CA2253874A1 - Viral particles which are masked or unmasked with respect to a cell receptor - Google Patents
Viral particles which are masked or unmasked with respect to a cell receptor Download PDFInfo
- Publication number
- CA2253874A1 CA2253874A1 CA002253874A CA2253874A CA2253874A1 CA 2253874 A1 CA2253874 A1 CA 2253874A1 CA 002253874 A CA002253874 A CA 002253874A CA 2253874 A CA2253874 A CA 2253874A CA 2253874 A1 CA2253874 A1 CA 2253874A1
- Authority
- CA
- Canada
- Prior art keywords
- leu
- pro
- gly
- thr
- ser
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 230000003612 virological effect Effects 0.000 title claims description 128
- 239000002245 particle Substances 0.000 title claims description 95
- 210000004027 cell Anatomy 0.000 claims abstract description 181
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 176
- 108020001580 protein domains Proteins 0.000 claims abstract description 98
- 150000001413 amino acids Chemical class 0.000 claims abstract description 80
- 210000004899 c-terminal region Anatomy 0.000 claims abstract description 66
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 60
- 210000003527 eukaryotic cell Anatomy 0.000 claims abstract description 58
- 238000011144 upstream manufacturing Methods 0.000 claims abstract description 55
- 102000004196 processed proteins & peptides Human genes 0.000 claims abstract description 54
- 230000003993 interaction Effects 0.000 claims abstract description 42
- 229920001184 polypeptide Polymers 0.000 claims abstract description 37
- 229920000037 Polyproline Polymers 0.000 claims abstract description 22
- 238000007514 turning Methods 0.000 claims abstract description 16
- 238000010276 construction Methods 0.000 claims abstract description 13
- 108010026466 polyproline Proteins 0.000 claims abstract description 10
- 239000000427 antigen Substances 0.000 claims abstract description 7
- 108091007433 antigens Proteins 0.000 claims abstract description 7
- 102000036639 antigens Human genes 0.000 claims abstract description 7
- 230000005764 inhibitory process Effects 0.000 claims abstract description 4
- 230000002401 inhibitory effect Effects 0.000 claims abstract description 3
- 108020003175 receptors Proteins 0.000 claims description 156
- 230000027455 binding Effects 0.000 claims description 87
- 241000700605 Viruses Species 0.000 claims description 74
- 102100021696 Syncytin-1 Human genes 0.000 claims description 67
- 101710121417 Envelope glycoprotein Proteins 0.000 claims description 63
- 230000008685 targeting Effects 0.000 claims description 63
- 230000001177 retroviral effect Effects 0.000 claims description 59
- 239000002773 nucleotide Substances 0.000 claims description 52
- 125000003729 nucleotide group Chemical group 0.000 claims description 52
- 241000714177 Murine leukemia virus Species 0.000 claims description 45
- 230000007246 mechanism Effects 0.000 claims description 42
- 239000013598 vector Substances 0.000 claims description 41
- 238000000034 method Methods 0.000 claims description 36
- 241000713869 Moloney murine leukemia virus Species 0.000 claims description 35
- 241001430294 unidentified retrovirus Species 0.000 claims description 33
- 230000000873 masking effect Effects 0.000 claims description 30
- 238000012546 transfer Methods 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 26
- 230000004927 fusion Effects 0.000 claims description 26
- 230000008569 process Effects 0.000 claims description 24
- 108090000288 Glycoproteins Proteins 0.000 claims description 23
- 102000003886 Glycoproteins Human genes 0.000 claims description 23
- 239000003446 ligand Substances 0.000 claims description 23
- 108070000030 Viral receptors Proteins 0.000 claims description 22
- 108020004707 nucleic acids Proteins 0.000 claims description 22
- 102000039446 nucleic acids Human genes 0.000 claims description 22
- 150000007523 nucleic acids Chemical class 0.000 claims description 22
- 241000713813 Gibbon ape leukemia virus Species 0.000 claims description 16
- 241000713877 Simian sarcoma-associated virus Species 0.000 claims description 12
- 125000001500 prolyl group Chemical group [H]N1C([H])(C(=O)[*])C([H])([H])C([H])([H])C1([H])[H] 0.000 claims description 11
- 241000702421 Dependoparvovirus Species 0.000 claims description 9
- 101800000135 N-terminal protein Proteins 0.000 claims description 9
- 101800001452 P1 proteinase Proteins 0.000 claims description 9
- 108010003533 Viral Envelope Proteins Proteins 0.000 claims description 8
- 239000003102 growth factor Substances 0.000 claims description 8
- 108090000695 Cytokines Proteins 0.000 claims description 7
- 102000004127 Cytokines Human genes 0.000 claims description 7
- 230000000694 effects Effects 0.000 claims description 7
- 239000000813 peptide hormone Substances 0.000 claims description 7
- 230000009870 specific binding Effects 0.000 claims description 7
- 241000714165 Feline leukemia virus Species 0.000 claims description 6
- 108010038988 Peptide Hormones Proteins 0.000 claims description 6
- 102000015731 Peptide Hormones Human genes 0.000 claims description 6
- 241000701161 unidentified adenovirus Species 0.000 claims description 6
- 108010076504 Protein Sorting Signals Proteins 0.000 claims description 5
- 241001529453 unidentified herpesvirus Species 0.000 claims description 5
- 239000000833 heterodimer Substances 0.000 claims description 4
- 239000000178 monomer Substances 0.000 claims description 4
- 241000712461 unidentified influenza virus Species 0.000 claims description 4
- 101000851056 Bos taurus Elastin Proteins 0.000 claims description 3
- 238000000338 in vitro Methods 0.000 claims description 3
- 239000013543 active substance Substances 0.000 claims description 2
- 239000003937 drug carrier Substances 0.000 claims description 2
- 239000008194 pharmaceutical composition Substances 0.000 claims description 2
- 239000013603 viral vector Substances 0.000 claims description 2
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 abstract description 19
- 230000001413 cellular effect Effects 0.000 abstract description 9
- 102000005962 receptors Human genes 0.000 description 125
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 74
- 235000001014 amino acid Nutrition 0.000 description 73
- 229940024606 amino acid Drugs 0.000 description 73
- 108010077515 glycylproline Proteins 0.000 description 61
- 208000015181 infectious disease Diseases 0.000 description 49
- 101800001707 Spacer peptide Proteins 0.000 description 48
- 101100356020 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) recA gene Proteins 0.000 description 46
- 101100042680 Mus musculus Slc7a1 gene Proteins 0.000 description 46
- 239000012634 fragment Substances 0.000 description 43
- 108010026333 seryl-proline Proteins 0.000 description 43
- 108020004414 DNA Proteins 0.000 description 41
- 108010061238 threonyl-glycine Proteins 0.000 description 41
- 108010090894 prolylleucine Proteins 0.000 description 39
- KUTPGXNAAOQSPD-LPEHRKFASA-N Glu-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)N)C(=O)O KUTPGXNAAOQSPD-LPEHRKFASA-N 0.000 description 37
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 37
- 239000013612 plasmid Substances 0.000 description 37
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 36
- 108010031719 prolyl-serine Proteins 0.000 description 34
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 33
- KBBRNEDOYWMIJP-KYNKHSRBSA-N Thr-Gly-Thr Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N)O KBBRNEDOYWMIJP-KYNKHSRBSA-N 0.000 description 30
- 241000520777 MLV-related retrovirus Species 0.000 description 29
- 108010047495 alanylglycine Proteins 0.000 description 28
- CUXJENOFJXOSOZ-BIIVOSGPSA-N Ser-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CO)N)C(=O)O CUXJENOFJXOSOZ-BIIVOSGPSA-N 0.000 description 27
- QLSRIZIDQXDQHK-RCWTZXSCSA-N Arg-Val-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QLSRIZIDQXDQHK-RCWTZXSCSA-N 0.000 description 26
- IKWHIGGRTYBSIW-OBJOEFQTSA-N (2s)-2-[[(2s)-2-[[(2s)-1-(2-aminoacetyl)pyrrolidine-2-carbonyl]amino]-5-(diaminomethylideneamino)pentanoyl]amino]-3-methylbutanoic acid Chemical compound NC(N)=NCCC[C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN IKWHIGGRTYBSIW-OBJOEFQTSA-N 0.000 description 25
- TXPUNZXZDVJUJQ-LPEHRKFASA-N Pro-Asn-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N2CCC[C@@H]2C(=O)O TXPUNZXZDVJUJQ-LPEHRKFASA-N 0.000 description 25
- XYHMFGGWNOFUOU-QXEWZRGKSA-N Pro-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 XYHMFGGWNOFUOU-QXEWZRGKSA-N 0.000 description 25
- 108010034529 leucyl-lysine Proteins 0.000 description 25
- MFFOYNGMOYFPBD-DCAQKATOSA-N Asn-Arg-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O MFFOYNGMOYFPBD-DCAQKATOSA-N 0.000 description 23
- KVYVOGYEMPEXBT-GUBZILKMSA-N Gln-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O KVYVOGYEMPEXBT-GUBZILKMSA-N 0.000 description 23
- ILDSIMPXNFWKLH-KATARQTJSA-N Leu-Thr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ILDSIMPXNFWKLH-KATARQTJSA-N 0.000 description 23
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 22
- UGDMQJSXSSZUKL-IHRRRGAJSA-N Pro-Ser-Tyr Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O UGDMQJSXSSZUKL-IHRRRGAJSA-N 0.000 description 22
- 108010089804 glycyl-threonine Proteins 0.000 description 22
- 210000002845 virion Anatomy 0.000 description 22
- DEAGTWNKODHUIY-MRFFXTKBSA-N Ala-Tyr-Trp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O DEAGTWNKODHUIY-MRFFXTKBSA-N 0.000 description 21
- 108091034117 Oligonucleotide Proteins 0.000 description 21
- RPVDDQYNBOVWLR-HOCLYGCPSA-N Trp-Gly-Leu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O RPVDDQYNBOVWLR-HOCLYGCPSA-N 0.000 description 21
- MCKSLROAGSDNFC-ACZMJKKPSA-N Ala-Asp-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MCKSLROAGSDNFC-ACZMJKKPSA-N 0.000 description 20
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 20
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 20
- 108010083327 glycyl-prolyl-arginyl-valine Proteins 0.000 description 20
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 20
- 108010077112 prolyl-proline Proteins 0.000 description 20
- XQDGOJPVMSWZSO-SRVKXCTJSA-N Gln-Pro-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)N)N XQDGOJPVMSWZSO-SRVKXCTJSA-N 0.000 description 19
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 19
- UQJOKDAYFULYIX-AVGNSLFASA-N Lys-Pro-Pro Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 UQJOKDAYFULYIX-AVGNSLFASA-N 0.000 description 19
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 19
- ZHQWPWQNVRCXAX-XQQFMLRXSA-N Val-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZHQWPWQNVRCXAX-XQQFMLRXSA-N 0.000 description 19
- 108010010147 glycylglutamine Proteins 0.000 description 19
- WGDNWOMKBUXFHR-BQBZGAKWSA-N Ala-Gly-Arg Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N WGDNWOMKBUXFHR-BQBZGAKWSA-N 0.000 description 18
- 101800003838 Epidermal growth factor Proteins 0.000 description 18
- IIVZNQCUUMBBKF-GVXVVHGQSA-N His-Gln-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CN=CN1 IIVZNQCUUMBBKF-GVXVVHGQSA-N 0.000 description 18
- HASRFYOMVPJRPU-SRVKXCTJSA-N Leu-Arg-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HASRFYOMVPJRPU-SRVKXCTJSA-N 0.000 description 18
- PTWIYDNFWPXQSD-GARJFASQSA-N Ser-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CO)N)C(=O)O PTWIYDNFWPXQSD-GARJFASQSA-N 0.000 description 18
- NUQZCPSZHGIYTA-HKUYNNGSSA-N Tyr-Trp-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N NUQZCPSZHGIYTA-HKUYNNGSSA-N 0.000 description 18
- PQSNETRGCRUOGP-KKHAAJSZSA-N Val-Thr-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(N)=O PQSNETRGCRUOGP-KKHAAJSZSA-N 0.000 description 18
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 18
- 229940116977 epidermal growth factor Drugs 0.000 description 18
- 235000013930 proline Nutrition 0.000 description 18
- 108010071097 threonyl-lysyl-proline Proteins 0.000 description 18
- VBEQCZHXXJYVRD-GACYYNSASA-N uroanthelone Chemical compound C([C@@H](C(=O)N[C@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)C(C)C)[C@@H](C)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CC=1NC=NC=1)NC(=O)[C@H](CCSC)NC(=O)[C@H](CS)NC(=O)[C@@H](NC(=O)CNC(=O)CNC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CS)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@H]1N(CCC1)C(=O)[C@H](CS)NC(=O)CNC(=O)[C@H]1N(CCC1)C(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O)C(C)C)[C@@H](C)CC)C1=CC=C(O)C=C1 VBEQCZHXXJYVRD-GACYYNSASA-N 0.000 description 18
- BRPMXFSTKXXNHF-IUCAKERBSA-N (2s)-1-[2-[[(2s)-pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@H]1NCCC1 BRPMXFSTKXXNHF-IUCAKERBSA-N 0.000 description 17
- SUMJNGAMIQSNGX-TUAOUCFPSA-N Arg-Val-Pro Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N1CCC[C@@H]1C(O)=O SUMJNGAMIQSNGX-TUAOUCFPSA-N 0.000 description 17
- 102400001368 Epidermal growth factor Human genes 0.000 description 17
- GQKSJYINYYWPMR-NGZCFLSTSA-N Ile-Gly-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N GQKSJYINYYWPMR-NGZCFLSTSA-N 0.000 description 17
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 17
- RAGOJJCBGXARPO-XVSYOHENSA-N Phe-Thr-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 RAGOJJCBGXARPO-XVSYOHENSA-N 0.000 description 17
- UGFMVXRXULGLNO-XPUUQOCRSA-N Val-Ser-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O UGFMVXRXULGLNO-XPUUQOCRSA-N 0.000 description 17
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 17
- 108010045269 tryptophyltryptophan Proteins 0.000 description 17
- VHQSGALUSWIYOD-QXEWZRGKSA-N Asn-Pro-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O VHQSGALUSWIYOD-QXEWZRGKSA-N 0.000 description 16
- NQSUTVRXXBGVDQ-LKXGYXEUSA-N Cys-Asn-Thr Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NQSUTVRXXBGVDQ-LKXGYXEUSA-N 0.000 description 16
- JKGHDYGZRDWHGA-SRVKXCTJSA-N Leu-Asn-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JKGHDYGZRDWHGA-SRVKXCTJSA-N 0.000 description 16
- JIHDFWWRYHSAQB-GUBZILKMSA-N Leu-Ser-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JIHDFWWRYHSAQB-GUBZILKMSA-N 0.000 description 16
- YOOAQCZYZHGUAZ-KATARQTJSA-N Thr-Leu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YOOAQCZYZHGUAZ-KATARQTJSA-N 0.000 description 16
- AHERARIZBPOMNU-KATARQTJSA-N Thr-Ser-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O AHERARIZBPOMNU-KATARQTJSA-N 0.000 description 16
- JYPCXBJRLBHWME-UHFFFAOYSA-N glycyl-L-prolyl-L-arginine Natural products NCC(=O)N1CCCC1C(=O)NC(CCCN=C(N)N)C(O)=O JYPCXBJRLBHWME-UHFFFAOYSA-N 0.000 description 16
- 230000002458 infectious effect Effects 0.000 description 16
- 108010015796 prolylisoleucine Proteins 0.000 description 16
- 108010073969 valyllysine Proteins 0.000 description 16
- TTXMOJWKNRJWQJ-FXQIFTODSA-N Ala-Arg-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N TTXMOJWKNRJWQJ-FXQIFTODSA-N 0.000 description 15
- YBPLKDWJFYCZSV-ZLUOBGJFSA-N Ala-Asn-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N YBPLKDWJFYCZSV-ZLUOBGJFSA-N 0.000 description 15
- LMPKCSXZJSXBBL-NHCYSSNCSA-N Arg-Gln-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O LMPKCSXZJSXBBL-NHCYSSNCSA-N 0.000 description 15
- MVRGBQGZSDJBSM-GMOBBJLQSA-N Asp-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)O)N MVRGBQGZSDJBSM-GMOBBJLQSA-N 0.000 description 15
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 15
- 241000880493 Leptailurus serval Species 0.000 description 15
- ONHCDMBHPQIPAI-YTQUADARSA-N Leu-Trp-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N3CCC[C@@H]3C(=O)O)N ONHCDMBHPQIPAI-YTQUADARSA-N 0.000 description 15
- UCRJTSIIAYHOHE-ULQDDVLXSA-N Leu-Tyr-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UCRJTSIIAYHOHE-ULQDDVLXSA-N 0.000 description 15
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 15
- YSPZCHGIWAQVKQ-AVGNSLFASA-N Lys-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN YSPZCHGIWAQVKQ-AVGNSLFASA-N 0.000 description 15
- WZVSHTFTCYOFPL-GARJFASQSA-N Lys-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCCCN)N)C(=O)O WZVSHTFTCYOFPL-GARJFASQSA-N 0.000 description 15
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 15
- HGNGAMWHGGANAU-WHOFXGATSA-N Phe-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HGNGAMWHGGANAU-WHOFXGATSA-N 0.000 description 15
- RMJZWERKFFNNNS-XGEHTFHBSA-N Pro-Thr-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMJZWERKFFNNNS-XGEHTFHBSA-N 0.000 description 15
- SIEBDTCABMZCLF-XGEHTFHBSA-N Ser-Val-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SIEBDTCABMZCLF-XGEHTFHBSA-N 0.000 description 15
- CGCMNOIQVAXYMA-UNQGMJICSA-N Thr-Met-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O CGCMNOIQVAXYMA-UNQGMJICSA-N 0.000 description 15
- 108010013835 arginine glutamate Proteins 0.000 description 15
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 15
- 235000018102 proteins Nutrition 0.000 description 15
- 102000004169 proteins and genes Human genes 0.000 description 15
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 15
- CXRCVCURMBFFOL-FXQIFTODSA-N Ala-Ala-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CXRCVCURMBFFOL-FXQIFTODSA-N 0.000 description 14
- 108010076441 Ala-His-His Proteins 0.000 description 14
- HPSVTWMFWCHKFN-GARJFASQSA-N Arg-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O HPSVTWMFWCHKFN-GARJFASQSA-N 0.000 description 14
- LVMUGODRNHFGRA-AVGNSLFASA-N Arg-Leu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O LVMUGODRNHFGRA-AVGNSLFASA-N 0.000 description 14
- PZBSKYJGKNNYNK-ULQDDVLXSA-N Arg-Leu-Tyr Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCN=C(N)N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O PZBSKYJGKNNYNK-ULQDDVLXSA-N 0.000 description 14
- BYLSYQASFJJBCL-DCAQKATOSA-N Asn-Pro-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O BYLSYQASFJJBCL-DCAQKATOSA-N 0.000 description 14
- JFSNBQJNDMXMQF-XHNCKOQMSA-N Gln-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N)C(=O)O JFSNBQJNDMXMQF-XHNCKOQMSA-N 0.000 description 14
- IWAXHBCACVWNHT-BQBZGAKWSA-N Gly-Asp-Arg Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IWAXHBCACVWNHT-BQBZGAKWSA-N 0.000 description 14
- TWQIYNGNYNJUFM-NHCYSSNCSA-N Leu-Asn-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TWQIYNGNYNJUFM-NHCYSSNCSA-N 0.000 description 14
- YFBBUHJJUXXZOF-UWVGGRQHSA-N Leu-Gly-Pro Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O YFBBUHJJUXXZOF-UWVGGRQHSA-N 0.000 description 14
- DPURXCQCHSQPAN-AVGNSLFASA-N Leu-Pro-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DPURXCQCHSQPAN-AVGNSLFASA-N 0.000 description 14
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 14
- UZWMJZSOXGOVIN-LURJTMIESA-N Met-Gly-Gly Chemical compound CSCC[C@H](N)C(=O)NCC(=O)NCC(O)=O UZWMJZSOXGOVIN-LURJTMIESA-N 0.000 description 14
- HRIXMVRZRGFKNQ-HJGDQZAQSA-N Pro-Thr-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HRIXMVRZRGFKNQ-HJGDQZAQSA-N 0.000 description 14
- DGHFNYXVIXNNMC-GUBZILKMSA-N Ser-Gln-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CO)N DGHFNYXVIXNNMC-GUBZILKMSA-N 0.000 description 14
- SFTZWNJFZYOLBD-ZDLURKLDSA-N Ser-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO SFTZWNJFZYOLBD-ZDLURKLDSA-N 0.000 description 14
- DKGRNFUXVTYRAS-UBHSHLNASA-N Ser-Ser-Trp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O DKGRNFUXVTYRAS-UBHSHLNASA-N 0.000 description 14
- BVOVIGCHYNFJBZ-JXUBOQSCSA-N Thr-Leu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O BVOVIGCHYNFJBZ-JXUBOQSCSA-N 0.000 description 14
- CRHFOYCJGVJPLE-AVGNSLFASA-N Tyr-Gln-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O CRHFOYCJGVJPLE-AVGNSLFASA-N 0.000 description 14
- JHORGUYURUBVOM-KKUMJFAQSA-N Tyr-His-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O JHORGUYURUBVOM-KKUMJFAQSA-N 0.000 description 14
- YCMXFKWYJFZFKS-LAEOZQHASA-N Val-Gln-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCMXFKWYJFZFKS-LAEOZQHASA-N 0.000 description 14
- VHIZXDZMTDVFGX-DCAQKATOSA-N Val-Ser-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N VHIZXDZMTDVFGX-DCAQKATOSA-N 0.000 description 14
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 14
- 108010090333 leucyl-lysyl-proline Proteins 0.000 description 14
- 108010017391 lysylvaline Proteins 0.000 description 14
- NLYYHIKRBRMAJV-AEJSXWLSSA-N Ala-Val-Pro Chemical compound C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N NLYYHIKRBRMAJV-AEJSXWLSSA-N 0.000 description 13
- AITKTFCQOBRJTG-CIUDSAMLSA-N Asp-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N AITKTFCQOBRJTG-CIUDSAMLSA-N 0.000 description 13
- DLISPGXMKZTWQG-IFFSRLJSSA-N Glu-Thr-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O DLISPGXMKZTWQG-IFFSRLJSSA-N 0.000 description 13
- QGAJQIGFFIQJJK-IHRRRGAJSA-N Glu-Tyr-Gln Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O QGAJQIGFFIQJJK-IHRRRGAJSA-N 0.000 description 13
- QCTLGOYODITHPQ-WHFBIAKZSA-N Gly-Cys-Ser Chemical compound [H]NCC(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O QCTLGOYODITHPQ-WHFBIAKZSA-N 0.000 description 13
- HMHRTKOWRUPPNU-RCOVLWMOSA-N Gly-Ile-Gly Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O HMHRTKOWRUPPNU-RCOVLWMOSA-N 0.000 description 13
- JIUYRPFQJJRSJB-QWRGUYRKSA-N His-His-Gly Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)NCC(O)=O)C1=CN=CN1 JIUYRPFQJJRSJB-QWRGUYRKSA-N 0.000 description 13
- VCBWXASUBZIFLQ-IHRRRGAJSA-N His-Pro-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O VCBWXASUBZIFLQ-IHRRRGAJSA-N 0.000 description 13
- NKVZTQVGUNLLQW-JBDRJPRFSA-N Ile-Ala-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)O)N NKVZTQVGUNLLQW-JBDRJPRFSA-N 0.000 description 13
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 13
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 13
- IEVXCWPVBYCJRZ-IXOXFDKPSA-N Lys-Thr-His Chemical compound NCCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 IEVXCWPVBYCJRZ-IXOXFDKPSA-N 0.000 description 13
- AFFKUNVPPLQUGA-DCAQKATOSA-N Met-Leu-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O AFFKUNVPPLQUGA-DCAQKATOSA-N 0.000 description 13
- 108010066427 N-valyltryptophan Proteins 0.000 description 13
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 13
- KWMUAKQOVYCQJQ-ZPFDUUQYSA-N Pro-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@@H]1CCCN1 KWMUAKQOVYCQJQ-ZPFDUUQYSA-N 0.000 description 13
- KDBHVPXBQADZKY-GUBZILKMSA-N Pro-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 KDBHVPXBQADZKY-GUBZILKMSA-N 0.000 description 13
- BPMRXBZYPGYPJN-WHFBIAKZSA-N Ser-Gly-Asn Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BPMRXBZYPGYPJN-WHFBIAKZSA-N 0.000 description 13
- VLMIUSLQONKLDV-HEIBUPTGSA-N Ser-Thr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VLMIUSLQONKLDV-HEIBUPTGSA-N 0.000 description 13
- XGUAUKUYQHBUNY-SWRJLBSHSA-N Thr-Trp-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(O)=O XGUAUKUYQHBUNY-SWRJLBSHSA-N 0.000 description 13
- PEYSVKMXSLPQRU-FJHTZYQYSA-N Trp-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O PEYSVKMXSLPQRU-FJHTZYQYSA-N 0.000 description 13
- GQNCRIFNDVFRNF-BPUTZDHNSA-N Trp-Pro-Asp Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O GQNCRIFNDVFRNF-BPUTZDHNSA-N 0.000 description 13
- SUGLEXVWEJOCGN-ONUFPDRFSA-N Trp-Thr-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CNC4=CC=CC=C43)N)O SUGLEXVWEJOCGN-ONUFPDRFSA-N 0.000 description 13
- MTEQZJFSEMXXRK-CFMVVWHZSA-N Tyr-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N MTEQZJFSEMXXRK-CFMVVWHZSA-N 0.000 description 13
- LHTGRUZSZOIAKM-SOUVJXGZSA-N Tyr-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O LHTGRUZSZOIAKM-SOUVJXGZSA-N 0.000 description 13
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 13
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 13
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 13
- 108010091092 arginyl-glycyl-proline Proteins 0.000 description 13
- 108010060199 cysteinylproline Proteins 0.000 description 13
- 108010050848 glycylleucine Proteins 0.000 description 13
- 238000003752 polymerase chain reaction Methods 0.000 description 13
- 108010014614 prolyl-glycyl-proline Proteins 0.000 description 13
- 108010053725 prolylvaline Proteins 0.000 description 13
- 108700004896 tripeptide FEG Proteins 0.000 description 13
- NAARDJBSSPUGCF-FXQIFTODSA-N Arg-Cys-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)CN=C(N)N NAARDJBSSPUGCF-FXQIFTODSA-N 0.000 description 12
- DAPLJWATMAXPPZ-CIUDSAMLSA-N Asn-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(N)=O DAPLJWATMAXPPZ-CIUDSAMLSA-N 0.000 description 12
- RSMIHCFQDCVVBR-CIUDSAMLSA-N Asp-Gln-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N RSMIHCFQDCVVBR-CIUDSAMLSA-N 0.000 description 12
- PCJOFZYFFMBZKC-PCBIJLKTSA-N Asp-Phe-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PCJOFZYFFMBZKC-PCBIJLKTSA-N 0.000 description 12
- OHLLDUNVMPPUMD-DCAQKATOSA-N Cys-Leu-Val Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CS)N OHLLDUNVMPPUMD-DCAQKATOSA-N 0.000 description 12
- MSWBLPLBSLQVME-XIRDDKMYSA-N Cys-Trp-Leu Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CS)=CNC2=C1 MSWBLPLBSLQVME-XIRDDKMYSA-N 0.000 description 12
- JSYULGSPLTZDHM-NRPADANISA-N Gln-Ala-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O JSYULGSPLTZDHM-NRPADANISA-N 0.000 description 12
- FGYPOQPQTUNESW-IUCAKERBSA-N Gln-Gly-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N FGYPOQPQTUNESW-IUCAKERBSA-N 0.000 description 12
- IULKWYSYZSURJK-AVGNSLFASA-N Gln-Leu-Lys Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O IULKWYSYZSURJK-AVGNSLFASA-N 0.000 description 12
- WLRYGVYQFXRJDA-DCAQKATOSA-N Gln-Pro-Pro Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 WLRYGVYQFXRJDA-DCAQKATOSA-N 0.000 description 12
- BJVBMSTUUWGZKX-JYJNAYRXSA-N Gln-Tyr-His Chemical compound N[C@@H](CCC(N)=O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O BJVBMSTUUWGZKX-JYJNAYRXSA-N 0.000 description 12
- KHHDJQRWIFHXHS-NRPADANISA-N Gln-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)N)N KHHDJQRWIFHXHS-NRPADANISA-N 0.000 description 12
- HILMIYALTUQTRC-XVKPBYJWSA-N Glu-Gly-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HILMIYALTUQTRC-XVKPBYJWSA-N 0.000 description 12
- INGJLBQKTRJLFO-UKJIMTQDSA-N Glu-Ile-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O INGJLBQKTRJLFO-UKJIMTQDSA-N 0.000 description 12
- FKYQEVBRZSFAMJ-QWRGUYRKSA-N Gly-Ser-Tyr Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FKYQEVBRZSFAMJ-QWRGUYRKSA-N 0.000 description 12
- BXDLTKLPPKBVEL-FJXKBIBVSA-N Gly-Thr-Met Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O BXDLTKLPPKBVEL-FJXKBIBVSA-N 0.000 description 12
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 12
- KPJJOZUXFOLGMQ-CIUDSAMLSA-N Lys-Asp-Asn Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N KPJJOZUXFOLGMQ-CIUDSAMLSA-N 0.000 description 12
- YTJFXEDRUOQGSP-DCAQKATOSA-N Lys-Pro-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YTJFXEDRUOQGSP-DCAQKATOSA-N 0.000 description 12
- CGBYDGAJHSOGFQ-LPEHRKFASA-N Pro-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 CGBYDGAJHSOGFQ-LPEHRKFASA-N 0.000 description 12
- QDDJNKWPTJHROJ-UFYCRDLUSA-N Pro-Tyr-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 QDDJNKWPTJHROJ-UFYCRDLUSA-N 0.000 description 12
- RHAPJNVNWDBFQI-BQBZGAKWSA-N Ser-Pro-Gly Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O RHAPJNVNWDBFQI-BQBZGAKWSA-N 0.000 description 12
- GZGFSPWOMUKKCV-NAKRPEOUSA-N Ser-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO GZGFSPWOMUKKCV-NAKRPEOUSA-N 0.000 description 12
- QUGRFWPMPVIAPW-IHRRRGAJSA-N Ser-Pro-Phe Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QUGRFWPMPVIAPW-IHRRRGAJSA-N 0.000 description 12
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 12
- NERYDXBVARJIQS-JYBASQMISA-N Ser-Trp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CO)N)O NERYDXBVARJIQS-JYBASQMISA-N 0.000 description 12
- BSNZTJXVDOINSR-JXUBOQSCSA-N Thr-Ala-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BSNZTJXVDOINSR-JXUBOQSCSA-N 0.000 description 12
- VUVCRYXYUUPGSB-GLLZPBPUSA-N Thr-Gln-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O VUVCRYXYUUPGSB-GLLZPBPUSA-N 0.000 description 12
- YZUWGFXVVZQJEI-PMVVWTBXSA-N Thr-Gly-His Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O YZUWGFXVVZQJEI-PMVVWTBXSA-N 0.000 description 12
- IVDFVBVIVLJJHR-LKXGYXEUSA-N Thr-Ser-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IVDFVBVIVLJJHR-LKXGYXEUSA-N 0.000 description 12
- YRJOLUDFVAUXLI-GSSVUCPTSA-N Thr-Thr-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O YRJOLUDFVAUXLI-GSSVUCPTSA-N 0.000 description 12
- BKIOKSLLAAZYTC-KKHAAJSZSA-N Thr-Val-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O BKIOKSLLAAZYTC-KKHAAJSZSA-N 0.000 description 12
- GHXXDFDIDHIEIL-WFBYXXMGSA-N Trp-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N GHXXDFDIDHIEIL-WFBYXXMGSA-N 0.000 description 12
- QOEZFICGUZTRFX-IHRRRGAJSA-N Tyr-Cys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O QOEZFICGUZTRFX-IHRRRGAJSA-N 0.000 description 12
- YQYFYUSYEDNLSD-YEPSODPASA-N Val-Thr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O YQYFYUSYEDNLSD-YEPSODPASA-N 0.000 description 12
- 108010016616 cysteinylglycine Proteins 0.000 description 12
- 238000001415 gene therapy Methods 0.000 description 12
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 12
- 108010034507 methionyltryptophan Proteins 0.000 description 12
- 108010012581 phenylalanylglutamate Proteins 0.000 description 12
- 108010051242 phenylalanylserine Proteins 0.000 description 12
- 238000012360 testing method Methods 0.000 description 12
- 108010038745 tryptophylglycine Proteins 0.000 description 12
- 108010003137 tyrosyltyrosine Proteins 0.000 description 12
- RCAUJZASOAFTAJ-FXQIFTODSA-N Arg-Asp-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N)CN=C(N)N RCAUJZASOAFTAJ-FXQIFTODSA-N 0.000 description 11
- ZATRYQNPUHGXCU-DTWKUNHWSA-N Arg-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ZATRYQNPUHGXCU-DTWKUNHWSA-N 0.000 description 11
- GMRGSBAMMMVDGG-GUBZILKMSA-N Asn-Arg-Arg Chemical compound C(C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N GMRGSBAMMMVDGG-GUBZILKMSA-N 0.000 description 11
- AMGQTNHANMRPOE-LKXGYXEUSA-N Asn-Thr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O AMGQTNHANMRPOE-LKXGYXEUSA-N 0.000 description 11
- RKNIUWSZIAUEPK-PBCZWWQYSA-N Asp-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)O)N)O RKNIUWSZIAUEPK-PBCZWWQYSA-N 0.000 description 11
- DTMLKCYOQKZXKZ-HJGDQZAQSA-N Gln-Arg-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DTMLKCYOQKZXKZ-HJGDQZAQSA-N 0.000 description 11
- LVNILKSSFHCSJZ-IHRRRGAJSA-N Gln-Gln-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N LVNILKSSFHCSJZ-IHRRRGAJSA-N 0.000 description 11
- YTSVAIMKVLZUDU-YUMQZZPRSA-N Gly-Leu-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YTSVAIMKVLZUDU-YUMQZZPRSA-N 0.000 description 11
- LSQHWKPPOFDHHZ-YUMQZZPRSA-N His-Asp-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N LSQHWKPPOFDHHZ-YUMQZZPRSA-N 0.000 description 11
- JKSIBWITFMQTOA-XUXIUFHCSA-N Leu-Ile-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O JKSIBWITFMQTOA-XUXIUFHCSA-N 0.000 description 11
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 11
- AIRUUHAOKGVJAD-JYJNAYRXSA-N Leu-Phe-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIRUUHAOKGVJAD-JYJNAYRXSA-N 0.000 description 11
- OKCJTECLRDARDZ-XIRDDKMYSA-N Lys-Trp-Cys Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CS)C(O)=O)=CNC2=C1 OKCJTECLRDARDZ-XIRDDKMYSA-N 0.000 description 11
- WFLWKEUBTSOFMP-FXQIFTODSA-N Pro-Cys-Cys Chemical compound OC(=O)[C@H](CS)NC(=O)[C@H](CS)NC(=O)[C@@H]1CCCN1 WFLWKEUBTSOFMP-FXQIFTODSA-N 0.000 description 11
- PRKWBYCXBBSLSK-GUBZILKMSA-N Pro-Ser-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O PRKWBYCXBBSLSK-GUBZILKMSA-N 0.000 description 11
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 11
- FKYWFUYPVKLJLP-DCAQKATOSA-N Ser-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FKYWFUYPVKLJLP-DCAQKATOSA-N 0.000 description 11
- KDKLLPMFFGYQJD-CYDGBPFRSA-N Val-Ile-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N KDKLLPMFFGYQJD-CYDGBPFRSA-N 0.000 description 11
- 108010004914 prolylarginine Proteins 0.000 description 11
- BYXHQQCXAJARLQ-ZLUOBGJFSA-N Ala-Ala-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O BYXHQQCXAJARLQ-ZLUOBGJFSA-N 0.000 description 10
- MQIGTEQXYCRLGK-BQBZGAKWSA-N Ala-Gly-Pro Chemical compound C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O MQIGTEQXYCRLGK-BQBZGAKWSA-N 0.000 description 10
- HHRAXZAYZFFRAM-CIUDSAMLSA-N Ala-Leu-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O HHRAXZAYZFFRAM-CIUDSAMLSA-N 0.000 description 10
- BHFOJPDOQPWJRN-XDTLVQLUSA-N Ala-Tyr-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCC(N)=O)C(O)=O BHFOJPDOQPWJRN-XDTLVQLUSA-N 0.000 description 10
- HKRXJBBCQBAGIM-FXQIFTODSA-N Arg-Asp-Ser Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N)CN=C(N)N HKRXJBBCQBAGIM-FXQIFTODSA-N 0.000 description 10
- NKNILFJYKKHBKE-WPRPVWTQSA-N Arg-Gly-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O NKNILFJYKKHBKE-WPRPVWTQSA-N 0.000 description 10
- RKQRHMKFNBYOTN-IHRRRGAJSA-N Arg-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N RKQRHMKFNBYOTN-IHRRRGAJSA-N 0.000 description 10
- GXXWTNKNFFKTJB-NAKRPEOUSA-N Arg-Ile-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O GXXWTNKNFFKTJB-NAKRPEOUSA-N 0.000 description 10
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 10
- AMIQZQAAYGYKOP-FXQIFTODSA-N Arg-Ser-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O AMIQZQAAYGYKOP-FXQIFTODSA-N 0.000 description 10
- SUEIIIFUBHDCCS-PBCZWWQYSA-N Asn-His-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SUEIIIFUBHDCCS-PBCZWWQYSA-N 0.000 description 10
- TVYMKYUSZSVOAG-ZLUOBGJFSA-N Cys-Ala-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O TVYMKYUSZSVOAG-ZLUOBGJFSA-N 0.000 description 10
- WOACHWLUOFZLGJ-GUBZILKMSA-N Gln-Arg-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O WOACHWLUOFZLGJ-GUBZILKMSA-N 0.000 description 10
- NXPXQIZKDOXIHH-JSGCOSHPSA-N Gln-Gly-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N NXPXQIZKDOXIHH-JSGCOSHPSA-N 0.000 description 10
- MXPBQDFWIMBACQ-ACZMJKKPSA-N Glu-Cys-Cys Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CS)C(O)=O MXPBQDFWIMBACQ-ACZMJKKPSA-N 0.000 description 10
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 10
- MIIVFRCYJABHTQ-ONGXEEELSA-N Gly-Leu-Val Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O MIIVFRCYJABHTQ-ONGXEEELSA-N 0.000 description 10
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 10
- KUIDCYNIEJBZBU-AJNGGQMLSA-N Leu-Ile-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O KUIDCYNIEJBZBU-AJNGGQMLSA-N 0.000 description 10
- OMHLATXVNQSALM-FQUUOJAGSA-N Leu-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(C)C)N OMHLATXVNQSALM-FQUUOJAGSA-N 0.000 description 10
- DDVHDMSBLRAKNV-IHRRRGAJSA-N Leu-Met-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O DDVHDMSBLRAKNV-IHRRRGAJSA-N 0.000 description 10
- XIZQPFCRXLUNMK-BZSNNMDCSA-N Lys-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCCCN)N XIZQPFCRXLUNMK-BZSNNMDCSA-N 0.000 description 10
- HYSVGEAWTGPMOA-IHRRRGAJSA-N Lys-Pro-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O HYSVGEAWTGPMOA-IHRRRGAJSA-N 0.000 description 10
- QAHFGYLFLVGBNW-DCAQKATOSA-N Met-Ala-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN QAHFGYLFLVGBNW-DCAQKATOSA-N 0.000 description 10
- MVBZBRKNZVJEKK-DTWKUNHWSA-N Met-Gly-Pro Chemical compound CSCC[C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N MVBZBRKNZVJEKK-DTWKUNHWSA-N 0.000 description 10
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 10
- QTDBZORPVYTRJU-KKXDTOCCSA-N Phe-Tyr-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O QTDBZORPVYTRJU-KKXDTOCCSA-N 0.000 description 10
- SKICPQLTOXGWGO-GARJFASQSA-N Pro-Gln-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)N)C(=O)N2CCC[C@@H]2C(=O)O SKICPQLTOXGWGO-GARJFASQSA-N 0.000 description 10
- BRKHVZNDAOMAHX-BIIVOSGPSA-N Ser-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N BRKHVZNDAOMAHX-BIIVOSGPSA-N 0.000 description 10
- XJDMUQCLVSCRSJ-VZFHVOOUSA-N Ser-Thr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O XJDMUQCLVSCRSJ-VZFHVOOUSA-N 0.000 description 10
- ZSDXEKUKQAKZFE-XAVMHZPKSA-N Ser-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N)O ZSDXEKUKQAKZFE-XAVMHZPKSA-N 0.000 description 10
- SNXUIBACCONSOH-BWBBJGPYSA-N Ser-Thr-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CO)C(O)=O SNXUIBACCONSOH-BWBBJGPYSA-N 0.000 description 10
- PMTWIUBUQRGCSB-FXQIFTODSA-N Ser-Val-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O PMTWIUBUQRGCSB-FXQIFTODSA-N 0.000 description 10
- NFMPFBCXABPALN-OWLDWWDNSA-N Thr-Ala-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O NFMPFBCXABPALN-OWLDWWDNSA-N 0.000 description 10
- KCRQEJSKXAIULJ-FJXKBIBVSA-N Thr-Gly-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O KCRQEJSKXAIULJ-FJXKBIBVSA-N 0.000 description 10
- CYCGARJWIQWPQM-YJRXYDGGSA-N Thr-Tyr-Ser Chemical compound C[C@@H](O)[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CO)C([O-])=O)CC1=CC=C(O)C=C1 CYCGARJWIQWPQM-YJRXYDGGSA-N 0.000 description 10
- VXFXIBCCVLJCJT-JYJNAYRXSA-N Tyr-Pro-Pro Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N1CCC[C@H]1C(O)=O VXFXIBCCVLJCJT-JYJNAYRXSA-N 0.000 description 10
- ZLFHAAGHGQBQQN-GUBZILKMSA-N Val-Ala-Pro Natural products CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O ZLFHAAGHGQBQQN-GUBZILKMSA-N 0.000 description 10
- IDKGBVZGNTYYCC-QXEWZRGKSA-N Val-Asn-Pro Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(O)=O IDKGBVZGNTYYCC-QXEWZRGKSA-N 0.000 description 10
- QHDXUYOYTPWCSK-RCOVLWMOSA-N Val-Asp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N QHDXUYOYTPWCSK-RCOVLWMOSA-N 0.000 description 10
- JXGWQYWDUOWQHA-DZKIICNBSA-N Val-Gln-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N JXGWQYWDUOWQHA-DZKIICNBSA-N 0.000 description 10
- HGJRMXOWUWVUOA-GVXVVHGQSA-N Val-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N HGJRMXOWUWVUOA-GVXVVHGQSA-N 0.000 description 10
- UMPVMAYCLYMYGA-ONGXEEELSA-N Val-Leu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O UMPVMAYCLYMYGA-ONGXEEELSA-N 0.000 description 10
- KTEZUXISLQTDDQ-NHCYSSNCSA-N Val-Lys-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KTEZUXISLQTDDQ-NHCYSSNCSA-N 0.000 description 10
- 108010045514 alpha-lactorphin Proteins 0.000 description 10
- 108010008355 arginyl-glutamine Proteins 0.000 description 10
- 108010071207 serylmethionine Proteins 0.000 description 10
- 125000006850 spacer group Chemical group 0.000 description 10
- ICRHGPYYXMWHIE-LPEHRKFASA-N Arg-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ICRHGPYYXMWHIE-LPEHRKFASA-N 0.000 description 9
- 108020004705 Codon Proteins 0.000 description 9
- DZLQXIFVQFTFJY-BYPYZUCNSA-N Cys-Gly-Gly Chemical compound SC[C@H](N)C(=O)NCC(=O)NCC(O)=O DZLQXIFVQFTFJY-BYPYZUCNSA-N 0.000 description 9
- OXFOKRAFNYSREH-BJDJZHNGSA-N Cys-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CS)N OXFOKRAFNYSREH-BJDJZHNGSA-N 0.000 description 9
- NITLUESFANGEIW-BQBZGAKWSA-N Cys-Pro-Gly Chemical compound [H]N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O NITLUESFANGEIW-BQBZGAKWSA-N 0.000 description 9
- XXGQRGQPGFYECI-WDSKDSINSA-N Gly-Cys-Glu Chemical compound NCC(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CCC(O)=O XXGQRGQPGFYECI-WDSKDSINSA-N 0.000 description 9
- HPCFRQWLTRDGHT-AJNGGQMLSA-N Ile-Leu-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O HPCFRQWLTRDGHT-AJNGGQMLSA-N 0.000 description 9
- 108010065920 Insulin Lispro Proteins 0.000 description 9
- PPQRKXHCLYCBSP-IHRRRGAJSA-N Leu-Leu-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)O)N PPQRKXHCLYCBSP-IHRRRGAJSA-N 0.000 description 9
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 9
- ZQCVMVCVPFYXHZ-SRVKXCTJSA-N Lys-Asn-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN ZQCVMVCVPFYXHZ-SRVKXCTJSA-N 0.000 description 9
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 9
- FIRWJEJVFFGXSH-RYUDHWBXSA-N Phe-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 FIRWJEJVFFGXSH-RYUDHWBXSA-N 0.000 description 9
- NPLGQVKZFGJWAI-QWHCGFSZSA-N Phe-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O NPLGQVKZFGJWAI-QWHCGFSZSA-N 0.000 description 9
- ZOGICTVLQDWPER-UFYCRDLUSA-N Phe-Tyr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O ZOGICTVLQDWPER-UFYCRDLUSA-N 0.000 description 9
- XKHCJJPNXFBADI-DCAQKATOSA-N Pro-Asp-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O XKHCJJPNXFBADI-DCAQKATOSA-N 0.000 description 9
- KQNDIKOYWZTZIX-FXQIFTODSA-N Ser-Ser-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQNDIKOYWZTZIX-FXQIFTODSA-N 0.000 description 9
- FLMYSKVSDVHLEW-SVSWQMSJSA-N Ser-Thr-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLMYSKVSDVHLEW-SVSWQMSJSA-N 0.000 description 9
- KGKWKSSSQGGYAU-SUSMZKCASA-N Thr-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N)O KGKWKSSSQGGYAU-SUSMZKCASA-N 0.000 description 9
- FLPZMPOZGYPBEN-PPCPHDFISA-N Thr-Leu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLPZMPOZGYPBEN-PPCPHDFISA-N 0.000 description 9
- UQCNIMDPYICBTR-KYNKHSRBSA-N Thr-Thr-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UQCNIMDPYICBTR-KYNKHSRBSA-N 0.000 description 9
- UEFHVUQBYNRNQC-SFJXLCSZSA-N Trp-Phe-Thr Chemical compound C([C@@H](C(=O)N[C@@H]([C@H](O)C)C(O)=O)NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)C1=CC=CC=C1 UEFHVUQBYNRNQC-SFJXLCSZSA-N 0.000 description 9
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 9
- GUIYPEKUEMQBIK-JSGCOSHPSA-N Val-Tyr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NCC(O)=O GUIYPEKUEMQBIK-JSGCOSHPSA-N 0.000 description 9
- VVIZITNVZUAEMI-DLOVCJGASA-N Val-Val-Gln Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(N)=O VVIZITNVZUAEMI-DLOVCJGASA-N 0.000 description 9
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 9
- 108010018691 arginyl-threonyl-arginine Proteins 0.000 description 9
- 108700004025 env Genes Proteins 0.000 description 9
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 9
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 9
- 108010045126 glycyl-tyrosyl-glycine Proteins 0.000 description 9
- 238000003780 insertion Methods 0.000 description 9
- 230000037431 insertion Effects 0.000 description 9
- XAXHGSOBFPIRFG-LSJOCFKGSA-N Ala-Pro-His Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O XAXHGSOBFPIRFG-LSJOCFKGSA-N 0.000 description 8
- FFZJHQODAYHGPO-KZVJFYERSA-N Ala-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N FFZJHQODAYHGPO-KZVJFYERSA-N 0.000 description 8
- HDHZCEDPLTVHFZ-GUBZILKMSA-N Asn-Leu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O HDHZCEDPLTVHFZ-GUBZILKMSA-N 0.000 description 8
- ZMWOJVAXTOUHAP-ZKWXMUAHSA-N Cys-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CS)N ZMWOJVAXTOUHAP-ZKWXMUAHSA-N 0.000 description 8
- VFGADOJXRLWTBU-JBDRJPRFSA-N Cys-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CS)N VFGADOJXRLWTBU-JBDRJPRFSA-N 0.000 description 8
- YJIUYQKQBBQYHZ-ACZMJKKPSA-N Gln-Ala-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YJIUYQKQBBQYHZ-ACZMJKKPSA-N 0.000 description 8
- KVXVVDFOZNYYKZ-DCAQKATOSA-N Gln-Gln-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KVXVVDFOZNYYKZ-DCAQKATOSA-N 0.000 description 8
- JWNZHMSRZXXGTM-XKBZYTNZSA-N Glu-Ser-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JWNZHMSRZXXGTM-XKBZYTNZSA-N 0.000 description 8
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 8
- IOQWIOPSKJOEKI-SRVKXCTJSA-N Lys-Ser-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IOQWIOPSKJOEKI-SRVKXCTJSA-N 0.000 description 8
- HUKLXYYPZWPXCC-KZVJFYERSA-N Met-Ala-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HUKLXYYPZWPXCC-KZVJFYERSA-N 0.000 description 8
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 8
- FDMKYQQYJKYCLV-GUBZILKMSA-N Pro-Pro-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 FDMKYQQYJKYCLV-GUBZILKMSA-N 0.000 description 8
- OHKFXGKHSJKKAL-NRPADANISA-N Ser-Glu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O OHKFXGKHSJKKAL-NRPADANISA-N 0.000 description 8
- UIPXCLNLUUAMJU-JBDRJPRFSA-N Ser-Ile-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UIPXCLNLUUAMJU-JBDRJPRFSA-N 0.000 description 8
- AZWNCEBQZXELEZ-FXQIFTODSA-N Ser-Pro-Ser Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O AZWNCEBQZXELEZ-FXQIFTODSA-N 0.000 description 8
- NHQVWACSJZJCGJ-FLBSBUHZSA-N Thr-Thr-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NHQVWACSJZJCGJ-FLBSBUHZSA-N 0.000 description 8
- CDKZJGMPZHPAJC-ULQDDVLXSA-N Tyr-Leu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDKZJGMPZHPAJC-ULQDDVLXSA-N 0.000 description 8
- 108010077245 asparaginyl-proline Proteins 0.000 description 8
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 8
- 230000000875 corresponding effect Effects 0.000 description 8
- 210000005260 human cell Anatomy 0.000 description 8
- 238000010348 incorporation Methods 0.000 description 8
- 108010091871 leucylmethionine Proteins 0.000 description 8
- 239000011159 matrix material Substances 0.000 description 8
- 239000002609 medium Substances 0.000 description 8
- GVUGOAYIVIDWIO-UFWWTJHBSA-N nepidermin Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)NC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@H](CS)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CS)NC(=O)[C@H](C)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CCSC)NC(=O)[C@H](CS)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC=1NC=NC=1)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CS)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC=1NC=NC=1)NC(=O)[C@H](CO)NC(=O)[C@H](CC(C)C)NC(=O)[C@H]1N(CCC1)C(=O)[C@H](CS)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O)C(C)C)[C@@H](C)CC)C(C)C)C(C)C)C1=CC=C(O)C=C1 GVUGOAYIVIDWIO-UFWWTJHBSA-N 0.000 description 8
- 239000006228 supernatant Substances 0.000 description 8
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 7
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 7
- GNKVBRYFXYWXAB-WDSKDSINSA-N Asn-Glu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O GNKVBRYFXYWXAB-WDSKDSINSA-N 0.000 description 7
- FHETWELNCBMRMG-HJGDQZAQSA-N Asn-Leu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FHETWELNCBMRMG-HJGDQZAQSA-N 0.000 description 7
- KRXIWXCXOARFNT-ZLUOBGJFSA-N Asp-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O KRXIWXCXOARFNT-ZLUOBGJFSA-N 0.000 description 7
- BDWIZLQVVWQMTB-XKBZYTNZSA-N Cys-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CS)N)O BDWIZLQVVWQMTB-XKBZYTNZSA-N 0.000 description 7
- STHSGOZLFLFGSS-SUSMZKCASA-N Gln-Thr-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O STHSGOZLFLFGSS-SUSMZKCASA-N 0.000 description 7
- CAQXJMUDOLSBPF-SUSMZKCASA-N Glu-Thr-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAQXJMUDOLSBPF-SUSMZKCASA-N 0.000 description 7
- BUEFQXUHTUZXHR-LURJTMIESA-N Gly-Gly-Pro zwitterion Chemical compound NCC(=O)NCC(=O)N1CCC[C@H]1C(O)=O BUEFQXUHTUZXHR-LURJTMIESA-N 0.000 description 7
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 7
- CKRJBQJIGOEKMC-SRVKXCTJSA-N His-Lys-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O CKRJBQJIGOEKMC-SRVKXCTJSA-N 0.000 description 7
- MUXNCRWTWBMNHX-SRVKXCTJSA-N Lys-Leu-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O MUXNCRWTWBMNHX-SRVKXCTJSA-N 0.000 description 7
- CVAUVSOFHJKCHN-BZSNNMDCSA-N Phe-Tyr-Cys Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CS)C(O)=O)C1=CC=CC=C1 CVAUVSOFHJKCHN-BZSNNMDCSA-N 0.000 description 7
- IHCXPSYCHXFXKT-DCAQKATOSA-N Pro-Arg-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O IHCXPSYCHXFXKT-DCAQKATOSA-N 0.000 description 7
- ZCXQTRXYZOSGJR-FXQIFTODSA-N Pro-Asp-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZCXQTRXYZOSGJR-FXQIFTODSA-N 0.000 description 7
- LCUOTSLIVGSGAU-AVGNSLFASA-N Pro-His-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LCUOTSLIVGSGAU-AVGNSLFASA-N 0.000 description 7
- FPCGZYMRFFIYIH-CIUDSAMLSA-N Ser-Lys-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O FPCGZYMRFFIYIH-CIUDSAMLSA-N 0.000 description 7
- PMCMLDNPAZUYGI-DCAQKATOSA-N Ser-Lys-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMCMLDNPAZUYGI-DCAQKATOSA-N 0.000 description 7
- MHCLIYHJRXZBGJ-AAEUAGOBSA-N Trp-Gly-Cys Chemical compound N[C@@H](CC1=CNC2=CC=CC=C12)C(=O)NCC(=O)N[C@@H](CS)C(=O)O MHCLIYHJRXZBGJ-AAEUAGOBSA-N 0.000 description 7
- LVILBTSHPTWDGE-PMVMPFDFSA-N Tyr-Trp-Lys Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCCCN)C(O)=O)C1=CC=C(O)C=C1 LVILBTSHPTWDGE-PMVMPFDFSA-N 0.000 description 7
- BQASAMYRHNCKQE-IHRRRGAJSA-N Tyr-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N BQASAMYRHNCKQE-IHRRRGAJSA-N 0.000 description 7
- DLYOEFGPYTZVSP-AEJSXWLSSA-N Val-Cys-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N1CCC[C@@H]1C(=O)O)N DLYOEFGPYTZVSP-AEJSXWLSSA-N 0.000 description 7
- JAKHAONCJJZVHT-DCAQKATOSA-N Val-Lys-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)O)N JAKHAONCJJZVHT-DCAQKATOSA-N 0.000 description 7
- 230000004913 activation Effects 0.000 description 7
- 108010060035 arginylproline Proteins 0.000 description 7
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 7
- 101150030339 env gene Proteins 0.000 description 7
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 7
- 108010081551 glycylphenylalanine Proteins 0.000 description 7
- CZPAHAKGPDUIPJ-CIUDSAMLSA-N Ala-Gln-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(O)=O CZPAHAKGPDUIPJ-CIUDSAMLSA-N 0.000 description 6
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 6
- XUCHENWTTBFODJ-FXQIFTODSA-N Ala-Met-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O XUCHENWTTBFODJ-FXQIFTODSA-N 0.000 description 6
- OSRZOHXQCUFIQG-FPMFFAJLSA-N Ala-Phe-Pro Chemical compound C([C@H](NC(=O)[C@@H]([NH3+])C)C(=O)N1[C@H](CCC1)C([O-])=O)C1=CC=CC=C1 OSRZOHXQCUFIQG-FPMFFAJLSA-N 0.000 description 6
- UFBURHXMKFQVLM-CIUDSAMLSA-N Arg-Glu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UFBURHXMKFQVLM-CIUDSAMLSA-N 0.000 description 6
- MOGMYRUNTKYZFB-UNQGMJICSA-N Arg-Thr-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MOGMYRUNTKYZFB-UNQGMJICSA-N 0.000 description 6
- VWADICJNCPFKJS-ZLUOBGJFSA-N Asn-Ser-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O VWADICJNCPFKJS-ZLUOBGJFSA-N 0.000 description 6
- NONWUQAWAANERO-BZSNNMDCSA-N Asp-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CC(O)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 NONWUQAWAANERO-BZSNNMDCSA-N 0.000 description 6
- ZQFRDAZBTSFGGW-SRVKXCTJSA-N Asp-Ser-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZQFRDAZBTSFGGW-SRVKXCTJSA-N 0.000 description 6
- XABFFGOGKOORCG-CIUDSAMLSA-N Cys-Asp-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XABFFGOGKOORCG-CIUDSAMLSA-N 0.000 description 6
- DZSICRGTVPDCRN-YUMQZZPRSA-N Cys-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CS)N DZSICRGTVPDCRN-YUMQZZPRSA-N 0.000 description 6
- 102000004190 Enzymes Human genes 0.000 description 6
- 108090000790 Enzymes Proteins 0.000 description 6
- VOLVNCMGXWDDQY-LPEHRKFASA-N Gln-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)C(=O)O VOLVNCMGXWDDQY-LPEHRKFASA-N 0.000 description 6
- FLLRAEJOLZPSMN-CIUDSAMLSA-N Glu-Asn-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FLLRAEJOLZPSMN-CIUDSAMLSA-N 0.000 description 6
- ZWQVYZXPYSYPJD-RYUDHWBXSA-N Glu-Gly-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZWQVYZXPYSYPJD-RYUDHWBXSA-N 0.000 description 6
- XMPAXPSENRSOSV-RYUDHWBXSA-N Glu-Gly-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XMPAXPSENRSOSV-RYUDHWBXSA-N 0.000 description 6
- RBXSZQRSEGYDFG-GUBZILKMSA-N Glu-Lys-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O RBXSZQRSEGYDFG-GUBZILKMSA-N 0.000 description 6
- VJVAQZYGLMJPTK-QEJZJMRPSA-N Glu-Trp-Asp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N VJVAQZYGLMJPTK-QEJZJMRPSA-N 0.000 description 6
- CEXINUGNTZFNRY-BYPYZUCNSA-N Gly-Cys-Gly Chemical compound [NH3+]CC(=O)N[C@@H](CS)C(=O)NCC([O-])=O CEXINUGNTZFNRY-BYPYZUCNSA-N 0.000 description 6
- QSVMIMFAAZPCAQ-PMVVWTBXSA-N Gly-His-Thr Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QSVMIMFAAZPCAQ-PMVVWTBXSA-N 0.000 description 6
- NSVOVKWEKGEOQB-LURJTMIESA-N Gly-Pro-Gly Chemical compound NCC(=O)N1CCC[C@H]1C(=O)NCC(O)=O NSVOVKWEKGEOQB-LURJTMIESA-N 0.000 description 6
- PROLDOGUBQJNPG-RWMBFGLXSA-N His-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O PROLDOGUBQJNPG-RWMBFGLXSA-N 0.000 description 6
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 6
- HYMLKESRWLZDBR-WEDXCCLWSA-N Leu-Gly-Thr Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HYMLKESRWLZDBR-WEDXCCLWSA-N 0.000 description 6
- HRTRLSRYZZKPCO-BJDJZHNGSA-N Leu-Ile-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HRTRLSRYZZKPCO-BJDJZHNGSA-N 0.000 description 6
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 6
- MPOHDJKRBLVGCT-CIUDSAMLSA-N Lys-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N MPOHDJKRBLVGCT-CIUDSAMLSA-N 0.000 description 6
- VUTWYNQUSJWBHO-BZSNNMDCSA-N Lys-Leu-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VUTWYNQUSJWBHO-BZSNNMDCSA-N 0.000 description 6
- GHKXHCMRAUYLBS-CIUDSAMLSA-N Lys-Ser-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O GHKXHCMRAUYLBS-CIUDSAMLSA-N 0.000 description 6
- CTJUSALVKAWFFU-CIUDSAMLSA-N Lys-Ser-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N CTJUSALVKAWFFU-CIUDSAMLSA-N 0.000 description 6
- IEIHKHYMBIYQTH-YESZJQIVSA-N Lys-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCCN)N)C(=O)O IEIHKHYMBIYQTH-YESZJQIVSA-N 0.000 description 6
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 6
- RIYZXJVARWJLKS-KKUMJFAQSA-N Phe-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 RIYZXJVARWJLKS-KKUMJFAQSA-N 0.000 description 6
- MHHQQZIFLWFZGR-DCAQKATOSA-N Pro-Lys-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O MHHQQZIFLWFZGR-DCAQKATOSA-N 0.000 description 6
- GMJDSFYVTAMIBF-FXQIFTODSA-N Pro-Ser-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GMJDSFYVTAMIBF-FXQIFTODSA-N 0.000 description 6
- GYXVUTAOICLGKJ-ACZMJKKPSA-N Ser-Glu-Cys Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N GYXVUTAOICLGKJ-ACZMJKKPSA-N 0.000 description 6
- HHJFMHQYEAAOBM-ZLUOBGJFSA-N Ser-Ser-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O HHJFMHQYEAAOBM-ZLUOBGJFSA-N 0.000 description 6
- UQGAAZXSCGWMFU-UBHSHLNASA-N Ser-Trp-Asp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N UQGAAZXSCGWMFU-UBHSHLNASA-N 0.000 description 6
- LMMDEZPNUTZJAY-GCJQMDKQSA-N Thr-Asp-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O LMMDEZPNUTZJAY-GCJQMDKQSA-N 0.000 description 6
- VGNLMPBYWWNQFS-ZEILLAHLSA-N Thr-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O VGNLMPBYWWNQFS-ZEILLAHLSA-N 0.000 description 6
- LHHDBONOFZDWMW-AAEUAGOBSA-N Trp-Asp-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N LHHDBONOFZDWMW-AAEUAGOBSA-N 0.000 description 6
- VDUJEEQMRQCLHB-YTQUADARSA-N Trp-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O VDUJEEQMRQCLHB-YTQUADARSA-N 0.000 description 6
- SMLCYZYQFRTLCO-UWJYBYFXSA-N Tyr-Cys-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CS)C(=O)N[C@@H](C)C(O)=O SMLCYZYQFRTLCO-UWJYBYFXSA-N 0.000 description 6
- YLRLHDFMMWDYTK-KKUMJFAQSA-N Tyr-Cys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 YLRLHDFMMWDYTK-KKUMJFAQSA-N 0.000 description 6
- LTSIAOZUVISRAQ-QWRGUYRKSA-N Tyr-Gly-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N)O LTSIAOZUVISRAQ-QWRGUYRKSA-N 0.000 description 6
- NWEGIYMHTZXVBP-JSGCOSHPSA-N Tyr-Val-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O NWEGIYMHTZXVBP-JSGCOSHPSA-N 0.000 description 6
- WFENBJPLZMPVAX-XVKPBYJWSA-N Val-Gly-Glu Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O WFENBJPLZMPVAX-XVKPBYJWSA-N 0.000 description 6
- LYERIXUFCYVFFX-GVXVVHGQSA-N Val-Leu-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LYERIXUFCYVFFX-GVXVVHGQSA-N 0.000 description 6
- AEFJNECXZCODJM-UWVGGRQHSA-N Val-Val-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](C(C)C)C(=O)NCC([O-])=O AEFJNECXZCODJM-UWVGGRQHSA-N 0.000 description 6
- 108010093581 aspartyl-proline Proteins 0.000 description 6
- 230000029087 digestion Effects 0.000 description 6
- 239000013613 expression plasmid Substances 0.000 description 6
- 108010049041 glutamylalanine Proteins 0.000 description 6
- 108010054155 lysyllysine Proteins 0.000 description 6
- 108010079317 prolyl-tyrosine Proteins 0.000 description 6
- 108010048818 seryl-histidine Proteins 0.000 description 6
- 238000004448 titration Methods 0.000 description 6
- 108010079202 tyrosyl-alanyl-cysteine Proteins 0.000 description 6
- YAXNATKKPOWVCP-ZLUOBGJFSA-N Ala-Asn-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O YAXNATKKPOWVCP-ZLUOBGJFSA-N 0.000 description 5
- FMYQECOAIFGQGU-CYDGBPFRSA-N Arg-Val-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FMYQECOAIFGQGU-CYDGBPFRSA-N 0.000 description 5
- SQZIAWGBBUSSPJ-ZKWXMUAHSA-N Asn-Cys-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)N)N SQZIAWGBBUSSPJ-ZKWXMUAHSA-N 0.000 description 5
- AITGTTNYKAWKDR-CIUDSAMLSA-N Asn-His-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O AITGTTNYKAWKDR-CIUDSAMLSA-N 0.000 description 5
- YRTOMUMWSTUQAX-FXQIFTODSA-N Asn-Pro-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O YRTOMUMWSTUQAX-FXQIFTODSA-N 0.000 description 5
- BIGRHVNFFJTHEB-UBHSHLNASA-N Asn-Trp-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(O)=O BIGRHVNFFJTHEB-UBHSHLNASA-N 0.000 description 5
- HPNDBHLITCHRSO-WHFBIAKZSA-N Asp-Ala-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)NCC(O)=O HPNDBHLITCHRSO-WHFBIAKZSA-N 0.000 description 5
- HICVMZCGVFKTPM-BQBZGAKWSA-N Asp-Pro-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O HICVMZCGVFKTPM-BQBZGAKWSA-N 0.000 description 5
- 102000001301 EGF receptor Human genes 0.000 description 5
- 108060006698 EGF receptor Proteins 0.000 description 5
- KZEUVLLVULIPNX-GUBZILKMSA-N Gln-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N KZEUVLLVULIPNX-GUBZILKMSA-N 0.000 description 5
- QKCZZAZNMMVICF-DCAQKATOSA-N Gln-Leu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O QKCZZAZNMMVICF-DCAQKATOSA-N 0.000 description 5
- ITVBKCZZLJUUHI-HTUGSXCWSA-N Glu-Phe-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ITVBKCZZLJUUHI-HTUGSXCWSA-N 0.000 description 5
- QIZJOTQTCAGKPU-KWQFWETISA-N Gly-Ala-Tyr Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 QIZJOTQTCAGKPU-KWQFWETISA-N 0.000 description 5
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 5
- CQZDZKRHFWJXDF-WDSKDSINSA-N Gly-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CN CQZDZKRHFWJXDF-WDSKDSINSA-N 0.000 description 5
- GAAHQHNCMIAYEX-UWVGGRQHSA-N Gly-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GAAHQHNCMIAYEX-UWVGGRQHSA-N 0.000 description 5
- MYXNLWDWWOTERK-BHNWBGBOSA-N Gly-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN)O MYXNLWDWWOTERK-BHNWBGBOSA-N 0.000 description 5
- UKTUOMWSJPXODT-GUDRVLHUSA-N Ile-Asn-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N UKTUOMWSJPXODT-GUDRVLHUSA-N 0.000 description 5
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 5
- FIICHHJDINDXKG-IHPCNDPISA-N Leu-Lys-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O FIICHHJDINDXKG-IHPCNDPISA-N 0.000 description 5
- JVTYXRRFZCEPPK-RHYQMDGZSA-N Leu-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC(C)C)N)O JVTYXRRFZCEPPK-RHYQMDGZSA-N 0.000 description 5
- INCJJHQRZGQLFC-KBPBESRZSA-N Leu-Phe-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O INCJJHQRZGQLFC-KBPBESRZSA-N 0.000 description 5
- SBANPBVRHYIMRR-GARJFASQSA-N Leu-Ser-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N SBANPBVRHYIMRR-GARJFASQSA-N 0.000 description 5
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 5
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 5
- WINFHLHJTRGLCV-BZSNNMDCSA-N Lys-Tyr-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=C(O)C=C1 WINFHLHJTRGLCV-BZSNNMDCSA-N 0.000 description 5
- CULGJGUDIJATIP-STQMWFEESA-N Met-Tyr-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 CULGJGUDIJATIP-STQMWFEESA-N 0.000 description 5
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 5
- JIYJYFIXQTYDNF-YDHLFZDLSA-N Phe-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N JIYJYFIXQTYDNF-YDHLFZDLSA-N 0.000 description 5
- DEDANIDYQAPTFI-IHRRRGAJSA-N Pro-Asp-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O DEDANIDYQAPTFI-IHRRRGAJSA-N 0.000 description 5
- GQLOZEMWEBDEAY-NAKRPEOUSA-N Pro-Cys-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GQLOZEMWEBDEAY-NAKRPEOUSA-N 0.000 description 5
- SUENWIFTSTWUKD-AVGNSLFASA-N Pro-Leu-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O SUENWIFTSTWUKD-AVGNSLFASA-N 0.000 description 5
- IIRBTQHFVNGPMQ-AVGNSLFASA-N Pro-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 IIRBTQHFVNGPMQ-AVGNSLFASA-N 0.000 description 5
- BCAVNDNYOGTQMQ-AAEUAGOBSA-N Ser-Trp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O BCAVNDNYOGTQMQ-AAEUAGOBSA-N 0.000 description 5
- 102100022791 Sodium/potassium-transporting ATPase subunit beta-2 Human genes 0.000 description 5
- 101710193880 Sodium/potassium-transporting ATPase subunit beta-2 Proteins 0.000 description 5
- LVHHEVGYAZGXDE-KDXUFGMBSA-N Thr-Ala-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(=O)O)N)O LVHHEVGYAZGXDE-KDXUFGMBSA-N 0.000 description 5
- CAJFZCICSVBOJK-SHGPDSBTSA-N Thr-Ala-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAJFZCICSVBOJK-SHGPDSBTSA-N 0.000 description 5
- RKDFEMGVMMYYNG-WDCWCFNPSA-N Thr-Gln-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O RKDFEMGVMMYYNG-WDCWCFNPSA-N 0.000 description 5
- ZOCJFNXUVSGBQI-HSHDSVGOSA-N Thr-Trp-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N)O ZOCJFNXUVSGBQI-HSHDSVGOSA-N 0.000 description 5
- LVRFMARKDGGZMX-IZPVPAKOSA-N Thr-Tyr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=C(O)C=C1 LVRFMARKDGGZMX-IZPVPAKOSA-N 0.000 description 5
- QJBWZNTWJSZUOY-UWJYBYFXSA-N Tyr-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N QJBWZNTWJSZUOY-UWJYBYFXSA-N 0.000 description 5
- MICSYKFECRFCTJ-IHRRRGAJSA-N Tyr-Arg-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O MICSYKFECRFCTJ-IHRRRGAJSA-N 0.000 description 5
- ILTXFANLDMJWPR-SIUGBPQLSA-N Tyr-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N ILTXFANLDMJWPR-SIUGBPQLSA-N 0.000 description 5
- WBUOKGBHGDPYMH-GUBZILKMSA-N Val-Cys-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)C(C)C WBUOKGBHGDPYMH-GUBZILKMSA-N 0.000 description 5
- GVNLOVJNNDZUHS-RHYQMDGZSA-N Val-Thr-Lys Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O GVNLOVJNNDZUHS-RHYQMDGZSA-N 0.000 description 5
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 5
- 108010047857 aspartylglycine Proteins 0.000 description 5
- 108020001507 fusion proteins Proteins 0.000 description 5
- 102000037865 fusion proteins Human genes 0.000 description 5
- 230000006872 improvement Effects 0.000 description 5
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 5
- 101150066555 lacZ gene Proteins 0.000 description 5
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 5
- 230000035772 mutation Effects 0.000 description 5
- 108091008146 restriction endonucleases Proteins 0.000 description 5
- 210000002966 serum Anatomy 0.000 description 5
- QRBLKGHRWFGINE-UGWAGOLRSA-N 2-[2-[2-[[2-[[4-[[2-[[6-amino-2-[3-amino-1-[(2,3-diamino-3-oxopropyl)amino]-3-oxopropyl]-5-methylpyrimidine-4-carbonyl]amino]-3-[(2r,3s,4s,5s,6s)-3-[(2s,3r,4r,5s)-4-carbamoyl-3,4,5-trihydroxy-6-(hydroxymethyl)oxan-2-yl]oxy-4,5-dihydroxy-6-(hydroxymethyl)- Chemical compound N=1C(C=2SC=C(N=2)C(N)=O)CSC=1CCNC(=O)C(C(C)=O)NC(=O)C(C)C(O)C(C)NC(=O)C(C(O[C@H]1[C@@]([C@@H](O)[C@H](O)[C@H](CO)O1)(C)O[C@H]1[C@@H]([C@](O)([C@@H](O)C(CO)O1)C(N)=O)O)C=1NC=NC=1)NC(=O)C1=NC(C(CC(N)=O)NCC(N)C(N)=O)=NC(N)=C1C QRBLKGHRWFGINE-UGWAGOLRSA-N 0.000 description 4
- MBWYUTNBYSSUIQ-HERUPUMHSA-N Ala-Asn-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N MBWYUTNBYSSUIQ-HERUPUMHSA-N 0.000 description 4
- KRHRBKYBJXMYBB-WHFBIAKZSA-N Ala-Cys-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O KRHRBKYBJXMYBB-WHFBIAKZSA-N 0.000 description 4
- NBTGEURICRTMGL-WHFBIAKZSA-N Ala-Gly-Ser Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O NBTGEURICRTMGL-WHFBIAKZSA-N 0.000 description 4
- IPZQNYYAYVRKKK-FXQIFTODSA-N Ala-Pro-Ala Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IPZQNYYAYVRKKK-FXQIFTODSA-N 0.000 description 4
- KTXKIYXZQFWJKB-VZFHVOOUSA-N Ala-Thr-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O KTXKIYXZQFWJKB-VZFHVOOUSA-N 0.000 description 4
- ZCUFMRIQCPNOHZ-NRPADANISA-N Ala-Val-Gln Chemical compound C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N ZCUFMRIQCPNOHZ-NRPADANISA-N 0.000 description 4
- QIWYWCYNUMJBTC-CIUDSAMLSA-N Arg-Cys-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(O)=O QIWYWCYNUMJBTC-CIUDSAMLSA-N 0.000 description 4
- UBGGJTMETLEXJD-DCAQKATOSA-N Asn-Leu-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O UBGGJTMETLEXJD-DCAQKATOSA-N 0.000 description 4
- VLDRQOHCMKCXLY-SRVKXCTJSA-N Asn-Ser-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VLDRQOHCMKCXLY-SRVKXCTJSA-N 0.000 description 4
- HMQDRBKQMLRCCG-GMOBBJLQSA-N Asp-Arg-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HMQDRBKQMLRCCG-GMOBBJLQSA-N 0.000 description 4
- POTCZYQVVNXUIG-BQBZGAKWSA-N Asp-Gly-Pro Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O POTCZYQVVNXUIG-BQBZGAKWSA-N 0.000 description 4
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 4
- 241000713826 Avian leukosis virus Species 0.000 description 4
- SRIRHERUAMYIOQ-CIUDSAMLSA-N Cys-Leu-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SRIRHERUAMYIOQ-CIUDSAMLSA-N 0.000 description 4
- KVGPYKUIHZJWGA-BQBZGAKWSA-N Cys-Met-Gly Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O KVGPYKUIHZJWGA-BQBZGAKWSA-N 0.000 description 4
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 4
- 101710177291 Gag polyprotein Proteins 0.000 description 4
- AAOBFSKXAVIORT-GUBZILKMSA-N Gln-Asn-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O AAOBFSKXAVIORT-GUBZILKMSA-N 0.000 description 4
- VSXBYIJUAXPAAL-WDSKDSINSA-N Gln-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O VSXBYIJUAXPAAL-WDSKDSINSA-N 0.000 description 4
- SOYWRINXUSUWEQ-DLOVCJGASA-N Glu-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 4
- PYUCNHJQQVSPGN-BQBZGAKWSA-N Gly-Arg-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN)CN=C(N)N PYUCNHJQQVSPGN-BQBZGAKWSA-N 0.000 description 4
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 4
- SXJHOPPTOJACOA-QXEWZRGKSA-N Gly-Ile-Arg Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SXJHOPPTOJACOA-QXEWZRGKSA-N 0.000 description 4
- NSTUFLGQJCOCDL-UWVGGRQHSA-N Gly-Leu-Arg Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NSTUFLGQJCOCDL-UWVGGRQHSA-N 0.000 description 4
- YIFUFYZELCMPJP-YUMQZZPRSA-N Gly-Leu-Cys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(O)=O YIFUFYZELCMPJP-YUMQZZPRSA-N 0.000 description 4
- TVUWMSBGMVAHSJ-KBPBESRZSA-N Gly-Leu-Phe Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TVUWMSBGMVAHSJ-KBPBESRZSA-N 0.000 description 4
- VEPBEGNDJYANCF-QWRGUYRKSA-N Gly-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN VEPBEGNDJYANCF-QWRGUYRKSA-N 0.000 description 4
- DBUNZBWUWCIELX-JHEQGTHGSA-N Gly-Thr-Glu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DBUNZBWUWCIELX-JHEQGTHGSA-N 0.000 description 4
- JQFILXICXLDTRR-FBCQKBJTSA-N Gly-Thr-Gly Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)NCC(O)=O JQFILXICXLDTRR-FBCQKBJTSA-N 0.000 description 4
- LPFBXFILACZHIB-LAEOZQHASA-N Ile-Gly-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)O)C(=O)O)N LPFBXFILACZHIB-LAEOZQHASA-N 0.000 description 4
- TWYOYAKMLHWMOJ-ZPFDUUQYSA-N Ile-Leu-Asn Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O TWYOYAKMLHWMOJ-ZPFDUUQYSA-N 0.000 description 4
- CUXRXAIAVYLVFD-ULQDDVLXSA-N Leu-Arg-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 CUXRXAIAVYLVFD-ULQDDVLXSA-N 0.000 description 4
- FAELBUXXFQLUAX-AJNGGQMLSA-N Leu-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(C)C FAELBUXXFQLUAX-AJNGGQMLSA-N 0.000 description 4
- JLWZLIQRYCTYBD-IHRRRGAJSA-N Leu-Lys-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JLWZLIQRYCTYBD-IHRRRGAJSA-N 0.000 description 4
- ODRREERHVHMIPT-OEAJRASXSA-N Leu-Thr-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ODRREERHVHMIPT-OEAJRASXSA-N 0.000 description 4
- MVJRBCJCRYGCKV-GVXVVHGQSA-N Leu-Val-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MVJRBCJCRYGCKV-GVXVVHGQSA-N 0.000 description 4
- OVAOHZIOUBEQCJ-IHRRRGAJSA-N Lys-Leu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OVAOHZIOUBEQCJ-IHRRRGAJSA-N 0.000 description 4
- UIJVKVHLCQSPOJ-XIRDDKMYSA-N Lys-Ser-Trp Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O UIJVKVHLCQSPOJ-XIRDDKMYSA-N 0.000 description 4
- GILLQRYAWOMHED-DCAQKATOSA-N Lys-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN GILLQRYAWOMHED-DCAQKATOSA-N 0.000 description 4
- 101710125418 Major capsid protein Proteins 0.000 description 4
- BMHIFARYXOJDLD-WPRPVWTQSA-N Met-Gly-Val Chemical compound [H]N[C@@H](CCSC)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O BMHIFARYXOJDLD-WPRPVWTQSA-N 0.000 description 4
- RGMLUHANLDVMPB-ULQDDVLXSA-N Phe-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N RGMLUHANLDVMPB-ULQDDVLXSA-N 0.000 description 4
- LTQCLFMNABRKSH-UHFFFAOYSA-N Phleomycin Natural products N=1C(C=2SC=C(N=2)C(N)=O)CSC=1CCNC(=O)C(C(O)C)NC(=O)C(C)C(O)C(C)NC(=O)C(C(OC1C(C(O)C(O)C(CO)O1)OC1C(C(OC(N)=O)C(O)C(CO)O1)O)C=1NC=NC=1)NC(=O)C1=NC(C(CC(N)=O)NCC(N)C(N)=O)=NC(N)=C1C LTQCLFMNABRKSH-UHFFFAOYSA-N 0.000 description 4
- 108010035235 Phleomycins Proteins 0.000 description 4
- YKQNVTOIYFQMLW-IHRRRGAJSA-N Pro-Cys-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 YKQNVTOIYFQMLW-IHRRRGAJSA-N 0.000 description 4
- KWMZPPWYBVZIER-XGEHTFHBSA-N Pro-Ser-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KWMZPPWYBVZIER-XGEHTFHBSA-N 0.000 description 4
- BNUKRHFCHHLIGR-JYJNAYRXSA-N Pro-Trp-Asp Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CC(=O)O)C(=O)O BNUKRHFCHHLIGR-JYJNAYRXSA-N 0.000 description 4
- UGGWCAFQPKANMW-FXQIFTODSA-N Ser-Met-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O UGGWCAFQPKANMW-FXQIFTODSA-N 0.000 description 4
- FLONGDPORFIVQW-XGEHTFHBSA-N Ser-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FLONGDPORFIVQW-XGEHTFHBSA-N 0.000 description 4
- HSWXBJCBYSWBPT-GUBZILKMSA-N Ser-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)C(C)C)C(O)=O HSWXBJCBYSWBPT-GUBZILKMSA-N 0.000 description 4
- TWLMXDWFVNEFFK-FJXKBIBVSA-N Thr-Arg-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O TWLMXDWFVNEFFK-FJXKBIBVSA-N 0.000 description 4
- ZQUKYJOKQBRBCS-GLLZPBPUSA-N Thr-Gln-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O ZQUKYJOKQBRBCS-GLLZPBPUSA-N 0.000 description 4
- LIXBDERDAGNVAV-XKBZYTNZSA-N Thr-Gln-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O LIXBDERDAGNVAV-XKBZYTNZSA-N 0.000 description 4
- WYKJENSCCRJLRC-ZDLURKLDSA-N Thr-Gly-Cys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N)O WYKJENSCCRJLRC-ZDLURKLDSA-N 0.000 description 4
- LECUEEHKUFYOOV-ZJDVBMNYSA-N Thr-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)[C@@H](C)O LECUEEHKUFYOOV-ZJDVBMNYSA-N 0.000 description 4
- 108700019146 Transgenes Proteins 0.000 description 4
- YXONONCLMLHWJX-SZMVWBNQSA-N Trp-Glu-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O)=CNC2=C1 YXONONCLMLHWJX-SZMVWBNQSA-N 0.000 description 4
- 102000046255 Type III Sodium-Phosphate Cotransporter Proteins Human genes 0.000 description 4
- 108091006286 Type III sodium-phosphate co-transporters Proteins 0.000 description 4
- GULIUBBXCYPDJU-CQDKDKBSSA-N Tyr-Leu-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 GULIUBBXCYPDJU-CQDKDKBSSA-N 0.000 description 4
- NMANTMWGQZASQN-QXEWZRGKSA-N Val-Arg-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N NMANTMWGQZASQN-QXEWZRGKSA-N 0.000 description 4
- FXVDGDZRYLFQKY-WPRPVWTQSA-N Val-Gly-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)C(C)C FXVDGDZRYLFQKY-WPRPVWTQSA-N 0.000 description 4
- BVWPHWLFGRCECJ-JSGCOSHPSA-N Val-Gly-Tyr Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N BVWPHWLFGRCECJ-JSGCOSHPSA-N 0.000 description 4
- UEXPMFIAZZHEAD-HSHDSVGOSA-N Val-Thr-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](C(C)C)N)O UEXPMFIAZZHEAD-HSHDSVGOSA-N 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 238000010367 cloning Methods 0.000 description 4
- 108010004073 cysteinylcysteine Proteins 0.000 description 4
- 239000013604 expression vector Substances 0.000 description 4
- 108010078144 glutaminyl-glycine Proteins 0.000 description 4
- 238000003119 immunoblot Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 239000012528 membrane Substances 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 239000002243 precursor Substances 0.000 description 4
- 108010020532 tyrosyl-proline Proteins 0.000 description 4
- 238000005199 ultracentrifugation Methods 0.000 description 4
- LZRNYBIJOSKKRJ-XVYDVKMFSA-N Ala-Asp-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LZRNYBIJOSKKRJ-XVYDVKMFSA-N 0.000 description 3
- PUBLUECXJRHTBK-ACZMJKKPSA-N Ala-Glu-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O PUBLUECXJRHTBK-ACZMJKKPSA-N 0.000 description 3
- AWZKCUCQJNTBAD-SRVKXCTJSA-N Ala-Leu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN AWZKCUCQJNTBAD-SRVKXCTJSA-N 0.000 description 3
- VRTOMXFZHGWHIJ-KZVJFYERSA-N Ala-Thr-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VRTOMXFZHGWHIJ-KZVJFYERSA-N 0.000 description 3
- XMIAMUXIMWREBJ-HERUPUMHSA-N Ala-Trp-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(=O)N)C(=O)O)N XMIAMUXIMWREBJ-HERUPUMHSA-N 0.000 description 3
- IASNWHAGGYTEKX-IUCAKERBSA-N Arg-Arg-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(O)=O IASNWHAGGYTEKX-IUCAKERBSA-N 0.000 description 3
- PNIGSVZJNVUVJA-BQBZGAKWSA-N Arg-Gly-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O PNIGSVZJNVUVJA-BQBZGAKWSA-N 0.000 description 3
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 3
- POOCJCRBHHMAOS-FXQIFTODSA-N Asn-Arg-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O POOCJCRBHHMAOS-FXQIFTODSA-N 0.000 description 3
- XWFPGQVLOVGSLU-CIUDSAMLSA-N Asn-Gln-Arg Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XWFPGQVLOVGSLU-CIUDSAMLSA-N 0.000 description 3
- NECWUSYTYSIFNC-DLOVCJGASA-N Asp-Ala-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 NECWUSYTYSIFNC-DLOVCJGASA-N 0.000 description 3
- RYKWOUUZJFSJOH-FXQIFTODSA-N Asp-Gln-Glu Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N RYKWOUUZJFSJOH-FXQIFTODSA-N 0.000 description 3
- RQHLMGCXCZUOGT-ZPFDUUQYSA-N Asp-Leu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RQHLMGCXCZUOGT-ZPFDUUQYSA-N 0.000 description 3
- FAUPLTGRUBTXNU-FXQIFTODSA-N Asp-Pro-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O FAUPLTGRUBTXNU-FXQIFTODSA-N 0.000 description 3
- JJQGZGOEDSSHTE-FOHZUACHSA-N Asp-Thr-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JJQGZGOEDSSHTE-FOHZUACHSA-N 0.000 description 3
- UXIPUCUHQBIQOS-SRVKXCTJSA-N Asp-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O UXIPUCUHQBIQOS-SRVKXCTJSA-N 0.000 description 3
- SBMGKDLRJLYZCU-BIIVOSGPSA-N Cys-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CS)N)C(=O)O SBMGKDLRJLYZCU-BIIVOSGPSA-N 0.000 description 3
- JEKIARHEWURQRJ-BZSNNMDCSA-N Cys-Phe-Tyr Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)NC(=O)[C@H](CS)N JEKIARHEWURQRJ-BZSNNMDCSA-N 0.000 description 3
- WZJLBUPPZRZNTO-CIUDSAMLSA-N Cys-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N WZJLBUPPZRZNTO-CIUDSAMLSA-N 0.000 description 3
- DQGIAOGALAQBGK-BWBBJGPYSA-N Cys-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N)O DQGIAOGALAQBGK-BWBBJGPYSA-N 0.000 description 3
- FKXCBKCOSVIGCT-AVGNSLFASA-N Gln-Lys-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FKXCBKCOSVIGCT-AVGNSLFASA-N 0.000 description 3
- KKCUFHUTMKQQCF-SRVKXCTJSA-N Glu-Arg-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O KKCUFHUTMKQQCF-SRVKXCTJSA-N 0.000 description 3
- VSMQDIVEBXPKRT-QEJZJMRPSA-N Glu-Cys-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)O)N VSMQDIVEBXPKRT-QEJZJMRPSA-N 0.000 description 3
- NUSWUSKZRCGFEX-FXQIFTODSA-N Glu-Glu-Cys Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(O)=O NUSWUSKZRCGFEX-FXQIFTODSA-N 0.000 description 3
- QYPKJXSMLMREKF-BPUTZDHNSA-N Glu-Glu-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)N QYPKJXSMLMREKF-BPUTZDHNSA-N 0.000 description 3
- BYYNJRSNDARRBX-YFKPBYRVSA-N Gly-Gln-Gly Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O BYYNJRSNDARRBX-YFKPBYRVSA-N 0.000 description 3
- HQRHFUYMGCHHJS-LURJTMIESA-N Gly-Gly-Arg Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N HQRHFUYMGCHHJS-LURJTMIESA-N 0.000 description 3
- QSQXZZCGPXQBPP-BQBZGAKWSA-N Gly-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)CN)C(=O)N[C@@H](CS)C(=O)O QSQXZZCGPXQBPP-BQBZGAKWSA-N 0.000 description 3
- WRFOZIJRODPLIA-QWRGUYRKSA-N Gly-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN)O WRFOZIJRODPLIA-QWRGUYRKSA-N 0.000 description 3
- HQSKKSLNLSTONK-JTQLQIEISA-N Gly-Tyr-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 HQSKKSLNLSTONK-JTQLQIEISA-N 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- 101000851176 Homo sapiens Pro-epidermal growth factor Proteins 0.000 description 3
- VZIFYHYNQDIPLI-HJWJTTGWSA-N Ile-Arg-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N VZIFYHYNQDIPLI-HJWJTTGWSA-N 0.000 description 3
- FJWALBCCVIHZBS-QXEWZRGKSA-N Ile-Met-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)NCC(=O)O)N FJWALBCCVIHZBS-QXEWZRGKSA-N 0.000 description 3
- RQJUKVXWAKJDBW-SVSWQMSJSA-N Ile-Ser-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N RQJUKVXWAKJDBW-SVSWQMSJSA-N 0.000 description 3
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 3
- IGUOAYLTQJLPPD-DCAQKATOSA-N Leu-Asn-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IGUOAYLTQJLPPD-DCAQKATOSA-N 0.000 description 3
- DLCOFDAHNMMQPP-SRVKXCTJSA-N Leu-Asp-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DLCOFDAHNMMQPP-SRVKXCTJSA-N 0.000 description 3
- RRSLQOLASISYTB-CIUDSAMLSA-N Leu-Cys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(O)=O RRSLQOLASISYTB-CIUDSAMLSA-N 0.000 description 3
- PPBKJAQJAUHZKX-SRVKXCTJSA-N Leu-Cys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC(C)C PPBKJAQJAUHZKX-SRVKXCTJSA-N 0.000 description 3
- DLCXCECTCPKKCD-GUBZILKMSA-N Leu-Gln-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DLCXCECTCPKKCD-GUBZILKMSA-N 0.000 description 3
- HQUXQAMSWFIRET-AVGNSLFASA-N Leu-Glu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HQUXQAMSWFIRET-AVGNSLFASA-N 0.000 description 3
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 3
- BMVFXOQHDQZAQU-DCAQKATOSA-N Leu-Pro-Asp Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N BMVFXOQHDQZAQU-DCAQKATOSA-N 0.000 description 3
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 3
- GZRABTMNWJXFMH-UVOCVTCTSA-N Leu-Thr-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZRABTMNWJXFMH-UVOCVTCTSA-N 0.000 description 3
- AAKRWBIIGKPOKQ-ONGXEEELSA-N Leu-Val-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 3
- GCMWRRQAKQXDED-IUCAKERBSA-N Lys-Glu-Gly Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)N[C@@H](CCC([O-])=O)C(=O)NCC([O-])=O GCMWRRQAKQXDED-IUCAKERBSA-N 0.000 description 3
- UWHCKWNPWKTMBM-WDCWCFNPSA-N Lys-Thr-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O UWHCKWNPWKTMBM-WDCWCFNPSA-N 0.000 description 3
- TUZSWDCTCGTVDJ-PJODQICGSA-N Met-Trp-Ala Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCSC)C(=O)N[C@@H](C)C(O)=O)=CNC2=C1 TUZSWDCTCGTVDJ-PJODQICGSA-N 0.000 description 3
- 241001529936 Murinae Species 0.000 description 3
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 3
- LLGTYVHITPVGKR-RYUDHWBXSA-N Phe-Gln-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O LLGTYVHITPVGKR-RYUDHWBXSA-N 0.000 description 3
- SSWJYJHXQOYTSP-SRVKXCTJSA-N Pro-His-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O SSWJYJHXQOYTSP-SRVKXCTJSA-N 0.000 description 3
- BRJGUPWVFXKBQI-XUXIUFHCSA-N Pro-Leu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRJGUPWVFXKBQI-XUXIUFHCSA-N 0.000 description 3
- RMODQFBNDDENCP-IHRRRGAJSA-N Pro-Lys-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O RMODQFBNDDENCP-IHRRRGAJSA-N 0.000 description 3
- LEIKGVHQTKHOLM-IUCAKERBSA-N Pro-Pro-Gly Chemical compound OC(=O)CNC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 LEIKGVHQTKHOLM-IUCAKERBSA-N 0.000 description 3
- AJNGQVUFQUVRQT-JYJNAYRXSA-N Pro-Pro-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1N(CCC1)C(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 AJNGQVUFQUVRQT-JYJNAYRXSA-N 0.000 description 3
- MKGIILKDUGDRRO-FXQIFTODSA-N Pro-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 MKGIILKDUGDRRO-FXQIFTODSA-N 0.000 description 3
- LEBTWGWVUVJNTA-FKBYEOEOSA-N Pro-Trp-Phe Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CC4=CC=CC=C4)C(=O)O LEBTWGWVUVJNTA-FKBYEOEOSA-N 0.000 description 3
- FIDNSJUXESUDOV-JYJNAYRXSA-N Pro-Tyr-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O FIDNSJUXESUDOV-JYJNAYRXSA-N 0.000 description 3
- RDFQNDHEHVSONI-ZLUOBGJFSA-N Ser-Asn-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDFQNDHEHVSONI-ZLUOBGJFSA-N 0.000 description 3
- UBRMZSHOOIVJPW-SRVKXCTJSA-N Ser-Leu-Lys Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O UBRMZSHOOIVJPW-SRVKXCTJSA-N 0.000 description 3
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 3
- DFTCYYILCSQGIZ-GCJQMDKQSA-N Thr-Ala-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFTCYYILCSQGIZ-GCJQMDKQSA-N 0.000 description 3
- QQWNRERCGGZOKG-WEDXCCLWSA-N Thr-Gly-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O QQWNRERCGGZOKG-WEDXCCLWSA-N 0.000 description 3
- IWAVRIPRTCJAQO-HSHDSVGOSA-N Thr-Pro-Trp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O IWAVRIPRTCJAQO-HSHDSVGOSA-N 0.000 description 3
- BBPCSGKKPJUYRB-UVOCVTCTSA-N Thr-Thr-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O BBPCSGKKPJUYRB-UVOCVTCTSA-N 0.000 description 3
- QGVBFDIREUUSHX-IFFSRLJSSA-N Thr-Val-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O QGVBFDIREUUSHX-IFFSRLJSSA-N 0.000 description 3
- HDQJVXVRGJUDML-UBHSHLNASA-N Trp-Cys-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)N)C(=O)O)N HDQJVXVRGJUDML-UBHSHLNASA-N 0.000 description 3
- YTVJTXJTNRWJCR-JBACZVJFSA-N Trp-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N YTVJTXJTNRWJCR-JBACZVJFSA-N 0.000 description 3
- RCLOWEZASFJFEX-KKUMJFAQSA-N Tyr-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 RCLOWEZASFJFEX-KKUMJFAQSA-N 0.000 description 3
- HVHJYXDXRIWELT-RYUDHWBXSA-N Tyr-Glu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O HVHJYXDXRIWELT-RYUDHWBXSA-N 0.000 description 3
- LMKKMCGTDANZTR-BZSNNMDCSA-N Tyr-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LMKKMCGTDANZTR-BZSNNMDCSA-N 0.000 description 3
- WOCYUGQDXPTQPY-FXQIFTODSA-N Val-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N WOCYUGQDXPTQPY-FXQIFTODSA-N 0.000 description 3
- NZGOVKLVQNOEKP-YDHLFZDLSA-N Val-Phe-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NZGOVKLVQNOEKP-YDHLFZDLSA-N 0.000 description 3
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 3
- 108010092854 aspartyllysine Proteins 0.000 description 3
- 244000309466 calf Species 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000002079 cooperative effect Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 201000010099 disease Diseases 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 238000001943 fluorescence-activated cell sorting Methods 0.000 description 3
- 238000007499 fusion processing Methods 0.000 description 3
- 108010037850 glycylvaline Proteins 0.000 description 3
- 230000003394 haemopoietic effect Effects 0.000 description 3
- 238000011534 incubation Methods 0.000 description 3
- 230000000977 initiatory effect Effects 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 238000011835 investigation Methods 0.000 description 3
- 108010012058 leucyltyrosine Proteins 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 108010005942 methionylglycine Proteins 0.000 description 3
- 239000002953 phosphate buffered saline Substances 0.000 description 3
- 108010089520 pol Gene Products Proteins 0.000 description 3
- 108700004029 pol Genes Proteins 0.000 description 3
- 230000002829 reductive effect Effects 0.000 description 3
- 230000001105 regulatory effect Effects 0.000 description 3
- 230000008093 supporting effect Effects 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- 238000001890 transfection Methods 0.000 description 3
- 108010080629 tryptophan-leucine Proteins 0.000 description 3
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 2
- LMFXXZPPZDCPTA-ZKWXMUAHSA-N Ala-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N LMFXXZPPZDCPTA-ZKWXMUAHSA-N 0.000 description 2
- GSHKMNKPMLXSQW-KBIXCLLPSA-N Ala-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C)N GSHKMNKPMLXSQW-KBIXCLLPSA-N 0.000 description 2
- 241001128034 Amphotropic murine leukemia virus Species 0.000 description 2
- OTCJMMRQBVDQRK-DCAQKATOSA-N Arg-Asp-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O OTCJMMRQBVDQRK-DCAQKATOSA-N 0.000 description 2
- GIVWETPOBCRTND-DCAQKATOSA-N Arg-Gln-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GIVWETPOBCRTND-DCAQKATOSA-N 0.000 description 2
- SKTGPBFTMNLIHQ-KKUMJFAQSA-N Arg-Glu-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SKTGPBFTMNLIHQ-KKUMJFAQSA-N 0.000 description 2
- CYXCAHZVPFREJD-LURJTMIESA-N Arg-Gly-Gly Chemical compound NC(=N)NCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O CYXCAHZVPFREJD-LURJTMIESA-N 0.000 description 2
- CTAPSNCVKPOOSM-KKUMJFAQSA-N Arg-Tyr-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O CTAPSNCVKPOOSM-KKUMJFAQSA-N 0.000 description 2
- DMLSCRJBWUEALP-LAEOZQHASA-N Asn-Glu-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O DMLSCRJBWUEALP-LAEOZQHASA-N 0.000 description 2
- WUQXMTITJLFXAU-JIOCBJNQSA-N Asn-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N)O WUQXMTITJLFXAU-JIOCBJNQSA-N 0.000 description 2
- IXIWEFWRKIUMQX-DCAQKATOSA-N Asp-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(O)=O IXIWEFWRKIUMQX-DCAQKATOSA-N 0.000 description 2
- JDHOJQJMWBKHDB-CIUDSAMLSA-N Asp-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N JDHOJQJMWBKHDB-CIUDSAMLSA-N 0.000 description 2
- SBHUBSDEZQFJHJ-CIUDSAMLSA-N Asp-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O SBHUBSDEZQFJHJ-CIUDSAMLSA-N 0.000 description 2
- UWOPETAWXDZUJR-ACZMJKKPSA-N Asp-Cys-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(O)=O UWOPETAWXDZUJR-ACZMJKKPSA-N 0.000 description 2
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 2
- 101100118680 Caenorhabditis elegans sec-61.G gene Proteins 0.000 description 2
- 101100533230 Caenorhabditis elegans ser-2 gene Proteins 0.000 description 2
- 241000283707 Capra Species 0.000 description 2
- ATPDEYTYWVMINF-ZLUOBGJFSA-N Cys-Cys-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O ATPDEYTYWVMINF-ZLUOBGJFSA-N 0.000 description 2
- PFAQXUDMZVMADG-AVGNSLFASA-N Cys-Gln-Tyr Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PFAQXUDMZVMADG-AVGNSLFASA-N 0.000 description 2
- SKSJPIBFNFPTJB-NKWVEPMBSA-N Cys-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CS)N)C(=O)O SKSJPIBFNFPTJB-NKWVEPMBSA-N 0.000 description 2
- JUUMIGUJJRFQQR-KKUMJFAQSA-N Cys-Lys-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CS)N)O JUUMIGUJJRFQQR-KKUMJFAQSA-N 0.000 description 2
- BCFXQBXXDSEHRS-FXQIFTODSA-N Cys-Ser-Arg Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O BCFXQBXXDSEHRS-FXQIFTODSA-N 0.000 description 2
- LHRCZIRWNFRIRG-SRVKXCTJSA-N Cys-Tyr-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N)O LHRCZIRWNFRIRG-SRVKXCTJSA-N 0.000 description 2
- 101150039808 Egfr gene Proteins 0.000 description 2
- 108010014258 Elastin Proteins 0.000 description 2
- 102000016942 Elastin Human genes 0.000 description 2
- 101710091045 Envelope protein Proteins 0.000 description 2
- KZKBJEUWNMQTLV-XDTLVQLUSA-N Gln-Ala-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KZKBJEUWNMQTLV-XDTLVQLUSA-N 0.000 description 2
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 2
- MPZWMIIOPAPAKE-BQBZGAKWSA-N Glu-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N MPZWMIIOPAPAKE-BQBZGAKWSA-N 0.000 description 2
- PVBBEKPHARMPHX-DCAQKATOSA-N Glu-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCC(O)=O PVBBEKPHARMPHX-DCAQKATOSA-N 0.000 description 2
- VSRCAOIHMGCIJK-SRVKXCTJSA-N Glu-Leu-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VSRCAOIHMGCIJK-SRVKXCTJSA-N 0.000 description 2
- SYWCGQOIIARSIX-SRVKXCTJSA-N Glu-Pro-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O SYWCGQOIIARSIX-SRVKXCTJSA-N 0.000 description 2
- GPSHCSTUYOQPAI-JHEQGTHGSA-N Glu-Thr-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O GPSHCSTUYOQPAI-JHEQGTHGSA-N 0.000 description 2
- LZEUDRYSAZAJIO-AUTRQRHGSA-N Glu-Val-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LZEUDRYSAZAJIO-AUTRQRHGSA-N 0.000 description 2
- QSDKBRMVXSWAQE-BFHQHQDPSA-N Gly-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN QSDKBRMVXSWAQE-BFHQHQDPSA-N 0.000 description 2
- UPOJUWHGMDJUQZ-IUCAKERBSA-N Gly-Arg-Arg Chemical compound NC(=N)NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UPOJUWHGMDJUQZ-IUCAKERBSA-N 0.000 description 2
- DHDOADIPGZTAHT-YUMQZZPRSA-N Gly-Glu-Arg Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DHDOADIPGZTAHT-YUMQZZPRSA-N 0.000 description 2
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 2
- YNIMVVJTPWCUJH-KBPBESRZSA-N Gly-His-Tyr Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YNIMVVJTPWCUJH-KBPBESRZSA-N 0.000 description 2
- YKJUITHASJAGHO-HOTGVXAUSA-N Gly-Lys-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)CN YKJUITHASJAGHO-HOTGVXAUSA-N 0.000 description 2
- SCJJPCQUJYPHRZ-BQBZGAKWSA-N Gly-Pro-Asn Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O SCJJPCQUJYPHRZ-BQBZGAKWSA-N 0.000 description 2
- OOCFXNOVSLSHAB-IUCAKERBSA-N Gly-Pro-Pro Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 OOCFXNOVSLSHAB-IUCAKERBSA-N 0.000 description 2
- DKJWUIYLMLUBDX-XPUUQOCRSA-N Gly-Val-Cys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CS)C(=O)O DKJWUIYLMLUBDX-XPUUQOCRSA-N 0.000 description 2
- 102220541510 Golgi to ER traffic protein 4 homolog_D84K_mutation Human genes 0.000 description 2
- BIAKMWKJMQLZOJ-ZKWXMUAHSA-N His-Ala-Ala Chemical compound C[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](N)Cc1cnc[nH]1)C(O)=O BIAKMWKJMQLZOJ-ZKWXMUAHSA-N 0.000 description 2
- VXZZUXWAOMWWJH-QTKMDUPCSA-N His-Thr-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O VXZZUXWAOMWWJH-QTKMDUPCSA-N 0.000 description 2
- JRYQSFOFUFXPTB-RWRJDSDZSA-N Ile-Gln-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N JRYQSFOFUFXPTB-RWRJDSDZSA-N 0.000 description 2
- CKRFDMPBSWYOBT-PPCPHDFISA-N Ile-Lys-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CKRFDMPBSWYOBT-PPCPHDFISA-N 0.000 description 2
- PELCGFMHLZXWBQ-BJDJZHNGSA-N Ile-Ser-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)O)N PELCGFMHLZXWBQ-BJDJZHNGSA-N 0.000 description 2
- SAEWJTCJQVZQNZ-IUKAMOBKSA-N Ile-Thr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SAEWJTCJQVZQNZ-IUKAMOBKSA-N 0.000 description 2
- 101710192606 Latent membrane protein 2 Proteins 0.000 description 2
- VFQOCUQGMUXTJR-DCAQKATOSA-N Leu-Cys-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCSC)C(=O)O)N VFQOCUQGMUXTJR-DCAQKATOSA-N 0.000 description 2
- VQPPIMUZCZCOIL-GUBZILKMSA-N Leu-Gln-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O VQPPIMUZCZCOIL-GUBZILKMSA-N 0.000 description 2
- LAGPXKYZCCTSGQ-JYJNAYRXSA-N Leu-Glu-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LAGPXKYZCCTSGQ-JYJNAYRXSA-N 0.000 description 2
- PBGDOSARRIJMEV-DLOVCJGASA-N Leu-His-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O PBGDOSARRIJMEV-DLOVCJGASA-N 0.000 description 2
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 2
- XGDCYUQSFDQISZ-BQBZGAKWSA-N Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O XGDCYUQSFDQISZ-BQBZGAKWSA-N 0.000 description 2
- GAOJCVKPIGHTGO-UWVGGRQHSA-N Lys-Arg-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O GAOJCVKPIGHTGO-UWVGGRQHSA-N 0.000 description 2
- LOGFVTREOLYCPF-RHYQMDGZSA-N Lys-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN LOGFVTREOLYCPF-RHYQMDGZSA-N 0.000 description 2
- SBQDRNOLGSYHQA-YUMQZZPRSA-N Lys-Ser-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SBQDRNOLGSYHQA-YUMQZZPRSA-N 0.000 description 2
- JOSAKOKSPXROGQ-BJDJZHNGSA-N Lys-Ser-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JOSAKOKSPXROGQ-BJDJZHNGSA-N 0.000 description 2
- OPJRECCCQSDDCZ-TUSQITKMSA-N Lys-Trp-Trp Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O OPJRECCCQSDDCZ-TUSQITKMSA-N 0.000 description 2
- VWPJQIHBBOJWDN-DCAQKATOSA-N Lys-Val-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O VWPJQIHBBOJWDN-DCAQKATOSA-N 0.000 description 2
- 102000018697 Membrane Proteins Human genes 0.000 description 2
- 108010052285 Membrane Proteins Proteins 0.000 description 2
- LIIXIZKVWNYQHB-STECZYCISA-N Met-Tyr-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LIIXIZKVWNYQHB-STECZYCISA-N 0.000 description 2
- WYBVBIHNJWOLCJ-UHFFFAOYSA-N N-L-arginyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCCN=C(N)N WYBVBIHNJWOLCJ-UHFFFAOYSA-N 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 2
- 238000012408 PCR amplification Methods 0.000 description 2
- CDQCFGOQNYOICK-IHRRRGAJSA-N Phe-Glu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 CDQCFGOQNYOICK-IHRRRGAJSA-N 0.000 description 2
- BFYHIHGIHGROAT-HTUGSXCWSA-N Phe-Glu-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFYHIHGIHGROAT-HTUGSXCWSA-N 0.000 description 2
- BYAIIACBWBOJCU-URLPEUOOSA-N Phe-Ile-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BYAIIACBWBOJCU-URLPEUOOSA-N 0.000 description 2
- YFXXRYFWJFQAFW-JHYOHUSXSA-N Phe-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N)O YFXXRYFWJFQAFW-JHYOHUSXSA-N 0.000 description 2
- 102000011755 Phosphoglycerate Kinase Human genes 0.000 description 2
- DZZCICYRSZASNF-FXQIFTODSA-N Pro-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 DZZCICYRSZASNF-FXQIFTODSA-N 0.000 description 2
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 2
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 2
- SBVPYBFMIGDIDX-SRVKXCTJSA-N Pro-Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1N(C(=O)[C@H]2NCCC2)CCC1 SBVPYBFMIGDIDX-SRVKXCTJSA-N 0.000 description 2
- KHRLUIPIMIQFGT-AVGNSLFASA-N Pro-Val-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KHRLUIPIMIQFGT-AVGNSLFASA-N 0.000 description 2
- 101710188315 Protein X Proteins 0.000 description 2
- VAIZFHMTBFYJIA-ACZMJKKPSA-N Ser-Asp-Gln Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O VAIZFHMTBFYJIA-ACZMJKKPSA-N 0.000 description 2
- KDGARKCAKHBEDB-NKWVEPMBSA-N Ser-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CO)N)C(=O)O KDGARKCAKHBEDB-NKWVEPMBSA-N 0.000 description 2
- FZEUTKVQGMVGHW-AVGNSLFASA-N Ser-Phe-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O FZEUTKVQGMVGHW-AVGNSLFASA-N 0.000 description 2
- BVLGVLWFIZFEAH-BPUTZDHNSA-N Ser-Pro-Trp Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O BVLGVLWFIZFEAH-BPUTZDHNSA-N 0.000 description 2
- PXIPVTKHYLBLMZ-UHFFFAOYSA-N Sodium azide Chemical compound [Na+].[N-]=[N+]=[N-] PXIPVTKHYLBLMZ-UHFFFAOYSA-N 0.000 description 2
- 101710109576 Terminal protein Proteins 0.000 description 2
- 101001099217 Thermotoga maritima (strain ATCC 43589 / DSM 3109 / JCM 10099 / NBRC 100826 / MSB8) Triosephosphate isomerase Proteins 0.000 description 2
- WFUAUEQXPVNAEF-ZJDVBMNYSA-N Thr-Arg-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CCCN=C(N)N WFUAUEQXPVNAEF-ZJDVBMNYSA-N 0.000 description 2
- SKHPKKYKDYULDH-HJGDQZAQSA-N Thr-Asn-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SKHPKKYKDYULDH-HJGDQZAQSA-N 0.000 description 2
- NLSNVZAREYQMGR-HJGDQZAQSA-N Thr-Asp-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NLSNVZAREYQMGR-HJGDQZAQSA-N 0.000 description 2
- NIEWSKWFURSECR-FOHZUACHSA-N Thr-Gly-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O NIEWSKWFURSECR-FOHZUACHSA-N 0.000 description 2
- VYEHBMMAJFVTOI-JHEQGTHGSA-N Thr-Gly-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O VYEHBMMAJFVTOI-JHEQGTHGSA-N 0.000 description 2
- MUAFDCVOHYAFNG-RCWTZXSCSA-N Thr-Pro-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MUAFDCVOHYAFNG-RCWTZXSCSA-N 0.000 description 2
- VUXIQSUQQYNLJP-XAVMHZPKSA-N Thr-Ser-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N)O VUXIQSUQQYNLJP-XAVMHZPKSA-N 0.000 description 2
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 2
- RVMNUBQWPVOUKH-HEIBUPTGSA-N Thr-Ser-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMNUBQWPVOUKH-HEIBUPTGSA-N 0.000 description 2
- NDZYTIMDOZMECO-SHGPDSBTSA-N Thr-Thr-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O NDZYTIMDOZMECO-SHGPDSBTSA-N 0.000 description 2
- GTNCSPKYWCJZAC-XIRDDKMYSA-N Trp-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N GTNCSPKYWCJZAC-XIRDDKMYSA-N 0.000 description 2
- NKUIXQOJUAEIET-AQZXSJQPSA-N Trp-Asp-Thr Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@H](O)C)C(O)=O)=CNC2=C1 NKUIXQOJUAEIET-AQZXSJQPSA-N 0.000 description 2
- UPUNWAXSLPBMRK-XTWBLICNSA-N Trp-Thr-Thr Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UPUNWAXSLPBMRK-XTWBLICNSA-N 0.000 description 2
- HZDQUVQEVVYDDA-ACRUOGEOSA-N Tyr-Tyr-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HZDQUVQEVVYDDA-ACRUOGEOSA-N 0.000 description 2
- ZLFHAAGHGQBQQN-AEJSXWLSSA-N Val-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZLFHAAGHGQBQQN-AEJSXWLSSA-N 0.000 description 2
- BYOHPUZJVXWHAE-BYULHYEWSA-N Val-Asn-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N BYOHPUZJVXWHAE-BYULHYEWSA-N 0.000 description 2
- XIFAHCUNWWKUDE-DCAQKATOSA-N Val-Cys-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N XIFAHCUNWWKUDE-DCAQKATOSA-N 0.000 description 2
- GQMNEJMFMCJJTD-NHCYSSNCSA-N Val-Pro-Gln Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O GQMNEJMFMCJJTD-NHCYSSNCSA-N 0.000 description 2
- USLVEJAHTBLSIL-CYDGBPFRSA-N Val-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C USLVEJAHTBLSIL-CYDGBPFRSA-N 0.000 description 2
- KSFXWENSJABBFI-ZKWXMUAHSA-N Val-Ser-Asn Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O KSFXWENSJABBFI-ZKWXMUAHSA-N 0.000 description 2
- OFTXTCGQJXTNQS-XGEHTFHBSA-N Val-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N)O OFTXTCGQJXTNQS-XGEHTFHBSA-N 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 235000004279 alanine Nutrition 0.000 description 2
- 150000001295 alanines Chemical class 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 2
- 210000000234 capsid Anatomy 0.000 description 2
- 239000002299 complementary DNA Substances 0.000 description 2
- 108010069495 cysteinyltyrosine Proteins 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 230000008021 deposition Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000010790 dilution Methods 0.000 description 2
- 239000012895 dilution Substances 0.000 description 2
- 238000010494 dissociation reaction Methods 0.000 description 2
- 230000005593 dissociations Effects 0.000 description 2
- 229920002549 elastin Polymers 0.000 description 2
- 102000052116 epidermal growth factor receptor activity proteins Human genes 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- MHMNJMPURVTYEJ-UHFFFAOYSA-N fluorescein-5-isothiocyanate Chemical compound O1C(=O)C2=CC(N=C=S)=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 MHMNJMPURVTYEJ-UHFFFAOYSA-N 0.000 description 2
- 108700004026 gag Genes Proteins 0.000 description 2
- 239000000499 gel Substances 0.000 description 2
- 238000012239 gene modification Methods 0.000 description 2
- 230000005017 genetic modification Effects 0.000 description 2
- 235000013617 genetically modified food Nutrition 0.000 description 2
- 108010020688 glycylhistidine Proteins 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 108010057821 leucylproline Proteins 0.000 description 2
- 208000032839 leukemia Diseases 0.000 description 2
- 239000006166 lysate Substances 0.000 description 2
- 108010003700 lysyl aspartic acid Proteins 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 210000004779 membrane envelope Anatomy 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 2
- YBYRMVIVWMBXKQ-UHFFFAOYSA-N phenylmethanesulfonyl fluoride Chemical compound FS(=O)(=O)CC1=CC=CC=C1 YBYRMVIVWMBXKQ-UHFFFAOYSA-N 0.000 description 2
- 101150088264 pol gene Proteins 0.000 description 2
- 230000008488 polyadenylation Effects 0.000 description 2
- 239000000047 product Substances 0.000 description 2
- 108010029020 prolylglycine Proteins 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000001225 therapeutic effect Effects 0.000 description 2
- 238000002560 therapeutic procedure Methods 0.000 description 2
- PGOHTUIFYSHAQG-LJSDBVFPSA-N (2S)-6-amino-2-[[(2S)-5-amino-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-4-amino-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-5-amino-2-[[(2S)-5-amino-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S,3R)-2-[[(2S)-5-amino-2-[[(2S)-2-[[(2S)-2-[[(2S,3R)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-5-amino-2-[[(2S)-1-[(2S,3R)-2-[[(2S)-2-[[(2S)-2-[[(2R)-2-[[(2S)-2-[[(2S)-2-[[2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-1-[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-amino-4-methylsulfanylbutanoyl]amino]-3-(1H-indol-3-yl)propanoyl]amino]-5-carbamimidamidopentanoyl]amino]propanoyl]pyrrolidine-2-carbonyl]amino]-3-methylbutanoyl]amino]-4-methylpentanoyl]amino]-4-methylpentanoyl]amino]acetyl]amino]-3-hydroxypropanoyl]amino]-4-methylpentanoyl]amino]-3-sulfanylpropanoyl]amino]-4-methylsulfanylbutanoyl]amino]-5-carbamimidamidopentanoyl]amino]-3-hydroxybutanoyl]pyrrolidine-2-carbonyl]amino]-5-oxopentanoyl]amino]-3-hydroxypropanoyl]amino]-3-hydroxypropanoyl]amino]-3-(1H-imidazol-5-yl)propanoyl]amino]-4-methylpentanoyl]amino]-3-hydroxybutanoyl]amino]-3-(1H-indol-3-yl)propanoyl]amino]-5-carbamimidamidopentanoyl]amino]-5-oxopentanoyl]amino]-3-hydroxybutanoyl]amino]-3-hydroxypropanoyl]amino]-3-carboxypropanoyl]amino]-3-hydroxypropanoyl]amino]-5-oxopentanoyl]amino]-5-oxopentanoyl]amino]-3-phenylpropanoyl]amino]-5-carbamimidamidopentanoyl]amino]-3-methylbutanoyl]amino]-4-methylpentanoyl]amino]-4-oxobutanoyl]amino]-5-carbamimidamidopentanoyl]amino]-3-(1H-indol-3-yl)propanoyl]amino]-4-carboxybutanoyl]amino]-5-oxopentanoyl]amino]hexanoic acid Chemical compound CSCC[C@H](N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O PGOHTUIFYSHAQG-LJSDBVFPSA-N 0.000 description 1
- AEUAEICGCMSYCQ-UHFFFAOYSA-N 4-n-(7-chloroquinolin-1-ium-4-yl)-1-n,1-n-diethylpentane-1,4-diamine;dihydrogen phosphate Chemical compound OP(O)(O)=O.ClC1=CC=C2C(NC(C)CCCN(CC)CC)=CC=NC2=C1 AEUAEICGCMSYCQ-UHFFFAOYSA-N 0.000 description 1
- OPIFSICVWOWJMJ-AEOCFKNESA-N 5-bromo-4-chloro-3-indolyl beta-D-galactoside Chemical compound O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@H]1OC1=CNC2=CC=C(Br)C(Cl)=C12 OPIFSICVWOWJMJ-AEOCFKNESA-N 0.000 description 1
- 208000030507 AIDS Diseases 0.000 description 1
- 101150078244 AMO1 gene Proteins 0.000 description 1
- HRPVXLWXLXDGHG-UHFFFAOYSA-N Acrylamide Chemical compound NC(=O)C=C HRPVXLWXLXDGHG-UHFFFAOYSA-N 0.000 description 1
- LJTZPXOCBZRFBH-CIUDSAMLSA-N Ala-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N LJTZPXOCBZRFBH-CIUDSAMLSA-N 0.000 description 1
- HFBFSOAKPUZCCO-ZLUOBGJFSA-N Ala-Cys-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)N)C(=O)O)N HFBFSOAKPUZCCO-ZLUOBGJFSA-N 0.000 description 1
- NOGFDULFCFXBHB-CIUDSAMLSA-N Ala-Leu-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)O)N NOGFDULFCFXBHB-CIUDSAMLSA-N 0.000 description 1
- FEGOCLZUJUFCHP-CIUDSAMLSA-N Ala-Pro-Gln Chemical compound [H]N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O FEGOCLZUJUFCHP-CIUDSAMLSA-N 0.000 description 1
- BUQICHWNXBIBOG-LMVFSUKVSA-N Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)N BUQICHWNXBIBOG-LMVFSUKVSA-N 0.000 description 1
- 208000002267 Anti-neutrophil cytoplasmic antibody-associated vasculitis Diseases 0.000 description 1
- MFAMTAVAFBPXDC-LPEHRKFASA-N Arg-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O MFAMTAVAFBPXDC-LPEHRKFASA-N 0.000 description 1
- PMGDADKJMCOXHX-BQBZGAKWSA-N Arg-Gln Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O PMGDADKJMCOXHX-BQBZGAKWSA-N 0.000 description 1
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 1
- WYBVBIHNJWOLCJ-IUCAKERBSA-N Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCNC(N)=N WYBVBIHNJWOLCJ-IUCAKERBSA-N 0.000 description 1
- WKPXXXUSUHAXDE-SRVKXCTJSA-N Arg-Pro-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O WKPXXXUSUHAXDE-SRVKXCTJSA-N 0.000 description 1
- AIFHRTPABBBHKU-RCWTZXSCSA-N Arg-Thr-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AIFHRTPABBBHKU-RCWTZXSCSA-N 0.000 description 1
- NPDLYUOYAGBHFB-WDSKDSINSA-N Asn-Arg Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NPDLYUOYAGBHFB-WDSKDSINSA-N 0.000 description 1
- IBLAOXSULLECQZ-IUKAMOBKSA-N Asn-Ile-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC(N)=O IBLAOXSULLECQZ-IUKAMOBKSA-N 0.000 description 1
- WIDVAWAQBRAKTI-YUMQZZPRSA-N Asn-Leu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O WIDVAWAQBRAKTI-YUMQZZPRSA-N 0.000 description 1
- WLVLIYYBPPONRJ-GCJQMDKQSA-N Asn-Thr-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O WLVLIYYBPPONRJ-GCJQMDKQSA-N 0.000 description 1
- UXHYOWXTJLBEPG-GSSVUCPTSA-N Asn-Thr-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UXHYOWXTJLBEPG-GSSVUCPTSA-N 0.000 description 1
- WSOKZUVWBXVJHX-CIUDSAMLSA-N Asp-Arg-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O WSOKZUVWBXVJHX-CIUDSAMLSA-N 0.000 description 1
- KIJLEFNHWSXHRU-NUMRIWBASA-N Asp-Gln-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KIJLEFNHWSXHRU-NUMRIWBASA-N 0.000 description 1
- RXBGWGRSWXOBGK-KKUMJFAQSA-N Asp-Lys-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RXBGWGRSWXOBGK-KKUMJFAQSA-N 0.000 description 1
- 241000714235 Avian retrovirus Species 0.000 description 1
- 101000583086 Bunodosoma granuliferum Delta-actitoxin-Bgr2b Proteins 0.000 description 1
- 101100163949 Caenorhabditis elegans asp-3 gene Proteins 0.000 description 1
- 101100392772 Caenorhabditis elegans gln-2 gene Proteins 0.000 description 1
- 101100129088 Caenorhabditis elegans lys-2 gene Proteins 0.000 description 1
- 102000014914 Carrier Proteins Human genes 0.000 description 1
- KKZHXOOZHFABQQ-UWJYBYFXSA-N Cys-Ala-Tyr Chemical compound SC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KKZHXOOZHFABQQ-UWJYBYFXSA-N 0.000 description 1
- QJUDRFBUWAGUSG-SRVKXCTJSA-N Cys-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CS)N QJUDRFBUWAGUSG-SRVKXCTJSA-N 0.000 description 1
- UDPSLLFHOLGXBY-FXQIFTODSA-N Cys-Glu-Glu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UDPSLLFHOLGXBY-FXQIFTODSA-N 0.000 description 1
- NXQCSPVUPLUTJH-WHFBIAKZSA-N Cys-Ser-Gly Chemical compound SC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O NXQCSPVUPLUTJH-WHFBIAKZSA-N 0.000 description 1
- ALTQTAKGRFLRLR-GUBZILKMSA-N Cys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CS)N ALTQTAKGRFLRLR-GUBZILKMSA-N 0.000 description 1
- 201000003883 Cystic fibrosis Diseases 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 101800001467 Envelope glycoprotein E2 Proteins 0.000 description 1
- PRBLYKYHAJEABA-SRVKXCTJSA-N Gln-Arg-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O PRBLYKYHAJEABA-SRVKXCTJSA-N 0.000 description 1
- LOJYQMFIIJVETK-WDSKDSINSA-N Gln-Gln Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LOJYQMFIIJVETK-WDSKDSINSA-N 0.000 description 1
- FNAJNWPDTIXYJN-CIUDSAMLSA-N Gln-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCC(N)=O FNAJNWPDTIXYJN-CIUDSAMLSA-N 0.000 description 1
- ARYKRXHBIPLULY-XKBZYTNZSA-N Gln-Thr-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ARYKRXHBIPLULY-XKBZYTNZSA-N 0.000 description 1
- KGNSGRRALVIRGR-QWRGUYRKSA-N Gln-Tyr Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KGNSGRRALVIRGR-QWRGUYRKSA-N 0.000 description 1
- MRVYVEQPNDSWLH-XPUUQOCRSA-N Gln-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(N)=O MRVYVEQPNDSWLH-XPUUQOCRSA-N 0.000 description 1
- CSMHMEATMDCQNY-DZKIICNBSA-N Gln-Val-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CSMHMEATMDCQNY-DZKIICNBSA-N 0.000 description 1
- JZDHUJAFXGNDSB-WHFBIAKZSA-N Glu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O JZDHUJAFXGNDSB-WHFBIAKZSA-N 0.000 description 1
- ITYRYNUZHPNCIK-GUBZILKMSA-N Glu-Ala-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O ITYRYNUZHPNCIK-GUBZILKMSA-N 0.000 description 1
- OWVURWCRZZMAOZ-XHNCKOQMSA-N Glu-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)O)N)C(=O)O OWVURWCRZZMAOZ-XHNCKOQMSA-N 0.000 description 1
- OAGVHWYIBZMWLA-YFKPBYRVSA-N Glu-Gly-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)NCC(O)=O OAGVHWYIBZMWLA-YFKPBYRVSA-N 0.000 description 1
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 1
- NNQDRRUXFJYCCJ-NHCYSSNCSA-N Glu-Pro-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O NNQDRRUXFJYCCJ-NHCYSSNCSA-N 0.000 description 1
- UQHGAYSULGRWRG-WHFBIAKZSA-N Glu-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(O)=O UQHGAYSULGRWRG-WHFBIAKZSA-N 0.000 description 1
- QOXDAWODGSIDDI-GUBZILKMSA-N Glu-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N QOXDAWODGSIDDI-GUBZILKMSA-N 0.000 description 1
- HMJULNMJWOZNFI-XHNCKOQMSA-N Glu-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N)C(=O)O HMJULNMJWOZNFI-XHNCKOQMSA-N 0.000 description 1
- RXJFSLQVMGYQEL-IHRRRGAJSA-N Glu-Tyr-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 RXJFSLQVMGYQEL-IHRRRGAJSA-N 0.000 description 1
- WGYHAAXZWPEBDQ-IFFSRLJSSA-N Glu-Val-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGYHAAXZWPEBDQ-IFFSRLJSSA-N 0.000 description 1
- JRDYDYXZKFNNRQ-XPUUQOCRSA-N Gly-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN JRDYDYXZKFNNRQ-XPUUQOCRSA-N 0.000 description 1
- JXYMPBCYRKWJEE-BQBZGAKWSA-N Gly-Arg-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O JXYMPBCYRKWJEE-BQBZGAKWSA-N 0.000 description 1
- RQZGFWKQLPJOEQ-YUMQZZPRSA-N Gly-Arg-Gln Chemical compound C(C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)CN)CN=C(N)N RQZGFWKQLPJOEQ-YUMQZZPRSA-N 0.000 description 1
- DUYYPIRFTLOAJQ-YUMQZZPRSA-N Gly-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN DUYYPIRFTLOAJQ-YUMQZZPRSA-N 0.000 description 1
- XTQFHTHIAKKCTM-YFKPBYRVSA-N Gly-Glu-Gly Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O XTQFHTHIAKKCTM-YFKPBYRVSA-N 0.000 description 1
- SWQALSGKVLYKDT-ZKWXMUAHSA-N Gly-Ile-Ala Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SWQALSGKVLYKDT-ZKWXMUAHSA-N 0.000 description 1
- DKEXFJVMVGETOO-LURJTMIESA-N Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CN DKEXFJVMVGETOO-LURJTMIESA-N 0.000 description 1
- ULZCYBYDTUMHNF-IUCAKERBSA-N Gly-Leu-Glu Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ULZCYBYDTUMHNF-IUCAKERBSA-N 0.000 description 1
- DBJYVKDPGIFXFO-BQBZGAKWSA-N Gly-Met-Ala Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O DBJYVKDPGIFXFO-BQBZGAKWSA-N 0.000 description 1
- QAMMIGULQSIRCD-IRXDYDNUSA-N Gly-Phe-Tyr Chemical compound C([C@H](NC(=O)C[NH3+])C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C([O-])=O)C1=CC=CC=C1 QAMMIGULQSIRCD-IRXDYDNUSA-N 0.000 description 1
- JYPCXBJRLBHWME-IUCAKERBSA-N Gly-Pro-Arg Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JYPCXBJRLBHWME-IUCAKERBSA-N 0.000 description 1
- WDXLKVQATNEAJQ-BQBZGAKWSA-N Gly-Pro-Asp Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O WDXLKVQATNEAJQ-BQBZGAKWSA-N 0.000 description 1
- ZZJVYSAQQMDIRD-UWVGGRQHSA-N Gly-Pro-His Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O ZZJVYSAQQMDIRD-UWVGGRQHSA-N 0.000 description 1
- HFPVRZWORNJRRC-UWVGGRQHSA-N Gly-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN HFPVRZWORNJRRC-UWVGGRQHSA-N 0.000 description 1
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 1
- OLIFSFOFKGKIRH-WUJLRWPWSA-N Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CN OLIFSFOFKGKIRH-WUJLRWPWSA-N 0.000 description 1
- UMRIXLHPZZIOML-OALUTQOASA-N Gly-Trp-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)NC(=O)CN UMRIXLHPZZIOML-OALUTQOASA-N 0.000 description 1
- KOYUSMBPJOVSOO-XEGUGMAKSA-N Gly-Tyr-Ile Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KOYUSMBPJOVSOO-XEGUGMAKSA-N 0.000 description 1
- BAYQNCWLXIDLHX-ONGXEEELSA-N Gly-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN BAYQNCWLXIDLHX-ONGXEEELSA-N 0.000 description 1
- 102100031573 Hematopoietic progenitor cell antigen CD34 Human genes 0.000 description 1
- 229920000209 Hexadimethrine bromide Polymers 0.000 description 1
- UPGJWSUYENXOPV-HGNGGELXSA-N His-Gln-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N UPGJWSUYENXOPV-HGNGGELXSA-N 0.000 description 1
- VHHYJBSXXMPQGZ-AVGNSLFASA-N His-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N VHHYJBSXXMPQGZ-AVGNSLFASA-N 0.000 description 1
- ZUPVLBAXUUGKKN-VHSXEESVSA-N His-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CC2=CN=CN2)N)C(=O)O ZUPVLBAXUUGKKN-VHSXEESVSA-N 0.000 description 1
- UXSATKFPUVZVDK-KKUMJFAQSA-N His-Lys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CN=CN1)N UXSATKFPUVZVDK-KKUMJFAQSA-N 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101500025419 Homo sapiens Epidermal growth factor Proteins 0.000 description 1
- 101000777663 Homo sapiens Hematopoietic progenitor cell antigen CD34 Proteins 0.000 description 1
- 241000725303 Human immunodeficiency virus Species 0.000 description 1
- WECYRWOMWSCWNX-XUXIUFHCSA-N Ile-Arg-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O WECYRWOMWSCWNX-XUXIUFHCSA-N 0.000 description 1
- KTGFOCFYOZQVRJ-ZKWXMUAHSA-N Ile-Glu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O KTGFOCFYOZQVRJ-ZKWXMUAHSA-N 0.000 description 1
- NZOCIWKZUVUNDW-ZKWXMUAHSA-N Ile-Gly-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O NZOCIWKZUVUNDW-ZKWXMUAHSA-N 0.000 description 1
- JODPUDMBQBIWCK-GHCJXIJMSA-N Ile-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O JODPUDMBQBIWCK-GHCJXIJMSA-N 0.000 description 1
- ZSESFIFAYQEKRD-CYDGBPFRSA-N Ile-Val-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCSC)C(=O)O)N ZSESFIFAYQEKRD-CYDGBPFRSA-N 0.000 description 1
- RQZFWBLDTBDEOF-RNJOBUHISA-N Ile-Val-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N RQZFWBLDTBDEOF-RNJOBUHISA-N 0.000 description 1
- 108060003951 Immunoglobulin Proteins 0.000 description 1
- 108010021625 Immunoglobulin Fragments Proteins 0.000 description 1
- 102000008394 Immunoglobulin Fragments Human genes 0.000 description 1
- PWWVAXIEGOYWEE-UHFFFAOYSA-N Isophenergan Chemical compound C1=CC=C2N(CC(C)N(C)C)C3=CC=CC=C3SC2=C1 PWWVAXIEGOYWEE-UHFFFAOYSA-N 0.000 description 1
- HFKJBCPRWWGPEY-BQBZGAKWSA-N L-arginyl-L-glutamic acid Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HFKJBCPRWWGPEY-BQBZGAKWSA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- 101150118523 LYS4 gene Proteins 0.000 description 1
- ZRLUISBDKUWAIZ-CIUDSAMLSA-N Leu-Ala-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O ZRLUISBDKUWAIZ-CIUDSAMLSA-N 0.000 description 1
- PBCHMHROGNUXMK-DLOVCJGASA-N Leu-Ala-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 PBCHMHROGNUXMK-DLOVCJGASA-N 0.000 description 1
- RFUBXQQFJFGJFV-GUBZILKMSA-N Leu-Asn-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RFUBXQQFJFGJFV-GUBZILKMSA-N 0.000 description 1
- FIJMQLGQLBLBOL-HJGDQZAQSA-N Leu-Asn-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FIJMQLGQLBLBOL-HJGDQZAQSA-N 0.000 description 1
- HIZYETOZLYFUFF-BQBZGAKWSA-N Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(O)=O HIZYETOZLYFUFF-BQBZGAKWSA-N 0.000 description 1
- DKEZVKFLETVJFY-CIUDSAMLSA-N Leu-Cys-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)N)C(=O)O)N DKEZVKFLETVJFY-CIUDSAMLSA-N 0.000 description 1
- PPTAQBNUFKTJKA-BJDJZHNGSA-N Leu-Cys-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PPTAQBNUFKTJKA-BJDJZHNGSA-N 0.000 description 1
- JYOAXOMPIXKMKK-YUMQZZPRSA-N Leu-Gln Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C([O-])=O)CCC(N)=O JYOAXOMPIXKMKK-YUMQZZPRSA-N 0.000 description 1
- POZULHZYLPGXMR-ONGXEEELSA-N Leu-Gly-Val Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O POZULHZYLPGXMR-ONGXEEELSA-N 0.000 description 1
- YWYQSLOTVIRCFE-SRVKXCTJSA-N Leu-His-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O YWYQSLOTVIRCFE-SRVKXCTJSA-N 0.000 description 1
- AZLASBBHHSLQDB-GUBZILKMSA-N Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(C)C AZLASBBHHSLQDB-GUBZILKMSA-N 0.000 description 1
- OTXBNHIUIHNGAO-UWVGGRQHSA-N Leu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN OTXBNHIUIHNGAO-UWVGGRQHSA-N 0.000 description 1
- BGZCJDGBBUUBHA-KKUMJFAQSA-N Leu-Lys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O BGZCJDGBBUUBHA-KKUMJFAQSA-N 0.000 description 1
- RTIRBWJPYJYTLO-MELADBBJSA-N Leu-Lys-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N RTIRBWJPYJYTLO-MELADBBJSA-N 0.000 description 1
- CPONGMJGVIAWEH-DCAQKATOSA-N Leu-Met-Ala Chemical compound CSCC[C@H](NC(=O)[C@@H](N)CC(C)C)C(=O)N[C@@H](C)C(O)=O CPONGMJGVIAWEH-DCAQKATOSA-N 0.000 description 1
- VQHUBNVKFFLWRP-ULQDDVLXSA-N Leu-Tyr-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=C(O)C=C1 VQHUBNVKFFLWRP-ULQDDVLXSA-N 0.000 description 1
- 241000244189 Lineus Species 0.000 description 1
- PBIPLDMFHAICIP-DCAQKATOSA-N Lys-Glu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PBIPLDMFHAICIP-DCAQKATOSA-N 0.000 description 1
- QBHGXFQJFPWJIH-XUXIUFHCSA-N Lys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN QBHGXFQJFPWJIH-XUXIUFHCSA-N 0.000 description 1
- LUTDBHBIHHREDC-IHRRRGAJSA-N Lys-Pro-Lys Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O LUTDBHBIHHREDC-IHRRRGAJSA-N 0.000 description 1
- CFOLERIRBUAYAD-HOCLYGCPSA-N Lys-Trp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O CFOLERIRBUAYAD-HOCLYGCPSA-N 0.000 description 1
- SUZVLFWOCKHWET-CQDKDKBSSA-N Lys-Tyr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O SUZVLFWOCKHWET-CQDKDKBSSA-N 0.000 description 1
- 208000033868 Lysosomal disease Diseases 0.000 description 1
- 208000015439 Lysosomal storage disease Diseases 0.000 description 1
- 108010060534 MSH (11-13) Proteins 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- LMKSBGIUPVRHEH-FXQIFTODSA-N Met-Ala-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(N)=O LMKSBGIUPVRHEH-FXQIFTODSA-N 0.000 description 1
- WPTDJKDGICUFCP-XUXIUFHCSA-N Met-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CCSC)N WPTDJKDGICUFCP-XUXIUFHCSA-N 0.000 description 1
- 208000021642 Muscular disease Diseases 0.000 description 1
- 241000714199 Myeloproliferative leukemia virus Species 0.000 description 1
- 201000009623 Myopathy Diseases 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 1
- 239000000020 Nitrocellulose Substances 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 108010033276 Peptide Fragments Proteins 0.000 description 1
- 102000007079 Peptide Fragments Human genes 0.000 description 1
- HXSUFWQYLPKEHF-IHRRRGAJSA-N Phe-Asn-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HXSUFWQYLPKEHF-IHRRRGAJSA-N 0.000 description 1
- DJPXNKUDJKGQEE-BZSNNMDCSA-N Phe-Asp-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DJPXNKUDJKGQEE-BZSNNMDCSA-N 0.000 description 1
- IILUKIJNFMUBNF-IHRRRGAJSA-N Phe-Gln-Gln Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O IILUKIJNFMUBNF-IHRRRGAJSA-N 0.000 description 1
- JXWLMUIXUXLIJR-QWRGUYRKSA-N Phe-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JXWLMUIXUXLIJR-QWRGUYRKSA-N 0.000 description 1
- JWQWPTLEOFNCGX-AVGNSLFASA-N Phe-Glu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JWQWPTLEOFNCGX-AVGNSLFASA-N 0.000 description 1
- MCIXMYKSPQUMJG-SRVKXCTJSA-N Phe-Ser-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MCIXMYKSPQUMJG-SRVKXCTJSA-N 0.000 description 1
- 102000016462 Phosphate Transport Proteins Human genes 0.000 description 1
- 108010092528 Phosphate Transport Proteins Proteins 0.000 description 1
- 102100033126 Phosphatidate cytidylyltransferase 2 Human genes 0.000 description 1
- 101710178746 Phosphatidate cytidylyltransferase 2 Proteins 0.000 description 1
- OLHDPZMYUSBGDE-GUBZILKMSA-N Pro-Arg-Cys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O OLHDPZMYUSBGDE-GUBZILKMSA-N 0.000 description 1
- UTAUEDINXUMHLG-FXQIFTODSA-N Pro-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 UTAUEDINXUMHLG-FXQIFTODSA-N 0.000 description 1
- KPDRZQUWJKTMBP-DCAQKATOSA-N Pro-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 KPDRZQUWJKTMBP-DCAQKATOSA-N 0.000 description 1
- HXNYBZQLBWIADP-WDSKDSINSA-N Pro-Cys Chemical compound OC(=O)[C@H](CS)NC(=O)[C@@H]1CCCN1 HXNYBZQLBWIADP-WDSKDSINSA-N 0.000 description 1
- AIZVVCMAFRREQS-GUBZILKMSA-N Pro-Cys-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AIZVVCMAFRREQS-GUBZILKMSA-N 0.000 description 1
- NOXSEHJOXCWRHK-DCAQKATOSA-N Pro-Cys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@@H]1CCCN1 NOXSEHJOXCWRHK-DCAQKATOSA-N 0.000 description 1
- QNZLIVROMORQFH-BQBZGAKWSA-N Pro-Gly-Cys Chemical compound C1C[C@H](NC1)C(=O)NCC(=O)N[C@@H](CS)C(=O)O QNZLIVROMORQFH-BQBZGAKWSA-N 0.000 description 1
- WSRWHZRUOCACLJ-UWVGGRQHSA-N Pro-Gly-His Chemical compound C([C@@H](C(=O)O)NC(=O)CNC(=O)[C@H]1NCCC1)C1=CN=CN1 WSRWHZRUOCACLJ-UWVGGRQHSA-N 0.000 description 1
- FKLSMYYLJHYPHH-UWVGGRQHSA-N Pro-Gly-Leu Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O FKLSMYYLJHYPHH-UWVGGRQHSA-N 0.000 description 1
- AFXCXDQNRXTSBD-FJXKBIBVSA-N Pro-Gly-Thr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O AFXCXDQNRXTSBD-FJXKBIBVSA-N 0.000 description 1
- RUDOLGWDSKQQFF-DCAQKATOSA-N Pro-Leu-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O RUDOLGWDSKQQFF-DCAQKATOSA-N 0.000 description 1
- VTFXTWDFPTWNJY-RHYQMDGZSA-N Pro-Leu-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VTFXTWDFPTWNJY-RHYQMDGZSA-N 0.000 description 1
- SRBFGSGDNNQABI-FHWLQOOXSA-N Pro-Leu-Trp Chemical compound N([C@@H](CC(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C(=O)[C@@H]1CCCN1 SRBFGSGDNNQABI-FHWLQOOXSA-N 0.000 description 1
- ULWBBFKQBDNGOY-RWMBFGLXSA-N Pro-Lys-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N2CCC[C@@H]2C(=O)O ULWBBFKQBDNGOY-RWMBFGLXSA-N 0.000 description 1
- WFIVLLFYUZZWOD-RHYQMDGZSA-N Pro-Lys-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WFIVLLFYUZZWOD-RHYQMDGZSA-N 0.000 description 1
- GFHXZNVJIKMAGO-IHRRRGAJSA-N Pro-Phe-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O GFHXZNVJIKMAGO-IHRRRGAJSA-N 0.000 description 1
- HWLKHNDRXWTFTN-GUBZILKMSA-N Pro-Pro-Cys Chemical compound C1C[C@H](NC1)C(=O)N2CCC[C@H]2C(=O)N[C@@H](CS)C(=O)O HWLKHNDRXWTFTN-GUBZILKMSA-N 0.000 description 1
- BGWKULMLUIUPKY-BQBZGAKWSA-N Pro-Ser-Gly Chemical compound OC(=O)CNC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 BGWKULMLUIUPKY-BQBZGAKWSA-N 0.000 description 1
- SNGZLPOXVRTNMB-LPEHRKFASA-N Pro-Ser-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N2CCC[C@@H]2C(=O)O SNGZLPOXVRTNMB-LPEHRKFASA-N 0.000 description 1
- 241000712907 Retroviridae Species 0.000 description 1
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 1
- QEDMOZUJTGEIBF-FXQIFTODSA-N Ser-Arg-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O QEDMOZUJTGEIBF-FXQIFTODSA-N 0.000 description 1
- HQTKVSCNCDLXSX-BQBZGAKWSA-N Ser-Arg-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O HQTKVSCNCDLXSX-BQBZGAKWSA-N 0.000 description 1
- ZXLUWXWISXIFIX-ACZMJKKPSA-N Ser-Asn-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZXLUWXWISXIFIX-ACZMJKKPSA-N 0.000 description 1
- FIDMVVBUOCMMJG-CIUDSAMLSA-N Ser-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO FIDMVVBUOCMMJG-CIUDSAMLSA-N 0.000 description 1
- MMAPOBOTRUVNKJ-ZLUOBGJFSA-N Ser-Asp-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CO)N)C(=O)O MMAPOBOTRUVNKJ-ZLUOBGJFSA-N 0.000 description 1
- RNFKSBPHLTZHLU-WHFBIAKZSA-N Ser-Cys-Gly Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)NCC(=O)O)N)O RNFKSBPHLTZHLU-WHFBIAKZSA-N 0.000 description 1
- SVWQEIRZHHNBIO-WHFBIAKZSA-N Ser-Gly-Cys Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CS)C(O)=O SVWQEIRZHHNBIO-WHFBIAKZSA-N 0.000 description 1
- SNVIOQXAHVORQM-WDSKDSINSA-N Ser-Gly-Gln Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O SNVIOQXAHVORQM-WDSKDSINSA-N 0.000 description 1
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 1
- BXLYSRPHVMCOPS-ACZMJKKPSA-N Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO BXLYSRPHVMCOPS-ACZMJKKPSA-N 0.000 description 1
- SBMNPABNWKXNBJ-BQBZGAKWSA-N Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CO SBMNPABNWKXNBJ-BQBZGAKWSA-N 0.000 description 1
- ZKBKUWQVDWWSRI-BZSNNMDCSA-N Ser-Phe-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKBKUWQVDWWSRI-BZSNNMDCSA-N 0.000 description 1
- DINQYZRMXGWWTG-GUBZILKMSA-N Ser-Pro-Pro Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DINQYZRMXGWWTG-GUBZILKMSA-N 0.000 description 1
- SZRNDHWMVSFPSP-XKBZYTNZSA-N Ser-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N)O SZRNDHWMVSFPSP-XKBZYTNZSA-N 0.000 description 1
- KIEIJCFVGZCUAS-MELADBBJSA-N Ser-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N)C(=O)O KIEIJCFVGZCUAS-MELADBBJSA-N 0.000 description 1
- BIWBTRRBHIEVAH-IHPCNDPISA-N Ser-Tyr-Trp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O BIWBTRRBHIEVAH-IHPCNDPISA-N 0.000 description 1
- OSFZCEQJLWCIBG-BZSNNMDCSA-N Ser-Tyr-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OSFZCEQJLWCIBG-BZSNNMDCSA-N 0.000 description 1
- ANOQEBQWIAYIMV-AEJSXWLSSA-N Ser-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ANOQEBQWIAYIMV-AEJSXWLSSA-N 0.000 description 1
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- 108091027544 Subgenomic mRNA Proteins 0.000 description 1
- 101800001271 Surface protein Proteins 0.000 description 1
- VPZKQTYZIVOJDV-LMVFSUKVSA-N Thr-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(O)=O VPZKQTYZIVOJDV-LMVFSUKVSA-N 0.000 description 1
- TZKPNGDGUVREEB-FOHZUACHSA-N Thr-Asn-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O TZKPNGDGUVREEB-FOHZUACHSA-N 0.000 description 1
- UHBPFYOQQPFKQR-JHEQGTHGSA-N Thr-Gln-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O UHBPFYOQQPFKQR-JHEQGTHGSA-N 0.000 description 1
- CYVQBKQYQGEELV-NKIYYHGXSA-N Thr-His-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O CYVQBKQYQGEELV-NKIYYHGXSA-N 0.000 description 1
- FKIGTIXHSRNKJU-IXOXFDKPSA-N Thr-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CN=CN1 FKIGTIXHSRNKJU-IXOXFDKPSA-N 0.000 description 1
- LUMXICQAOKVQOB-YWIQKCBGSA-N Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)[C@@H](C)O LUMXICQAOKVQOB-YWIQKCBGSA-N 0.000 description 1
- LCCSEJSPBWKBNT-OSUNSFLBSA-N Thr-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N LCCSEJSPBWKBNT-OSUNSFLBSA-N 0.000 description 1
- WFAUDCSNCWJJAA-KXNHARMFSA-N Thr-Lys-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(O)=O WFAUDCSNCWJJAA-KXNHARMFSA-N 0.000 description 1
- WYLAVUAWOUVUCA-XVSYOHENSA-N Thr-Phe-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WYLAVUAWOUVUCA-XVSYOHENSA-N 0.000 description 1
- BIBYEFRASCNLAA-CDMKHQONSA-N Thr-Phe-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 BIBYEFRASCNLAA-CDMKHQONSA-N 0.000 description 1
- NDXSOKGYKCGYKT-VEVYYDQMSA-N Thr-Pro-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O NDXSOKGYKCGYKT-VEVYYDQMSA-N 0.000 description 1
- DEGCBBCMYWNJNA-RHYQMDGZSA-N Thr-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O DEGCBBCMYWNJNA-RHYQMDGZSA-N 0.000 description 1
- KERCOYANYUPLHJ-XGEHTFHBSA-N Thr-Pro-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O KERCOYANYUPLHJ-XGEHTFHBSA-N 0.000 description 1
- HUPLKEHTTQBXSC-YJRXYDGGSA-N Thr-Ser-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HUPLKEHTTQBXSC-YJRXYDGGSA-N 0.000 description 1
- MFMGPEKYBXFIRF-SUSMZKCASA-N Thr-Thr-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MFMGPEKYBXFIRF-SUSMZKCASA-N 0.000 description 1
- ZMYCLHFLHRVOEA-HEIBUPTGSA-N Thr-Thr-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ZMYCLHFLHRVOEA-HEIBUPTGSA-N 0.000 description 1
- BGHVVGPELPHRCI-HZTRNQAASA-N Thr-Trp-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC3=CNC4=CC=CC=C43)C(=O)O)N)O BGHVVGPELPHRCI-HZTRNQAASA-N 0.000 description 1
- PWONLXBUSVIZPH-RHYQMDGZSA-N Thr-Val-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O PWONLXBUSVIZPH-RHYQMDGZSA-N 0.000 description 1
- BTAJAOWZCWOHBU-HSHDSVGOSA-N Thr-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)O)C(C)C)C(O)=O)=CNC2=C1 BTAJAOWZCWOHBU-HSHDSVGOSA-N 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- IBBBOLAPFHRDHW-BPUTZDHNSA-N Trp-Asn-Arg Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N IBBBOLAPFHRDHW-BPUTZDHNSA-N 0.000 description 1
- DVWAIHZOPSYMSJ-ZVZYQTTQSA-N Trp-Glu-Val Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O)=CNC2=C1 DVWAIHZOPSYMSJ-ZVZYQTTQSA-N 0.000 description 1
- DZHDVYLBNKMLMB-ZFWWWQNUSA-N Trp-Lys Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 DZHDVYLBNKMLMB-ZFWWWQNUSA-N 0.000 description 1
- FQNUWOHNGJWNLM-QWRGUYRKSA-N Tyr-Cys-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CS)C(=O)NCC(O)=O FQNUWOHNGJWNLM-QWRGUYRKSA-N 0.000 description 1
- HZZKQZDUIKVFDZ-AVGNSLFASA-N Tyr-Gln-Ser Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)O HZZKQZDUIKVFDZ-AVGNSLFASA-N 0.000 description 1
- YYZPVPJCOGGQPC-JYJNAYRXSA-N Tyr-His-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYZPVPJCOGGQPC-JYJNAYRXSA-N 0.000 description 1
- GGXUDPQWAWRINY-XEGUGMAKSA-N Tyr-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 GGXUDPQWAWRINY-XEGUGMAKSA-N 0.000 description 1
- CDBXVDXSLPLFMD-BPNCWPANSA-N Tyr-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDBXVDXSLPLFMD-BPNCWPANSA-N 0.000 description 1
- YKBUNNNRNZZUID-UFYCRDLUSA-N Tyr-Val-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YKBUNNNRNZZUID-UFYCRDLUSA-N 0.000 description 1
- YTPLVNUZZOBFFC-SCZZXKLOSA-N Val-Gly-Pro Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N1CCC[C@@H]1C(O)=O YTPLVNUZZOBFFC-SCZZXKLOSA-N 0.000 description 1
- XCTHZFGSVQBHBW-IUCAKERBSA-N Val-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])C(C)C XCTHZFGSVQBHBW-IUCAKERBSA-N 0.000 description 1
- FEXILLGKGGTLRI-NHCYSSNCSA-N Val-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N FEXILLGKGGTLRI-NHCYSSNCSA-N 0.000 description 1
- WLHIIWDIDLQTKP-IHRRRGAJSA-N Val-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)C(C)C WLHIIWDIDLQTKP-IHRRRGAJSA-N 0.000 description 1
- BGXVHVMJZCSOCA-AVGNSLFASA-N Val-Pro-Lys Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N BGXVHVMJZCSOCA-AVGNSLFASA-N 0.000 description 1
- 102000005789 Vascular Endothelial Growth Factors Human genes 0.000 description 1
- 108010019530 Vascular Endothelial Growth Factors Proteins 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- 108010066342 Virus Receptors Proteins 0.000 description 1
- 102000018265 Virus Receptors Human genes 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 108010056243 alanylalanine Proteins 0.000 description 1
- 230000003281 allosteric effect Effects 0.000 description 1
- 230000000259 anti-tumor effect Effects 0.000 description 1
- 230000000840 anti-viral effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 108091008324 binding proteins Proteins 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 230000007321 biological mechanism Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 108010083912 bleomycin N-acetyltransferase Proteins 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- UDSAIICHUKSCKT-UHFFFAOYSA-N bromophenol blue Chemical compound C1=C(Br)C(O)=C(Br)C=C1C1(C=2C=C(Br)C(O)=C(Br)C=2)C2=CC=CC=C2S(=O)(=O)O1 UDSAIICHUKSCKT-UHFFFAOYSA-N 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 210000000170 cell membrane Anatomy 0.000 description 1
- 210000003855 cell nucleus Anatomy 0.000 description 1
- 230000008614 cellular interaction Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000002512 chemotherapy Methods 0.000 description 1
- 210000004978 chinese hamster ovary cell Anatomy 0.000 description 1
- 229960002328 chloroquine phosphate Drugs 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000002153 concerted effect Effects 0.000 description 1
- 108010057085 cytokine receptors Proteins 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 229940009976 deoxycholate Drugs 0.000 description 1
- KXGVEGMKQFWNSR-LLQZFEROSA-N deoxycholic acid Chemical compound C([C@H]1CC2)[C@H](O)CC[C@]1(C)[C@@H]1[C@@H]2[C@@H]2CC[C@H]([C@@H](CCC(O)=O)C)[C@@]2(C)[C@@H](O)C1 KXGVEGMKQFWNSR-LLQZFEROSA-N 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000006471 dimerization reaction Methods 0.000 description 1
- LOKCTEFSRHRXRJ-UHFFFAOYSA-I dipotassium trisodium dihydrogen phosphate hydrogen phosphate dichloride Chemical compound P(=O)(O)(O)[O-].[K+].P(=O)(O)([O-])[O-].[Na+].[Na+].[Cl-].[K+].[Cl-].[Na+] LOKCTEFSRHRXRJ-UHFFFAOYSA-I 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000012202 endocytosis Effects 0.000 description 1
- 210000001163 endosome Anatomy 0.000 description 1
- 230000007247 enzymatic mechanism Effects 0.000 description 1
- 239000012737 fresh medium Substances 0.000 description 1
- 101150098622 gag gene Proteins 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 108010084724 gibbon ape leukemia virus receptor Proteins 0.000 description 1
- 150000002333 glycines Chemical class 0.000 description 1
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 1
- 108010025801 glycyl-prolyl-arginine Proteins 0.000 description 1
- 108010087823 glycyltyrosine Proteins 0.000 description 1
- 108010085325 histidylproline Proteins 0.000 description 1
- 229940116978 human epidermal growth factor Drugs 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 102000018358 immunoglobulin Human genes 0.000 description 1
- 229940072221 immunoglobulins Drugs 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 238000011081 inoculation Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000010189 intracellular transport Effects 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 229920001220 nitrocellulos Polymers 0.000 description 1
- 210000004940 nucleus Anatomy 0.000 description 1
- 229920002113 octoxynol Polymers 0.000 description 1
- 238000006384 oligomerization reaction Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000020477 pH reduction Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 230000035515 penetration Effects 0.000 description 1
- 230000001766 physiological effect Effects 0.000 description 1
- 229920000136 polysorbate Polymers 0.000 description 1
- 238000011533 pre-incubation Methods 0.000 description 1
- 239000002244 precipitate Substances 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 150000003148 prolines Chemical class 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- XJMOSONTPMZWPB-UHFFFAOYSA-M propidium iodide Chemical compound [I-].[I-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CCC[N+](C)(CC)CC)=C1C1=CC=CC=C1 XJMOSONTPMZWPB-UHFFFAOYSA-M 0.000 description 1
- 239000003531 protein hydrolysate Substances 0.000 description 1
- 230000006337 proteolytic cleavage Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 238000001338 self-assembly Methods 0.000 description 1
- 235000020183 skimmed milk Nutrition 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- -1 spacer amino acids Chemical class 0.000 description 1
- 238000010186 staining Methods 0.000 description 1
- 210000000130 stem cell Anatomy 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 108091005703 transmembrane proteins Proteins 0.000 description 1
- 102000035160 transmembrane proteins Human genes 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 239000013638 trimer Substances 0.000 description 1
- 238000005829 trimerization reaction Methods 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- 230000010415 tropism Effects 0.000 description 1
- 210000004881 tumor cell Anatomy 0.000 description 1
- 108010036320 valylleucine Proteins 0.000 description 1
- 230000009385 viral infection Effects 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/005—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K47/00—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient
- A61K47/50—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates
- A61K47/69—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the conjugate being characterised by physical or galenical forms, e.g. emulsion, particle, inclusion complex, stent or kit
- A61K47/6901—Conjugates being cells, cell fragments, viruses, ghosts, red blood cells or viral vectors
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K1/00—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
- C07K1/107—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length by chemical modification of precursor peptides
- C07K1/1072—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length by chemical modification of precursor peptides by covalent attachment of residues or functional groups
- C07K1/1075—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length by chemical modification of precursor peptides by covalent attachment of residues or functional groups by covalent attachment of amino acids or peptide residues
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/475—Growth factors; Growth regulators
- C07K14/485—Epidermal growth factor [EGF], i.e. urogastrone
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/62—DNA sequences coding for fusion proteins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/86—Viral vectors
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N7/00—Viruses; Bacteriophages; Compositions thereof; Preparation or purification thereof
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2740/00—Reverse transcribing RNA viruses
- C12N2740/00011—Details
- C12N2740/10011—Retroviridae
- C12N2740/13011—Gammaretrovirus, e.g. murine leukeamia virus
- C12N2740/13022—New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2740/00—Reverse transcribing RNA viruses
- C12N2740/00011—Details
- C12N2740/10011—Retroviridae
- C12N2740/13011—Gammaretrovirus, e.g. murine leukeamia virus
- C12N2740/13041—Use of virus, viral particle or viral elements as a vector
- C12N2740/13043—Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2740/00—Reverse transcribing RNA viruses
- C12N2740/00011—Details
- C12N2740/10011—Retroviridae
- C12N2740/13011—Gammaretrovirus, e.g. murine leukeamia virus
- C12N2740/13041—Use of virus, viral particle or viral elements as a vector
- C12N2740/13045—Special targeting system for viral vectors
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2740/00—Reverse transcribing RNA viruses
- C12N2740/00011—Details
- C12N2740/10011—Retroviridae
- C12N2740/13011—Gammaretrovirus, e.g. murine leukeamia virus
- C12N2740/13061—Methods of inactivation or attenuation
- C12N2740/13062—Methods of inactivation or attenuation by genetic engineering
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2810/00—Vectors comprising a targeting moiety
- C12N2810/50—Vectors comprising as targeting moiety peptide derived from defined protein
- C12N2810/60—Vectors comprising as targeting moiety peptide derived from defined protein from viruses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2810/00—Vectors comprising a targeting moiety
- C12N2810/50—Vectors comprising as targeting moiety peptide derived from defined protein
- C12N2810/60—Vectors comprising as targeting moiety peptide derived from defined protein from viruses
- C12N2810/6009—Vectors comprising as targeting moiety peptide derived from defined protein from viruses dsDNA viruses
- C12N2810/6018—Adenoviridae
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2810/00—Vectors comprising a targeting moiety
- C12N2810/50—Vectors comprising as targeting moiety peptide derived from defined protein
- C12N2810/60—Vectors comprising as targeting moiety peptide derived from defined protein from viruses
- C12N2810/6045—RNA rev transcr viruses
- C12N2810/6054—Retroviridae
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2810/00—Vectors comprising a targeting moiety
- C12N2810/50—Vectors comprising as targeting moiety peptide derived from defined protein
- C12N2810/80—Vectors comprising as targeting moiety peptide derived from defined protein from vertebrates
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2810/00—Vectors comprising a targeting moiety
- C12N2810/50—Vectors comprising as targeting moiety peptide derived from defined protein
- C12N2810/80—Vectors comprising as targeting moiety peptide derived from defined protein from vertebrates
- C12N2810/85—Vectors comprising as targeting moiety peptide derived from defined protein from vertebrates mammalian
- C12N2810/851—Vectors comprising as targeting moiety peptide derived from defined protein from vertebrates mammalian from growth factors; from growth regulators
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2810/00—Vectors comprising a targeting moiety
- C12N2810/50—Vectors comprising as targeting moiety peptide derived from defined protein
- C12N2810/80—Vectors comprising as targeting moiety peptide derived from defined protein from vertebrates
- C12N2810/85—Vectors comprising as targeting moiety peptide derived from defined protein from vertebrates mammalian
- C12N2810/852—Vectors comprising as targeting moiety peptide derived from defined protein from vertebrates mammalian from cytokines; from lymphokines; from interferons
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2810/00—Vectors comprising a targeting moiety
- C12N2810/50—Vectors comprising as targeting moiety peptide derived from defined protein
- C12N2810/80—Vectors comprising as targeting moiety peptide derived from defined protein from vertebrates
- C12N2810/85—Vectors comprising as targeting moiety peptide derived from defined protein from vertebrates mammalian
- C12N2810/854—Vectors comprising as targeting moiety peptide derived from defined protein from vertebrates mammalian from hormones
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- General Health & Medical Sciences (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Medicinal Chemistry (AREA)
- Virology (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Physics & Mathematics (AREA)
- Pharmacology & Pharmacy (AREA)
- Animal Behavior & Ethology (AREA)
- General Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Immunology (AREA)
- Cell Biology (AREA)
- Hematology (AREA)
- Toxicology (AREA)
- Epidemiology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Peptides Or Proteins (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Medicinal Preparation (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
Abstract
The invention features the use of a peptide for transferring genes in a eukaryotic target cell, which peptide has about 10 to about 200, in particular about 15 to about 150 amino acids, and advantageously about 20 amino acids, in which 30 % at least of the amino acids are constituted by proline radicals, which proline radicals are regularly arranged so as to induce polypeptide chain turnings at about 180~ (".beta.-turn" or "reverse-turn"), these turns being regularly spaced and gathering in a polyproline .beta.-turn helix, in a polypeptide construction containing, on the said peptide N-terminal side (upstream), an N-terminal (upstream) proteinic domain capable of recognising a targeted surface molecule or an antigen expressed on a cellular surface, in particular an appropriate receptor (targeted receptor) located on the said eukaryotic cell, and on the said peptide C-terminal side (downstream), a C-terminal (downstream) protein domain capable of recognising an appropriate receptor (auxiliary receptor) located on the said eukaryotic cell, which peptide is capable of facilitating or inhibiting interaction between the C-terminal (downstream) protein domain and the auxiliary receptor, the inhibition of this interaction taking place so long as the N-terminal (upstream) protein domain has not interacted with the targeted receptor, and the facilitating of the interaction between the C-terminal (downstream) protein domain and the auxiliary receptor taking place when the N-terminal (upstream) protein domain has interacted with the targeted receptor.
Description
VIRAL PARTICLES WHICH ARE MASKED OR UNMASKED WITH RESPECT TO
A CELL RECEPTOR
The invention relates to recombinant viral particles containing a peptide possessing properties of masking and of unmasking with respect to a biological mechanism, notably with respect to a mechanism of cellular interaction.
The invention also relates to the application of the aforesaid viral particles, notably for cell targeting in gene transfer.
Retroviruses and therefore retrovirus vectors initiate their infectious cycle by recognizing specific cell surface molecules, called retrovirus receptors, with envelope glycoproteins expressed on the surface of the retroviral particles. This recognition then leads to fusion between the viral and cellular membranes, a process that is complex and poorly understood, which is also mediated by a second function of the envelope glycoprotein.
The possibility of altering the specificity of the interaction with the surface of the target cell has been demonstrated previously, notably by means of genetic modifications introduced in the retroviral envelope glycoprotein.
A certain number of works have shown that such modifications do not involve disturbing effects in the complex processes that permit the retroviral envelope glycoprotein to become mature, to be expressed on the cell surface, and to be incorporated selectively in the virions. Moreover, it has now been proved that these modifications can lead to the specific recognition of cells by interaction with the surface molecules corresponding to these polypeptides. Finally, in certain cases, this recognition permits continuation of the infectious cycle and integration of the transgene in the targeted cell with, in the best possible case, an ei~iciency that is greatly reduced relative to the efficiency that is provided by an unmodified retroviral envelope via its normal retrovirus receptor. Hence it is concluded, on the one hand, that certain target surface molecules cannot be utilized as a receptor for initiating infection and, on the other hand, when they can be so utilized, the processes following the primary interaction take place with extremely low efficiency. These conclusions are true with regard to "one stage targeting" strategies, i.e. approaches in which the aim of the primary interaction is to lead directly to continuation of the infectious cycle, without the intervention of auxiliary processes.
Retrovirus vectors are now the most-used vectors for gene transfer and in particular for gene therapy, as we require stable integration and expression of the transgene. Other gene transfer vectors exist (adenovirus vectors, liposomes, vectors derived from herpesviruses, or vectors derived from AAVs)) but do not permit stable and in a highly selective reaction after .. CA 02253874 1998-11-06 efficient integration of the transgene. Whereas most of the gene therapy protocols examined up to now and using retroviral vectors are based on the explantation of the patient's cells, their transgenesis and expansion ift vitro, followed by their reimplantation, it would be highly desirable to be able to transfer a therapeutic gene in the irt vivo context by means of retroviral vectors. For that, the retroviral particle carrying the therapeutic gene would have to be endowed with certain additional characteristics, and more particularly, the ability to recognize very specifically the target cells of the gene transfer. In fact, the surface molecules that are recognized naturally by the retroviruses for initiation of infection are expressed very widely on most of the cells.
This does not permit precise discrimination of the cells in which one wishes to effect a gene transfer.
Several works have had the aim of altering the infection tropism of retroviral vectors. Some of these works are based on biochemical modifications of retroviral particles, others on genetic modifications of the retroviral envelope glycoproteins, which guarantees that all the retroviral particles will be altered. In this last context, the works I S have consisted of "single-stage targeting", for which the modified viral particle attaches to the targeted cell surface molecule - leading to continuation of the infectious cycle.
However, the efficiency of the retroviral vectors altered in this way is very low relative to what is required for obtaining a tool that can be used for purposes of therapy.
It is possible that the development of targeting strategies could not succeed with the "single-stage" system, for reasons already mentioned above: inefficiency of the gene fusion process of the chimeric envelope glycoproteins after their binding on the targeted surface molecule, and the impossibility of utilizing certain surface molecules as retroviral receptors.
A certain number of human gene therapy protocols will require retroviral vectors that are capable of effecting gene transfer in vivo, by direct inoculation of the recombinant retrovira( particles. Among the improvements that this presupposes relative to the retroviral vectors developed until now, we may cite:
- improvement of the infectious titres, - improvement of the stability of the viral particles in the serum and more generally in the various body fluids) - the possibility of infecting quiescent cells, - the possibility of discrimination of the target cells of gene therapy.
The invention has the aim of proposing means for discriminating the target cells of gene therapy. It is essential, for certain applications in gene therapy, to guarantee that gene transfer will only have taken place in the cells to be treated) and not in other categories of cells. For example, when we wish to confer a selective advantage on normal cells with respect to a chemotherapy, it is imperative that the transferred gene conferring this advantage has not been introduced into cancer cells.
A CELL RECEPTOR
The invention relates to recombinant viral particles containing a peptide possessing properties of masking and of unmasking with respect to a biological mechanism, notably with respect to a mechanism of cellular interaction.
The invention also relates to the application of the aforesaid viral particles, notably for cell targeting in gene transfer.
Retroviruses and therefore retrovirus vectors initiate their infectious cycle by recognizing specific cell surface molecules, called retrovirus receptors, with envelope glycoproteins expressed on the surface of the retroviral particles. This recognition then leads to fusion between the viral and cellular membranes, a process that is complex and poorly understood, which is also mediated by a second function of the envelope glycoprotein.
The possibility of altering the specificity of the interaction with the surface of the target cell has been demonstrated previously, notably by means of genetic modifications introduced in the retroviral envelope glycoprotein.
A certain number of works have shown that such modifications do not involve disturbing effects in the complex processes that permit the retroviral envelope glycoprotein to become mature, to be expressed on the cell surface, and to be incorporated selectively in the virions. Moreover, it has now been proved that these modifications can lead to the specific recognition of cells by interaction with the surface molecules corresponding to these polypeptides. Finally, in certain cases, this recognition permits continuation of the infectious cycle and integration of the transgene in the targeted cell with, in the best possible case, an ei~iciency that is greatly reduced relative to the efficiency that is provided by an unmodified retroviral envelope via its normal retrovirus receptor. Hence it is concluded, on the one hand, that certain target surface molecules cannot be utilized as a receptor for initiating infection and, on the other hand, when they can be so utilized, the processes following the primary interaction take place with extremely low efficiency. These conclusions are true with regard to "one stage targeting" strategies, i.e. approaches in which the aim of the primary interaction is to lead directly to continuation of the infectious cycle, without the intervention of auxiliary processes.
Retrovirus vectors are now the most-used vectors for gene transfer and in particular for gene therapy, as we require stable integration and expression of the transgene. Other gene transfer vectors exist (adenovirus vectors, liposomes, vectors derived from herpesviruses, or vectors derived from AAVs)) but do not permit stable and in a highly selective reaction after .. CA 02253874 1998-11-06 efficient integration of the transgene. Whereas most of the gene therapy protocols examined up to now and using retroviral vectors are based on the explantation of the patient's cells, their transgenesis and expansion ift vitro, followed by their reimplantation, it would be highly desirable to be able to transfer a therapeutic gene in the irt vivo context by means of retroviral vectors. For that, the retroviral particle carrying the therapeutic gene would have to be endowed with certain additional characteristics, and more particularly, the ability to recognize very specifically the target cells of the gene transfer. In fact, the surface molecules that are recognized naturally by the retroviruses for initiation of infection are expressed very widely on most of the cells.
This does not permit precise discrimination of the cells in which one wishes to effect a gene transfer.
Several works have had the aim of altering the infection tropism of retroviral vectors. Some of these works are based on biochemical modifications of retroviral particles, others on genetic modifications of the retroviral envelope glycoproteins, which guarantees that all the retroviral particles will be altered. In this last context, the works I S have consisted of "single-stage targeting", for which the modified viral particle attaches to the targeted cell surface molecule - leading to continuation of the infectious cycle.
However, the efficiency of the retroviral vectors altered in this way is very low relative to what is required for obtaining a tool that can be used for purposes of therapy.
It is possible that the development of targeting strategies could not succeed with the "single-stage" system, for reasons already mentioned above: inefficiency of the gene fusion process of the chimeric envelope glycoproteins after their binding on the targeted surface molecule, and the impossibility of utilizing certain surface molecules as retroviral receptors.
A certain number of human gene therapy protocols will require retroviral vectors that are capable of effecting gene transfer in vivo, by direct inoculation of the recombinant retrovira( particles. Among the improvements that this presupposes relative to the retroviral vectors developed until now, we may cite:
- improvement of the infectious titres, - improvement of the stability of the viral particles in the serum and more generally in the various body fluids) - the possibility of infecting quiescent cells, - the possibility of discrimination of the target cells of gene therapy.
The invention has the aim of proposing means for discriminating the target cells of gene therapy. It is essential, for certain applications in gene therapy, to guarantee that gene transfer will only have taken place in the cells to be treated) and not in other categories of cells. For example, when we wish to confer a selective advantage on normal cells with respect to a chemotherapy, it is imperative that the transferred gene conferring this advantage has not been introduced into cancer cells.
The invention relates to a two-stage mechanism, in which the second stage is dependent on realization of the first stage.
The invention relates to an alternative that is beneficial with regard to performance in targeting, particularly because it combines specific recognition of the target cell and entry into the target cell connected with a natural retroviral mechanism, known for its efficiency.
The invention relates more particularly to a two-stage targeting mechanism:
- the first stage permitting recognition of a targeted surface molecule by means of the new N-terminal binding domain, inserted in an envelope glycoprotein, - the second stage permitting conditional recognition of a normal retroviral receptor via a domain inherent in the initial envelope glycoprotein and thus permitting a relay in the process of entry of the viral particle into the cell, the term "conditional"
signifying that the relay in the entry mechanism can only be effected if the viral particle has previously interacted with the initial surface molecule, which in turn guarantees that the infection is truly targeted.
The invention relates to new peptides for carrying out the first stage in a two stage mechanism and which perform the role of "masking" with respect to the second stage, for as long as the first stage has not taken place and permitting the second stage, i. e. performing the role of unmasking with respect to the second stage if, and only if, the first stage has taken place.
The present invention also relates to the construction of chimeric envelope glycoproteins using these novel peptides.
The invention relates to the use of a peptide for transfer of genes into a target eukaryotic cell, this peptide containing from about 10 to about 200, especially from about 1 S to about 1 SO amino acids, and preferably about 20 amino acids, in which at least 30% of the amino acids are made up of proline residues, these proline residues being regularly arranged so as to induce turnings of the polypeptide chain to about 180°
("~3-turn" or "reverse-turn"), these turnings being evenly spaced and forming a polyproline helix with (3 type turning ("polyproline (3-turn helix"), in a polypeptide construction containing, on the N-terminal side (upstream) of the said peptide, an N-terminal (upstream) protein region capable of recognizing a targeted surface molecule or an antigen expressed on a cell surface, especially a suitable receptor (targeted receptor) located on the said eukaryotic cell, and on the C-terminal side (downstream) of the said peptide, a C-terminal (downstream) protein region capable of recognizing a suitable receptor (auxiliary receptor) located on the aforesaid eukaryotic cell, this peptide being capable of promoting or inhibiting interaction between the C-terminal (downstream) protein region and the auxiliary receptor, inhibition of this interaction occurring for as long as the N-terminal (upstream) protein domain has not interacted with the targeted receptor and promotion of interaction between the C-terminal (downstream) protein domain and the auxiliary receptor occurring when the N-terminal (upstream) protein domain has interacted with the targeted receptor.
In the case of a peptide of 20 amino acids (OPRO defined below), this b-turn polyproline helix contains four (3 turnings and therefore 4 turns, and moreover is incompatible with an a-helix or (3-sheet secondary structure. Advantageously, the polyproline helix with (3 type turning positioned between the two domains of the chimeric protein (N-terminal domain and auxiliary domain) possesses intrinsically: 1 ) an elastomeric force, 2) the property of self assembly with other polyproline helices, probably in connection with the trimeric nature of the envelope, 3) the property of transmitting, to the auxiliary domain, a distortion that is induced by binding of the N-terminal domain with its receptor, causing activation of the auxiliary domain.
The invention also relates in general to any two-stage mechanism, in which the 1 S second stage can only be effected if the first stage has taken place) and relates for example to an enzymatic mechanism involving a chimeric protein which is only to occur if the chimeric protein is able to recognize its substrate.
The expression "N-terminal (upstream) protein domain capable of recognizing a targeted surface molecule, or an antigen expressed on a cell surface", means that:
1) the interaction between this N-terminal protein domain and the targeted surface molecule can be characterized by a dissociation constant (of nanomolar order with respect to interaction between wild-type retroviral envelope glycoprotein and retroviral receptor);
2) the soluble form of this N-terminal protein domain (i.e. not associated in the construction of the chimeric envelope glycoprotein) possesses binding characteristics similar to this same protein domain when it is inserted at the N-terminal position in the chimeric envelope glycoprotein;
3) the chimeric envelope glycoprotein containing the N-terminal protein domain can be characterized according to classical techniques of virology (e.g. binding test; cf "Examples").
The following may be mentioned as examples of targeted surface molecule or of antigen expressed on a cell surface:
- markers for differentiating the various haematopoietic lineages, in particular markers expressed on immature cells and/or haematopoietic stem cells (example:
CD34), - markers expressed on tumour cells (example: carcino-embryonic antigens), - markers present specifically on various differentiated tissues (example:
receptor of growth factors, of peptide hormones).
As an example of a targeted surface molecule, we may mention in particular a receptor which will be designated as targeted receptor hereinafter For convenience of terminology, the expression "targeted receptor" will be used in the following to encompass any targeted surface molecule or any antigen expressed on S a cellular surface.
The expression "C-terminal (downstream) protein domain capable of recognizing a suitable receptor (auxiliary receptor)" means that the C-terminal protein domain can interact with the auxiliary receptor, this interaction being characterized by a dissociation constant which is of nanomolar order if the C-terminal protein domain is derived from a retroviral envelope glycoprotein and if the auxiliary receptor is the retroviral receptor used by this same glycoprotein, this interaction permitting the triggering of the gene fusion process in a mechanism that is strictly similar to the natural process, i. e. outside of the context of a chimeric envelope glycoprotein.
The peptide that is the subject of the invention is such that, positioned between two protein domains (an N-terminal protein domain relative to the said peptide and a C
terminal protein domain relative to the said peptide), it can induce the function of the C
terminal protein domain (for example binding if that is the function of this C-terminal domain) if, and only if, the N-terminal protein domain has been mobilized in its function (for example binding).
Non-induction of the function of the C-terminal protein domain by the peptide of the invention corresponds to the mechanism of "masking" of the peptide of the invention, whereas induction of the function of the C-terminal protein domain by the peptide of the invention corresponds to the mechanism of "unmasking" of the peptide of the invention.
That is why the peptide of the invention will also be designated hereinafter as "masking/unmasking peptide".
The invention relates to the use of a peptide according to the invention, in the construction of a glycoprotein with targeting and gene-fusion activity, essentially intact, carried by a viral or non-viral recombinant gene-transfer vector capable of infecting a eukaryotic cell, the said eukaryotic cell possessing a targeted receptor and an auxiliary receptor permitting facilitation of entry of the said viral or non-viral vector into the eukaryotic cell, the aforesaid glycoprotein comprising:
- the aforesaid peptide, - a protein domain on the N-terminal (upstream) side of the said peptide, capable of interacting with the above-mentioned targeted receptor, this protein domain permitting specific binding of the aforesaid gene-transfer vector and - a protein domain on the C-terminal (downstream) side of the said peptide, capable of interacting with the aforesaid auxiliary receptor, this interaction performing the role of auxiliary mechanism of entry of the aforesaid gene-transfer vector into the eukaryotic cell, the process of cell entry of the viral or non-viral recombinant vector into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to occur when the N-terminal (upstream) protein domain has recognized and bound the viral or S non-viral recombinant vector with the targeted receptor of the eukaryotic cell, leading, by means of the aforesaid peptide, to a mechanism of "unmasking" or of accessibility of the auxiliary receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the aforesaid gene-transfer vector and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, a mechanism of "masking" or of non-accessibility is produced, by means of the aforesaid peptide, of the auxiliary receptor with respect to the C-terminal (downstream) protein domain.
The expression glycoprotein with targeting and gene-fusion activity denotes a glycoprotein which is:
1) capable of being incorporated efficiently on (retro)viral particles carrying a transgene, 2) capable of specifically recognizing the targeted cell-surface molecule and of specifically redirecting the binding of the (retro)viral particle which carries it to this molecule, 3) capable of causing fusion, after fixation on the molecular target, of the membrane of the (retro)viral particle and the cytoplasmic membrane of the cell, according to the mechanism used naturally by the (retro)virus from which the envelope glycoprotein was derived.
The expression "substantially intact" refers to a viral glycoprotein that retains all its necessary determinants for preserving the post-translation processes:
oligomerization, the properties of viral incorporation and of fusion, as required. However, certain changes (such as mutations, deletions, additions) can be made to the glycoprotein without significantly affecting its functions and the glycoproteins containing these minor changes are regarded as substantially intact for the needs of the invention.
In particular, the glycoprotein may lack some amino acids (for example about 1 to 10), especially at the N-terminal end, but will generally be of the same size as the wild-type protein and possesses essentially the same biological properties as the wild-type protein.
The expression "viral recombinant gene-transfer vector" means any virus capable of infecting cells of the eukaryotic type) and preferably a virus that is suitable for gene therapy, such as an adenovirus or a retrovirus (for example a type C
retrovirus).
The expression "non-viral recombinant gene-transfer vector" means macromolecular complexes combining the DNA containing the transferred gene, its regulatory sequences, and molecules belonging to the class of lipids, carbohydrates, or proteins, which possess functional properties capable of: 1 ) targeting deposition of DNA
on the surface of the target cell, 2) introducing this DNA into the targeted cell, and 3) introducing this DNA into the nucleus of the targeted cell.
The expression "process of cell entry of the viral recombinant gene-transfer vector" means all of the events leading to introduction of the transported gene into the cytoplasm of the targeted cell following initial contact between the surface of this cell and the gene-transfer vector.
As an example, for retroviral vectors, in relation to a defined cellular target for which a "targetable" surface molecule is known (i.e. sufficiently specific relative to the other tissues) and a ligand for the surface molecule (ligand or single-chain antibody), a gene coding for the envelope glycoprotein targeting this surface molecule can be constructed genetically. This is accomplished by fusing (from N to C-terminal) a signal peptide, the ligand, the "masking/unmasking" peptide, and the rest of the retroviral envelope. An expression vector for this chimeric molecule is inserted into a "semi-transcomplementing" cell line expressing the gag and pol proteins of the MLV
virus I S (coding for the viral capsid and the enzymes of replication of the retrovirus). A
"transcomplementing" line is obtained, which can then be used for producing retroviral vectors if a plasmid carrying this retroviral vector is additionally introduced, as occurs with the conventional transcomplementing lines expressing normal retroviral envelopes.
The invention also relates to the use of a peptide according to the invention, in the construction of an essentially intact (retro)viral envelope glycoprotein, carried by a recombinant (retro)viral particle capable of infecting a eukaryotic cell, the said envelope glycoprotein preferably being of polymeric form, and especially of trimeric form, each monomer of the polymeric form being in its turn of heterodimer form, the said eukaryotic cell possessing a targeted receptor and an auxiliary receptor permitting facilitation of entry of the aforesaid (retro)viral particle ((retro)viral receptor) into the eukaryotic cell, the envelope glycoprotein comprising:
- the aforesaid peptide, - a protein domain on the N-terminal side (upstream) of the aforesaid peptide, capable of interacting with the aforesaid targeted receptor, this interaction permitting specific binding of the (retro)viral particle and - a protein domain on the C-terminal side (downstream) of the aforesaid peptide, capable of interacting with the aforesaid (retro)viral receptor, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the eukaryotic cell, the process of cell entry of the (retro)viral recombinant particle into the eukaryotic cell 3 S by means of the C-terminal (downstream) protein domain only being able to occur when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the (retro)viral recombinant particle, Leading, via the aforesaid peptide, to a mechanism of "unmasking" or of accessibility of the (retro)viral ~
receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the viral recombinant particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, a mechanism of "masking" or of non-accessibility is produced, by means of the aforesaid peptide, of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain.
The (retro)viral envelope glycoproteins are trimers of heterodimers with surface subunit (SU) and transmembrane subunit (TM). This concept of trimerization is fundamental for the functionality of the (retro)viral envelope. The envelope glycoproteins of the invention are preferably of trimeric form.
According to an advantageous embodiment of the invention, the N-terminal (upstream) protein domain is chosen from the following polypeptides:
- single-strand antibodies recognizing cell-surface molecules, - any ligand for a cell-surface molecule, especially polypeptide hormones, cytokine, I S growth factors.
According to an advantageous embodiment of the invention, the C-terminal (downstream) protein domain corresponds to a (retro)viral envelope glycoprotein, essentially intact, including the natural binding domain, the functions of fusion and of attachment of the wild-type envelope glycoprotein from which is derived the envelope glycoprotein carried by the recombinant (retro)viral particle.
According to an advantageous embodiment of the invention, the peptide originates from the envelope glycoprotein of type C retroviruses, and in that the virus is preferably chosen from: the ecotropic MLV virus, the amphotropic MLV virus, the xenotropic MLV virus, the MCF MLV virus, the MLV 10A1 virus, GALV (Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV
C
(FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or consisting of one of the following sequences: PRO (4070A), PRO(MoMLV), APRO, PRO+, PRO+, PRO(3) OPROp, OPR04-~, OPR04-int, OPR04-vrb, PR0~3, PRO-int, PRO-vrb.
The invention relates to the use of a peptide derived or adapted from bovine elastin and chosen from those containing or consisting of one of the following sequences:
EL3, EL3-V, ELS.
The invention also relates to peptide sequences chosen from those containing or consisting of one of the following sequences:
- PRO (4070A), PRO(MoMLV), PRO(3, PRO+, PRO, 4PR0(3, DPRO+) MOAPRO, MOAOPRO, - EMOPRO, EMOPRO(3, EMOPRO+, EAPRO, EAPR0~3, EAPRO+, EMODPRO, EMOOPRO(~, EMOOPRO+, EAOPRO, EAAPRO(3, EAOPRO+, AMOEL3) AMOEL3-V) AMOELS.
PRO (4070A), PRO(MoMLV), PRO(3, PRO+, 4PR0, 4PR0(3, OPRO+, EL3, EL3-V, ELS are masking/unmasking peptides of the invention.
AMOPRO, AMOOPRO, AMOEL3, AMOEL3-V, AMOELS correspond to S Ram-1 targeting envelopes.
MOAPRO, MOAOPRO correspond to Rec-1 targeting envelopes.
EMOPRO, EMOPR0~3, EMOPRO+, EAPRO, EAPRO(3, EAPRO+, EMOOPRO, EMO~PR0~3, EMOOPRO+, EAOPRO, EAOPRO(3, EAAPRO+
correspond to EGFR targeting envelopes.
The invention also relates to a polypeptide sequence containing a peptide of about 10 to about 200, especially from about 15 to about 150 amino acids, and preferably about 20, in which at least 30% of the amino acids consist of proline residues, and these proline residues are regularly arranged so as to induce turnings of the polypeptide chain at about 180° (" ~i-turn" or "reverse-turn"), these turnings being 1 S regularly spaced and assembling themselves into a polyproline (3-turn helix, - an N-terminal protein domain (upstream) of the aforesaid peptide, capable of reacting with a suitable receptor (targeted receptor) located on a eukaryotic cell, and this protein domain permits specific binding of a recombinant (retro)viral particle containing the said N-terminal protein domain and - a C-terminal protein domain (downstream) of the aforesaid peptide, capable of interacting with a suitable auxiliary (retro)viral receptor ((retro)viral receptor) located on the said eukaryotic cell) and this interaction performs the role of auxiliary mechanism of entry of the (retro)viral particle into the said eukaryotic cell, the process of cell entry of the said recombinant (retro)viral particle into the said eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to occur when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the said recombinant (retro)viral particle, leading, by means of the aforesaid peptide, to a mechanism of unmasking or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of masking or of non-accessibility, by means of the aforesaid peptide) of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain.
The invention also relates to a recombinant (retro)viral particle capable of infecting a eukaryotic cell) this cell containing a targeted receptor and an auxiliary receptor of the aforesaid (retro)viral particle, including a substantially intact envelope glycoprotein) especially of polymeric form and preferably of trimeric form, each monomer of the polymeric form preferably being itself of heterodimer form, containing a peptide of about 10 to about 200) especially of about 15 to about 150 amino acids, and preferably of about 20, in which at least 30% of the amino acids are made up of proline 5 residues, these proline residues being regularly arranged so as to induce turnings of the polypeptide chain at about 180° ("(3-turn" or "reverse-turn"), these turnings being regularly spaced and assembling themselves into a polyproline (3-turn helix, - a protein domain on the N-terminal side (upstream) of the aforesaid peptide, capable of interacting with the aforesaid targeted receptor, this peptide region permitting specific 10 binding of the (retro)viral particle and - a protein domain on the C-terminal side (downstream) of the aforesaid peptide, capable of interacting with the aforesaid (retro)viral receptor, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the eukaryotic cell, the process of cell entry of the recombinant (retro)viral particle into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to occur when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the recombinant (retro)viral particle, leading, via the aforesaid peptide, to a mechanism of unmasking or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of masking or of non-accessibility, via the aforesaid peptide, of the retroviral receptor with respect to the C-terminal (downstream) protein domain.
The invention also relates to a recombinant (retro)viral particle characterized in that the N-terminal (upstream) protein domain is chosen from the following peptides:
- single-strand antibody recognizing cell surface molecules, - any ligand for a cell surface molecule, especially polypeptide hormones, cytokine, growth factors.
The invention also relates to a recombinant (retro)viral particle characterized in that the C-terminal (downstream) protein domain corresponds to a polypeptide of (retro)viral origin possessing functions of binding, of fusion and of attachment of the wild-type envelope glycoprotein from which is derived the envelope glycoprotein carried by the recombinant (retro)viral particle, and can originate from natural regions possessing functions of binding) of fusion and of attachment of the envelope glycoproteins derived from retroviruses Nll.V-A, GALV, FeLVB, or viruses such as adenoviruses, herpesviruses, AAV (Adeno Associated Virus), or more generally from viral glycoproteins derived from viruses of eukaryotic origin, especially orthomyxoviruses (such as influenza viruses) or paramyxoviruses (such as SVS).
The invention also relates to a recombinant (retro)viral particle characterized in that the peptide is derived from the envelope glycoprotein of type C
retroviruses, and in that the peptide is preferably derived from a virus chosen from: ecotropic MLV
virus, amphotropic MLV virus, xenotropic MLV virus, MLV MCF virus, MLV 10A1 virus, GALV (Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV
A, FeLV B, FeLV C (FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or consisting of one of the following sequences:
PRO
(4070A), PRO(MoMLV), APRO, PRO+, PRO+, PR0~3, ~PRO(3, OPR04-(3, OPR04-int, OPR04-vrb, PRO(3, PRO-int, PRO-vrb.
The invention also relates to a recombinant (retro)viral particle characterized in that:
- the peptide originates from the envelope glycoprotein of type C
retroviruses, and in that 1 S the virus is preferably chosen from: ecotropic MLV virus, amphotropic MLV
virus, xenotropic MLV virus) MLV MCF virus, MLV 10A1 virus) GALV (Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV
C
(FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or consisting of one of the following sequences: PRO (4070A), PRO(MoMLV), OPRO, PRO+, OPRO+, PRO(3, OPR0~3, OPR04-(3, OPR04-int, ~PR04-vrb, PRO(3, PRO-int, PRO-vrb, - the N-terminal (upstream) protein domain is chosen from the following peptides:
* single-strand antibodies recognizing cell surface molecules, * any ligand for a cell surface molecule, especially polypeptide hormones, cytokine, growth factors, - the C-terminal protein domain corresponds to a polypeptide of (retro)viral origin possessing functions of binding, fusion and attachment of the wild-type envelope glycoprotein from which is derived the envelope glycoprotein carried by the recombinant (retro)viral particle, and can originate from natural regions possessing functions of binding, of fusion and of attachment of the envelope glycoproteins derived from retroviruses MLV-A, GALV, FeLVB, or from viruses such as adenoviruses) herpesviruses, AAV (Adeno Associated Virus), or more generally from viral glycoproteins derived from viruses of eukaryotic origin, especially orthomyxoviruses (such as influenza viruses) or paramyxoviruses (such as SVS).
The invention also relates to a recombinant (retro)viral particle characterized in that the 5' end of the nucleotide sequence coding for the N-terminal (upstream) protein domain is contiguous with the 3' end of the nucleotide sequence coding for the signal peptide) the 3' end of the nucleotide sequence coding for the N-terminal (upstream) protein domain is contiguous with the 5' end of the nucleotide sequence coding for the peptide, the 3' end of the nucleotide sequence coding for the peptide is contiguous with the 5' end of the nucleotide sequence coding for the C-terminal (downstream) protein domain.
The invention also relates to a nucleic acid coding for a peptide or for a recombinant particle according to the invention.
The invention also relates to a method of selective in vitro or ex vivo transfer of a nucleic acid into eukaryotic target cells present among other non-target cells, comprising the administration to the target and non-target cells, of a recombinant (retro)viral particle according to the invention, containing the nucleic acid to be transferred.
The invention also relates to a pharmaceutical composition containing, as active substance, a (retro)viral particle according to the invention, and also containing a gene to be transferred, together with a physiologically suitable pharmaceutical vehicle.
With regard to genes to be transferred that are important for gene therapy, these are for example IFN, IL2, p53, VEGF, TNF, CFTR, HSV-TK, lacZ, GFP, gene of various cytokines, other types of suicide genes including conditional suicide genes, other genes with antiviral activity, other genes with antitumour activity, other marker genes and any gene for therapy of a mono- or multi-genie disease. As an example) the pathologies most specifically involved are: most mono- or multi-genie diseases (mucoviscidosis, myopathy, lysosomal diseases) various forms of cancer, viral diseases (AIDS), etc.).
For a proper understanding of the mechanism of the invention (see Fig. 1 ), we must bear in mind that the envelope glycoproteins according to the invention (also denoted by "chimeric envelopes") possess, as well as an additional recognition region, the functions corresponding to their own particular regions; that is (see Fig.
2), 1 ) the natural binding domain located in the N-terminal part of the surface subunit (SU) of the wild-type envelope glycoprotein) and therefore just downstream of the supernumerary binding domain and 2) the fusion domain located in the C-terminal part of the subunit (SU) and in the transmembrane subunit (TM) of the envelope glycoprotein complex. For the chimeric envelopes constructed previously (EMO and AMO envelopes, for example), on the basis of the general structure shown diagrammatically in Fig. 2, the natural binding domain is functional. If the retroviral receptor that it recognizes is expressed at the surface of the 3 S target cell, then this domain will recognize it) and will permit infection to proceed. Then there will be no possibility of specific targeting, even if a surface molecule specifically recognizing the supernumerary binding domain is also expressed.
However, depending on the peptide inserted between the supernumerary binding domain and the natural binding domain, it is possible for the functionality of the natural binding domain to be adjusted considerably and, for some of these peptides, there can be effective prevention of its accessibility for recognition of the retroviral receptor (first action). It will be possible for this site to be unmasked, and hence rendered accessible to interaction with the normal retroviral receptor, if and only if the supernumerary binding domain has previously interacted with the targeted surface molecule. This second action is also mediated by the peptide separating the two domains. Here the normal retroviral receptor plays the role of auxiliary molecule.
Symbols on the diagrams:
- Fig. I represents the two-stage entry process of the targeting viral particle. The viral particles are generated (A) with targeting envelope glycoproteins composed of an N-terminal domain (ligand, single-strand antibody etc.), of the masking/unmasking peptide) and a C-terminal domain (B). The stages giving rise to introduction of the I S virion into the targeted cell involve a mechanism that is coordinated by the masking/unmasking peptide (C).
- Fig. 2 is a schematic representation of some of the targeting envelopes investigated. The position of some functional regions is shown. Vertical arrows: sites of proteolytic cleavage. SU: surface subunit, TM: transmembrane subunit, SP:
signal peptide, PRO: polyproline region, T: transmembrane domain, Ram-1 ligand:
binding domain for the amphotropic receptor, Rec-1 ligand: binding domain for the ecotropic receptor, EGF: epidermal growth factor. Dark grey boxes: sequences derived from the env gene of MoMLV) Light grey boxes: sequences derived from the env gene of MLV-4070A, White boxes: other sequences derived from MLVs. Black boxes: spacer peptides derived from the polyproline region. All the env genes are expressed starting from the same promoter (LTR) and polyadenylation signal (pA) starting from the sub-genomic mRNAusing the retroviral splicing sites, donor (SD) and acceptor (SA), with an identical intron sequence of 190 nt containing the end of the pol gene (OPOL).
The position of some restriction sites is shown.
- Fig. 3 shows the sequence of the spacer peptides and of the binding domains investigated. (A) Sequence of the spacer peptides in the series AMO, AS208 and fused with the various spacer peptides, and the whole is fused with codon 7 of the SU of the envelope of the MoMLV. (B) Sequence of the spacer peptides in the series MOA.
The binding domain at Rec-1 is fused with the various spacer peptides, and the whole is fused with codon 5 of the SU of the envelope of the amphotropic MLV. (C) Sequence of the spacer peptides in the series EMO and EA. The binding domain EGF is fused with the various spacer peptides, and the whole is fused with codon 5 of the SU of the envelope of the amphotropic MLV or with codon 7 of the SU of the envelope of the MoMLV
- Fig. 4 shows detection of membrane expression of the envelopes of the EMO
series. Populations of transfected cells, selected using phleomycin, are marked with (black histograms) or without (white histograms) anti-hEGF antibodies, then with anti-IgG mouse antibodies combined with FITC.
- Fig. 5 shows expression and viral incorporation of the chimeric envelopes of the AMO series. Immunoblots on lysates of TELCeB6 cells transfected by the plasmids expressing the chimeric envelopes (see Fig. 2 and Fig. 3A) and on deposits of viral particles purified by ultracentrifugation. The immunoblots are detected with an anti-SU
antiserum (top part) or with an anti-p30-CA antiserum (bottom part, size less than 46 KD). The positions of the p30-CA (CA) and, for the MO wild-type envelopes, of the precursor (PR) and of the surface protein (SU) of the envelope complex are shown.
- Fig. 6 shows binding tests on human cells of the envelopes of series EMO (A) and AMO (B). The background noise of fluorescence is provided by incubation of human cells with the ecotropic envelope (white histograms), 1 S - Fig. 7 shows the amino-acid and nucleotide sequence of PRO(4070A).
- Fig. 8 shows the amino-acid and nucleotide sequence of PRO(MoMLV).
- Fig. 9 shows the amino-acid and nucleotide sequence of PRO(3(MoMLV).
- Fig. 10 shows the amino-acid and nucleotide sequence of PRO+(4070A).
- Fig. 11 shows the amino-acid and nucleotide sequence of OPRO.
- Fig. 12 shows the amino-acid and nucleotide sequence of OPROp.
- Fig. I 3 shows the amino-acid and nucleotide sequence of OPRO+.
- Fig. 14 shows the amino-acid and nucleotide sequence of AMOPRO.
- Fig. 15 shows the amino-acid and nucleotide sequence of AM04PR0.
- Fig. 16 shows the amino-acid and nucleotide sequence of MOAPRO.
- Fig. 17 shows the amino-acid and nucleotide sequence of MOAOPRO.
- Fig. 18 shows the amino-acid and nucleotide sequence of EMOPRO.
- Fig. 19 shows the amino-acid and nucleotide sequence of EMOPRO(3.
- Fig. 20 shows the amino-acid and nucleotide sequence of EMOPRO+.
Fig. 21 shows the amino-acid and nucleotide sequence of EAPRO.
- Fig. 22 shows the amino-acid and nucleotide sequence of EAPRO(3.
- Fig. 23 shows the amino-acid and nucleotide sequence of EAPRO+.
- Fig. 24 shows the amino-acid and nucleotide sequence of EMODPRO.
- Fig. 25 shows the amino-acid and nucleotide sequence of EMOOPRO(3.
- Fig. 26 shows the amino-acid and nucleotide sequence of EMOOPRO+.
- Fig. 27 shows the amino-acid and nucleotide sequence of EAOPRO.
- Fig. 28 shows the amino-acid and nucleotide sequence of EA~PRO(3.
- Fig. 29 shows the amino-acid and nucleotide sequence of EAOPRO+
- Fig. 30 shows the amino-acid and nucleotide sequence of AMOEL3.
1$
- Fig. 3 I shows the amino-acid and nucleotide sequence of EL3 - Fig. 32 shows the amino-acid and nucleotide sequence of AMOEL3-V
- Fig. 33 shows the amino-acid and nucleotide sequence of EL3-V
- Fig. 34 shows the amino-acid and nucleotide sequence of AMOELS.
- Fig. 35 shows the amino-acid and nucleotide sequence of ELS.
- Fig. 36 shows the amino-acid and nucleotide sequence of OPR04-beta.
- Fig. 37 shows the amino-acid and nucleotide sequence of OPR04-int.
- Fig. 38 shows the amino-acid and nucleotide sequence of OPR04-vrb.
- Fig. 39 shows the amino-acid and nucleotide sequence of PRO-beta.
- Fig. 40 shows the amino-acid and nucleotide sequence of PRO-int.
- Fig. 41 shows the amino-acid and nucleotide sequence of PRO-vrb.
EXAMPLES:
I S EXAMPLE 1:
The retroviruses utilize a certain number of cell surface molecules, called viral receptors, for initiating the infectious process (23 ). Apart from some notable exceptions, especially in the case of human immunodeficiency viruses) most of the receptors utilized by the other retroviruses and in particular the type C mammalian retroviruses are distributed over most cell types of the host organism. For example, the amphotropic murine leukemia virus (MLV-A) is capable of infecting the majority of mammalian cells because its receptor) the phosphate transporter Ram-1, is expressed on almost all the cells.
The type C mammalian retroviruses are currently used for making retroviral vectors, in particular for purposes of gene transfer in humans, in gene therapy. Certain gene therapy procedures would be facilitated if the retroviral vectors were capable of very accurately recognizing the true target cells of gene transfer. For this, a certain number of research groups, including ours, have developed various strategies aiming to modify the recognition between the viral particle and the cell surface. This interaction essentially involves the retroviral envelope glycoprotein; it therefore seems logical to make genetic changes to this protein so as to enable it to recognize cell surface molecules specifically expressed on the target cells of gene transfer.
Two types of strategies permitting such changes have been developed recently.
In the first strategy, the natural binding domain of the retroviral envelope glycoprotein for its receptor was altered by insertion or substitution of peptides of reduced size that are able to bind a cell surface molecule. This work has demonstrated the feasibility of cell targeting for gene transfer (20).
In the second approach, polypeptides (ligands, single-strand antibodies) capable of binding various cell surface molecules were inserted at the N-terminal end of the SU
subunit of the envelope glycoprotein (6) ( 10) ( I 3) ( 1 S) (21 ). In general, investigation of the virions generated with these various types of targeting envelopes showed that it was possible for the binding of viral particles to be redirected specifically and efficiently towards new surface molecules. Some factors limiting the eRicacy of targeting were also identified. The first seems to depend on physiological properties of the surface molecule targeted (dimerization, internalization, intracellular transport ("traffcking") process) (6), the second is connected with the low intrinsic gene-fusion capacity of the chimeric envelopes generated by N-terminal insertion of ligands (6) (2 I ). It was observed that this low gene-fusion capacity can be partially overcome by introducing a spacer peptide between the new binding domain and the envelope (2 I ). However, the best infectious titres obtained are 100 times lower than can be obtained with retroviral vectors bearing a wild-type envelope. Moreover, it is possible that these results obtained in a particular targeting model (targeting of Ram- I ) cannot be extended to other types of targeting envelope glycoproteins. It therefore seemed essential to develop alternative strategies to solve these problems.
Furthermore, a general finding made with the targeting envelope glycoproteins generated by N-terminal insertions is that the natural binding domain of the supporting envelope is always functional. To the extent that the target cells are human cells in gene therapy, this functionality of the natural binding domain does not pose problems of "background noise" of infection because the supporting glycoprotein used is the ecotropic envelope of the MoMLV virus which does not recognize a receptor on the cells of higher mammals. However, it seemed interesting to characterize these chimeric envelope glycoproteins that are able to recognize two different surface molecules, to see 2S what influence the spacer peptide could have in this recognition, and to assess the relative contributions of the two types of interaction in the infectious process.
These observations, which form the subject of the work described below, led to the development of a two-stage targeting strategy, firstly involving specific recognition between the ligand inserted at the N-terminal end of the targeting glycoprotein, and then an auxiliary mechanism making it possible to facilitate entry of the virus specifically bound to the good cellular target by means of the natural retroviral receptor.
To avoid any problem of background noise of infection connected with direct interaction between the natural binding domain and the natural retroviral receptor, masking/unmasking spacer peptides were also developed, inserted between the targeting site and the supporting envelope glycoprotein, and which are able to mask the natural binding domain for as long as the viral particle has not interacted with the targeted surface molecule.
Realization of this interaction induces unmasking of the natural binding domain and interaction between the natural binding domain and the natural retroviral receptor (auxiliary mechanism) which then takes over for introducing the virus into the cell.
Equipment and Methods: Cell lines.
The cell line TELCeB6 (7) is derived from the TELacZ line ( 19) by transfection and clonal selection of cells expressing the gag and pol proteins of MoMLV
(Moloney Murine Leukemia Virus). The TELacZ cells express the retroviral vector MFGnlslacZ
which is able to transduce a nuclear (3-galactosidase. The TELCeB6 cells permit production of retroviral capsids (non-infectious, as they are devoid of envelopes) transporting the nlsLacZ retroviral marker vector. Cells A431 (ATCC CRL1555) and TE671 (ATCC CRL8805) are cultivated in DMEM medium (Gibco-BRL) supplemented with 10% of foetal calf serum (Gibco-BRL). Cells CHO, CERD9 (9), and CEAR 13 (9) are cultivated in DMEM medium (Gibco-BRL) supplemented with 10% of foetal calf serum and proline (Gibco-BRL). The NIH-3T3 cell lines and NIH-3T3 derivatives are cultivated in DMEM medium (Gibco-BRL) supplemented with 10% of newborn calf serum (Gibco-BRL).
Chimeric envelopes.
The DNA fragments coding for the polypeptides recognizing either EGFR (EGF
receptor) or Ram-1 (MLV-A receptor) were generated after PCR (polymerase chain reaction) by using oligonucleotides containing restriction sites. These polypeptides were introduced at the N-terminal of the SU protein of MLV (surface protein gp70) in which the SfiI and NotI restriction sites were created at codon +6 (33). A schematic diagram of the various env genes used in this article is shown in Fig. 2. Briefly, a DNA fragment derived from PCR amplification, coding for the 53 amino acids of human EGF (3) was generated using a cDNA matrix (ATCC 59957) and two oligonucleotides: OUEGF:
(S'> ATGCTCAGAGGGGTCAGTACGGCCCAGCCGGCCATGGCCAATAGTGAC
TCTGAATGTCC) with an SfiI restriction site and OLEGF:
(5'> ACCTGAAGTGGTGGGAACTGCGCGCGGCCGCATGTGGGGGTCCAGACT
CC) containing a NotI site. After digestion by SfiI and NotI, these fragments were cloned in a gene coding either for the SU protein of MoMLV in the case of the chimeric protein EMO, or SU of the 4070A virus for the chimeric protein EA (6).
For the AMO construct (6), a site NotI was created at the end of the recognition domain of the receptor in the 4070A envelope (called AS208), (2), and the nucleotide (nt) 750 (14) using a PCR fragment generated from the XhoI site (nt 594) up to nt 750 before the proline-rich region) owing to two oligonucleotides: 805FC (5' >
TCCAATTCCTTCCAAGGGGC) upstream of XhoI and 806FC (S' > ACCCCCACATGCGGCCGCTCCCACATTAAGGACCTGCCG) containing a NotI restriction site. The chimeric envelope is constructed by cloning of the Xhol/NotI PCR fragment and of the Notl/CIaI fragment, isolated from the env EMO
gene (coding for the SU and TM- P 1 SE transmembrane proteins of MoMLV), between the XhoI/CIaI sites of the env gene 4070A MLV.
The resulting constructs are recovered in the form of a BgIII/CIaI fragment (corresponding to positions 5408 and 7676 in MoMLV) and cloned at sites BamHI
and CIaI of an FBMOSALF expression plasmid (7) in which a selection marker gene (8) fused to the polyadenylation sequences of the PGK (phosphoglycerate kinase) gene and was introduced downstream of the LTR 3' of the MLV-C57 virus.
For EMO, EA, or AMO, the new recognition site was separated from the rest of the MLV envelope by a spacer peptide consisting of three alanines, supplied by the NotI
cloning site ( 15). In three other series of targeting envelopes (derived from envelopes EMO, EA or AMO), spacer amino acids were introduced either after the recognition domain of EGFR (EGF), or after the recognition domain of Ram-1 (AS208) as described below.
The series of envelopes targeting Ram-1 was generated by introducing different spacers between the recognition domain of Ram-I and the MoMLV envelope (Fig.
3A).
For AMOPRO, a region of 59 amino acids rich in proline originating from SU
(amphotropic) (nucleotides 75 I to 927) ( 14) was used. A shorter proline-rich region, also isolated from the envelope MLV 4070A (nt 751-789) was used for AMOOPRO.
This region corresponds to the 13 amino acid spacer of product v-mpl (originating from the virus of myeloproliferating leukemia) ( 18) located between its region derived from env and the equivalent of the cellular gene mpl.
In the case of AMO I , the first 208 amino acids, derived from the envelope of MLV 4070A, were fused to amino acid I of the SU of MoMLV. For AMOIFx, a 4 amino acid site corresponding to the cleavage site of blood coagulation factor Xa (Ile-Glu-Gly-Arg) (12) was inserted after the Ram-1 recognition site and fused to the +I
codon of the SU of MoMLV. The strategy used for these constructs is described above.
Briefly, an oligonucleotide (S'-TCCAATTCCTTCCAAGGGGC-3')) located just upstream of the XhoI site of the env gene of 4070A (nt 594) was used in combination with one or other of the following two oligonucleotides bearing the Not I
site:
5'-AGTATGCGGCCGCTGGGGGTGGCTGTGGGACAC-3' and S'-TATCTGCGGCCGCGTCGGGTAATACTGGGTTGG-3' so as to generate by PCR, using an env 4070A matrix, 3' fragments for the AMOPRO
and AMO~PRO envelopes respectively.
These PCR fragments were submitted to digestion by XhoI and NotI and cloned in the open FBAMOSALF plasmid in XhoI/NotI, a plasmid expressing an AMO type of envelope. The plasmids expressing the envelopes AMOFx, AMO 1 and AMO I Fx were generated by cloning the Ndel/Notl fragment of FBAMOSALF containing the Ram-1 recognition site) in a series of plasmids (13) expressing the modified MoMLV
envelopes so as to create a NotI site at codon 1 or at codon 6 with (AMO I Fx, AMOFx) or without (AMO 1 ) the Xa sequence. Envelopes derived from AMO and containing other types of spacer peptides were constructed. All of these spacer peptides are shown in Fig. 3 A.
The MOAPRO and MOAOPRO envelopes were generated according to a method similar to that of the AMOPRO and AMODPRO envelopes. The FBEASALF
plasmid, expressing the EA envelopes, was opened at NedI/NotI. This DNA was used I O for cloning two fragments: the 5' NdeI/BamH 1 fragment from digestion of the FBMOSALF plasmid (expressing the ecotropic MO envelopes) and containing, in addition to LTRS' and the retroviral leader sequence, the N-terminal end of the env gene of the MoMLV virus (position 6565), (17). 3' fragments were generated by PCR
using the env gene of MoMLV as matrix, as oligonucleotide 5'(5'-ACTGGGGCTTACGTTTGT-3') upstream of the BamH 1 site, and as oligonucleotide 3' (5'-TATGTGCGGCCGCCGGTGGAAGTTGGGTAGGGG-3') or (5'-TATGTGCGGCCGCGTCTGGCAGAACGGGGTTTGG-3') for constructing the MOAPRO and MOAOPRO envelopes, respectively. These PCR
fragments were digested with BamHI and NotI, and co-ligated with the 5' fragment. The sequence of the spacer peptides for these two constructs is shown in Fig. 3B.
FBEMOSALF, expressing the EMO chimeric envelopes (6), was submitted to digestion by BaaHII, filling by Klenow enryme and digestion by NdeI. The resulting 1.8 Kb fragment, containing the LTRS', the leader sequence, the end of the pol gene and human EGF, was isolated and inserted either in FBAMO~PROSALF or FBAMOPROSALF (plasmids expressing the AMOOPRO and AMOPRO chimeric envelopes respectively) in which the NdeI/EcoRI fragment was eliminated and the EcoRI
site was filled so as to generate the plasmids expressing the envelopes EMOOPRO+ and EMOPRO+, respectively. Plasmids expressing the envelopes EMO 1, EMO 1 FX were also generated. The sequence of the spacer peptides for these two constructs is shown in Fig.3C.
The plasmids expressing the EAPRO+ and EADPRO+ envelopes were generated by replacing the SfiI/Not fragment of the FBEASALF plasmid by the SfiI/NotI
fragments obtained from plasmids expressing the EMOPRO+ and EMOOPRO+ envelopes.
Finally for these various envelopes EMOOPRO+, EMOPRO+, EAPRO+ and EAOPRO+) the spacer peptides were reduced in their N-terminal part. For this, a DNA
fragment was generated by PCR using as matrix the EMO gene, oligonucleotide 5' (5' ACCATCCTCTAGACGGACATG-3') upstream of the XbaI site preceding the initiator codon and as oligonucleotide 3' (5'-TATCAGGATCCCAAATGTAAGCCCTGGATCG
CGCAGTTCCCACCACTTCAGGTCTCGGTACTGAC-3') containing a BamHI site.
This DNA was digested with XbaI/BamHI and cloned in one or other of the plasmids expressing the EMOPRO+ or EMOOPRO-+- envelopes, after removing the Xbal/NotI
fragments beforehand, by co-ligation with the BamHI/NotI fragments obtained from the S plasmids expressing the MOAPRO and MOADPRO envelopes. This results in two plasmids that are able to express the EMOPR0~3 and EMOOPR0~3 envelopes, respectively (Fig. 3C), in which EGF is fused just upstream of the BamHI site of the envelope of the MoMLV virus (nt 6537), (17) before the proline-rich region and leaving intact the potential b sheet. One or other of the SfiI/NotI fragments resulting from these 10 last two constructs was then introduced into the FBEASALF plasmid after prior removal of the SfiI/Not fragment; this results in two plasmids capable of expressing the EAPRO(3 and EAOPRO~i envelopes, respectively (Fig. 3C).
In another construction series (EMOPRO, EMOOPRO, EAPRO, EAOPRO), the potential ~i sheet was removed, and EGF was fused directly at the level of the proline 15 rich region (Fig. 3C).
Production of viruses.
The plasmids expressing the envelopes were transfected by the calcium phosphate precipitate method (16) in the TeLCeB6 cell line. The cells were submitted to selection with phleomycin (50 mg/ml), then the resistant clones were trypsinized in the bulk.
20 These confluence cells were used for recovering the viral supernatants after incubation over night in DMEM medium in the presence of FCS ( 10%). These supernatants are submitted to ultracentrifugation with the aim of obtaining samples for analysis in Western blots, in binding tests and in infection tests. Immunoblots. The virus-producing cells are lysed for 10 min at 4°C in buffer of Tris-HCL 20mM (pH 7. S), containing triton X 100 1%, SDS 0.05%, deoxycholate S mg/ml, NaCI 150 mM and PMSF 1 mM. After centrifugation for 10 min at 10 000 g, for deposition of the cell nuclei, the supernatants are frozen at -70°C until analysis. These viral samples are obtained by ultracentrifugation of the viral supernatants (10 ml) in a SW41 Beckman rotor (30 000 rpm, 1 h at 4°C).
The deposits are resuspended in 100 ml of PB S (phosphate buffered saline) and frozen at -70°C. The samples (30 mg of cellular lysates or 10 ml of purified viruses) are mixed in a ratio of 5:1 with buffer of 375 mM Tris-HCl (pH 6.8) containing SDS 6%, b-mercaptoethanol 30%, glycerol 10% and bromophenol blue 0.06%, then boiled for 3 min and analysed on acrylamide 10%/SDS gels. After transferring the proteins onto nitrocellulose membrane, immunologic marking is effected in TBS (Tris base saline, pH
7.4) in the presence of skimmed milk 5% and Tween 0.1%- Antibodies (Quality Biotech Inc., USA) obtained from goat antiserum, directed against gp70-SU of RLV
(Rauscher Leukemia Virus) or p30 of RLV were used at a dilution of 1:1000 or 1/10000 respectively. The blots were developed using a conjugated antibody of rabbit origin ' CA 02253874 1998-11-06 directed against goat immunoglobulins (DAKO, UK) using an electrochemoluminescence kit (Amersham Life Science).
Binding tests.
The target cells were washed with PB S and separated by incubation for 10 min at S 37°C with Versene 0.02% in PBS. These cells are rinsed with PBA (PBS
containing 2%
of FCS and sodium azide 0.1%). 106 cells are then incubated in the presence of viruses for 30 min at 4°C for the EMO envelope series or 45 min at 37°C
for the AMO envelope series. After rinsing with PBA, the cells are incubated in the presence of monoclonal antibodies (Evans et al., 1990) for 30 min at 4°C. After rinsing twice with PBA, the cells are incubated for 30 min at 4°C in the presence of conjugated anti-rat antibodies combined with FITC (Dako; UK). S min before the two final rinsings in PBA, the cells are counterstained with propidium iodide (20 mg/ml). The fluorescence of the live cells is analysed in a FACS (FACScalibur, Beckton Dickinson).
Infection tests.
1 S The target cells are inoculated in 24-well culture plates at a density of 3.10 cells per well. Various dilutions of the viral supernatants, containing Polybrene at 4 mg/ml, are added to the cells for 3 to 5 h at 37°C. The supernatants are then replaced with fresh medium and the cells are incubated for 24 to 48 h at 37°C. X-gal staining is then carried out as described previously (4). The viral titres are estimated as reported previously (S) in number of colonies per ml (IacZ i.u./ml).
In order to block the EGFRs, the target cells are incubated for 30 min at 37°C in a medium containing 10~ M of human recombinant EGF (236-EG, R&D Systems, UK).
The cells are then rinsed and infections are carried out as described previously. To block acidification of the endosomes, 100 mM of chloroquine phosphate (Sigma, UK) is added to the medium. Six hours after infection, the cells are rinsed and incubated in a normal medium.
Results and discussion.
Construction of the mutant envelopes.
Two series of modified envelopes capable of recognizing either the retroviral receptor Ram-1 ( 11 ), (22)) or the EGF receptor were generated. A first envelope targeting Ram-1, AMO, was constructed by insertion, at the N-terminal of the envelope of MoMLV (by fusion with codon 7), of a polypeptide recognizing Ram-1 (AS208, Fig.
3A) and corresponding to the first 208 amino acids of the SU of MLV-A ( 1 ).
The sequence coding for EGF was inserted in the env gene of MLV in position +6 of the SU
of MoMLV (Fig. 2). It had previously been demonstrated that this insertion site permits expression of a single-chain antibody fragment on the surface of virions ( 1 S). In the case of the chimeric envelope EMO (Fig. 2), human EGF was inserted in the envelope of MoMLV at the same position, whereas for the envelope EA, insertion was effected in the ' CA 02253874 1998-11-06 amphotropic envelope of MLV in position +5.
For the AMO, EMO and EA envelopes, the new binding domains were separated from the recognition domain of the retroviral receptor by a spacer peptide corresponding to three alanines. For the two types of parental envelopes targeting Etam-1 or targeting S EGFR, various constructs were then generated by insertion of spacers of different sizes and structures. The protein sequences of these different spacers are shown in Fig. 3 A in the case of the envelopes targeting Ram-1 and in Fig. 3C for the envelopes targeting EGFR.
The plasmids expressing the various envelopes, including the ecotropic (MO) and amphotropic (A) control envelopes, were transfected into the cell line TELCeB6 which expresses the proteins coded by the gag and pol genes, as well as a retroviral vector nlsLacZ (7).
Expression and incorporation of the envelopes in the virions.
The protein lysates of the corresponding cells were analysed for the expression of envelopes by means of antibodies directed against the SU of MLV (Fig. 5) for most of the envelopes of the AMO series (not shown for the other chimeric envelopes).
For all the chimeric envelopes) the precursors and the mature form SU of the envelopes could be detected at the expected size and at a level similar to the wild-type envelopes, suggesting that these chimeric envelopes are normally produced and matured.
Expression on the cell surface was determined by analyses of the producing cells in the FACS, using antibodies directed against the SU or using an anti-EGF
monoclonal antibody. The cells transfected by the various envelopes can be marked by the anti-SU
antibody (not shown). Only the cells expressing the EGF envelopes fusion envelopes can be marked by means of anti-EGF monoclonal antibodies (Fig. 4). This demonstrates expression of the chimeric envelopes on the cell surface and correct folding of the EGF
on the chimeric glycoproteins.
To demonstrate incorporation of the chimeric envelopes in the retroviral particles, the supernatants of the TELCeB6 cell lines transfected with the various envelopes were submitted to ultracentrifugation and the deposits of viral particles were recovered. These deposits were analysed by immunoblots for their expression of products of the gag gene (CAp30) and of the envelope proteins (Fig. 5 for most of the envelopes of the AMO series, not shown for the other chimeric envelopes). With the aim of comparing the e~ciency of viral incorporation between the various chimeric 3 S envelopes, identical quantities of viral particles (determined by marking the gag proteins by means of anti-CAp30 antibodies) were deposited on the gels.
The SU proteins could be detected for all the mutants) at the expected size but at a rate slightly less than was observed for the wild-type envelopes. In the case of the AMOG2X and AMOG3X envelopes only, the efficiency of incorporation is appreciably lower relative to the wild-type envelopes. As expected, no envelope expression was observed in the deposits from TELacZ supernatants (not expressing gag and pol proteins) transfected by the various envelopes. These results show that the chimeric SU
proteins are associated with retroviral particles.
Binding of the envelopes to the receptors.
Human cells expressing the receptors Ram-1 and/or of EGF were used for this investigation. These cells are incubated in the presence of viral preparations and the binding of the viral envelopes on the target receptor is determined by analysis with the FACS with the aid of antibodies directed against the SU (Fig. 6B). As expected, no binding is observed in the case of viruses expressing MO ecotropic envelopes on the various human cells (not shown), whereas the viruses that have chimeric envelopes targeting Ram-I are able to bind to the TE671 cells with an efficiency similar to that observed for the viruses expressing unmodified amphotropic envelopes. All the envelopes targeting Ram-1, derived from AMO, are able to bind to the TE671 cells with a similar efficiency. This binding can be inhibited after competition by AS208 fragment (the purified recognition domain of Ram-1 ) (2), which suggests that this recognition is specific (results not presented).
The envelopes targeting EGFR (EMO series) are moreover able to bind to the A431 cells) on EGFR expressor (Fig. 6A). This binding seems specific since pre incubation of the A431 cells in the presence of EGF (inducing endocytosis of the EGFRs) inhibits this binding (not shown).
Ram-I and Rec-1 cooperation in infection.
Transduction of the retroviral vectors pseudotyped by the various targeting envelopes was measured on cells expressing different types of receptors: human cells TE671 expressing the EGF and Ram-1 receptors; 3T3 cells expressing murine EGF, Rec-1 and Ram-1 receptors; CEAR 13 cells expressing Rec-1 and Ram-1; CERD9 cells expressing only Rec-1. The titrations were carried out as described previously (6). As expected, it was shown that the viruses pseudotyped by MO ecotropic envelopes were not capable of infecting the TE671 cells, but did permit infection of murine cells 3T3, CEAR13 and Cerd9 (with titres of the order of 10' IacZ i.u./ml). Conversely, the viruses bearing the amphotropic A envelopes are able to infect the murine cells 3T3) and TE671 (with titres of the order of 10' IacZ i.u./ml).
The viruses that have chimeric AMO envelopes are able to infect the TEb71 cells at a titre of 4.103 lacZ i.u.lml (Table 1 ). In comparison, despite a similar efficiency of binding to the receptor (Fig. 6B), the titres obtained with the wild-type envelopes are 10 000 times higher. Surprisingly, the viruses expressing AMOPRO envelopes, despite good efficiency of binding, proved incapable of infecting the human cells Compared with the titres obtained for the AMO envelopes (Table I ), the other types of spacers inserted in the envelopes of the AMO series permit an increase in titres from 30-fold (for AMODPRO) to more than I 00-fold (for AMO 1 Fx) making it possible to reach titres of 4.105 IacZ i.u./ml. It has been shown that these infections take place via the targeted receptor Ram-1. This was demonstrated by an interference test on target cells chronically infected with MLV-A virus. These cells become specifically refractory to infection by viruses bearing envelopes targeting Ram-1 (results not shown).
The viruses bearing the chimeric envelopes in which the site for binding to Ram-1 was separated from the SU of MoMLV by various spacers proved very infectious on 3T3 cells.
Compared with the titres obtained for the AMO envelopes, an increase from 200-fold (for AMOPRO) to more than 1000-fold (for AMO 1 Fx) in the viral titres was measured (Table 1 ).
Infection of the 3T3's is effected via Rec-1 or via Ram-1 (Table 1). This can be demonstrated by interference tests carried out on 3T3 cells chronically infected either by MLV-A (blocking Ram-1 ) or by MoMLV (blocking Rec-1 ). The viruses expressing the AMO envelopes seem to be capable of infecting the 3T3's indiscriminately depending on whether one or the other, or both, Rec-1 and Ram-1 receptors are available on the target cell. Compared with these AMO viruses, the viral particles containing the other envelopes capable of targeting Ram-1 are far less capable of infecting the 3T3's when only one of the two receptors is available. For example, when 100 particles (according to the titre determined on intact 3T3's) containing the AMOFx envelopes are used for infecting interfering 3T3's, 4 viruses are capable of infecting the cells if only Rec-1 is available and 2 viruses are capable of infecting the cells if only Ram-1 is available. This indicates a considerable loss of infectivity (more than 94% of the viruses are not infectious) when only one receptor is available compared with when both receptors are available. This also suggests that the two receptors Ram-1 and Rec-I cooperate in infecting the 3T3's. It appears that this phenomenon of cooperation is even more marked in the case of viruses bearing the AMOPRO envelopes. These last-mentioned viruses can infect the 3T3's with difficulty when only Rec-1 is available and cannot infect them at all when only Ram-1 is available. However, when Rec-1 and Ram-1 are both available, infection is possible and titres of the order of 6x 10° lacZ i.u./ml can be obtained (Table 1).
For better characterization of this cooperation effect, infection tests were carried out using CHO cells as targets (naturally devoid of Ram-1 and Rec-1 receptors) altered so as to express either Rec-1 only (Cerd9 cells), or Rec-1 and Ram-1 (Cearl3 cells) or TE671 cells expressing Ram-1 only. Furthermore) other envelopes derived from the AMO envelope were generated. These envelopes possess other types of spacer peptides (see Fig. 3 A) after the site targeting Ram-1, in particular flexible spacers.
The results of a typical experiment are shown in Table 2. For each envelope, cooperativity indices were calculated as the ratio of the titre obtained on the cell type expressing just one receptor to the titre obtained on the cell type expressing both types of receptors. An index of 1 therefore indicates that the titre is the same, whether there is just one or both 5 receptors. This is obviously the case with ecotropic or amphotropic wild-type envelopes.
An index less than 1 indicates that the titre is less good when a single receptor is expressed relative to when both are, and that both receptors are needed to promote infection. The lower this index is, the greater is the requirement for two receptors. As suggested in Table 1, the infectivity of the virions with the original AMO
envelopes is 10 not affected, whether there is a single type of receptor or both types (Table 2). In fact, the indices are even greater than 1 suggesting that the simultaneous presence of the two receptors hampers the infectious process, perhaps because the two binding domains hinder each other. The situation is different for viruses with the AMO 1 Fx envelopes even though, compared with the AMO virions, their infectivity is at least 100 times better 15 in the TE671 cells that express Ram-1 only. This increase in infectivity via Ram-1 can be explained by the increased size of the spacer peptide separating the two binding domains:
' it is possible that the AS208 site induces less steric hindrance with respect to the rest of the glycoprotein and that these envelopes can more easily induce the gene-fusion process. Moreover, the Cerd9 cells expressing Rec-1 only are infected relatively easily 20 by the AMO 1 Fx virions. However, in accordance with the results in Table 1, infection is facilitated by a factor of I 0 when both molecules Ram-1 and Rec-1 are co-expressed (index of about 0.1 ) compared with when only one or the other of the two receptors is present. The envelopes with the "flexible" spacers (AMOG 1 Fx, AMOG2, AMOG2Fx and AMOG3) seem to behave like the AMOIFx envelopes with regard to infection via 25 Rec-1 expressed alone. However) infectivity by Ram-1 expressed alone (RamID) tends to decrease as a function of the length of the spacer. This probably reflects a decrease in transmission of the gene-fusion signal following binding on Ram-I owing to the increase in distance between the AS208 domain and the fusion domain. With these envelopes as well, infection is favoured when the two receptors are co-expressed on the surface of the same cell.
As for the AMO 1 Fx envelopes, but non-symmetrically (RamID similar, but RecID very different)) the virions containing the AMOOPRO envelope can infect cells efficiently when Ram-1 is expressed alone. For this envelope as well, infectivity is increased about 10-fold when Rec-1 is also present on the cell surface. This difference is 3 S not due to the mere fact that the AMOOPRO virions utilize Rec-1 preferentially for infection. In fact, infection of cells on which Rec-1 alone is available is extremely slight (Table I) or even undetectable (Table 2) compared with when Ram-1 and Rec-1 are co-expressed. The RecID index is less than 10-5 (Table 2). This also demonstrates that the two receptors can synergize infection. These results also suggest that the domain of binding to the ecotropic receptor Rec-1 is not accessible when the AM04PR0 envelope is expressed on viral particles, and only becomes accessible if these virions interact with Ram-1 beforehand. It can also be suggested that following binding with Ram-1, the domain for binding to Rec-1 is unmasked and recruited for facilitating the infectious process. It is possible that this masking/unmasking takes place according to an allosteric type of mechanism causing a change in conformation of the chimeric glycoprotein that is induced by the Ram-1 /AS208 interaction and which involves the spacer peptide.
It is likely that this mechanism is strongly dependent on the amino acid composition of the spacer peptide. With comparable size, there is a difference of at least 1000 times in the RecID's when the AMOOPRO virions are compared with the virions containing the envelopes with the flexible spacers AMO 1 Fx, AMOG 1 Fx and AMOG2. The OPRO
peptide contains S prolines probably arranged in a type II polyproline helix, whereas the AMOGIFx and AMOG2 envelopes contain essentially glycines.
Similarly to the AMODPRO virions, the virions containing the AMOPRO
envelopes require the simultaneous presence of the two types of receptors for infecting the cells. The infectious titres in cell types co-expressing the two receptors are, however, lower than that observed with the AMODPRO virions, though it is not possible to exclude the hypothesis that the lesser extent of incorporation of these envelopes is responsible for this result. Even more markedly than with AMOOPRO, the AMOPRO
viruses cannot infect the cells when either one of the two receptors is expressed alone (Table 2). The two indices RamID and RecID are in fact less than 10-5. These results suggest that:
1) interaction of the AMOPRO virions with Ram-1 when it is expressed alone is not sufficient to trigger the changes in conformation of the glycoprotein permitting its gene-fusion. Furthermore, it is possible that the PRO spacer peptide is either too rigid, or too long to favour such a transition, 2) the domain for binding with Rec-1 is not accessible for interaction with Rec-1 and to take over in the entry process as long as the AMOPRO virion has not interacted with Ram-1.
For the purpose of better discrimination of whether the masking of the binding domain located downstream of the targeting site is a unique property of the peptide conjugated to the PRO spacer peptide, the inverse construction was effected.
The MOAPRO envelopes contain the binding domain of the ecotropic envelope followed by the proline-rich region of this same envelope, the whole being fused at the N-terminal end of the amphotropic envelope (Fig. 2). The results shown in Table 2, show that in a similar manner to the virions containing the AMOPRO envelopes, the MOAPRO
virions can infect the cells expressing only either one of the receptors Rec-1 or Ram-1 with difficulty, or not at all. It even seems that the Ram-1 domain in the MOAPRO
envelope is even less accessible (RamID less than 7x 10-5) than the Rec-1 domain is in the AMOPRO envelope (RecID less than 5.6x10-''). The MOAPRO envelopes can efficiently infect the cells expressing the two types of receptors, with titres of the order of 105 IacZ i.u./ml, suggesting that, for this envelope as well, the presence of the two receptors synergizes the infectious process.
These results, taken together, suggest that the spacer peptide inserted between the targeting domain and the rest of the retroviral envelope exercises control over the accessibility of the domain located downstream of the said peptide and over the activation of fusion. This control depends on the peptide itself and is influenced by its length and by its biochemical composition. The hypothesis formulated is that the PRO
spacer peptide would finally perform the same role as the proline-rich region in question and which is located, in the unmodified glycoprotein, between the binding domain to the receptor and the fusion domain. This role would be masking of the domain downstream (fusion domain for the wild-type envelope or binding domain for the chimeric envelope) and subsequent unmasking for interaction of the domain upstream with its receptor. In the case of the wild-type envelope, this unmasking would lead to activation of fusion, whereas in the case of chimeric envelopes, unmasking would lead to accessibility of the binding domain to the viral receptor. If the receptor is expressed at the cell surface, there can then be interaction, and this then triggers activation of the fusion domain, explaining why the simultaneous presence of the two receptors synergizes infection.
These results make it possible to propose a two-stage targeting strategy for which a targeting envelope is constructed with various domains) whose functions are activated and coordinated by means of specific spacer peptides containing proline-rich sequences. These chimeric envelope glycoproteins can be conceived as follows) with, from N-terminal to C-terminal, a "targeting" domain capable of recognizing a cell surface molecule specifically expressed on the targeted tissue or targeted cell (for example a single-chain antibody or a ligand for a surface receptor); a spacer peptide capable of masking an auxiliary region which is in turn capable of facilitating penetration of the virus when it is activated. Such an auxiliary domain can be an entire retroviral envelope, i.e. a structure capable of mediating and taking over from viral infection by means of interaction with a ubiquitous retroviral receptor) which therefore has a very strong likelihood of being co-expressed with the targeted surface molecule. Ideally, the auxiliary domain should be masked until the viral particle has specifically interacted with the targeted surface molecule. For example, in the case of the AMOPRO and AMO~PRO envelopes, the targeted surface molecule is Ram-1 whereas the auxiliary domain is the ecotropic envelope.
EGFR and Rec-1 cooperation in infection.
. 28 To verify whether the PRO and OPRO spacer peptides could mediate the masking/unmasking mechanism in the case of another type of targeting envelope, another two-stage targeting model was explored by means of the EGF receptor. The results obtained with the targeting of Ram-1 made it possible to propose C-terminal ends of the masking/unmasking spacer peptides. However, it was not possible to define their N-terminal ends exactly. That is why, in the first place, the EMOPRO+ and EMOOPRO+
envelopes were constructed (Fig. 3B), in which the PRO and OPRO spacer peptides contain in addition) at the N-terminus, 41 amino acids derived from the amphotropic envelope and located immediately upstream of the proline-rich region. For the EMOPRO+ and EMODPRO+ envelopes, the targeting domain is EGF, whereas the auxiliary domain is the ecotropic envelope. These two envelopes were compared with the EMO envelope (Fig. 2 and 2B) which does not contain a spacer peptide.
The infection tests were carned out with cells expressing Rec-1 alone (Cerd9 cells) or with cells co-expressing Rec-1 and EGFR (3T3 cells). The results of a typical experiment are presented in Table 3. As expected from the results obtained with the AMO envelopes, the viruses containing the EMO envelopes can efficiently infect the Cerd9 and 3T3 cells, indicating that the binding domain to Rec-1 in these envelopes is not masked. In comparison with the EMO viruses, the viral particles containing the EMOPRO+ and EMOOPRO+ envelopes can only infect the Cerd9 cells with difficulty (between 1000 and 10 000 times less well than the EMO viruses). However, when Rec-1 and EGFR are co-expressed, even though this does not affect the titre of the EMO
virions, the viral particles containing the EMOPRO+ and EMO~PRO+ envelopes are and 60 times more infectious, respectively, compared with when Rec-1 is expressed alone.
In relation to the results obtained with the AMOPRO and AMOOPRO envelopes, masking is apparently effected less well, leading to non-negligible infectivity on Cerd9 cells. This is perhaps due to the fact that the PRO+ and OPRO+ spacer peptides are not optimized for their function, but perhaps also to the fact that the Cerd9 cells express a few EGF receptors which would contribute to activation of the EMOPRO+ and EMOOPRO+ envelopes.
Table 1 Titres (lacZ i.u./ml) obtained for the viruses containing the envelopes targeting Ram-1 in interference tests env T6671 3T3a 3T3-MLV-Aa~b 3T3-MoMLVa~b Ram-lc Ram-1 ~ Rec-lc Rec-Ic Ram-1c MO <1 92,000,000 46,OCC,000 (100)40 (100) A 10.000,000 12,000,000 240 8,000,000 (100) (100) AMO 4,000 24 (100) 32 (266_71 8 (50) AMOFx 230,000 440,000 (100) 8,000 (3.6) 6,000 (2) AMO1 330,C00 1,920,000 (100)78,000 (8.1) 62,000 (4.8) AMOIFx 400,000 1,620,000 (100)60,000 (7.41 74,000 (6.8) AMO~PRO 150,000 280,000 (100) 400 (0.29) 64,000 (34.3) 1~
AMOPRO 10 60,OC0 (100) 4 (0.013) <1 (0.0025) a: percentages calculated assigning a value of 100 to the titres obtained on b: infection on 3T3 chronically infected by MLV-A (3T3-MLV-A) or by MoMLV (3T3-MoMLV) c: receptor available at the surface of the cell in question Table 2 Tit res (IacZ/ml) obtained containing i u. for the the viruses envelopes g Ram-1 targetin Spacer env 3T3 Tc671 CERD9 Ram?(~ RecID
peptide MO 2.8x10'E<1.7x10'0 2.8x10' <6.1x10 1 ~
A 5x10'5 SxlO'S 6.2x10'01 1.2x10-5 3 AMO 1x10 2.2x10'2 6.2x10'02.2x10'0 6.2x10-2 "
13 AMOIFx 6x10'4 1.6x10'"4 2.2X10'52.7x10-1 3.7X100 16 AMOAPRO 1.9x10'55x10'3 6.2x10'02.6x10-1 3.3x10-9 18 AMOGIFx 9x10 4.5x10'3 8.7x10'31.1x10 1 2.2x10-1 "' 19 AP90G2 8x10'3 2.7x1C'3 3.1X10'33.9x10-1 3.9x10-1 23 AMOG2Fx 6x.0'3 1.2x10' 1.2X10" 2x10-1 2x10-1 2E A.MOG3Fx9x10" 1x10+~ _.2x10''_.1x:0-1 1.3x10'0 62 AMOPRO 1.8x10'3<1.7x1C+0 <1x10'2 <9.9x10-q <5.6x10-4 NOAPRO 1.3x10" <9.1x10'0 1.7x10'3<7x10-5 1.3x10-2 Table 3 Titres (lacZ i.u./ml) obtained for the viruses containing the envelopes targeting EGFR
env 3T3 CeRD9 RecID
MO 9.2x106 1.3x107 1 EMO~PRO 3.5x104 8.5x102 1.7x10-2 EMOPRO 9.6x102 7x101 5.2x10-2 Cpl 2x106 3x106 1 EXAMPLE 2:
With the aim of characterizing the cooperation between the Rec-1 and Ram-1 receptors, as well as the peptides that are capable of regulating this cooperation of receptors, a new series of type AMO chimeric envelope glycoproteins (see preceding S example) was constructed:
- in order to verify whether the infection obtained with the AMOPRO and AMODPRO envelopes passes, in a second stage, through an interaction with Rec-1, the binding domain with Rec-1 was inactivated by point mutagenesis (D84K mutation) (MacKrell et al., J. Virology, 70:1768-1774 ( 1996)) in the AMOPRO and AMODPRO
envelopes as well as in the AMOG 1 X control envelope which does not require the cooperation of receptors to permit infection (Valsesia-Wittmann et al., The EMBO
Journal 16:1214-1223. (1997)).
- in order to demonstrate the role of the type II polyproline helix structure for the cooperating peptides, the envelopes AMOEL3 and AMOELS were constructed. These envelopes have respectively 3 and 5 turns of a type II polyproline helix as characterized in the literature (Urry, Journal of Protein Chemistry 7:1-34. ( 1988)).
Retroviruses were generated with these chimeric envelopes and were characterized by infection of cells expressing either Rec-1 alone, or Ram-1 alone, or the two molecules Ram-1 and Rec-1.
Material and Methods.
The oligonucleotides elast3U: (5'-TTT ATG GTC ACC GCG GCC GCA CCT
GGG GTA GGG GCT CCG GGG GTA GGG GCT CCT GGG GTG GCC ATA TAA) and elast3L (5'-TTA TAT GGC CAC CCC AGG AGC CCC TAC CCC CGG AGC
CCC TAC CCC AGG TGC GGC CGC GGT GAC CAT AAA) were hybridized together. The resulting bicatenary DNA fragment was digested with the Eael restriction enzyme and cloned in the FBAMOSALF expression plasmid previously opened at NotI.
The result was the plasmid FBAMOEL3SALF (see sequence of the gene env AMOEL3 in Fig. 30) containing the peptide EL3 the peptide sequence of which is shown in Table 4 (see nucleotide sequence in Fig. 31 ).
The oligonucleotides UpElS: (5'-GAT GTA CCT GGG GTA GGC GCC CCT
GGA GTC GGG GCT CCT GGG GTA GGA TTC AT) and LowElS: (5'-ATG AAT
CCT ACC CCA GGA GCC CCG ACT CCA GGG GCG CCT ACC CCA GGT ACA
TC) were hybridized together. The resulting bicatenary DNA fragment was digested with EcoNI restriction enzyme and cloned in the FBAMOEL3SALF expression plasmid, previously opened at EcoNI. The result is the plasmid FBAMOELSSALF (see sequence of the gene env AMOELS in Fig. 32) containing the peptide ELS, the peptide sequence of which is shown in Table 4 (see nucleotide sequence in Fig. 33).
The oligonucleotides DELASTIN3-V Upper: (5'-GTC ACC GCG GCC GTC
CCT GGG GTA GGG GTG CCG GGG GTA GGG GTG CCT GGG GTG GCC ATA
TAA) and DELASTIN3-V Lower (5'-TTA 'TAT GGC CAC CCC AGG CAC CCC TAC
CCC CGG CAC CCC TAC CCC AGG GAC GGC CGC GGT GAC) were hybridized together. The resulting bicatenary DNA fragment was digested with the EaeI
restriction enzyme and cloned in the FBAMOSALF expression plasmid, previously opened at NotI.
The result is the plasmid FBAMOEL3-VSALF (see sequence of the gene AMOEL3-V in Fig. 34) containing the EL3-V peptide, the peptide sequence of which is shown in Table 4 (see nucleotide sequence in Fig. 35).
The oligonucleotides DELASTIN3-I Upper: (5'-GTC ACC GCG GCC GTC
ATA GGG GTA GGG GTG ATT GGG GTA GGG GTG ATC GGG GTG GCC ATA
TAA) and DELASTIN3-I Lower (5'-TTA TAT GGC CAC CCC GAT CAC CCC TAC
CCC AAT CAC CCC TAC CCC TAT GAC GGC CGC GGT GAC) were hybridized together. The resulting bicatenary DNA fragment was digested with the EaeI
restriction enzyme and cloned in the FBAMOSALF expression plasmid, previously opened at NotI.
This resulted in the plasmid FBAMOEL3-ISALF containing the peptide EL3-I, the peptide sequence of which is shown in Table 4.
The oligonucleotides UpXhoD84K: (5'-AGG CTG CTC GAG AAA ATG CGA
AGA ACC TTT AAC CTC CC) and LoXhoD84K: (5'-ATT TTC TCG AGC AGC CTG
GGC TGC TGC CCC C) were synthesized. Starting from the oligonucleotides 805FC
and LMOADeItaPR03 (see sequence above), the pairs 805FC/LoXhoD84K or UpXhoD84K/LMOADeItaPR03 were used independently for PCR amplification of two DNA fragments starting from the FBAMOSALF matrix. These two DNAs were digested by the enzymes NotI/XhoI and XhoI/BamHI respectively and co-ligated in one or other of the three plasmids FBAMOSALF, FBAMODeItaPROSALF, and FBAMOProSALF previously opened at NotI and BamHI. The resulting plasmids express respectively the envelopes AMOD84K, AMODeItaProD84K, and AMOProD84K.
Two DNA fragments of 2005 by and 241 by were isolated from the plasmid FBAMOG 1 X (Valsesia-Wittmann et al., The EMBO Journal 16:1214-1223. ( 1997)) by digestion with the restriction enzymes NdeI/XhoI and Xhol/BstEII respectively.
These two inserts were cloned in the plasmid FBAMOD84KSALF previously digested by the enzymes NdeI and BstEII, resulting in a plasmid capable of expressing the AMOG 1 XD84K envelope.
Results and Discussion.
Expression and viral incorporation of the chimeric envelopes. The expression plasmids for the envelopes AMO, AMODeItaPRO, AMOPRO, AMOEL3) AMOELS, AMOEL3-V, AMOEL3-I, AMO1FX) AMOGIX, AMOD84K, AMODeItaPROD84K, AMOPROD84K) AMOG 1 XD84K, AMODeItaPR02 (Valsesia-Wittmann et al., The EMBO Journal 16:1214-1223. (1997)), and AMODeItaPR04 (Valsesia-Wittmann et al., The EMBO Journal 16:1214-1223. (1997)) were introduced by transfection into the cells of the TELCeB6 line (Cosset et al., Journal of Virology 69:7430-7436. ( 1995b)). After selection by phleomycin, the phleomycin-resistant colonies were combined for each DNA
and virions were generated and analysed following the procedures originally described (Cosset et al., Journal of Virology 69:6314-6322. ( 1995a)).
These various chimeric envelopes are normally expressed and matured in the cells, and, moreover, efftciently incorporated on the viral particles (results not shown).
The binding tests that were carried out show that these retroviruses can bind specifically on human cells by means of the targeted surface molecule Ram-1 (results not shown).
These various viruses were used for infecting cells expressing either Rec-1 only (Cerd9), or Ram-1 only (CHO-Ram-1 ), or the two molecules Ram-1 and Rec- I
I 5 (Cear I 3 ). The results of titration of these viruses are presented in Table 4.
These results can be summarized as follows:
- in an AMO envelope, substitution of the spacer peptide by three beta-turns of a synthetic (AMOEL3) or natural polyproline helix, described in the literature (AMOEL3-V, from bovine elastin) confers) with regard to capacity for masking the function of the ecotropic envelope and for regulating the cooperation of the Ram-1 and Rec-1 receptors, a phenotype similar to the viruses bearing the "AMO" envelopes containing the cooperating spacer peptides DeltaPR02, DeltaPRO, DeltaPR04, or PRO. Since the peptides derived from elastin (AMOEL3-V and AMOEL3) are arranged as a type II
polyproline helix, it can be suggested on the basis of the results obtained that regulation of the cooperation of the Ram-1 and Rec-1 receptors by the DeltaPro and Pro peptides is probably due to their presumed secondary structure, as a type II polyproline helix.
Moreover, mutations introduced into the spacer peptide derived from elastin (AMOEL3-V) and having the purpose of destroying the folding of the peptide into a type II
polyproline helix (AMOEL3-I, mutations obtained by replacing the proline of each beta-turn with an isoleucine) lead to cancellation of receptor cooperation.
- destruction of the capacity for binding to the ecotropic receptor (D84K
mutations) stops receptor cooperation for the envelopes containing cooperating spacer peptides, especially PRO (see results AMOPRO vs AMOPROD84K)) but does not affect the functionality of the control envelopes bearing the flexible spacer peptide G 1 X (see results AMOGIX vs AMOGIXD84K). We deduce from this that binding to the ecotropic receptor is necessary for infection, in a second stage, following fixation on the Ram-I receptor.
Note that the present results show that in the case of the retroviruses generated with the chimeric envelope AMOPro, the binding domain to the ecotropic receptor is masked (Valsesia-Wittmann et al , The EMBO Journal 16:1214-1223. ( 1997)). The results, taken together, are therefore compatible with a model of two-stage interaction in which:
S - in its "naive" configuration, i e. when it has not been permitted to interact with a cell, the "A~10PR0" retrovitus can potentially interact with the targeted "primary"
receptor (the Ram-1 molecule), but cannot directly interact with the auxiliary receptor (the Rec-1 molecule). This masking seems to be due to a first property of the Pro spacer peptide - when this virus is permitted to interact with Ram-1, a local change in conformation occurs at the level of the Pro spacer peptide which will make the binding domain to Rec-1 accessible. This change in conformation is due to a second property of the Pro spacer peptide.
- if the Rec-1 receptor is present at the surface of the same cell that has Ram-1 and on which the virus is bound. then in a second stage, this receptor will serve as an entry molecule for the virus Table 4. Results of titration.
L"'~ ' Cearl3b CHU-Ram-Ih Ccrd9b sequence ofthe spacer peptide AMO NV<i AAA I'EIQV + + +
AMUO84K NVG PRVPIGPNPAA 1'HQV + + -AMODcItaPro2 NVG PRVP1GPNPAA P13QV ++++ +++
AMOIFX NVG AAAIEGRASPGSS PHQV ++++ +++ +++
AMODeItaPro NVG PRVPIGPNPVLPDAAA PHQV ++++ +++ -2S v~~o'~"~'cnaa~;NVG PRVPIGPNPVLPDAAA PHQV +++ +++ _ AMOEL3 NVG AAAPGVGAPGt'GAPGVAA PIiQV +++ + _ AMOEL3-V NVG AAVPGVGVPGVGVPGVAA PHQV +++ + -AMO);L3-! NVG AAVIGVGVIGVGVIGVAA PHQV +++ +++
AMOG1X NVG AAAGGGGSICGRASPGSS PHQV +++ ++ ++
AMOGIXD84K ++ ++
NVG AAAGGGGSIEGRASPGSS
PHQV
A1v10DcitaPro4NVG PRVPIGPNPVLPDQRLPSSAA PHQV +++ ++ -AMOELS NVG AAAPGVGAPGVGAPGVGAPGVGAPGVAA +++ - -PIiQV
AMOPRO NVG PRVPIGPNPVLPDQRLPSSPIEIVPAPQPPSP...
...LNTSYPPSTTSTPSTSPTSPSVPQPPPAAA +++ - -PHQV
NVG PRVPiGPNPVLPDQRLPSSPIEIVPAPQQPPSP...
..LNTS~'PPSTTSTPSTSPTSPSVPQPPPAAA - -PHQV
3S envelope. "PHQV" represents the amino acids 7 to 10 of the envelope of Mo~.~(L.V and "NVG" represents the last 3 amino acids of the binding domain to Ram-1 b: relative titres obtained on the cells indicated: Cearl3) expressing the receptors Ram-I and Rec-I; CHO-Ram-l) expressing Ram-I only; Cerd9) expressing REPLACEMENT SHEET (RULE 26) Rec-I only.
EXAMPLE 3.
The development of strategies of targeting gene transfer by means of the 5 construction of chimeric envelope glycoproteins) generated by N-terminal insertions of ligands, comes up against the difficulty, in particular, of low capacity, or even incapacity of interaction between virus and targeted surface molecule for activating fusion of these targeting envelopes (Cosset and Russell) Gene Therapy 3 :946-956 ( 1996)). The possibility of causing two surface molecules to cooperate (Valsesia-Wittman et al., The 10 EMBO Journal 16:1214-1223. ( 1997)), one being the targeted receptor or cell surface molecule of attachment, the other being a (retro)viral receptor specialized for fusion or auxiliary surface molecule, makes it possible to envisage a means of overcoming this problem of low gene-fusion capacity of chimeric envelopes and more generally of low e~ciency of the targeting retroviruses. The cooperation of receptors was tested in three I S models of targeting, in which the following three cell surface molecules serve as points of attachment for the targeting retroviruses: (i) receptor of EGF (epidermal growth factor), and (ii) class I molecule of human CMH. The binding domains for these two surface molecules are either growth factors (EGFR), or a single-strand antibody (CMH-I).
These ligands were inserted by fusion at the N-terminal end of the amphotropic MLV
20 envelope (4070A) and various peptides from the proline-rich region carried by the SU
subunit of the amphotropic MLV virus were inserted between the ligands and the envelope (see Table S).
Materials and Methods 25 DNA fragments coding for the spacer peptides DeItaPro2, DeItaPro3, DeltaPro4, and Pro (see Table S) were generated by PCR using as DNA matrix the gene env 4070A, at 5' the oligonucleotide PRO-S-NE (5'-ATC GAG GTC ACC GCG GCC GCG GGA
CCC CGA GTC CCC ATA GGG CCC) which is the same for the four PCR fragments and as oligonucleotides 3' the sequences AMODPRO(-H + P-A): (5'-TAT GAG CGG
30 CCG GGT TGG GCC CTA TGG GGA C), DPro3: (5'-TTA TAC GGC CGT GTC
GGG TAA TAC TGG), AMODPRO(+H+S-A): (5'-TAT GTG CGG CCG AGG AAG
GGA GTC TTT GGT C) and PRO-3-NE: (5'-ATA ATC GGC CGG GGG TGG CTG
TGG GAC).
The corresponding DNA fragments were digested by the enzyme EagI and 35 inserted separately in the plasmid FBEASALF (expressing the chimeric envelope glycoproteins EA) (Cosset et al., Journal of Virology 69:6314-6322. ( 1995a)) previously opened at the NotI restriction site. The resulting plasmids express the envelopes EADeltaPro2, EADeItaPro3, EADeltaPro4, and EAPro.
The Ndel/Notl fragment containing the promoter FB29 as well as the scFv anti-MHC-I provided with the signal peptide of the envelope glycoprotein of the MoMLV
virus (Marin et al.) Journal of Virology 70:2957-2962. ( 1996)) was cloned in the FBEASALF plasmid from which the NdeUNotl fragment was removed beforehand. This S results in the plasmid FB34ASALF capable of expressing a 4070 chimeric envelope with the scFv fused at its N-terminal end. This plasmid was then opened at Notl for inserting the spacer peptides DeItaPro2, DeItaPro3, DeltaPro4, and Pro (see Table S) previously digested with the EagI enzyme. This results in a series of expression vectors for the envelopes 34De1taPro2, 34DeItaPro3, 34De1taPro4, and 34Pro.
Results and Discussion.
These various DNAs were introduced by transfection into the cells of the TELCeB6 line and retroviruses were generated following the usual procedure (see examples I and 2). It was shown that these retroviruses correctly express the chimeric envelope glycoproteins and that the latter permit effcient redirection of binding of the I S viral particles on the specific cellular targets (results not shown).
The viruses produced with the chimeric envelopes of the various groups were used for infecting cells that only express the amphotropic receptor and not the targeted surface molecule. The results of titration of these viruses are shown in Table S.
These results show that it is possible to mask the functions of the amphotropic envelope by means of fragments from the proline-rich region. In the case of chimeras effected with EGF, it is necessary to insert at least five beta-turns to obtain a significant masking effect, and insertion of the whole of the proline-rich region leads to complete inhibition. For the chimeras effected with scFv anti-hR IC-I, three beta-turns are required to obtain a complete masking effect.
Table S. Results of titration.
peptides ligand fork:
name sequence CGFR MHC-I
without' AAA PHQV 6e3 39e2 DeltaPro2 AAA GPRVPIGPNPAA PHQV 7e3 18e1 DeltaPro3 AAA ~PRVPIGPNPVLPD'CAA PHQV 1.2e3 <
Icl DeItaPro4 AAA GPRV_PIG~NpVLPDORLPSSAA PHQV 7el <
Icl Pro AAA GPRVPIGPNPV1.PDOP~LPSSPIEIVPAPQPf' .
...SPLNTSYPPSTTSTPSTSPTSPSVPQPPPAA PIiQV< lel <
lel a: peptide inserted between the targeting binding domain and the 4070A
envelope. "AAA" codes for the Notl site used for effecting fusion in the chimeric envelope; "PHQV" represents the amino acids 4 to 7 of the amphotropic envelope. The REPLACEMENT SHEET (RULE 26) beta-turns are underlined.
b: titration on Cear 13 cells for the EGFR targeting envelopes (ligand: EGF) and for the targeting envelopes targeting MHC-I (ligand: scFv anti-MHC-1).
c: the ligand is directly fused at the end of the amphotropic SU (with the 4th S amino acid), and does not have a spacer peptide.
EXAMPLE 4.
The previous investigations made it possible to delimit the C-terminal ends of the cooperating peptides and to determine the number of turns of type II
polyproline helix necessary for obtaining a masking effect and a minimal cooperative effect. In the case of the model of the AMO chimeric envelopes (see above), a minimum of two turns of the helix is sufficient (Valsesia-Wittmann et al., The EMBO Journal 16:1214-1223.
(1997)).
However, for chimeric envelopes generated with other binding domains than that for Ram-1 (in the case of AMO chimeras) and using the amphotropic envelope as support 1 S envelope, the cooperative effect is less marked, on the one hand because masking of the functions of the amphotropic envelope requires four turns of polyproline helix (see Table S) and on the other hand because activation of the functions of the amphotropic envelope is less strong following binding of the viruses on the targeted surface molecules. One possible explanation is that, in the model of the AMO chimeras, apart from the PRO
spacer peptide, the binding domain to Ram-1 itself carries important determinants for inducing, in a concerted manner with this PRO peptide, activation of the functions of the ecotropic envelope. The binding domain for Ram-1 is in fact a fragment of retroviral envelope (derived from the amphotropic envelope) which is naturally located immediately upstream of the proline-rich region. In order to determine the presence and 2S the importance of such regions in receptor cooperation, chimeric envelopes were constructed combining a targeting domain with the amphotropic envelope and, inserted between these two polypeptides, various peptides tested for their cooperative effect containing notably the proline-rich region (or a fragment of this region) combined with peptide fragments derived from the N-terminal domain of the amphotropic envelope.
Materials and Methods DNA fragments coding for the spacer peptides DeItaPro4-beta, DeItaPro4-int, DeltaPro4-vrb and.. were generated by PCR using as DNA matrix the gene env 4070A, at 3' the oligonucleotide AMODPRO(+H+S-A): (S'-TAT GTG CGG CCG AGG AAG
3S GGA GTC TTT GGT C) and at S' the oligonucleotides UPro-beta: (S'-ATG CTG
GCG
GCC GCG GAT CCT ATT ACC ATG T'TC TCC CTG ACC CGG C), UPro-int: (S'-ATG CTG GCG GCC GCG AAC CCT CTA GTC CTA GAA TTC ACT GAT GC), and UPRO-vrb: (S'-ATG CTG GCG GCC GCG GAA ACC ACC GGA CAG GCT TAC
TGG AAG CCC)) respectively (see Figs. 36 to 38).
DNA fragments coding for the spacer peptides Pro-beta, Pro-int and Pro-vrb were generated by PCR using as DNA matrix the gene env 4070A, at 3' the oligonucleotide PRO-3-NE: (ATA ATC GGC CGG GGG TGG CTG TGG GAC) and at S S' the oligonucleotides UPro-beta: (S'-ATG CTG GCG GCC GCG GAT CCT ATT
ACC ATG TTC TCC CTG ACC CGG C), UPro-int: (S'-ATG CTG GCG GCC GCG
AAC CCT CTA GTC CTA GAA TTC ACT GAT GC), and UPRO-vrb: (S'-ATG CTG
GCG GCC GCG GAA ACC ACC GGA CAG GCT TAC TGG AAG CCC), respectively (see Figs. 39 to 41 ).
These DNA fragments were digested with Eagl enzyme and inserted either in the FBEASALF plasmid (see above) resulting in production of the expression vectors for the chimeric envelopes EADeltaPro4-beta, EADeltaPro4-int, EADeltaPro4-vrb, EAPro-beta, EAPro-int and EAPro-vrb, or in the FB34ASALF plasmid (see above) resulting in production of the expression vectors for the chimeric envelopes 34ADe1taPro4-beta, 1 S 34ADeltaPro4-int) 34ADe1taPro4-vrb, 34APro-beta, 34APro-int and 34APro-vrb.
BIBLIOGRAPHY
1. Battini, J. L., O. Danos, and J. M. Heard. 1995. Receptor-binding domain of murine -leukemia virus envelope glycoproteins. J. Virol. 69:713-719.
2. Battini, J. L., P. Rodrigues, R. MYller, O. Danos, and 1.-M. Heard. 1996.
Receptor-binding properties of a purified fragment of the 4070A amphotropic murine leukemia virus envelope glycoprotein. J. Virol. in press.
3. Bell, G. L, N. M. Fong, M. M. Stempien, M. A. Wormsted, D. Caput, L. Ku, M. S. Urdea, L. B. Rall, and R. Sanchez-Pescador. 1986. Human epidermal growth factor precursor: cDNA sequence, expression in vitro and gene organization.
Nucleic Acid Res. 14:8427-8446.
4. Cosset) F.-L., C. Legras, Y. Chebloune, P. Savatier) P. Thoraval, J. L.
Thomas) 1. Samarut, V. M. Nigon, and G. Verdier. 1990. A new avian leukosis virus-based packaging cell line that uses two separate transcomplementing helper genomes. 1.
Virol. 64:1070-1078.
5. Cosset, F.-L., C. Legras, J. L. Thomas, R. M. Molina, Y. Chebloune, C.
Faure, V. M. Nigon, and G. Verdier. 1991. Improvement of avian leukosis virus (ALV)-based retrovirus vectors by using different cis-acting sequences from ALVs. J
Virol.
65:3388-3394.
6. Cosset, F.-L., F. J. Morling, Y. Takeuchi, R. A. Weiss, M. K. L. Collins, and S. J. Russell. 1995a. Retroviral retargeting by envelopes expressing an N-terminal binding domain. J. Virol. 69:6314-6322.
The invention relates to an alternative that is beneficial with regard to performance in targeting, particularly because it combines specific recognition of the target cell and entry into the target cell connected with a natural retroviral mechanism, known for its efficiency.
The invention relates more particularly to a two-stage targeting mechanism:
- the first stage permitting recognition of a targeted surface molecule by means of the new N-terminal binding domain, inserted in an envelope glycoprotein, - the second stage permitting conditional recognition of a normal retroviral receptor via a domain inherent in the initial envelope glycoprotein and thus permitting a relay in the process of entry of the viral particle into the cell, the term "conditional"
signifying that the relay in the entry mechanism can only be effected if the viral particle has previously interacted with the initial surface molecule, which in turn guarantees that the infection is truly targeted.
The invention relates to new peptides for carrying out the first stage in a two stage mechanism and which perform the role of "masking" with respect to the second stage, for as long as the first stage has not taken place and permitting the second stage, i. e. performing the role of unmasking with respect to the second stage if, and only if, the first stage has taken place.
The present invention also relates to the construction of chimeric envelope glycoproteins using these novel peptides.
The invention relates to the use of a peptide for transfer of genes into a target eukaryotic cell, this peptide containing from about 10 to about 200, especially from about 1 S to about 1 SO amino acids, and preferably about 20 amino acids, in which at least 30% of the amino acids are made up of proline residues, these proline residues being regularly arranged so as to induce turnings of the polypeptide chain to about 180°
("~3-turn" or "reverse-turn"), these turnings being evenly spaced and forming a polyproline helix with (3 type turning ("polyproline (3-turn helix"), in a polypeptide construction containing, on the N-terminal side (upstream) of the said peptide, an N-terminal (upstream) protein region capable of recognizing a targeted surface molecule or an antigen expressed on a cell surface, especially a suitable receptor (targeted receptor) located on the said eukaryotic cell, and on the C-terminal side (downstream) of the said peptide, a C-terminal (downstream) protein region capable of recognizing a suitable receptor (auxiliary receptor) located on the aforesaid eukaryotic cell, this peptide being capable of promoting or inhibiting interaction between the C-terminal (downstream) protein region and the auxiliary receptor, inhibition of this interaction occurring for as long as the N-terminal (upstream) protein domain has not interacted with the targeted receptor and promotion of interaction between the C-terminal (downstream) protein domain and the auxiliary receptor occurring when the N-terminal (upstream) protein domain has interacted with the targeted receptor.
In the case of a peptide of 20 amino acids (OPRO defined below), this b-turn polyproline helix contains four (3 turnings and therefore 4 turns, and moreover is incompatible with an a-helix or (3-sheet secondary structure. Advantageously, the polyproline helix with (3 type turning positioned between the two domains of the chimeric protein (N-terminal domain and auxiliary domain) possesses intrinsically: 1 ) an elastomeric force, 2) the property of self assembly with other polyproline helices, probably in connection with the trimeric nature of the envelope, 3) the property of transmitting, to the auxiliary domain, a distortion that is induced by binding of the N-terminal domain with its receptor, causing activation of the auxiliary domain.
The invention also relates in general to any two-stage mechanism, in which the 1 S second stage can only be effected if the first stage has taken place) and relates for example to an enzymatic mechanism involving a chimeric protein which is only to occur if the chimeric protein is able to recognize its substrate.
The expression "N-terminal (upstream) protein domain capable of recognizing a targeted surface molecule, or an antigen expressed on a cell surface", means that:
1) the interaction between this N-terminal protein domain and the targeted surface molecule can be characterized by a dissociation constant (of nanomolar order with respect to interaction between wild-type retroviral envelope glycoprotein and retroviral receptor);
2) the soluble form of this N-terminal protein domain (i.e. not associated in the construction of the chimeric envelope glycoprotein) possesses binding characteristics similar to this same protein domain when it is inserted at the N-terminal position in the chimeric envelope glycoprotein;
3) the chimeric envelope glycoprotein containing the N-terminal protein domain can be characterized according to classical techniques of virology (e.g. binding test; cf "Examples").
The following may be mentioned as examples of targeted surface molecule or of antigen expressed on a cell surface:
- markers for differentiating the various haematopoietic lineages, in particular markers expressed on immature cells and/or haematopoietic stem cells (example:
CD34), - markers expressed on tumour cells (example: carcino-embryonic antigens), - markers present specifically on various differentiated tissues (example:
receptor of growth factors, of peptide hormones).
As an example of a targeted surface molecule, we may mention in particular a receptor which will be designated as targeted receptor hereinafter For convenience of terminology, the expression "targeted receptor" will be used in the following to encompass any targeted surface molecule or any antigen expressed on S a cellular surface.
The expression "C-terminal (downstream) protein domain capable of recognizing a suitable receptor (auxiliary receptor)" means that the C-terminal protein domain can interact with the auxiliary receptor, this interaction being characterized by a dissociation constant which is of nanomolar order if the C-terminal protein domain is derived from a retroviral envelope glycoprotein and if the auxiliary receptor is the retroviral receptor used by this same glycoprotein, this interaction permitting the triggering of the gene fusion process in a mechanism that is strictly similar to the natural process, i. e. outside of the context of a chimeric envelope glycoprotein.
The peptide that is the subject of the invention is such that, positioned between two protein domains (an N-terminal protein domain relative to the said peptide and a C
terminal protein domain relative to the said peptide), it can induce the function of the C
terminal protein domain (for example binding if that is the function of this C-terminal domain) if, and only if, the N-terminal protein domain has been mobilized in its function (for example binding).
Non-induction of the function of the C-terminal protein domain by the peptide of the invention corresponds to the mechanism of "masking" of the peptide of the invention, whereas induction of the function of the C-terminal protein domain by the peptide of the invention corresponds to the mechanism of "unmasking" of the peptide of the invention.
That is why the peptide of the invention will also be designated hereinafter as "masking/unmasking peptide".
The invention relates to the use of a peptide according to the invention, in the construction of a glycoprotein with targeting and gene-fusion activity, essentially intact, carried by a viral or non-viral recombinant gene-transfer vector capable of infecting a eukaryotic cell, the said eukaryotic cell possessing a targeted receptor and an auxiliary receptor permitting facilitation of entry of the said viral or non-viral vector into the eukaryotic cell, the aforesaid glycoprotein comprising:
- the aforesaid peptide, - a protein domain on the N-terminal (upstream) side of the said peptide, capable of interacting with the above-mentioned targeted receptor, this protein domain permitting specific binding of the aforesaid gene-transfer vector and - a protein domain on the C-terminal (downstream) side of the said peptide, capable of interacting with the aforesaid auxiliary receptor, this interaction performing the role of auxiliary mechanism of entry of the aforesaid gene-transfer vector into the eukaryotic cell, the process of cell entry of the viral or non-viral recombinant vector into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to occur when the N-terminal (upstream) protein domain has recognized and bound the viral or S non-viral recombinant vector with the targeted receptor of the eukaryotic cell, leading, by means of the aforesaid peptide, to a mechanism of "unmasking" or of accessibility of the auxiliary receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the aforesaid gene-transfer vector and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, a mechanism of "masking" or of non-accessibility is produced, by means of the aforesaid peptide, of the auxiliary receptor with respect to the C-terminal (downstream) protein domain.
The expression glycoprotein with targeting and gene-fusion activity denotes a glycoprotein which is:
1) capable of being incorporated efficiently on (retro)viral particles carrying a transgene, 2) capable of specifically recognizing the targeted cell-surface molecule and of specifically redirecting the binding of the (retro)viral particle which carries it to this molecule, 3) capable of causing fusion, after fixation on the molecular target, of the membrane of the (retro)viral particle and the cytoplasmic membrane of the cell, according to the mechanism used naturally by the (retro)virus from which the envelope glycoprotein was derived.
The expression "substantially intact" refers to a viral glycoprotein that retains all its necessary determinants for preserving the post-translation processes:
oligomerization, the properties of viral incorporation and of fusion, as required. However, certain changes (such as mutations, deletions, additions) can be made to the glycoprotein without significantly affecting its functions and the glycoproteins containing these minor changes are regarded as substantially intact for the needs of the invention.
In particular, the glycoprotein may lack some amino acids (for example about 1 to 10), especially at the N-terminal end, but will generally be of the same size as the wild-type protein and possesses essentially the same biological properties as the wild-type protein.
The expression "viral recombinant gene-transfer vector" means any virus capable of infecting cells of the eukaryotic type) and preferably a virus that is suitable for gene therapy, such as an adenovirus or a retrovirus (for example a type C
retrovirus).
The expression "non-viral recombinant gene-transfer vector" means macromolecular complexes combining the DNA containing the transferred gene, its regulatory sequences, and molecules belonging to the class of lipids, carbohydrates, or proteins, which possess functional properties capable of: 1 ) targeting deposition of DNA
on the surface of the target cell, 2) introducing this DNA into the targeted cell, and 3) introducing this DNA into the nucleus of the targeted cell.
The expression "process of cell entry of the viral recombinant gene-transfer vector" means all of the events leading to introduction of the transported gene into the cytoplasm of the targeted cell following initial contact between the surface of this cell and the gene-transfer vector.
As an example, for retroviral vectors, in relation to a defined cellular target for which a "targetable" surface molecule is known (i.e. sufficiently specific relative to the other tissues) and a ligand for the surface molecule (ligand or single-chain antibody), a gene coding for the envelope glycoprotein targeting this surface molecule can be constructed genetically. This is accomplished by fusing (from N to C-terminal) a signal peptide, the ligand, the "masking/unmasking" peptide, and the rest of the retroviral envelope. An expression vector for this chimeric molecule is inserted into a "semi-transcomplementing" cell line expressing the gag and pol proteins of the MLV
virus I S (coding for the viral capsid and the enzymes of replication of the retrovirus). A
"transcomplementing" line is obtained, which can then be used for producing retroviral vectors if a plasmid carrying this retroviral vector is additionally introduced, as occurs with the conventional transcomplementing lines expressing normal retroviral envelopes.
The invention also relates to the use of a peptide according to the invention, in the construction of an essentially intact (retro)viral envelope glycoprotein, carried by a recombinant (retro)viral particle capable of infecting a eukaryotic cell, the said envelope glycoprotein preferably being of polymeric form, and especially of trimeric form, each monomer of the polymeric form being in its turn of heterodimer form, the said eukaryotic cell possessing a targeted receptor and an auxiliary receptor permitting facilitation of entry of the aforesaid (retro)viral particle ((retro)viral receptor) into the eukaryotic cell, the envelope glycoprotein comprising:
- the aforesaid peptide, - a protein domain on the N-terminal side (upstream) of the aforesaid peptide, capable of interacting with the aforesaid targeted receptor, this interaction permitting specific binding of the (retro)viral particle and - a protein domain on the C-terminal side (downstream) of the aforesaid peptide, capable of interacting with the aforesaid (retro)viral receptor, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the eukaryotic cell, the process of cell entry of the (retro)viral recombinant particle into the eukaryotic cell 3 S by means of the C-terminal (downstream) protein domain only being able to occur when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the (retro)viral recombinant particle, Leading, via the aforesaid peptide, to a mechanism of "unmasking" or of accessibility of the (retro)viral ~
receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the viral recombinant particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, a mechanism of "masking" or of non-accessibility is produced, by means of the aforesaid peptide, of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain.
The (retro)viral envelope glycoproteins are trimers of heterodimers with surface subunit (SU) and transmembrane subunit (TM). This concept of trimerization is fundamental for the functionality of the (retro)viral envelope. The envelope glycoproteins of the invention are preferably of trimeric form.
According to an advantageous embodiment of the invention, the N-terminal (upstream) protein domain is chosen from the following polypeptides:
- single-strand antibodies recognizing cell-surface molecules, - any ligand for a cell-surface molecule, especially polypeptide hormones, cytokine, I S growth factors.
According to an advantageous embodiment of the invention, the C-terminal (downstream) protein domain corresponds to a (retro)viral envelope glycoprotein, essentially intact, including the natural binding domain, the functions of fusion and of attachment of the wild-type envelope glycoprotein from which is derived the envelope glycoprotein carried by the recombinant (retro)viral particle.
According to an advantageous embodiment of the invention, the peptide originates from the envelope glycoprotein of type C retroviruses, and in that the virus is preferably chosen from: the ecotropic MLV virus, the amphotropic MLV virus, the xenotropic MLV virus, the MCF MLV virus, the MLV 10A1 virus, GALV (Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV
C
(FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or consisting of one of the following sequences: PRO (4070A), PRO(MoMLV), APRO, PRO+, PRO+, PRO(3) OPROp, OPR04-~, OPR04-int, OPR04-vrb, PR0~3, PRO-int, PRO-vrb.
The invention relates to the use of a peptide derived or adapted from bovine elastin and chosen from those containing or consisting of one of the following sequences:
EL3, EL3-V, ELS.
The invention also relates to peptide sequences chosen from those containing or consisting of one of the following sequences:
- PRO (4070A), PRO(MoMLV), PRO(3, PRO+, PRO, 4PR0(3, DPRO+) MOAPRO, MOAOPRO, - EMOPRO, EMOPRO(3, EMOPRO+, EAPRO, EAPR0~3, EAPRO+, EMODPRO, EMOOPRO(~, EMOOPRO+, EAOPRO, EAAPRO(3, EAOPRO+, AMOEL3) AMOEL3-V) AMOELS.
PRO (4070A), PRO(MoMLV), PRO(3, PRO+, 4PR0, 4PR0(3, OPRO+, EL3, EL3-V, ELS are masking/unmasking peptides of the invention.
AMOPRO, AMOOPRO, AMOEL3, AMOEL3-V, AMOELS correspond to S Ram-1 targeting envelopes.
MOAPRO, MOAOPRO correspond to Rec-1 targeting envelopes.
EMOPRO, EMOPR0~3, EMOPRO+, EAPRO, EAPRO(3, EAPRO+, EMOOPRO, EMO~PR0~3, EMOOPRO+, EAOPRO, EAOPRO(3, EAAPRO+
correspond to EGFR targeting envelopes.
The invention also relates to a polypeptide sequence containing a peptide of about 10 to about 200, especially from about 15 to about 150 amino acids, and preferably about 20, in which at least 30% of the amino acids consist of proline residues, and these proline residues are regularly arranged so as to induce turnings of the polypeptide chain at about 180° (" ~i-turn" or "reverse-turn"), these turnings being 1 S regularly spaced and assembling themselves into a polyproline (3-turn helix, - an N-terminal protein domain (upstream) of the aforesaid peptide, capable of reacting with a suitable receptor (targeted receptor) located on a eukaryotic cell, and this protein domain permits specific binding of a recombinant (retro)viral particle containing the said N-terminal protein domain and - a C-terminal protein domain (downstream) of the aforesaid peptide, capable of interacting with a suitable auxiliary (retro)viral receptor ((retro)viral receptor) located on the said eukaryotic cell) and this interaction performs the role of auxiliary mechanism of entry of the (retro)viral particle into the said eukaryotic cell, the process of cell entry of the said recombinant (retro)viral particle into the said eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to occur when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the said recombinant (retro)viral particle, leading, by means of the aforesaid peptide, to a mechanism of unmasking or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of masking or of non-accessibility, by means of the aforesaid peptide) of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain.
The invention also relates to a recombinant (retro)viral particle capable of infecting a eukaryotic cell) this cell containing a targeted receptor and an auxiliary receptor of the aforesaid (retro)viral particle, including a substantially intact envelope glycoprotein) especially of polymeric form and preferably of trimeric form, each monomer of the polymeric form preferably being itself of heterodimer form, containing a peptide of about 10 to about 200) especially of about 15 to about 150 amino acids, and preferably of about 20, in which at least 30% of the amino acids are made up of proline 5 residues, these proline residues being regularly arranged so as to induce turnings of the polypeptide chain at about 180° ("(3-turn" or "reverse-turn"), these turnings being regularly spaced and assembling themselves into a polyproline (3-turn helix, - a protein domain on the N-terminal side (upstream) of the aforesaid peptide, capable of interacting with the aforesaid targeted receptor, this peptide region permitting specific 10 binding of the (retro)viral particle and - a protein domain on the C-terminal side (downstream) of the aforesaid peptide, capable of interacting with the aforesaid (retro)viral receptor, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the eukaryotic cell, the process of cell entry of the recombinant (retro)viral particle into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to occur when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the recombinant (retro)viral particle, leading, via the aforesaid peptide, to a mechanism of unmasking or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of masking or of non-accessibility, via the aforesaid peptide, of the retroviral receptor with respect to the C-terminal (downstream) protein domain.
The invention also relates to a recombinant (retro)viral particle characterized in that the N-terminal (upstream) protein domain is chosen from the following peptides:
- single-strand antibody recognizing cell surface molecules, - any ligand for a cell surface molecule, especially polypeptide hormones, cytokine, growth factors.
The invention also relates to a recombinant (retro)viral particle characterized in that the C-terminal (downstream) protein domain corresponds to a polypeptide of (retro)viral origin possessing functions of binding, of fusion and of attachment of the wild-type envelope glycoprotein from which is derived the envelope glycoprotein carried by the recombinant (retro)viral particle, and can originate from natural regions possessing functions of binding) of fusion and of attachment of the envelope glycoproteins derived from retroviruses Nll.V-A, GALV, FeLVB, or viruses such as adenoviruses, herpesviruses, AAV (Adeno Associated Virus), or more generally from viral glycoproteins derived from viruses of eukaryotic origin, especially orthomyxoviruses (such as influenza viruses) or paramyxoviruses (such as SVS).
The invention also relates to a recombinant (retro)viral particle characterized in that the peptide is derived from the envelope glycoprotein of type C
retroviruses, and in that the peptide is preferably derived from a virus chosen from: ecotropic MLV
virus, amphotropic MLV virus, xenotropic MLV virus, MLV MCF virus, MLV 10A1 virus, GALV (Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV
A, FeLV B, FeLV C (FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or consisting of one of the following sequences:
PRO
(4070A), PRO(MoMLV), APRO, PRO+, PRO+, PR0~3, ~PRO(3, OPR04-(3, OPR04-int, OPR04-vrb, PRO(3, PRO-int, PRO-vrb.
The invention also relates to a recombinant (retro)viral particle characterized in that:
- the peptide originates from the envelope glycoprotein of type C
retroviruses, and in that 1 S the virus is preferably chosen from: ecotropic MLV virus, amphotropic MLV
virus, xenotropic MLV virus) MLV MCF virus, MLV 10A1 virus) GALV (Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV
C
(FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or consisting of one of the following sequences: PRO (4070A), PRO(MoMLV), OPRO, PRO+, OPRO+, PRO(3, OPR0~3, OPR04-(3, OPR04-int, ~PR04-vrb, PRO(3, PRO-int, PRO-vrb, - the N-terminal (upstream) protein domain is chosen from the following peptides:
* single-strand antibodies recognizing cell surface molecules, * any ligand for a cell surface molecule, especially polypeptide hormones, cytokine, growth factors, - the C-terminal protein domain corresponds to a polypeptide of (retro)viral origin possessing functions of binding, fusion and attachment of the wild-type envelope glycoprotein from which is derived the envelope glycoprotein carried by the recombinant (retro)viral particle, and can originate from natural regions possessing functions of binding, of fusion and of attachment of the envelope glycoproteins derived from retroviruses MLV-A, GALV, FeLVB, or from viruses such as adenoviruses) herpesviruses, AAV (Adeno Associated Virus), or more generally from viral glycoproteins derived from viruses of eukaryotic origin, especially orthomyxoviruses (such as influenza viruses) or paramyxoviruses (such as SVS).
The invention also relates to a recombinant (retro)viral particle characterized in that the 5' end of the nucleotide sequence coding for the N-terminal (upstream) protein domain is contiguous with the 3' end of the nucleotide sequence coding for the signal peptide) the 3' end of the nucleotide sequence coding for the N-terminal (upstream) protein domain is contiguous with the 5' end of the nucleotide sequence coding for the peptide, the 3' end of the nucleotide sequence coding for the peptide is contiguous with the 5' end of the nucleotide sequence coding for the C-terminal (downstream) protein domain.
The invention also relates to a nucleic acid coding for a peptide or for a recombinant particle according to the invention.
The invention also relates to a method of selective in vitro or ex vivo transfer of a nucleic acid into eukaryotic target cells present among other non-target cells, comprising the administration to the target and non-target cells, of a recombinant (retro)viral particle according to the invention, containing the nucleic acid to be transferred.
The invention also relates to a pharmaceutical composition containing, as active substance, a (retro)viral particle according to the invention, and also containing a gene to be transferred, together with a physiologically suitable pharmaceutical vehicle.
With regard to genes to be transferred that are important for gene therapy, these are for example IFN, IL2, p53, VEGF, TNF, CFTR, HSV-TK, lacZ, GFP, gene of various cytokines, other types of suicide genes including conditional suicide genes, other genes with antiviral activity, other genes with antitumour activity, other marker genes and any gene for therapy of a mono- or multi-genie disease. As an example) the pathologies most specifically involved are: most mono- or multi-genie diseases (mucoviscidosis, myopathy, lysosomal diseases) various forms of cancer, viral diseases (AIDS), etc.).
For a proper understanding of the mechanism of the invention (see Fig. 1 ), we must bear in mind that the envelope glycoproteins according to the invention (also denoted by "chimeric envelopes") possess, as well as an additional recognition region, the functions corresponding to their own particular regions; that is (see Fig.
2), 1 ) the natural binding domain located in the N-terminal part of the surface subunit (SU) of the wild-type envelope glycoprotein) and therefore just downstream of the supernumerary binding domain and 2) the fusion domain located in the C-terminal part of the subunit (SU) and in the transmembrane subunit (TM) of the envelope glycoprotein complex. For the chimeric envelopes constructed previously (EMO and AMO envelopes, for example), on the basis of the general structure shown diagrammatically in Fig. 2, the natural binding domain is functional. If the retroviral receptor that it recognizes is expressed at the surface of the 3 S target cell, then this domain will recognize it) and will permit infection to proceed. Then there will be no possibility of specific targeting, even if a surface molecule specifically recognizing the supernumerary binding domain is also expressed.
However, depending on the peptide inserted between the supernumerary binding domain and the natural binding domain, it is possible for the functionality of the natural binding domain to be adjusted considerably and, for some of these peptides, there can be effective prevention of its accessibility for recognition of the retroviral receptor (first action). It will be possible for this site to be unmasked, and hence rendered accessible to interaction with the normal retroviral receptor, if and only if the supernumerary binding domain has previously interacted with the targeted surface molecule. This second action is also mediated by the peptide separating the two domains. Here the normal retroviral receptor plays the role of auxiliary molecule.
Symbols on the diagrams:
- Fig. I represents the two-stage entry process of the targeting viral particle. The viral particles are generated (A) with targeting envelope glycoproteins composed of an N-terminal domain (ligand, single-strand antibody etc.), of the masking/unmasking peptide) and a C-terminal domain (B). The stages giving rise to introduction of the I S virion into the targeted cell involve a mechanism that is coordinated by the masking/unmasking peptide (C).
- Fig. 2 is a schematic representation of some of the targeting envelopes investigated. The position of some functional regions is shown. Vertical arrows: sites of proteolytic cleavage. SU: surface subunit, TM: transmembrane subunit, SP:
signal peptide, PRO: polyproline region, T: transmembrane domain, Ram-1 ligand:
binding domain for the amphotropic receptor, Rec-1 ligand: binding domain for the ecotropic receptor, EGF: epidermal growth factor. Dark grey boxes: sequences derived from the env gene of MoMLV) Light grey boxes: sequences derived from the env gene of MLV-4070A, White boxes: other sequences derived from MLVs. Black boxes: spacer peptides derived from the polyproline region. All the env genes are expressed starting from the same promoter (LTR) and polyadenylation signal (pA) starting from the sub-genomic mRNAusing the retroviral splicing sites, donor (SD) and acceptor (SA), with an identical intron sequence of 190 nt containing the end of the pol gene (OPOL).
The position of some restriction sites is shown.
- Fig. 3 shows the sequence of the spacer peptides and of the binding domains investigated. (A) Sequence of the spacer peptides in the series AMO, AS208 and fused with the various spacer peptides, and the whole is fused with codon 7 of the SU of the envelope of the MoMLV. (B) Sequence of the spacer peptides in the series MOA.
The binding domain at Rec-1 is fused with the various spacer peptides, and the whole is fused with codon 5 of the SU of the envelope of the amphotropic MLV. (C) Sequence of the spacer peptides in the series EMO and EA. The binding domain EGF is fused with the various spacer peptides, and the whole is fused with codon 5 of the SU of the envelope of the amphotropic MLV or with codon 7 of the SU of the envelope of the MoMLV
- Fig. 4 shows detection of membrane expression of the envelopes of the EMO
series. Populations of transfected cells, selected using phleomycin, are marked with (black histograms) or without (white histograms) anti-hEGF antibodies, then with anti-IgG mouse antibodies combined with FITC.
- Fig. 5 shows expression and viral incorporation of the chimeric envelopes of the AMO series. Immunoblots on lysates of TELCeB6 cells transfected by the plasmids expressing the chimeric envelopes (see Fig. 2 and Fig. 3A) and on deposits of viral particles purified by ultracentrifugation. The immunoblots are detected with an anti-SU
antiserum (top part) or with an anti-p30-CA antiserum (bottom part, size less than 46 KD). The positions of the p30-CA (CA) and, for the MO wild-type envelopes, of the precursor (PR) and of the surface protein (SU) of the envelope complex are shown.
- Fig. 6 shows binding tests on human cells of the envelopes of series EMO (A) and AMO (B). The background noise of fluorescence is provided by incubation of human cells with the ecotropic envelope (white histograms), 1 S - Fig. 7 shows the amino-acid and nucleotide sequence of PRO(4070A).
- Fig. 8 shows the amino-acid and nucleotide sequence of PRO(MoMLV).
- Fig. 9 shows the amino-acid and nucleotide sequence of PRO(3(MoMLV).
- Fig. 10 shows the amino-acid and nucleotide sequence of PRO+(4070A).
- Fig. 11 shows the amino-acid and nucleotide sequence of OPRO.
- Fig. 12 shows the amino-acid and nucleotide sequence of OPROp.
- Fig. I 3 shows the amino-acid and nucleotide sequence of OPRO+.
- Fig. 14 shows the amino-acid and nucleotide sequence of AMOPRO.
- Fig. 15 shows the amino-acid and nucleotide sequence of AM04PR0.
- Fig. 16 shows the amino-acid and nucleotide sequence of MOAPRO.
- Fig. 17 shows the amino-acid and nucleotide sequence of MOAOPRO.
- Fig. 18 shows the amino-acid and nucleotide sequence of EMOPRO.
- Fig. 19 shows the amino-acid and nucleotide sequence of EMOPRO(3.
- Fig. 20 shows the amino-acid and nucleotide sequence of EMOPRO+.
Fig. 21 shows the amino-acid and nucleotide sequence of EAPRO.
- Fig. 22 shows the amino-acid and nucleotide sequence of EAPRO(3.
- Fig. 23 shows the amino-acid and nucleotide sequence of EAPRO+.
- Fig. 24 shows the amino-acid and nucleotide sequence of EMODPRO.
- Fig. 25 shows the amino-acid and nucleotide sequence of EMOOPRO(3.
- Fig. 26 shows the amino-acid and nucleotide sequence of EMOOPRO+.
- Fig. 27 shows the amino-acid and nucleotide sequence of EAOPRO.
- Fig. 28 shows the amino-acid and nucleotide sequence of EA~PRO(3.
- Fig. 29 shows the amino-acid and nucleotide sequence of EAOPRO+
- Fig. 30 shows the amino-acid and nucleotide sequence of AMOEL3.
1$
- Fig. 3 I shows the amino-acid and nucleotide sequence of EL3 - Fig. 32 shows the amino-acid and nucleotide sequence of AMOEL3-V
- Fig. 33 shows the amino-acid and nucleotide sequence of EL3-V
- Fig. 34 shows the amino-acid and nucleotide sequence of AMOELS.
- Fig. 35 shows the amino-acid and nucleotide sequence of ELS.
- Fig. 36 shows the amino-acid and nucleotide sequence of OPR04-beta.
- Fig. 37 shows the amino-acid and nucleotide sequence of OPR04-int.
- Fig. 38 shows the amino-acid and nucleotide sequence of OPR04-vrb.
- Fig. 39 shows the amino-acid and nucleotide sequence of PRO-beta.
- Fig. 40 shows the amino-acid and nucleotide sequence of PRO-int.
- Fig. 41 shows the amino-acid and nucleotide sequence of PRO-vrb.
EXAMPLES:
I S EXAMPLE 1:
The retroviruses utilize a certain number of cell surface molecules, called viral receptors, for initiating the infectious process (23 ). Apart from some notable exceptions, especially in the case of human immunodeficiency viruses) most of the receptors utilized by the other retroviruses and in particular the type C mammalian retroviruses are distributed over most cell types of the host organism. For example, the amphotropic murine leukemia virus (MLV-A) is capable of infecting the majority of mammalian cells because its receptor) the phosphate transporter Ram-1, is expressed on almost all the cells.
The type C mammalian retroviruses are currently used for making retroviral vectors, in particular for purposes of gene transfer in humans, in gene therapy. Certain gene therapy procedures would be facilitated if the retroviral vectors were capable of very accurately recognizing the true target cells of gene transfer. For this, a certain number of research groups, including ours, have developed various strategies aiming to modify the recognition between the viral particle and the cell surface. This interaction essentially involves the retroviral envelope glycoprotein; it therefore seems logical to make genetic changes to this protein so as to enable it to recognize cell surface molecules specifically expressed on the target cells of gene transfer.
Two types of strategies permitting such changes have been developed recently.
In the first strategy, the natural binding domain of the retroviral envelope glycoprotein for its receptor was altered by insertion or substitution of peptides of reduced size that are able to bind a cell surface molecule. This work has demonstrated the feasibility of cell targeting for gene transfer (20).
In the second approach, polypeptides (ligands, single-strand antibodies) capable of binding various cell surface molecules were inserted at the N-terminal end of the SU
subunit of the envelope glycoprotein (6) ( 10) ( I 3) ( 1 S) (21 ). In general, investigation of the virions generated with these various types of targeting envelopes showed that it was possible for the binding of viral particles to be redirected specifically and efficiently towards new surface molecules. Some factors limiting the eRicacy of targeting were also identified. The first seems to depend on physiological properties of the surface molecule targeted (dimerization, internalization, intracellular transport ("traffcking") process) (6), the second is connected with the low intrinsic gene-fusion capacity of the chimeric envelopes generated by N-terminal insertion of ligands (6) (2 I ). It was observed that this low gene-fusion capacity can be partially overcome by introducing a spacer peptide between the new binding domain and the envelope (2 I ). However, the best infectious titres obtained are 100 times lower than can be obtained with retroviral vectors bearing a wild-type envelope. Moreover, it is possible that these results obtained in a particular targeting model (targeting of Ram- I ) cannot be extended to other types of targeting envelope glycoproteins. It therefore seemed essential to develop alternative strategies to solve these problems.
Furthermore, a general finding made with the targeting envelope glycoproteins generated by N-terminal insertions is that the natural binding domain of the supporting envelope is always functional. To the extent that the target cells are human cells in gene therapy, this functionality of the natural binding domain does not pose problems of "background noise" of infection because the supporting glycoprotein used is the ecotropic envelope of the MoMLV virus which does not recognize a receptor on the cells of higher mammals. However, it seemed interesting to characterize these chimeric envelope glycoproteins that are able to recognize two different surface molecules, to see 2S what influence the spacer peptide could have in this recognition, and to assess the relative contributions of the two types of interaction in the infectious process.
These observations, which form the subject of the work described below, led to the development of a two-stage targeting strategy, firstly involving specific recognition between the ligand inserted at the N-terminal end of the targeting glycoprotein, and then an auxiliary mechanism making it possible to facilitate entry of the virus specifically bound to the good cellular target by means of the natural retroviral receptor.
To avoid any problem of background noise of infection connected with direct interaction between the natural binding domain and the natural retroviral receptor, masking/unmasking spacer peptides were also developed, inserted between the targeting site and the supporting envelope glycoprotein, and which are able to mask the natural binding domain for as long as the viral particle has not interacted with the targeted surface molecule.
Realization of this interaction induces unmasking of the natural binding domain and interaction between the natural binding domain and the natural retroviral receptor (auxiliary mechanism) which then takes over for introducing the virus into the cell.
Equipment and Methods: Cell lines.
The cell line TELCeB6 (7) is derived from the TELacZ line ( 19) by transfection and clonal selection of cells expressing the gag and pol proteins of MoMLV
(Moloney Murine Leukemia Virus). The TELacZ cells express the retroviral vector MFGnlslacZ
which is able to transduce a nuclear (3-galactosidase. The TELCeB6 cells permit production of retroviral capsids (non-infectious, as they are devoid of envelopes) transporting the nlsLacZ retroviral marker vector. Cells A431 (ATCC CRL1555) and TE671 (ATCC CRL8805) are cultivated in DMEM medium (Gibco-BRL) supplemented with 10% of foetal calf serum (Gibco-BRL). Cells CHO, CERD9 (9), and CEAR 13 (9) are cultivated in DMEM medium (Gibco-BRL) supplemented with 10% of foetal calf serum and proline (Gibco-BRL). The NIH-3T3 cell lines and NIH-3T3 derivatives are cultivated in DMEM medium (Gibco-BRL) supplemented with 10% of newborn calf serum (Gibco-BRL).
Chimeric envelopes.
The DNA fragments coding for the polypeptides recognizing either EGFR (EGF
receptor) or Ram-1 (MLV-A receptor) were generated after PCR (polymerase chain reaction) by using oligonucleotides containing restriction sites. These polypeptides were introduced at the N-terminal of the SU protein of MLV (surface protein gp70) in which the SfiI and NotI restriction sites were created at codon +6 (33). A schematic diagram of the various env genes used in this article is shown in Fig. 2. Briefly, a DNA fragment derived from PCR amplification, coding for the 53 amino acids of human EGF (3) was generated using a cDNA matrix (ATCC 59957) and two oligonucleotides: OUEGF:
(S'> ATGCTCAGAGGGGTCAGTACGGCCCAGCCGGCCATGGCCAATAGTGAC
TCTGAATGTCC) with an SfiI restriction site and OLEGF:
(5'> ACCTGAAGTGGTGGGAACTGCGCGCGGCCGCATGTGGGGGTCCAGACT
CC) containing a NotI site. After digestion by SfiI and NotI, these fragments were cloned in a gene coding either for the SU protein of MoMLV in the case of the chimeric protein EMO, or SU of the 4070A virus for the chimeric protein EA (6).
For the AMO construct (6), a site NotI was created at the end of the recognition domain of the receptor in the 4070A envelope (called AS208), (2), and the nucleotide (nt) 750 (14) using a PCR fragment generated from the XhoI site (nt 594) up to nt 750 before the proline-rich region) owing to two oligonucleotides: 805FC (5' >
TCCAATTCCTTCCAAGGGGC) upstream of XhoI and 806FC (S' > ACCCCCACATGCGGCCGCTCCCACATTAAGGACCTGCCG) containing a NotI restriction site. The chimeric envelope is constructed by cloning of the Xhol/NotI PCR fragment and of the Notl/CIaI fragment, isolated from the env EMO
gene (coding for the SU and TM- P 1 SE transmembrane proteins of MoMLV), between the XhoI/CIaI sites of the env gene 4070A MLV.
The resulting constructs are recovered in the form of a BgIII/CIaI fragment (corresponding to positions 5408 and 7676 in MoMLV) and cloned at sites BamHI
and CIaI of an FBMOSALF expression plasmid (7) in which a selection marker gene (8) fused to the polyadenylation sequences of the PGK (phosphoglycerate kinase) gene and was introduced downstream of the LTR 3' of the MLV-C57 virus.
For EMO, EA, or AMO, the new recognition site was separated from the rest of the MLV envelope by a spacer peptide consisting of three alanines, supplied by the NotI
cloning site ( 15). In three other series of targeting envelopes (derived from envelopes EMO, EA or AMO), spacer amino acids were introduced either after the recognition domain of EGFR (EGF), or after the recognition domain of Ram-1 (AS208) as described below.
The series of envelopes targeting Ram-1 was generated by introducing different spacers between the recognition domain of Ram-I and the MoMLV envelope (Fig.
3A).
For AMOPRO, a region of 59 amino acids rich in proline originating from SU
(amphotropic) (nucleotides 75 I to 927) ( 14) was used. A shorter proline-rich region, also isolated from the envelope MLV 4070A (nt 751-789) was used for AMOOPRO.
This region corresponds to the 13 amino acid spacer of product v-mpl (originating from the virus of myeloproliferating leukemia) ( 18) located between its region derived from env and the equivalent of the cellular gene mpl.
In the case of AMO I , the first 208 amino acids, derived from the envelope of MLV 4070A, were fused to amino acid I of the SU of MoMLV. For AMOIFx, a 4 amino acid site corresponding to the cleavage site of blood coagulation factor Xa (Ile-Glu-Gly-Arg) (12) was inserted after the Ram-1 recognition site and fused to the +I
codon of the SU of MoMLV. The strategy used for these constructs is described above.
Briefly, an oligonucleotide (S'-TCCAATTCCTTCCAAGGGGC-3')) located just upstream of the XhoI site of the env gene of 4070A (nt 594) was used in combination with one or other of the following two oligonucleotides bearing the Not I
site:
5'-AGTATGCGGCCGCTGGGGGTGGCTGTGGGACAC-3' and S'-TATCTGCGGCCGCGTCGGGTAATACTGGGTTGG-3' so as to generate by PCR, using an env 4070A matrix, 3' fragments for the AMOPRO
and AMO~PRO envelopes respectively.
These PCR fragments were submitted to digestion by XhoI and NotI and cloned in the open FBAMOSALF plasmid in XhoI/NotI, a plasmid expressing an AMO type of envelope. The plasmids expressing the envelopes AMOFx, AMO 1 and AMO I Fx were generated by cloning the Ndel/Notl fragment of FBAMOSALF containing the Ram-1 recognition site) in a series of plasmids (13) expressing the modified MoMLV
envelopes so as to create a NotI site at codon 1 or at codon 6 with (AMO I Fx, AMOFx) or without (AMO 1 ) the Xa sequence. Envelopes derived from AMO and containing other types of spacer peptides were constructed. All of these spacer peptides are shown in Fig. 3 A.
The MOAPRO and MOAOPRO envelopes were generated according to a method similar to that of the AMOPRO and AMODPRO envelopes. The FBEASALF
plasmid, expressing the EA envelopes, was opened at NedI/NotI. This DNA was used I O for cloning two fragments: the 5' NdeI/BamH 1 fragment from digestion of the FBMOSALF plasmid (expressing the ecotropic MO envelopes) and containing, in addition to LTRS' and the retroviral leader sequence, the N-terminal end of the env gene of the MoMLV virus (position 6565), (17). 3' fragments were generated by PCR
using the env gene of MoMLV as matrix, as oligonucleotide 5'(5'-ACTGGGGCTTACGTTTGT-3') upstream of the BamH 1 site, and as oligonucleotide 3' (5'-TATGTGCGGCCGCCGGTGGAAGTTGGGTAGGGG-3') or (5'-TATGTGCGGCCGCGTCTGGCAGAACGGGGTTTGG-3') for constructing the MOAPRO and MOAOPRO envelopes, respectively. These PCR
fragments were digested with BamHI and NotI, and co-ligated with the 5' fragment. The sequence of the spacer peptides for these two constructs is shown in Fig. 3B.
FBEMOSALF, expressing the EMO chimeric envelopes (6), was submitted to digestion by BaaHII, filling by Klenow enryme and digestion by NdeI. The resulting 1.8 Kb fragment, containing the LTRS', the leader sequence, the end of the pol gene and human EGF, was isolated and inserted either in FBAMO~PROSALF or FBAMOPROSALF (plasmids expressing the AMOOPRO and AMOPRO chimeric envelopes respectively) in which the NdeI/EcoRI fragment was eliminated and the EcoRI
site was filled so as to generate the plasmids expressing the envelopes EMOOPRO+ and EMOPRO+, respectively. Plasmids expressing the envelopes EMO 1, EMO 1 FX were also generated. The sequence of the spacer peptides for these two constructs is shown in Fig.3C.
The plasmids expressing the EAPRO+ and EADPRO+ envelopes were generated by replacing the SfiI/Not fragment of the FBEASALF plasmid by the SfiI/NotI
fragments obtained from plasmids expressing the EMOPRO+ and EMOOPRO+ envelopes.
Finally for these various envelopes EMOOPRO+, EMOPRO+, EAPRO+ and EAOPRO+) the spacer peptides were reduced in their N-terminal part. For this, a DNA
fragment was generated by PCR using as matrix the EMO gene, oligonucleotide 5' (5' ACCATCCTCTAGACGGACATG-3') upstream of the XbaI site preceding the initiator codon and as oligonucleotide 3' (5'-TATCAGGATCCCAAATGTAAGCCCTGGATCG
CGCAGTTCCCACCACTTCAGGTCTCGGTACTGAC-3') containing a BamHI site.
This DNA was digested with XbaI/BamHI and cloned in one or other of the plasmids expressing the EMOPRO+ or EMOOPRO-+- envelopes, after removing the Xbal/NotI
fragments beforehand, by co-ligation with the BamHI/NotI fragments obtained from the S plasmids expressing the MOAPRO and MOADPRO envelopes. This results in two plasmids that are able to express the EMOPR0~3 and EMOOPR0~3 envelopes, respectively (Fig. 3C), in which EGF is fused just upstream of the BamHI site of the envelope of the MoMLV virus (nt 6537), (17) before the proline-rich region and leaving intact the potential b sheet. One or other of the SfiI/NotI fragments resulting from these 10 last two constructs was then introduced into the FBEASALF plasmid after prior removal of the SfiI/Not fragment; this results in two plasmids capable of expressing the EAPRO(3 and EAOPRO~i envelopes, respectively (Fig. 3C).
In another construction series (EMOPRO, EMOOPRO, EAPRO, EAOPRO), the potential ~i sheet was removed, and EGF was fused directly at the level of the proline 15 rich region (Fig. 3C).
Production of viruses.
The plasmids expressing the envelopes were transfected by the calcium phosphate precipitate method (16) in the TeLCeB6 cell line. The cells were submitted to selection with phleomycin (50 mg/ml), then the resistant clones were trypsinized in the bulk.
20 These confluence cells were used for recovering the viral supernatants after incubation over night in DMEM medium in the presence of FCS ( 10%). These supernatants are submitted to ultracentrifugation with the aim of obtaining samples for analysis in Western blots, in binding tests and in infection tests. Immunoblots. The virus-producing cells are lysed for 10 min at 4°C in buffer of Tris-HCL 20mM (pH 7. S), containing triton X 100 1%, SDS 0.05%, deoxycholate S mg/ml, NaCI 150 mM and PMSF 1 mM. After centrifugation for 10 min at 10 000 g, for deposition of the cell nuclei, the supernatants are frozen at -70°C until analysis. These viral samples are obtained by ultracentrifugation of the viral supernatants (10 ml) in a SW41 Beckman rotor (30 000 rpm, 1 h at 4°C).
The deposits are resuspended in 100 ml of PB S (phosphate buffered saline) and frozen at -70°C. The samples (30 mg of cellular lysates or 10 ml of purified viruses) are mixed in a ratio of 5:1 with buffer of 375 mM Tris-HCl (pH 6.8) containing SDS 6%, b-mercaptoethanol 30%, glycerol 10% and bromophenol blue 0.06%, then boiled for 3 min and analysed on acrylamide 10%/SDS gels. After transferring the proteins onto nitrocellulose membrane, immunologic marking is effected in TBS (Tris base saline, pH
7.4) in the presence of skimmed milk 5% and Tween 0.1%- Antibodies (Quality Biotech Inc., USA) obtained from goat antiserum, directed against gp70-SU of RLV
(Rauscher Leukemia Virus) or p30 of RLV were used at a dilution of 1:1000 or 1/10000 respectively. The blots were developed using a conjugated antibody of rabbit origin ' CA 02253874 1998-11-06 directed against goat immunoglobulins (DAKO, UK) using an electrochemoluminescence kit (Amersham Life Science).
Binding tests.
The target cells were washed with PB S and separated by incubation for 10 min at S 37°C with Versene 0.02% in PBS. These cells are rinsed with PBA (PBS
containing 2%
of FCS and sodium azide 0.1%). 106 cells are then incubated in the presence of viruses for 30 min at 4°C for the EMO envelope series or 45 min at 37°C
for the AMO envelope series. After rinsing with PBA, the cells are incubated in the presence of monoclonal antibodies (Evans et al., 1990) for 30 min at 4°C. After rinsing twice with PBA, the cells are incubated for 30 min at 4°C in the presence of conjugated anti-rat antibodies combined with FITC (Dako; UK). S min before the two final rinsings in PBA, the cells are counterstained with propidium iodide (20 mg/ml). The fluorescence of the live cells is analysed in a FACS (FACScalibur, Beckton Dickinson).
Infection tests.
1 S The target cells are inoculated in 24-well culture plates at a density of 3.10 cells per well. Various dilutions of the viral supernatants, containing Polybrene at 4 mg/ml, are added to the cells for 3 to 5 h at 37°C. The supernatants are then replaced with fresh medium and the cells are incubated for 24 to 48 h at 37°C. X-gal staining is then carried out as described previously (4). The viral titres are estimated as reported previously (S) in number of colonies per ml (IacZ i.u./ml).
In order to block the EGFRs, the target cells are incubated for 30 min at 37°C in a medium containing 10~ M of human recombinant EGF (236-EG, R&D Systems, UK).
The cells are then rinsed and infections are carried out as described previously. To block acidification of the endosomes, 100 mM of chloroquine phosphate (Sigma, UK) is added to the medium. Six hours after infection, the cells are rinsed and incubated in a normal medium.
Results and discussion.
Construction of the mutant envelopes.
Two series of modified envelopes capable of recognizing either the retroviral receptor Ram-1 ( 11 ), (22)) or the EGF receptor were generated. A first envelope targeting Ram-1, AMO, was constructed by insertion, at the N-terminal of the envelope of MoMLV (by fusion with codon 7), of a polypeptide recognizing Ram-1 (AS208, Fig.
3A) and corresponding to the first 208 amino acids of the SU of MLV-A ( 1 ).
The sequence coding for EGF was inserted in the env gene of MLV in position +6 of the SU
of MoMLV (Fig. 2). It had previously been demonstrated that this insertion site permits expression of a single-chain antibody fragment on the surface of virions ( 1 S). In the case of the chimeric envelope EMO (Fig. 2), human EGF was inserted in the envelope of MoMLV at the same position, whereas for the envelope EA, insertion was effected in the ' CA 02253874 1998-11-06 amphotropic envelope of MLV in position +5.
For the AMO, EMO and EA envelopes, the new binding domains were separated from the recognition domain of the retroviral receptor by a spacer peptide corresponding to three alanines. For the two types of parental envelopes targeting Etam-1 or targeting S EGFR, various constructs were then generated by insertion of spacers of different sizes and structures. The protein sequences of these different spacers are shown in Fig. 3 A in the case of the envelopes targeting Ram-1 and in Fig. 3C for the envelopes targeting EGFR.
The plasmids expressing the various envelopes, including the ecotropic (MO) and amphotropic (A) control envelopes, were transfected into the cell line TELCeB6 which expresses the proteins coded by the gag and pol genes, as well as a retroviral vector nlsLacZ (7).
Expression and incorporation of the envelopes in the virions.
The protein lysates of the corresponding cells were analysed for the expression of envelopes by means of antibodies directed against the SU of MLV (Fig. 5) for most of the envelopes of the AMO series (not shown for the other chimeric envelopes).
For all the chimeric envelopes) the precursors and the mature form SU of the envelopes could be detected at the expected size and at a level similar to the wild-type envelopes, suggesting that these chimeric envelopes are normally produced and matured.
Expression on the cell surface was determined by analyses of the producing cells in the FACS, using antibodies directed against the SU or using an anti-EGF
monoclonal antibody. The cells transfected by the various envelopes can be marked by the anti-SU
antibody (not shown). Only the cells expressing the EGF envelopes fusion envelopes can be marked by means of anti-EGF monoclonal antibodies (Fig. 4). This demonstrates expression of the chimeric envelopes on the cell surface and correct folding of the EGF
on the chimeric glycoproteins.
To demonstrate incorporation of the chimeric envelopes in the retroviral particles, the supernatants of the TELCeB6 cell lines transfected with the various envelopes were submitted to ultracentrifugation and the deposits of viral particles were recovered. These deposits were analysed by immunoblots for their expression of products of the gag gene (CAp30) and of the envelope proteins (Fig. 5 for most of the envelopes of the AMO series, not shown for the other chimeric envelopes). With the aim of comparing the e~ciency of viral incorporation between the various chimeric 3 S envelopes, identical quantities of viral particles (determined by marking the gag proteins by means of anti-CAp30 antibodies) were deposited on the gels.
The SU proteins could be detected for all the mutants) at the expected size but at a rate slightly less than was observed for the wild-type envelopes. In the case of the AMOG2X and AMOG3X envelopes only, the efficiency of incorporation is appreciably lower relative to the wild-type envelopes. As expected, no envelope expression was observed in the deposits from TELacZ supernatants (not expressing gag and pol proteins) transfected by the various envelopes. These results show that the chimeric SU
proteins are associated with retroviral particles.
Binding of the envelopes to the receptors.
Human cells expressing the receptors Ram-1 and/or of EGF were used for this investigation. These cells are incubated in the presence of viral preparations and the binding of the viral envelopes on the target receptor is determined by analysis with the FACS with the aid of antibodies directed against the SU (Fig. 6B). As expected, no binding is observed in the case of viruses expressing MO ecotropic envelopes on the various human cells (not shown), whereas the viruses that have chimeric envelopes targeting Ram-I are able to bind to the TE671 cells with an efficiency similar to that observed for the viruses expressing unmodified amphotropic envelopes. All the envelopes targeting Ram-1, derived from AMO, are able to bind to the TE671 cells with a similar efficiency. This binding can be inhibited after competition by AS208 fragment (the purified recognition domain of Ram-1 ) (2), which suggests that this recognition is specific (results not presented).
The envelopes targeting EGFR (EMO series) are moreover able to bind to the A431 cells) on EGFR expressor (Fig. 6A). This binding seems specific since pre incubation of the A431 cells in the presence of EGF (inducing endocytosis of the EGFRs) inhibits this binding (not shown).
Ram-I and Rec-1 cooperation in infection.
Transduction of the retroviral vectors pseudotyped by the various targeting envelopes was measured on cells expressing different types of receptors: human cells TE671 expressing the EGF and Ram-1 receptors; 3T3 cells expressing murine EGF, Rec-1 and Ram-1 receptors; CEAR 13 cells expressing Rec-1 and Ram-1; CERD9 cells expressing only Rec-1. The titrations were carried out as described previously (6). As expected, it was shown that the viruses pseudotyped by MO ecotropic envelopes were not capable of infecting the TE671 cells, but did permit infection of murine cells 3T3, CEAR13 and Cerd9 (with titres of the order of 10' IacZ i.u./ml). Conversely, the viruses bearing the amphotropic A envelopes are able to infect the murine cells 3T3) and TE671 (with titres of the order of 10' IacZ i.u./ml).
The viruses that have chimeric AMO envelopes are able to infect the TEb71 cells at a titre of 4.103 lacZ i.u.lml (Table 1 ). In comparison, despite a similar efficiency of binding to the receptor (Fig. 6B), the titres obtained with the wild-type envelopes are 10 000 times higher. Surprisingly, the viruses expressing AMOPRO envelopes, despite good efficiency of binding, proved incapable of infecting the human cells Compared with the titres obtained for the AMO envelopes (Table I ), the other types of spacers inserted in the envelopes of the AMO series permit an increase in titres from 30-fold (for AMODPRO) to more than I 00-fold (for AMO 1 Fx) making it possible to reach titres of 4.105 IacZ i.u./ml. It has been shown that these infections take place via the targeted receptor Ram-1. This was demonstrated by an interference test on target cells chronically infected with MLV-A virus. These cells become specifically refractory to infection by viruses bearing envelopes targeting Ram-1 (results not shown).
The viruses bearing the chimeric envelopes in which the site for binding to Ram-1 was separated from the SU of MoMLV by various spacers proved very infectious on 3T3 cells.
Compared with the titres obtained for the AMO envelopes, an increase from 200-fold (for AMOPRO) to more than 1000-fold (for AMO 1 Fx) in the viral titres was measured (Table 1 ).
Infection of the 3T3's is effected via Rec-1 or via Ram-1 (Table 1). This can be demonstrated by interference tests carried out on 3T3 cells chronically infected either by MLV-A (blocking Ram-1 ) or by MoMLV (blocking Rec-1 ). The viruses expressing the AMO envelopes seem to be capable of infecting the 3T3's indiscriminately depending on whether one or the other, or both, Rec-1 and Ram-1 receptors are available on the target cell. Compared with these AMO viruses, the viral particles containing the other envelopes capable of targeting Ram-1 are far less capable of infecting the 3T3's when only one of the two receptors is available. For example, when 100 particles (according to the titre determined on intact 3T3's) containing the AMOFx envelopes are used for infecting interfering 3T3's, 4 viruses are capable of infecting the cells if only Rec-1 is available and 2 viruses are capable of infecting the cells if only Ram-1 is available. This indicates a considerable loss of infectivity (more than 94% of the viruses are not infectious) when only one receptor is available compared with when both receptors are available. This also suggests that the two receptors Ram-1 and Rec-I cooperate in infecting the 3T3's. It appears that this phenomenon of cooperation is even more marked in the case of viruses bearing the AMOPRO envelopes. These last-mentioned viruses can infect the 3T3's with difficulty when only Rec-1 is available and cannot infect them at all when only Ram-1 is available. However, when Rec-1 and Ram-1 are both available, infection is possible and titres of the order of 6x 10° lacZ i.u./ml can be obtained (Table 1).
For better characterization of this cooperation effect, infection tests were carried out using CHO cells as targets (naturally devoid of Ram-1 and Rec-1 receptors) altered so as to express either Rec-1 only (Cerd9 cells), or Rec-1 and Ram-1 (Cearl3 cells) or TE671 cells expressing Ram-1 only. Furthermore) other envelopes derived from the AMO envelope were generated. These envelopes possess other types of spacer peptides (see Fig. 3 A) after the site targeting Ram-1, in particular flexible spacers.
The results of a typical experiment are shown in Table 2. For each envelope, cooperativity indices were calculated as the ratio of the titre obtained on the cell type expressing just one receptor to the titre obtained on the cell type expressing both types of receptors. An index of 1 therefore indicates that the titre is the same, whether there is just one or both 5 receptors. This is obviously the case with ecotropic or amphotropic wild-type envelopes.
An index less than 1 indicates that the titre is less good when a single receptor is expressed relative to when both are, and that both receptors are needed to promote infection. The lower this index is, the greater is the requirement for two receptors. As suggested in Table 1, the infectivity of the virions with the original AMO
envelopes is 10 not affected, whether there is a single type of receptor or both types (Table 2). In fact, the indices are even greater than 1 suggesting that the simultaneous presence of the two receptors hampers the infectious process, perhaps because the two binding domains hinder each other. The situation is different for viruses with the AMO 1 Fx envelopes even though, compared with the AMO virions, their infectivity is at least 100 times better 15 in the TE671 cells that express Ram-1 only. This increase in infectivity via Ram-1 can be explained by the increased size of the spacer peptide separating the two binding domains:
' it is possible that the AS208 site induces less steric hindrance with respect to the rest of the glycoprotein and that these envelopes can more easily induce the gene-fusion process. Moreover, the Cerd9 cells expressing Rec-1 only are infected relatively easily 20 by the AMO 1 Fx virions. However, in accordance with the results in Table 1, infection is facilitated by a factor of I 0 when both molecules Ram-1 and Rec-1 are co-expressed (index of about 0.1 ) compared with when only one or the other of the two receptors is present. The envelopes with the "flexible" spacers (AMOG 1 Fx, AMOG2, AMOG2Fx and AMOG3) seem to behave like the AMOIFx envelopes with regard to infection via 25 Rec-1 expressed alone. However) infectivity by Ram-1 expressed alone (RamID) tends to decrease as a function of the length of the spacer. This probably reflects a decrease in transmission of the gene-fusion signal following binding on Ram-I owing to the increase in distance between the AS208 domain and the fusion domain. With these envelopes as well, infection is favoured when the two receptors are co-expressed on the surface of the same cell.
As for the AMO 1 Fx envelopes, but non-symmetrically (RamID similar, but RecID very different)) the virions containing the AMOOPRO envelope can infect cells efficiently when Ram-1 is expressed alone. For this envelope as well, infectivity is increased about 10-fold when Rec-1 is also present on the cell surface. This difference is 3 S not due to the mere fact that the AMOOPRO virions utilize Rec-1 preferentially for infection. In fact, infection of cells on which Rec-1 alone is available is extremely slight (Table I) or even undetectable (Table 2) compared with when Ram-1 and Rec-1 are co-expressed. The RecID index is less than 10-5 (Table 2). This also demonstrates that the two receptors can synergize infection. These results also suggest that the domain of binding to the ecotropic receptor Rec-1 is not accessible when the AM04PR0 envelope is expressed on viral particles, and only becomes accessible if these virions interact with Ram-1 beforehand. It can also be suggested that following binding with Ram-1, the domain for binding to Rec-1 is unmasked and recruited for facilitating the infectious process. It is possible that this masking/unmasking takes place according to an allosteric type of mechanism causing a change in conformation of the chimeric glycoprotein that is induced by the Ram-1 /AS208 interaction and which involves the spacer peptide.
It is likely that this mechanism is strongly dependent on the amino acid composition of the spacer peptide. With comparable size, there is a difference of at least 1000 times in the RecID's when the AMOOPRO virions are compared with the virions containing the envelopes with the flexible spacers AMO 1 Fx, AMOG 1 Fx and AMOG2. The OPRO
peptide contains S prolines probably arranged in a type II polyproline helix, whereas the AMOGIFx and AMOG2 envelopes contain essentially glycines.
Similarly to the AMODPRO virions, the virions containing the AMOPRO
envelopes require the simultaneous presence of the two types of receptors for infecting the cells. The infectious titres in cell types co-expressing the two receptors are, however, lower than that observed with the AMODPRO virions, though it is not possible to exclude the hypothesis that the lesser extent of incorporation of these envelopes is responsible for this result. Even more markedly than with AMOOPRO, the AMOPRO
viruses cannot infect the cells when either one of the two receptors is expressed alone (Table 2). The two indices RamID and RecID are in fact less than 10-5. These results suggest that:
1) interaction of the AMOPRO virions with Ram-1 when it is expressed alone is not sufficient to trigger the changes in conformation of the glycoprotein permitting its gene-fusion. Furthermore, it is possible that the PRO spacer peptide is either too rigid, or too long to favour such a transition, 2) the domain for binding with Rec-1 is not accessible for interaction with Rec-1 and to take over in the entry process as long as the AMOPRO virion has not interacted with Ram-1.
For the purpose of better discrimination of whether the masking of the binding domain located downstream of the targeting site is a unique property of the peptide conjugated to the PRO spacer peptide, the inverse construction was effected.
The MOAPRO envelopes contain the binding domain of the ecotropic envelope followed by the proline-rich region of this same envelope, the whole being fused at the N-terminal end of the amphotropic envelope (Fig. 2). The results shown in Table 2, show that in a similar manner to the virions containing the AMOPRO envelopes, the MOAPRO
virions can infect the cells expressing only either one of the receptors Rec-1 or Ram-1 with difficulty, or not at all. It even seems that the Ram-1 domain in the MOAPRO
envelope is even less accessible (RamID less than 7x 10-5) than the Rec-1 domain is in the AMOPRO envelope (RecID less than 5.6x10-''). The MOAPRO envelopes can efficiently infect the cells expressing the two types of receptors, with titres of the order of 105 IacZ i.u./ml, suggesting that, for this envelope as well, the presence of the two receptors synergizes the infectious process.
These results, taken together, suggest that the spacer peptide inserted between the targeting domain and the rest of the retroviral envelope exercises control over the accessibility of the domain located downstream of the said peptide and over the activation of fusion. This control depends on the peptide itself and is influenced by its length and by its biochemical composition. The hypothesis formulated is that the PRO
spacer peptide would finally perform the same role as the proline-rich region in question and which is located, in the unmodified glycoprotein, between the binding domain to the receptor and the fusion domain. This role would be masking of the domain downstream (fusion domain for the wild-type envelope or binding domain for the chimeric envelope) and subsequent unmasking for interaction of the domain upstream with its receptor. In the case of the wild-type envelope, this unmasking would lead to activation of fusion, whereas in the case of chimeric envelopes, unmasking would lead to accessibility of the binding domain to the viral receptor. If the receptor is expressed at the cell surface, there can then be interaction, and this then triggers activation of the fusion domain, explaining why the simultaneous presence of the two receptors synergizes infection.
These results make it possible to propose a two-stage targeting strategy for which a targeting envelope is constructed with various domains) whose functions are activated and coordinated by means of specific spacer peptides containing proline-rich sequences. These chimeric envelope glycoproteins can be conceived as follows) with, from N-terminal to C-terminal, a "targeting" domain capable of recognizing a cell surface molecule specifically expressed on the targeted tissue or targeted cell (for example a single-chain antibody or a ligand for a surface receptor); a spacer peptide capable of masking an auxiliary region which is in turn capable of facilitating penetration of the virus when it is activated. Such an auxiliary domain can be an entire retroviral envelope, i.e. a structure capable of mediating and taking over from viral infection by means of interaction with a ubiquitous retroviral receptor) which therefore has a very strong likelihood of being co-expressed with the targeted surface molecule. Ideally, the auxiliary domain should be masked until the viral particle has specifically interacted with the targeted surface molecule. For example, in the case of the AMOPRO and AMO~PRO envelopes, the targeted surface molecule is Ram-1 whereas the auxiliary domain is the ecotropic envelope.
EGFR and Rec-1 cooperation in infection.
. 28 To verify whether the PRO and OPRO spacer peptides could mediate the masking/unmasking mechanism in the case of another type of targeting envelope, another two-stage targeting model was explored by means of the EGF receptor. The results obtained with the targeting of Ram-1 made it possible to propose C-terminal ends of the masking/unmasking spacer peptides. However, it was not possible to define their N-terminal ends exactly. That is why, in the first place, the EMOPRO+ and EMOOPRO+
envelopes were constructed (Fig. 3B), in which the PRO and OPRO spacer peptides contain in addition) at the N-terminus, 41 amino acids derived from the amphotropic envelope and located immediately upstream of the proline-rich region. For the EMOPRO+ and EMODPRO+ envelopes, the targeting domain is EGF, whereas the auxiliary domain is the ecotropic envelope. These two envelopes were compared with the EMO envelope (Fig. 2 and 2B) which does not contain a spacer peptide.
The infection tests were carned out with cells expressing Rec-1 alone (Cerd9 cells) or with cells co-expressing Rec-1 and EGFR (3T3 cells). The results of a typical experiment are presented in Table 3. As expected from the results obtained with the AMO envelopes, the viruses containing the EMO envelopes can efficiently infect the Cerd9 and 3T3 cells, indicating that the binding domain to Rec-1 in these envelopes is not masked. In comparison with the EMO viruses, the viral particles containing the EMOPRO+ and EMOOPRO+ envelopes can only infect the Cerd9 cells with difficulty (between 1000 and 10 000 times less well than the EMO viruses). However, when Rec-1 and EGFR are co-expressed, even though this does not affect the titre of the EMO
virions, the viral particles containing the EMOPRO+ and EMO~PRO+ envelopes are and 60 times more infectious, respectively, compared with when Rec-1 is expressed alone.
In relation to the results obtained with the AMOPRO and AMOOPRO envelopes, masking is apparently effected less well, leading to non-negligible infectivity on Cerd9 cells. This is perhaps due to the fact that the PRO+ and OPRO+ spacer peptides are not optimized for their function, but perhaps also to the fact that the Cerd9 cells express a few EGF receptors which would contribute to activation of the EMOPRO+ and EMOOPRO+ envelopes.
Table 1 Titres (lacZ i.u./ml) obtained for the viruses containing the envelopes targeting Ram-1 in interference tests env T6671 3T3a 3T3-MLV-Aa~b 3T3-MoMLVa~b Ram-lc Ram-1 ~ Rec-lc Rec-Ic Ram-1c MO <1 92,000,000 46,OCC,000 (100)40 (100) A 10.000,000 12,000,000 240 8,000,000 (100) (100) AMO 4,000 24 (100) 32 (266_71 8 (50) AMOFx 230,000 440,000 (100) 8,000 (3.6) 6,000 (2) AMO1 330,C00 1,920,000 (100)78,000 (8.1) 62,000 (4.8) AMOIFx 400,000 1,620,000 (100)60,000 (7.41 74,000 (6.8) AMO~PRO 150,000 280,000 (100) 400 (0.29) 64,000 (34.3) 1~
AMOPRO 10 60,OC0 (100) 4 (0.013) <1 (0.0025) a: percentages calculated assigning a value of 100 to the titres obtained on b: infection on 3T3 chronically infected by MLV-A (3T3-MLV-A) or by MoMLV (3T3-MoMLV) c: receptor available at the surface of the cell in question Table 2 Tit res (IacZ/ml) obtained containing i u. for the the viruses envelopes g Ram-1 targetin Spacer env 3T3 Tc671 CERD9 Ram?(~ RecID
peptide MO 2.8x10'E<1.7x10'0 2.8x10' <6.1x10 1 ~
A 5x10'5 SxlO'S 6.2x10'01 1.2x10-5 3 AMO 1x10 2.2x10'2 6.2x10'02.2x10'0 6.2x10-2 "
13 AMOIFx 6x10'4 1.6x10'"4 2.2X10'52.7x10-1 3.7X100 16 AMOAPRO 1.9x10'55x10'3 6.2x10'02.6x10-1 3.3x10-9 18 AMOGIFx 9x10 4.5x10'3 8.7x10'31.1x10 1 2.2x10-1 "' 19 AP90G2 8x10'3 2.7x1C'3 3.1X10'33.9x10-1 3.9x10-1 23 AMOG2Fx 6x.0'3 1.2x10' 1.2X10" 2x10-1 2x10-1 2E A.MOG3Fx9x10" 1x10+~ _.2x10''_.1x:0-1 1.3x10'0 62 AMOPRO 1.8x10'3<1.7x1C+0 <1x10'2 <9.9x10-q <5.6x10-4 NOAPRO 1.3x10" <9.1x10'0 1.7x10'3<7x10-5 1.3x10-2 Table 3 Titres (lacZ i.u./ml) obtained for the viruses containing the envelopes targeting EGFR
env 3T3 CeRD9 RecID
MO 9.2x106 1.3x107 1 EMO~PRO 3.5x104 8.5x102 1.7x10-2 EMOPRO 9.6x102 7x101 5.2x10-2 Cpl 2x106 3x106 1 EXAMPLE 2:
With the aim of characterizing the cooperation between the Rec-1 and Ram-1 receptors, as well as the peptides that are capable of regulating this cooperation of receptors, a new series of type AMO chimeric envelope glycoproteins (see preceding S example) was constructed:
- in order to verify whether the infection obtained with the AMOPRO and AMODPRO envelopes passes, in a second stage, through an interaction with Rec-1, the binding domain with Rec-1 was inactivated by point mutagenesis (D84K mutation) (MacKrell et al., J. Virology, 70:1768-1774 ( 1996)) in the AMOPRO and AMODPRO
envelopes as well as in the AMOG 1 X control envelope which does not require the cooperation of receptors to permit infection (Valsesia-Wittmann et al., The EMBO
Journal 16:1214-1223. (1997)).
- in order to demonstrate the role of the type II polyproline helix structure for the cooperating peptides, the envelopes AMOEL3 and AMOELS were constructed. These envelopes have respectively 3 and 5 turns of a type II polyproline helix as characterized in the literature (Urry, Journal of Protein Chemistry 7:1-34. ( 1988)).
Retroviruses were generated with these chimeric envelopes and were characterized by infection of cells expressing either Rec-1 alone, or Ram-1 alone, or the two molecules Ram-1 and Rec-1.
Material and Methods.
The oligonucleotides elast3U: (5'-TTT ATG GTC ACC GCG GCC GCA CCT
GGG GTA GGG GCT CCG GGG GTA GGG GCT CCT GGG GTG GCC ATA TAA) and elast3L (5'-TTA TAT GGC CAC CCC AGG AGC CCC TAC CCC CGG AGC
CCC TAC CCC AGG TGC GGC CGC GGT GAC CAT AAA) were hybridized together. The resulting bicatenary DNA fragment was digested with the Eael restriction enzyme and cloned in the FBAMOSALF expression plasmid previously opened at NotI.
The result was the plasmid FBAMOEL3SALF (see sequence of the gene env AMOEL3 in Fig. 30) containing the peptide EL3 the peptide sequence of which is shown in Table 4 (see nucleotide sequence in Fig. 31 ).
The oligonucleotides UpElS: (5'-GAT GTA CCT GGG GTA GGC GCC CCT
GGA GTC GGG GCT CCT GGG GTA GGA TTC AT) and LowElS: (5'-ATG AAT
CCT ACC CCA GGA GCC CCG ACT CCA GGG GCG CCT ACC CCA GGT ACA
TC) were hybridized together. The resulting bicatenary DNA fragment was digested with EcoNI restriction enzyme and cloned in the FBAMOEL3SALF expression plasmid, previously opened at EcoNI. The result is the plasmid FBAMOELSSALF (see sequence of the gene env AMOELS in Fig. 32) containing the peptide ELS, the peptide sequence of which is shown in Table 4 (see nucleotide sequence in Fig. 33).
The oligonucleotides DELASTIN3-V Upper: (5'-GTC ACC GCG GCC GTC
CCT GGG GTA GGG GTG CCG GGG GTA GGG GTG CCT GGG GTG GCC ATA
TAA) and DELASTIN3-V Lower (5'-TTA 'TAT GGC CAC CCC AGG CAC CCC TAC
CCC CGG CAC CCC TAC CCC AGG GAC GGC CGC GGT GAC) were hybridized together. The resulting bicatenary DNA fragment was digested with the EaeI
restriction enzyme and cloned in the FBAMOSALF expression plasmid, previously opened at NotI.
The result is the plasmid FBAMOEL3-VSALF (see sequence of the gene AMOEL3-V in Fig. 34) containing the EL3-V peptide, the peptide sequence of which is shown in Table 4 (see nucleotide sequence in Fig. 35).
The oligonucleotides DELASTIN3-I Upper: (5'-GTC ACC GCG GCC GTC
ATA GGG GTA GGG GTG ATT GGG GTA GGG GTG ATC GGG GTG GCC ATA
TAA) and DELASTIN3-I Lower (5'-TTA TAT GGC CAC CCC GAT CAC CCC TAC
CCC AAT CAC CCC TAC CCC TAT GAC GGC CGC GGT GAC) were hybridized together. The resulting bicatenary DNA fragment was digested with the EaeI
restriction enzyme and cloned in the FBAMOSALF expression plasmid, previously opened at NotI.
This resulted in the plasmid FBAMOEL3-ISALF containing the peptide EL3-I, the peptide sequence of which is shown in Table 4.
The oligonucleotides UpXhoD84K: (5'-AGG CTG CTC GAG AAA ATG CGA
AGA ACC TTT AAC CTC CC) and LoXhoD84K: (5'-ATT TTC TCG AGC AGC CTG
GGC TGC TGC CCC C) were synthesized. Starting from the oligonucleotides 805FC
and LMOADeItaPR03 (see sequence above), the pairs 805FC/LoXhoD84K or UpXhoD84K/LMOADeItaPR03 were used independently for PCR amplification of two DNA fragments starting from the FBAMOSALF matrix. These two DNAs were digested by the enzymes NotI/XhoI and XhoI/BamHI respectively and co-ligated in one or other of the three plasmids FBAMOSALF, FBAMODeItaPROSALF, and FBAMOProSALF previously opened at NotI and BamHI. The resulting plasmids express respectively the envelopes AMOD84K, AMODeItaProD84K, and AMOProD84K.
Two DNA fragments of 2005 by and 241 by were isolated from the plasmid FBAMOG 1 X (Valsesia-Wittmann et al., The EMBO Journal 16:1214-1223. ( 1997)) by digestion with the restriction enzymes NdeI/XhoI and Xhol/BstEII respectively.
These two inserts were cloned in the plasmid FBAMOD84KSALF previously digested by the enzymes NdeI and BstEII, resulting in a plasmid capable of expressing the AMOG 1 XD84K envelope.
Results and Discussion.
Expression and viral incorporation of the chimeric envelopes. The expression plasmids for the envelopes AMO, AMODeItaPRO, AMOPRO, AMOEL3) AMOELS, AMOEL3-V, AMOEL3-I, AMO1FX) AMOGIX, AMOD84K, AMODeItaPROD84K, AMOPROD84K) AMOG 1 XD84K, AMODeItaPR02 (Valsesia-Wittmann et al., The EMBO Journal 16:1214-1223. (1997)), and AMODeItaPR04 (Valsesia-Wittmann et al., The EMBO Journal 16:1214-1223. (1997)) were introduced by transfection into the cells of the TELCeB6 line (Cosset et al., Journal of Virology 69:7430-7436. ( 1995b)). After selection by phleomycin, the phleomycin-resistant colonies were combined for each DNA
and virions were generated and analysed following the procedures originally described (Cosset et al., Journal of Virology 69:6314-6322. ( 1995a)).
These various chimeric envelopes are normally expressed and matured in the cells, and, moreover, efftciently incorporated on the viral particles (results not shown).
The binding tests that were carried out show that these retroviruses can bind specifically on human cells by means of the targeted surface molecule Ram-1 (results not shown).
These various viruses were used for infecting cells expressing either Rec-1 only (Cerd9), or Ram-1 only (CHO-Ram-1 ), or the two molecules Ram-1 and Rec- I
I 5 (Cear I 3 ). The results of titration of these viruses are presented in Table 4.
These results can be summarized as follows:
- in an AMO envelope, substitution of the spacer peptide by three beta-turns of a synthetic (AMOEL3) or natural polyproline helix, described in the literature (AMOEL3-V, from bovine elastin) confers) with regard to capacity for masking the function of the ecotropic envelope and for regulating the cooperation of the Ram-1 and Rec-1 receptors, a phenotype similar to the viruses bearing the "AMO" envelopes containing the cooperating spacer peptides DeltaPR02, DeltaPRO, DeltaPR04, or PRO. Since the peptides derived from elastin (AMOEL3-V and AMOEL3) are arranged as a type II
polyproline helix, it can be suggested on the basis of the results obtained that regulation of the cooperation of the Ram-1 and Rec-1 receptors by the DeltaPro and Pro peptides is probably due to their presumed secondary structure, as a type II polyproline helix.
Moreover, mutations introduced into the spacer peptide derived from elastin (AMOEL3-V) and having the purpose of destroying the folding of the peptide into a type II
polyproline helix (AMOEL3-I, mutations obtained by replacing the proline of each beta-turn with an isoleucine) lead to cancellation of receptor cooperation.
- destruction of the capacity for binding to the ecotropic receptor (D84K
mutations) stops receptor cooperation for the envelopes containing cooperating spacer peptides, especially PRO (see results AMOPRO vs AMOPROD84K)) but does not affect the functionality of the control envelopes bearing the flexible spacer peptide G 1 X (see results AMOGIX vs AMOGIXD84K). We deduce from this that binding to the ecotropic receptor is necessary for infection, in a second stage, following fixation on the Ram-I receptor.
Note that the present results show that in the case of the retroviruses generated with the chimeric envelope AMOPro, the binding domain to the ecotropic receptor is masked (Valsesia-Wittmann et al , The EMBO Journal 16:1214-1223. ( 1997)). The results, taken together, are therefore compatible with a model of two-stage interaction in which:
S - in its "naive" configuration, i e. when it has not been permitted to interact with a cell, the "A~10PR0" retrovitus can potentially interact with the targeted "primary"
receptor (the Ram-1 molecule), but cannot directly interact with the auxiliary receptor (the Rec-1 molecule). This masking seems to be due to a first property of the Pro spacer peptide - when this virus is permitted to interact with Ram-1, a local change in conformation occurs at the level of the Pro spacer peptide which will make the binding domain to Rec-1 accessible. This change in conformation is due to a second property of the Pro spacer peptide.
- if the Rec-1 receptor is present at the surface of the same cell that has Ram-1 and on which the virus is bound. then in a second stage, this receptor will serve as an entry molecule for the virus Table 4. Results of titration.
L"'~ ' Cearl3b CHU-Ram-Ih Ccrd9b sequence ofthe spacer peptide AMO NV<i AAA I'EIQV + + +
AMUO84K NVG PRVPIGPNPAA 1'HQV + + -AMODcItaPro2 NVG PRVP1GPNPAA P13QV ++++ +++
AMOIFX NVG AAAIEGRASPGSS PHQV ++++ +++ +++
AMODeItaPro NVG PRVPIGPNPVLPDAAA PHQV ++++ +++ -2S v~~o'~"~'cnaa~;NVG PRVPIGPNPVLPDAAA PHQV +++ +++ _ AMOEL3 NVG AAAPGVGAPGt'GAPGVAA PIiQV +++ + _ AMOEL3-V NVG AAVPGVGVPGVGVPGVAA PHQV +++ + -AMO);L3-! NVG AAVIGVGVIGVGVIGVAA PHQV +++ +++
AMOG1X NVG AAAGGGGSICGRASPGSS PHQV +++ ++ ++
AMOGIXD84K ++ ++
NVG AAAGGGGSIEGRASPGSS
PHQV
A1v10DcitaPro4NVG PRVPIGPNPVLPDQRLPSSAA PHQV +++ ++ -AMOELS NVG AAAPGVGAPGVGAPGVGAPGVGAPGVAA +++ - -PIiQV
AMOPRO NVG PRVPIGPNPVLPDQRLPSSPIEIVPAPQPPSP...
...LNTSYPPSTTSTPSTSPTSPSVPQPPPAAA +++ - -PHQV
NVG PRVPiGPNPVLPDQRLPSSPIEIVPAPQQPPSP...
..LNTS~'PPSTTSTPSTSPTSPSVPQPPPAAA - -PHQV
3S envelope. "PHQV" represents the amino acids 7 to 10 of the envelope of Mo~.~(L.V and "NVG" represents the last 3 amino acids of the binding domain to Ram-1 b: relative titres obtained on the cells indicated: Cearl3) expressing the receptors Ram-I and Rec-I; CHO-Ram-l) expressing Ram-I only; Cerd9) expressing REPLACEMENT SHEET (RULE 26) Rec-I only.
EXAMPLE 3.
The development of strategies of targeting gene transfer by means of the 5 construction of chimeric envelope glycoproteins) generated by N-terminal insertions of ligands, comes up against the difficulty, in particular, of low capacity, or even incapacity of interaction between virus and targeted surface molecule for activating fusion of these targeting envelopes (Cosset and Russell) Gene Therapy 3 :946-956 ( 1996)). The possibility of causing two surface molecules to cooperate (Valsesia-Wittman et al., The 10 EMBO Journal 16:1214-1223. ( 1997)), one being the targeted receptor or cell surface molecule of attachment, the other being a (retro)viral receptor specialized for fusion or auxiliary surface molecule, makes it possible to envisage a means of overcoming this problem of low gene-fusion capacity of chimeric envelopes and more generally of low e~ciency of the targeting retroviruses. The cooperation of receptors was tested in three I S models of targeting, in which the following three cell surface molecules serve as points of attachment for the targeting retroviruses: (i) receptor of EGF (epidermal growth factor), and (ii) class I molecule of human CMH. The binding domains for these two surface molecules are either growth factors (EGFR), or a single-strand antibody (CMH-I).
These ligands were inserted by fusion at the N-terminal end of the amphotropic MLV
20 envelope (4070A) and various peptides from the proline-rich region carried by the SU
subunit of the amphotropic MLV virus were inserted between the ligands and the envelope (see Table S).
Materials and Methods 25 DNA fragments coding for the spacer peptides DeItaPro2, DeItaPro3, DeltaPro4, and Pro (see Table S) were generated by PCR using as DNA matrix the gene env 4070A, at 5' the oligonucleotide PRO-S-NE (5'-ATC GAG GTC ACC GCG GCC GCG GGA
CCC CGA GTC CCC ATA GGG CCC) which is the same for the four PCR fragments and as oligonucleotides 3' the sequences AMODPRO(-H + P-A): (5'-TAT GAG CGG
30 CCG GGT TGG GCC CTA TGG GGA C), DPro3: (5'-TTA TAC GGC CGT GTC
GGG TAA TAC TGG), AMODPRO(+H+S-A): (5'-TAT GTG CGG CCG AGG AAG
GGA GTC TTT GGT C) and PRO-3-NE: (5'-ATA ATC GGC CGG GGG TGG CTG
TGG GAC).
The corresponding DNA fragments were digested by the enzyme EagI and 35 inserted separately in the plasmid FBEASALF (expressing the chimeric envelope glycoproteins EA) (Cosset et al., Journal of Virology 69:6314-6322. ( 1995a)) previously opened at the NotI restriction site. The resulting plasmids express the envelopes EADeltaPro2, EADeItaPro3, EADeltaPro4, and EAPro.
The Ndel/Notl fragment containing the promoter FB29 as well as the scFv anti-MHC-I provided with the signal peptide of the envelope glycoprotein of the MoMLV
virus (Marin et al.) Journal of Virology 70:2957-2962. ( 1996)) was cloned in the FBEASALF plasmid from which the NdeUNotl fragment was removed beforehand. This S results in the plasmid FB34ASALF capable of expressing a 4070 chimeric envelope with the scFv fused at its N-terminal end. This plasmid was then opened at Notl for inserting the spacer peptides DeItaPro2, DeItaPro3, DeltaPro4, and Pro (see Table S) previously digested with the EagI enzyme. This results in a series of expression vectors for the envelopes 34De1taPro2, 34DeItaPro3, 34De1taPro4, and 34Pro.
Results and Discussion.
These various DNAs were introduced by transfection into the cells of the TELCeB6 line and retroviruses were generated following the usual procedure (see examples I and 2). It was shown that these retroviruses correctly express the chimeric envelope glycoproteins and that the latter permit effcient redirection of binding of the I S viral particles on the specific cellular targets (results not shown).
The viruses produced with the chimeric envelopes of the various groups were used for infecting cells that only express the amphotropic receptor and not the targeted surface molecule. The results of titration of these viruses are shown in Table S.
These results show that it is possible to mask the functions of the amphotropic envelope by means of fragments from the proline-rich region. In the case of chimeras effected with EGF, it is necessary to insert at least five beta-turns to obtain a significant masking effect, and insertion of the whole of the proline-rich region leads to complete inhibition. For the chimeras effected with scFv anti-hR IC-I, three beta-turns are required to obtain a complete masking effect.
Table S. Results of titration.
peptides ligand fork:
name sequence CGFR MHC-I
without' AAA PHQV 6e3 39e2 DeltaPro2 AAA GPRVPIGPNPAA PHQV 7e3 18e1 DeltaPro3 AAA ~PRVPIGPNPVLPD'CAA PHQV 1.2e3 <
Icl DeItaPro4 AAA GPRV_PIG~NpVLPDORLPSSAA PHQV 7el <
Icl Pro AAA GPRVPIGPNPV1.PDOP~LPSSPIEIVPAPQPf' .
...SPLNTSYPPSTTSTPSTSPTSPSVPQPPPAA PIiQV< lel <
lel a: peptide inserted between the targeting binding domain and the 4070A
envelope. "AAA" codes for the Notl site used for effecting fusion in the chimeric envelope; "PHQV" represents the amino acids 4 to 7 of the amphotropic envelope. The REPLACEMENT SHEET (RULE 26) beta-turns are underlined.
b: titration on Cear 13 cells for the EGFR targeting envelopes (ligand: EGF) and for the targeting envelopes targeting MHC-I (ligand: scFv anti-MHC-1).
c: the ligand is directly fused at the end of the amphotropic SU (with the 4th S amino acid), and does not have a spacer peptide.
EXAMPLE 4.
The previous investigations made it possible to delimit the C-terminal ends of the cooperating peptides and to determine the number of turns of type II
polyproline helix necessary for obtaining a masking effect and a minimal cooperative effect. In the case of the model of the AMO chimeric envelopes (see above), a minimum of two turns of the helix is sufficient (Valsesia-Wittmann et al., The EMBO Journal 16:1214-1223.
(1997)).
However, for chimeric envelopes generated with other binding domains than that for Ram-1 (in the case of AMO chimeras) and using the amphotropic envelope as support 1 S envelope, the cooperative effect is less marked, on the one hand because masking of the functions of the amphotropic envelope requires four turns of polyproline helix (see Table S) and on the other hand because activation of the functions of the amphotropic envelope is less strong following binding of the viruses on the targeted surface molecules. One possible explanation is that, in the model of the AMO chimeras, apart from the PRO
spacer peptide, the binding domain to Ram-1 itself carries important determinants for inducing, in a concerted manner with this PRO peptide, activation of the functions of the ecotropic envelope. The binding domain for Ram-1 is in fact a fragment of retroviral envelope (derived from the amphotropic envelope) which is naturally located immediately upstream of the proline-rich region. In order to determine the presence and 2S the importance of such regions in receptor cooperation, chimeric envelopes were constructed combining a targeting domain with the amphotropic envelope and, inserted between these two polypeptides, various peptides tested for their cooperative effect containing notably the proline-rich region (or a fragment of this region) combined with peptide fragments derived from the N-terminal domain of the amphotropic envelope.
Materials and Methods DNA fragments coding for the spacer peptides DeItaPro4-beta, DeItaPro4-int, DeltaPro4-vrb and.. were generated by PCR using as DNA matrix the gene env 4070A, at 3' the oligonucleotide AMODPRO(+H+S-A): (S'-TAT GTG CGG CCG AGG AAG
3S GGA GTC TTT GGT C) and at S' the oligonucleotides UPro-beta: (S'-ATG CTG
GCG
GCC GCG GAT CCT ATT ACC ATG T'TC TCC CTG ACC CGG C), UPro-int: (S'-ATG CTG GCG GCC GCG AAC CCT CTA GTC CTA GAA TTC ACT GAT GC), and UPRO-vrb: (S'-ATG CTG GCG GCC GCG GAA ACC ACC GGA CAG GCT TAC
TGG AAG CCC)) respectively (see Figs. 36 to 38).
DNA fragments coding for the spacer peptides Pro-beta, Pro-int and Pro-vrb were generated by PCR using as DNA matrix the gene env 4070A, at 3' the oligonucleotide PRO-3-NE: (ATA ATC GGC CGG GGG TGG CTG TGG GAC) and at S S' the oligonucleotides UPro-beta: (S'-ATG CTG GCG GCC GCG GAT CCT ATT
ACC ATG TTC TCC CTG ACC CGG C), UPro-int: (S'-ATG CTG GCG GCC GCG
AAC CCT CTA GTC CTA GAA TTC ACT GAT GC), and UPRO-vrb: (S'-ATG CTG
GCG GCC GCG GAA ACC ACC GGA CAG GCT TAC TGG AAG CCC), respectively (see Figs. 39 to 41 ).
These DNA fragments were digested with Eagl enzyme and inserted either in the FBEASALF plasmid (see above) resulting in production of the expression vectors for the chimeric envelopes EADeltaPro4-beta, EADeltaPro4-int, EADeltaPro4-vrb, EAPro-beta, EAPro-int and EAPro-vrb, or in the FB34ASALF plasmid (see above) resulting in production of the expression vectors for the chimeric envelopes 34ADe1taPro4-beta, 1 S 34ADeltaPro4-int) 34ADe1taPro4-vrb, 34APro-beta, 34APro-int and 34APro-vrb.
BIBLIOGRAPHY
1. Battini, J. L., O. Danos, and J. M. Heard. 1995. Receptor-binding domain of murine -leukemia virus envelope glycoproteins. J. Virol. 69:713-719.
2. Battini, J. L., P. Rodrigues, R. MYller, O. Danos, and 1.-M. Heard. 1996.
Receptor-binding properties of a purified fragment of the 4070A amphotropic murine leukemia virus envelope glycoprotein. J. Virol. in press.
3. Bell, G. L, N. M. Fong, M. M. Stempien, M. A. Wormsted, D. Caput, L. Ku, M. S. Urdea, L. B. Rall, and R. Sanchez-Pescador. 1986. Human epidermal growth factor precursor: cDNA sequence, expression in vitro and gene organization.
Nucleic Acid Res. 14:8427-8446.
4. Cosset) F.-L., C. Legras, Y. Chebloune, P. Savatier) P. Thoraval, J. L.
Thomas) 1. Samarut, V. M. Nigon, and G. Verdier. 1990. A new avian leukosis virus-based packaging cell line that uses two separate transcomplementing helper genomes. 1.
Virol. 64:1070-1078.
5. Cosset, F.-L., C. Legras, J. L. Thomas, R. M. Molina, Y. Chebloune, C.
Faure, V. M. Nigon, and G. Verdier. 1991. Improvement of avian leukosis virus (ALV)-based retrovirus vectors by using different cis-acting sequences from ALVs. J
Virol.
65:3388-3394.
6. Cosset, F.-L., F. J. Morling, Y. Takeuchi, R. A. Weiss, M. K. L. Collins, and S. J. Russell. 1995a. Retroviral retargeting by envelopes expressing an N-terminal binding domain. J. Virol. 69:6314-6322.
7. Cosset, F.-L., Y. Takeuchi, J. L. Battini, R. A. Weiss, and M. K. L.
Collins.
1995b. High titer packaging systems producing recombinant retroviruses resistant to human serum. J. Virol. 69:7430-7436.
Collins.
1995b. High titer packaging systems producing recombinant retroviruses resistant to human serum. J. Virol. 69:7430-7436.
8. Gatignol, A., H. Durand, and G. Tiraby. 1988. Bleomycin resistance conferred by a drug-binding protein. FEBS Letters. 230:171-175.
9. Kozak, S. L., D. C. Siess, M. P. Kavanaugh, A. D. Miller, and D. Kabat.
1995. The envelope glycoprotein of an amphotropic murine retrovirus binds specifically to the cellular receptor/phosphate -transporter of susceptible species. J.
Virol. 69 3433-3440.
1995. The envelope glycoprotein of an amphotropic murine retrovirus binds specifically to the cellular receptor/phosphate -transporter of susceptible species. J.
Virol. 69 3433-3440.
10. Marin, M., D. No' 1, S. Valsesia-Wittmann, F. Brockly, M. Etienne-Julan, S.
S J. Russell, F.-L. Cosset, and M. Piechaczyk. 1996. Targeted infection of human cells via MHC class I molecules by MoMuLV-derived viruses displaying single-chain antibody fragment-envelope fusion proteins. in press.
S J. Russell, F.-L. Cosset, and M. Piechaczyk. 1996. Targeted infection of human cells via MHC class I molecules by MoMuLV-derived viruses displaying single-chain antibody fragment-envelope fusion proteins. in press.
11. Miller, D. G., R. H. Edwards, and A. D. Miller. 1994. Cloning of the cellular 10 receptor for amphotropic murine retroviruses reveals homology to that for gibbon ape leukemia virus. Proc Natl Acad Sci USA. 91:78-82.
12. Nagai, K., and H. C. Thorgersen. 1984. Generation of betaglobin by sequence-specific proteolysis of a hybrid protein produced in Escherishia coli. Nature 15 (London).309:810-812.
13. Nikon, B. H. K., F. J. Morling, F.-L. Cosset, and S. J. Russell. 1996.
Targeting of retroviral vectors through protease-substrate interactions. Gene Ther. in press.
Targeting of retroviral vectors through protease-substrate interactions. Gene Ther. in press.
14. Ott, D., R. Friedrich, and A. Rein. 1990. Sequence analysis of amphotropic and 10A1 murine leukemia virus: close relationship to mink cell focus forming viruses. J.
Virol. 64:757-766.
Virol. 64:757-766.
15. Russell, S. J., R. E. Hawkins, and G, Winter. 1993. Retroviral vectors displaying functional antibody fragments. Nucleic Acids Research. 21:1081-1085.
16. Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989. Molecular cloning, A
laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, New York.
laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, New York.
17. Shinnick, T. M., R. A. Lerner) and J. G. Sutcliffe. 1981. Nucleotide sequence of Moloney murine leukemia virus. Nature 293:543-548.
18. Souyri, M., I. Vigon, J. F. Penciolelli, J. M. Heard, P. Tambourin, and F.
Wendling. 1990. A putative truncated cytokine receptor gene transduced by the myeloproliferative leukemia virus immortalizes hematopoietic progenitors.
Cell.
63:1137-1147.
Wendling. 1990. A putative truncated cytokine receptor gene transduced by the myeloproliferative leukemia virus immortalizes hematopoietic progenitors.
Cell.
63:1137-1147.
19. Takeuchi, Y., F. L. Cosset, P. J. Lachmann, H. Okada, R. A. Weirs, and M.
K. L. Collins. 1994. Type C retrovirus inactivation by human complement is determined by both the viral genome and producer cell. J. Virol. 68:8001-8007.
K. L. Collins. 1994. Type C retrovirus inactivation by human complement is determined by both the viral genome and producer cell. J. Virol. 68:8001-8007.
20. Valsesia-Wittmann, S., A. Drynda, G. Deleage, M. Aumailley, J.-M. Heard, O. Danos, G. Verdier, and F.-L. Cosset. 1994. Modifications in the binding domain of avian retrovirus envelope protein to redirect the host range of retroviral vectors. J. Virol.
68:4609-4619.
68:4609-4619.
21. Valsesia-Wittmann, S., F. J. Morling, B. H. K. Nilson, Y. Takeuchi, S. J.
Russell, and F.-L. Cosset. 1996. Improvement of retroviral retargeting by using amino acid spacers between an additional binding domain and the N terminus of Moloney murine leukemia virus SU. J. Virol. 70:2059-2064.
Russell, and F.-L. Cosset. 1996. Improvement of retroviral retargeting by using amino acid spacers between an additional binding domain and the N terminus of Moloney murine leukemia virus SU. J. Virol. 70:2059-2064.
22. VanZeijl, M., S. V. Johann, E. Cross, J. Cunningham, R. Eddy, T. B. Shows, and B. O'Hara. 1994. An amphotropic virus receptor is a second member of the gibbon ape leukemia virus receptor family. Proc. Natl. Acad. Sci. USA. 91:1168-1172.
23. Weirs) R. A. 1993. Cellular receptors and viral glycoproteins involved in retroviral entry, p. 1-108. In J. levy (ed. ), The Retroviridae, vol . 2.
Plenum Press.
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: CENTRE NATINAL DE LA RECHERCHE SCIENTIFIQUE
(ii) TITLE OF INVENTION: VIRAL PARTICLES WHICH ARE MASKED OR UNMASKED
WITH RESPECT TO A CELL RECEPTOR
(iii) NUMBER OF SEQUENCES: 70 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: FETHERSTONHAUGH & CO.
(B) STREET: P.O. BOX 2999, STATION D
(C) CITY: OTTAWA
(D) STATE: ONT
(E) COUNTRY: CANADA
(F) ZIP: K1P 5Y6 (v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk (B) COMPUTER: IBM PC compatible (C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: ASCII (text) (vi) CURRENT APPLICATION DATA:
2 0 (A) APPLICATION NUMBER: CA 2,253,874 (B) FILING DATE: 16-MAY-1997 (C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: FR 96/06234 (B) FILING DATE: 20-MAY-1996 (viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: FETHERSTONHAUGH & C0.
(B) REGISTRATION NUMBER:
(C) REFERENCE/DOCKET NUMBER: 11534-16 3 0 (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (613)-235-4373 (B) TELEFAX: (613)-232-8440 (2) INFORMATION FOR SEQ ID NO.: l:
_ _.___ __ _ (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 189 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(189) (C) OTHER INFORMATION: Description of Unknown Organism:UNFQ~10WN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 1:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 2:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 63 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide 4 0 (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 2:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala 50 ~5 60 (2) INFORMATION FOR SEQ ID NO.: 3:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 144 (B) TYPE: nucleic acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(144) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~10WN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 3:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Set Pro Thr Gln Leu Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 4:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 48 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 4:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 5:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 189 2 0 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(189) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~10WN
30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 5:
Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro 4 0 Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 6:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 63 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 6:
Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lye Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 7:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 312 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
3 0 (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(312) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 7:
Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 8:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 104 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
30 (ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 8:
Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe 4 0 Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 9:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 60 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(60) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 9:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln (2) INFORMATION FOR SEQ ID NO.: 10:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 20 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
3 0 (ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 10:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln (2) INFORMATION FOR SEQ ID NO.: 11:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 105 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus ( ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(105) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQdOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 11:
Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln 3 0 (2) INFORMATION FOR SEQ ID NO.: 12:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 35 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 12:
4 0 Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln (2) INFORMATION FOR SEQ ID NO.: 13:
10 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 183 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
2 0 (B) LOCATION: (1)..(183) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 13:
Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln (2) INFORMATION FOR SEQ ID NO.: 14:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 61 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 14:
Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln 2 0 (2) INFORMATION FOR SEQ ID NO.: 15:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2780 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
3 0 (A) NAME/KEY: CDS
(B) LOCATION: (1)..(2778) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 15:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 4 0 Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Aep Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile 2 0 Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 16:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 926 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
4 0 (ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 16:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala 50 Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser 2 0 Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lye Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val 4 0 Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile 2 0 Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 17:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2642 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 4 0 ( ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2640) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 17:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 5 0 Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys 2 0 Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro 4 0 Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser AAG
Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lye Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val 3 0 Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 18:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 880 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
3 0 (A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 18:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys 50 Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg 2 0 Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lye Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala 4 0 Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 19:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2792 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
2 0 (A) NAME/KEY: CDS
(B) LOCATION: (1)..(2790) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 19:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro 3 0 Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro 50 Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp 3 0 Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 20:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 930 (B) TYPE: amino acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 20:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg 5 0 Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro 2 0 Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lye Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro 4 0 Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 21:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2700 (B) TYPE: nucleic acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2697) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 21:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr 5er Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 22:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 899 (B) TYPE: amino acid (C) STRANDEDNESS:
2 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 22:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu _ 77 _ Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg 2 0 Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro _ 78 _ Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lye Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly 4 0 Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu _ 79 _ Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 23:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2322 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
2 0 (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2319) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 23:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp '' Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met GTA
Ala Thr GlnGlnPhe GlnGln LeuGlnAla AlaValGln AspAspLeu Arg Glu ValGluLys SerIle SerAsnLeu GluLysSer LeuThrSer Leu Ser GluValVal LeuGln AsnArgArg GlyLeuAsp LeuLeuPhe Leu Lys GluGlyGly LeuCys AlaAlaLeu LysGluGlu CysCysPhe Tyr Ala AspHisThr GlyLeu ValArgAsp SerMetAla LysLeuArg Glu Arg LeuAsnGln ArgGln LysLeuPhe GluSerThr GlnGlyTrp Phe Glu GlyLeuPhe AsnArg SerProTrp PheThrThr LeuIleSer Thr Ile MetGlyPro LeuIle ValLeuLeu MetIleLeu LeuPheGly Pro Cys IleLeuAsn ArgLeu ValGlnPhe ValLysAsp ArgIleSer Val Val GlnAlaLeu ValLeu ThrGlnGln TyrHisGln LeuLysPro Ile Glu TyrGluPro (2) INFORMATION FOR SEQ ID NO.: 24:
(i) SEQUENCE CHARACTERISTICS
50 (A) LENGTH: 773 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 24:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Aap Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp 2 0 Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His 4 0 Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 25:
2 O (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2367 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
30 (B) LOCATION: (1) . . (2364) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 25:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lye Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val AAG
ValGlyTyrIle GlyGluArg CysGlnTyr ArgAspLeu LysTrpTrp GluLeuArgAsp ProGlyLeu ThrPheGly IleArgLeu ArgTyrGln AsnLeuGlyPro ArgValPro IleGlyPro AsnProVal LeuAlaAsp GlnGlnProLeu SerLysPro LysProVal LysSerPro SerValThr 2 0 Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro 4 0 Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys _ 87 _ Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lye Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg _ 88 _ Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 26:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 788 (B) TYPE: amino acid 60 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 26:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val 2 0 Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys 2 0 Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser 5 0 Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 27:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2490 3 0 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2487) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
4 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 27:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lye Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 28:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 829 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 28:
2 0 Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser 2 0 Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr _ 97 -Gly Gln Gly Leu Cys Ile GIy Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly 5er Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys 4 0 Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 29:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2289 60 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
_ 98 _ (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2286) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 29:
AAA GAT AAC
CCC AAG
Met Ala ArgSer ThrLeuSer LysProPro GlnAspLys IleAsnPro Trp Lys ProLeu IleValMet GlyValLeu LeuGlyVal GlyMetAla Glu Ser AlaAla GlnProAla MetAlaAsn SerAspSer GluCysPro Leu Ser HisAsp GlyTyrCys LeuHisAsp GlyValCys MetTyrIle Glu Ala LeuAsp LysTyrAla CysAanCys ValValGly TyrIleGly Glu Arg CysGln TyrArgAsp LeuLysTrp TrpGluLeu ArgGlyPro Arg Val ProIle GlyProAsn ProValLeu AlaAspGln GlnProLeu Ser Lys ProLys ProValLys SerProSer ValThrLys ProProSer Gly Thr ProLeu SerProThr GlnLeuPro ProAlaAla AlaProHis Gln Val PheAsn ValThrTrp ArgValThr AsnLeuMet ThrGlyArg Thr Ala AsnAla ThrSerLeu LeuGlyThr ValGlnAsp AlaPhePro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp GAC CTA ATC TCC CTT AAG CGC GGT AAC ACC CCC TGG GAC ACG GGA TG~ 816 Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu 3 0 Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro 5 0 Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 30:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 762 (B) TYPE: amino acid (C) STRANDEDNESS:
2 0 ( D ) TOPOLOGY
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 30:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser 50 Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Aen Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu 4 0 Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu 4 0 Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 31:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2334 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2331) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 31:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu 3 0 Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile 4 0 Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Aen Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr 6 0 Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 32:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 777 (B) TYPE: amino acid (C) STRANDEDNESS:
2 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 32:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg 100 ~ 105 110 Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly 2 0 Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr 4 0 Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 33:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2457 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
60 (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2454) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 33:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu SerHisAsp GlyTyrCys LeuHisAsp GlyValCys MetTyrIle Glu AlaLeuAsp LysTyrAla CysAsnCys ValValGly TyrIleGly 3 0 Glu ArgCysGln TyrArgAsp LeuLysTrp TrpGluLeu ArgGluPhe Thr AspAlaGly LysLysAla AsnTrpAsp GlyProLys SerTrpGly Leu ArgLeuTyr ArgThrGly ThrAspPro IleThrMet PheSerLeu AAT GGG
Thr Arg GlnVal LeuAsnVal GlyProArg ValProIle GlyProAsn Pro Val LeuPro AspGlnArg LeuProSer SerProIle GluIleVal Pro Ala ProGln ProProSer ProLeuAsn ThrSerTyr ProProSer Thr Thr SerThr ProSerThr SerProThr SerProSer ValProGln Pro Pro ProAla AlaAlaPro HisGlnVal PheAsnVal ThrTrpArg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lye Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 34:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 818 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 4 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 34:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro 4 0 Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr 2 0 Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 35:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2229 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
2 0 (A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2226) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 35:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser 4 0 Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser 4 0 Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala 4 0 Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lye Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro 40 (2) INFORMATION FOR SEQ ID NO.: 36:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 742 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 36:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser 4 0 Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser 2 0 Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 37:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2274 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2271) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 37:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser 3 0 Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln 50 Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys. Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly 3 0 Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 38:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 757 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
4 0 (A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 38:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His 2 0 Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Aap Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 39:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2352 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2349) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 39:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro 2 0 Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp 4 0 Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lye Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu 3 0 Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 40:
3 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 783 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 40:
4 0 Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val 2 0 Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lye Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser 4 0 Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 41:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2196 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1) . . (2193) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 41:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro 3 0 Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 42:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 731 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
3 0 (ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 42:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly 50 Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys 2 0 Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lye Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lya Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cya Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp 4 0 Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 43:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2241 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
60 (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1) . . (2238) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 43:
AAA GAT AAC
AAG
Met Ala ArgSer ThrLeuSer LysProPro GlnAspLys IleAsnPro Trp Lys ProLeu IleValMet GlyValLeu LeuGlyVal GlyMetAla 20 . 25 30 Glu Ser AlaAla GlnProAla MetAlaAsn SerAspSer GluCysPro Leu Ser HisAsp GlyTyrCys LeuHisAsp GlyValCys MetTyrIle Glu Ala LeuAsp LysTyrAla CysAsnCys ValValGly TyrIleGly 3 Glu Arg CysGln TyrArgAsp LeuLysTrp TrpGluLeu ArgAspPro Gly Leu ThrPhe GlyIleArg LeuArgTyr GlnAsnLeu GlyProArg Val Pro IleGly ProAsnPro ValLeuPro AspAlaAla AlaProHis Gln Val PheAsn ValThrTrp ArgValThr AsnLeuMet ThrGlyArg Thr Ala AsnAla ThrSerLeu LeuGlyThr ValGlnAsp AlaPhePro 5 Lys Leu TyrPhe AspLeuCys AspLeuVal GlyGluGlu TrpAspPro Ser Asp GlnGlu ProTyrVal GlyTyrGly CysLysTyr ProAlaGly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 44:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 746 (B) TYPE: amino acid (C) STRANDEDNESS:
2 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 44:
Mei Ala Arg Ser Th5 Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 3 0 T~ Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His 5 0 Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys 2 0 Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val 4 0 Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 45:
(i) SEQUENCE CHARACTERISTICS
50 (A) LENGTH: 2319 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2316) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 45:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 46:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 772 (B) TYPE: amino acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 46:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 3 0 Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro 4 0 Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys 2 0 Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 47:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2649 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2646) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 47:
AAA GAT
AAG
Met AlaArgSer ThrLeuSer LysProPro GlnAspLys IleAsnPro Trp LysProLeu IleValMet GlyValLeu LeuGlyVal GlyMetAla Glu SerProHis GlnValPhe AsnValThr TrpArgVal ThrAsnLeu Met ThrGlyArg ThrAlaAsn AlaThrSer LeuLeuGly ThrValGln Asp AlaPhePro LysLeuTyr PheAspLeu CysAspLeu ValGlyGlu Glu TrpAspPro SerAspGln GluProTyr ValGlyTyr GlyCysLys Tyr ProAlaGly ArgGlnArg ThrArgThr PheAspPhe TyrValCys Pro GlyHisThr ValLysSer GlyCysGly GlyProGly GluGlyTyr Cys GlyLysTrp GlyCysGlu ThrThrGly GlnAlaTyr TrpLysPro Thr SerSerTrp AspLeuIle SerLeuLys ArgGlyAsn ThrProTrp Asp ThrGlyCys SerLysVal AlaCysGly ProCysTyr AspLeuSer Lys ValSerAsn SerPheGln GlyAlaThr ArgGlyGly ArgCysAsn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lye Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cye Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly ATA CTA
GGA ATG
ACA GCC
GGG ACT
ACT
ACT
Ile Ala AlaGly IleGly GlyThrThr Ala Met Ala Gln Thr Leu Thr GCC GAT GAG
Gln Phe GlnGln LeuGln AlaValGln Asp Leu Arg Val Ala Aap Glu CTA ACT TCT
Glu Lys SerIle SerAsn GluLysSer Leu Ser Leu Glu Leu Thr Ser AGG TTA AAA
Val Val LeuGln AsnArg GlyLeuAsp Leu Phe Leu Glu Arg Leu Lys CTA TGC GCG
Gly Gly LeuCys AlaAla LysGluGlu Cys Phe Tyr Asp Leu Cys Ala GAC TTG AGG
His Thr GlyLeu ValArg SerMetAla Lys Arg Glu Leu Asp Leu Arg TTT GGA GAG
Asn Gln ArgGln LysLeu GluSerThr Gln Trp Phe Gly Phe Gly Glu Leu Phe AsnArg SerPro PheThrThr Leu Ser Thr Met Trp Ile Ile CTA TTC TGC
Gly Pro LeuIle ValLeu MetIleLeu Leu Gly Pro Ile Leu Phe Cys TTT ATC GTC
Leu Asn ArgLeu ValGln ValLysAsp Arg Ser Val Gln Phe Ile Val CAA AAA GAA
Ala Leu ValLeu ThrGln TyrHisGln Leu Pro Leu Tyr Gln Lys Glu Glu Pro (2) INFORMATION FOR SEQ ID NO.: 48:-(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 882 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 48:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lye Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys 2 0 Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lye Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser 2 0 Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Aan Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 49:
4 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 54 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
50 (B) LOCATION: (1)..(54) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 49:
Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 50:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 18 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 50:
2 0 Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 51:
(i) SEQUENCE CHARACTERISTICS
3 0 (A) LENGTH: 2649 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2646) 4 0 (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 51:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys 3 0 Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Aan Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lye Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lye Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 52:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 882 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 4 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 52:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr G1y Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp 4 0 Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lya Ser Pro Ser Val Thr Lya Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu 2 0 Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lya Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lya Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro 5 0 Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Aap Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lya Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 53:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 54 (B) TYPE: nucleic acid (C) STRANDEDNESS:
3 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(54) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~TOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 53:
Ala Ala Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 54:
50 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 18 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 54:
Ala Ala Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 55:
(i) SEQUENCE CHARACTERISTICS
2 0 (A) LENGTH: 2679 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2676) 3 0 (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~10WN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 55:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lye Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lye Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lye Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 56:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 892 (B) TYPE: amino acid 3 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 56:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 4 0 Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp 2 0 Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu 4 0 Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 57:
2 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 84 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
3 0 (B) LOCATION: (1)..(84) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 57:
Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 58:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 28 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 58:
Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 59:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 118 (B) TYPE: nucleic acid (C) STRANDEDNESS:
2 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(117) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 59:
Ala Ala Ala Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu 1 5 .10 15 Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 60:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 39 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 60:
Ala Ala Ala Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 61:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 211 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 3 0 ( ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(210) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~TOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 61:
Ala Ala Ala Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 62:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 70 (B) TYPE: amino acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 62:
Ala Ala Ala Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys 3 0 Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 63:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 382 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
50 (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(381) (C) OTHER INFORMATION: Description of Unknown Organism:UNFQdOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 63:
Ala Ala Ala Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lye Lys Ala Asn Trp Asp Gly Pro Lys 3 0 Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 64:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 127 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide 5 0 (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 64:
Ala Ala Ala Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 65:
3 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 238 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirue (ix) FEATURE
(A) NAME/KEY: CDS
4 0 (B) LOCATION: (1)..(237) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQdOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 65:
Ala Ala Ala Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 66:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 79 (B) TYPE: amino acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 66:
Ala Ala Ala Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 67:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 331 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(330) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~TOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 67:
Ala Ala Ala Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro 3 0 Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 68:
4 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 110 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 68:
Ala Ala Ala Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 69:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 502 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
3 0 (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(501) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 69:
Ala Ala Ala Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lye Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 70:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 167 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 4 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 70:
Ala Ala Ala Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala
Plenum Press.
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: CENTRE NATINAL DE LA RECHERCHE SCIENTIFIQUE
(ii) TITLE OF INVENTION: VIRAL PARTICLES WHICH ARE MASKED OR UNMASKED
WITH RESPECT TO A CELL RECEPTOR
(iii) NUMBER OF SEQUENCES: 70 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: FETHERSTONHAUGH & CO.
(B) STREET: P.O. BOX 2999, STATION D
(C) CITY: OTTAWA
(D) STATE: ONT
(E) COUNTRY: CANADA
(F) ZIP: K1P 5Y6 (v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk (B) COMPUTER: IBM PC compatible (C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: ASCII (text) (vi) CURRENT APPLICATION DATA:
2 0 (A) APPLICATION NUMBER: CA 2,253,874 (B) FILING DATE: 16-MAY-1997 (C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: FR 96/06234 (B) FILING DATE: 20-MAY-1996 (viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: FETHERSTONHAUGH & C0.
(B) REGISTRATION NUMBER:
(C) REFERENCE/DOCKET NUMBER: 11534-16 3 0 (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (613)-235-4373 (B) TELEFAX: (613)-232-8440 (2) INFORMATION FOR SEQ ID NO.: l:
_ _.___ __ _ (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 189 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(189) (C) OTHER INFORMATION: Description of Unknown Organism:UNFQ~10WN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 1:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 2:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 63 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide 4 0 (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 2:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala 50 ~5 60 (2) INFORMATION FOR SEQ ID NO.: 3:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 144 (B) TYPE: nucleic acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(144) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~10WN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 3:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Set Pro Thr Gln Leu Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 4:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 48 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 4:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 5:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 189 2 0 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(189) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~10WN
30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 5:
Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro 4 0 Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 6:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 63 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 6:
Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lye Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 7:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 312 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
3 0 (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(312) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 7:
Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 8:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 104 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
30 (ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 8:
Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe 4 0 Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 9:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 60 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(60) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 9:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln (2) INFORMATION FOR SEQ ID NO.: 10:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 20 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
3 0 (ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 10:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln (2) INFORMATION FOR SEQ ID NO.: 11:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 105 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus ( ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(105) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQdOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 11:
Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln 3 0 (2) INFORMATION FOR SEQ ID NO.: 12:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 35 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 12:
4 0 Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln (2) INFORMATION FOR SEQ ID NO.: 13:
10 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 183 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
2 0 (B) LOCATION: (1)..(183) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 13:
Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln (2) INFORMATION FOR SEQ ID NO.: 14:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 61 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 14:
Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln 2 0 (2) INFORMATION FOR SEQ ID NO.: 15:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2780 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
3 0 (A) NAME/KEY: CDS
(B) LOCATION: (1)..(2778) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 15:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 4 0 Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Aep Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile 2 0 Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 16:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 926 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
4 0 (ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 16:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala 50 Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser 2 0 Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lye Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val 4 0 Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile 2 0 Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 17:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2642 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 4 0 ( ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2640) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 17:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 5 0 Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys 2 0 Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro 4 0 Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser AAG
Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lye Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val 3 0 Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 18:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 880 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
3 0 (A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 18:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys 50 Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg 2 0 Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lye Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala 4 0 Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 19:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2792 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
2 0 (A) NAME/KEY: CDS
(B) LOCATION: (1)..(2790) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 19:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro 3 0 Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro 50 Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp 3 0 Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 20:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 930 (B) TYPE: amino acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 20:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg 5 0 Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro 2 0 Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lye Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro 4 0 Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 21:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2700 (B) TYPE: nucleic acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2697) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 21:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr 5er Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 22:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 899 (B) TYPE: amino acid (C) STRANDEDNESS:
2 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 22:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu _ 77 _ Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg 2 0 Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro _ 78 _ Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lye Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly 4 0 Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu _ 79 _ Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 23:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2322 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
2 0 (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2319) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 23:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp '' Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met GTA
Ala Thr GlnGlnPhe GlnGln LeuGlnAla AlaValGln AspAspLeu Arg Glu ValGluLys SerIle SerAsnLeu GluLysSer LeuThrSer Leu Ser GluValVal LeuGln AsnArgArg GlyLeuAsp LeuLeuPhe Leu Lys GluGlyGly LeuCys AlaAlaLeu LysGluGlu CysCysPhe Tyr Ala AspHisThr GlyLeu ValArgAsp SerMetAla LysLeuArg Glu Arg LeuAsnGln ArgGln LysLeuPhe GluSerThr GlnGlyTrp Phe Glu GlyLeuPhe AsnArg SerProTrp PheThrThr LeuIleSer Thr Ile MetGlyPro LeuIle ValLeuLeu MetIleLeu LeuPheGly Pro Cys IleLeuAsn ArgLeu ValGlnPhe ValLysAsp ArgIleSer Val Val GlnAlaLeu ValLeu ThrGlnGln TyrHisGln LeuLysPro Ile Glu TyrGluPro (2) INFORMATION FOR SEQ ID NO.: 24:
(i) SEQUENCE CHARACTERISTICS
50 (A) LENGTH: 773 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 24:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Aap Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp 2 0 Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His 4 0 Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 25:
2 O (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2367 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
30 (B) LOCATION: (1) . . (2364) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 25:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lye Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val AAG
ValGlyTyrIle GlyGluArg CysGlnTyr ArgAspLeu LysTrpTrp GluLeuArgAsp ProGlyLeu ThrPheGly IleArgLeu ArgTyrGln AsnLeuGlyPro ArgValPro IleGlyPro AsnProVal LeuAlaAsp GlnGlnProLeu SerLysPro LysProVal LysSerPro SerValThr 2 0 Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro 4 0 Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys _ 87 _ Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lye Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg _ 88 _ Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 26:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 788 (B) TYPE: amino acid 60 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 26:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val 2 0 Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys 2 0 Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser 5 0 Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 27:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2490 3 0 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2487) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
4 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 27:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lye Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 28:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 829 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 28:
2 0 Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser 2 0 Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr _ 97 -Gly Gln Gly Leu Cys Ile GIy Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly 5er Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys 4 0 Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 29:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2289 60 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
_ 98 _ (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2286) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 29:
AAA GAT AAC
CCC AAG
Met Ala ArgSer ThrLeuSer LysProPro GlnAspLys IleAsnPro Trp Lys ProLeu IleValMet GlyValLeu LeuGlyVal GlyMetAla Glu Ser AlaAla GlnProAla MetAlaAsn SerAspSer GluCysPro Leu Ser HisAsp GlyTyrCys LeuHisAsp GlyValCys MetTyrIle Glu Ala LeuAsp LysTyrAla CysAanCys ValValGly TyrIleGly Glu Arg CysGln TyrArgAsp LeuLysTrp TrpGluLeu ArgGlyPro Arg Val ProIle GlyProAsn ProValLeu AlaAspGln GlnProLeu Ser Lys ProLys ProValLys SerProSer ValThrLys ProProSer Gly Thr ProLeu SerProThr GlnLeuPro ProAlaAla AlaProHis Gln Val PheAsn ValThrTrp ArgValThr AsnLeuMet ThrGlyArg Thr Ala AsnAla ThrSerLeu LeuGlyThr ValGlnAsp AlaPhePro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp GAC CTA ATC TCC CTT AAG CGC GGT AAC ACC CCC TGG GAC ACG GGA TG~ 816 Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu 3 0 Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro 5 0 Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 30:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 762 (B) TYPE: amino acid (C) STRANDEDNESS:
2 0 ( D ) TOPOLOGY
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 30:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser 50 Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Aen Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu 4 0 Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu 4 0 Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 31:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2334 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2331) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 31:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu 3 0 Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile 4 0 Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Aen Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr 6 0 Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 32:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 777 (B) TYPE: amino acid (C) STRANDEDNESS:
2 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 32:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg 100 ~ 105 110 Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly 2 0 Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr 4 0 Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 33:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2457 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
60 (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2454) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 33:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu SerHisAsp GlyTyrCys LeuHisAsp GlyValCys MetTyrIle Glu AlaLeuAsp LysTyrAla CysAsnCys ValValGly TyrIleGly 3 0 Glu ArgCysGln TyrArgAsp LeuLysTrp TrpGluLeu ArgGluPhe Thr AspAlaGly LysLysAla AsnTrpAsp GlyProLys SerTrpGly Leu ArgLeuTyr ArgThrGly ThrAspPro IleThrMet PheSerLeu AAT GGG
Thr Arg GlnVal LeuAsnVal GlyProArg ValProIle GlyProAsn Pro Val LeuPro AspGlnArg LeuProSer SerProIle GluIleVal Pro Ala ProGln ProProSer ProLeuAsn ThrSerTyr ProProSer Thr Thr SerThr ProSerThr SerProThr SerProSer ValProGln Pro Pro ProAla AlaAlaPro HisGlnVal PheAsnVal ThrTrpArg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lye Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 34:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 818 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 4 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 34:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro 4 0 Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr 2 0 Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 35:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2229 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
2 0 (A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2226) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 35:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser 4 0 Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser 4 0 Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala 4 0 Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lye Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro 40 (2) INFORMATION FOR SEQ ID NO.: 36:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 742 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 36:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser 4 0 Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser 2 0 Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 37:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2274 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2271) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 37:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser 3 0 Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln 50 Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys. Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly 3 0 Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 38:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 757 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
4 0 (A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 38:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His 2 0 Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Aap Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 39:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2352 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2349) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 39:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro 2 0 Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp 4 0 Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lye Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu 3 0 Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 40:
3 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 783 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 40:
4 0 Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val 2 0 Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lye Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser 4 0 Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 41:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2196 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1) . . (2193) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 41:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro 3 0 Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 42:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 731 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
3 0 (ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 42:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly 50 Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys 2 0 Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lye Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lya Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cya Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp 4 0 Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 43:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2241 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
60 (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1) . . (2238) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 43:
AAA GAT AAC
AAG
Met Ala ArgSer ThrLeuSer LysProPro GlnAspLys IleAsnPro Trp Lys ProLeu IleValMet GlyValLeu LeuGlyVal GlyMetAla 20 . 25 30 Glu Ser AlaAla GlnProAla MetAlaAsn SerAspSer GluCysPro Leu Ser HisAsp GlyTyrCys LeuHisAsp GlyValCys MetTyrIle Glu Ala LeuAsp LysTyrAla CysAsnCys ValValGly TyrIleGly 3 Glu Arg CysGln TyrArgAsp LeuLysTrp TrpGluLeu ArgAspPro Gly Leu ThrPhe GlyIleArg LeuArgTyr GlnAsnLeu GlyProArg Val Pro IleGly ProAsnPro ValLeuPro AspAlaAla AlaProHis Gln Val PheAsn ValThrTrp ArgValThr AsnLeuMet ThrGlyArg Thr Ala AsnAla ThrSerLeu LeuGlyThr ValGlnAsp AlaPhePro 5 Lys Leu TyrPhe AspLeuCys AspLeuVal GlyGluGlu TrpAspPro Ser Asp GlnGlu ProTyrVal GlyTyrGly CysLysTyr ProAlaGly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 44:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 746 (B) TYPE: amino acid (C) STRANDEDNESS:
2 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 44:
Mei Ala Arg Ser Th5 Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 3 0 T~ Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His 5 0 Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys 2 0 Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val 4 0 Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 45:
(i) SEQUENCE CHARACTERISTICS
50 (A) LENGTH: 2319 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2316) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 45:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 46:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 772 (B) TYPE: amino acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 46:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 3 0 Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro 4 0 Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys 2 0 Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 47:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2649 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2646) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 47:
AAA GAT
AAG
Met AlaArgSer ThrLeuSer LysProPro GlnAspLys IleAsnPro Trp LysProLeu IleValMet GlyValLeu LeuGlyVal GlyMetAla Glu SerProHis GlnValPhe AsnValThr TrpArgVal ThrAsnLeu Met ThrGlyArg ThrAlaAsn AlaThrSer LeuLeuGly ThrValGln Asp AlaPhePro LysLeuTyr PheAspLeu CysAspLeu ValGlyGlu Glu TrpAspPro SerAspGln GluProTyr ValGlyTyr GlyCysLys Tyr ProAlaGly ArgGlnArg ThrArgThr PheAspPhe TyrValCys Pro GlyHisThr ValLysSer GlyCysGly GlyProGly GluGlyTyr Cys GlyLysTrp GlyCysGlu ThrThrGly GlnAlaTyr TrpLysPro Thr SerSerTrp AspLeuIle SerLeuLys ArgGlyAsn ThrProTrp Asp ThrGlyCys SerLysVal AlaCysGly ProCysTyr AspLeuSer Lys ValSerAsn SerPheGln GlyAlaThr ArgGlyGly ArgCysAsn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lye Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cye Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly ATA CTA
GGA ATG
ACA GCC
GGG ACT
ACT
ACT
Ile Ala AlaGly IleGly GlyThrThr Ala Met Ala Gln Thr Leu Thr GCC GAT GAG
Gln Phe GlnGln LeuGln AlaValGln Asp Leu Arg Val Ala Aap Glu CTA ACT TCT
Glu Lys SerIle SerAsn GluLysSer Leu Ser Leu Glu Leu Thr Ser AGG TTA AAA
Val Val LeuGln AsnArg GlyLeuAsp Leu Phe Leu Glu Arg Leu Lys CTA TGC GCG
Gly Gly LeuCys AlaAla LysGluGlu Cys Phe Tyr Asp Leu Cys Ala GAC TTG AGG
His Thr GlyLeu ValArg SerMetAla Lys Arg Glu Leu Asp Leu Arg TTT GGA GAG
Asn Gln ArgGln LysLeu GluSerThr Gln Trp Phe Gly Phe Gly Glu Leu Phe AsnArg SerPro PheThrThr Leu Ser Thr Met Trp Ile Ile CTA TTC TGC
Gly Pro LeuIle ValLeu MetIleLeu Leu Gly Pro Ile Leu Phe Cys TTT ATC GTC
Leu Asn ArgLeu ValGln ValLysAsp Arg Ser Val Gln Phe Ile Val CAA AAA GAA
Ala Leu ValLeu ThrGln TyrHisGln Leu Pro Leu Tyr Gln Lys Glu Glu Pro (2) INFORMATION FOR SEQ ID NO.: 48:-(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 882 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 48:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lye Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys 2 0 Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lye Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser 2 0 Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Aan Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 49:
4 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 54 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
50 (B) LOCATION: (1)..(54) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 49:
Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 50:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 18 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 50:
2 0 Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 51:
(i) SEQUENCE CHARACTERISTICS
3 0 (A) LENGTH: 2649 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2646) 4 0 (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 51:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys 3 0 Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Aan Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lye Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lye Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 52:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 882 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 4 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 52:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr G1y Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp 4 0 Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lya Ser Pro Ser Val Thr Lya Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu 2 0 Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lya Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lya Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro 5 0 Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Aap Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lya Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 53:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 54 (B) TYPE: nucleic acid (C) STRANDEDNESS:
3 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(54) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~TOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 53:
Ala Ala Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 54:
50 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 18 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 54:
Ala Ala Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 55:
(i) SEQUENCE CHARACTERISTICS
2 0 (A) LENGTH: 2679 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2676) 3 0 (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~10WN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 55:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lye Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lye Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lye Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 56:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 892 (B) TYPE: amino acid 3 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 56:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 4 0 Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp 2 0 Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu 4 0 Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 57:
2 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 84 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
3 0 (B) LOCATION: (1)..(84) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 57:
Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 58:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 28 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 58:
Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 59:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 118 (B) TYPE: nucleic acid (C) STRANDEDNESS:
2 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(117) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 59:
Ala Ala Ala Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu 1 5 .10 15 Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 60:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 39 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 60:
Ala Ala Ala Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 61:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 211 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 3 0 ( ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(210) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~TOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 61:
Ala Ala Ala Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 62:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 70 (B) TYPE: amino acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 62:
Ala Ala Ala Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys 3 0 Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 63:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 382 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
50 (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(381) (C) OTHER INFORMATION: Description of Unknown Organism:UNFQdOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 63:
Ala Ala Ala Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lye Lys Ala Asn Trp Asp Gly Pro Lys 3 0 Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 64:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 127 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide 5 0 (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 64:
Ala Ala Ala Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 65:
3 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 238 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirue (ix) FEATURE
(A) NAME/KEY: CDS
4 0 (B) LOCATION: (1)..(237) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQdOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 65:
Ala Ala Ala Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 66:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 79 (B) TYPE: amino acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 66:
Ala Ala Ala Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 67:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 331 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(330) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~TOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 67:
Ala Ala Ala Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro 3 0 Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 68:
4 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 110 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 68:
Ala Ala Ala Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 69:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 502 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
3 0 (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(501) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 69:
Ala Ala Ala Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lye Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 70:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 167 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 4 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 70:
Ala Ala Ala Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala
Claims (18)
1. Use of a peptide for transferring genes into a eukaryotic target cell, this peptide containing from about 10 to about 200, especially from about 15 to about 150 amino acids, and advantageously about 20 amino acids, in which at least 30% of the amino acids consist of proline residues, these proline residues being arranged regularly so as to induce turnings of the polypeptide chain at about 180° (".beta.-turn" or "reverse-turn"), these turnings being regularly spaced and forming a polyproline .beta.-turn helix, in a polypeptide construction containing, on the N-terminal side (upstream) of the said peptide, an N-terminal (upstream) protein domain capable of recognizing a targeted surface molecule or an antigen expressed on a cell surface, especially a suitable receptor (targeted receptor) located on the said eukaryotic cell, and on the C-terminal side (downstream) of the said peptide, a C-terminal (downstream) protein domain capable of recognizing a suitable receptor (auxiliary receptor) located on the aforesaid eukaryotic cell, this peptide being capable of facilitating or inhibiting interaction between the C-terminal (downstream) protein domain and the auxiliary receptor, inhibition of this interaction occurring for as long as the N-terminal (upstream) protein domain has not interacted with the targeted receptor and promotion of interaction between the C-terminal (downstream) protein domain and the auxiliary receptor occurring when the N-terminal (upstream) protein domain has interacted with the targeted receptor.
2. Use of a peptide according to claim 1, in the construction of a glycoprotein with targeting and gene-fusion activity, essentially intact, carried by a viral or non-viral recombinant gene-transfer vector capable of infecting a eukaryotic cell, and this eukaryotic cell has a targeted receptor and an auxiliary receptor permitting facilitation of entry of the aforesaid viral or non-viral vector into the eukaryotic cell, the aforesaid glycoprotein comprising:
- the aforesaid peptide, - a protein domain on the N-terminal side (upstream) of the said peptide, capable of interacting with the said targeted receptor, this protein domain permitting specific binding of the said gene-transfer vector and - a protein domain on the C-terminal side (downstream) of the said peptide, capable of interacting with the said auxiliary receptor, this interaction performing the role of auxiliary mechanism of entry of the said gene-transfer vector into the eukaryotic cell, the process of cell entry of the viral or non-viral recombinant vector into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to take place when the N-terminal (upstream) protein domain has recognized and bound the viral or non-viral recombinant vector with the targeted receptor of the eukaryotic cell, leading, through the agency of the aforesaid peptide, to a mechanism of "unmasking" or ofaccessibility of the auxiliary receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the aforesaid gene-transfer vector and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of "masking" or of non-accessibility, through the agency of the aforesaid peptide, of the auxiliary receptor with respect to the C-terminal (downstream) protein domain.
- the aforesaid peptide, - a protein domain on the N-terminal side (upstream) of the said peptide, capable of interacting with the said targeted receptor, this protein domain permitting specific binding of the said gene-transfer vector and - a protein domain on the C-terminal side (downstream) of the said peptide, capable of interacting with the said auxiliary receptor, this interaction performing the role of auxiliary mechanism of entry of the said gene-transfer vector into the eukaryotic cell, the process of cell entry of the viral or non-viral recombinant vector into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to take place when the N-terminal (upstream) protein domain has recognized and bound the viral or non-viral recombinant vector with the targeted receptor of the eukaryotic cell, leading, through the agency of the aforesaid peptide, to a mechanism of "unmasking" or ofaccessibility of the auxiliary receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the aforesaid gene-transfer vector and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of "masking" or of non-accessibility, through the agency of the aforesaid peptide, of the auxiliary receptor with respect to the C-terminal (downstream) protein domain.
3. Use of a peptide according to one of the claims 1 to 2, in the construction of an essentially intact (retro)viral envelope glycoprotein, carried by a recombinant (retro)viral particle capable of infecting a eukaryotic cell, the said envelope glycoprotein being advantageously of polymeric form, and especially of trimeric form, each monomer of the polymeric form being in itself of heterodimer form, the said eukaryotic cell containing a targeted receptor and an auxiliary receptor permitting facilitation of entry of the said (retro)viral particle ((retro)viral receptor) into the eukaryotic cell, the envelope glycoprotein comprising:
- the aforesaid peptide, - a protein domain on the N-terminal side (upstream) of the said peptide, capable of interacting with the said targeted receptor, this interaction permitting specific binding of the (retro)viral particle and - a protein domain on the C-terminal side (downstream) of the said peptide, capable of interacting with the said (retro)viral receptor, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the eukaryotic cell, the process of cell entry of the recombinant (retro)viral particle into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to take place when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the recombinant (retro)viral particle, leading, through the agency of the aforesaid peptide, to a mechanism of "unmasking" or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of "masking" or of non-accessibility, through the agency of the aforesaid peptide, of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain.
- the aforesaid peptide, - a protein domain on the N-terminal side (upstream) of the said peptide, capable of interacting with the said targeted receptor, this interaction permitting specific binding of the (retro)viral particle and - a protein domain on the C-terminal side (downstream) of the said peptide, capable of interacting with the said (retro)viral receptor, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the eukaryotic cell, the process of cell entry of the recombinant (retro)viral particle into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to take place when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the recombinant (retro)viral particle, leading, through the agency of the aforesaid peptide, to a mechanism of "unmasking" or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of "masking" or of non-accessibility, through the agency of the aforesaid peptide, of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain.
4. Use of a peptide according to any one of the claims 1 to 3, characterized in that the N-terminal (upstream) protein domain is chosen from the following polypeptides:
- single-strand antibodies recognizing cell surface molecules, - any ligand for a cell surface molecule, especially polypeptide hormones, cytokine, growth factors.
- single-strand antibodies recognizing cell surface molecules, - any ligand for a cell surface molecule, especially polypeptide hormones, cytokine, growth factors.
5. Use of a peptide according to any one of the claims 1 to 4, characterized in that the C-terminal (downstream) protein domain corresponds to a (retro)viral envelope glycoprotein, essentially intact, containing the natural binding domain, the functions of fusion and of attachment of the wild-type envelope glycoprotein from which the envelope glycoprotein carried by the recombinant (retro)viral particle is derived.
6. Use of a peptide according to any one of the claims 1 to 5, characterized in that the peptide comes from the envelope glycoprotein of type C retroviruses, and in that the virus is preferably chosen from: the ecotropic MLV virus, the amphotropic MLV
virus, the xenotropic MLV virus, the MLV MCF virus, the MLV 10A1 virus, GALV
(Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV C (FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or that are constituted of one of the following sequences:
PRO(4070A), PRO(MoMLV), .DELTA.PRO, PRO+, .DELTA.PRO+, PRO.beta., .DELTA.PRO.beta., .DELTA.PRO4-.beta., .DELTA.PRO4-int, .DELTA.PRO4-vrb, PRO.beta., PRO-int, PRO-vrb.
virus, the xenotropic MLV virus, the MLV MCF virus, the MLV 10A1 virus, GALV
(Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV C (FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or that are constituted of one of the following sequences:
PRO(4070A), PRO(MoMLV), .DELTA.PRO, PRO+, .DELTA.PRO+, PRO.beta., .DELTA.PRO.beta., .DELTA.PRO4-.beta., .DELTA.PRO4-int, .DELTA.PRO4-vrb, PRO.beta., PRO-int, PRO-vrb.
7. Use of a peptide according to any one of the claims 1 to 5, characterized in that the peptide is derived or adapted from bovine elastin and is chosen from those containing or that are constituted of one of the following sequences: EL3, EL3-V, EL5.
8. Peptide sequences chosen from those containing or constituted of one of the following sequences:
- PRO(4070A), PRO(MoMLV), PRO.beta., PRO+, .DELTA.PRO, .DELTA.PRO.beta., .DELTA.PRO+, - MOAPRO, MOA.DELTA.PRO, - EMOPRO, EMOPRO.beta., EMOPRO+, EAPRO, EAPRO.beta., EAPRO+, EMO.DELTA.PRO, EMO.DELTA.PRO.beta., EMO.DELTA.PRO+, EA.DELTA.PRO, EA.DELTA.PRO.beta., EA.DELTA.PRO+, EL3, EL3-V, EL5, AMOEL3, AMOEL3-V, AMOEL5, .DELTA.PRO4-.beta., .DELTA.PRO4-int, .DELTA.PRO4-vrb, PRO.beta., PRO-int, PRO-vrb.
- PRO(4070A), PRO(MoMLV), PRO.beta., PRO+, .DELTA.PRO, .DELTA.PRO.beta., .DELTA.PRO+, - MOAPRO, MOA.DELTA.PRO, - EMOPRO, EMOPRO.beta., EMOPRO+, EAPRO, EAPRO.beta., EAPRO+, EMO.DELTA.PRO, EMO.DELTA.PRO.beta., EMO.DELTA.PRO+, EA.DELTA.PRO, EA.DELTA.PRO.beta., EA.DELTA.PRO+, EL3, EL3-V, EL5, AMOEL3, AMOEL3-V, AMOEL5, .DELTA.PRO4-.beta., .DELTA.PRO4-int, .DELTA.PRO4-vrb, PRO.beta., PRO-int, PRO-vrb.
9. Peptide sequence containing a peptide of about 10 to about 200, especially about 15 to about 150 amino acids, and preferably about 20, in which at least 30% of the amino acids consist of proline residues, these proline residues being arranged regularly so as to induce turnings of the polypeptide chain at about 180° (".beta.-turn" or "reverse-turn"), these turnings being regularly spaced and forming a polyproline .beta.-turn helix, - an N-terminal protein domain (upstream) of the said peptide, capable of reacting with a suitable receptor (targeted receptor) located on a eukaryotic cell, this protein domain permitting specific binding of a recombinant (retro)viral particle containing the said N-terminal protein domain and - a C-terminal protein domain (downstream) of the said peptide, capable of interacting with a suitable auxiliary (retro)viral receptor ((retro)viral receptor) located on the said eukaryotic cell, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the said eukaryotic cell, the process of cell entry of the said recombinant (retro)viral particle into the said eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to take place when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the said recombinant (retro)viral particle, leading, through the agency of the aforesaid peptide, to a mechanism of unmasking or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of masking or of non-accessibility, through the agency of the aforesaid peptide, of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain.
10. Recombinant (retro)viral particle capable of infecting a eukaryotic cell, this cell possessing a targeted receptor and an auxiliary receptor of the aforesaid (retro)viral particle, comprising a substantially intact envelope glycoprotein, especially of polymeric form and advantageously of trimeric form, each monomer of the polymeric form being advantageously itself of heterodimer form, containing a peptide of about 10 to about 200, especially about 15 to about 150 amino acids, and preferably about 20, in which at least 30% of the amino acids are constituted of proline residues, these proline residues being arranged regularly so as to induce turnings of the polypeptide chain at about 180°
(".beta.-turn" or "reverse-turn"), these turnings being regularly spaced and forming a polyproline .beta.-turn helix, - a protein domain on the N-terminal side (upstream) of the aforesaid peptide, capable of interacting with the aforesaid targeted receptor, this peptide domain permitting specific binding of the (retro)viral particle and - a protein domain on the C-terminal side (downstream) of the aforesaid peptide, capable of interacting with the aforesaid (retro)viral receptor, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the eukaryotic cell, the process of cell entry of the recombinant (retro)viral particle into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to take place when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the recombinant (retro)viral particle, leading through the agency of the aforesaid peptide to a mechanism of unmasking or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of masking or of non-accessibility, through the agency of the aforesaid peptide, of the retroviral receptor with respect to the C-terminal (downstream) protein domain.
(".beta.-turn" or "reverse-turn"), these turnings being regularly spaced and forming a polyproline .beta.-turn helix, - a protein domain on the N-terminal side (upstream) of the aforesaid peptide, capable of interacting with the aforesaid targeted receptor, this peptide domain permitting specific binding of the (retro)viral particle and - a protein domain on the C-terminal side (downstream) of the aforesaid peptide, capable of interacting with the aforesaid (retro)viral receptor, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the eukaryotic cell, the process of cell entry of the recombinant (retro)viral particle into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to take place when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the recombinant (retro)viral particle, leading through the agency of the aforesaid peptide to a mechanism of unmasking or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of masking or of non-accessibility, through the agency of the aforesaid peptide, of the retroviral receptor with respect to the C-terminal (downstream) protein domain.
11 . Recombinant (retro)viral particle according to claim 10, characterized in that the N-terminal (upstream) protein domain is chosen from the following peptides:
- single-strand antibodies recognizing cell surface molecules, - any ligand for a cell surface molecule, notably polypeptide hormones, cytokine, growth factors.
- single-strand antibodies recognizing cell surface molecules, - any ligand for a cell surface molecule, notably polypeptide hormones, cytokine, growth factors.
12. Recombinant (retro)viral particle according to one of the claims 10 or 11, characterized in that the C-terminal (downstream) protein domain corresponds to a polypeptide of (retro)viral origin possessing functions of binding, of fusion and of attachment of the wild-type envelope glycoprotein from which the envelope glycoprotein carried by the recombinant (retro)viral particle is derived, and can originate from the natural domains possessing the functions of binding, of fusion and of attachment of the envelope glycoproteins from retroviruses MLV-A, GALV, FeLVB, or from viruses such as adenoviruses, herpesviruses, AAV (Adeno Associated Virus), or more generally from viral glycoproteins from viruses of eukaryotic origin, especially orthomyxoviruses (such as influenza viruses) or paramyxoviruses (such as SVS).
13. Recombinant (retro)viral particle according to any one of the claims 10 to 12, characterized in that the peptide originates from the envelope glycoprotein of type C
retroviruses, and in that the peptide originates advantageously from a virus chosen from:
the ecotropic MLV virus, the amphotropic MLV virus, the xenotropic MLV virus, the MLV MCF virus, the MLV 10A1 virus, GALV (Gibbon Ape Leukemia Virus), SSAV
(Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV C (FeLV Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or constituted of one of the following sequences: PRO (4070A), PRO(MoMLV), .DELTA.PRO, PRO+, .DELTA.PRO+, PRO.beta., .DELTA.PRO.beta., .DELTA.PRO4-.beta., .DELTA.PRO4-int, .DELTA.PRO4-vrb, PRO.beta., PRO-int, PRO-vrb.
retroviruses, and in that the peptide originates advantageously from a virus chosen from:
the ecotropic MLV virus, the amphotropic MLV virus, the xenotropic MLV virus, the MLV MCF virus, the MLV 10A1 virus, GALV (Gibbon Ape Leukemia Virus), SSAV
(Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV C (FeLV Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or constituted of one of the following sequences: PRO (4070A), PRO(MoMLV), .DELTA.PRO, PRO+, .DELTA.PRO+, PRO.beta., .DELTA.PRO.beta., .DELTA.PRO4-.beta., .DELTA.PRO4-int, .DELTA.PRO4-vrb, PRO.beta., PRO-int, PRO-vrb.
14. Recombinant (retro)viral particle according to any one of the claims 10 to 13, characterized in that:
- the peptide originates from the envelope glycoprotein of type C retroviruses, and in that the virus is preferably chosen from: the ecotropic MLV virus, the amphotropic MLV
virus, the xenotropic MLV virus, the MLV MCF virus, the MLV 10A1 virus, GALV
(Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV C (FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or constituted of one of the following sequences: PRO
(4070A), PRO(MoMLV), .DELTA.PRO, PRO+, .DELTA.PRO+, PRO.beta., .DELTA.PRO.beta., .DELTA.PRO4-.beta., .DELTA.PRO4-int, .DELTA.PRO4-vrb, PRO.beta., PRO-int, PRO-vrb, - the N-terminal (upstream) protein domain is chosen from the following peptides:
* single-strand antibodies recognizing cell surface molecules, * any ligand for a cell surface molecule, notably polypeptide hormones, cytokine, growth factors, - the C-terminal protein domain corresponds to a polypeptide of (retro)viral origin possessing the functions of binding, of fusion and of attachment of the wild-type envelope glycoprotein from which the envelope glycoprotein carried by the recombinant (retro)viral particle is derived, and can originate from the natural domains possessing the functions of binding, of fusion and of attachment of the envelope glycoproteins from the retroviruses MLV-A, GALV, FeLVB, or from viruses such as adenoviruses, herpesviruses, AAV (Adeno Associated Virus), or more generally viral glycoproteins from viruses of eukaryotic origin, especially orthomyxoviruses (such as influenza viruses) or paramyxoviruses (such as SV5).
- the peptide originates from the envelope glycoprotein of type C retroviruses, and in that the virus is preferably chosen from: the ecotropic MLV virus, the amphotropic MLV
virus, the xenotropic MLV virus, the MLV MCF virus, the MLV 10A1 virus, GALV
(Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV C (FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or constituted of one of the following sequences: PRO
(4070A), PRO(MoMLV), .DELTA.PRO, PRO+, .DELTA.PRO+, PRO.beta., .DELTA.PRO.beta., .DELTA.PRO4-.beta., .DELTA.PRO4-int, .DELTA.PRO4-vrb, PRO.beta., PRO-int, PRO-vrb, - the N-terminal (upstream) protein domain is chosen from the following peptides:
* single-strand antibodies recognizing cell surface molecules, * any ligand for a cell surface molecule, notably polypeptide hormones, cytokine, growth factors, - the C-terminal protein domain corresponds to a polypeptide of (retro)viral origin possessing the functions of binding, of fusion and of attachment of the wild-type envelope glycoprotein from which the envelope glycoprotein carried by the recombinant (retro)viral particle is derived, and can originate from the natural domains possessing the functions of binding, of fusion and of attachment of the envelope glycoproteins from the retroviruses MLV-A, GALV, FeLVB, or from viruses such as adenoviruses, herpesviruses, AAV (Adeno Associated Virus), or more generally viral glycoproteins from viruses of eukaryotic origin, especially orthomyxoviruses (such as influenza viruses) or paramyxoviruses (such as SV5).
15. Recombinant (retro)viral particle according to one of the claims 10 to 14, characterized in that the 5' end of the nucleotide sequence coding for the N-terminal (upstream) protein domain is contiguous with the 3' end of the nucleotide sequence coding for the signal peptide, the 3' end of the nucleotide sequence coding for the N-terminal (upstream) protein domain is contiguous with the 5' end of the nucleotide sequence coding for the peptide, the 3' end of the nucleotide sequence coding for the peptide is contiguous with the 5' end of the nucleotide sequence coding for the C-terminal (downstream) protein domain.
16. Nucleic acid coding for a peptide or for a recombinant particle according toany one of the claims 10 to 15.
17. Method of selective transfer in vitro or ex vivo of a nucleic acid into target eukaryotic cells present among other non-target cells, comprising the administration, to the target and non-target cells, of a recombinant (retro)viral particle according to one of the claims 10 to 15, containing the nucleic acid to be transferred.
18. Pharmaceutical composition containing as active substance a (retro)viral particle according to any one of the claims 10 to 15, and also containing a gene to be transferred, in combination with a physiologically suitable pharmaceutical vehicle.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR9606234A FR2748747B1 (en) | 1996-05-20 | 1996-05-20 | RECOMBINANT VIRAL PARTICLES COMPRISING A PEPTIDE HAVING MASKING AND UNMASKING PROPERTIES WITH RESPECT TO A BIOLOGICAL MECHANISM |
FR96/06234 | 1996-05-20 | ||
PCT/FR1997/000870 WO1997044474A2 (en) | 1996-05-20 | 1997-05-16 | Viral particles which are masked or unmasked with respect to a cell receptor |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2253874A1 true CA2253874A1 (en) | 1997-11-27 |
Family
ID=9492280
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002253874A Abandoned CA2253874A1 (en) | 1996-05-20 | 1997-05-16 | Viral particles which are masked or unmasked with respect to a cell receptor |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP0953053A2 (en) |
JP (1) | JP2000511051A (en) |
AU (1) | AU725632B2 (en) |
CA (1) | CA2253874A1 (en) |
FR (1) | FR2748747B1 (en) |
WO (1) | WO1997044474A2 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2773561A1 (en) * | 1998-01-15 | 1999-07-16 | Centre Nat Rech Scient | Retroviral glycoprotein envelope proline-rich sequences, mutant and chimeric proteins, useful for gene therapy expression of retroviral vectors |
WO2000071578A2 (en) * | 1999-05-20 | 2000-11-30 | Cnrs Centre National De La Recherche Scientifique | New polypeptides and their use for the rescue of fusion defective virus or retrovirus glycoproteins |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
ATE111715T1 (en) * | 1988-04-21 | 1994-10-15 | Uab Research Foundation | BIOELASTOMER WITHOUT ADHESION PROPERTIES FOR USE AT THE WOUND SITE. |
US5637481A (en) * | 1993-02-01 | 1997-06-10 | Bristol-Myers Squibb Company | Expression vectors encoding bispecific fusion proteins and methods of producing biologically active bispecific fusion proteins in a mammalian cell |
WO1994011524A1 (en) * | 1992-11-09 | 1994-05-26 | The United States Government As Represented By The Secretary Of The Department Of Health And Human Services | Targetable vector particles |
US5773577A (en) * | 1994-03-03 | 1998-06-30 | Protein Polymer Technologies | Products comprising substrates capable of enzymatic cross-linking |
GB9412844D0 (en) * | 1994-06-27 | 1994-08-17 | Medical Res Council | Improvements in or relating to therapeutic methods |
US5662885A (en) * | 1994-07-22 | 1997-09-02 | Resolution Pharmaceuticals Inc. | Peptide derived radionuclide chelators |
-
1996
- 1996-05-20 FR FR9606234A patent/FR2748747B1/en not_active Expired - Fee Related
-
1997
- 1997-05-16 CA CA002253874A patent/CA2253874A1/en not_active Abandoned
- 1997-05-16 WO PCT/FR1997/000870 patent/WO1997044474A2/en not_active Application Discontinuation
- 1997-05-16 AU AU30356/97A patent/AU725632B2/en not_active Ceased
- 1997-05-16 EP EP97925095A patent/EP0953053A2/en not_active Withdrawn
- 1997-05-16 JP JP09541705A patent/JP2000511051A/en active Pending
Also Published As
Publication number | Publication date |
---|---|
AU3035697A (en) | 1997-12-09 |
EP0953053A2 (en) | 1999-11-03 |
JP2000511051A (en) | 2000-08-29 |
FR2748747A1 (en) | 1997-11-21 |
WO1997044474A3 (en) | 1998-03-05 |
FR2748747B1 (en) | 1998-08-07 |
WO1997044474A2 (en) | 1997-11-27 |
AU725632B2 (en) | 2000-10-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0840797B1 (en) | Methods and means for targeted gene delivery | |
US7820157B2 (en) | Transgene delivering retrovirus targeting collagen exposed at site of tissue injury | |
US6534051B1 (en) | Cell type specific gene transfer using retroviral vectors containing antibody-envelope fusion proteins and wild-type envelope fusion proteins | |
US5858743A (en) | Delivery of nucleic acids | |
US6448390B1 (en) | Stable envelope proteins for retroviral, viral and liposome vectors and use in gene drug therapy | |
WO1995023846A1 (en) | Cell-type specific gene transfer using retroviral vectors containing antibody-envelope and wild-type envelope-fusion proteins | |
US20110020901A1 (en) | Methods of Making Viral Particles Having a Modified Cell Binding Activity and Uses Thereof | |
AU725632B2 (en) | Viral particles which are masked or unmasked with respect to a cell receptor | |
US20030129163A1 (en) | Retroviral vectors including modified envelope escort proteins | |
US6762031B2 (en) | Targeting viral vectors to specific cells | |
US7138272B1 (en) | Gene transfer in human lymphocytes using retroviral scFv cell targeting | |
Harboring | Efficient Gene Delivery to Quiescent Interleukin-2 (IL-2) | |
Yee | Prospects for using retroviral vectors for human gene therapy | |
Chu | Engineered retroviruses enable cell-specific gene transfer through antigen-antibody interaction | |
NZ536851A (en) | Vector particles containing TVTM peptides | |
WO2000071578A2 (en) | New polypeptides and their use for the rescue of fusion defective virus or retrovirus glycoproteins |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
FZDE | Dead |