CA2253874A1

CA2253874A1 - Viral particles which are masked or unmasked with respect to a cell receptor

Info

Publication number: CA2253874A1
Application number: CA002253874A
Authority: CA
Inventors: Francois-Loic Cosset; Sandrine Valsesia; Stephen J. Russell
Original assignee: Individual
Current assignee: Centre National de la Recherche Scientifique CNRS
Priority date: 1996-05-20
Filing date: 1997-05-16
Publication date: 1997-11-27
Also published as: AU3035697A; EP0953053A2; JP2000511051A; FR2748747A1; WO1997044474A3; FR2748747B1; WO1997044474A2; AU725632B2

Abstract

The invention features the use of a peptide for transferring genes in a eukaryotic target cell, which peptide has about 10 to about 200, in particular about 15 to about 150 amino acids, and advantageously about 20 amino acids, in which 30 % at least of the amino acids are constituted by proline radicals, which proline radicals are regularly arranged so as to induce polypeptide chain turnings at about 180~ (".beta.-turn" or "reverse-turn"), these turns being regularly spaced and gathering in a polyproline .beta.-turn helix, in a polypeptide construction containing, on the said peptide N-terminal side (upstream), an N-terminal (upstream) proteinic domain capable of recognising a targeted surface molecule or an antigen expressed on a cellular surface, in particular an appropriate receptor (targeted receptor) located on the said eukaryotic cell, and on the said peptide C-terminal side (downstream), a C-terminal (downstream) protein domain capable of recognising an appropriate receptor (auxiliary receptor) located on the said eukaryotic cell, which peptide is capable of facilitating or inhibiting interaction between the C-terminal (downstream) protein domain and the auxiliary receptor, the inhibition of this interaction taking place so long as the N-terminal (upstream) protein domain has not interacted with the targeted receptor, and the facilitating of the interaction between the C-terminal (downstream) protein domain and the auxiliary receptor taking place when the N-terminal (upstream) protein domain has interacted with the targeted receptor.

Description

VIRAL PARTICLES WHICH ARE MASKED OR UNMASKED WITH RESPECT TO
A CELL RECEPTOR
The invention relates to recombinant viral particles containing a peptide possessing properties of masking and of unmasking with respect to a biological mechanism, notably with respect to a mechanism of cellular interaction.
The invention also relates to the application of the aforesaid viral particles, notably for cell targeting in gene transfer.
Retroviruses and therefore retrovirus vectors initiate their infectious cycle by recognizing specific cell surface molecules, called retrovirus receptors, with envelope glycoproteins expressed on the surface of the retroviral particles. This recognition then leads to fusion between the viral and cellular membranes, a process that is complex and poorly understood, which is also mediated by a second function of the envelope glycoprotein.
The possibility of altering the specificity of the interaction with the surface of the target cell has been demonstrated previously, notably by means of genetic modifications introduced in the retroviral envelope glycoprotein.
A certain number of works have shown that such modifications do not involve disturbing effects in the complex processes that permit the retroviral envelope glycoprotein to become mature, to be expressed on the cell surface, and to be incorporated selectively in the virions. Moreover, it has now been proved that these modifications can lead to the specific recognition of cells by interaction with the surface molecules corresponding to these polypeptides. Finally, in certain cases, this recognition permits continuation of the infectious cycle and integration of the transgene in the targeted cell with, in the best possible case, an ei~iciency that is greatly reduced relative to the efficiency that is provided by an unmodified retroviral envelope via its normal retrovirus receptor. Hence it is concluded, on the one hand, that certain target surface molecules cannot be utilized as a receptor for initiating infection and, on the other hand, when they can be so utilized, the processes following the primary interaction take place with extremely low efficiency. These conclusions are true with regard to "one stage targeting" strategies, i.e. approaches in which the aim of the primary interaction is to lead directly to continuation of the infectious cycle, without the intervention of auxiliary processes.
Retrovirus vectors are now the most-used vectors for gene transfer and in particular for gene therapy, as we require stable integration and expression of the transgene. Other gene transfer vectors exist (adenovirus vectors, liposomes, vectors derived from herpesviruses, or vectors derived from AAVs)) but do not permit stable and in a highly selective reaction after .. CA 02253874 1998-11-06 efficient integration of the transgene. Whereas most of the gene therapy protocols examined up to now and using retroviral vectors are based on the explantation of the patient's cells, their transgenesis and expansion ift vitro, followed by their reimplantation, it would be highly desirable to be able to transfer a therapeutic gene in the irt vivo context by means of retroviral vectors. For that, the retroviral particle carrying the therapeutic gene would have to be endowed with certain additional characteristics, and more particularly, the ability to recognize very specifically the target cells of the gene transfer. In fact, the surface molecules that are recognized naturally by the retroviruses for initiation of infection are expressed very widely on most of the cells.
This does not permit precise discrimination of the cells in which one wishes to effect a gene transfer.
Several works have had the aim of altering the infection tropism of retroviral vectors. Some of these works are based on biochemical modifications of retroviral particles, others on genetic modifications of the retroviral envelope glycoproteins, which guarantees that all the retroviral particles will be altered. In this last context, the works I S have consisted of "single-stage targeting", for which the modified viral particle attaches to the targeted cell surface molecule - leading to continuation of the infectious cycle.
However, the efficiency of the retroviral vectors altered in this way is very low relative to what is required for obtaining a tool that can be used for purposes of therapy.
It is possible that the development of targeting strategies could not succeed with the "single-stage" system, for reasons already mentioned above: inefficiency of the gene fusion process of the chimeric envelope glycoproteins after their binding on the targeted surface molecule, and the impossibility of utilizing certain surface molecules as retroviral receptors.
A certain number of human gene therapy protocols will require retroviral vectors that are capable of effecting gene transfer in vivo, by direct inoculation of the recombinant retrovira( particles. Among the improvements that this presupposes relative to the retroviral vectors developed until now, we may cite:
- improvement of the infectious titres, - improvement of the stability of the viral particles in the serum and more generally in the various body fluids) - the possibility of infecting quiescent cells, - the possibility of discrimination of the target cells of gene therapy.
The invention has the aim of proposing means for discriminating the target cells of gene therapy. It is essential, for certain applications in gene therapy, to guarantee that gene transfer will only have taken place in the cells to be treated) and not in other categories of cells. For example, when we wish to confer a selective advantage on normal cells with respect to a chemotherapy, it is imperative that the transferred gene conferring this advantage has not been introduced into cancer cells.

The invention relates to a two-stage mechanism, in which the second stage is dependent on realization of the first stage.
The invention relates to an alternative that is beneficial with regard to performance in targeting, particularly because it combines specific recognition of the target cell and entry into the target cell connected with a natural retroviral mechanism, known for its efficiency.
The invention relates more particularly to a two-stage targeting mechanism:
- the first stage permitting recognition of a targeted surface molecule by means of the new N-terminal binding domain, inserted in an envelope glycoprotein, - the second stage permitting conditional recognition of a normal retroviral receptor via a domain inherent in the initial envelope glycoprotein and thus permitting a relay in the process of entry of the viral particle into the cell, the term "conditional"
signifying that the relay in the entry mechanism can only be effected if the viral particle has previously interacted with the initial surface molecule, which in turn guarantees that the infection is truly targeted.
The invention relates to new peptides for carrying out the first stage in a two stage mechanism and which perform the role of "masking" with respect to the second stage, for as long as the first stage has not taken place and permitting the second stage, i. e. performing the role of unmasking with respect to the second stage if, and only if, the first stage has taken place.
The present invention also relates to the construction of chimeric envelope glycoproteins using these novel peptides.
The invention relates to the use of a peptide for transfer of genes into a target eukaryotic cell, this peptide containing from about 10 to about 200, especially from about 1 S to about 1 SO amino acids, and preferably about 20 amino acids, in which at least 30% of the amino acids are made up of proline residues, these proline residues being regularly arranged so as to induce turnings of the polypeptide chain to about 180°
("~3-turn" or "reverse-turn"), these turnings being evenly spaced and forming a polyproline helix with (3 type turning ("polyproline (3-turn helix"), in a polypeptide construction containing, on the N-terminal side (upstream) of the said peptide, an N-terminal (upstream) protein region capable of recognizing a targeted surface molecule or an antigen expressed on a cell surface, especially a suitable receptor (targeted receptor) located on the said eukaryotic cell, and on the C-terminal side (downstream) of the said peptide, a C-terminal (downstream) protein region capable of recognizing a suitable receptor (auxiliary receptor) located on the aforesaid eukaryotic cell, this peptide being capable of promoting or inhibiting interaction between the C-terminal (downstream) protein region and the auxiliary receptor, inhibition of this interaction occurring for as long as the N-terminal (upstream) protein domain has not interacted with the targeted receptor and promotion of interaction between the C-terminal (downstream) protein domain and the auxiliary receptor occurring when the N-terminal (upstream) protein domain has interacted with the targeted receptor.
In the case of a peptide of 20 amino acids (OPRO defined below), this b-turn polyproline helix contains four (3 turnings and therefore 4 turns, and moreover is incompatible with an a-helix or (3-sheet secondary structure. Advantageously, the polyproline helix with (3 type turning positioned between the two domains of the chimeric protein (N-terminal domain and auxiliary domain) possesses intrinsically: 1 ) an elastomeric force, 2) the property of self assembly with other polyproline helices, probably in connection with the trimeric nature of the envelope, 3) the property of transmitting, to the auxiliary domain, a distortion that is induced by binding of the N-terminal domain with its receptor, causing activation of the auxiliary domain.
The invention also relates in general to any two-stage mechanism, in which the 1 S second stage can only be effected if the first stage has taken place) and relates for example to an enzymatic mechanism involving a chimeric protein which is only to occur if the chimeric protein is able to recognize its substrate.
The expression "N-terminal (upstream) protein domain capable of recognizing a targeted surface molecule, or an antigen expressed on a cell surface", means that:
1) the interaction between this N-terminal protein domain and the targeted surface molecule can be characterized by a dissociation constant (of nanomolar order with respect to interaction between wild-type retroviral envelope glycoprotein and retroviral receptor);
2) the soluble form of this N-terminal protein domain (i.e. not associated in the construction of the chimeric envelope glycoprotein) possesses binding characteristics similar to this same protein domain when it is inserted at the N-terminal position in the chimeric envelope glycoprotein;
3) the chimeric envelope glycoprotein containing the N-terminal protein domain can be characterized according to classical techniques of virology (e.g. binding test; cf "Examples").
The following may be mentioned as examples of targeted surface molecule or of antigen expressed on a cell surface:
- markers for differentiating the various haematopoietic lineages, in particular markers expressed on immature cells and/or haematopoietic stem cells (example:
CD34), - markers expressed on tumour cells (example: carcino-embryonic antigens), - markers present specifically on various differentiated tissues (example:
receptor of growth factors, of peptide hormones).

As an example of a targeted surface molecule, we may mention in particular a receptor which will be designated as targeted receptor hereinafter For convenience of terminology, the expression "targeted receptor" will be used in the following to encompass any targeted surface molecule or any antigen expressed on S a cellular surface.
The expression "C-terminal (downstream) protein domain capable of recognizing a suitable receptor (auxiliary receptor)" means that the C-terminal protein domain can interact with the auxiliary receptor, this interaction being characterized by a dissociation constant which is of nanomolar order if the C-terminal protein domain is derived from a retroviral envelope glycoprotein and if the auxiliary receptor is the retroviral receptor used by this same glycoprotein, this interaction permitting the triggering of the gene fusion process in a mechanism that is strictly similar to the natural process, i. e. outside of the context of a chimeric envelope glycoprotein.
The peptide that is the subject of the invention is such that, positioned between two protein domains (an N-terminal protein domain relative to the said peptide and a C
terminal protein domain relative to the said peptide), it can induce the function of the C
terminal protein domain (for example binding if that is the function of this C-terminal domain) if, and only if, the N-terminal protein domain has been mobilized in its function (for example binding).
Non-induction of the function of the C-terminal protein domain by the peptide of the invention corresponds to the mechanism of "masking" of the peptide of the invention, whereas induction of the function of the C-terminal protein domain by the peptide of the invention corresponds to the mechanism of "unmasking" of the peptide of the invention.
That is why the peptide of the invention will also be designated hereinafter as "masking/unmasking peptide".
The invention relates to the use of a peptide according to the invention, in the construction of a glycoprotein with targeting and gene-fusion activity, essentially intact, carried by a viral or non-viral recombinant gene-transfer vector capable of infecting a eukaryotic cell, the said eukaryotic cell possessing a targeted receptor and an auxiliary receptor permitting facilitation of entry of the said viral or non-viral vector into the eukaryotic cell, the aforesaid glycoprotein comprising:
- the aforesaid peptide, - a protein domain on the N-terminal (upstream) side of the said peptide, capable of interacting with the above-mentioned targeted receptor, this protein domain permitting specific binding of the aforesaid gene-transfer vector and - a protein domain on the C-terminal (downstream) side of the said peptide, capable of interacting with the aforesaid auxiliary receptor, this interaction performing the role of auxiliary mechanism of entry of the aforesaid gene-transfer vector into the eukaryotic cell, the process of cell entry of the viral or non-viral recombinant vector into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to occur when the N-terminal (upstream) protein domain has recognized and bound the viral or S non-viral recombinant vector with the targeted receptor of the eukaryotic cell, leading, by means of the aforesaid peptide, to a mechanism of "unmasking" or of accessibility of the auxiliary receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the aforesaid gene-transfer vector and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, a mechanism of "masking" or of non-accessibility is produced, by means of the aforesaid peptide, of the auxiliary receptor with respect to the C-terminal (downstream) protein domain.
The expression glycoprotein with targeting and gene-fusion activity denotes a glycoprotein which is:
1) capable of being incorporated efficiently on (retro)viral particles carrying a transgene, 2) capable of specifically recognizing the targeted cell-surface molecule and of specifically redirecting the binding of the (retro)viral particle which carries it to this molecule, 3) capable of causing fusion, after fixation on the molecular target, of the membrane of the (retro)viral particle and the cytoplasmic membrane of the cell, according to the mechanism used naturally by the (retro)virus from which the envelope glycoprotein was derived.
The expression "substantially intact" refers to a viral glycoprotein that retains all its necessary determinants for preserving the post-translation processes:
oligomerization, the properties of viral incorporation and of fusion, as required. However, certain changes (such as mutations, deletions, additions) can be made to the glycoprotein without significantly affecting its functions and the glycoproteins containing these minor changes are regarded as substantially intact for the needs of the invention.
In particular, the glycoprotein may lack some amino acids (for example about 1 to 10), especially at the N-terminal end, but will generally be of the same size as the wild-type protein and possesses essentially the same biological properties as the wild-type protein.
The expression "viral recombinant gene-transfer vector" means any virus capable of infecting cells of the eukaryotic type) and preferably a virus that is suitable for gene therapy, such as an adenovirus or a retrovirus (for example a type C
retrovirus).
The expression "non-viral recombinant gene-transfer vector" means macromolecular complexes combining the DNA containing the transferred gene, its regulatory sequences, and molecules belonging to the class of lipids, carbohydrates, or proteins, which possess functional properties capable of: 1 ) targeting deposition of DNA

on the surface of the target cell, 2) introducing this DNA into the targeted cell, and 3) introducing this DNA into the nucleus of the targeted cell.
The expression "process of cell entry of the viral recombinant gene-transfer vector" means all of the events leading to introduction of the transported gene into the cytoplasm of the targeted cell following initial contact between the surface of this cell and the gene-transfer vector.
As an example, for retroviral vectors, in relation to a defined cellular target for which a "targetable" surface molecule is known (i.e. sufficiently specific relative to the other tissues) and a ligand for the surface molecule (ligand or single-chain antibody), a gene coding for the envelope glycoprotein targeting this surface molecule can be constructed genetically. This is accomplished by fusing (from N to C-terminal) a signal peptide, the ligand, the "masking/unmasking" peptide, and the rest of the retroviral envelope. An expression vector for this chimeric molecule is inserted into a "semi-transcomplementing" cell line expressing the gag and pol proteins of the MLV
virus I S (coding for the viral capsid and the enzymes of replication of the retrovirus). A
"transcomplementing" line is obtained, which can then be used for producing retroviral vectors if a plasmid carrying this retroviral vector is additionally introduced, as occurs with the conventional transcomplementing lines expressing normal retroviral envelopes.
The invention also relates to the use of a peptide according to the invention, in the construction of an essentially intact (retro)viral envelope glycoprotein, carried by a recombinant (retro)viral particle capable of infecting a eukaryotic cell, the said envelope glycoprotein preferably being of polymeric form, and especially of trimeric form, each monomer of the polymeric form being in its turn of heterodimer form, the said eukaryotic cell possessing a targeted receptor and an auxiliary receptor permitting facilitation of entry of the aforesaid (retro)viral particle ((retro)viral receptor) into the eukaryotic cell, the envelope glycoprotein comprising:
- the aforesaid peptide, - a protein domain on the N-terminal side (upstream) of the aforesaid peptide, capable of interacting with the aforesaid targeted receptor, this interaction permitting specific binding of the (retro)viral particle and - a protein domain on the C-terminal side (downstream) of the aforesaid peptide, capable of interacting with the aforesaid (retro)viral receptor, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the eukaryotic cell, the process of cell entry of the (retro)viral recombinant particle into the eukaryotic cell 3 S by means of the C-terminal (downstream) protein domain only being able to occur when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the (retro)viral recombinant particle, Leading, via the aforesaid peptide, to a mechanism of "unmasking" or of accessibility of the (retro)viral ~

receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the viral recombinant particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, a mechanism of "masking" or of non-accessibility is produced, by means of the aforesaid peptide, of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain.
The (retro)viral envelope glycoproteins are trimers of heterodimers with surface subunit (SU) and transmembrane subunit (TM). This concept of trimerization is fundamental for the functionality of the (retro)viral envelope. The envelope glycoproteins of the invention are preferably of trimeric form.
According to an advantageous embodiment of the invention, the N-terminal (upstream) protein domain is chosen from the following polypeptides:
- single-strand antibodies recognizing cell-surface molecules, - any ligand for a cell-surface molecule, especially polypeptide hormones, cytokine, I S growth factors.
According to an advantageous embodiment of the invention, the C-terminal (downstream) protein domain corresponds to a (retro)viral envelope glycoprotein, essentially intact, including the natural binding domain, the functions of fusion and of attachment of the wild-type envelope glycoprotein from which is derived the envelope glycoprotein carried by the recombinant (retro)viral particle.
According to an advantageous embodiment of the invention, the peptide originates from the envelope glycoprotein of type C retroviruses, and in that the virus is preferably chosen from: the ecotropic MLV virus, the amphotropic MLV virus, the xenotropic MLV virus, the MCF MLV virus, the MLV 10A1 virus, GALV (Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV
C
(FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or consisting of one of the following sequences: PRO (4070A), PRO(MoMLV), APRO, PRO+, PRO+, PRO(3) OPROp, OPR04-~, OPR04-int, OPR04-vrb, PR0~3, PRO-int, PRO-vrb.
The invention relates to the use of a peptide derived or adapted from bovine elastin and chosen from those containing or consisting of one of the following sequences:
EL3, EL3-V, ELS.
The invention also relates to peptide sequences chosen from those containing or consisting of one of the following sequences:
- PRO (4070A), PRO(MoMLV), PRO(3, PRO+, PRO, 4PR0(3, DPRO+) MOAPRO, MOAOPRO, - EMOPRO, EMOPRO(3, EMOPRO+, EAPRO, EAPR0~3, EAPRO+, EMODPRO, EMOOPRO(~, EMOOPRO+, EAOPRO, EAAPRO(3, EAOPRO+, AMOEL3) AMOEL3-V) AMOELS.
PRO (4070A), PRO(MoMLV), PRO(3, PRO+, 4PR0, 4PR0(3, OPRO+, EL3, EL3-V, ELS are masking/unmasking peptides of the invention.
AMOPRO, AMOOPRO, AMOEL3, AMOEL3-V, AMOELS correspond to S Ram-1 targeting envelopes.
MOAPRO, MOAOPRO correspond to Rec-1 targeting envelopes.
EMOPRO, EMOPR0~3, EMOPRO+, EAPRO, EAPRO(3, EAPRO+, EMOOPRO, EMO~PR0~3, EMOOPRO+, EAOPRO, EAOPRO(3, EAAPRO+
correspond to EGFR targeting envelopes.
The invention also relates to a polypeptide sequence containing a peptide of about 10 to about 200, especially from about 15 to about 150 amino acids, and preferably about 20, in which at least 30% of the amino acids consist of proline residues, and these proline residues are regularly arranged so as to induce turnings of the polypeptide chain at about 180° (" ~i-turn" or "reverse-turn"), these turnings being 1 S regularly spaced and assembling themselves into a polyproline (3-turn helix, - an N-terminal protein domain (upstream) of the aforesaid peptide, capable of reacting with a suitable receptor (targeted receptor) located on a eukaryotic cell, and this protein domain permits specific binding of a recombinant (retro)viral particle containing the said N-terminal protein domain and - a C-terminal protein domain (downstream) of the aforesaid peptide, capable of interacting with a suitable auxiliary (retro)viral receptor ((retro)viral receptor) located on the said eukaryotic cell) and this interaction performs the role of auxiliary mechanism of entry of the (retro)viral particle into the said eukaryotic cell, the process of cell entry of the said recombinant (retro)viral particle into the said eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to occur when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the said recombinant (retro)viral particle, leading, by means of the aforesaid peptide, to a mechanism of unmasking or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of masking or of non-accessibility, by means of the aforesaid peptide) of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain.
The invention also relates to a recombinant (retro)viral particle capable of infecting a eukaryotic cell) this cell containing a targeted receptor and an auxiliary receptor of the aforesaid (retro)viral particle, including a substantially intact envelope glycoprotein) especially of polymeric form and preferably of trimeric form, each monomer of the polymeric form preferably being itself of heterodimer form, containing a peptide of about 10 to about 200) especially of about 15 to about 150 amino acids, and preferably of about 20, in which at least 30% of the amino acids are made up of proline 5 residues, these proline residues being regularly arranged so as to induce turnings of the polypeptide chain at about 180° ("(3-turn" or "reverse-turn"), these turnings being regularly spaced and assembling themselves into a polyproline (3-turn helix, - a protein domain on the N-terminal side (upstream) of the aforesaid peptide, capable of interacting with the aforesaid targeted receptor, this peptide region permitting specific 10 binding of the (retro)viral particle and - a protein domain on the C-terminal side (downstream) of the aforesaid peptide, capable of interacting with the aforesaid (retro)viral receptor, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the eukaryotic cell, the process of cell entry of the recombinant (retro)viral particle into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to occur when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the recombinant (retro)viral particle, leading, via the aforesaid peptide, to a mechanism of unmasking or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of masking or of non-accessibility, via the aforesaid peptide, of the retroviral receptor with respect to the C-terminal (downstream) protein domain.
The invention also relates to a recombinant (retro)viral particle characterized in that the N-terminal (upstream) protein domain is chosen from the following peptides:
- single-strand antibody recognizing cell surface molecules, - any ligand for a cell surface molecule, especially polypeptide hormones, cytokine, growth factors.
The invention also relates to a recombinant (retro)viral particle characterized in that the C-terminal (downstream) protein domain corresponds to a polypeptide of (retro)viral origin possessing functions of binding, of fusion and of attachment of the wild-type envelope glycoprotein from which is derived the envelope glycoprotein carried by the recombinant (retro)viral particle, and can originate from natural regions possessing functions of binding) of fusion and of attachment of the envelope glycoproteins derived from retroviruses Nll.V-A, GALV, FeLVB, or viruses such as adenoviruses, herpesviruses, AAV (Adeno Associated Virus), or more generally from viral glycoproteins derived from viruses of eukaryotic origin, especially orthomyxoviruses (such as influenza viruses) or paramyxoviruses (such as SVS).
The invention also relates to a recombinant (retro)viral particle characterized in that the peptide is derived from the envelope glycoprotein of type C
retroviruses, and in that the peptide is preferably derived from a virus chosen from: ecotropic MLV
virus, amphotropic MLV virus, xenotropic MLV virus, MLV MCF virus, MLV 10A1 virus, GALV (Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV
A, FeLV B, FeLV C (FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or consisting of one of the following sequences:
PRO
(4070A), PRO(MoMLV), APRO, PRO+, PRO+, PR0~3, ~PRO(3, OPR04-(3, OPR04-int, OPR04-vrb, PRO(3, PRO-int, PRO-vrb.
The invention also relates to a recombinant (retro)viral particle characterized in that:
- the peptide originates from the envelope glycoprotein of type C
retroviruses, and in that 1 S the virus is preferably chosen from: ecotropic MLV virus, amphotropic MLV
virus, xenotropic MLV virus) MLV MCF virus, MLV 10A1 virus) GALV (Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV
C
(FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or consisting of one of the following sequences: PRO (4070A), PRO(MoMLV), OPRO, PRO+, OPRO+, PRO(3, OPR0~3, OPR04-(3, OPR04-int, ~PR04-vrb, PRO(3, PRO-int, PRO-vrb, - the N-terminal (upstream) protein domain is chosen from the following peptides:
* single-strand antibodies recognizing cell surface molecules, * any ligand for a cell surface molecule, especially polypeptide hormones, cytokine, growth factors, - the C-terminal protein domain corresponds to a polypeptide of (retro)viral origin possessing functions of binding, fusion and attachment of the wild-type envelope glycoprotein from which is derived the envelope glycoprotein carried by the recombinant (retro)viral particle, and can originate from natural regions possessing functions of binding, of fusion and of attachment of the envelope glycoproteins derived from retroviruses MLV-A, GALV, FeLVB, or from viruses such as adenoviruses) herpesviruses, AAV (Adeno Associated Virus), or more generally from viral glycoproteins derived from viruses of eukaryotic origin, especially orthomyxoviruses (such as influenza viruses) or paramyxoviruses (such as SVS).
The invention also relates to a recombinant (retro)viral particle characterized in that the 5' end of the nucleotide sequence coding for the N-terminal (upstream) protein domain is contiguous with the 3' end of the nucleotide sequence coding for the signal peptide) the 3' end of the nucleotide sequence coding for the N-terminal (upstream) protein domain is contiguous with the 5' end of the nucleotide sequence coding for the peptide, the 3' end of the nucleotide sequence coding for the peptide is contiguous with the 5' end of the nucleotide sequence coding for the C-terminal (downstream) protein domain.
The invention also relates to a nucleic acid coding for a peptide or for a recombinant particle according to the invention.
The invention also relates to a method of selective in vitro or ex vivo transfer of a nucleic acid into eukaryotic target cells present among other non-target cells, comprising the administration to the target and non-target cells, of a recombinant (retro)viral particle according to the invention, containing the nucleic acid to be transferred.
The invention also relates to a pharmaceutical composition containing, as active substance, a (retro)viral particle according to the invention, and also containing a gene to be transferred, together with a physiologically suitable pharmaceutical vehicle.
With regard to genes to be transferred that are important for gene therapy, these are for example IFN, IL2, p53, VEGF, TNF, CFTR, HSV-TK, lacZ, GFP, gene of various cytokines, other types of suicide genes including conditional suicide genes, other genes with antiviral activity, other genes with antitumour activity, other marker genes and any gene for therapy of a mono- or multi-genie disease. As an example) the pathologies most specifically involved are: most mono- or multi-genie diseases (mucoviscidosis, myopathy, lysosomal diseases) various forms of cancer, viral diseases (AIDS), etc.).
For a proper understanding of the mechanism of the invention (see Fig. 1 ), we must bear in mind that the envelope glycoproteins according to the invention (also denoted by "chimeric envelopes") possess, as well as an additional recognition region, the functions corresponding to their own particular regions; that is (see Fig.
2), 1 ) the natural binding domain located in the N-terminal part of the surface subunit (SU) of the wild-type envelope glycoprotein) and therefore just downstream of the supernumerary binding domain and 2) the fusion domain located in the C-terminal part of the subunit (SU) and in the transmembrane subunit (TM) of the envelope glycoprotein complex. For the chimeric envelopes constructed previously (EMO and AMO envelopes, for example), on the basis of the general structure shown diagrammatically in Fig. 2, the natural binding domain is functional. If the retroviral receptor that it recognizes is expressed at the surface of the 3 S target cell, then this domain will recognize it) and will permit infection to proceed. Then there will be no possibility of specific targeting, even if a surface molecule specifically recognizing the supernumerary binding domain is also expressed.
However, depending on the peptide inserted between the supernumerary binding domain and the natural binding domain, it is possible for the functionality of the natural binding domain to be adjusted considerably and, for some of these peptides, there can be effective prevention of its accessibility for recognition of the retroviral receptor (first action). It will be possible for this site to be unmasked, and hence rendered accessible to interaction with the normal retroviral receptor, if and only if the supernumerary binding domain has previously interacted with the targeted surface molecule. This second action is also mediated by the peptide separating the two domains. Here the normal retroviral receptor plays the role of auxiliary molecule.
Symbols on the diagrams:
- Fig. I represents the two-stage entry process of the targeting viral particle. The viral particles are generated (A) with targeting envelope glycoproteins composed of an N-terminal domain (ligand, single-strand antibody etc.), of the masking/unmasking peptide) and a C-terminal domain (B). The stages giving rise to introduction of the I S virion into the targeted cell involve a mechanism that is coordinated by the masking/unmasking peptide (C).
- Fig. 2 is a schematic representation of some of the targeting envelopes investigated. The position of some functional regions is shown. Vertical arrows: sites of proteolytic cleavage. SU: surface subunit, TM: transmembrane subunit, SP:
signal peptide, PRO: polyproline region, T: transmembrane domain, Ram-1 ligand:
binding domain for the amphotropic receptor, Rec-1 ligand: binding domain for the ecotropic receptor, EGF: epidermal growth factor. Dark grey boxes: sequences derived from the env gene of MoMLV) Light grey boxes: sequences derived from the env gene of MLV-4070A, White boxes: other sequences derived from MLVs. Black boxes: spacer peptides derived from the polyproline region. All the env genes are expressed starting from the same promoter (LTR) and polyadenylation signal (pA) starting from the sub-genomic mRNAusing the retroviral splicing sites, donor (SD) and acceptor (SA), with an identical intron sequence of 190 nt containing the end of the pol gene (OPOL).
The position of some restriction sites is shown.
- Fig. 3 shows the sequence of the spacer peptides and of the binding domains investigated. (A) Sequence of the spacer peptides in the series AMO, AS208 and fused with the various spacer peptides, and the whole is fused with codon 7 of the SU of the envelope of the MoMLV. (B) Sequence of the spacer peptides in the series MOA.
The binding domain at Rec-1 is fused with the various spacer peptides, and the whole is fused with codon 5 of the SU of the envelope of the amphotropic MLV. (C) Sequence of the spacer peptides in the series EMO and EA. The binding domain EGF is fused with the various spacer peptides, and the whole is fused with codon 5 of the SU of the envelope of the amphotropic MLV or with codon 7 of the SU of the envelope of the MoMLV

- Fig. 4 shows detection of membrane expression of the envelopes of the EMO
series. Populations of transfected cells, selected using phleomycin, are marked with (black histograms) or without (white histograms) anti-hEGF antibodies, then with anti-IgG mouse antibodies combined with FITC.
- Fig. 5 shows expression and viral incorporation of the chimeric envelopes of the AMO series. Immunoblots on lysates of TELCeB6 cells transfected by the plasmids expressing the chimeric envelopes (see Fig. 2 and Fig. 3A) and on deposits of viral particles purified by ultracentrifugation. The immunoblots are detected with an anti-SU
antiserum (top part) or with an anti-p30-CA antiserum (bottom part, size less than 46 KD). The positions of the p30-CA (CA) and, for the MO wild-type envelopes, of the precursor (PR) and of the surface protein (SU) of the envelope complex are shown.
- Fig. 6 shows binding tests on human cells of the envelopes of series EMO (A) and AMO (B). The background noise of fluorescence is provided by incubation of human cells with the ecotropic envelope (white histograms), 1 S - Fig. 7 shows the amino-acid and nucleotide sequence of PRO(4070A).
- Fig. 8 shows the amino-acid and nucleotide sequence of PRO(MoMLV).
- Fig. 9 shows the amino-acid and nucleotide sequence of PRO(3(MoMLV).
- Fig. 10 shows the amino-acid and nucleotide sequence of PRO+(4070A).
- Fig. 11 shows the amino-acid and nucleotide sequence of OPRO.
- Fig. 12 shows the amino-acid and nucleotide sequence of OPROp.
- Fig. I 3 shows the amino-acid and nucleotide sequence of OPRO+.
- Fig. 14 shows the amino-acid and nucleotide sequence of AMOPRO.
- Fig. 15 shows the amino-acid and nucleotide sequence of AM04PR0.
- Fig. 16 shows the amino-acid and nucleotide sequence of MOAPRO.

- Fig. 17 shows the amino-acid and nucleotide sequence of MOAOPRO.

- Fig. 18 shows the amino-acid and nucleotide sequence of EMOPRO.

- Fig. 19 shows the amino-acid and nucleotide sequence of EMOPRO(3.

- Fig. 20 shows the amino-acid and nucleotide sequence of EMOPRO+.

Fig. 21 shows the amino-acid and nucleotide sequence of EAPRO.

- Fig. 22 shows the amino-acid and nucleotide sequence of EAPRO(3.

- Fig. 23 shows the amino-acid and nucleotide sequence of EAPRO+.

- Fig. 24 shows the amino-acid and nucleotide sequence of EMODPRO.

- Fig. 25 shows the amino-acid and nucleotide sequence of EMOOPRO(3.

- Fig. 26 shows the amino-acid and nucleotide sequence of EMOOPRO+.

- Fig. 27 shows the amino-acid and nucleotide sequence of EAOPRO.

- Fig. 28 shows the amino-acid and nucleotide sequence of EA~PRO(3.
- Fig. 29 shows the amino-acid and nucleotide sequence of EAOPRO+
- Fig. 30 shows the amino-acid and nucleotide sequence of AMOEL3.

1$
- Fig. 3 I shows the amino-acid and nucleotide sequence of EL3 - Fig. 32 shows the amino-acid and nucleotide sequence of AMOEL3-V
- Fig. 33 shows the amino-acid and nucleotide sequence of EL3-V
- Fig. 34 shows the amino-acid and nucleotide sequence of AMOELS.

- Fig. 35 shows the amino-acid and nucleotide sequence of ELS.

- Fig. 36 shows the amino-acid and nucleotide sequence of OPR04-beta.

- Fig. 37 shows the amino-acid and nucleotide sequence of OPR04-int.

- Fig. 38 shows the amino-acid and nucleotide sequence of OPR04-vrb.

- Fig. 39 shows the amino-acid and nucleotide sequence of PRO-beta.

- Fig. 40 shows the amino-acid and nucleotide sequence of PRO-int.

- Fig. 41 shows the amino-acid and nucleotide sequence of PRO-vrb.

EXAMPLES:
I S EXAMPLE 1:
The retroviruses utilize a certain number of cell surface molecules, called viral receptors, for initiating the infectious process (23 ). Apart from some notable exceptions, especially in the case of human immunodeficiency viruses) most of the receptors utilized by the other retroviruses and in particular the type C mammalian retroviruses are distributed over most cell types of the host organism. For example, the amphotropic murine leukemia virus (MLV-A) is capable of infecting the majority of mammalian cells because its receptor) the phosphate transporter Ram-1, is expressed on almost all the cells.
The type C mammalian retroviruses are currently used for making retroviral vectors, in particular for purposes of gene transfer in humans, in gene therapy. Certain gene therapy procedures would be facilitated if the retroviral vectors were capable of very accurately recognizing the true target cells of gene transfer. For this, a certain number of research groups, including ours, have developed various strategies aiming to modify the recognition between the viral particle and the cell surface. This interaction essentially involves the retroviral envelope glycoprotein; it therefore seems logical to make genetic changes to this protein so as to enable it to recognize cell surface molecules specifically expressed on the target cells of gene transfer.
Two types of strategies permitting such changes have been developed recently.
In the first strategy, the natural binding domain of the retroviral envelope glycoprotein for its receptor was altered by insertion or substitution of peptides of reduced size that are able to bind a cell surface molecule. This work has demonstrated the feasibility of cell targeting for gene transfer (20).
In the second approach, polypeptides (ligands, single-strand antibodies) capable of binding various cell surface molecules were inserted at the N-terminal end of the SU
subunit of the envelope glycoprotein (6) ( 10) ( I 3) ( 1 S) (21 ). In general, investigation of the virions generated with these various types of targeting envelopes showed that it was possible for the binding of viral particles to be redirected specifically and efficiently towards new surface molecules. Some factors limiting the eRicacy of targeting were also identified. The first seems to depend on physiological properties of the surface molecule targeted (dimerization, internalization, intracellular transport ("traffcking") process) (6), the second is connected with the low intrinsic gene-fusion capacity of the chimeric envelopes generated by N-terminal insertion of ligands (6) (2 I ). It was observed that this low gene-fusion capacity can be partially overcome by introducing a spacer peptide between the new binding domain and the envelope (2 I ). However, the best infectious titres obtained are 100 times lower than can be obtained with retroviral vectors bearing a wild-type envelope. Moreover, it is possible that these results obtained in a particular targeting model (targeting of Ram- I ) cannot be extended to other types of targeting envelope glycoproteins. It therefore seemed essential to develop alternative strategies to solve these problems.
Furthermore, a general finding made with the targeting envelope glycoproteins generated by N-terminal insertions is that the natural binding domain of the supporting envelope is always functional. To the extent that the target cells are human cells in gene therapy, this functionality of the natural binding domain does not pose problems of "background noise" of infection because the supporting glycoprotein used is the ecotropic envelope of the MoMLV virus which does not recognize a receptor on the cells of higher mammals. However, it seemed interesting to characterize these chimeric envelope glycoproteins that are able to recognize two different surface molecules, to see 2S what influence the spacer peptide could have in this recognition, and to assess the relative contributions of the two types of interaction in the infectious process.
These observations, which form the subject of the work described below, led to the development of a two-stage targeting strategy, firstly involving specific recognition between the ligand inserted at the N-terminal end of the targeting glycoprotein, and then an auxiliary mechanism making it possible to facilitate entry of the virus specifically bound to the good cellular target by means of the natural retroviral receptor.
To avoid any problem of background noise of infection connected with direct interaction between the natural binding domain and the natural retroviral receptor, masking/unmasking spacer peptides were also developed, inserted between the targeting site and the supporting envelope glycoprotein, and which are able to mask the natural binding domain for as long as the viral particle has not interacted with the targeted surface molecule.
Realization of this interaction induces unmasking of the natural binding domain and interaction between the natural binding domain and the natural retroviral receptor (auxiliary mechanism) which then takes over for introducing the virus into the cell.
Equipment and Methods: Cell lines.
The cell line TELCeB6 (7) is derived from the TELacZ line ( 19) by transfection and clonal selection of cells expressing the gag and pol proteins of MoMLV
(Moloney Murine Leukemia Virus). The TELacZ cells express the retroviral vector MFGnlslacZ
which is able to transduce a nuclear (3-galactosidase. The TELCeB6 cells permit production of retroviral capsids (non-infectious, as they are devoid of envelopes) transporting the nlsLacZ retroviral marker vector. Cells A431 (ATCC CRL1555) and TE671 (ATCC CRL8805) are cultivated in DMEM medium (Gibco-BRL) supplemented with 10% of foetal calf serum (Gibco-BRL). Cells CHO, CERD9 (9), and CEAR 13 (9) are cultivated in DMEM medium (Gibco-BRL) supplemented with 10% of foetal calf serum and proline (Gibco-BRL). The NIH-3T3 cell lines and NIH-3T3 derivatives are cultivated in DMEM medium (Gibco-BRL) supplemented with 10% of newborn calf serum (Gibco-BRL).
Chimeric envelopes.
The DNA fragments coding for the polypeptides recognizing either EGFR (EGF
receptor) or Ram-1 (MLV-A receptor) were generated after PCR (polymerase chain reaction) by using oligonucleotides containing restriction sites. These polypeptides were introduced at the N-terminal of the SU protein of MLV (surface protein gp70) in which the SfiI and NotI restriction sites were created at codon +6 (33). A schematic diagram of the various env genes used in this article is shown in Fig. 2. Briefly, a DNA fragment derived from PCR amplification, coding for the 53 amino acids of human EGF (3) was generated using a cDNA matrix (ATCC 59957) and two oligonucleotides: OUEGF:
(S'> ATGCTCAGAGGGGTCAGTACGGCCCAGCCGGCCATGGCCAATAGTGAC
TCTGAATGTCC) with an SfiI restriction site and OLEGF:
(5'> ACCTGAAGTGGTGGGAACTGCGCGCGGCCGCATGTGGGGGTCCAGACT
CC) containing a NotI site. After digestion by SfiI and NotI, these fragments were cloned in a gene coding either for the SU protein of MoMLV in the case of the chimeric protein EMO, or SU of the 4070A virus for the chimeric protein EA (6).
For the AMO construct (6), a site NotI was created at the end of the recognition domain of the receptor in the 4070A envelope (called AS208), (2), and the nucleotide (nt) 750 (14) using a PCR fragment generated from the XhoI site (nt 594) up to nt 750 before the proline-rich region) owing to two oligonucleotides: 805FC (5' >
TCCAATTCCTTCCAAGGGGC) upstream of XhoI and 806FC (S' > ACCCCCACATGCGGCCGCTCCCACATTAAGGACCTGCCG) containing a NotI restriction site. The chimeric envelope is constructed by cloning of the Xhol/NotI PCR fragment and of the Notl/CIaI fragment, isolated from the env EMO
gene (coding for the SU and TM- P 1 SE transmembrane proteins of MoMLV), between the XhoI/CIaI sites of the env gene 4070A MLV.
The resulting constructs are recovered in the form of a BgIII/CIaI fragment (corresponding to positions 5408 and 7676 in MoMLV) and cloned at sites BamHI
and CIaI of an FBMOSALF expression plasmid (7) in which a selection marker gene (8) fused to the polyadenylation sequences of the PGK (phosphoglycerate kinase) gene and was introduced downstream of the LTR 3' of the MLV-C57 virus.
For EMO, EA, or AMO, the new recognition site was separated from the rest of the MLV envelope by a spacer peptide consisting of three alanines, supplied by the NotI
cloning site ( 15). In three other series of targeting envelopes (derived from envelopes EMO, EA or AMO), spacer amino acids were introduced either after the recognition domain of EGFR (EGF), or after the recognition domain of Ram-1 (AS208) as described below.
The series of envelopes targeting Ram-1 was generated by introducing different spacers between the recognition domain of Ram-I and the MoMLV envelope (Fig.
3A).
For AMOPRO, a region of 59 amino acids rich in proline originating from SU

(amphotropic) (nucleotides 75 I to 927) ( 14) was used. A shorter proline-rich region, also isolated from the envelope MLV 4070A (nt 751-789) was used for AMOOPRO.
This region corresponds to the 13 amino acid spacer of product v-mpl (originating from the virus of myeloproliferating leukemia) ( 18) located between its region derived from env and the equivalent of the cellular gene mpl.
In the case of AMO I , the first 208 amino acids, derived from the envelope of MLV 4070A, were fused to amino acid I of the SU of MoMLV. For AMOIFx, a 4 amino acid site corresponding to the cleavage site of blood coagulation factor Xa (Ile-Glu-Gly-Arg) (12) was inserted after the Ram-1 recognition site and fused to the +I
codon of the SU of MoMLV. The strategy used for these constructs is described above.
Briefly, an oligonucleotide (S'-TCCAATTCCTTCCAAGGGGC-3')) located just upstream of the XhoI site of the env gene of 4070A (nt 594) was used in combination with one or other of the following two oligonucleotides bearing the Not I
site:
5'-AGTATGCGGCCGCTGGGGGTGGCTGTGGGACAC-3' and S'-TATCTGCGGCCGCGTCGGGTAATACTGGGTTGG-3' so as to generate by PCR, using an env 4070A matrix, 3' fragments for the AMOPRO
and AMO~PRO envelopes respectively.
These PCR fragments were submitted to digestion by XhoI and NotI and cloned in the open FBAMOSALF plasmid in XhoI/NotI, a plasmid expressing an AMO type of envelope. The plasmids expressing the envelopes AMOFx, AMO 1 and AMO I Fx were generated by cloning the Ndel/Notl fragment of FBAMOSALF containing the Ram-1 recognition site) in a series of plasmids (13) expressing the modified MoMLV
envelopes so as to create a NotI site at codon 1 or at codon 6 with (AMO I Fx, AMOFx) or without (AMO 1 ) the Xa sequence. Envelopes derived from AMO and containing other types of spacer peptides were constructed. All of these spacer peptides are shown in Fig. 3 A.
The MOAPRO and MOAOPRO envelopes were generated according to a method similar to that of the AMOPRO and AMODPRO envelopes. The FBEASALF
plasmid, expressing the EA envelopes, was opened at NedI/NotI. This DNA was used I O for cloning two fragments: the 5' NdeI/BamH 1 fragment from digestion of the FBMOSALF plasmid (expressing the ecotropic MO envelopes) and containing, in addition to LTRS' and the retroviral leader sequence, the N-terminal end of the env gene of the MoMLV virus (position 6565), (17). 3' fragments were generated by PCR
using the env gene of MoMLV as matrix, as oligonucleotide 5'(5'-ACTGGGGCTTACGTTTGT-3') upstream of the BamH 1 site, and as oligonucleotide 3' (5'-TATGTGCGGCCGCCGGTGGAAGTTGGGTAGGGG-3') or (5'-TATGTGCGGCCGCGTCTGGCAGAACGGGGTTTGG-3') for constructing the MOAPRO and MOAOPRO envelopes, respectively. These PCR
fragments were digested with BamHI and NotI, and co-ligated with the 5' fragment. The sequence of the spacer peptides for these two constructs is shown in Fig. 3B.
FBEMOSALF, expressing the EMO chimeric envelopes (6), was submitted to digestion by BaaHII, filling by Klenow enryme and digestion by NdeI. The resulting 1.8 Kb fragment, containing the LTRS', the leader sequence, the end of the pol gene and human EGF, was isolated and inserted either in FBAMO~PROSALF or FBAMOPROSALF (plasmids expressing the AMOOPRO and AMOPRO chimeric envelopes respectively) in which the NdeI/EcoRI fragment was eliminated and the EcoRI
site was filled so as to generate the plasmids expressing the envelopes EMOOPRO+ and EMOPRO+, respectively. Plasmids expressing the envelopes EMO 1, EMO 1 FX were also generated. The sequence of the spacer peptides for these two constructs is shown in Fig.3C.
The plasmids expressing the EAPRO+ and EADPRO+ envelopes were generated by replacing the SfiI/Not fragment of the FBEASALF plasmid by the SfiI/NotI
fragments obtained from plasmids expressing the EMOPRO+ and EMOOPRO+ envelopes.
Finally for these various envelopes EMOOPRO+, EMOPRO+, EAPRO+ and EAOPRO+) the spacer peptides were reduced in their N-terminal part. For this, a DNA
fragment was generated by PCR using as matrix the EMO gene, oligonucleotide 5' (5' ACCATCCTCTAGACGGACATG-3') upstream of the XbaI site preceding the initiator codon and as oligonucleotide 3' (5'-TATCAGGATCCCAAATGTAAGCCCTGGATCG

CGCAGTTCCCACCACTTCAGGTCTCGGTACTGAC-3') containing a BamHI site.
This DNA was digested with XbaI/BamHI and cloned in one or other of the plasmids expressing the EMOPRO+ or EMOOPRO-+- envelopes, after removing the Xbal/NotI
fragments beforehand, by co-ligation with the BamHI/NotI fragments obtained from the S plasmids expressing the MOAPRO and MOADPRO envelopes. This results in two plasmids that are able to express the EMOPR0~3 and EMOOPR0~3 envelopes, respectively (Fig. 3C), in which EGF is fused just upstream of the BamHI site of the envelope of the MoMLV virus (nt 6537), (17) before the proline-rich region and leaving intact the potential b sheet. One or other of the SfiI/NotI fragments resulting from these 10 last two constructs was then introduced into the FBEASALF plasmid after prior removal of the SfiI/Not fragment; this results in two plasmids capable of expressing the EAPRO(3 and EAOPRO~i envelopes, respectively (Fig. 3C).
In another construction series (EMOPRO, EMOOPRO, EAPRO, EAOPRO), the potential ~i sheet was removed, and EGF was fused directly at the level of the proline 15 rich region (Fig. 3C).
Production of viruses.
The plasmids expressing the envelopes were transfected by the calcium phosphate precipitate method (16) in the TeLCeB6 cell line. The cells were submitted to selection with phleomycin (50 mg/ml), then the resistant clones were trypsinized in the bulk.
20 These confluence cells were used for recovering the viral supernatants after incubation over night in DMEM medium in the presence of FCS ( 10%). These supernatants are submitted to ultracentrifugation with the aim of obtaining samples for analysis in Western blots, in binding tests and in infection tests. Immunoblots. The virus-producing cells are lysed for 10 min at 4°C in buffer of Tris-HCL 20mM (pH 7. S), containing triton X 100 1%, SDS 0.05%, deoxycholate S mg/ml, NaCI 150 mM and PMSF 1 mM. After centrifugation for 10 min at 10 000 g, for deposition of the cell nuclei, the supernatants are frozen at -70°C until analysis. These viral samples are obtained by ultracentrifugation of the viral supernatants (10 ml) in a SW41 Beckman rotor (30 000 rpm, 1 h at 4°C).
The deposits are resuspended in 100 ml of PB S (phosphate buffered saline) and frozen at -70°C. The samples (30 mg of cellular lysates or 10 ml of purified viruses) are mixed in a ratio of 5:1 with buffer of 375 mM Tris-HCl (pH 6.8) containing SDS 6%, b-mercaptoethanol 30%, glycerol 10% and bromophenol blue 0.06%, then boiled for 3 min and analysed on acrylamide 10%/SDS gels. After transferring the proteins onto nitrocellulose membrane, immunologic marking is effected in TBS (Tris base saline, pH
7.4) in the presence of skimmed milk 5% and Tween 0.1%- Antibodies (Quality Biotech Inc., USA) obtained from goat antiserum, directed against gp70-SU of RLV
(Rauscher Leukemia Virus) or p30 of RLV were used at a dilution of 1:1000 or 1/10000 respectively. The blots were developed using a conjugated antibody of rabbit origin ' CA 02253874 1998-11-06 directed against goat immunoglobulins (DAKO, UK) using an electrochemoluminescence kit (Amersham Life Science).
Binding tests.
The target cells were washed with PB S and separated by incubation for 10 min at S 37°C with Versene 0.02% in PBS. These cells are rinsed with PBA (PBS
containing 2%
of FCS and sodium azide 0.1%). 106 cells are then incubated in the presence of viruses for 30 min at 4°C for the EMO envelope series or 45 min at 37°C
for the AMO envelope series. After rinsing with PBA, the cells are incubated in the presence of monoclonal antibodies (Evans et al., 1990) for 30 min at 4°C. After rinsing twice with PBA, the cells are incubated for 30 min at 4°C in the presence of conjugated anti-rat antibodies combined with FITC (Dako; UK). S min before the two final rinsings in PBA, the cells are counterstained with propidium iodide (20 mg/ml). The fluorescence of the live cells is analysed in a FACS (FACScalibur, Beckton Dickinson).
Infection tests.
1 S The target cells are inoculated in 24-well culture plates at a density of 3.10 cells per well. Various dilutions of the viral supernatants, containing Polybrene at 4 mg/ml, are added to the cells for 3 to 5 h at 37°C. The supernatants are then replaced with fresh medium and the cells are incubated for 24 to 48 h at 37°C. X-gal staining is then carried out as described previously (4). The viral titres are estimated as reported previously (S) in number of colonies per ml (IacZ i.u./ml).
In order to block the EGFRs, the target cells are incubated for 30 min at 37°C in a medium containing 10~ M of human recombinant EGF (236-EG, R&D Systems, UK).
The cells are then rinsed and infections are carried out as described previously. To block acidification of the endosomes, 100 mM of chloroquine phosphate (Sigma, UK) is added to the medium. Six hours after infection, the cells are rinsed and incubated in a normal medium.
Results and discussion.
Construction of the mutant envelopes.
Two series of modified envelopes capable of recognizing either the retroviral receptor Ram-1 ( 11 ), (22)) or the EGF receptor were generated. A first envelope targeting Ram-1, AMO, was constructed by insertion, at the N-terminal of the envelope of MoMLV (by fusion with codon 7), of a polypeptide recognizing Ram-1 (AS208, Fig.
3A) and corresponding to the first 208 amino acids of the SU of MLV-A ( 1 ).
The sequence coding for EGF was inserted in the env gene of MLV in position +6 of the SU
of MoMLV (Fig. 2). It had previously been demonstrated that this insertion site permits expression of a single-chain antibody fragment on the surface of virions ( 1 S). In the case of the chimeric envelope EMO (Fig. 2), human EGF was inserted in the envelope of MoMLV at the same position, whereas for the envelope EA, insertion was effected in the ' CA 02253874 1998-11-06 amphotropic envelope of MLV in position +5.
For the AMO, EMO and EA envelopes, the new binding domains were separated from the recognition domain of the retroviral receptor by a spacer peptide corresponding to three alanines. For the two types of parental envelopes targeting Etam-1 or targeting S EGFR, various constructs were then generated by insertion of spacers of different sizes and structures. The protein sequences of these different spacers are shown in Fig. 3 A in the case of the envelopes targeting Ram-1 and in Fig. 3C for the envelopes targeting EGFR.
The plasmids expressing the various envelopes, including the ecotropic (MO) and amphotropic (A) control envelopes, were transfected into the cell line TELCeB6 which expresses the proteins coded by the gag and pol genes, as well as a retroviral vector nlsLacZ (7).
Expression and incorporation of the envelopes in the virions.
The protein lysates of the corresponding cells were analysed for the expression of envelopes by means of antibodies directed against the SU of MLV (Fig. 5) for most of the envelopes of the AMO series (not shown for the other chimeric envelopes).
For all the chimeric envelopes) the precursors and the mature form SU of the envelopes could be detected at the expected size and at a level similar to the wild-type envelopes, suggesting that these chimeric envelopes are normally produced and matured.
Expression on the cell surface was determined by analyses of the producing cells in the FACS, using antibodies directed against the SU or using an anti-EGF
monoclonal antibody. The cells transfected by the various envelopes can be marked by the anti-SU
antibody (not shown). Only the cells expressing the EGF envelopes fusion envelopes can be marked by means of anti-EGF monoclonal antibodies (Fig. 4). This demonstrates expression of the chimeric envelopes on the cell surface and correct folding of the EGF
on the chimeric glycoproteins.
To demonstrate incorporation of the chimeric envelopes in the retroviral particles, the supernatants of the TELCeB6 cell lines transfected with the various envelopes were submitted to ultracentrifugation and the deposits of viral particles were recovered. These deposits were analysed by immunoblots for their expression of products of the gag gene (CAp30) and of the envelope proteins (Fig. 5 for most of the envelopes of the AMO series, not shown for the other chimeric envelopes). With the aim of comparing the e~ciency of viral incorporation between the various chimeric 3 S envelopes, identical quantities of viral particles (determined by marking the gag proteins by means of anti-CAp30 antibodies) were deposited on the gels.
The SU proteins could be detected for all the mutants) at the expected size but at a rate slightly less than was observed for the wild-type envelopes. In the case of the AMOG2X and AMOG3X envelopes only, the efficiency of incorporation is appreciably lower relative to the wild-type envelopes. As expected, no envelope expression was observed in the deposits from TELacZ supernatants (not expressing gag and pol proteins) transfected by the various envelopes. These results show that the chimeric SU
proteins are associated with retroviral particles.
Binding of the envelopes to the receptors.
Human cells expressing the receptors Ram-1 and/or of EGF were used for this investigation. These cells are incubated in the presence of viral preparations and the binding of the viral envelopes on the target receptor is determined by analysis with the FACS with the aid of antibodies directed against the SU (Fig. 6B). As expected, no binding is observed in the case of viruses expressing MO ecotropic envelopes on the various human cells (not shown), whereas the viruses that have chimeric envelopes targeting Ram-I are able to bind to the TE671 cells with an efficiency similar to that observed for the viruses expressing unmodified amphotropic envelopes. All the envelopes targeting Ram-1, derived from AMO, are able to bind to the TE671 cells with a similar efficiency. This binding can be inhibited after competition by AS208 fragment (the purified recognition domain of Ram-1 ) (2), which suggests that this recognition is specific (results not presented).
The envelopes targeting EGFR (EMO series) are moreover able to bind to the A431 cells) on EGFR expressor (Fig. 6A). This binding seems specific since pre incubation of the A431 cells in the presence of EGF (inducing endocytosis of the EGFRs) inhibits this binding (not shown).
Ram-I and Rec-1 cooperation in infection.
Transduction of the retroviral vectors pseudotyped by the various targeting envelopes was measured on cells expressing different types of receptors: human cells TE671 expressing the EGF and Ram-1 receptors; 3T3 cells expressing murine EGF, Rec-1 and Ram-1 receptors; CEAR 13 cells expressing Rec-1 and Ram-1; CERD9 cells expressing only Rec-1. The titrations were carried out as described previously (6). As expected, it was shown that the viruses pseudotyped by MO ecotropic envelopes were not capable of infecting the TE671 cells, but did permit infection of murine cells 3T3, CEAR13 and Cerd9 (with titres of the order of 10' IacZ i.u./ml). Conversely, the viruses bearing the amphotropic A envelopes are able to infect the murine cells 3T3) and TE671 (with titres of the order of 10' IacZ i.u./ml).
The viruses that have chimeric AMO envelopes are able to infect the TEb71 cells at a titre of 4.103 lacZ i.u.lml (Table 1 ). In comparison, despite a similar efficiency of binding to the receptor (Fig. 6B), the titres obtained with the wild-type envelopes are 10 000 times higher. Surprisingly, the viruses expressing AMOPRO envelopes, despite good efficiency of binding, proved incapable of infecting the human cells Compared with the titres obtained for the AMO envelopes (Table I ), the other types of spacers inserted in the envelopes of the AMO series permit an increase in titres from 30-fold (for AMODPRO) to more than I 00-fold (for AMO 1 Fx) making it possible to reach titres of 4.105 IacZ i.u./ml. It has been shown that these infections take place via the targeted receptor Ram-1. This was demonstrated by an interference test on target cells chronically infected with MLV-A virus. These cells become specifically refractory to infection by viruses bearing envelopes targeting Ram-1 (results not shown).
The viruses bearing the chimeric envelopes in which the site for binding to Ram-1 was separated from the SU of MoMLV by various spacers proved very infectious on 3T3 cells.
Compared with the titres obtained for the AMO envelopes, an increase from 200-fold (for AMOPRO) to more than 1000-fold (for AMO 1 Fx) in the viral titres was measured (Table 1 ).
Infection of the 3T3's is effected via Rec-1 or via Ram-1 (Table 1). This can be demonstrated by interference tests carried out on 3T3 cells chronically infected either by MLV-A (blocking Ram-1 ) or by MoMLV (blocking Rec-1 ). The viruses expressing the AMO envelopes seem to be capable of infecting the 3T3's indiscriminately depending on whether one or the other, or both, Rec-1 and Ram-1 receptors are available on the target cell. Compared with these AMO viruses, the viral particles containing the other envelopes capable of targeting Ram-1 are far less capable of infecting the 3T3's when only one of the two receptors is available. For example, when 100 particles (according to the titre determined on intact 3T3's) containing the AMOFx envelopes are used for infecting interfering 3T3's, 4 viruses are capable of infecting the cells if only Rec-1 is available and 2 viruses are capable of infecting the cells if only Ram-1 is available. This indicates a considerable loss of infectivity (more than 94% of the viruses are not infectious) when only one receptor is available compared with when both receptors are available. This also suggests that the two receptors Ram-1 and Rec-I cooperate in infecting the 3T3's. It appears that this phenomenon of cooperation is even more marked in the case of viruses bearing the AMOPRO envelopes. These last-mentioned viruses can infect the 3T3's with difficulty when only Rec-1 is available and cannot infect them at all when only Ram-1 is available. However, when Rec-1 and Ram-1 are both available, infection is possible and titres of the order of 6x 10° lacZ i.u./ml can be obtained (Table 1).
For better characterization of this cooperation effect, infection tests were carried out using CHO cells as targets (naturally devoid of Ram-1 and Rec-1 receptors) altered so as to express either Rec-1 only (Cerd9 cells), or Rec-1 and Ram-1 (Cearl3 cells) or TE671 cells expressing Ram-1 only. Furthermore) other envelopes derived from the AMO envelope were generated. These envelopes possess other types of spacer peptides (see Fig. 3 A) after the site targeting Ram-1, in particular flexible spacers.
The results of a typical experiment are shown in Table 2. For each envelope, cooperativity indices were calculated as the ratio of the titre obtained on the cell type expressing just one receptor to the titre obtained on the cell type expressing both types of receptors. An index of 1 therefore indicates that the titre is the same, whether there is just one or both 5 receptors. This is obviously the case with ecotropic or amphotropic wild-type envelopes.
An index less than 1 indicates that the titre is less good when a single receptor is expressed relative to when both are, and that both receptors are needed to promote infection. The lower this index is, the greater is the requirement for two receptors. As suggested in Table 1, the infectivity of the virions with the original AMO
envelopes is 10 not affected, whether there is a single type of receptor or both types (Table 2). In fact, the indices are even greater than 1 suggesting that the simultaneous presence of the two receptors hampers the infectious process, perhaps because the two binding domains hinder each other. The situation is different for viruses with the AMO 1 Fx envelopes even though, compared with the AMO virions, their infectivity is at least 100 times better 15 in the TE671 cells that express Ram-1 only. This increase in infectivity via Ram-1 can be explained by the increased size of the spacer peptide separating the two binding domains:
' it is possible that the AS208 site induces less steric hindrance with respect to the rest of the glycoprotein and that these envelopes can more easily induce the gene-fusion process. Moreover, the Cerd9 cells expressing Rec-1 only are infected relatively easily 20 by the AMO 1 Fx virions. However, in accordance with the results in Table 1, infection is facilitated by a factor of I 0 when both molecules Ram-1 and Rec-1 are co-expressed (index of about 0.1 ) compared with when only one or the other of the two receptors is present. The envelopes with the "flexible" spacers (AMOG 1 Fx, AMOG2, AMOG2Fx and AMOG3) seem to behave like the AMOIFx envelopes with regard to infection via 25 Rec-1 expressed alone. However) infectivity by Ram-1 expressed alone (RamID) tends to decrease as a function of the length of the spacer. This probably reflects a decrease in transmission of the gene-fusion signal following binding on Ram-I owing to the increase in distance between the AS208 domain and the fusion domain. With these envelopes as well, infection is favoured when the two receptors are co-expressed on the surface of the same cell.
As for the AMO 1 Fx envelopes, but non-symmetrically (RamID similar, but RecID very different)) the virions containing the AMOOPRO envelope can infect cells efficiently when Ram-1 is expressed alone. For this envelope as well, infectivity is increased about 10-fold when Rec-1 is also present on the cell surface. This difference is 3 S not due to the mere fact that the AMOOPRO virions utilize Rec-1 preferentially for infection. In fact, infection of cells on which Rec-1 alone is available is extremely slight (Table I) or even undetectable (Table 2) compared with when Ram-1 and Rec-1 are co-expressed. The RecID index is less than 10-5 (Table 2). This also demonstrates that the two receptors can synergize infection. These results also suggest that the domain of binding to the ecotropic receptor Rec-1 is not accessible when the AM04PR0 envelope is expressed on viral particles, and only becomes accessible if these virions interact with Ram-1 beforehand. It can also be suggested that following binding with Ram-1, the domain for binding to Rec-1 is unmasked and recruited for facilitating the infectious process. It is possible that this masking/unmasking takes place according to an allosteric type of mechanism causing a change in conformation of the chimeric glycoprotein that is induced by the Ram-1 /AS208 interaction and which involves the spacer peptide.
It is likely that this mechanism is strongly dependent on the amino acid composition of the spacer peptide. With comparable size, there is a difference of at least 1000 times in the RecID's when the AMOOPRO virions are compared with the virions containing the envelopes with the flexible spacers AMO 1 Fx, AMOG 1 Fx and AMOG2. The OPRO
peptide contains S prolines probably arranged in a type II polyproline helix, whereas the AMOGIFx and AMOG2 envelopes contain essentially glycines.
Similarly to the AMODPRO virions, the virions containing the AMOPRO
envelopes require the simultaneous presence of the two types of receptors for infecting the cells. The infectious titres in cell types co-expressing the two receptors are, however, lower than that observed with the AMODPRO virions, though it is not possible to exclude the hypothesis that the lesser extent of incorporation of these envelopes is responsible for this result. Even more markedly than with AMOOPRO, the AMOPRO
viruses cannot infect the cells when either one of the two receptors is expressed alone (Table 2). The two indices RamID and RecID are in fact less than 10-5. These results suggest that:
1) interaction of the AMOPRO virions with Ram-1 when it is expressed alone is not sufficient to trigger the changes in conformation of the glycoprotein permitting its gene-fusion. Furthermore, it is possible that the PRO spacer peptide is either too rigid, or too long to favour such a transition, 2) the domain for binding with Rec-1 is not accessible for interaction with Rec-1 and to take over in the entry process as long as the AMOPRO virion has not interacted with Ram-1.
For the purpose of better discrimination of whether the masking of the binding domain located downstream of the targeting site is a unique property of the peptide conjugated to the PRO spacer peptide, the inverse construction was effected.
The MOAPRO envelopes contain the binding domain of the ecotropic envelope followed by the proline-rich region of this same envelope, the whole being fused at the N-terminal end of the amphotropic envelope (Fig. 2). The results shown in Table 2, show that in a similar manner to the virions containing the AMOPRO envelopes, the MOAPRO
virions can infect the cells expressing only either one of the receptors Rec-1 or Ram-1 with difficulty, or not at all. It even seems that the Ram-1 domain in the MOAPRO
envelope is even less accessible (RamID less than 7x 10-5) than the Rec-1 domain is in the AMOPRO envelope (RecID less than 5.6x10-''). The MOAPRO envelopes can efficiently infect the cells expressing the two types of receptors, with titres of the order of 105 IacZ i.u./ml, suggesting that, for this envelope as well, the presence of the two receptors synergizes the infectious process.
These results, taken together, suggest that the spacer peptide inserted between the targeting domain and the rest of the retroviral envelope exercises control over the accessibility of the domain located downstream of the said peptide and over the activation of fusion. This control depends on the peptide itself and is influenced by its length and by its biochemical composition. The hypothesis formulated is that the PRO
spacer peptide would finally perform the same role as the proline-rich region in question and which is located, in the unmodified glycoprotein, between the binding domain to the receptor and the fusion domain. This role would be masking of the domain downstream (fusion domain for the wild-type envelope or binding domain for the chimeric envelope) and subsequent unmasking for interaction of the domain upstream with its receptor. In the case of the wild-type envelope, this unmasking would lead to activation of fusion, whereas in the case of chimeric envelopes, unmasking would lead to accessibility of the binding domain to the viral receptor. If the receptor is expressed at the cell surface, there can then be interaction, and this then triggers activation of the fusion domain, explaining why the simultaneous presence of the two receptors synergizes infection.
These results make it possible to propose a two-stage targeting strategy for which a targeting envelope is constructed with various domains) whose functions are activated and coordinated by means of specific spacer peptides containing proline-rich sequences. These chimeric envelope glycoproteins can be conceived as follows) with, from N-terminal to C-terminal, a "targeting" domain capable of recognizing a cell surface molecule specifically expressed on the targeted tissue or targeted cell (for example a single-chain antibody or a ligand for a surface receptor); a spacer peptide capable of masking an auxiliary region which is in turn capable of facilitating penetration of the virus when it is activated. Such an auxiliary domain can be an entire retroviral envelope, i.e. a structure capable of mediating and taking over from viral infection by means of interaction with a ubiquitous retroviral receptor) which therefore has a very strong likelihood of being co-expressed with the targeted surface molecule. Ideally, the auxiliary domain should be masked until the viral particle has specifically interacted with the targeted surface molecule. For example, in the case of the AMOPRO and AMO~PRO envelopes, the targeted surface molecule is Ram-1 whereas the auxiliary domain is the ecotropic envelope.
EGFR and Rec-1 cooperation in infection.

. 28 To verify whether the PRO and OPRO spacer peptides could mediate the masking/unmasking mechanism in the case of another type of targeting envelope, another two-stage targeting model was explored by means of the EGF receptor. The results obtained with the targeting of Ram-1 made it possible to propose C-terminal ends of the masking/unmasking spacer peptides. However, it was not possible to define their N-terminal ends exactly. That is why, in the first place, the EMOPRO+ and EMOOPRO+
envelopes were constructed (Fig. 3B), in which the PRO and OPRO spacer peptides contain in addition) at the N-terminus, 41 amino acids derived from the amphotropic envelope and located immediately upstream of the proline-rich region. For the EMOPRO+ and EMODPRO+ envelopes, the targeting domain is EGF, whereas the auxiliary domain is the ecotropic envelope. These two envelopes were compared with the EMO envelope (Fig. 2 and 2B) which does not contain a spacer peptide.
The infection tests were carned out with cells expressing Rec-1 alone (Cerd9 cells) or with cells co-expressing Rec-1 and EGFR (3T3 cells). The results of a typical experiment are presented in Table 3. As expected from the results obtained with the AMO envelopes, the viruses containing the EMO envelopes can efficiently infect the Cerd9 and 3T3 cells, indicating that the binding domain to Rec-1 in these envelopes is not masked. In comparison with the EMO viruses, the viral particles containing the EMOPRO+ and EMOOPRO+ envelopes can only infect the Cerd9 cells with difficulty (between 1000 and 10 000 times less well than the EMO viruses). However, when Rec-1 and EGFR are co-expressed, even though this does not affect the titre of the EMO
virions, the viral particles containing the EMOPRO+ and EMO~PRO+ envelopes are and 60 times more infectious, respectively, compared with when Rec-1 is expressed alone.
In relation to the results obtained with the AMOPRO and AMOOPRO envelopes, masking is apparently effected less well, leading to non-negligible infectivity on Cerd9 cells. This is perhaps due to the fact that the PRO+ and OPRO+ spacer peptides are not optimized for their function, but perhaps also to the fact that the Cerd9 cells express a few EGF receptors which would contribute to activation of the EMOPRO+ and EMOOPRO+ envelopes.

Table 1 Titres (lacZ i.u./ml) obtained for the viruses containing the envelopes targeting Ram-1 in interference tests env T6671 3T3a 3T3-MLV-Aa~b 3T3-MoMLVa~b Ram-lc Ram-1 ~ Rec-lc Rec-Ic Ram-1c MO <1 92,000,000 46,OCC,000 (100)40 (100) A 10.000,000 12,000,000 240 8,000,000 (100) (100) AMO 4,000 24 (100) 32 (266_71 8 (50) AMOFx 230,000 440,000 (100) 8,000 (3.6) 6,000 (2) AMO1 330,C00 1,920,000 (100)78,000 (8.1) 62,000 (4.8) AMOIFx 400,000 1,620,000 (100)60,000 (7.41 74,000 (6.8) AMO~PRO 150,000 280,000 (100) 400 (0.29) 64,000 (34.3) 1~

AMOPRO 10 60,OC0 (100) 4 (0.013) <1 (0.0025) a: percentages calculated assigning a value of 100 to the titres obtained on b: infection on 3T3 chronically infected by MLV-A (3T3-MLV-A) or by MoMLV (3T3-MoMLV) c: receptor available at the surface of the cell in question Table 2 Tit res (IacZ/ml) obtained containing i u. for the the viruses envelopes g Ram-1 targetin Spacer env 3T3 Tc671 CERD9 Ram?(~ RecID

peptide MO 2.8x10'E<1.7x10'0 2.8x10' <6.1x10 1 ~

A 5x10'5 SxlO'S 6.2x10'01 1.2x10-5 3 AMO 1x10 2.2x10'2 6.2x10'02.2x10'0 6.2x10-2 "

13 AMOIFx 6x10'4 1.6x10'"4 2.2X10'52.7x10-1 3.7X100 16 AMOAPRO 1.9x10'55x10'3 6.2x10'02.6x10-1 3.3x10-9 18 AMOGIFx 9x10 4.5x10'3 8.7x10'31.1x10 1 2.2x10-1 "' 19 AP90G2 8x10'3 2.7x1C'3 3.1X10'33.9x10-1 3.9x10-1 23 AMOG2Fx 6x.0'3 1.2x10' 1.2X10" 2x10-1 2x10-1 2E A.MOG3Fx9x10" 1x10+~ _.2x10''_.1x:0-1 1.3x10'0 62 AMOPRO 1.8x10'3<1.7x1C+0 <1x10'2 <9.9x10-q <5.6x10-4 NOAPRO 1.3x10" <9.1x10'0 1.7x10'3<7x10-5 1.3x10-2 Table 3 Titres (lacZ i.u./ml) obtained for the viruses containing the envelopes targeting EGFR
env 3T3 CeRD9 RecID

MO 9.2x106 1.3x107 1 EMO~PRO 3.5x104 8.5x102 1.7x10-2 EMOPRO 9.6x102 7x101 5.2x10-2 Cpl 2x106 3x106 1 EXAMPLE 2:
With the aim of characterizing the cooperation between the Rec-1 and Ram-1 receptors, as well as the peptides that are capable of regulating this cooperation of receptors, a new series of type AMO chimeric envelope glycoproteins (see preceding S example) was constructed:
- in order to verify whether the infection obtained with the AMOPRO and AMODPRO envelopes passes, in a second stage, through an interaction with Rec-1, the binding domain with Rec-1 was inactivated by point mutagenesis (D84K mutation) (MacKrell et al., J. Virology, 70:1768-1774 ( 1996)) in the AMOPRO and AMODPRO
envelopes as well as in the AMOG 1 X control envelope which does not require the cooperation of receptors to permit infection (Valsesia-Wittmann et al., The EMBO
Journal 16:1214-1223. (1997)).
- in order to demonstrate the role of the type II polyproline helix structure for the cooperating peptides, the envelopes AMOEL3 and AMOELS were constructed. These envelopes have respectively 3 and 5 turns of a type II polyproline helix as characterized in the literature (Urry, Journal of Protein Chemistry 7:1-34. ( 1988)).
Retroviruses were generated with these chimeric envelopes and were characterized by infection of cells expressing either Rec-1 alone, or Ram-1 alone, or the two molecules Ram-1 and Rec-1.
Material and Methods.
The oligonucleotides elast3U: (5'-TTT ATG GTC ACC GCG GCC GCA CCT
GGG GTA GGG GCT CCG GGG GTA GGG GCT CCT GGG GTG GCC ATA TAA) and elast3L (5'-TTA TAT GGC CAC CCC AGG AGC CCC TAC CCC CGG AGC
CCC TAC CCC AGG TGC GGC CGC GGT GAC CAT AAA) were hybridized together. The resulting bicatenary DNA fragment was digested with the Eael restriction enzyme and cloned in the FBAMOSALF expression plasmid previously opened at NotI.
The result was the plasmid FBAMOEL3SALF (see sequence of the gene env AMOEL3 in Fig. 30) containing the peptide EL3 the peptide sequence of which is shown in Table 4 (see nucleotide sequence in Fig. 31 ).
The oligonucleotides UpElS: (5'-GAT GTA CCT GGG GTA GGC GCC CCT
GGA GTC GGG GCT CCT GGG GTA GGA TTC AT) and LowElS: (5'-ATG AAT
CCT ACC CCA GGA GCC CCG ACT CCA GGG GCG CCT ACC CCA GGT ACA
TC) were hybridized together. The resulting bicatenary DNA fragment was digested with EcoNI restriction enzyme and cloned in the FBAMOEL3SALF expression plasmid, previously opened at EcoNI. The result is the plasmid FBAMOELSSALF (see sequence of the gene env AMOELS in Fig. 32) containing the peptide ELS, the peptide sequence of which is shown in Table 4 (see nucleotide sequence in Fig. 33).

The oligonucleotides DELASTIN3-V Upper: (5'-GTC ACC GCG GCC GTC
CCT GGG GTA GGG GTG CCG GGG GTA GGG GTG CCT GGG GTG GCC ATA
TAA) and DELASTIN3-V Lower (5'-TTA 'TAT GGC CAC CCC AGG CAC CCC TAC
CCC CGG CAC CCC TAC CCC AGG GAC GGC CGC GGT GAC) were hybridized together. The resulting bicatenary DNA fragment was digested with the EaeI
restriction enzyme and cloned in the FBAMOSALF expression plasmid, previously opened at NotI.
The result is the plasmid FBAMOEL3-VSALF (see sequence of the gene AMOEL3-V in Fig. 34) containing the EL3-V peptide, the peptide sequence of which is shown in Table 4 (see nucleotide sequence in Fig. 35).
The oligonucleotides DELASTIN3-I Upper: (5'-GTC ACC GCG GCC GTC
ATA GGG GTA GGG GTG ATT GGG GTA GGG GTG ATC GGG GTG GCC ATA
TAA) and DELASTIN3-I Lower (5'-TTA TAT GGC CAC CCC GAT CAC CCC TAC
CCC AAT CAC CCC TAC CCC TAT GAC GGC CGC GGT GAC) were hybridized together. The resulting bicatenary DNA fragment was digested with the EaeI
restriction enzyme and cloned in the FBAMOSALF expression plasmid, previously opened at NotI.
This resulted in the plasmid FBAMOEL3-ISALF containing the peptide EL3-I, the peptide sequence of which is shown in Table 4.
The oligonucleotides UpXhoD84K: (5'-AGG CTG CTC GAG AAA ATG CGA
AGA ACC TTT AAC CTC CC) and LoXhoD84K: (5'-ATT TTC TCG AGC AGC CTG
GGC TGC TGC CCC C) were synthesized. Starting from the oligonucleotides 805FC
and LMOADeItaPR03 (see sequence above), the pairs 805FC/LoXhoD84K or UpXhoD84K/LMOADeItaPR03 were used independently for PCR amplification of two DNA fragments starting from the FBAMOSALF matrix. These two DNAs were digested by the enzymes NotI/XhoI and XhoI/BamHI respectively and co-ligated in one or other of the three plasmids FBAMOSALF, FBAMODeItaPROSALF, and FBAMOProSALF previously opened at NotI and BamHI. The resulting plasmids express respectively the envelopes AMOD84K, AMODeItaProD84K, and AMOProD84K.
Two DNA fragments of 2005 by and 241 by were isolated from the plasmid FBAMOG 1 X (Valsesia-Wittmann et al., The EMBO Journal 16:1214-1223. ( 1997)) by digestion with the restriction enzymes NdeI/XhoI and Xhol/BstEII respectively.
These two inserts were cloned in the plasmid FBAMOD84KSALF previously digested by the enzymes NdeI and BstEII, resulting in a plasmid capable of expressing the AMOG 1 XD84K envelope.
Results and Discussion.
Expression and viral incorporation of the chimeric envelopes. The expression plasmids for the envelopes AMO, AMODeItaPRO, AMOPRO, AMOEL3) AMOELS, AMOEL3-V, AMOEL3-I, AMO1FX) AMOGIX, AMOD84K, AMODeItaPROD84K, AMOPROD84K) AMOG 1 XD84K, AMODeItaPR02 (Valsesia-Wittmann et al., The EMBO Journal 16:1214-1223. (1997)), and AMODeItaPR04 (Valsesia-Wittmann et al., The EMBO Journal 16:1214-1223. (1997)) were introduced by transfection into the cells of the TELCeB6 line (Cosset et al., Journal of Virology 69:7430-7436. ( 1995b)). After selection by phleomycin, the phleomycin-resistant colonies were combined for each DNA
and virions were generated and analysed following the procedures originally described (Cosset et al., Journal of Virology 69:6314-6322. ( 1995a)).
These various chimeric envelopes are normally expressed and matured in the cells, and, moreover, efftciently incorporated on the viral particles (results not shown).
The binding tests that were carried out show that these retroviruses can bind specifically on human cells by means of the targeted surface molecule Ram-1 (results not shown).
These various viruses were used for infecting cells expressing either Rec-1 only (Cerd9), or Ram-1 only (CHO-Ram-1 ), or the two molecules Ram-1 and Rec- I
I 5 (Cear I 3 ). The results of titration of these viruses are presented in Table 4.
These results can be summarized as follows:
- in an AMO envelope, substitution of the spacer peptide by three beta-turns of a synthetic (AMOEL3) or natural polyproline helix, described in the literature (AMOEL3-V, from bovine elastin) confers) with regard to capacity for masking the function of the ecotropic envelope and for regulating the cooperation of the Ram-1 and Rec-1 receptors, a phenotype similar to the viruses bearing the "AMO" envelopes containing the cooperating spacer peptides DeltaPR02, DeltaPRO, DeltaPR04, or PRO. Since the peptides derived from elastin (AMOEL3-V and AMOEL3) are arranged as a type II
polyproline helix, it can be suggested on the basis of the results obtained that regulation of the cooperation of the Ram-1 and Rec-1 receptors by the DeltaPro and Pro peptides is probably due to their presumed secondary structure, as a type II polyproline helix.
Moreover, mutations introduced into the spacer peptide derived from elastin (AMOEL3-V) and having the purpose of destroying the folding of the peptide into a type II
polyproline helix (AMOEL3-I, mutations obtained by replacing the proline of each beta-turn with an isoleucine) lead to cancellation of receptor cooperation.
- destruction of the capacity for binding to the ecotropic receptor (D84K
mutations) stops receptor cooperation for the envelopes containing cooperating spacer peptides, especially PRO (see results AMOPRO vs AMOPROD84K)) but does not affect the functionality of the control envelopes bearing the flexible spacer peptide G 1 X (see results AMOGIX vs AMOGIXD84K). We deduce from this that binding to the ecotropic receptor is necessary for infection, in a second stage, following fixation on the Ram-I receptor.
Note that the present results show that in the case of the retroviruses generated with the chimeric envelope AMOPro, the binding domain to the ecotropic receptor is masked (Valsesia-Wittmann et al , The EMBO Journal 16:1214-1223. ( 1997)). The results, taken together, are therefore compatible with a model of two-stage interaction in which:
S - in its "naive" configuration, i e. when it has not been permitted to interact with a cell, the "A~10PR0" retrovitus can potentially interact with the targeted "primary"
receptor (the Ram-1 molecule), but cannot directly interact with the auxiliary receptor (the Rec-1 molecule). This masking seems to be due to a first property of the Pro spacer peptide - when this virus is permitted to interact with Ram-1, a local change in conformation occurs at the level of the Pro spacer peptide which will make the binding domain to Rec-1 accessible. This change in conformation is due to a second property of the Pro spacer peptide.
- if the Rec-1 receptor is present at the surface of the same cell that has Ram-1 and on which the virus is bound. then in a second stage, this receptor will serve as an entry molecule for the virus Table 4. Results of titration.
L"'~ ' Cearl3b CHU-Ram-Ih Ccrd9b sequence ofthe spacer peptide AMO NV<i AAA I'EIQV + + +

AMUO84K NVG PRVPIGPNPAA 1'HQV + + -AMODcItaPro2 NVG PRVP1GPNPAA P13QV ++++ +++

AMOIFX NVG AAAIEGRASPGSS PHQV ++++ +++ +++

AMODeItaPro NVG PRVPIGPNPVLPDAAA PHQV ++++ +++ -2S v~~o'~"~'cnaa~;NVG PRVPIGPNPVLPDAAA PHQV +++ +++ _ AMOEL3 NVG AAAPGVGAPGt'GAPGVAA PIiQV +++ + _ AMOEL3-V NVG AAVPGVGVPGVGVPGVAA PHQV +++ + -AMO);L3-! NVG AAVIGVGVIGVGVIGVAA PHQV +++ +++

AMOG1X NVG AAAGGGGSICGRASPGSS PHQV +++ ++ ++

AMOGIXD84K ++ ++
NVG AAAGGGGSIEGRASPGSS
PHQV

A1v10DcitaPro4NVG PRVPIGPNPVLPDQRLPSSAA PHQV +++ ++ -AMOELS NVG AAAPGVGAPGVGAPGVGAPGVGAPGVAA +++ - -PIiQV

AMOPRO NVG PRVPIGPNPVLPDQRLPSSPIEIVPAPQPPSP...

...LNTSYPPSTTSTPSTSPTSPSVPQPPPAAA +++ - -PHQV

NVG PRVPiGPNPVLPDQRLPSSPIEIVPAPQQPPSP...

..LNTS~'PPSTTSTPSTSPTSPSVPQPPPAAA - -PHQV

3S envelope. "PHQV" represents the amino acids 7 to 10 of the envelope of Mo~.~(L.V and "NVG" represents the last 3 amino acids of the binding domain to Ram-1 b: relative titres obtained on the cells indicated: Cearl3) expressing the receptors Ram-I and Rec-I; CHO-Ram-l) expressing Ram-I only; Cerd9) expressing REPLACEMENT SHEET (RULE 26) Rec-I only.
EXAMPLE 3.
The development of strategies of targeting gene transfer by means of the 5 construction of chimeric envelope glycoproteins) generated by N-terminal insertions of ligands, comes up against the difficulty, in particular, of low capacity, or even incapacity of interaction between virus and targeted surface molecule for activating fusion of these targeting envelopes (Cosset and Russell) Gene Therapy 3 :946-956 ( 1996)). The possibility of causing two surface molecules to cooperate (Valsesia-Wittman et al., The 10 EMBO Journal 16:1214-1223. ( 1997)), one being the targeted receptor or cell surface molecule of attachment, the other being a (retro)viral receptor specialized for fusion or auxiliary surface molecule, makes it possible to envisage a means of overcoming this problem of low gene-fusion capacity of chimeric envelopes and more generally of low e~ciency of the targeting retroviruses. The cooperation of receptors was tested in three I S models of targeting, in which the following three cell surface molecules serve as points of attachment for the targeting retroviruses: (i) receptor of EGF (epidermal growth factor), and (ii) class I molecule of human CMH. The binding domains for these two surface molecules are either growth factors (EGFR), or a single-strand antibody (CMH-I).
These ligands were inserted by fusion at the N-terminal end of the amphotropic MLV
20 envelope (4070A) and various peptides from the proline-rich region carried by the SU
subunit of the amphotropic MLV virus were inserted between the ligands and the envelope (see Table S).
Materials and Methods 25 DNA fragments coding for the spacer peptides DeItaPro2, DeItaPro3, DeltaPro4, and Pro (see Table S) were generated by PCR using as DNA matrix the gene env 4070A, at 5' the oligonucleotide PRO-S-NE (5'-ATC GAG GTC ACC GCG GCC GCG GGA
CCC CGA GTC CCC ATA GGG CCC) which is the same for the four PCR fragments and as oligonucleotides 3' the sequences AMODPRO(-H + P-A): (5'-TAT GAG CGG
30 CCG GGT TGG GCC CTA TGG GGA C), DPro3: (5'-TTA TAC GGC CGT GTC
GGG TAA TAC TGG), AMODPRO(+H+S-A): (5'-TAT GTG CGG CCG AGG AAG
GGA GTC TTT GGT C) and PRO-3-NE: (5'-ATA ATC GGC CGG GGG TGG CTG
TGG GAC).
The corresponding DNA fragments were digested by the enzyme EagI and 35 inserted separately in the plasmid FBEASALF (expressing the chimeric envelope glycoproteins EA) (Cosset et al., Journal of Virology 69:6314-6322. ( 1995a)) previously opened at the NotI restriction site. The resulting plasmids express the envelopes EADeltaPro2, EADeItaPro3, EADeltaPro4, and EAPro.

The Ndel/Notl fragment containing the promoter FB29 as well as the scFv anti-MHC-I provided with the signal peptide of the envelope glycoprotein of the MoMLV
virus (Marin et al.) Journal of Virology 70:2957-2962. ( 1996)) was cloned in the FBEASALF plasmid from which the NdeUNotl fragment was removed beforehand. This S results in the plasmid FB34ASALF capable of expressing a 4070 chimeric envelope with the scFv fused at its N-terminal end. This plasmid was then opened at Notl for inserting the spacer peptides DeItaPro2, DeItaPro3, DeltaPro4, and Pro (see Table S) previously digested with the EagI enzyme. This results in a series of expression vectors for the envelopes 34De1taPro2, 34DeItaPro3, 34De1taPro4, and 34Pro.
Results and Discussion.
These various DNAs were introduced by transfection into the cells of the TELCeB6 line and retroviruses were generated following the usual procedure (see examples I and 2). It was shown that these retroviruses correctly express the chimeric envelope glycoproteins and that the latter permit effcient redirection of binding of the I S viral particles on the specific cellular targets (results not shown).
The viruses produced with the chimeric envelopes of the various groups were used for infecting cells that only express the amphotropic receptor and not the targeted surface molecule. The results of titration of these viruses are shown in Table S.
These results show that it is possible to mask the functions of the amphotropic envelope by means of fragments from the proline-rich region. In the case of chimeras effected with EGF, it is necessary to insert at least five beta-turns to obtain a significant masking effect, and insertion of the whole of the proline-rich region leads to complete inhibition. For the chimeras effected with scFv anti-hR IC-I, three beta-turns are required to obtain a complete masking effect.

Table S. Results of titration.
peptides ligand fork:
name sequence CGFR MHC-I
without' AAA PHQV 6e3 39e2 DeltaPro2 AAA GPRVPIGPNPAA PHQV 7e3 18e1 DeltaPro3 AAA ~PRVPIGPNPVLPD'CAA PHQV 1.2e3 <
Icl DeItaPro4 AAA GPRV_PIG~NpVLPDORLPSSAA PHQV 7el <
Icl Pro AAA GPRVPIGPNPV1.PDOP~LPSSPIEIVPAPQPf' .

...SPLNTSYPPSTTSTPSTSPTSPSVPQPPPAA PIiQV< lel <
lel a: peptide inserted between the targeting binding domain and the 4070A
envelope. "AAA" codes for the Notl site used for effecting fusion in the chimeric envelope; "PHQV" represents the amino acids 4 to 7 of the amphotropic envelope. The REPLACEMENT SHEET (RULE 26) beta-turns are underlined.
b: titration on Cear 13 cells for the EGFR targeting envelopes (ligand: EGF) and for the targeting envelopes targeting MHC-I (ligand: scFv anti-MHC-1).
c: the ligand is directly fused at the end of the amphotropic SU (with the 4th S amino acid), and does not have a spacer peptide.
EXAMPLE 4.
The previous investigations made it possible to delimit the C-terminal ends of the cooperating peptides and to determine the number of turns of type II
polyproline helix necessary for obtaining a masking effect and a minimal cooperative effect. In the case of the model of the AMO chimeric envelopes (see above), a minimum of two turns of the helix is sufficient (Valsesia-Wittmann et al., The EMBO Journal 16:1214-1223.
(1997)).
However, for chimeric envelopes generated with other binding domains than that for Ram-1 (in the case of AMO chimeras) and using the amphotropic envelope as support 1 S envelope, the cooperative effect is less marked, on the one hand because masking of the functions of the amphotropic envelope requires four turns of polyproline helix (see Table S) and on the other hand because activation of the functions of the amphotropic envelope is less strong following binding of the viruses on the targeted surface molecules. One possible explanation is that, in the model of the AMO chimeras, apart from the PRO
spacer peptide, the binding domain to Ram-1 itself carries important determinants for inducing, in a concerted manner with this PRO peptide, activation of the functions of the ecotropic envelope. The binding domain for Ram-1 is in fact a fragment of retroviral envelope (derived from the amphotropic envelope) which is naturally located immediately upstream of the proline-rich region. In order to determine the presence and 2S the importance of such regions in receptor cooperation, chimeric envelopes were constructed combining a targeting domain with the amphotropic envelope and, inserted between these two polypeptides, various peptides tested for their cooperative effect containing notably the proline-rich region (or a fragment of this region) combined with peptide fragments derived from the N-terminal domain of the amphotropic envelope.
Materials and Methods DNA fragments coding for the spacer peptides DeItaPro4-beta, DeItaPro4-int, DeltaPro4-vrb and.. were generated by PCR using as DNA matrix the gene env 4070A, at 3' the oligonucleotide AMODPRO(+H+S-A): (S'-TAT GTG CGG CCG AGG AAG
3S GGA GTC TTT GGT C) and at S' the oligonucleotides UPro-beta: (S'-ATG CTG
GCG
GCC GCG GAT CCT ATT ACC ATG T'TC TCC CTG ACC CGG C), UPro-int: (S'-ATG CTG GCG GCC GCG AAC CCT CTA GTC CTA GAA TTC ACT GAT GC), and UPRO-vrb: (S'-ATG CTG GCG GCC GCG GAA ACC ACC GGA CAG GCT TAC

TGG AAG CCC)) respectively (see Figs. 36 to 38).
DNA fragments coding for the spacer peptides Pro-beta, Pro-int and Pro-vrb were generated by PCR using as DNA matrix the gene env 4070A, at 3' the oligonucleotide PRO-3-NE: (ATA ATC GGC CGG GGG TGG CTG TGG GAC) and at S S' the oligonucleotides UPro-beta: (S'-ATG CTG GCG GCC GCG GAT CCT ATT
ACC ATG TTC TCC CTG ACC CGG C), UPro-int: (S'-ATG CTG GCG GCC GCG
AAC CCT CTA GTC CTA GAA TTC ACT GAT GC), and UPRO-vrb: (S'-ATG CTG
GCG GCC GCG GAA ACC ACC GGA CAG GCT TAC TGG AAG CCC), respectively (see Figs. 39 to 41 ).
These DNA fragments were digested with Eagl enzyme and inserted either in the FBEASALF plasmid (see above) resulting in production of the expression vectors for the chimeric envelopes EADeltaPro4-beta, EADeltaPro4-int, EADeltaPro4-vrb, EAPro-beta, EAPro-int and EAPro-vrb, or in the FB34ASALF plasmid (see above) resulting in production of the expression vectors for the chimeric envelopes 34ADe1taPro4-beta, 1 S 34ADeltaPro4-int) 34ADe1taPro4-vrb, 34APro-beta, 34APro-int and 34APro-vrb.

BIBLIOGRAPHY
1. Battini, J. L., O. Danos, and J. M. Heard. 1995. Receptor-binding domain of murine -leukemia virus envelope glycoproteins. J. Virol. 69:713-719.
2. Battini, J. L., P. Rodrigues, R. MYller, O. Danos, and 1.-M. Heard. 1996.
Receptor-binding properties of a purified fragment of the 4070A amphotropic murine leukemia virus envelope glycoprotein. J. Virol. in press.

3. Bell, G. L, N. M. Fong, M. M. Stempien, M. A. Wormsted, D. Caput, L. Ku, M. S. Urdea, L. B. Rall, and R. Sanchez-Pescador. 1986. Human epidermal growth factor precursor: cDNA sequence, expression in vitro and gene organization.
Nucleic Acid Res. 14:8427-8446.
4. Cosset) F.-L., C. Legras, Y. Chebloune, P. Savatier) P. Thoraval, J. L.
Thomas) 1. Samarut, V. M. Nigon, and G. Verdier. 1990. A new avian leukosis virus-based packaging cell line that uses two separate transcomplementing helper genomes. 1.
Virol. 64:1070-1078.
5. Cosset, F.-L., C. Legras, J. L. Thomas, R. M. Molina, Y. Chebloune, C.
Faure, V. M. Nigon, and G. Verdier. 1991. Improvement of avian leukosis virus (ALV)-based retrovirus vectors by using different cis-acting sequences from ALVs. J
Virol.
65:3388-3394.
6. Cosset, F.-L., F. J. Morling, Y. Takeuchi, R. A. Weiss, M. K. L. Collins, and S. J. Russell. 1995a. Retroviral retargeting by envelopes expressing an N-terminal binding domain. J. Virol. 69:6314-6322.

7. Cosset, F.-L., Y. Takeuchi, J. L. Battini, R. A. Weiss, and M. K. L.
Collins.
1995b. High titer packaging systems producing recombinant retroviruses resistant to human serum. J. Virol. 69:7430-7436.

8. Gatignol, A., H. Durand, and G. Tiraby. 1988. Bleomycin resistance conferred by a drug-binding protein. FEBS Letters. 230:171-175.

9. Kozak, S. L., D. C. Siess, M. P. Kavanaugh, A. D. Miller, and D. Kabat.
1995. The envelope glycoprotein of an amphotropic murine retrovirus binds specifically to the cellular receptor/phosphate -transporter of susceptible species. J.
Virol. 69 3433-3440.

10. Marin, M., D. No' 1, S. Valsesia-Wittmann, F. Brockly, M. Etienne-Julan, S.
S J. Russell, F.-L. Cosset, and M. Piechaczyk. 1996. Targeted infection of human cells via MHC class I molecules by MoMuLV-derived viruses displaying single-chain antibody fragment-envelope fusion proteins. in press.

11. Miller, D. G., R. H. Edwards, and A. D. Miller. 1994. Cloning of the cellular 10 receptor for amphotropic murine retroviruses reveals homology to that for gibbon ape leukemia virus. Proc Natl Acad Sci USA. 91:78-82.

12. Nagai, K., and H. C. Thorgersen. 1984. Generation of betaglobin by sequence-specific proteolysis of a hybrid protein produced in Escherishia coli. Nature 15 (London).309:810-812.

13. Nikon, B. H. K., F. J. Morling, F.-L. Cosset, and S. J. Russell. 1996.
Targeting of retroviral vectors through protease-substrate interactions. Gene Ther. in press.

14. Ott, D., R. Friedrich, and A. Rein. 1990. Sequence analysis of amphotropic and 10A1 murine leukemia virus: close relationship to mink cell focus forming viruses. J.
Virol. 64:757-766.

15. Russell, S. J., R. E. Hawkins, and G, Winter. 1993. Retroviral vectors displaying functional antibody fragments. Nucleic Acids Research. 21:1081-1085.

16. Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989. Molecular cloning, A
laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, New York.

17. Shinnick, T. M., R. A. Lerner) and J. G. Sutcliffe. 1981. Nucleotide sequence of Moloney murine leukemia virus. Nature 293:543-548.

18. Souyri, M., I. Vigon, J. F. Penciolelli, J. M. Heard, P. Tambourin, and F.
Wendling. 1990. A putative truncated cytokine receptor gene transduced by the myeloproliferative leukemia virus immortalizes hematopoietic progenitors.
Cell.
63:1137-1147.

19. Takeuchi, Y., F. L. Cosset, P. J. Lachmann, H. Okada, R. A. Weirs, and M.
K. L. Collins. 1994. Type C retrovirus inactivation by human complement is determined by both the viral genome and producer cell. J. Virol. 68:8001-8007.

20. Valsesia-Wittmann, S., A. Drynda, G. Deleage, M. Aumailley, J.-M. Heard, O. Danos, G. Verdier, and F.-L. Cosset. 1994. Modifications in the binding domain of avian retrovirus envelope protein to redirect the host range of retroviral vectors. J. Virol.
68:4609-4619.

21. Valsesia-Wittmann, S., F. J. Morling, B. H. K. Nilson, Y. Takeuchi, S. J.
Russell, and F.-L. Cosset. 1996. Improvement of retroviral retargeting by using amino acid spacers between an additional binding domain and the N terminus of Moloney murine leukemia virus SU. J. Virol. 70:2059-2064.

22. VanZeijl, M., S. V. Johann, E. Cross, J. Cunningham, R. Eddy, T. B. Shows, and B. O'Hara. 1994. An amphotropic virus receptor is a second member of the gibbon ape leukemia virus receptor family. Proc. Natl. Acad. Sci. USA. 91:1168-1172.

23. Weirs) R. A. 1993. Cellular receptors and viral glycoproteins involved in retroviral entry, p. 1-108. In J. levy (ed. ), The Retroviridae, vol . 2.
Plenum Press.

SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: CENTRE NATINAL DE LA RECHERCHE SCIENTIFIQUE
(ii) TITLE OF INVENTION: VIRAL PARTICLES WHICH ARE MASKED OR UNMASKED
WITH RESPECT TO A CELL RECEPTOR
(iii) NUMBER OF SEQUENCES: 70 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: FETHERSTONHAUGH & CO.
(B) STREET: P.O. BOX 2999, STATION D
(C) CITY: OTTAWA
(D) STATE: ONT
(E) COUNTRY: CANADA
(F) ZIP: K1P 5Y6 (v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk (B) COMPUTER: IBM PC compatible (C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: ASCII (text) (vi) CURRENT APPLICATION DATA:
2 0 (A) APPLICATION NUMBER: CA 2,253,874 (B) FILING DATE: 16-MAY-1997 (C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: FR 96/06234 (B) FILING DATE: 20-MAY-1996 (viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: FETHERSTONHAUGH & C0.
(B) REGISTRATION NUMBER:
(C) REFERENCE/DOCKET NUMBER: 11534-16 3 0 (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (613)-235-4373 (B) TELEFAX: (613)-232-8440 (2) INFORMATION FOR SEQ ID NO.: l:

_ _.___ __ _ (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 189 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(189) (C) OTHER INFORMATION: Description of Unknown Organism:UNFQ~10WN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 1:

Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 2:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 63 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide 4 0 (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 2:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala 50 ~5 60 (2) INFORMATION FOR SEQ ID NO.: 3:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 144 (B) TYPE: nucleic acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(144) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~10WN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 3:

Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Set Pro Thr Gln Leu Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 4:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 48 (B) TYPE: amino acid (C) STRANDEDNESS:

(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 4:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 5:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 189 2 0 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(189) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~10WN
30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 5:

Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro 4 0 Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 6:

(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 63 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 6:
Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lye Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 7:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 312 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
3 0 (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(312) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 7:

Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 8:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 104 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
30 (ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 8:
Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe 4 0 Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala (2) INFORMATION FOR SEQ ID NO.: 9:

(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 60 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(60) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 9:

Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln (2) INFORMATION FOR SEQ ID NO.: 10:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 20 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
3 0 (ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 10:
Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln (2) INFORMATION FOR SEQ ID NO.: 11:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 105 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus ( ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(105) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQdOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 11:

Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln 3 0 (2) INFORMATION FOR SEQ ID NO.: 12:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 35 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 12:
4 0 Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln (2) INFORMATION FOR SEQ ID NO.: 13:
10 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 183 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
2 0 (B) LOCATION: (1)..(183) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 13:

Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln (2) INFORMATION FOR SEQ ID NO.: 14:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 61 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:

(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 14:
Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln 2 0 (2) INFORMATION FOR SEQ ID NO.: 15:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2780 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
3 0 (A) NAME/KEY: CDS
(B) LOCATION: (1)..(2778) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 15:

Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 4 0 Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Aep Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile 2 0 Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 16:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 926 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
4 0 (ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 16:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala 50 Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser 2 0 Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lye Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val 4 0 Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile 2 0 Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 17:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2642 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 4 0 ( ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2640) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 17:

Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 5 0 Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys 2 0 Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro 4 0 Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser AAG

Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lye Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val 3 0 Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 18:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 880 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
3 0 (A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 18:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys 50 Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg 2 0 Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lye Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala 4 0 Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 19:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2792 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
2 0 (A) NAME/KEY: CDS
(B) LOCATION: (1)..(2790) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 19:

Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro 3 0 Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro 50 Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp 3 0 Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 20:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 930 (B) TYPE: amino acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 20:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg 5 0 Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro 2 0 Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lye Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro 4 0 Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 21:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2700 (B) TYPE: nucleic acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2697) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 21:

Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr 5er Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 22:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 899 (B) TYPE: amino acid (C) STRANDEDNESS:
2 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 22:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu _ 77 _ Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg 2 0 Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro _ 78 _ Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lye Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly 4 0 Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu _ 79 _ Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 23:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2322 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
2 0 (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2319) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 23:

Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp '' Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met GTA

Ala Thr GlnGlnPhe GlnGln LeuGlnAla AlaValGln AspAspLeu Arg Glu ValGluLys SerIle SerAsnLeu GluLysSer LeuThrSer Leu Ser GluValVal LeuGln AsnArgArg GlyLeuAsp LeuLeuPhe Leu Lys GluGlyGly LeuCys AlaAlaLeu LysGluGlu CysCysPhe Tyr Ala AspHisThr GlyLeu ValArgAsp SerMetAla LysLeuArg Glu Arg LeuAsnGln ArgGln LysLeuPhe GluSerThr GlnGlyTrp Phe Glu GlyLeuPhe AsnArg SerProTrp PheThrThr LeuIleSer Thr Ile MetGlyPro LeuIle ValLeuLeu MetIleLeu LeuPheGly Pro Cys IleLeuAsn ArgLeu ValGlnPhe ValLysAsp ArgIleSer Val Val GlnAlaLeu ValLeu ThrGlnGln TyrHisGln LeuLysPro Ile Glu TyrGluPro (2) INFORMATION FOR SEQ ID NO.: 24:
(i) SEQUENCE CHARACTERISTICS
50 (A) LENGTH: 773 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 24:

Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Aap Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp 2 0 Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His 4 0 Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 25:
2 O (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2367 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
30 (B) LOCATION: (1) . . (2364) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 25:

Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lye Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val AAG

ValGlyTyrIle GlyGluArg CysGlnTyr ArgAspLeu LysTrpTrp GluLeuArgAsp ProGlyLeu ThrPheGly IleArgLeu ArgTyrGln AsnLeuGlyPro ArgValPro IleGlyPro AsnProVal LeuAlaAsp GlnGlnProLeu SerLysPro LysProVal LysSerPro SerValThr 2 0 Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro 4 0 Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys _ 87 _ Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lye Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg _ 88 _ Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 26:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 788 (B) TYPE: amino acid 60 (C) STRANDEDNESS:
(D) TOPOLOGY:

(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 26:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val 2 0 Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys 2 0 Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser 5 0 Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 27:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2490 3 0 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2487) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
4 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 27:

Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lye Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 28:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 829 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 28:
2 0 Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser 2 0 Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr _ 97 -Gly Gln Gly Leu Cys Ile GIy Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly 5er Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys 4 0 Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 29:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2289 60 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:

_ 98 _ (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2286) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 29:

AAA GAT AAC
CCC AAG

Met Ala ArgSer ThrLeuSer LysProPro GlnAspLys IleAsnPro Trp Lys ProLeu IleValMet GlyValLeu LeuGlyVal GlyMetAla Glu Ser AlaAla GlnProAla MetAlaAsn SerAspSer GluCysPro Leu Ser HisAsp GlyTyrCys LeuHisAsp GlyValCys MetTyrIle Glu Ala LeuAsp LysTyrAla CysAanCys ValValGly TyrIleGly Glu Arg CysGln TyrArgAsp LeuLysTrp TrpGluLeu ArgGlyPro Arg Val ProIle GlyProAsn ProValLeu AlaAspGln GlnProLeu Ser Lys ProLys ProValLys SerProSer ValThrLys ProProSer Gly Thr ProLeu SerProThr GlnLeuPro ProAlaAla AlaProHis Gln Val PheAsn ValThrTrp ArgValThr AsnLeuMet ThrGlyArg Thr Ala AsnAla ThrSerLeu LeuGlyThr ValGlnAsp AlaPhePro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp GAC CTA ATC TCC CTT AAG CGC GGT AAC ACC CCC TGG GAC ACG GGA TG~ 816 Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu 3 0 Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro 5 0 Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 30:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 762 (B) TYPE: amino acid (C) STRANDEDNESS:
2 0 ( D ) TOPOLOGY
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 30:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser 50 Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Aen Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu 4 0 Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu 4 0 Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 31:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2334 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2331) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 31:

Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu 3 0 Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile 4 0 Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Aen Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr 6 0 Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 32:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 777 (B) TYPE: amino acid (C) STRANDEDNESS:
2 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 32:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg 100 ~ 105 110 Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly 2 0 Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr 4 0 Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 33:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2457 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
60 (ii) MOLECULE TYPE: DNA

(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2454) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 33:

Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu SerHisAsp GlyTyrCys LeuHisAsp GlyValCys MetTyrIle Glu AlaLeuAsp LysTyrAla CysAsnCys ValValGly TyrIleGly 3 0 Glu ArgCysGln TyrArgAsp LeuLysTrp TrpGluLeu ArgGluPhe Thr AspAlaGly LysLysAla AsnTrpAsp GlyProLys SerTrpGly Leu ArgLeuTyr ArgThrGly ThrAspPro IleThrMet PheSerLeu AAT GGG

Thr Arg GlnVal LeuAsnVal GlyProArg ValProIle GlyProAsn Pro Val LeuPro AspGlnArg LeuProSer SerProIle GluIleVal Pro Ala ProGln ProProSer ProLeuAsn ThrSerTyr ProProSer Thr Thr SerThr ProSerThr SerProThr SerProSer ValProGln Pro Pro ProAla AlaAlaPro HisGlnVal PheAsnVal ThrTrpArg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lye Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 34:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 818 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 4 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 34:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro 4 0 Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr 2 0 Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 35:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2229 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
2 0 (A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2226) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 35:

Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser 4 0 Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser 4 0 Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala 4 0 Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lye Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro 40 (2) INFORMATION FOR SEQ ID NO.: 36:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 742 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 36:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser 4 0 Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser 2 0 Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 37:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2274 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2271) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 37:

Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser 3 0 Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln 50 Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys. Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly 3 0 Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 38:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 757 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
4 0 (A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 38:
Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His 2 0 Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Aap Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 39:

(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2352 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2349) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 39:

Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro 2 0 Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp 4 0 Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lye Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu 3 0 Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 40:
3 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 783 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 40:
4 0 Met Ala Arg Ser Thr Leu Ser Lys Pro Leu Lys Asn Lys Val Asn Pro Arg Gly Pro Leu Ile Pro Leu Ile Leu Leu Met Leu Arg Gly Val Ser Thr Ala Ser Pro Gly Ser Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val 2 0 Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lye Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser 4 0 Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 41:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2196 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1) . . (2193) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 41:

Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro 3 0 Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 42:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 731 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
3 0 (ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 42:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly 50 Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys 2 0 Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lye Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lya Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cya Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp 4 0 Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 43:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2241 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
60 (ii) MOLECULE TYPE: DNA

(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1) . . (2238) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 43:

AAA GAT AAC
AAG

Met Ala ArgSer ThrLeuSer LysProPro GlnAspLys IleAsnPro Trp Lys ProLeu IleValMet GlyValLeu LeuGlyVal GlyMetAla 20 . 25 30 Glu Ser AlaAla GlnProAla MetAlaAsn SerAspSer GluCysPro Leu Ser HisAsp GlyTyrCys LeuHisAsp GlyValCys MetTyrIle Glu Ala LeuAsp LysTyrAla CysAsnCys ValValGly TyrIleGly 3 Glu Arg CysGln TyrArgAsp LeuLysTrp TrpGluLeu ArgAspPro Gly Leu ThrPhe GlyIleArg LeuArgTyr GlnAsnLeu GlyProArg Val Pro IleGly ProAsnPro ValLeuPro AspAlaAla AlaProHis Gln Val PheAsn ValThrTrp ArgValThr AsnLeuMet ThrGlyArg Thr Ala AsnAla ThrSerLeu LeuGlyThr ValGlnAsp AlaPhePro 5 Lys Leu TyrPhe AspLeuCys AspLeuVal GlyGluGlu TrpAspPro Ser Asp GlnGlu ProTyrVal GlyTyrGly CysLysTyr ProAlaGly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 44:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 746 (B) TYPE: amino acid (C) STRANDEDNESS:
2 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 44:
Mei Ala Arg Ser Th5 Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 3 0 T~ Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His 5 0 Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys 2 0 Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val 4 0 Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 45:
(i) SEQUENCE CHARACTERISTICS
50 (A) LENGTH: 2319 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE

(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2316) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 45:

Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 46:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 772 (B) TYPE: amino acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 46:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 3 0 Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Ala Ala Gln Pro Ala Met Ala Asn Ser Asp Ser Glu Cys Pro Leu Ser His Asp Gly Tyr Cys Leu His Asp Gly Val Cys Met Tyr Ile Glu Ala Leu Asp Lys Tyr Ala Cys Asn Cys Val Val Gly Tyr Ile Gly Glu Arg Cys Gln Tyr Arg Asp Leu Lys Trp Trp Glu Leu Arg Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Ala Ala Ala Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro 4 0 Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Gly Thr Gly Asp Arg Leu Leu Ala Leu Val Lys Gly Ala Tyr Gln Ala Leu Asn Leu Thr Asn Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ser Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Val Gly Thr Tyr Thr Asn His Ser Thr Ala Pro Ala Asn Cys Thr Ala Thr Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Met Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Ser Ala Gly Ser Gly Ser Tyr Tyr Leu Ala Ala Pro Ala Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Leu Ser Thr Thr Val Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Ile Tyr His Ser Pro Asp Tyr Met Tyr Gly Gln Leu Glu Gln Arg Thr Lys Tyr Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Ile Lys 2 0 Thr Gln Gln Phe Glu Gln Leu His Ala Ala Ile Gln Thr Asp Leu Asn Glu Val Glu Lys Ser Ile Thr Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Thr Gly Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Leu Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Ile Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 47:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 2649 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:

(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2646) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 47:

AAA GAT
AAG

Met AlaArgSer ThrLeuSer LysProPro GlnAspLys IleAsnPro Trp LysProLeu IleValMet GlyValLeu LeuGlyVal GlyMetAla Glu SerProHis GlnValPhe AsnValThr TrpArgVal ThrAsnLeu Met ThrGlyArg ThrAlaAsn AlaThrSer LeuLeuGly ThrValGln Asp AlaPhePro LysLeuTyr PheAspLeu CysAspLeu ValGlyGlu Glu TrpAspPro SerAspGln GluProTyr ValGlyTyr GlyCysLys Tyr ProAlaGly ArgGlnArg ThrArgThr PheAspPhe TyrValCys Pro GlyHisThr ValLysSer GlyCysGly GlyProGly GluGlyTyr Cys GlyLysTrp GlyCysGlu ThrThrGly GlnAlaTyr TrpLysPro Thr SerSerTrp AspLeuIle SerLeuLys ArgGlyAsn ThrProTrp Asp ThrGlyCys SerLysVal AlaCysGly ProCysTyr AspLeuSer Lys ValSerAsn SerPheGln GlyAlaThr ArgGlyGly ArgCysAsn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lye Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cye Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly ATA CTA
GGA ATG
ACA GCC
GGG ACT
ACT
ACT

Ile Ala AlaGly IleGly GlyThrThr Ala Met Ala Gln Thr Leu Thr GCC GAT GAG

Gln Phe GlnGln LeuGln AlaValGln Asp Leu Arg Val Ala Aap Glu CTA ACT TCT

Glu Lys SerIle SerAsn GluLysSer Leu Ser Leu Glu Leu Thr Ser AGG TTA AAA

Val Val LeuGln AsnArg GlyLeuAsp Leu Phe Leu Glu Arg Leu Lys CTA TGC GCG

Gly Gly LeuCys AlaAla LysGluGlu Cys Phe Tyr Asp Leu Cys Ala GAC TTG AGG

His Thr GlyLeu ValArg SerMetAla Lys Arg Glu Leu Asp Leu Arg TTT GGA GAG

Asn Gln ArgGln LysLeu GluSerThr Gln Trp Phe Gly Phe Gly Glu Leu Phe AsnArg SerPro PheThrThr Leu Ser Thr Met Trp Ile Ile CTA TTC TGC

Gly Pro LeuIle ValLeu MetIleLeu Leu Gly Pro Ile Leu Phe Cys TTT ATC GTC

Leu Asn ArgLeu ValGln ValLysAsp Arg Ser Val Gln Phe Ile Val CAA AAA GAA

Ala Leu ValLeu ThrGln TyrHisGln Leu Pro Leu Tyr Gln Lys Glu Glu Pro (2) INFORMATION FOR SEQ ID NO.: 48:-(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 882 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 48:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lye Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys 2 0 Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lye Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser 2 0 Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Aan Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 49:
4 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 54 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
50 (B) LOCATION: (1)..(54) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 49:

Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 50:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 18 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 50:
2 0 Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 51:
(i) SEQUENCE CHARACTERISTICS
3 0 (A) LENGTH: 2649 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2646) 4 0 (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 51:

Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys 3 0 Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Aan Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lye Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lye Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 52:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 882 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 4 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 52:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr G1y Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp 4 0 Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lya Ser Pro Ser Val Thr Lya Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu 2 0 Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lya Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lya Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro 5 0 Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Aap Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lya Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 53:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 54 (B) TYPE: nucleic acid (C) STRANDEDNESS:
3 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(54) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~TOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 53:

Ala Ala Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 54:
50 (i) SEQUENCE CHARACTERISTICS

(A) LENGTH: 18 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 54:
Ala Ala Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 55:
(i) SEQUENCE CHARACTERISTICS
2 0 (A) LENGTH: 2679 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(2676) 3 0 (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~10WN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 55:

Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lye Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lye Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lye Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 56:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 892 (B) TYPE: amino acid 3 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 56:
Met Ala Arg Ser Thr Leu Ser Lys Pro Pro Gln Asp Lys Ile Asn Pro 4 0 Trp Lys Pro Leu Ile Val Met Gly Val Leu Leu Gly Val Gly Met Ala Glu Ser Pro His Gln Val Phe Asn Val Thr Trp Arg Val Thr Asn Leu Met Thr Gly Arg Thr Ala Asn Ala Thr Ser Leu Leu Gly Thr Val Gln Asp Ala Phe Pro Lys Leu Tyr Phe Asp Leu Cys Asp Leu Val Gly Glu Glu Trp Asp Pro Ser Asp Gln Glu Pro Tyr Val Gly Tyr Gly Cys Lys Tyr Pro Ala Gly Arg Gln Arg Thr Arg Thr Phe Asp Phe Tyr Val Cys Pro Gly His Thr Val Lys Ser Gly Cys Gly Gly Pro Gly Glu Gly Tyr Cys Gly Lys Trp Gly Cys Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp 2 0 Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala Pro His Gln Val Tyr Asn Ile Thr Trp Glu Val Thr Asn Gly Asp Arg Glu Thr Val Trp Ala Thr Ser Gly Asn His Pro Leu Trp Thr Trp Trp Pro Asp Leu Thr Pro Asp Leu Cys Met Leu Ala His His Gly Pro Ser Tyr Trp Gly Leu Glu Tyr Gln Ser Pro Phe Ser Ser Pro Pro Gly Pro Pro Cys Cys Ser Gly Gly Ser Ser Pro Gly Cys Ser Arg Asp Cys Glu Glu Pro Leu Thr Ser Leu Thr Pro Arg Cys Asn Thr Ala Trp Asn Arg Leu Lys Leu Asp Gln Thr Thr His Lys Ser Asn Glu Gly Phe Tyr Val Cys Pro Gly Pro His Arg Pro Arg Glu Ser Lys Ser Cys Gly Gly Pro Asp Ser Phe Tyr Cys Ala Tyr Trp Gly Cys Glu Thr Thr Gly Arg Ala Tyr Trp Lys Pro Ser Ser Ser Trp Asp Phe Ile Thr Val Asn Asn Asn Leu Thr Ser Asp Gln Ala Val Gln Val Cys Lys Asp Asn Lys Trp Cys Asn Pro Leu Val Ile Arg Phe Thr Asp Ala Gly Arg Arg Val Thr Ser Trp Thr Thr Gly His Tyr Trp Gly Leu Arg Leu Tyr Val Ser Gly Gln Asp Pro Gly Leu Thr Phe Gly Ile Arg Leu Arg Tyr Gln Asn Leu Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Ala Asp Gln Gln Pro Leu Ser Lys Pro Lys Pro Val Lys Ser Pro Ser Val Thr Lys Pro Pro Ser Gly Thr Pro Leu Ser Pro Thr Gln Leu Pro Pro Ala Gly Thr Glu Asn Arg Leu Leu Asn Leu Val Asp Gly Ala Tyr Gln Ala Leu Asn Leu Thr Ser Pro Asp Lys Thr Gln Glu Cys Trp Leu Cys Leu Val Ala Gly Pro Pro Tyr Tyr Glu Gly Val Ala Val Leu Gly Thr Tyr Ser Asn His Thr Ser Ala Pro Ala Asn Cys Ser Val Ala Ser Gln His Lys Leu Thr Leu Ser Glu Val Thr Gly Gln Gly Leu Cys Ile Gly Ala Val Pro Lys Thr His Gln Ala Leu Cys Asn Thr Thr Gln Thr Ser Ser Arg Gly Ser Tyr Tyr Leu Val Ala Pro Thr Gly Thr Met Trp Ala Cys Ser Thr Gly Leu Thr Pro Cys Ile Ser Thr Thr Ile Leu Asn Leu Thr Thr Asp Tyr Cys Val Leu Val Glu Leu 4 0 Trp Pro Arg Val Thr Tyr His Ser Pro Ser Tyr Val Tyr Gly Leu Phe Glu Arg Ser Asn Arg His Lys Arg Glu Pro Val Ser Leu Thr Leu Ala Leu Leu Leu Gly Gly Leu Thr Met Gly Gly Ile Ala Ala Gly Ile Gly Thr Gly Thr Thr Ala Leu Met Ala Thr Gln Gln Phe Gln Gln Leu Gln Ala Ala Val Gln Asp Asp Leu Arg Glu Val Glu Lys Ser Ile Ser Asn Leu Glu Lys Ser Leu Thr Ser Leu Ser Glu Val Val Leu Gln Asn Arg Arg Gly Leu Asp Leu Leu Phe Leu Lys Glu Gly Gly Leu Cys Ala Ala Leu Lys Glu Glu Cys Cys Phe Tyr Ala Asp His Thr Gly Leu Val Arg Asp Ser Met Ala Lys Leu Arg Glu Arg Leu Asn Gln Arg Gln Lys Leu Phe Glu Ser Thr Gln Gly Trp Phe Glu Gly Leu Phe Asn Arg Ser Pro Trp Phe Thr Thr Leu Ile Ser Thr Ile Met Gly Pro Leu Ile Val Leu Leu Met Ile Leu Leu Phe Gly Pro Cys Ile Leu Asn Arg Leu Val Gln Phe Val Lys Asp Arg Ile Ser Val Val Gln Ala Leu Val Leu Thr Gln Gln Tyr His Gln Leu Lys Pro Leu Glu Tyr Glu Pro (2) INFORMATION FOR SEQ ID NO.: 57:
2 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 84 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
3 0 (B) LOCATION: (1)..(84) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 57:

Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 58:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 28 (B) TYPE: amino acid (C) STRANDEDNESS:

(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 58:
Ala Ala Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Gly Ala Pro Gly Val Ala Ala (2) INFORMATION FOR SEQ ID NO.: 59:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 118 (B) TYPE: nucleic acid (C) STRANDEDNESS:
2 0 (D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(117) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 59:

Ala Ala Ala Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu 1 5 .10 15 Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 60:
(i) SEQUENCE CHARACTERISTICS

(A) LENGTH: 39 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 60:
Ala Ala Ala Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 61:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 211 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 3 0 ( ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(210) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~TOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 61:

Ala Ala Ala Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 62:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 70 (B) TYPE: amino acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 62:
Ala Ala Ala Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys 3 0 Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 63:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 382 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
50 (ii) MOLECULE TYPE: DNA

(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(381) (C) OTHER INFORMATION: Description of Unknown Organism:UNFQdOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 63:

Ala Ala Ala Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lye Lys Ala Asn Trp Asp Gly Pro Lys 3 0 Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 64:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 127 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide 5 0 (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 64:
Ala Ala Ala Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Ala (2) INFORMATION FOR SEQ ID NO.: 65:
3 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 238 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirue (ix) FEATURE
(A) NAME/KEY: CDS
4 0 (B) LOCATION: (1)..(237) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQdOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 65:

Ala Ala Ala Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 66:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 79 (B) TYPE: amino acid 2 0 (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 66:
Ala Ala Ala Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 67:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 331 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: DNA

(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(330) (C) OTHER INFORMATION: Description of Unknown Organism:UNIQ~TOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 67:

Ala Ala Ala Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro 3 0 Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 68:
4 0 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 110 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 68:

Ala Ala Ala Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 69:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 502 (B) TYPE: nucleic acid (C) STRANDEDNESS:
(D) TOPOLOGY:
3 0 (ii) MOLECULE TYPE: DNA
(vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus (ix) FEATURE
(A) NAME/KEY: CDS
(B) LOCATION: (1)..(501) (C) OTHER INFORMATION: Description of Unknown Organism: UNKNOWN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 69:

Ala Ala Ala Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lye Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala (2) INFORMATION FOR SEQ ID NO.: 70:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 167 (B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY:
(ii) MOLECULE TYPE: polypeptide (vi) ORIGINAL SOURCE:
(A) ORGANISM: MLV-related retrovirus 4 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 70:
Ala Ala Ala Glu Thr Thr Gly Gln Ala Tyr Trp Lys Pro Thr Ser Ser Trp Asp Leu Ile Ser Leu Lys Arg Gly Asn Thr Pro Trp Asp Thr Gly Cys Ser Lys Val Ala Cys Gly Pro Cys Tyr Asp Leu Ser Lys Val Ser Asn Ser Phe Gln Gly Ala Thr Arg Gly Gly Arg Cys Asn Pro Leu Val Leu Glu Phe Thr Asp Ala Gly Lys Lys Ala Asn Trp Asp Gly Pro Lys Ser Trp Gly Leu Arg Leu Tyr Arg Thr Gly Thr Asp Pro Ile Thr Met Phe Ser Leu Thr Arg Gln Val Leu Asn Val Gly Pro Arg Val Pro Ile Gly Pro Asn Pro Val Leu Pro Asp Gln Arg Leu Pro Ser Ser Pro Ile Glu Ile Val Pro Ala Pro Gln Pro Pro Ser Pro Leu Asn Thr Ser Tyr Pro Pro Ser Thr Thr Ser Thr Pro Ser Thr Ser Pro Thr Ser Pro Ser Val Pro Gln Pro Pro Pro Ala

Claims

1. Use of a peptide for transferring genes into a eukaryotic target cell, this peptide containing from about 10 to about 200, especially from about 15 to about 150 amino acids, and advantageously about 20 amino acids, in which at least 30% of the amino acids consist of proline residues, these proline residues being arranged regularly so as to induce turnings of the polypeptide chain at about 180° (".beta.-turn" or "reverse-turn"), these turnings being regularly spaced and forming a polyproline .beta.-turn helix, in a polypeptide construction containing, on the N-terminal side (upstream) of the said peptide, an N-terminal (upstream) protein domain capable of recognizing a targeted surface molecule or an antigen expressed on a cell surface, especially a suitable receptor (targeted receptor) located on the said eukaryotic cell, and on the C-terminal side (downstream) of the said peptide, a C-terminal (downstream) protein domain capable of recognizing a suitable receptor (auxiliary receptor) located on the aforesaid eukaryotic cell, this peptide being capable of facilitating or inhibiting interaction between the C-terminal (downstream) protein domain and the auxiliary receptor, inhibition of this interaction occurring for as long as the N-terminal (upstream) protein domain has not interacted with the targeted receptor and promotion of interaction between the C-terminal (downstream) protein domain and the auxiliary receptor occurring when the N-terminal (upstream) protein domain has interacted with the targeted receptor.

2. Use of a peptide according to claim 1, in the construction of a glycoprotein with targeting and gene-fusion activity, essentially intact, carried by a viral or non-viral recombinant gene-transfer vector capable of infecting a eukaryotic cell, and this eukaryotic cell has a targeted receptor and an auxiliary receptor permitting facilitation of entry of the aforesaid viral or non-viral vector into the eukaryotic cell, the aforesaid glycoprotein comprising:
- the aforesaid peptide, - a protein domain on the N-terminal side (upstream) of the said peptide, capable of interacting with the said targeted receptor, this protein domain permitting specific binding of the said gene-transfer vector and - a protein domain on the C-terminal side (downstream) of the said peptide, capable of interacting with the said auxiliary receptor, this interaction performing the role of auxiliary mechanism of entry of the said gene-transfer vector into the eukaryotic cell, the process of cell entry of the viral or non-viral recombinant vector into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to take place when the N-terminal (upstream) protein domain has recognized and bound the viral or non-viral recombinant vector with the targeted receptor of the eukaryotic cell, leading, through the agency of the aforesaid peptide, to a mechanism of "unmasking" or ofaccessibility of the auxiliary receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the aforesaid gene-transfer vector and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of "masking" or of non-accessibility, through the agency of the aforesaid peptide, of the auxiliary receptor with respect to the C-terminal (downstream) protein domain.

3. Use of a peptide according to one of the claims 1 to 2, in the construction of an essentially intact (retro)viral envelope glycoprotein, carried by a recombinant (retro)viral particle capable of infecting a eukaryotic cell, the said envelope glycoprotein being advantageously of polymeric form, and especially of trimeric form, each monomer of the polymeric form being in itself of heterodimer form, the said eukaryotic cell containing a targeted receptor and an auxiliary receptor permitting facilitation of entry of the said (retro)viral particle ((retro)viral receptor) into the eukaryotic cell, the envelope glycoprotein comprising:
- the aforesaid peptide, - a protein domain on the N-terminal side (upstream) of the said peptide, capable of interacting with the said targeted receptor, this interaction permitting specific binding of the (retro)viral particle and - a protein domain on the C-terminal side (downstream) of the said peptide, capable of interacting with the said (retro)viral receptor, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the eukaryotic cell, the process of cell entry of the recombinant (retro)viral particle into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to take place when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the recombinant (retro)viral particle, leading, through the agency of the aforesaid peptide, to a mechanism of "unmasking" or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of "masking" or of non-accessibility, through the agency of the aforesaid peptide, of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain.

4. Use of a peptide according to any one of the claims 1 to 3, characterized in that the N-terminal (upstream) protein domain is chosen from the following polypeptides:
- single-strand antibodies recognizing cell surface molecules, - any ligand for a cell surface molecule, especially polypeptide hormones, cytokine, growth factors.

5. Use of a peptide according to any one of the claims 1 to 4, characterized in that the C-terminal (downstream) protein domain corresponds to a (retro)viral envelope glycoprotein, essentially intact, containing the natural binding domain, the functions of fusion and of attachment of the wild-type envelope glycoprotein from which the envelope glycoprotein carried by the recombinant (retro)viral particle is derived.

6. Use of a peptide according to any one of the claims 1 to 5, characterized in that the peptide comes from the envelope glycoprotein of type C retroviruses, and in that the virus is preferably chosen from: the ecotropic MLV virus, the amphotropic MLV
virus, the xenotropic MLV virus, the MLV MCF virus, the MLV 10A1 virus, GALV
(Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV C (FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or that are constituted of one of the following sequences:
PRO(4070A), PRO(MoMLV), .DELTA.PRO, PRO+, .DELTA.PRO+, PRO.beta., .DELTA.PRO.beta., .DELTA.PRO4-.beta., .DELTA.PRO4-int, .DELTA.PRO4-vrb, PRO.beta., PRO-int, PRO-vrb.

7. Use of a peptide according to any one of the claims 1 to 5, characterized in that the peptide is derived or adapted from bovine elastin and is chosen from those containing or that are constituted of one of the following sequences: EL3, EL3-V, EL5.

8. Peptide sequences chosen from those containing or constituted of one of the following sequences:
- PRO(4070A), PRO(MoMLV), PRO.beta., PRO+, .DELTA.PRO, .DELTA.PRO.beta., .DELTA.PRO+, - MOAPRO, MOA.DELTA.PRO, - EMOPRO, EMOPRO.beta., EMOPRO+, EAPRO, EAPRO.beta., EAPRO+, EMO.DELTA.PRO, EMO.DELTA.PRO.beta., EMO.DELTA.PRO+, EA.DELTA.PRO, EA.DELTA.PRO.beta., EA.DELTA.PRO+, EL3, EL3-V, EL5, AMOEL3, AMOEL3-V, AMOEL5, .DELTA.PRO4-.beta., .DELTA.PRO4-int, .DELTA.PRO4-vrb, PRO.beta., PRO-int, PRO-vrb.

9. Peptide sequence containing a peptide of about 10 to about 200, especially about 15 to about 150 amino acids, and preferably about 20, in which at least 30% of the amino acids consist of proline residues, these proline residues being arranged regularly so as to induce turnings of the polypeptide chain at about 180° (".beta.-turn" or "reverse-turn"), these turnings being regularly spaced and forming a polyproline .beta.-turn helix, - an N-terminal protein domain (upstream) of the said peptide, capable of reacting with a suitable receptor (targeted receptor) located on a eukaryotic cell, this protein domain permitting specific binding of a recombinant (retro)viral particle containing the said N-terminal protein domain and - a C-terminal protein domain (downstream) of the said peptide, capable of interacting with a suitable auxiliary (retro)viral receptor ((retro)viral receptor) located on the said eukaryotic cell, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the said eukaryotic cell, the process of cell entry of the said recombinant (retro)viral particle into the said eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to take place when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the said recombinant (retro)viral particle, leading, through the agency of the aforesaid peptide, to a mechanism of unmasking or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of masking or of non-accessibility, through the agency of the aforesaid peptide, of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain.

10. Recombinant (retro)viral particle capable of infecting a eukaryotic cell, this cell possessing a targeted receptor and an auxiliary receptor of the aforesaid (retro)viral particle, comprising a substantially intact envelope glycoprotein, especially of polymeric form and advantageously of trimeric form, each monomer of the polymeric form being advantageously itself of heterodimer form, containing a peptide of about 10 to about 200, especially about 15 to about 150 amino acids, and preferably about 20, in which at least 30% of the amino acids are constituted of proline residues, these proline residues being arranged regularly so as to induce turnings of the polypeptide chain at about 180°
(".beta.-turn" or "reverse-turn"), these turnings being regularly spaced and forming a polyproline .beta.-turn helix, - a protein domain on the N-terminal side (upstream) of the aforesaid peptide, capable of interacting with the aforesaid targeted receptor, this peptide domain permitting specific binding of the (retro)viral particle and - a protein domain on the C-terminal side (downstream) of the aforesaid peptide, capable of interacting with the aforesaid (retro)viral receptor, this interaction performing the role of auxiliary mechanism of entry of the (retro)viral particle into the eukaryotic cell, the process of cell entry of the recombinant (retro)viral particle into the eukaryotic cell by means of the C-terminal (downstream) protein domain only being able to take place when the N-terminal (upstream) protein domain has recognized and bound the targeted receptor of the eukaryotic cell with the recombinant (retro)viral particle, leading through the agency of the aforesaid peptide to a mechanism of unmasking or of accessibility of the (retro)viral receptor with respect to the C-terminal (downstream) protein domain, and, in the case when recognition does not occur between the recombinant viral particle and the targeted receptor of the eukaryotic cell by means of the N-terminal (upstream) protein domain, there is produced a mechanism of masking or of non-accessibility, through the agency of the aforesaid peptide, of the retroviral receptor with respect to the C-terminal (downstream) protein domain.

11 . Recombinant (retro)viral particle according to claim 10, characterized in that the N-terminal (upstream) protein domain is chosen from the following peptides:
- single-strand antibodies recognizing cell surface molecules, - any ligand for a cell surface molecule, notably polypeptide hormones, cytokine, growth factors.

12. Recombinant (retro)viral particle according to one of the claims 10 or 11, characterized in that the C-terminal (downstream) protein domain corresponds to a polypeptide of (retro)viral origin possessing functions of binding, of fusion and of attachment of the wild-type envelope glycoprotein from which the envelope glycoprotein carried by the recombinant (retro)viral particle is derived, and can originate from the natural domains possessing the functions of binding, of fusion and of attachment of the envelope glycoproteins from retroviruses MLV-A, GALV, FeLVB, or from viruses such as adenoviruses, herpesviruses, AAV (Adeno Associated Virus), or more generally from viral glycoproteins from viruses of eukaryotic origin, especially orthomyxoviruses (such as influenza viruses) or paramyxoviruses (such as SVS).

13. Recombinant (retro)viral particle according to any one of the claims 10 to 12, characterized in that the peptide originates from the envelope glycoprotein of type C
retroviruses, and in that the peptide originates advantageously from a virus chosen from:
the ecotropic MLV virus, the amphotropic MLV virus, the xenotropic MLV virus, the MLV MCF virus, the MLV 10A1 virus, GALV (Gibbon Ape Leukemia Virus), SSAV
(Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV C (FeLV Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or constituted of one of the following sequences: PRO (4070A), PRO(MoMLV), .DELTA.PRO, PRO+, .DELTA.PRO+, PRO.beta., .DELTA.PRO.beta., .DELTA.PRO4-.beta., .DELTA.PRO4-int, .DELTA.PRO4-vrb, PRO.beta., PRO-int, PRO-vrb.

14. Recombinant (retro)viral particle according to any one of the claims 10 to 13, characterized in that:
- the peptide originates from the envelope glycoprotein of type C retroviruses, and in that the virus is preferably chosen from: the ecotropic MLV virus, the amphotropic MLV
virus, the xenotropic MLV virus, the MLV MCF virus, the MLV 10A1 virus, GALV
(Gibbon Ape Leukemia Virus), SSAV (Simian Sarcoma Associated Virus), FeLV A, FeLV B, FeLV C (FeLV: Feline Leukemia Virus), and especially in that the peptide is chosen from those containing or constituted of one of the following sequences: PRO
(4070A), PRO(MoMLV), .DELTA.PRO, PRO+, .DELTA.PRO+, PRO.beta., .DELTA.PRO.beta., .DELTA.PRO4-.beta., .DELTA.PRO4-int, .DELTA.PRO4-vrb, PRO.beta., PRO-int, PRO-vrb, - the N-terminal (upstream) protein domain is chosen from the following peptides:
* single-strand antibodies recognizing cell surface molecules, * any ligand for a cell surface molecule, notably polypeptide hormones, cytokine, growth factors, - the C-terminal protein domain corresponds to a polypeptide of (retro)viral origin possessing the functions of binding, of fusion and of attachment of the wild-type envelope glycoprotein from which the envelope glycoprotein carried by the recombinant (retro)viral particle is derived, and can originate from the natural domains possessing the functions of binding, of fusion and of attachment of the envelope glycoproteins from the retroviruses MLV-A, GALV, FeLVB, or from viruses such as adenoviruses, herpesviruses, AAV (Adeno Associated Virus), or more generally viral glycoproteins from viruses of eukaryotic origin, especially orthomyxoviruses (such as influenza viruses) or paramyxoviruses (such as SV5).

15. Recombinant (retro)viral particle according to one of the claims 10 to 14, characterized in that the 5' end of the nucleotide sequence coding for the N-terminal (upstream) protein domain is contiguous with the 3' end of the nucleotide sequence coding for the signal peptide, the 3' end of the nucleotide sequence coding for the N-terminal (upstream) protein domain is contiguous with the 5' end of the nucleotide sequence coding for the peptide, the 3' end of the nucleotide sequence coding for the peptide is contiguous with the 5' end of the nucleotide sequence coding for the C-terminal (downstream) protein domain.

16. Nucleic acid coding for a peptide or for a recombinant particle according toany one of the claims 10 to 15.

17. Method of selective transfer in vitro or ex vivo of a nucleic acid into target eukaryotic cells present among other non-target cells, comprising the administration, to the target and non-target cells, of a recombinant (retro)viral particle according to one of the claims 10 to 15, containing the nucleic acid to be transferred.

18. Pharmaceutical composition containing as active substance a (retro)viral particle according to any one of the claims 10 to 15, and also containing a gene to be transferred, in combination with a physiologically suitable pharmaceutical vehicle.