CA2225824A1

CA2225824A1 - Breast specific genes and proteins

Info

Publication number: CA2225824A1
Application number: CA002225824A
Authority: CA
Inventors: Hongjun Ji; Craig A. Rosen
Original assignee: Individual
Current assignee: Human Genome Sciences Inc
Priority date: 1995-06-30
Filing date: 1995-06-30
Publication date: 1997-01-23
Also published as: EP0851869A4; WO1997002280A1; JPH11509093A; AU3092995A; EP0851869A1

Abstract

Human breast specific gene polypeptides and DNA (RNA) encoding such polypeptides and a procedure for producing such polypeptides by recombinant techniques is disclosed. Also disclosed are methods for utilizing such polynucleotides or polypeptides as a diagnostic marker for breast cancer and as an agent to determine if breast cancer has metastasized. Also disclosed are antibodies specific to the breast specific gene polypeptides which may be used to target cancer cells and be used as part of a breast cancer vaccine. Methods of screening for antagonists for the polypeptide and therapeutic uses thereof are also disclosed.

Description

~ 7=
CA 0222~824 1997-12-24 W O 97/02280 PCT~US9S/08295 BRE~ST Sr~l~C G~S AJ~D PRO~l~S

This invention relates to newly identified polynucleotides, polypeptides encoded by such polynucleotides, and the use of such polynucleotides and polypeptides for detecting disorders of the breast, particularly the presence of breast cancer and breast cancer metastases. The present invention further relates to ;nh;b;ting the production and function of the polypeptides o~
the present invention. The twenty breast specific genes of the present invention are sometimes hereina~ter referred to as "BSG1~, "BSG2" etc.
The m~mm~ry gland is subject to a variety of disorders that should be readily detectable. Detection may be accomplished by inspection which usually consists o~
palpation. Un~ortunately, so few periodic self ~ m; n~tions are made that many breast masses are discovered only by acci~ent~l palpation. Aspiration o~ suspected cysts with a fine-gauge needle is another fairly rQmmon diagnostic practice. Md,.-.oy.d~hy or xeroradiography (so$t-tissue x-ray) of the breast of yet another. A biopsy of a lesion or suspected area is an extreme method of diagnostic test.
There are many types of tumors and cysts which affect the m~mm~ry gland. Fibro~nom~s is the most commo~ benign breast tumor. As a pathological entity, it ranks third _ CA 0222~824 1997-12-24 h~h; n~ cystic disease and carcinoma, respectively. These tumors are seen most fre~uently in young people and are usually readily recognized because they feel encapsulated.
Fibrocystic disease, a benign condition, is the most ro~on disease of the ~emale breast, occurring in about 20% of pre-menopausal women. Lipomas of the breast are also co~ ~n and they are benign in nature. Carr; n~- of the breast is the most c~ malignant condition among women and carries with it the highest fatality rate o~ all cancers affecting this sex. At some during her li~e, one of every 15 women in the USA will develop cancer of the breast. Its reported Ann~
incidence is 70 per 100,000 females in the population in 1947, rising to 72.5 in 1969 for whites, and rising ~rom 47.8 to 60.1 for blacks. The ~nnll~l mortality rate from 1930 to the present has l ; n~ fairly constant, at d~lo~imately 23 per 100,000 female population. Breast cancer is rare in men, but when it does occur, it usually not recognized until late, and thus the results of treatment are poor. In women, carcinoma of the breast is rarely seen before age 30 and the incidence rises rapidly after menopause. For this reason, post-menopausal breast masses should be considered cancer until proved otherwise.
In accordance with an aspect of the present invention, there are provided nucleic acid probes comprising nucleic acid molecules of sufficient l-ength to specifically hybridize to the RNA transcribed from the human breast specific genes of the present invention or to DNA corresponding to such RNA.
In accordance with another aspect of the present invention there is provided a method of and products for diagnosing breast cancer formation and breast cancer metastases by detecting the presence of RNA transcribed from the human breast specific genes of the present invention or DNA corresponding to such RNA in a sample d~rived ~rom a host.

CA 0222~824 1997-12-24 W O 97/02280 PCT~US95/08295 In accordance with yet another aspect of the present invention, there is provided a method of and products for diagnosing breast cancer formation and breast cancer metastases by detecting an altered level of a polypeptide correspo~; ng to the breast specific genes of the present invention in a sample derived from a host, whereby an elevated level of the polypeptide indicates a breast cancer diagnosis.
In accordance with another aspect of the present invention, there are provided isolated polynucleotides encoding hllm~n breast specific polypeptides, including mRNAs, DNAs, cDNAs, genomic DNAs, as well as antisense analogs and biologically active and diagnostically or therapeutically useful fragments thereof.
In accordance with still another aspect of the present invention there are provided human breast specific genes which include polynucleotides as set forth in the sequence listing.
In accordance with a further aspect of the present invention, there are provided novel polypeptides encoded by the polynucleotides, as well as biologically active and diagnostically or therapeutically useful fragments, analogs and derivatives thereof.
In accordance with yet a further aspect of the present invention, there is provided- a process for producing such polypeptides by recomhin~nt techniques comprising culturing recombinant prokaryotic and/or eukaryotic host cells, containing a polynucleotide of the present invention, under conditions promoting expression of said proteins and subsequent recovery of said proteins.
In accordance with yet a further aspect of the present invention, there are provided antibodies specific to such polypeptides, which may be employed to detect breast cancer cells or breast cancer metastasis.

CA 0222~824 1997-12-24 W O 97/02280 PCTAUS95tO8295 In accordance with another aspect o~ the present invention, there are provided processes ~or using one or more o~ the polypeptides o~ the present invention to treat breast cancer and ~or using the polypeptides to screen ~or compounds which interact with the polypeptides, ~or example, compounds which ; nh; h; t or activate the polypeptides o~ the present invention.
In accordance with yet another aspect o~ the present invention, there is provided a ~creen for detecting compounds which ~ nh; h; t activation o~ one or more o~ the polynucleotides and/or polypeptides of the present invention which may be used to therapeutically, for example, in the treatment of breast cancer.
In accordance with yet a ~urther aspect o~ the present invention, there are provided processes ~or utilizing such polypeptides, or polynucleotides encoding such polypeptides, ~or in vitro purposes related to scienti~ic research, synthesis o~ DNA and manu~acture o~ DNA vectors.
These and other aspects o~ the present invention should be apparent to those skilled in the art ~rom the teAch;ngs herein.
The ~ollowing drawings are illustrative o~ embo~;m~nts o~ the invention and are not meant to limit the scope o~ the invention as encompassed by the cl ~;mc, Figure 1 is a ~ull length cDNA sequence o~ breast speci~ic gene 1 o~ the present invention.
Figure 2 is a partial CDNA sequence and the corresponding deduced amino acid sequence o~ breast speci~ic gene 2 of the present invention.
Figure 3 is a partial cDNA sequence and deduced amino acid sequence o~ breast speci$ic gene 3 of the invention.
Figure 4 is a partial CDNA seguence and the corresponding deduced amino acid sequence o~ breast speci~ic gene 4 o~ the present invention.

CA 0222~824 1997-12-24 W O 97/02280 PCTnUS95108295 Figure 5 is a partial cDNA sequence of breast specif ic gene 5 of the present invention.
Figure 6 is a partial cDNA and deduced amino acid sequence of breast specif ic gene 6 of the present invention .
Figure 7 is a partial cDNA sequence of breast specif ic gene 7 of the present invention.
Figure 8 is a partial CDNA sequence of breast specif ic gene 8 of the present invention.
Figure 9 is a partial CDNA sequence of breast specif ic gene 9 of the present invention.
Figure 10 is a partial CDNA sequence of breast specific gene 10 of the present invention.
Figure 11 is a partial CDNA se~uence of breast specific gene 11 of the present invention.
Figure 12 is a partial cDNA sequence of breast specif ic gene 12 of the present invention.
Figure 13 is a partial cDNA sequence of breast specif ic gene 13 of the present invention.
Figure 14 is a partial CDNA sequence of breast specif ic gene 14 of the present invention.
Figure 15 is a partial CDNA sequence of breast specif ic gene 15 of the present invention.
Figure 16 is a partial CDNA sequence of breast specif ic gene 16 of the present invention.
Figure 17 is a partial CE)NA sequence of breast specific gene 17 of the present invention.
Figure 18 is a partial CDNA seguence of breast specif ic gene 18 of the present invention.
Figure 19 is a partial CDNA sequence of breast specific gene 19 of the present invention.
Figure 20 is a partial cDNA sequence of breast specific gene 2 0 of the present invention .
The term ~breast specif ic gene " means that such gene is primarily expressed in tissues derived from the breast, and such genes may be expressed in cells derived f rom tissues CA 0222~824 1997-12-24 W O 97/02280 PCT~US95108295 other than ~rom the breast. However, the expression o~ such genes is signi~icantly higher in tissues derived ~rom the breast than ~rom non-breast tissues.
In accordance with one aspect o~ the present invention there is provided a polynucleotide which ~nco~s the mature polypeptides having the deduced amino acid sequence o~ Figure 1 (SEQ ID NO:1) and ~ragments, analogues and derivatives thereof.
In accordance with a further aspect o~ the present invention there is provided a polynucleotide which encodes the same mature polypeptide as a human gene having a coding portion which ~ont~; n~ a polynucleotide which is at least 90 identical (pre~erably at least 95% identical and most pre~erably at least 97~ or 100% identical) to one of the polynucleotides o~ Figures 2-20 (SEQ ID NO:2-20) , as well as ~ragments thereo~.
In accordance with still another aspect o~ the present invention there is provided a polynucleotide which encodes ~or the same mature polypeptide as a human gene whose coding portion includes a polynucleotide wnicn is at ieas~ 9û~
identical to (pre~erably at least 95~ identical to and most preferably at least 97~ or 100% identical) to one o~ the polynucleotides included in ATCC Deposit No. 97175 deposited June 2, 1995.
In accordance with yet -another aspect o~ the present invention, there is provided a polynucleotide probe which hybridizes to mRNA (or the corresponding cDNA) which is transcribed ~rom the coding portion o~ a human gene which coding portion includes a DNA sequence which is at least 90~
identical to (pre~erably at least 95~ identical to) and most preferably at least 97~ or 100~ identical) to one o~ the polynucleotide sequences o~ Figures 1-20 (SEQ ID NO:1-20) .
The present invention ~urther relates to a mature polypeptide encoded by a coding portion o~ a human gene which coding portion includes a DNA sequence which is at lest 9û~

CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 identical to (preferably at least 95% identical to and more preferably 97~ or 100~ identical to) one of the polynucleotides of Figures 2-20 (SEQ ID N0:2-20), as well as analogues, derivatives and fragments of such polypeptides.
The present invention also relates to one of the mature polypeptides of Figure 1 (SEQ ID NO:1) and fragments, analogues and derivatives of such polypeptides.
The present invention further relates to the same mature polypeptide encoded by a human gene whose coding portion includes DNA which is at least 90~ identical to (preferably at least 95% identical to and more preferably at least 97~ or 100% identical to) one of the polynucleotides included in ATCC Deposit No. 97175 deposited June 2, 1995.
In accordance with an aspect of the present invention, there are provided isolated nucleic acids (polynucleotides) which encode for the mature polypeptides having the ~llc~
amino acid sequence of Figure 1 (SEQ ID NO:1) or fra~ment~, analogues or derivatives thereof.
The polynucleotides of the present invention may be in the form of RNA or in the form of DNA, which DNA includes cDNA, genomic DNA, and synthetic DNA. The DNA may be double-stranded or single-stranded, and if single stranded may be the coding strand or non-coding (anti-sense) strand. The coding sequence which encodes the mature polypeptide may include DNA identical to Figures 1-20 (SEQ ID NO:1-20) or that of the deposited clone or may be a different coding sequence which coding sequence, as a result of the r~lln~ncy or degeneracy of the genetic code, encodes the same mature polypeptide as the coding sequence of a gene which coding sequence includes the DNA of Figures 1-20 (SEQ ID NO:1-20) or the deposited cDNA.
The polynucleotide which encodes a mature polypeptide of the present invention may include, but is not limited to:
only the coding sequence for the mature polypeptide; the coding sequence for the mature polypeptide and additional CA 0222~824 1997-12-24 coding sequence such as a leader or secretory sequence or a proprotein sequence; the coding sequence for the mature polypeptide (and optionally additional coding sequence) and non-coding sequence, such as introns or non-coding sequence 5' and/or 3' of the coA~ ng sèquence for the mature polypeptide.
Thus, the term ~polynucleotide encoding a polypeptide"
encompasses a polynucleotide which includes only coding sequence for the polypeptide as well as a polynucleotide which includes additional coding and/or non-coding sequence.
The present invention further relates to variants of the hereinabove described polynucleotides which encode fragments, analogs and derivatives o~ a mature polypeptide of the present invention. The variant of the polynucleotide may be a naturally occurring allelic variant of the polynucleotide or a non-naturally occurring variant o~ the polynucleotide.
Thus, the present invention includes polynucleotides encoding the same mature polypeptide as hereinabove described as well as variants of such polynucleotides which variants encode a fragment, derivative or analog of a polypeptide of the invention. Such nucleotide variants include deletion variants, substitution variants and addition or insertion variants.
The polynucleotides of the invention may have a coding sequence which is a naturally occurring allelic variant of the human gene whose coding sequence includes DNA as shown in Figures 1-20 (SEQ ID NO:1-20) or of the coding sequence of the DNA in the deposited clone. As known in the art, an allelic variant is an alternate ~orm of a polynucleotide sequence which may have a substitution, deletion or addition o~ one or more nucleotides, which does not substantially alter the function of the encoded polypeptide.
The present invention also includes pol~nucleotides, wherein the coding sequence for the mature polypeptide may be fused in the same reading frame to a polynucleotide sequence CA 0222~824 1997-12-24 which aids in expression and secretion o$ a polypeptide from a host cell, for example, a leader sequence which functions as a secretory sequence for controlling transport of a polypeptide from the cell. The polypeptide having a leader sequence is a preprotein and may have the leader sequence cleaved by the host cell to form the mature form of the ~ polypeptide. The polynucleotides may also encode a otein which is the mature protein plus additional 5' amino acid residues. A mature protein having a prosequence is a proprotein and is an inactive form of the protein. Once the prosequence is cleaved an active mature protein rPm~; n~, Thus, for example, the polynucleotide o~ the present invention may encode a mature protein, or a protein having a prosequence or a protein having both a presequence and a presequence (leader sequence).
The polynucleotides of the present invention may also have the coding sequence fused in frame to a marker sequence which allows for purification of the polypeptide of the present invention. The marker sequence may be a hexa-histidine tag supplied by a pQE-9 vector to provide ~or purification of the mature polypeptide ~used to the marker in the case o~ a bacterial host, or, ~or example, the marker sequence may be a hemayglutinin (HA) tag when a ~ lian host, e.g. COS-7 cells, is used. The HA tag corresponds to an epitope derived ~rom the influenza hemagglutinin protein (Wilson, I., et al., Cell, 37:767 (198~)).
The present invention ~urther relates to polynucleotides which hybridize to the hereinabove-described polynucleotides if there is at least 70%, pre~erably at least 90%, and more preferably at least 95% identity between the sequences. The present invention particularly relates to polynucleotides which hybridize under stringent conditions to the hereinabove-described polynucleotides. As herein used, the term "stringent conditions" means hybridization will occur only if there is at least 95% and preferably at least CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 97% identity between the sequences. The polynucleotides which hybridize to the her~n~hove described polynucleotides in a preferred emboA~m~nt encode polypeptides which retain substantially the same biological function or activity as the mature polypeptide of the present invention encoded by a coding sequence which includes the DNA of Figures 1-20 (SEQ
ID NO:1-20) or the deposited cDNA(s).
Alternatively, the polynucleotide may have at least 10 or 20 bases, preferably at least 30 bases, and more preferably at least 50 bases which hybridize to a polynucleotide of the present invention and which has an identity thereto, as hereinabove described, and which may or may not retain activity. For example, such polynucleotides may be employed as probes for polynucleotides, for example, for recovery o~ the polynucleotide or as a diagnostic probe or as a PCR primer.
Thus, the present invention is directed to polynucleotides having at least a 70~ identity, preferably at least 90~ and more preferably at least 95~ identity to a polynucleotide which encodes the mature polypeptide encoded by a human gene which includes the DNA of one of Figures 1-20 (SEQ ID NO:1-20) as well as ~ragments thereof, which fragments have at least 30 bases and preferably at least 50 bases and to polypeptides encoded by such polynucleotides.
The partial sequences are specific tags for messenger RNA molecules. The complete sequence of that messenger RNA, in the form of cDNA, can be determined using the partial sequence as a probe to identify a cDNA clone corresponding to a full-length transcript, followed by sequencing of that clone. The partial cDNA clone can also be used as a probe to identify a genomic clone or clones that contA~ n the complete gene including regulatory and promoter regions, exons, and introns.
The partial sequences of Figures 2-20 (SEQ ID NO:2-20) may be used to identify the corresponding full length gene CA 0222~824 1997-12-24 W O 97/02280 PCT~US95108295 ~rom which they were derived. The partial sequence~ can be nick-translated or end-labelled with 32p using polynucleotide kinase using labelling methods known to those with skill in the art (Basic Methods in Molecular Biology, L.G. Davis, M.D.
Dibner, and ~.F. Battey, ed., Elsevier Press, NY, 19~6). A
lambda library prepared ~rom human breast tissue can be directly screened with the labelled sequences of interest or the library can be converted en masse to pBluescript (Stratagene Cloning Systems, La Jolla, CA 92037) to ~acilitate bacterial breasty screening. Regarding pBluescript, see Sambrook et al., Molecular Cloning-A
Laboratory ~AnnAl, Cold Spring Harbor Laboratory Press (1989), pg. 1.20. Both methods are well known in the art.
Brie~ly, ~ilters with bacterial colonies cont~ n~ ng the library in pBluescript or bacterial lawns cOntA~n~ng l~m~
plaques are denatured and the DNA is ~ixed to the filters.
The filters are hybridized with the labelled probe using hybridization conditions described by Davis et al., su~ra.
The partial sequences, cloned into lAmh~ or pBluescript, can be used as positive controls to assess background binding and to adjust the hybridization and washing stringencies necessary ~or accurate clone identi~ication. The resulting autoradiograms are compared to duplicate plates o~ colonies or plaques; each exposed spot corresponds to a positive breasty or plaque. The colonies or plaques are selected, ~xrAn~d and the DNA is isolated ~rom the colonies ~or ~urther analysis and sequencing.
Positive cDNA clones are analyzed to determine the amount o~ additional sequence they contA~n using PCR with one primer ~rom the partial sequence and the other primer ~rom the vector. Clones with a larger vector-insert PCR product than the original partial sequence are analyzed by restriction digestion and DNA sequencing to determine whether they contain an insert o~ the same size or similar as the mRNA size determined ~rom Northern blot Analysis.

CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 Once one or more overlapping cDNA clones are identified, the complete sequence of the clones can be determined. The preferred method is to use ~on~lclease III digestion (McCombie, W.R, Kirkness, E., Fleming, J.T., Kerlavage, A.R., Iovannisci, D.M., and Martin~ ~do, R., Methods, 3:33-40, 1991). A series of deletion clones are generated, each of which is sequenced. The resulting overlapping sequences are assembled into a single contiguous sequence of high r~lln~ncy (usually three to five overlapping sequences at each nucleotide position), resulting in a highly accurate ~inal sequence.
The DNA sequences (as well as the corresponrl;ng RNA
sequences) also include sequences which are or contain a DNA
sequence identical to one cont~;ne~ in and isolatable from ATCC Deposit No. 97175, deposited June 2, l99S, and ~ras~-nts or portions of the isolated DNA sequences (and corresponding RNA sequences), as well as DNA (RNA) sequences encoding the same polypeptide.
The deposit(s) referred to herein will be maintained under the terms o~ the Budapest Treaty on the International Recognition o~ the Deposit o~ Micro-org~n;Fm~ for purposes o~
Patent Procedure. These deposits are provided merely as convenience to those of skill in the art and are not an admission that a deposit is required under 35 U.S.C. 112.
The sequence o~ the polynucleotides cont~;ned in the deposited materials, as well as the amino acid sequence of the polypeptides encoded thereby, are incorporated herein by reference and are controlling in the event of any conflict with any description of sequences herein. A license may be required to make, use or sell the deposited materials, and no such license is hereby granted.
The present invention further relates to polynucleotides which have at least 10 bases, pre~erably at least 20 bases, and may have 30 or more bases, which polynucleotides are hybridizable to and have at least a 70~ identity to RNA (and CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 DNA which corresponds to such RNA) transcribed from a hllm~n gene whose coding portion includes DNA as here;n~bove described.
Thus, the polynucleotide sequences which hybridize as described above may be used to hybridize to and detect the expression of the human genes to which they correspond for use in diagnostic assays as hereinafter described.
In accordance with still another aspect of the present invention there are provided diagnostic assays for detecting micrometastases of breast ~nc~ in a host. While applicant does not wish to limit the reasoning of the present invention to any specific scientific theory, it is believed that the presence of active transcription of a breast specific gene of the present invention in cells o~ the host, other than those derived from the breast, is indicative of breast cancer metastases. This is true because, while the breast specific genes are found in all cells of the body, their transcription to mRNA, cDNA and expression products is primarily limited to the breast in non-diseased individuals. However, if breast cancer is present, breast cancer cells migrate from the cancer to other cells, such that these other cells are now actively transcribing and expressing a breast specific gene at a greater level than is normally found in non-diseased individuals, i.e., transcription is higher than found in non-breast tissues in healthy individuals. It is the detection of this ~nh~nced transcription or ~nh~nced protein expression in cells, other than those derived from the breast, which is indicative of metastases of breast cancer.
In one example of such a ~;~gnostic assay, an RNA
sequence in a sample derived from a tissue other than the breast is detected by hybridization to a probe. The sample contains a nucleic acid or a mixture of nucleic acids, at least one of which is suspected of cont~; n; ng a h~ n breast specific gene or ~ragment thereof of the present invention which is transcribed and expressed in such tissue. Thus, for _ CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 example, in a form of an assay for determining the presence of a specific RNA in cells, initially RNA is isolated from the cells.
A sample may be obt~;neA from cells derived from tissue other than from the breast including but not limited to blood, urine, saliva, tissue biopsy and autopsy material.
The use of such methods for detecting ~nh~n~ed transcription to mRNA from a human breast specific gene of the present invention or fragment thereof in a sample obt~;ne~ from cells derived from other than the breast is well within the scope of those skilled in the art from the teachings herein.
The isolation of mRN~ comprises isolating total cellular RNA by disrupting a cell and performing differential centrifugation. Once the total RNA is isolated, mRNA is isolated by making use of the ~ n; ne nucleotide residues known to those skilled in the art as a poly(A) tail found on virtually every eukaryotic mRNA molecule at the 3' end thereof. Oligonucleotides composed of only deoxythymidine toligo(dT)] are linked to cellulose and the oligo(dT)-cellulose packed into small columns. When a preparation of total cellular RNA is passed through such a column, the mR~A
molecules bind to the oligo(dT) by the poly(A)tails while the rest of the RNA flows through the column. The bound mRNAs are then eluted from the column and collected.
One example of detecting-isolated mRNA transcribed from a breast specific gene of the present invention comprises screening the collected mRNAs with the gene specific oligonucleotide probes, as hereinabove described.
It is also appreciated that such probes can be and are preferably labeled with an analytically detectable reagent to facilitate identification of the probe. Useful reagents include but are not limited to radioactivity, fluorescent dyes or enzymes capable of catalyzing the fq~mation of a detectable product.

CA 0222~824 1997-12-24 W O 97/02280 PCTnUS95/08295 An example o~ detecting a polynucleotide complementary to the mRNA sequence (cDNA) utilizes the polymerase chain reaction (PCR) in conjunction with reverse transcriptase.
PCR is a very powerful method ~or the specific amplification of DNA or RNA stretches (Saiki et al., Nature, 234:163-166 (1986)). One application of this technology is in nucleic acid probe technology to bring up nucleic acid sequences present in low copy numbers to a detectable level. Numerous diagnostic and scientific applications of thi~ method have been described by E.A. Erlich (ed.) in PCR Technology-Principles and Applications for DNA Amplification, Stockton Press, USA, 1989, and by M.A. Inis (ed.) in PCR Protocols, Ac~em~c Press, San Diego, USA, 1990.
RT-PCR is a combination o~ PCR with the reverse transcriptase enzyme. Reverse transcriptase is an enzyme which produces cDNA molecules from corresponding mRNA
molecules. This is important since PCR amplifies nucleic acid molecules, particularly DNA, and this DNA may be produced ~rom the mRNA isolated ~rom a sample derived ~rom the host.
A speci~ic example o~ an RT-PCR diagnostic assay involves removing a sample ~rom a tissue o~ a host. Such a sample will be ~rom a tissue, other than the breast, ~or example, blood. There~ore, an example o~ such a diagnostic assay comprises whole blood gradient isolation o~ nucleated cells, total RNA extraction, RT-PCR o~ total RNA and agarose gel electrophoresis of PCR products. The PCR products comprise cDNA complement~y to RNA transcribed from one or more breast speci~ic genes o~ the present invention or ~ragments thereo~. More particularly, a blood sample is obtained and the whole blood is combined with an equal volume of phosphate bu~ered saline, centri~uged and the lymphocyte and granulocyte layer is care~ully aspirated and rediluted in phosphate buffered saline and centrifuged again. The supernate is discarded and the pellet cont~; n; ng nucleated CA 0222~824 1997-12-24 W O 97/02280 PCT/u~53~ 95 cells is used ~or RNA extraction using the RNazole B method as described by the manu~acturer (Tel-Test Inc., Friendswood, TX).
Oligonucleotide primers and probes are prepared with high speci~icity to the DNA sequences o~ the present invention. The probes are at least 10 base pairs in length, pre~erably at least 30 base pairs in length and most pre~erably at least 50 base pairs in length or more. The reverse transcriptase reaction and PCR ampli~ication are per~ormed se~uentially without interruption. Taq polymerase is used during PCR and the PCR products are concentrated and the entire sample is run on a Tris-borate-EDTA agarose gel contA;ning ethidium bromide.
In accordance with another aspect o~ the present invention, there is provided a method o~ diagnosing a disorder o~ the breast, ~or example breast cancer, by determining altered levels of the breast speci~ic polypeptides o~ the present invention in a biological sample, derived ~rom tissue other than ~rom the breast. Elevated levels o~ the breast speci~ic polypeptides o~ the present invention, indicates active transcription and expression o~
the corresponding breast speci~ic gene product. Assays used to detect levels o~ a breast speci~ic gene polypeptide in a sample derived ~rom a host are well-known to those skilled in the art and include radioimmllnoassays, co7npetitive-~;n~.7ing assays, Western blot analysis, ELISA assays and "sandwich"
assays. A biological sample may include, but is not limited to, tissue extracts, cell samples or biological ~luids, however, in accordance with the present invention, a biological sample speci~ically does not include tissue or cells o~ the breast.
An ELISA assay (Coligan, et al., Current Protocols in T7nml7noloqy, 1(2), Chapter 6, 1991) initially comprises preparing an antibody speci~ic to a breast speci~ic polypeptide o~ the present invention, pre~erably a monoclonal CA 0222~824 1997-12-24 W O 97/02280 PCT~US95/08295 antibody. In addition, a reporter antibody is prepared against the monoclonal ~nt;hody. To the reporter antibody is attached a detectable reagent such as radioactivity, i~luorescence or, in this example, a horser;lci; h peroxidase enzyme. A sample is removed ~rom a host and incubated on a solid support, e.g., a polystyrene dish, that binds the proteins in the sample. Any ~ree protein h;n~;ng sites on the dish are then covered by incubating with a non-speci~ic protein, such as BSA. Next, the monoclonal ~n~;hody is incubated in the dish during which time the monoclonal antibodies attach to the breast speci~ic polypeptide attached to the polystyrene dish. All unbound monoclonal antibody is washed out with bu~er. The reporter ~nt;hody linked to horseradish peroxidase is now placed in the dish resulting in h;n~,ng o~ the reporter antibody to any monorlon~l ~nt;hody bound to the breast specific gene polypeptide. Unattached reporter antibody is then washed out. Peroxidase substrates are then added to the dish and the amount o~ color developed in a given time period is a measurement o~ the amount o~ the breast speci~ic polypeptide present in a given volume o~
patient sample when ~o~r~ed against a st~n~d curve.
A competition assay may be employed where antibodies speci~ic to a breast speci~ic polypeptide are attached to a solid support. The breast speci~ic polypeptide is then labeled and the labeled polypeptide a sample derived ~rom the host are passed over the solid support and the amount o~
label detected, ~or example, by liquid scintillation chromatography, can be correlated to a quantity o~ the breast speci~ic polypeptide in the sample.
A "sandwich" assay is similar to an ELISA assay. In a "sandwich~ assay, breast speci~ic polypeptides are passed over a solid support and bind to antibody attached to the solid support. A second antibody is then bound to the breast speci~ic polypeptide. A third antibody which is labeled and is speci~ic to the second antibody, is then passed over the CA 0222~824 l997-l2-24 W O 97/02280 PCTrUS95/08295 solid support and binds to the second antibody and an amount can then be quantified.
In alternative methods, labeled antibodies to a breast speci~ic polypeptide are used. In a one-step assay, the target molecule, if it is present, is ~ h; lized and incllh~ted with a labeled antibody. The labeled ~nt;ho~y binds to the immobilized target molecule. After washing to remove the unbound molecules, the sample is assayed for the presence of the label. In a two-step assay, ~ h;l ~zed target molecule is incubated with an unlabeled antibody. The target molecule-labeled antibody complex, if present, is then bound to a second, labeled antibody that is specific for the unlabeled Ant;hody The sample is washed and assayed for the presence of the label.
Such antibodies specific to breast speci~ic gene proteins, for example, anti-idiotypic Ant;hodies, can be used to detect breast cancer cells by being labeled and described above and h;n~;ng tightly to the breast cancer cells, and, therefore, detect their presence.
The Ant;hodies may also be used to target breast cancer cells, for example, in a method of homing interaction agents which, when contacting breast cancer cells, destroy them.
This is true since the antibodies are specific for breast specific genes which are primarily expressed in breast cancer, and a linking of the interaction agent to the antibody would cause the interaction agent to be carried directly to the breast.
Antibodies of this ~ype may also be used to do in vivo imaging, for example, by labeling the Ant;hodies to facilitate scAnn;ng of the breast. One method for imaging comprises contacting any cancer cells of the breast to be imaged with an anti-breast specific gene protein antibody labeled with a detectable marker. The method is performed under conditions such that the labeled antibody binds to the ~'~'.'',',~"~.~;~!~ ~:.~ ~:..'.'.~'','.. '~ .r.~L:~n~ . In a specific example, the CA 0222~824 l997-l2-24 W O 97/02280 PCTrUS95/08295 antibodies interact with the breast, for example, breast cancer cells, and fluoresce upon such contact such that imaging and visibility o~ the breast is ~nh~nced to allow a determination of the diseased or non-diseased state of the breast.
The choice of marker used to label the antibodies will vary depending upon the application. However, the choice of marker is readily determ~ n~hl e to one skilled in the art.
These labeled antibodies may be used in ;~mllno~ssays as well as in histological applications to detect the presence of the proteins. The labeled antibodies may be polyclonal or monoclonal.
The presence of active transcription, which is greater than that normally ~ound, of the breast specific genes in cells other than from the breast, by the presence of an altered level of m-RNA~ CDNA or expression products is an important indication o~ the presence of a breast cancer which has metastasized, since breast cancer cells are migrating from the breast into the general circulation. Accordingly, this ph~nom~non may have important clinical implications since the method of treating a localized, as opposed to a metastasized, tumor is entirely different.
Of the 20 breast speci~ic genes disclosed, only breast speci~ic gene 1 is a ~ull-length gene. Breast speci~ic gene 1 is 79% identical and 83~ similar to human Al~he~m~ disease amyloid gene. Breast speci~ic gene 2 is 30% identical and 48% 5~m;1~ to hnm~n hydroxyindole-o-methyltrans~erase gene.
Breast speci~ic gene 3 is 58% identical and 62~ similar to human 06-methylgll~n~nF~-DNA methyltransferase gene. Breast speci~ic gene 4 is 34~ identical and 65~ similar to the mouse pl20 gene. Breast speci~ic gene 5 is 78~ identical and 89 similar to human p70 ribosomal S6 kinase alpha-II gene.
Breast speci~ic gene 6 is 77% identical and 79% similar to the human transcription factor NFATp gene.

CA 0222~824 1997-12-24 W O 97/02280 PCT~US9S/08295 As stated previously, the breast speci~ic genes o~ the present inv~nt; ~n are putative molecular markers in the diagnosis of breast c~nç~ ~ormation, and breast cancer metastases. As shown in the ~ollowing Table 1, the presence o~ the breast speci~ic genes when tested in normal breast, breast cancer, embryo and other ç~nc~ libraries, the breast speci~ic genes of the present invention were ~ound to be most prevalent in the breast cancer library, indicating that the genes o~ the present invention may be employed ~or detecting breast cancer, as discussed previously. The table also indicates a putative identi~ication, based on homology, o~
BSG1 through BSG6 to known genes.

Table 1 Genes Homolog Gene Norm Br Ca Embryo Other Others Name (Class) Br Can-cers BSGl AD Amyloid (3) 1 6 BSG2 Hyd~yindole- 3 o-methytrans-ferase (2) methylguanine-DNA
methyltrans-ferase (1) BSG4 P120 (3) 3 BSGs p70 ribosomal - 3 S6 kinase alpha-II (2) BSG6 Transcription 2 factor NFATp(3) BSGll 3 CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 BSG14 . 2 BSGl9 The assays described above may also be used to test whether bone marrow preserved before chemotherapy is cont~m;n~ted with micrometastases of a breast cancer cell.
In the assay, blood cells from the bone marrow are isolated and treated as described above, this method allows one to determine whether preserved bone marrow is still suitable for transplantation a$ter chemotherapy.
The present invention $urther relates to mature polypeptides, for example the BSG1 polypeptide, as well as ~ragments, analogs and derivatives of such polypeptide.
The terms "$ragment,~ derivative~ and ~analog" when referring to the polypeptides Pnco~p~ by the genes of the invention means a polypeptide which retains essentially the same biological $unction or activity as surh polypeptide.
Thus, an analog includes a proprotein which can be activated by cleavage of the proprotein portion to produce an active mature polypeptide.
The polypeptides of the present invention may be recombinant polypeptides, natural polypeptides or synthetic polypeptides, preferably recombinant polypeptides.
The $ragment, derivative or analog of the polypeptides encoded by the genes of the invention may be (i) one in which one or more of the amino acid residues are substituted with a conserved or non-conserved amino acid residue (preferably a conserved amino acid residue) and such substituted amino acid residue may or may not be one encoded by the genetic code, or (ii) one in which one or more of the amino acid CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 residues includes a substituent group, or (iii) one in which the polypeptide is $used with another compound, such as a compound to increase the hal$-li$e o$ the polypeptide ($or example, polyethylene glycol), or (iv) one in which the additional amino acids are $used to the polypeptide, such as a leader or secretory seguence or a sequence which is employed for purification of the mature polypeptide or a proprotein sequence. Such fra~m~nt~, derivatives and analogs are deemed to be within the scope o$ those skilled in the art $rom the teachings herein.
The polypeptides and polynucleotides o$ the present invention are preferably provided in an isolated $orm, and preferably are puri$ied to homogeneity.
The term "isolated~ means that the material is removed $rom its original envilv~ t (e.g., the natural envi~v-~ t if it is naturally occurring). For example, a naturally-occurring polynucleotide or polypeptide present in a living ~n;m~l is not isolated, but the same polynucleotide or polypeptide, separated $rom some or all of the coexisting materials in the natural system, is isolated. Such polynucleotides could be part of a vector and/or such polynucleotides or polypeptides could be part of a composition, and still be isolated in that such vector or composition is not part of its natural envi~ ellt.
The polypeptides of the-present invention include the polypeptides of Figure 1 (SEQ ID NO:1) (in particular the mature polypeptides) as well as polypeptides which have at least 70~ similarity (preferably at least a 70% identity) to the polypeptides o$ Figure 1 (SBQ ID NO:1) and more preferably at least a 90~ sim; 1 ~rity (more preferably at least a 90% identity) to the polypeptides of Figures 8 and 9 (SEQ ID NO:8 and 9) and still more preferably at least a 95%
similarity (still more preferably at least 95% identity) to the polypeptides of Figure 1 (SEQ ID NO:1) and also include portions of such polypeptides with such portion of the CA 0222~824 1997-12-24 W O 97/02280 PCT~US95/08295 polypeptide generally cont~;n;ng at least 30 amino acids and more preferably at least 50 amino acids.
As known in the art "s;m;l~ityn between two polypeptides is determ;n~n by comparing the amino acid sequence and its conserved amino acid substitutes of one polypeptide to the sequence of a second polypeptide.
Fragments or portions of the polypeptides of the present invention may be employed for pro~lc;ng the correspQnn;ng full-length polypeptide by peptide synthesis; therefore, the fragments may be employed as intermediates for pron~lc;ng the full-length polypeptides. Fragments or portions of the polynucleotides of the present invention may be used to synthesize full-length polynucleotides of the present invention.
The present invention also relates to vectors which include polynucleotides of the present invention, host cells which are genetically engineered with vectors of the invention and the production of polypeptides of the invention by recombinant techniques.
Host cells are genetically engineered (transduced or transformed or transfected) with the vectors of this invention which may be, for example, a cloning vector or an expression vector. The vector may be, for example, in the form of a plasmid, a viral particle, a phage, etc. The engineered host cells can -be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants or amplifying the breast specific genes. The culture conditions, such as temperature, pH and the like, are those previously used with the host cell selected for expression, and will be apparent to those of ordinarily skill in the art.
The polynucleotides of the present invention may be employed for producing polypeptides by recombinant techniques. Thus, for example, the polynucleotide may be included in any one of a variety of expression vectors for CA 0222~824 1997-12-24 expressing a polypeptide. Such vectors include chromosomal, nonchromosomal and synthetic DNA se~l~ncP~, e.g., derivatives of SV40; bacterial plasmids; phage DNA;
baculovirus; yeast pl ~m; ~; vectors derived from combinations of pl ~Fm; ~ and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox virus, and pseudorabies.
However, any other vector may be used as long as it is replicable and viable in the host.
The ~L~. iate DNA sequence may be inserted into the vector by a variety of procedures. In general, the DNA
sequence is inserted into an ~l~riate restriction ~n~Qnllclease site~s) by procedures known in the art. Such procedures and others are deemed to be within the scope o$
those skilled in the art.
The DNA sequence in the expression vector is operatively linked to an d~' ~riate expression control sequence(s) (promoter) to direct mRNA synthesis. As representative examples of such promoters, there may be mentioned: LTR or SV40 promoter, the E. coli. lac or tr~, the phage lambda PL
promoter and other promoters known to control expression of genes in prokaryotic or eukaryotic cells or their viruses.
The expression vector also contains a ribosome binding site for translation initiation and a transcription terminator.
The vector may also include appropriate sequences for amplifying expression.
In addition, the expression vectors preferably contain one or more selectable marker genes to provide a phenotypic trait for selection of transformed host cells such as dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or such as tetracycline or ampicillin resistance in E. coli.
The vector cont~;n;ng the appropriate DNA sequence as hereinabove described, as well as an appropriate promoter or control sequence, may be employed to transform an appropriate host to permit the host to express the protein.

CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 As representative examples of a~.o~riate hosts, there may be mentioned: bacterial cells, such as E. coli, StrePtOmYCeS, SA1~ne1 lA ty~himll~ium; fungal cells, such as yeast; insect cells such as Drosophila S2 and SpodoPtera S~9;
Ani 1 cells such as CHO, COS or Bowes m~l An~mA;
adenovirusesi plant cells, etc. The selection of an ~ r iate host i~ deemed to be within the scope o~ those skilled in the art ~rom the teachings herein.
More particularly, the present invention also includes recombinant constructs comprising one or more o~ the sequences as broadly described above. The constructs co~ ~ise a vector, such as a plasmid or viral vector, into which a sequence of the invention has been inserted, in a ~orward or reverse orientation. In a pre~erred aspect o~
this embodiment, the construct ~urther comprises regulatory sequences, including, ~or example, a promoter, operably linked to the sequence. Large nl hers o$ suitable vectors and promoters are known to those o~ skill in the art, and are commercially available. The following vectors are provided by way o~ example. Bacterial: pQE70, pQE60, pQE-9 (Qiagen), pBS, pD10, phagescript, psiX174, pbluescript SK, pBSKS, pNH8A, pNH16a, pNH18A, pNH46A (Stratagene); ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia). Eukaryotic: pWLNEO, pSV2CAT, pOG44, pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). However, any other plasmid or vector may be used as long as they are replicable and viable in the host.
Promoter regions can be selected ~rom any desired gene using CAT (chloramphenicol trans~erase) vectors or other vectors with selectable markers. Two a~ro~riate vectors are pKK232-8 and pCM7. Particular ~ bacterial promoters include lacI, lacZ, T3, T7, gpt, 1Am1r1A PR~ PL and trp.
Eukaryotic promoters include CMV imm~i Ate early, HSV
thymidine kinase, early and late SV40, LTRs ~rom retrovirus, and mouse metallothionein-I. Selection o~ the appropriate CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 vector and promoter is well within the level o~ ordinary skill in the art.
In a ~urther embo~;m~nt, the present invention relates to host cells contA;n;ng the above-described constructs. The host cell can be a higher eukaryotic cell, such as a m~mm~ lian cell, or a lower eukaryotic cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a bacterial cell. Introduction of the construct into the host cell can be e~ected by calcium phosphate trans~ection, D~AE-Dextran mediated trans~ection, or electroporation (Davis, L., Dibner, M., Battey, I., Basic Methods in Molecular Biology, (1986)).
The constructs in host cells can be used in a conventional mAnne~ to produce the gene product ~nco~ by the recombinant sequence. Alternatively, the polypeptides o~
the invention can be synthetically produced by conventional peptide synthesizers.
Proteins can be expressed in mam~Al;An cells, yeast, bacteria, or other cells under the control of appropriate promoters. Cell-~ree translation systems can also be employed to produce such proteins using RNAs derived ~rom the DNA constructs of the present invention. Appropriate cloning and expression vectors ~or use with prokaryotic and eukaryotic hosts are described by Sambrook, et al., Molecular Cloning: A Laboratory ~nllAl-, Second Bdition, Cold Spring Harbor, N.Y., (1989), the disclosure o~ which is hereby incorporated by re~erence.
Transcription o~ the DNA Pnco~;ng the polypeptides o~
the present invention by higher eukaryotes is increased by inserting an ~nhAncer sequence into the vector. RnhAnrers are cis-acting elements of DNA, usually about ~rom 10 to 300 bp that act on a promoter to increase its transcription.
Examples including the SV40 enhAncer on the la~e side o~ the replication origin bp 100 to 270, a cytomegalovirus early CA 0222~824 1997-12-24 W O 97/02280 PCTrUS9S/08295 promoter ~nh~ncer, the polyoma enhAncer on the late side of the replication origin, and adenovirus PnhAn~5.
Generally, recombinant expression vectors will include origins of replication and selectable markers penmitting trans_ormation of the host cell, e.g., the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and -a promoter derived from a highly-expressed gene to direct transcription of a downstream structural sequence. Such promoters can be derived from operons ~nCoA~ng glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), ~-factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled in ~l~riate phase with translation initiation and termination sequences.
Optionally, the heterologous sequence can PnCo~e a _usion protein including an N-terminal identification peptide imparting desired characteristics, e.g., stAh;l~zation or simpli~ied puri$ication of expres~ed recombinant product.
Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a desired protein together with suitable translation initiation and termination signals in operable r~;ng ~rame with a functional promoter. The vector will comprise one or more phenotypic selectable markers and an origin of replication to ensure maintenance o~ the vector and to, i~
desirable, provide ampli~ication within the host. Suitable prokaryotic hosts ~or transformation include E. coli, Bacillus subtilis, S~lmon~lla tYph~mll~ium and various species within the genera Psell~mnn~ Streptomyces, and Staphylococcus, although others may also be employed as a matter o~ choice.
As a representative but nonlimiting example, use~ul expression vectors for bacterial use can comprise a selectable marker and bacterial origin o~ replication derived ~rom commercially available plasmids comprising genetic elements of the well known cloning vector pBR322 (ATCC

_ CA 0222~824 1997-12-24 37017). Such commercial vectors include, ~or example, pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and OE M1 (P~ eyd Biotec, Madison, WI, USA). These pBR322 nh~ckhone"
sections are comh~n~ with an d~Lo~riate promoter and the structural sequence to be expressed.
Following trans~ormation of a suitable host strain and growth of the host strain to an ~~ iate cell density, the selected promoter is ;n~llc~ by a~ iate means (e.g., temperature shi~t or chemical induction) and cells are cultured for an additional period.
Cells are typically harvested by centri~ugation, disrupted by physical or chemical means, and the resulting crude extract retAin~ ~or ~urther puri~ication.
Microbial cells employed in expression of proteins can be disrupted by any convenient method, including ~reeze-thaw cycling, sonication, mechanical disruption, or use o~ cell lysing agents, such methods are well know to those skilled in the art.
Various m~mm~ n cell culture systems can also be employed to express recombinant protein. Examples of m~rm-lian expression systems include the COS-7 lines o~
monkey kidney ~ibroblasts, described by Gluzman, Cell, 23:175 (1981), and other cell lines capable o~ expressing a compatible vector, ~or example, the C127, 3T3, CHO, HeLa and BHK cell lines. ~Amm~lian expression vectors will comprise an origin of replication, a suitable promoter and ~nh~ncer, and also any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, and 5' ~lanking nontranscribed sequences. DNA se~l~nces derived from the SV40 splice, and polyadenylation sites may be used to provide the required nontranscribed genetic elements.
The breast speci~ic gene polypeptides can be recovered and puri~ied ~rom recombinant cell cultures by methods including ~m~on;um sul~ate or ethanol precipitation, acid CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 extraction, anion or cation ~xchAnge chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography and lectin chromatography. Protein refolding steps can be used, as necessary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can be employed for final purification steps.
The polynucleotides of the present invention may have the coding se~uence fused in frame to a marker sequence which allows for purification of the polypeptide of the present invention. An example of a marker sequence is a hPx~h;stidine tag which may be supplied by a vector, preferably a pQE-9 vector, which provides for purification of the polypeptide fused to the marker in the case of a bacterial host, or, for example, the marker sequence may be a hemagglutinin (HA) tag when a m~m~ n host, e.g. COS-7 cells, is used. The HA tag corresponds to an epitope derived from the influenza hemagglutinin protein (Wilson, I., et al., Cell, 37:767 (1984)).
The polypeptides of the present invention may be a naturally purified product, or a product of chemical synthetic procedures, or produced by recombinant techniques from a prokaryotic or eukaryotic host ($or example, by bacterial, yeast, higher plant, insect and, -l;an cells in culture). Depending upon the host employed in a recomh;n~nt production procedure, the polypeptides of the present invention may be glycosylated or may be non-glycosylated.
Polypeptides of the invention may also include an initial methionine amino acid residue.
BSG1, and other breast specific genes, and the protein product thereof may be employed for early detection of breast cancer since they are over-expressed in the breast cancer state.

CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 In accordance with another aspect of the present invention there are provided assays which may be used to screen for therapeutics to ;nh; hi t the action of the breast specific genes or breast specific proteins of the present invention. The present invention discloses methods ~or selecting a therapeutic which forms a complex with breast specific gene proteins with su~ficient affinity to ~leve-lt their biological action. The methods include various assays, including competitive assays where the proteins are ;m~nh; lized to a support, and are contacted with a natural substrate and a labeled therapeutic either simultaneously or in either consecutive order, and determining whether the therapeutic ef~ectively competes with the natural substrate in a m~nne~ sufficient to prevent h;n~;ng of the protein to its substrate.
In another embodiment, the substrate is ; -h;l; zed to a support, and is ~ont~cted with both a labeled breast specific polypeptide and a therapeutic (or unlabeled proteins and a labeled therapeutic), and it is determined whether the amount of the breast specific polypeptide bound to the substrate is reduced in comparison to the assay without the therapeutic added. The breast specific polypeptide may be labeled with antibodies.
Potential therapeutic compounds include antibodies and anti-idiotypic ~nt;hodies as- described above, or in some cases, an oligonucleotide, which binds to the polypeptide.
Another example is an antisense construct prepared using antisense technology, which is directed to a breast specific polynucleotide to prevent transcription. Antisense technology can be used to control gene expression through triple-helix formation or antisense DNA or RNA, both of which methods are based on binding of a polynucleotide to DNA or RNA. For example, the 5' coding portion of the polynucleotide sequence, which encodes for the mature polypeptides of the present invention, is used to design an CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 antisense RNA oligonucleotide o~ ~rom about 10 to 40 base pairs in length. A DNA oligonucleotide is designed to be complementary to a region o~ the gene involved in transcription (triple helix -see Lee et al., Nucl. Acids Res., 6:3073 (1979); Cooney et al, Science, 241:456 (1988);
and Dervan et al., Science, 251: 1360 (1991)), thereby evel-ting transcription and the production of a breast specific polynucleotide. The antisense RNA oligonucleotide hybridizes to the mRNA in vi~o and blocks translation of the mRNA molecule into the breast speci~ic genes polypeptide (antisense - Okano, J. Neurochem., 56:560 (1991);
Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, FL (1988)). The oligonucleotides described above can also be delivered to cells such that the antisense RNA or DNA may be expressed in vivo to ; nh~h; t production of the breast specific polypeptides.
Another example is a small molecule which binds to and occupies the active site of the breast speci~ic polypeptide thereby making the active site inaccessible to substrate such that normal biological activity is prevented. ~xamples o~
small molecules include but are not limited to small peptides or peptide-like molecules.
These compounds may be employed to treat breast cancer, since they interact with the function o~ breast specific polypeptides in a m~n~er sufficient to inhibit natural ~unction which is necessary ~or the v;~h;l;ty o~ breast cancer cells. This is true since the BSGs and their protein products are primarily expressed in breast cancer tissues and are, therefore, suspected o~ being critical to the ~ormation of this state.
The compounds may be employed in a composition with a pharmaceutically acceptable carrier, e.g., as hereina~ter described.

CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 The compounds of the present invention may be employed in combination with a suitable pharmaceutical carrier. Such compositions comprise a therapeutically effective amount of the polypeptide, and a pharmaceutically acceptable carrier or excipient. Such a carrier includes but is not limited to saline, buffered ~1 ;n~, dextrose, water, glycerol, ethanol, and r~mh~ n~tions thereof. The formulation should suit the mode of ~m; n; stration.
The invention also provides a pharmaceutical pack or kit comprising one or more cont~;ne~s filled with one or more of the ingredients of the pharmaceutical romrositions of the invention. Associated with such cont~;nen(s) can be a notice in the ~orm prescribed by a governm~nt~l agency regulating the manufacture, use or sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use or sale ~or human ~m;n; stration. In addition, the pharmaceutical compositions may be employed in conjunction with other therapeutic compounds.
The pharmaceutical compositions may be ~m;n; stered in a convenient m~nnen such as by the oral, topical, intravenous, intraperitoneal, intramuscular, subclltAn~ous, intranasal, intra-anal or intradermal routes. The pharmaceutical compositions are ~m; n; ~tered in an amount which is effective for treating and/or prophylaxis of the speci~ic indication. In general, they are ~Am; n; ~tered in an amount of at least about 10 ~g/kg body weight and in most cases they will be ~m;n; stered in an amount not in excess o~
about 8 mg/Kg body weight per day. In most cases, the dosage is from about 10 ~g/kg to about 1 mg/kg body weight daily, taking into account the routes of ~m; n; stration, symptoms, etc.
The breast specific genes and compounds which are polypeptides may also be employed in accorda~ce with the present invention by expression of such polypeptides in vivo, which is often referred to as "gene therapy."

CA 0222~824 1997-12-24 W O 97/02280 PCT~US95/08295 Thus, for example, cells from a patient may be engineered with a polynucleotide ~DNA or RNA) encoding a polypeptide ex vivo, with the engineered cells then being provided to a patient to be treated with the polypeptide.
Such methods are well-known in the art. For example, cells may be engineered by procedures known in the art by use of a retroviral particle cont~;n;ng RNA ~nco~;ng a polypeptide of the present invention.
Similarly, cells ,m,~y be engineered in vivo for expression of a polypeptide in vivo by, for example, procedures known in the art. As known in the art, a producer cell for pro~nc~ng a retroviral particle cont~; n~ ng RNA
encoding a polypeptide of the present invention may be ~m; n;stered to a patient for engineering cells in vivo and expression of the polypeptide in vivo. These and other methods for ~mi n~Qtering a polypeptide of the present invention by such method should be apparent to those skilled in the art from the teachings of the present invention. For example, the expression vehicle ~or engineering cells may be other than a retrovirus, for example, an adenovirus which may be used to engineer cells in vivo after combination with a suitable delivery vehicle.
Retroviruses from which the retroviral plasmid vectors hereinabove mentioned may be derived include, but are not limited to, Moloney Murine Leukemia Virus, spleen necrosis virus, retroviruses such as Rous Sarcoma Virus, Harvey Sarcoma Virus, avian leukosis virus, gibbon ape leukemia virus, hllm~n ~ lnodeficiency virus, adenovirus, Myeloproliferative Sarcoma virus, and m~m~y tumor virus.
In one embodiment, the retroviral plasmid vector is derived ~rom Moloney Murine Leukemia virus.
The vector includes one or more promoters. Suitable promoters which may be employed include, but are not limited to, the retroviral LTR; the SV40 promoter; and the hnm~n cytomegalovirus (CMV) promoter described in Miller, et al., -CA 0222~824 1997-12-24 W O 97/02280 PCTrUS9S/08295 Biotechniques, Vol. 7, No. 9, 980-990 (1989), or any other promoter (e.g., cellular promoters such as eukaryotic cellular promoters including, but not limited to, the histone, pol III, and ~-actin promoters). Other viral promoters which may be employed include, but are not limited to, adenovirus promoters, thymidine kinase (TK) promoters, and B19 parvovirus promoters. The selection o~ a suitable promoter will be apparent to those skilled in the art ~rom the teachings contA; n~ herein.
The nucleic acid sequence encoding the polypeptide o~
the present invention is under the control o~ a suitable promoter. Suitable promoters which may be employed include, but are not limited to, adenoviral promoters, such as the adenoviral major late promoter; or heterologous promoters, such as the cyt~m~g~lovirus (CMV) promoter; the respiratory syncytial virus (RSV) promoter; i n~llc; hle promoters, such as the MMT promoter, the metallothionein ~l. Ler; heat shock promoters; the albumin ~",oLer; the ApoAI promoter; human globin promoters; viral thymidine kinase promoters, such as the Herpes Simplex thymidine kinase promoter; retroviral LTRs (including the modi~ied retroviral LTRs hereinabove described); the ~-actin promoter; and human growth hormone promoters. The promoter also may be the native promoter which controls the genes encoding the polypeptides.
The retroviral plasmid vector is employed to transduce packaging cell lines to form producer cell lines. Examples o~ packaging cells which may be transfected include, but are not limited to, the pEsol~ PA317, ~-2, ~-AM, PA12, T19-14X, VT-19-17-H2, ~CRE, ~CRIP, GP+E-86, GP+envAml2, and DAN cell lines as described in Miller, Human Gene Thera~Y, Vol. 1, pgs. 5-14 (1990), which is incorporated herein by re~erence in its entirety. The vector may transduce the packaging cells through any means known in the art. Such means include, but are not limited to, electroporation, the use o~
liposomes, and CaPO4 precipitation. In one alternative, the CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 retroviral plasmid vector may be encapsulated into a liposome, or coupled to a lipid, and then ~mi n~stered to a host.
The producer cell line generates in~ectious retroviral vector particles which include the nucleic acid sequence(s) encoding the polypeptides. Such retroviral vector particles then may be employed, to transduce eukaryotic cells, either in vitro or in vivo. The transduced eukaryotic cells will express the nucleic acid sequence(s) encoding the polypeptide. Euk~ryotic cells which may be transduced include, but are not limited to, embryonic stem cells, embryonic carcino~ cells, as well as hematopoietic stem cells, hepatocytes, ~ibroblasts, myoblasts, keratinocytes, endothel~l cells, and bro~rh;~l epithelial cells.
This invention is also related to the use o~ a breast speci~ic genes of the present invention as a diagnostic. For example, some diseases result ~rom inherited de~ective genes.
The breast speci~ic genes, CSG7 and CSG10, ~or example, have been ~ound to have a reduced expression in breast cancer cells as compared to that in normal cells. Further, the r~m~n;ng breast speci~ic genes of the present invention are overexpressed in breast cancer. Accordingly, a mutation in these genes allows a detection of breast disorders, ~or example, breast cancer. A mutation in a breast speci~ic gene o~ the present invention at the DNA level may be detected by a variety o~ techniques. Nucleic acids used ~or diagnosis (genomic DNA, mRNA, etc.) may be obtained ~rom a patient~s cells, other than ~rom the breast, such as ~rom blood, urine, saliva, tissue biopsy and autopsy material. The genomic DNA
may be used directly ~or detection or may be ampli~ied enzymatically by using PCR (Saiki, et al., Nature, 324:163-166 (1986)) prior to analysis. RNA or cDNA may also be used ~or the same purpose. As an example, PCR primers complementary to the nucleic acid o~ the instant invention can be used to identi~y and analyze mutations in a breast CA 0222~824 1997-12-24 W O 97/02280 PCT~US95/0829S
speci~ic polynucleotide of the present invention. For example, deletions and insertions can be detected by a change in size of the amplified product in romr~ison to the normal genotype. Point mutations can be identified by hybridizing ampli~ied DNA to radiolabelled breast speci~ic RNA or, alternatively, radiolabelled antisense DNA sequences.
Another well-establ;!~h~l method ~or screening for mutations in particular se~~nt~ of DNA a~ter PCR
amplification is single-strand conformation polymorphism (SSCP) analysis. PCR products are ~repared for SSCP by ten cycles of reamplification to incorporate 32P-dCTP, digested with an appropriate restriction enzyme to generate 200-300 bp fragments, and denatured by heating to 85~C for 5 min. and then plunged into ice. ~lectrophoresis is then carried out in a non~n~turing gel (5% glycerol, 5~ acrylamide) (Glavac, D. and Dean, M., ~l n Mutation, 2:404-414 (1993)).
Sequence di~ferences between the reference gene and "mutants" may be revealed by the direct DNA sequencing method. In addition, cloned DNA se~nts may be used as probes to detect specific DNA se~m~nts. The sensitivity of this method is greatly ~nh~nced when comh;n~ with PCR. For example, a sequencing primer is used with double-stranded PCR
product or a single-stranded template molecule generated by a modified PCR. The sequence determination is performed by conventional procedures with-radiolabeled nucleotides or by automatic sequencing procedures with fluorescent-tags.
Genetic testing based on DNA sequence differences may be achieved by detection of alteration in electrophoretic mobility of DNA fragments and gels with or without denaturing agents. Small sequence deletions and insertions can be visualized by high-resolution gel electrophoresis. DNA
fragments of different sequences may be distinguished on denaturing formamide gradient gels in which the mobilities of di~ferent DNA fragments are retarded in the gel at different positions according to their specific melting or partial CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 melting temperatures (see, e.g., Myers, et al., Science, 230:1242 (1985)). In addition, sequence alterations, in particular small deletions, may be detected as changes in the migration pattern of DNA.
Sequence changes at specific locations may also be revealed by nuclease protection assays, such as Rnase and S1 protection or the chemical cleavage method (e.g., Cotton, et al., PNAS, USA, 85:4397-4401 (1985)).
Thus, the detection of the specific DNA sequence may be achieved by methods such as hybridization, RNase protection, chemical cleavage, direct DNA sequencing, or the use of restriction enzymes (e.g., Restriction ~ragment Length Polymorphisms (RFLP)) and Southern blotting.
The sequences of the present invention are also valuable for chromosome identification. The sequence is specifically targeted to and can hybridize with a particular location on an individual hnm~n chromosome. Moreover, there is a current need for identifying particular sites on the chromosome. Few chromosome marking reagents based on actual sequence data (repeat polymorph~sms) are presently available for marking chromosomal location. The mapping o~ DNAs to chromosomes according to the present invention is an important first step in correlating those sequences with genes associated with disease.
Briefly, sequences can-be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp) ~rom the cDNA.
Computer analysis o~ the 3~ untranslated region is used to rapidly select primers that do not span more than one exon in the genomic DNA, thus complicating the amplification process.
These primers are then used for PCR screening of somatic cell hybrids cont~;n;ng individual human chromosomes. Only those hybrids containing the human gene corresponding to the primer will yield an amplified ~ragment.
PCR mapping of somatic cell hybrids is a rapid procedure for assigning a particular DNA to a particular chromosome.

CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 U~ing the present invention with the same oligonucleotide primers, sublocalization can be achieved with p~n~ls o~
~ragments from specific chromosomes or pools o~ large genomic clones in an analogous ~nn~ Other mapping strategies that can s~m~l~ly be used to map to its chromosome include in situ hybridization, prescr~n;n~ with labeled ~low-sorted chromosomes and preselection by hybridization to construct chromosome specific-cDNA libraries.
Fluorescence in situ hybridization (FISH) of a cDNA
clone to a met~ph~e chromosomal spread can be used to provide a precise chromosomal location in one step. This technique can be used with cDNA as short as 50 or 60 bases.
For a review of this technique, see Verma et al., ~nm~n Cl,lo...~somes: a M~nll~l of Basic Techniques, Pe ydu~vll Press, New York (1988).
Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. Such data are found, for example, in V. McKusick, M~n~lian Inheritance in Man (available on line through Johns Hopkins University Welch Medical Library). The relationship between genes and diseases that have been mapped to the same chromosomal region are then identi$ied through linkage analysis (coinheritance of physically adjacent genes).
Next, it is necessary to determine the differences in the cDNA or genomic sequence between affected and unaffected individuals. If a mutation is observed in some or all of the affected individuals but not in any normal individuals, then the mutation is likely to be the causative agent of the disease.
With current resolution of physical mapping and genetic mapping techniques, a cDNA precisely localized to a chromosomal region associated with the disease ~ould be one of between 50 and 500 potential causative genes. (This CA 0222~824 1997-12-24 assumes 1 megabase mapping resolution and one gene per 20 kb).
The polypeptides, their fragments or other derivatives, or analogs thereof, or cells expressing them can be used as an ~ -nogen to produce Ant~ho~es thereto. These antibodies can be, for example, polyclonal or monoclonal antibodies.
The present invention also includes ch;m~ic~ single chain, and hl~m~n;zed Ant;hodies, as well as Fab fragments, or the product of an Fab expression library. Various procedures known in the art may be used for the production of such antibodies and fra_ - ~s.
Antibodies generated against the polypeptides corresponding to a sequence of the present invention can be obtA;n~ by direct injection of the polypeptides into an An;m~- or by A~m;n;~tering the polypeptides to an ~n~m~l, preferably a nnnhnmAn The Ant;ho~y so obtA;n~ will then bind the polypeptides itself. In this ~-nn~, even a sequence encoding only a fragment of the polypeptides can be used to generate antibodies binding the whole native polypeptides. Such ~nt;hodies can then be used to isolate the polypeptide from tissue expressing that polypeptide.
For pr~paration of monoclonal antibodies, any technique which provides antibodies produced by continuous cell line cultures can be used. Examples include the hybridoma technique (Kohler and Milstei-n, 1975, Nature, 256:495-497), the trioma technique, the human B-cell hybridoma technique (Kozbor et al., 1983, Tmmllnology Today 4:72), and the EBV-hybridoma technique to produce human monoclonal antibodies (Cole, et al., 1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96).
Techniques described for the production of single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce single chain antibodies to ;m~-lnogenic polypeptide products of this invention. Transgenic mice may also be used to generate antibodies.

W O 97/02280 PCTrUS95/08295 The antibodies may also be employed to target breast cancer cells, ~or example, in a method o~ homi ng interaction agents which, when c~nt~cting breast cancer cells, destroy them. This is true since the ~nt~h~dies are speci~ic ~or the breast speci~ic polypeptides oi~ the present invention. A
linking o~ the interaction agent to the antibody would cause the interaction agent to be carried directly to the breast.
~ nt;hoA;es o~ this type may also be used to do in vivo imaging, for example, by labeling the antibodies to facilitate sc~nn; ng o~ the pelvic area and the breast. One method for imaging comprises cont~cting any cAn~r cells o~
the breast to be imaged with an anti-breast speci~ic protein-antibody labeled with a detectable m~rk~r The method is per~ormed under conditions such that the labeled ~nt; hody binds to the breast speci~ic polypeptides. In a specific example, the ~nt;hoA;es interact with the breast, ~or example, breast cancer cells, and fluoresce upon contact such that imaging and visibility o~ the breast are ~nh~nced to allow a determination o~ the diseased or non-diseased state o~ the breast.
The present invention will be ~urther described with re~erence to the ~ollowing examples; however, it is to be understood that the present invention is not limited to such examples. All parts or amounts, unless otherwise specified, are by weight.
In order to ~acilitate underst~n~; ng o~ the ~ollowing examples certain ~requently occurring methods and/or terms will be described.
"Plasmids~ are designated by a lower case p preceded and/or ~ollowed by capital letters and/or numbers. The starting plasmids herein are either ~omm~rcially available, publicly available on an unrestricted basis, or can be constructed ~rom available plasmids in accord with published procedures. In addition, equivalent plasmids to those CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 described are known in the art and will be apparent to the ordinarily skilled artisan.
"Digestion" of DNA refers to catalytic cleavage of the DNA with a restriction enzyme that acts only at certain sequences in the DNA. The various restriction enzymes used herein are commercially avA;lAhle and their reaction conditions, cofactors and other requir~m~nt~ were u~ed as would be known to the ordinarily skilled artisan. For analytical purposes, typically 1 ~g of plasmid or DNA
fragment is used with about 2 units of enzyme in about 20 ~1 of buffer solution. For the purpose of isolating DNA
fragments ~or plasmid construction, typically 5 to 50 ~g of DNA are digested with 20 to 250 units of enzyme in a larger volume. Appropriate buffers and substrate amounts for particular restriction enzymes are specified by the manu~acturer. Incubation times of about 1 hour at 37 C are ordinarily used, but may vary in accordance with the supplier's instructions. After digestion the reaction is electrophoresed directly on a polyacrylamide gel to isolate the desired fragment.
Size separation of the cleaved fragments is performed using 1 percent TAE agarose gel described by Sambrook, et al., ~Molecular Cloning: A Laboratory ~AnllAl" Cold Spring Laboratory Press,(1989).
~ 'Oligonucleotides~ refers to either a single stranded polydeoxynucleotide or two compl~m~ntAry polydeoxynucleotide strands which may be chemically synthesized. Such synthetic oligonucleotides have no 5' phosphate and thus will not ligate to another oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated.
"Ligation" refers to the process of forming phosphodiester bonds between two double stranded nucleic acid fragments (Maniatis, T., et al., Id., p. 146). Unless CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 otherwise provided, ligation may be accompl~h~ using known buffers and conditions with 10 units of T4 DNA ligase ("ligase") per 0.5 ~Lg of d~ ~o~imately equimolar amounts of the DNA fra~m~nt~ to be ligated.
Unless otherwise stated, transformation was performed as described in the method of Graham, F. and Van der Eb, A., Virology, 52:456-457 (1973).

Example 1 Determination of Transcription of a breast sPecific qene To assess the presence or absence of active transcription of a breast specific gene RNA, a~,~ximately 6 ml o~ venous blood is obt~;ne~ with a st~n~d venipuncture technique using heparinized tubes. Whole blood is mixed with an equal volume of phosphate buffered ~Al in~, which is then layered over 8 ml of Ficoll (Pharmacia, Uppsala, Sweden) in a 15-ml polystyrene tube. The gradient is centrifuged at 1800 X g for 20 min at 5~C. The lymphocyte and granulocyte layer (a~L~imately 5 ml) is carefully aspirated and rediluted up to 50 ml with phosphate-buffered ~line in a 50-ml tube, which is centrifuged again at 1800 X g for 20 min.
at 5~C. The supernatant is discarded and the pellet cont~ining nucleated cells is used for RNA extraction using the RNazole B method as described by the manufacturer (Tel-Test Inc., Friendswood, TX).
To determine the quantity of mRNA from the gene of interest, a probe is designed with an identity to at least a portion of the mRNA sequence transcribed from a human gene whose coding portion includes a DNA sequence of Figures 1-20 (SEQ ID NO:1-20). This probe is mixed with the extracted RNA
and the m; xe~ DNA and RNA are precipitated with ethanol -70~C
for 15 minutes). The pellet is resuspended in hybridization buffer and dissolved. The tubes cont~;n;ng the mixture are incubated in a 72~C water bath ~or 10-15 mins. to denature the DNA. The tubes are rapidly transferred to a water bath CA 0222~824 1997-12-24 W O 97/02280 PCT~US95/08295 at the desired hybridization temperature. Hybridization temperature depend~ on the G + C content of the DNA.
Hybridization is done for 3 hrs. 0.3 ml of nuclease-S1 buffer is ~ and m; ~ well. 50 ~l of 4.0 M ~m~n;um acetate and 0.1 M EDTA is added to stop the reaction. The mixture is extracted with phenol/chloro~orm and 20 ~g of carrier tRNA is ~ and precipitation is done with an equal volume of isopropanol. The precipitate is dissolved in 40 ~l of TE (pH 7.4) and run on an alk~l tn~ agarose gel. Following electrophoresis, the RNA is microsequenced to con~irm the nucleotide sequence. (See Favaloro, J. et al., Methods Enzymol., 65:718 (1980) for a more detailed review).
Two oligonucleotide primers are employed to amplify the sequence isolated by the above methods. The 5' primer is 20 nucleotides long and the 3' primer is a complimentary sequence ~or the 3' end of the isolated mRNA. The primers are custom designed according to the isolated mRNA. The reverse transcriptase reaction and PCR amplification are per~ormed sequentially without interruption in a Perkin Elmer 9600 PCR machine (Emeryville, CA). Four hundred ng total RNA
in 20 ~l diethylpyrocarbonate-treated water are placed in a 65~C water ~ath for 5 min. and then quickly chilled on ice immediately prior to the addition o~ PCR reagents. The 50-~l total PCR volume consisted o~ 2.5 units Taq polymerase (Perkin-Elmer). 2 units avian myeloblastosis virus reverse transcriptase (Boehringer M~nnh~;m, Tn~;~n~rolis~ IN); 200 ~M
each of dCTP, d~TP, dGTP and dTTP (Perkin Elmer); 18 pM each primer, 10 mM Tris-HCl; 50 mM KCl; and 2 mM MgCl2 (Perkin Elmer). PCR conditions are as follows: cycle 1 is 42~C for 15 min then 97~C for 15 s (1 cycle); cycle 2 is 95~C for 1 min. 60~C for 1 min, and 72~C for 30 s (15 cycles); cycle 3 is 95~C ~or 1 min. 60~C ~or 1 min., and 72~C ~or 1 min. (10 cycles); cycle 4 is 95~C ~or 1 min., 60~C for 1 min., and 72~C for 2 min. (8 cycles); cycle 5 is 72~C ~or 15 min. (1 cycle); and the final cycle is a 4~C hold until sample is CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 taken out of the machine. The 50-~l PCR products are concentrated down to 10 ~1 with vacuum centrifugation, and a sample is then run on a thin 1.2 % Tris-borate-EDTA agarose gel contA; n; ng ethidium bromide. A band of expected size would indicate that this gene is present in the tissue assayed. The amount of RNA in the pellet may be ~l~nt;fied in numerous ways, for example, it may be weighed.
Verification of the nucleotide sequence of the PCR
products is done by microse~nc;ng. The PCR product is purified with a Qiagen PCR Product Puri~ication Kit (Qiagen, Chatsworth, CA) as described by the manufacturer. One ~g of the PCR product undergoes PCR se~lenc;ng by using the Taq DyeDeoxy Terminator Cycle se~ncing kit in a Perkin-Elmer 9600 PCR machine as described by Applied Biosystems (Foster, CA). The sequenced product is purified using Centri-Sep columns (Princeton Separations, Adelphia, NJ) as described by the company. This product is then analyzed with an ABI model 373A DNA se~l~nc;ng system (Applied Biosystems) integrated with a Macintosh IIci computer.

Example 2 Bacterial Ex~ression and Purification of the BSG Proteins and Use For Preparinq a Monoclonal ~nt; hoAy The DNA sequence encoding a polypeptide of the present invention, for this example BSG1, ATCC # 97175, is initially amplified using PCR oligonucleotide primers correspon~;ng to the 5' sequences of the protein and the vector sequences 3' to the protein. Additional nucleotides corresponA;ng to the DNA sequence are added to the 5' and 3' sequences respectively. The 5' oligonucleotide primer has the sequence 5' GCCACCATGGAl~rl~l~l-~AAG 3~ (SEQ ID NO:21) and cont~;n~ an NcoI restriction enzyme site followed by 15 nucleotides of coding sequence starting from the initial amin~ acid of the processed protein. The 3' sequence 5' GCGCAGATCTGTCT

CA 0222~824 1997-12-24 W O 97/02280 PCTrUS9SJ08295 CCCCCA~-l--l~C 3' (SEQ ID NO:22) and contains a compl~m~nt~ry sequence to a BglII restriction enzyme site and is followed by 18 nucleotides of the nucleic acid sequence ~nCo~;ng the protein. The restriction enzyme sites correspond to the restriction enzyme sites on a bacterial expression vector, pQE-60 (Qiagen, Inc. Chatsworth, CA). pQE-60 ~nro~s antibiotic resistance (Ampr), a bacterial origin of replication (ori), an IPTG-regulatable ~---~Ler operator (P/O), a ribosome b~n~ng site (RBS), a 6-His tag and restriction enzyme sites. pQE-60 is then digested with NcoI
and BglII. The ampli~ied se~l~nc~s are ligated into pQE-60 and inserted in frame with the sequence ~nro~ng for the histidine tag and the RBS. The ligation mixture is then used to trans~orm an E. coli strain M15/rep 4 (Qiagen) by the procedure described in Sambrook, ~. et al., Molecular Cloning: A Laboratory ~nll~l, Cold Spring Laboratory Press, (1989). M15/rep4 cont~nc multiple copies of the plasmid pREP4, which expresses the lacI repressor and also confers kanamycin resistance (Kanr). Transformants are identi~ied by their ability to grow on LB plates and ampicillin/kanamycin resistant colonies are selected. Plasmid DNA is isolated and confirmed by restriction analysis.
Clones cont~;n~ng the desired constructs are grown overnight (O/N) in liquid culture in LB media supplemented with both Amp (100 ug/ml) and Kan (25 ug/ml). The O/N
culture is used to inoculate a large culture at a ratio of 1:100 to 1:250. The cells are grown to an optical density 600 (O.D.~') o~ between 0.4 and 0.6. IPTG ("Isopropyl-B-D-thiogalacto pyranoside") i8 then added to a final concentration of 1 mM. IPTG induces by inactivating the lacI
repressor, clearing the P/O leading to increased gene expression. Cells are grown an extra 3 to 4 hours. Cells are then harvested by centri$ugation. The cell pellet is solubilized in the chaotropic agent 6 Molar Guanidine HCl.
A~ter clarification, solubilized protein i8 purified ~rom CA 0222~824 1997-12-24 W O 97/02280 PCT~US95/08295 this solution by chromatography on a Nickel-Chelate column under conditions that allow for tight h; nA; n~ by proteins r~nt~;n~n~ the 6-His tag (Hochuli, E. et al., J.
C~ tography 411:177-184 (1984)). BSG1 protein (,90~ pure) is eluted ~rom the column in 6 molar guanidine HCl pH 5.0 and ~or the purpose o~ renaturation adjusted to 3 molar guanidine HCl, lOOmM sodium phosphate, 10 mmolar gl~ltAth;one (reduced) and 2 mmolar glutathione (oxidized). After incubation in this solution ~or 12 hours the protein is dialyzed to 10 mmolar sodium phosphate.
The protein puri~ied in this m~nner may be used as an epitope to raise monoclonal antibodies speci~ic to such protein. The monoclonal antibodies generated against the polypeptide the isolated protein can be obtA; n~A by direct injection o~ the polypeptides into an An;m~l or by AAm;n~stering the polypeptides to an An;m-l. The antibodies so obtained will then bind to the protein itsel~. Such antibodies can then be used to isolate the protein ~rom tissue expressing that polypeptide by the use o~ an, for example, ELISA assay.

Exam~le 3 Preparation o~ cDNA Libraries ~rom Breast Tissue Total cellular RNA is prepared ~rom tissues by the guanidinium-phenol method as previously described (P.
Chomczynski and N Sacchi, Anal. Biochem., 162: 156-159 (1987)) using RNAzol (Cinna-Biotecx). An additional ethanol precipitation o~ the RNA is included. Poly A mRNA is isolated ~rom the total RNA using oligo dT-coated latex beads (Qiagen). Two rounds of poly A selection are per~ormed to ensure better separation ~rom non-polyadenylated material when suf~icient ~uantities o~ total RNA are available.
The mRNA selected on the oligo dT is used ~or the synthesis o~ cDNA by a modi~ication of the method o~ Gob~ler and Ho~man (Gobbler, U. and B.J. Ho~man, 1983, Gene, CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 25:263). The first strand synthesis is performed using either Moloney murine sarcoma virus reverse transcriptase (Stratagene) or Superscript II (RNase H minus Moloney murine reverse transcriptase, Gibco-BRL). First strand synthesis is primed u~ing a primer/linker co~t~; n; ng an Xho I restriction site. The nucleotide mix used in the synthesis c~nt~; n~
methylated dCTP to prevent restriction within the cDNA
sequence. ~or second-strand synthesis E. coli polymerase Klenow fragment is used and [32p~ -d~TP is incorporated as a tracer of nucleotide incorporation.
Following 2nd strand synthesis, the cDNA is made blunt ended using either T4 DNA polymerase or Klenow fragment. Eco RI adapters are A~e~ to the cDNA and the cDNA is restricted with Xho I. The cDNA is size fractionated over a Sephacryl S-500 column (Pharmacia) to remove excess linkers and cDNAs under approximately 500 base pairs.
The cDNA is cloned unidirectionally into the Eco RI-Xho I ~ites of either pBluescript II phagemid or lAmh~A Uni-zap XR (Stratagene). In the case of cloning into pBluescript II, the plasmids are electroporated into E.coli SURE competent cells (Stratagene). When the cDNA is cloned into Uni-Zap XR
it is packaged using the Gigipack II packaging extract (Stratagene). The packaged phage is used to infect SURE
cells and amplified. The pBluescript phagemid contA;n;ng the cDNA inserts are excised from-the lAmh~A Zap phage using the helper phage ~xAssist (Stratagene). The rescued phagemid is plated on SOLR ~.coli cells (Stratagene).
Pre~aration of Seauencing TemPlates Template DNA for se~lenc;~g is prepared by 1) a boiling method or 2) PCR amplification.
The boiling method is a modification of the method of Holmes and Quigley (Holmes, D.S. and M. Quigley, 1981, Anal.
Biochem., 114:193). Colonies from either cDNA cloned into Bluescript II or rescued Bluescript phagemid are grown in an enriched bacterial media overnight. 400 ~1 of cells are CA 0222~824 1997-12-24 centri~uged and resuspended in STET (O.lM NaCl, lOmM TRIS Ph 8.0, 1.0 mM EDTA and 5~ Triton X-100) including lysozyme (80 ~g/ml) and RNase A (4 ~g/ml). Cells are boiled for 40 seconds and centrifuged for 10 minutes. The supernatant is removed and the DNA is precipitated with PBG/NaCl and ~ che~
with 70~ ethanol (2x). Templates are resuspended in water at d~~ tely 250 ng/~
Preparation of templates by PCR is a modification of the method of Ros~nth~l et al. (Rospnth~l~ et al., Nucleic Acids Res., 1993, 21:173-174). Colonies cont~n~ng cDNA cloned into pBluescript II or rescued pBluescript phagemid are grown overnight in LB cont~;n;ng ampicillin in a 96 well tissue culture plate. Two ~1 of the cultures are used as template in a PCR reaction (Saiki, RK, et al., Science, 239:487-493, 1988; and Saiki, RR, et al., Science, 230:1350-1354, 1985) using a tricine buffer system (Ponce and Micol., Nucleic Acids Res., 1992, 20:1992.) and 200 ~M dNTPs. The primer set chosen for amplification of the templates is outside of primer sites chosen ~or sequencing of the templates. The primers used are 5'-ATGCTTCCGGCTCGTATG-3' (SEQ ID NO:23) which is 5' of the M13 reverse sequence in pBluescript and 5'-GGGTTTTCCCAGTCACGAC-3' (SEQ ID NO:24) which is 3' o~ the M13 forward primer in pBluescript. Any primers which correspond to the sequence flanking the M13 forward and reverse sequences can be- used. Perkin-Elmer 9600 thermocyclers are used ~or amplification of the templates with the ~ollowing cycler conditions: 5 min at 94~C (1 cycle); (20 sec at 94~C); 20 sec at 55~C (1 min at 72~C) (30 cycles); 7 min at 72~C (1 cycle). Following amplification the PCR templates are precipitated using PEG/NaC1 and washed three times with 70~ ethanol. The templates are resuspended in water.

Example 4 Isolation of a Selected Clone From Breast Tissue CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 Two approaches are used to isolate a particular clone from a cDNA library prepared from human breast ti~sue.
In the first, a clone is isolated directly by screening the library using an oligonucleotide probe. To isolate a particular clone, a specific oligonucleotide with 30-40 nucleotides is synthesized using an Applied Biosystems DNA
synthesizer according to one of the partial sequences described in this application. The oligonucleotide is labeled with 32p_ -ATP using T4 polynucleotide kinase and purified according to the st~n~d protocol (~n~t;S et al., Molecular Cloning: A Laboratory ~n~ , Cold Spring ~A~h Press, Cold Spring, NY, 1982). The T-~mh~ cDNA library is plated on 1.5% agar plate to a density of 20,000-50,000 pfu/150 mm plate. These plates are screened using Nylon ".~"~Lanes according to the st~n~d phage screening protocol (Stratagene, 1993). Specifically, the Nylon membrane with denatured and fixed phage DNA is prehybridized in 6 x SSC, 20 mM NaH2PO4, 0.4% SDS, 5 x Denhardt~s 500 ~g/ml denatured, sonicated s~lm~n sperm DNA; and 6 x SSC, 0.1% SDS. After one hour of prehybridization, the membrane is hybridized with hybridization buffer 6 x SSC, 20 mM NaH2PO4, 0.4% SDS, 500 ~g/ml denatured, sonicated salmon sperm DNA with 1 x 1o6 cpm/ml 32P-probe overnight at 42~C. The ",~"~ldne is w~he~ at 45-50~C with washing buffer 6 x SSC, 0.1% SDS for 20-30 minutes dried and exposed to Kodak X-ray film overnight.
Positive clones are isolated and puri~ied by secondary and tertiary screening. The purified clone sequenced to verify its identity to the partial sequence described in this app~ication.
An alternative approach to screen the cDNA library prepared from ~n~n breast tissue is to prepare a DNA probe corresponding to the entire partial sequence. To prepare a probe, two oligonucleotide primers of 17-20 nucleotides derived from bo~h ends of the partial sequence reported are synthesized and purified. These two oligonucleotides are CA 0222~824 1997-12-24 W O 97/02280 PCT~US95/08295 used to ampli~y the probe using the cDNA library template.
The DNA template is prepared ~rom the phage lysate o~ the cDNA library according to the stAn~d phage DNA preparation protocol tManiatis et al.). The polymerase chain reaction is carried out in 25 ~l reaction mixture with 0.5 ~g of the above cDNA template. The reaction mixture is 1.5-5 mM MgCl2, O.01~ (w/v) gelatin, 20 ~M each of dATP, dCTP, dGTP, dTTP, 25 pmol of each primer and 0.25 Unit o~ Taq polymerase. Thirty ~ive cycles o~ PCR (denaturation at 94~C ~or 1 min; ~nne~l;ng at 55~C ~or 1 min; elongation at 72~C ~or 1 min) are performed with the Perkin-Blmer Cetus automated thermal cycler. The ampli~ied product is analyzed by agarose gel electrophoresis and the DNA band with expected molecular weight is excised and puri~ied. The PCR product is veri~ied to be the probe by subcloning and se~l~nc~ng the DNA product.
The probe is labeled with the Multiprime DNA Labelling System (Amersham) at a speci~ic activity c 1 x 109 dmp/~g. This probe is used to screen the l~mh~A cDNA library according to Stratagene's protocol. Hybridization is carried out with 5X
TEN 920XTEN:0.3M Tris-HCl pH 8.0, 0.02M EDTA and 3MNaCl), 5X
Denhardt's, 0.5% sodium pyrophosphate, 0.1% SDS, 0.2 mg/ml heat denatured salmon sperm DNA and 1 x 1o6 cpm/ml o~ t32P]-labeled probe at 55~C ~or 12 hours. The ~ilters are w~:hF~
in 0.5X TEN at room temperature ~or 20-30 min., then at 55~C
~or 15 min. The ~ilters are dried and autoradiographed at -70~C using Kodak XAR-5 ~ilm. The positive clones are puri~ied by secon~ry and tertiary screening. The sequence o~ the isolated clone are veri~ied by DNA se~l~n~ng General procedures ~or obt~;n;ng complete sequences ~rom partial sequences described herein are summarized as ~ollows;
Procedure 1 Selected human DNA ~rom the partial sequence clone (the cDNA clone that was sequenced to give the partial sequence) is puri~ied e.g., by endonuclease digestion using ~co-R1, gel electrophoresis, and isolation o~ the clone by removal ~rom CA 0222~824 1997-12-24 W O 97/02280 PCT~US95/08295 low melting agarose gel. The isolated insert DNA, is radiolabeled e.g., with 32p labels, preferably by nick translation or r~n~m primer labeling. The labeled in~ert is used as a probe to screen a lAmb~ phage cDNA library or a plasmid cDNA library. Colonies con~A;n;ng clones related to the probe cDNA are identi~ied and puri~ied by known purification methods. The ends of the newly puri~ied clones are nucleotide sequenced to irl~nt; fy full length sequences.
Complete sequencing of full length clones is then performed by R~o~llclease III digestion or primer walking. Northern blots of the mRNA ~rom various tissues using at least part of the deposited clone from which the partial sequence is obtained as a probe can optionally be performed to check the size of the mRNA against that of the purported full length cDNA.
The ~ollowing procedures 2 and 3 can be used to obtain full length genes or full length coding portions of genes where a clone isolated from the deposited clone mixture does not contain a full length sequence. A library derived from human breast tissue or from the deposited clone mixture is also applicable to obtAin;ng full length sequences from clones obtained from sources other than the deposited mixture by use of the partial sequences of the present invention.

Pr~ e 2 RACE Protocol For Recovery of Full-Length Genes Partial cDNA clones can be made full-length by utilizing the rapid amplification of cDNA ends (RACE) procedure described in Frohman, M.A., Dush, M.K. and Martin, G.R.
(1988) Proc. Nat'l. Acad. Sci. USA, 85:8998-9002. A cDNA
clone missing either the 5~ or 3~ end can be reconstructed to include the absent base pairs extending to the translational start or stop codon, respectively. In most cases, cDNAs are missing the start of translation therefor. The following brie~ly describes a modification of this original 5' RACE

CA 0222~824 1997-12-24 W O 97t02280 PCT~US95108295 procedure. Poly A+ or total RNA is reverse transcribed with Superscript II (Gibco/BRL) and an antisense or complementary primer speci~ic to the cDNA sequence. The primer is removed ~rom the reaction with a Microcon Concentrator (Amicon). The first-strand cDNA is then tailed with dATP and terminal deoxynucleotide trans~erase (Gibco/BRL). Thus, an anchor sequence is produced which is ne~e~ ~or PCR ampli~ication.
The second strand is synthesized ~rom the d~-tail in PCR
bu~er, Taq DNA polymerase (Perkin-Elmer Cetus), an oligo-dT
primer cont~;n;ng three adjacent restriction sites (XhoI, SalI and ClaI) at the 5' end and a primer contA;n;ng just these restriction sites. This double-stranded cDNA is PCR
ampli~ied ~or 40 cycles with the same primers as well as a nested cDNA-speci~ic antisense primer. The PCR products are size-separated on an ethidium bromide-agarose gel and the region o~ gel c~nt~; n; ng cDNA products the predicted size of missing protein-coding DNA is removed. cDNA is puri~ied ~rom the agarose with the Magic PCR Prep kit (Promega), restriction digested with XhoI or SalI, and ligated to a plasmid such as pBluescript SRII (Stratagene) at ShoI and EcoRV sites. This DNA is trans~ormed into bacteria and the plasmid clones sequenced to identi~y the correct protein-coding inserts. Correct 5' ends are con~irmed by comparing this sequence with the putatively identi~ied homologue and overlap with the partial cDNA clone.
Several quality-controlled kits are available ~or purchase. S;m; 1 Ar reagents and methods to those above are supplied in kit ~orm $rom Gibco/BRL. A second kit is available ~rom Clontech which is a modi~ication o~ a related technique, SLIC (single-stranded ligation to single-stranded cDNA) developed by Dumas et al. (Dumas, J.B., Edwards, M., Delort, J. and Mallet, Jr., 1991, Nucleic Acids Res., 19:5227-5232). The major di~erences in procedure are that the RNA is alkaline hydrolyzed a~ter reverse transcription and RNA ligase is used to join a restriction site-cont~;n;ng CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 ~nchor primer to the first-strand cDNA. This obviates the necessity for the dA-~ ng reaction which results in a polyT stretch that is difficult to sequence past.
An alternative to generating 5' cDNA from RNA is to use cDNA library double-stranded DNA. An asymmetric PCR-amplified antisense cDNA strand is synthesized with an antisense cDNA-specific primer and a plasmid-anchored primer.
These primers are le~ ~d and a symmetric PCR reaction is performed with a nested cDNA-specific antisense primer and the plasmid-~ncho~ed primer.

Pror~ ~e 3 RNA Ligase Protocol For Generating The 5' End Se~n~es To Obtain Full ~ength Gene~
Once a gene of interest is identified, several methods are available for the identi~ication o~ the 5~ or 3' portions of the gene which may not be present in the original deposited clone. These methods include but are not limited to filter probing, clone enrirhm~nt using specific probes and protocols similar and identical to 5' and 3' RACE. While the full length gene may be present in a library and can be identi~ied by probing, a useful method for generating the 5~
end is to use the existing sequence information from the original partial sequence to generate the missing information. A method similar to 5~ RACE is available for generating the misQing 5~ end of a desired full-length gene.
(This method was pnhl~sh~tl by Fr~ t-Racine et al, Nucleic Acids Res., 21(7):1683-1684 (1993). Briefly, a specific RNA
oligonucleotide is ligated to the 5~ ends of a population of RNA presumably cont~n;ng full-length gene RNA transcript and a primer set cont~ n~ ng a primer specific to the ligated RNA
oligonucleotide. A primer specific to a known sequence (EST) of the gene of interest is used to PCR amplify the 5' portion of the desired full length gene which may then be sequenced and used to generate the full length gene. This method CA 0222~824 1997-12-24 W O 97/02280 PCT~US95/08295 starts with total RNA isolated from the desired source, poly A RNA may be used but is not a prerequisite for this procedure. The RNA preparation may then be treated with phosphatase if necessary to ~1 ;m~ n~te 5' phosphate groups on degraded or damaged RNA which may inter~ere with the later RNA ligase step. The phosphatase if used is then inactivated and the RNA is treated with tobacco acid pyrophosphatase in order to remove the cap structure present at the 5' ends of messenger RNAs. This reaction leaves a 5~ phosphate group at the 5' end of the cap-cleaved RNA which can then be ligated to an RNA oligonucleotide using T4 RNA ligase. This modi~ied RNA preparation can then be used as a template for first strand cDNA synthesis using a gene-speci~ic oligonucleotide.
The first stand synthesis reaction can then be used as a template for PCR amplification of the desired 5' end using a primer specific to the ligated RNA oligonucleotide and a primer specific to the known sequence (EST) of the gene of interest. The resultant product is then sequenced and analyzed to confirm that the 5' end sequence belongs to the partial sequence.

ExamPle 5 Cloninq and exPression o~ BSG1 usin~ the baculovirus expression system The DNA sequence encoding the ~ull length BSG1 protein, ATCC # 97175, was amplified using PCR oligonucleotide primers corresponding to the 5' and 3' sequences of the gene:
The 5' primer has the sequence 5' AAAGGA~I~CCATCATGG
Al~rl-l-lCAAGAAG 3~ (SEQ ID NO:25) and contains a BamHI
restriction enzyme site (in bold) followed by 8 nucleotides resembling an efficient signal for the initiation of translation in eukaryotic cells (Kozak, M., J. Mol. Biol., 196:947-950 (1987) of the BSG1 gene (the initiation codon for translation ~ATG~ is underlined).
The 3~ primer has the sequence 5~ A~ATCTAGACTAGTCTCCCCC

CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95/08295 ACTCTG 3' (S~Q ID NO:26) and contains the cleavage site for the restriction ~n~onllclease XbaI and 21 nucleotides complementary to the 3' sequence of the BSG1 gene. The amplified sequences were isolated from a 1~ agarose gel using a commercially available kit ("Geneclean," BIO 101 Inc., La Jolla, Ca.). The fragment was then digested with the A , ~n~nll~leases BamHI and XbaI and then purified again on a 1 agarose gel. This fragment is designated F2.
The vector pA2 (modification of pVL941 vector, discussed below) is used for the expression of the BSG1 protein using the baculovirus expression system (for review see: Summers, M.D. and Smith, G.E. 1987, A ~-nll~l of methods for baculovirus vectors and insect cell culture procedures, Texas Agricultural Bxper~m~nt~l Station Bulletin No. 1555). This expression vector ront~nfi the strong polyhedrin promoter of the Autographa cali$ornica nuclear polyhedrosis virus (AcMNPV) followed by the recognition sites for the restriction ~nnllcleases BamHI and XbaI. The polyadenylation site of the simian virus (SV)40 is used for efficient polyadenylation. For an easy selection of recombinant virus the beta-galactosidase gene from E.coli is inserted in the same orientation as the polyhedrin promoter followed by the polyadenylation signal of the polyhedrin gene. The polyhedrin sequences are $1anked at both sides by viral sequences for the cell-mediated homologous recombination of co-transfected wild-type viral DNA. Many other baculovirus vectors could be used in place of pA2 such as pRG1, pAc373, pVL941 and pAcIM1 (Luckow, V.A. and Summers, M.D.~ Virology, 170:31-39).
The plasmid was digested with the restriction enzymes BamHI and XbaI and dephosphorylated using calf intestinal phosphatase by procedures known in the art. The DNA was then isolated $rom a 1~ agarose gel using the commercially available kit ("Geneclean" BI0 101 Inc., La Jolla, Ca.).
This vector DNA is designated V2.

W O 97/02280 PCTrUS9~/08295 Fragment F2 and the dephosphorylated plasmid pA2 were ligated with T4 DNA ligase. E.coli HB101 cells were then transformed and bacteria ;~nt;fied that ront~;ne~ the plasmid (pBacBSG1) with the BSG1 gene using the enzymes BamHI
and XbaI. The sequence of the cloned fragment was confirmed by DNA seqllencing.
5 ~g of the plasmid pR~cR-SG1 was co-transfected with 1.0 ~g of a commercially av;~ hl e l;n~ized baculovirus ("BaculoGold~ baculovirus DNA", Pharmingen, San Diego, CA.) using the lipofection method (Felgner et al. Proc. Natl.
Acad. Sci. USA, 84:7413-7417 (1987)).
l~g of BaculoGold~ virus DNA and 5 ~g of the p~asmid pBacBSG1 were m;xr~A in a sterile well o~ a microtiter plate cont~;n;ng 50 ~l of serum free Grace's medium (Life Technologies Inc., Gaithersburg, MD). Afterwards 10 ~l Lipofectin plus 90 ~l Grace's medium were added, mixed and incubated for 15 minutes at room temperature. Then the transfection mixture was added drop-wise to the Sf9 insect cells (ATCC CRL 1711) seeded in a 35 mm tissue culture plate with 1 ml Grace's medium without serum. The plate was rocked back and forth to mix the newly added solution. The plate was then incubated ~or 5 hours at 27~C. A~ter 5 hours the transfection solution was removed from the plate and 1 ml of Grace's insect medium supplemented with 10% fetal calf serum was ~ ~. The plate was put back into an incubator and cultivation cont;nlleA at 27~C ~or ~our days.
After four days the supernatant was collected and a plague assay performed similar as described by Summers and Smith (supra). As a modification an agarose gel with "Blue Gal" ~Li~e Technologies Inc., Gaithersburg) was used which allows an easy isolation of blue st~; n~ plaques. (A
detailed description of a "plaque assay" can also be found in the user's guide for insect cell culture and b~culovirology distributed by Life ~echnologies Inc., Gaithersburg, page 9-10) .

CA 0222~824 1997-12-24 W O 97/02280 PCT~US95tO8295 Four days after the serial dilution, the virus was added to the cells and blue st~n~ plaques were picked with the tip of an Eppendor~ pipette. The agar ront~;ning the recombinant viruses was then resuspended in an Eppendorf tube cont~;n;ng 200 ~l of Grace's medium. The agar was removed by a brief centrifugation and the supernatant cont~; n; ng the recomh~n~nt baculovirus was used to infect Sf9 cells seeded in 35 mm dishes. Four days later the supernatants of these culture ~;Ch~s were harvested and then stored at 4~C.
S~9 cells were grown in Grace~s medium supplemented with 10% heat-inactivated FBS. The cells were infected with the recombinant baculovirus V-BSG1 at a multiplicity of infection (MOI) of 2. Six hours later the medium was removed and replaced with SF900 II medium minus methionine and cysteine (Li~e Technologies Inc., Gaithersburg). 42 hours later 5 ~Ci o~ 35S-methionine and 5 ~Ci 35S cysteine (Amersham) were added.
The cells were further incubated for 16 hours before they were harvested by centrifugation and the labelled proteins visualized by SDS-PAGE and autoradiography.

BxamPle 6 Expression o~ Recombinant BSG1 in COS cells The expression of plasmid, BSG1 HA is derived ~rom a vector pcDNAI/Amp (Invitrogen) cont~;n;ng: 1) SV40 origin of replication, 2) ampicillin- resistance gene, 3) E.coli replication origin, 4) CMV promoter followed by a polylinker region, an SV40 intron and polyadenylation site. A DNA
~ragment encoding the entire precursor and a HA tag ~used in frame to its 3~ end was cloned into the polylinker region o~
the vector, there~ore, the recombinant protein expression is directed under the CMV promoter. The HA tag corresponds to an epitope derived from the influenza hemagglutinin protein as previously described (I. Wilson, H. Niman, R. Heighten, A
Cherenson, M. Connolly, and R. T.~rn~r, 1984, Cell 37:767, (1984)). The in~usion of HA tag to the target protein allows W O 97/02280 PCTrUS95108295 easy detection o~ the re~omh~nAnt protein with an antibody that recognizes the HA epitope.
The plasmid construction strategy is described as ~ollows:
The DNA sequence enco~; ng BSGl, ATCC # 97175, was constructed by PCR using two primers: the 5' primer AAAGGA
T~CCCC~CCATCATGGA~ lCAAGA~G 3' (SEQ ID N0:27) rontA~n~ a BamHI site ~ollowed by 18 nucleotides o~ BSGl coding sequence starting ~rom the initiation codon; the 3' sequence AAATC
TAGAcTA~AGcGTA~ Alw~lA~ L~ 'A~-J'~
3' (SEQ ID NO:28) contA;ns complementary se~-Pnc~s to an XbaI
site, translation stop codon, HA tag and the last 18 nucleotides o~ the BamHI coding sequence (not including the stop codon). There~ore, the PCR product cont~;nC an BamXI
site, BSGl coding sequence ~ollowed by HA tag ~used in ~rame, a translation termination stop codon next to the HA tag, and an XbaI site. The PCR ampli~ied DNA ~ragment and the vector, pcDNAI/Amp, were digested with BamHI and XbaI restriction enzyme and ligated. The ligation mixture was trans~ormed into E. coli strain SURE (available ~rom Stratagene Cloning Systems, 11099 North Torrey Pines Road, La Jolla, CA 92037) the trans~ormed culture was plated on ampicillin media plates and resistant colonies were selected. Plasmid DNA was isolated ~rom trans~ormants and ~xAm;ne~ by restriction analysis ~or the presence o~ the correct ~ragment. For expression o~ the rec~mh;nAnt BSG protein, COS cells were trans~ected with the expression vector by DEAE-DEXTRAN method (~. Sambrook, E. Fritsch, T. Maniatis, Molecular Cloning: A
Laboratory MAnll~Al~ Cold Spring Laboratory Pre~s, (1989)).
The expression o~ the BSG HA protein was detected by radiolab~ll~ng and immunoprecipitation method (E. Harlow, D.
Lane, Antibodies: A Laboratory MAnllAl, Cold Spring Harbor Laboratory Press, (1988)). Cells were lA~lled ~or 8 hours with 35S-cysteine two days post trans~ection. Culture media was then collected and cells were lysed with detergent (RIPA

CA 0222~824 1997-12-24 W O 97/02280 PCTrUS95108295 buffer (150 mM NaCl, 1~ NP-40, 0.1~ SDS, 1~ NP-40, 0.5~ DOC, 50mM Tris, pH 7.5) (Wilson, I. et al., Id. 37:767 (1984)).
Both cell lysate and culture media were precipitated with an HA specific monoclonal ~nt; ho~y ~ Proteins precipitated were analyzed on 15% SDS-PAGE gels.
Numerous modi~ications and variations o~ the present invention are possible in light of the above teachings and, there~ore, within the scope of the appended claims, the invention may be practiced otherwise than as particularly described.

Claims

WHAT IS CLAIMED IS:

1. An isolated polynucleotide comprising a member selected from the group consisting of (a) a polynucleotide encoding the same polypeptide as the polynucleotide of Figure 1 (SEQ ID NO:1);
(b) a polynucleotide encoding the same mature polypeptide as a human gene having a coding portion which includes DNA having at least a 90% identity to the DNA of one of Figures 2-20 (SEQ ID NO:2-20);
(c) a polynucleotide which hybridizes to the polynucleotide of (a) and which has at least a 70% identity thereto; and (d) a polynucleotide encoding the same mature polypeptide as a human gene having a coding portion which includes DNA having at least a 90% identity to a DNA included in the deposited clone.

2. The polynucleotide of Claim 1 wherein the human gene includes DNA contained in the deposited clone.

3. The polynucleotide of Claim 1 wherein the member is a polynucleotide encoding the same polypeptide as the polynucleotide of Figure 1 (SEQ ID NO:1).

4. A vector containing-the polynucleotide of claim 1.

5. A host cell transformed or transfected with the vector of Claim 4.

6. A process for producing cells capable of expressing a polypeptide comprising genetically engineering cells with the vector of Claim 4.

7. A process for producing a polypeptide comprising:
expressing from the host cell of Claim 5 the polypeptide encoded by said polynucleotide.

8. A polypeptide comprising a member selected from the group consisting of: (i) a polypeptide encoded by a human gene, said human gene having a coding portion whose DNA has at least a 90% identity to the DNA of one of Figures 2-20 (SEQ ID NO:2-20); (ii) a polypeptide having the deduced amino acid sequence as set forth in Figure 1 (SEQ ID NO:1) and fragments, analogs and derivatives thereof; and (iii) a polypeptide encoded by the human gene whose coding region includes a DNA having at least a 90% identity to the DNA
contained in the deposited clone and fragments, analogs and derivatives of said polypeptide.

9. The polypeptide of Claim 8 wherein the polypeptide has the deduced amino acid sequence as set forth in Figure 1 (SEQ ID NO:1).

10. An antibody against the polypeptide of claim 8.

11. A compound which inhibits activation of the polypeptide of claim 8.

12. A method for the treatment of a patient having need to inhibit a breast specific gene protein comprising:
administering to the patient a therapeutically effective amount of the compound of Claim 11.

13. The method of claim 12 wherein the compound is a polypeptide and the therapeutically effective amount of the compound is administered by providing to the patient DNA
encoding said polypeptide and expressing said polypeptide in vivo.

14. A method for the treatment of a patient having need of a breast specific gene protein comprising: administering to the patient a therapeutically effective amount of the polypeptide of claim 8.

15. A process for diagnosing a disorder of the breast in a host comprising:
determining transcription of a human gene in a sample derived from non-breast tissue of a host, said gene having a coding portion which includes DNA having at least 90% identity to DNA selected from the group consisting of the DNA of Figures 1-20 (SEQ ID NO:1-20), whereby said transcription indicates a disorder of the breast in the host.

16. The process of claim 15 wherein transcription is determined by detecting the presence of an altered level of RNA transcribed from said human gene.

17. The process of claim 15 wherein transcription is determined by detecting the presence of an altered level of DNA complementary to the RNA transcribed from said human gene.

18. The process of claim 15 wherein transcription is determined by detecting the presence of an altered level of an expression product of said human gene.

19. A process for determining a disorder of a breast in a host comprising:
contacting an antibody specific to a BSG antigen or an epitopic portion thereof, to a fluid sample derived from a host;
determining the presence of an altered level of a BSG gene product in said sample.

20. A process for identifying antagonists to the polypeptide of claim 8 comprising:
contacting said polypeptide with a natural substrate and a labeled compound to be screened either simultaneously or in either consecutive order; and determining whether the therapeutic effectively competes with the natural substrate in a manner sufficient to prevent binding of the protein to its substrate.