WO2006065938A2 - Cancer-specific spanx-n markers - Google Patents

Cancer-specific spanx-n markers Download PDF

Info

Publication number
WO2006065938A2
WO2006065938A2 PCT/US2005/045317 US2005045317W WO2006065938A2 WO 2006065938 A2 WO2006065938 A2 WO 2006065938A2 US 2005045317 W US2005045317 W US 2005045317W WO 2006065938 A2 WO2006065938 A2 WO 2006065938A2
Authority
WO
WIPO (PCT)
Prior art keywords
spanx
seq
nucleic acid
sequence
antibody
Prior art date
Application number
PCT/US2005/045317
Other languages
French (fr)
Other versions
WO2006065938A3 (en
Inventor
Natalay Kouprina
Vladimir Larionov
Paul Goldsmith
J. Carl Barrett
Original Assignee
Government Of The United States Of America As Represented By The Secretary Of The Department Of Health And Human Services
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Government Of The United States Of America As Represented By The Secretary Of The Department Of Health And Human Services filed Critical Government Of The United States Of America As Represented By The Secretary Of The Department Of Health And Human Services
Priority to CA002591918A priority Critical patent/CA2591918A1/en
Priority to AU2005316532A priority patent/AU2005316532A1/en
Priority to EP05854102A priority patent/EP1838872A2/en
Publication of WO2006065938A2 publication Critical patent/WO2006065938A2/en
Publication of WO2006065938A3 publication Critical patent/WO2006065938A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/136Screening for pharmacological compounds

Definitions

  • the invention relates to SPANX-N genes, a new cluster of genes whose expression is detected in few normal adult tissues, with the highest expression in normal testis tissues. Some of the SPANX-N genes are also highly expressed in tumor tissues, including prostate, melanoma, uterine and cervical cancer tissues.
  • the invention therefore provides SPANX nucleic acids, polypeptides and antibodies useful for detecting and treating prostate cancer.
  • prostate cancer is the most common cancer, occu ⁇ ing in as many as 15% of men in the United States. Approximately 330,000 new cases are diagnosed annually. Prostate cancer kills about 40,000 men in the United States each year and is second only to lung cancer in mortality to men. Castration, treatment with anti-androgens, and prostatectomy with its associated urogenital risk, are all treatments that can seriously compromise the quality of life for men diagnosed too late for less drastic prostate cancer treatments. Hence, early detection and treatment is critical.
  • PSA serum prostate-specific antigen
  • level and prostate digital rectal exams are the only early diagnostic tests in routine use for screening for prostate cancer.
  • small, aggressive tumors can be missed by digital rectal exams and even by needle biopsy, and only modest increases in prostate-specific antigen, i.e., below the 4 ng/mL threshold between normal and elevated PSA levels, are generated by these tumors.
  • These aggressive tumors have the potential to suddenly dedifferentiate, grow, spread, and metastasize rapidly.
  • the NCI Fact Sheet also indicates that a need exists for a prostate cancer screen with an improved ability to differentiate between prostate cancer and benign conditions such as prostatitis, benign prostatic hypertrophy (BPH), inflammation and infection.
  • BPH benign prostatic hypertrophy
  • a need also exists for prostate cancer screens that can differentiate between slow-growing and fast-growing cancers.
  • cancer of the cervix is one of the most common malignancies in women and remains a significant public health problem throughout the world.
  • invasive cervical cancer accounts for approximately 19% of all gynecological cancers.
  • it was estimated that there were 14,700 newly diagnosed cases and 4900 deaths attributed to this disease (American Cancer Society, Cancer Facts & Figures 1996, Atlanta, Ga.: American Cancer Society, 1996).
  • the clinical problem is more serious.
  • the number of new cases is estimated to be 471,000 with a four-year survival rate of only 40% (Munoz et al., 1989, Epidemiology of
  • Cervical cancer is detected by cellular diagnosis conducted by scrubbing a cervical surface with a cotton swab or a scrubber, immediately smearing the scrubbed cells on a slide glass to prepare a sample and observing the sample under a microscope or the like. The diagnosis is then performed by observing the form of the cells under a microscope and involves examination of each sample by a cytotechnologist. Therefore, a need exists for improved accuracy and speed in processing samples so that cervical cancer can be detected and treated before malignancy develops.
  • the invention provides polypeptides and nucleic acids that are expressed in cancer cells and that can act as cancer markers.
  • one aspect of the invention is an isolated polypeptide having an amino acid sequence corresponding to any one of SEQ ID NO: 1-5.
  • the isolated polypeptide is a SPANX-Nl polypeptide with SEQ ID NO:1.
  • nucleic acid encoding a polypeptide having an amino acid sequence corresponding to any one of SEQ ID NO: 1-5.
  • nucleic acid encodes an isolated SPANX-Nl polypeptide with SEQ ID NO: 1.
  • Another aspect of the invention is an isolated nucleic acid comprising a SPANX-N promoter that includes one of the following nucleotide sequences SEQ ID NO:206-210.
  • the promoter is a SPANX-Nl promoter (SEQ ID NO:206), which can promote expression in a variety of cancer cells and tumor tissue types.
  • Another aspect of the invention is an expression cassette that includes a nucleic acid encoding a therapeutic gene product operably linked to a SPANX-N promoter of the invention.
  • Another aspect of the invention is an isolated antibody that can bind to a polypeptide having an amino acid sequence corresponding to any one of SEQ ID NO.1-5.
  • the antibody can bind to a SPANX-N peptide consisting essentially of any one of SEQ ID NO: 12-25, or 136.
  • Another aspect of the invention is an isolated SPANX-N-specific nucleic acid.
  • the invention provides a SP ANX-N- specific probe or primer with any one of SEQ ID NO:37, 38, 145-176.
  • the isolated SPANX-N primer or probe has any one of SEQ ID NO: 145-152.
  • the invention provides SPANX-N nucleic acids with any one of SEQ ID NO:26-30.
  • nucleic acids that can inhibit the function of a SPANX mRNA include DNA or RNA molecules that can hybridize to a nucleic acid encoding a SPANX-N polypeptide having an amino acid sequence comprising any one of SEQ ID NO: 1-5.
  • the nucleic acid can be a small interfering RNA (siRNA), ribozyme, or antisense nucleic acid.
  • siRNA that consists essentially of a double-stranded RNA with any one of SEQ ID NO:39-61.
  • Another aspect of the invention is a method for detecting cancer that involves contacting a non-testis tissue sample with a SPANX-N probe and observing whether an mRNA or cDNA in the sample hybridizes to the SPANX- N probe; wherein the SPANX-N probe comprises any one of SEQ ID NO:26-30, 37, 38, 145-176, or a combination thereof.
  • Another aspect of the invention involves a method for detecting cancer comprising performing nucleic acid amplification of RNA from a non-testis tissue sample using SPANX-N primers consisting essentially of SEQ ID NO:37 and 38, and observing whether a SPANX-N nucleic acid fragment is amplified.
  • the SPANX-N nucleic acid fragment can be about 240 to about 290 base pairs in length, or about 260-270 base pairs in length.
  • Another aspect of the invention involves a method for detecting cancer comprising performing nucleic acid amplification of RNA from a non-testis tissue sample using SPANX-Nl primers consisting essentially of SEQ ID NO: 1
  • Another aspect of the invention involves a method for detecting cancer comprising contacting a non-testis tissue sample with an anti-SPANX-N antibody and observing whether a complex forms between the antibody and a SPANX-N polypeptide.
  • the SPANX-N polypeptide can have any one of SEQ ID NO: 1-5.
  • the antibody can bind to any SPANX-N peptidyl epitope.
  • the peptide epitope is a SPANX-N epitope that includes SEQ ID NO: 136.
  • Another aspect of the invention is a method for treating cancer in a mammal comprising administering to the mammal an effective amount of an antibody that can bind to a SPANX-N peptide consisting of SEQ ID NO: 136.
  • Another aspect of the invention is a method for treating cancer in a mammal comprising administering to the mammal an effective amount of a nucleic acid that encodes an anti-cancer agent operably linked to a SPANX-Nl promoter comprising SEQ ID NO:206.
  • the anti-cancer agent can, for example, be a cytokine, interferon, hormones, cell growth inhibitor, cell cycle regulator, apoptosis regulator, cytotoxin, cytolytic viral product, or antibody.
  • Another aspect of the invention involves a method for treating cancer comprising administering to a mammal an effective amount of a nucleic acid that can inhibit the function of a SPANX-N mRNA, wherein the nucleic acid comprises a DNA or RNA that can hybridize to a mRNA encoding a SPANX-N polypeptide having an amino acid sequence comprising any one of SEQ ID NO: 1-5.
  • the SPANX-N mRNA can be complementary to any one of SEQ ID NO-.26-30.
  • the nucleic acid can be a small interfering RNA
  • Another aspect of the invention is a method to identify an agent that modulates SPANX-N expression. This method involves contacting a test cell with a candidate agent and determining if the candidate agent increases or decreases expression of an SPANX-N gene in the test cell when compared to expression of the SPANX-N gene in a control cell that was not contacted with the candidate agent. The agent can increase or decrease SPANX-N expression.
  • FIG. 1 illustrates the sizes of DNA fragments from the SPANX family from chimpanzee (African great apes), orangutan (great apes), rhesus macaque (Old World monkeys), and tamarin (New World monkeys), which can be amplified by polymerase chain reaction. Oligonucleotide primers were designed to be complementary to sequences within the promoter and 3' non-coding regions. The double upper bands for rhesus macaque and tamarin are presumably due to polymorphism in paralogs.
  • FIG. 2 schematically illustrates the location of the SPANX family genes on human chromosome X.
  • SPANX-Nl positions 142995930-143005820
  • SPANX-N2 positions 141495326-141490625
  • SPANX-N3 positions 141297635- 141291834)
  • SPANX-N4 positions 140806882-140816198
  • SPANX-N5 positions 51791606-51793934
  • SPANX genes reside within large segmental duplications across the chromosome., where each duplication is homologous to the others.
  • FIG. 3A-B are images of gels illustrating SPANX-N gene expression in human and mouse normal tissues as detected by reverse transcription- polymerase chain reaction (RT-PCR).
  • FIG. 3A shows the cDNA prepared from a panel of human tissue mRNAs. Oligonucleotide primers employed were from exons 1 and 2 of the genomic sequence and designed to amplify putative transcripts. A 264-bp band of the expected size was observed only in testis. Two members of the human SPANX-N subfamily, SPANX-N2 and SPANX -N3, were detected upon cloning and sequencing of PCR products.
  • FIG. 3B illustrates cDNA bands prepared from a panel of mouse tissue mRNAs.
  • Oligonucleotide primers employed were from exons 1 and 2 of the genomic sequence and designed to amplify a putative transcript. A 264-bp band of the expected size was observed only in testis. Control PCR assays were carried out with the same samples by using actin-specific primers.
  • FIG. 4A is a schematic diagram illustrating a hypothetical evolutionary tree for the SPANX gene family. The expansion of SPANX genes is superimposed on the tree of primate evolution.
  • FIG. 4B provides a schematic diagram of the genomic changes that may have occurred during evolution of the different SPANX genes.
  • FIG. 5 provides a neighbor-joining tree of primate SPANX genes. Noncoding regions (5' flanking regions and introns) were used for the tree reconstruction. The numbers at the interior branches indicate the percentage of 5,000 bootstrap pseudo-replicates that support the respective fork.
  • FIG. 6A-B provides analyses of the affinity-purified anti-EQPT antibodies prepared against peptide sequence EQPTS STNGEKRKSPCESNN (positions 2-21, SEQ ID NO: 136) from the SPANX-N sequence.
  • recombinant SPANX proteins were expressed in the pMAL-p2X bacterial expression vector. The proteins were purified as fusions with MBP by affinity chromatography. Lane 1 contained SPANX-N2 (uninduced) proteins. Lane 2 contained SPANX-N2 (induced) proteins. Lane 3 contained SPANX-B (induced) proteins. Lane 4 contained SPANX-C (induced) proteins. Lane 5 contained a 10-20OkDa ladder of molecular weight markers.
  • Lane 6 contained SPANX-B purified proteins. Lane 7 contained SPANX-C purified proteins. Lane 8 contained SPANX-N2 purified (5ul) proteins. Lane 9 contained SPANX-N2 purified (lOul) proteins.
  • FIG. 6B after separation of the recombinant proteins by SDS-PAGE, the gel was immunoblotted using anti- SPANX-N (EQPT) (lanes 1 , 2 and 3) antibodies. The antibodies exhibit a high specificity for the SPANX-N protein (lane 3).
  • FIG. 7A-C illustrates expression of SPANX-N genes in normal and cancer tissues.
  • SPANX-N expression was determined in normal tissues (FIG. 7A), in primary uterine tumors and melanoma cell lines (FIG. 7B), and in normal and tumor pairs (FIG. 7C).
  • FOG. 7A normal tissues
  • FIG. 7B primary uterine tumors and melanoma cell lines
  • FIG. 7C normal and tumor pairs
  • SPANX-Nl bars are horizontally (-) cross- hatched
  • SPANX-N2 bars are diagonally (/) cross-hatched
  • SPANX-N3 bars are vertically (
  • SPANX-N4 bars are diagonally ( ⁇ ) cross-hatched
  • SPANX-N5 bars are double cross-hatched (x).
  • FIG. 8A-B provides a comparison of human SPANX- A/D and SPANX- N promoter sequences.
  • the detected transcription start sites and the translation initiation codons (ATG) are indicated. Noncoding sequences are in lowercase.
  • SPANX-N copies differ from SP ANX- A/D genes by the almost complete lack of all CpG dinucleotides in the promoter regions (CG above sequences). However, these CpGs are perfectly preserved in all of the SP ANX- A/D copies. Another difference is the presence of the SpI binding consensus in four SPANX-N copies.
  • the invention relates to nucleic acids, polypeptides and antibodies useful for detecting cancer.
  • the invention also provides nucleic acids that can modulate the function of SPANX mRNA transcripts and antibodies that can modulate the function of SPANX gene products.
  • the invention relates to SPANX-N promoters that can express gene products in a tissue-specific manner, for example, in cancer cells where the promoter is generally active.
  • nucleic acid refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form, composed of monomers (nucleotides) containing a sugar, phosphate and a base that is either a purine or pyrimidine. Unless specifically limited, the term encompasses nucleic acids containing known analogs of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides.
  • nucleic acid sequence also encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions) and complementary sequences, as well as the sequence explicitly indicated.
  • degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et a!., Nucl. Acids Res..19. 508 (1991); Ohtsuka et al, JBC. 260. 2605 (1985); Rossolini et a!., MoI. Cell. Probes. 8, 91 (1994)).
  • nucleic acid refers to any nucleic acid molecule.
  • nucleic acid fragment is a portion of a given nucleic acid molecule.
  • DNA in the majority of organisms is the genetic material while ribonucleic acid (RNA) is involved in the transfer of information contained within DNA into proteins.
  • the invention encompasses isolated or substantially purified nucleic acid, peptide or polypeptide compositions.
  • an "isolated” or “purified” DNA molecule or RNA molecule or an “isolated” or “purified” polypeptide is a DNA molecule, RNA molecule, or polypeptide that exists apart from its native environment and is therefore not a product of nature.
  • An isolated DNA molecule, RNA molecule or polypeptide may exist in a purified form or may exist in a non-native environment such as, for example, a transgenic host cell.
  • an "isolated” or “purified” nucleic acid molecule or protein, or biologically active portion thereof is substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.
  • an "isolated" nucleic acid is free of sequences that naturally flank the nucleic acid (i.e., sequences located at the 5' and 3' ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived.
  • the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleotide sequences that naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived.
  • a protein that is substantially free of cellular material includes preparations of protein or polypeptide having less than about 30%, 20%, 10%, or 5% (by dry weight) of contaminating protein.
  • culture medium represents less than about 30%, 20%, 10%, or 5% (by dry weight) of chemical precursors or non-protein-of-interest chemicals.
  • Fragments and variants of the disclosed nucleotide sequences and proteins or partial-length proteins encoded thereby are also encompassed by the present invention.
  • fragment or portion is meant a full length or less than full length of the nucleotide sequence encoding, or the amino acid sequence of, a polypeptide or protein.
  • genes include coding sequences and/or the regulatory sequences required for their expression.
  • gene refers to a nucleic acid fragment that expresses mRNA, functional RNA, or specific protein, including regulatory sequences.
  • Genes also include non- expressed DNA segments that, for example, form recognition sequences for other proteins.
  • Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and may include sequences designed to have desired parameters.
  • Naturally occurring is used to describe an object that can be found in nature as distinct from being artificially produced.
  • a protein or nucleotide sequence present in an organism including a virus, which can be isolated from a source in nature and which has not been intentionally modified by a person in the laboratory, is naturally occurring.
  • protein protein
  • variants are a sequence that is substantially similar to the sequence of the native molecule.
  • variants include those sequences that, because of the degeneracy of the genetic code, encode the identical amino acid sequence of the native protein.
  • Naturally occurring allelic variants such as these can be identified with the use of molecular biology techniques, as, for example, with polymerase chain reaction (PCR) and hybridization techniques.
  • variant nucleotide sequences also include synthetically derived nucleotide sequences, such as those generated, for example, by using site-directed mutagenesis, which encode the native protein, as well as those that encode a polypeptide having amino acid substitutions.
  • nucleotide sequence variants of the invention will have at least 40%, 50%, 60%, to 70%, e.g., 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, to 79%, generally at least 80%, e.g., 81%-84%, at least 85%, e.g., 86%, S7%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, to 9S%, sequence identity to the native (endogenous) nucleotide sequence.
  • Consatively modified variations of a particular nucleic acid sequence refers to those nucleic acid sequences that encode identical or essentially identical amino acid sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given polypeptide. For instance, the codons CGT, CGC, CGA, CGG, AGA and AGG all encode the amino acid arginine. Thus, at every position where an arginine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded protein.
  • nucleic acid variations are "silent variations,” which are one species of “conservatively modified variations.” Every nucleic acid sequence described herein that encodes a polypeptide also describes every possible silent variation, except where otherwise noted.
  • each codon in a nucleic acid except ATG, which is ordinarily the only codon for methionine
  • each "silent variation" of a nucleic acid that encodes a polypeptide is implicit in each described sequence.
  • the invention contemplates SPANX-N polypeptides and peptides.
  • Such polypeptides and peptides have utility, for example, for generating SPANX-N-specific antibodies.
  • Such antibodies can be used for detecting, diagnosing and treating cancer.
  • the invention provides a human SPANX-Nl polypeptide of the following sequence (SEQ ID NO:1).
  • the invention provides a human SPANX-N2 polypeptide of the following sequence (SEQ ID NO:2).
  • the invention provides a human SPANX-N3 polypeptide of the following sequence (SEQ ID NO:3).
  • SEQ ID NO:3 1 MEQPTSSTNG EKTKSPCESN NKKNDEMQEV PNRVLAPEQS 41 LKKTKTSEYP IIFVYYLRKG KKINSNQLEN EQSQENSINP 81 IQKEEDEGVD LSEGSSNEDE DLGPCEGPSK EDKDLDSSEG 121 SSQEDEDLGL SEGSSQDSGE D
  • the invention provides a human SPANX-N4 polypeptide of the following sequence (SEQ ID NO:4).
  • the invention provides a human SPANX-N5 polypeptide of the following sequence (SEQ ID NO:5).
  • the SPANX-N polypeptides of the invention have about 50% amino acid sequence identity with the SPANX- A/D proteins. Hence, they are structurally distinct and can readily be used to generate SP ANX-N- specific antibodies. Sequences for various SPANX- A/D polypeptides and nucleic acids are publicly available through the National Center for Biotechnology Information (http:/ ⁇ vww.ncbi.nlm.nih.gov/). For example, a sequence for a human SPANX- Al polypeptide can be found in the NCBI database at accession number NP 038481.2 (gi: 14192937). This sequence for human SPANX-Al has the following sequence (SEQ ID NO:6):
  • SPANX- A2 polypeptide sequence is as follows (SEQ ID NO:7):
  • SPANX-Bl A sequence for the human SPANX-Bl polypeptide can be found in the NCBI database at accession number NP 115850.1 (gi:14196344). This sequence for SPANX-Bl is as follows (SEQ ID NO:8):
  • SPANX-C polypeptide A sequence for the human SPANX-C polypeptide can be found in the NCBI database at accession number NP 073152.1 (gi:13435137). This SPANX- C polypeptide has the following sequence (SEQ ID NO: 10):
  • SPANX-D polypeptide A sequence for the human SPANX-D polypeptide can be found in the NCBI database at accession number NP 115793.1 (gi:14192939). This SPANX-D polypeptide has the following sequence (SEQ ID NO: 11 ):
  • sequence identity or “identity” in the context of two nucleic acid or polypeptide sequences refers to a specified percentage of residues in the two sequences that are the same when the sequences are aligned for maximum correspondence over a specified comparison window, as measured by sequence comparison algorithms or by visual inspection.
  • substantially identical in the context of a polypeptide or peptide indicates that a polypeptide or peptide comprises a sequence with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, or 79%, preferably 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, S8%, or 89%, more preferably at least 90%, 91%, 92%, 93%, or 94%, or even more preferably, 95%, 96%, 97%, 98% or 99%, sequence identity to the reference sequence over a specified comparison window.
  • optimal alignment is conducted using the homology alignment algorithm of Needleman and Wunsch (JMB, 48, 443 (1970)).
  • a peptide is substantially identical to a second peptide, for example, where the two peptides differ only by a conservative substitution.
  • the invention also contemplates variant SPANX-N polypeptides and peptides as well as SPANX-N polypeptides and peptides from mammalian species other than humans. Such SPANX-N polypeptides and peptides are useful for raising antibodies and for detecting cancer in mammalian species.
  • Residue positions in variant polypeptides and peptides may not be identical to those in the reference sequence but often differ, for example, by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. Sequences that differ by such conservative substitutions are said to have "sequence similarity" or “similarity.” When sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are available to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity.
  • variant polypeptide is intended a polypeptide derived from the native protein by deletion (also called “truncation") or addition of one or more amino acids to the N-terminal and/or C-terminal end of the native protein; deletion or addition of one or more amino acids at one or more sites in the native protein; or substitution of one or more amino acids at one or more sites in the native protein.
  • variants may result from, for example, genetic polymorphism, species variation or from human manipulation. Methods for such manipulations are generally known in the art.
  • polypeptides of the invention may have sequence differences including amino acid substitutions, deletions, truncations, and insertions. Methods for such manipulations are generally known in the art.
  • amino acid sequence variants of the polypeptides can be prepared by mutations in the DNA. Methods for mutagenesis and nucleotide sequence alterations are well known in the art. See, for example, Kunkel, Proc. Natl. Acad. Sci. USA. 82, 488 (1985); Kunkel et al, Meth. EnzvmoL 154. 367 (1987); U. S. Patent No. 4,873,192; Walker and Gaastra (1983), and the references cited therein.
  • polypeptides of the invention encompass both naturally- occurring proteins as well as variations and modified forms thereof. Such variants will continue to possess the desired activity.
  • the deletions, insertions, and substitutions of the polypeptide sequence encompassed herein are not expected to produce radical changes in the characteristics of the polypeptide. However, when it is difficult to predict the exact effect of the substitution, deletion, or insertion in advance of doing so, one skilled in the art will appreciate that the effect will be evaluated by routine screening assays.
  • any of the entire SPANX-N 1-5 polypeptides can be used for generating antibodies.
  • selected peptides from any of the SPANX-N polypeptides can be used for this purpose. These antibodies are useful for detecting and treating cancer.
  • the following peptide (SEQ ID NO: 136) can be used to generate antibodies that specifically bind to each of the SPANX-N 1-5 polypeptides:
  • EQPTSSTNGEKRKSPCESNN (SPANX-N amino acid positions 2-21).
  • SPANX-Nl specific peptide epitope has the following sequence:
  • SPANX-N2 specific peptide epitopes include those with the following peptidyl sequences:
  • SPANX-N3 specific peptide epitopes include those with the following sequences:
  • GEKT SEQ ID NO: 16
  • PIIFV SEQ ID NO: 17
  • SKEDKLSEGSS (SEQ ID NO: IS); NKKNDEMQEVPNRVL (SEQ ID NO: 19); and KTKTSEYPIIFVYYL (SEQ ID NO:20).
  • SPANX-N4 specific peptide epitopes include those with the following sequences:
  • ESNNLH SEQ ID N0:21
  • DGGQ SEQ ID NO:22
  • SPANX-N5 specific peptide epitopes include those with the following sequences: GEICRK (SEQ ID NO:23);
  • LVLEPS SEQ ID NO:24
  • STVLVLCY SEQ ID NO:25
  • the entire SPANX-N polypeptides SEQ ID NO: 1-5) and/or any of peptide SEQ ID NO: 12-25, 136 can be used to immunize animals to obtain SPANX-N specific antibodies.
  • the invention therefore provides antibodies made by available procedures that can bind SPANX-N peptides and/or polypeptides.
  • the binding domains of such antibodies for example, the CDR regions of these antibodies, can be transferred into or utilized with any convenient binding entity backbone.
  • Antibody molecules belong to a family of plasma proteins called immunoglobulins, whose basic building block, the immunoglobulin fold or domain, is used in various forms in many molecules of the immune system and other biological recognition systems.
  • a standard antibody is a tetrameric structure consisting of two identical immunoglobulin heavy chains and two identical light chains and has a molecular weight of about 150,000 daltons.
  • the heavy and light chains of an antibody consist of different domains. Each light chain has one variable domain (VL) and one constant domain (CL), while each heavy chain has one variable domain (VH) and three or four constant domains (CH). See, e.g., Alzari, P. N., Lascombe, M.-B. & Poljak, R. J. (1988) Three-dimensional structure of antibodies. Annu. Rev. Immunol. 6, 555-580. Each domain, consisting of about 110 amino acid residues, is folded into a characteristic ⁇ -sandwich structure formed from two ⁇ -sheets packed against each other, the immunoglobulin fold.
  • VH and VL domains each have three complementarity determining regions (CDRl -3) that are loops, or turns, connecting ⁇ -strands at " one end of the domains.
  • CDRl -3 complementarity determining regions
  • the variable regions of both the light and heavy chains generally contribute to antigen specificity, although the contribution of the individual chains to specificity is not always equal.
  • Antibody molecules have evolved to bind to a large number of molecules by using six randomized loops (CDRs).
  • Immunoglobulins can be assigned to different classes depending on the amino acid sequences of the constant domain of their heavy chains. There are at least five (5) major classes of immunoglobulins: IgA, IgD, IgE, IgG and IgM. Several of these may be further divided into subclasses (isotypes), for example, IgG-I, IgG-2, IgG-3 and IgG-4; IgA-I and IgA-2.
  • the heavy chain constant domains that correspond to the IgA, IgD, IgE, IgG and IgM classes of immunoglobulins are called alpha ( ⁇ ), delta ( ⁇ ), epsilon ( ⁇ ), gamma ( ⁇ ) and mu ( ⁇ ), respectively.
  • the light chains of antibodies can be assigned to one of two clearly distinct types, called kappa (K) and lambda ( ⁇ ), based on the amino sequences of their constant domain.
  • K kappa
  • lambda
  • the subunit structures and three- dimensional configurations of different classes of immunoglobulins are well known.
  • variable in the context of variable domain of antibodies, refers to the fact that certain portions of variable domains differ extensively in sequence from one antibody to the next.
  • the variable domains are for binding and determine the specificity of each particular antibody for its particular antigen.
  • the variability is not evenly distributed through the variable domains of antibodies. Instead, the variability is concentrated in three segments called complementarity determining regions (CDRs), also known as hypervariable regions in both the light chain and the heavy chain variable domains.
  • CDRs complementarity determining regions
  • variable domains The more highly conserved portions of variable domains are called framework (FR) regions.
  • the variable domains of native heavy and light chains each comprise four FR regions, largely adopting a ⁇ -sheet configuration, connected by three CDRs, which form loops connecting, and in some cases forming part of, the ⁇ -sheet structure.
  • the CDRs in each chain are held together in close proximity by the FR regions and, with the CDRs from another chain, contribute to the formation of the antigen-binding site of antibodies.
  • the constant domains are not involved directly in binding an antibody to an antigen, but exhibit various effector functions, such as participation of the antibody in antibody-dependent cellular toxicity.
  • an antibody that is contemplated for use in the present invention thus can be in any of a variety of forms, including a whole immunoglobulin, an antibody fragment such as Fv, Fab, and similar fragments, a single chain antibody which includes the variable domain complementarity determining regions (CDR), and the like forms, all of which fall under the broad term "antibody”, as used herein.
  • the present invention contemplates the use of any specificity of an antibody, polyclonal or monoclonal, and is not limited to antibodies that recognize and immunoreact with a specific SPANX-N polypeptide or derivative thereof.
  • the binding regions, or CDR, of antibodies can be placed within the backbone of any convenient binding entity polypeptide.
  • an antibody, binding entity or fragment thereof is used that is immunospecific for a SPANX-N polypeptide, as well as the variants and derivatives thereof.
  • antibody fragment refers to a portion of a full-length antibody, generally the antigen binding or variable region.
  • antibody fragments include Fab, Fab', F(ab') 2 and Fv fragments.
  • Papain digestion of antibodies produces two identical antigen binding fragments, called Fab fragments, each with a single antigen binding site, and a residual Fc fragment.
  • Fab fragments thus have an intact light chain and a portion of one heavy chain.
  • Pepsin treatment yields an F(ab') 2 fragment that has two antigen binding fragments that are capable of cross-linking antigen, and a residual fragment that is termed a pFc' fragment.
  • Fab' fragments are obtained after reduction of a pepsin digested antibody, and consist of an intact light chain and a portion of the heavy chain. Two Fab" fragments are obtained per antibody molecule. Fab 1 fragments differ from Fab fragments by the addition of a few residues at the carboxyl terminus of the heavy chain CHl domain including one or more cysteines from the antibody hinge region.
  • Fv is the minimum antibody fragment that contains a complete antigen recognition and binding site. This region consists of a dimer of one heavy and one light chain variable domain in a tight, non-covalent association (V H -V L dimer). It is in this configuration that the three CDRs of each variable domain interact to define an antigen binding site on the surface of the VH -V L dimer.
  • variable domain or half of an Fv comprising only three CDRs specific for an antigen
  • functional fragment refers to Fv, F(ab) and F(ab') 2 fragments.
  • Additional fragments can include diabodies, linear antibodies, single- chain antibody molecules, and multispecific antibodies formed from antibody fragments.
  • Single chain antibodies are genetically engineered molecules containing the variable region of the light chain, the variable region of the heavy chain, linked by a suitable polypeptide linker as a genetically fused single chain molecule.
  • Such single chain antibodies are also referred to as "single-chain Fv" or "sFv” antibody fragments.
  • the Fv polypeptide further comprises a polypeptide linker between the VH and VL domains that enables the sFv to form the desired structure for antigen binding.
  • diabodies refers to a small antibody fragments with two antigen-binding sites, where the fragments comprise a heavy chain variable domain (VH) connected to a light chain variable domain (VL) in the same polypeptide chain (VH-VL).
  • VH heavy chain variable domain
  • VL light chain variable domain
  • VH-VL polypeptide chain
  • Antibody fragments contemplated by the invention are therefore not full- length antibodies. However, such antibody fragments can have similar or improved immunological properties relative to a full-length antibody. Such antibody fragments may be as small as about 4 amino acids, 5 amino acids, 6 amino acids, 7 amino acids, 9 amino acids, about 12 amino acids, about 15 amino acids, about 17 amino acids, about 18 amino acids, about 20 amino acids, about 25 amino acids, about 30 amino acids or more.
  • an antibody fragment of the invention can have any upper size limit so long as it is has similar or improved immunological properties relative to an antibody that binds with specificity to a SPANX-N polypeptide.
  • smaller binding entities and light chain antibody fragments can have less than about 200 amino acids, less than about 175 amino acids, less than about 150 amino acids, or less than about 120 amino acids if the antibody fragment is related to a light chain antibody subunit.
  • larger binding entities and heavy chain antibody fragments can have less than about 425 amino acids, less than about 400 amino acids, less than about 375 amino acids, less than about 350 amino acids, less than about 325 amino acids or less than about 300 amino acids if the antibody fragment is related to a heavy chain antibody subunit.
  • Antibodies directed against a SPANX-N peptide or polypeptide can be made by any available procedure. Methods for the preparation of polyclonal antibodies are available to those skilled in the art. See, for example, Green, et al., Production of Polyclonal Antisera, in: Immunochemical Protocols (Manson, ed.), pages 1-5 (Humana Press); Coligan, et al., Production of Polyclonal Antisera in Rabbits, Rats Mice and Hamsters, in: Current Protocols in Immunology, section 2.4.1 (1992), which are hereby incorporated by reference. Monoclonal antibodies can also be employed in the invention. The term
  • monoclonal antibody refers to an antibody obtained from a population of substantially homogeneous antibodies. In other words, the individual antibodies comprising the population are identical except for occasional naturally occurring mutations in some antibodies that may be present in minor amounts. Monoclonal antibodies are highly specific, being directed against a single antigenic site. Furthermore, in contrast to polyclonal antibody preparations that typically include different antibodies directed against different determinants (epitopes), each monoclonal antibody is directed against a single determinant on the antigen. In addition to their specificity, the monoclonal antibodies are advantageous in that they are synthesized by the hybridoma culture, uncoiitamiiiated by other immunoglobulins.
  • the modifier "monoclonal” indicates that the antibody is obtained from a substantially homogeneous population of antibodies, and is not to be construed as requiring production of the antibody by any particular method.
  • the monoclonal antibodies herein specifically include "chimeric" antibodies in which a portion of the heavy and/or light chain is identical or homologous to corresponding sequences in antibodies derived from a particular species or belonging to a particular antibody class or subclass, while the remainder of the chain(s) is identical or homologous to corresponding sequences in antibodies derived from another species or belonging to another antibody class or subclass. Fragments of such antibodies can also be used, so long as they exhibit the desired biological activity. See U.S. Patent No. 4,816,567; Morrison et al. Proc.
  • Monoclonal antibodies can be isolated and purified from hybridoma cultures by a variety of well-established techniques. Such isolation techniques include affinity chromatography with Protein-A Sepharose, size-exclusion chromatography, and ion-exchange chromatography.
  • the monoclonal antibodies to be used in accordance with the present invention may be made by the hybridoma method as described above or may be made by recombinant methods, e.g., as described in U.S. Pat. No. 4,816,567.
  • Monoclonal antibodies for use with the present invention may also be isolated from phage antibody libraries using the techniques described in Clackson et al. Nature 352: 624-628 (1991), as well as in Marks et al., J. MoI Biol. 222: 581-597 (1991).
  • Antibody fragments of the present invention can be prepared by proteolytic hydrolysis of the antibody or by expression of nucleic acids encoding the antibody fragment in a suitable host.
  • Antibody fragments can be obtained by pepsin or papain digestion of whole antibodies conventional methods.
  • antibody fragments can be produced by enzymatic cleavage of antibodies with pepsin to provide a 5S fragment described as F(ab') 2 .
  • This fragment can be further cleaved using a thiol reducing agent, and optionally using a blocking group for the sulfhydryl groups resulting from cleavage of disulfide linkages, to produce 3.5S Fab' monovalent fragments.
  • a thiol reducing agent optionally using a blocking group for the sulfhydryl groups resulting from cleavage of disulfide linkages, to produce 3.5S Fab' monovalent fragments.
  • pepsin produces two monovalent Fab' fragments and an Fc fragment directly.
  • Fv fragments comprise an association of VH and V L chains. This association may be noncovalent or the variable chains can be linked by an intermolecular disulfide bond or cross-linked by chemicals such as glutaraldehyde.
  • the Fv fragments comprise VH and V L chains connected by a peptide linker.
  • sFv single-chain antigen binding proteins
  • CDR peptides (“minimal recognition units") are often involved in antigen recognition and binding.
  • CDR peptides can be obtained by cloning or constructing genes encoding the CDR of an antibody of interest. Such genes are prepared, for example, by using the polymerase chain reaction to synthesize the variable region from RNA of antibody-producing cells. See, for example, Larrick, et al., Methods: a Companion to Methods in Enzymology, Vol. 2, page 106 (1991).
  • humanized antibodies are chimeric immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab') 2 or other antigen-binding subsequences of antibodies) that contain minimal sequence derived from non-human immunoglobulin.
  • humanized antibodies are human immunoglobulins (recipient antibody) in which residues from a complementary determining region (CDR) of the recipient are replaced by residues from a CDR of a nonhuman species (donor antibody) such as mouse, rat or rabbit having the desired specificity, affinity and capacity.
  • CDR complementary determining region
  • humanized antibodies may comprise residues that are found neither in the recipient antibody nor in the imported CDR or framework sequences. These modifications are made to further refine and optimize antibody performance.
  • humanized antibodies will comprise substantially all of at least one, and typically two, variable domains, in which all or substantially all of the CDR regions correspond to those of a non-human immunoglobulin and all or substantially all of the FR regions are those of a human immunoglobulin consensus sequence.
  • the humanized antibody optimally also will comprise at least a portion of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin.
  • Fc immunoglobulin constant region
  • binding entities which comprise polypeptides that can recognize and bind to a SPANX-N polypeptide.
  • a number of proteins can serve as protein scaffolds to which binding domains reactive with a SPANX-N peptide can be attached and thereby form a suitable binding entity.
  • the binding domains bind or interact with a SPANX-N peptide while the protein scaffold merely holds and stabilizes the binding domains so that they can bind.
  • a number of protein scaffolds can be used.
  • phage capsid proteins can be used. See Review in Clackson & Wells, Trends Biotechnol. 12: 173- 184 (1994).
  • Phage capsid proteins have been used as scaffolds for displaying random peptide sequences, including bovine pancreatic trypsin inhibitor (Roberts et al., PNAS 89:2429-2433 (1992)), human growth hormone (Lowman et al., Biochemistry 30:10S32-10S3S (1991)), Venturini et al., Protein Peptide Letters 1 :70-75 (1994)), and the IgG binding domain of Streptococcus (O'Neil et al., Techniques in Protein Chemistry V (Crabb, L,. ed.) pp. 517-524, Academic Press, San Diego (1994)). These scaffolds have displayed a single randomized loop or region that can be modified to include binding domains for a SPANX-N peptide or polypeptide.
  • Tendamistat is a ⁇ -sheet protein from Streptomyces tendae. It has a number of features that make it an attractive scaffold for binding peptides, including its small size, stability, and the availability of high resolution NMR and X-ray structural data.
  • the overall topology of Tendamistat is similar to that of an immunoglobulin domain, with two ⁇ -sheets connected by a series of loops.
  • Tendamistat In contrast to immunoglobulin domains, the ⁇ -sheets of Tendamistat are held together with two rather than one disulfide bond, accounting for the considerable stability of the protein.
  • the loops of Tendamistat can serve a similar function to the CDR loops found in immunoglobulins and can be easily randomized by in vitro mutagenesis.
  • Tendamistat is derived from Streptomyces tendae and may be antigenic in humans.
  • binding entities that employ Tendamistat are preferably employed in vitro.
  • Fibronectin type III domain has also been used as a protein scaffold to which binding entities can be attached.
  • Fibronectin type III is part of a large subfamily (Fn3 family or s-type Ig family) of the immunoglobulin superfamily. Sequences, vectors and cloning procedures for using such a fibronectin type III domain as a protein scaffold for binding entities (e.g. CDR peptides) are provided, for example, in U.S. Patent Application Publication 20020019517. See also, Bork, P. & Doolittle, R. F. (1992) Proposed acquisition of an animal protein domain by bacteria. Proc. Natl. Acad. Sci. USA 89, 8990-8994; Jones, E. Y.
  • Variant binding entities, antibody fragments and antibodies therefore can also be generated through display-type technologies.
  • display-type technologies include, for example, phage display, retroviral display, ribosomal display, and other techniques.
  • Techniques available in the art can be used for generating libraries of binding entities, for screening those libraries and the selected binding entities can be subjected to additional maturation, such as affinity maturation.
  • Wright and Harris, supra. Hanes and Plucthau PNAS USA 94:4937-4942 (1997) (ribosomal display), Parmley and Smith Gene 73:305-318 (1988) (phage display), Scott TIBS 17:241-245 (1992), Cwirla et al. PNAS USA 87:6378-6382 (1990), Russel et al.
  • a mutant binding domain refers to an amino acid sequence variant of a selected binding domain (e.g. a CDR). In general, one or more of the amino acid residues in the mutant binding domain is different from what is present in the reference binding domain.
  • Such mutant antibodies necessarily have less than 100% sequence identity or similarity with the reference amino acid sequence, hi general, mutant binding domains have at least 75% amino acid sequence identity or similarity with the amino acid sequence of the reference binding domain.
  • mutant binding domains have at least 80%, more preferably at least 85%, even more preferably at least 90%, and most preferably at least 95% amino acid sequence identity or similarity with the amino acid sequence of the reference binding domain.
  • affinity maturation using phage display can be utilized as one method for generating mutant binding domains.
  • Affinity maturation using phage display refers to a process described in Lowman et al., Biochemistry 30(45): 10832-10838 (1991), see also Hawkins et al., J. MoI Biol. 254: 889-896 (1992).
  • this process can be described briefly as involving mutation of several binding domains or antibody hypervariable regions at a number of different sites with the goal of generating all possible amino acid substitutions at each site.
  • the binding domain mutants thus generated are displayed in a monovalent fashion from filamentous phage particles as fusion proteins. Fusions are generally made to the gene III product of Ml 3.
  • the phage expressing the various mutants can be cycled through several rounds of selection for the trait of interest, e.g. binding affinity or selectivity.
  • the mutants of interest are isolated and sequenced. Such methods are described in more detail in U.S. Patent 5,750,373, U.S. Patent 6,290,957 and Cunningham, B. C. et al., EMBO J. 13(11), 2508-2515 (1994).
  • the invention provides methods of manipulating binding entity or antibody polypeptides or the nucleic acids encoding them to generate binding entities, antibodies and antibody fragments with improved binding properties that recognize a SPANX-N polypeptide.
  • Such methods of mutating portions of an existing binding entity or antibody involve fusing a nucleic acid encoding a polypeptide that encodes a binding domain reactive with a SPANX-N peptide to a nucleic acid encoding a phage coat protein to generate a recombinant nucleic acid encoding a fusion protein, mutating the recombinant nucleic acid encoding the fusion protein to generate a mutant nucleic acid encoding a mutant fusion protein, expressing the mutant fusion protein on the surface of a phage, and selecting phage that bind to a SPANX-N polypeptide.
  • the invention provides antibodies, antibody fragments, and binding entity polypeptides that can recognize and bind to a SPANX-N polypeptide.
  • the invention further provides methods of manipulating those antibodies, antibody fragments, and binding entity polypeptides to optimize their binding properties or other desirable properties (e.g., stability, size, ease of use).
  • Such antibodies, antibody fragments, and binding entity polypeptides can be modified to include a label or reporter molecule useful for detecting the presence of the antibody.
  • the labeled antibody can then be used for detection of SPANX-N polypeptides.
  • a label or reporter molecule is any molecule that can be associated with an antibody, directly or indirectly, and that results in a measurable, detectable signal, either directly or indirectly.
  • labels can be incorporated into or coupled onto an antibody or binding entity are available to those of skill in the art.
  • labels suitable for use with the antibodies and binding entities of the invention include radioactive isotopes, fluorescent molecules, phosphorescent molecules, enzymes, secondary antibodies, and ligands.
  • fluorescent labels examples include fluorescein (FITC), 5,6- carboxymethyl fluorescein, Texas red, nitrobenz-2-oxa-l,3-diazol-4-yl (NBD), coumarin, dansyl chloride, rhodamine, 4'-6-diamidino-2-phenylinodole (DAPI), and the cyanine dyes Cy3, Cy3.5, Cy5, Cy5.5 and Cy7.
  • the fluorescent label is fluorescein (5-carboxyfluorescein-N-hydroxysuccinimide ester) or rhodamine (5,6-tetramethyl rhodamine).
  • Fluorescent labels for combinatorial multicolor used in some embodiments include FITC and the cyanine dyes Cy3, Cy3.5, Cy5, Cy5.5 and Cy7.
  • the absoiption and emission maxima, respectively, for these fluors are: FITC (490 nm; 520 nm), Cy3 (554 nm; 568 nm), Cy3.5 (581 nm; 588 nm), Cy5 (652 nm: 672 nm), Cy5.5 (682 nm; 703 nm) and Cy7 (755 nm; 778 nm), thus allowing their simultaneous detection.
  • fluorescent labels can be obtained from a variety of commercial sources, including Molecular Probes, Eugene. OR and Research Organics, Cleveland, Ohio.
  • Biotin can be detected using streptavidin-alkaline phosphatase conjugate (Tropix., Inc.) that binds to the biotin and subsequently can be detected by chemiluminescence of suitable substrates (for example, the chemiluminescent substrate CSPD: disodium, 3-(4-methoxyspiro-[l,2, - dioxetane-3-2'-(5'-chloro)tricyclo [3.3.1.1.sup.3,7 ]decane]-4-yl) phenyl phosphate; Tropix, Inc.).
  • suitable substrates for example, the chemiluminescent substrate CSPD: disodium, 3-(4-methoxyspiro-[l,2, - dioxetane-3-2'-(5'-chloro)tricyclo [3.3.1.1.sup.3,7 ]decane]-4-yl
  • Molecules that combine two or more of these reporter molecules or detection labels can also be used in the invention. Any of the known detection labels can be used with the disclosed antibodies, antibody fragments, binding entities, and methods. Methods for detecting and measuring signals generated by detection labels are also available to those of skill in the art. For example, radioactive isotopes can be detected by scintillation counting or direct visualization; fluorescent molecules can be detected with fluorescent spectrophotometers; phosphorescent molecules can be detected with a scanner or spectrophotometer, or directly visualized with a camera; enzymes can be detected by visualization of the product of a reaction catalyzed by the enzyme. Such methods can be used directly in methods for detecting SPANX-N polypeptides.
  • the invention provides SPANX-N nucleic acids that encode SPANX-N polypeptides and peptides.
  • the invention provides SPANX-specific nucleic acids, probes and primers that can be used to detect tissues that express SPANX, for example, normal testis, cervical cancer, uterine cancer, melanoma and prostate cancer tissues.
  • the invention provides SPANX-N promoters that can be used to express desirable gene products in a tissue-specific manner.
  • the SPAN-Nl promoter is active in a variety of cancer cells.
  • the invention relates to SPANX-specific nucleic acids that can modulate or inhibit the function of a SPANX mRNA.
  • the SPANX-specific nucleic acids that can modulate or inhibit the function of a SPANX mRNA are useful for treating cancer.
  • the invention provides nucleic acids that encode SPANX-N polypeptides.
  • the SPANX-Nl polypeptide having SEQ ID NO:1 is encoded by the cDNA sequence provided below (SEQ ID NO:26):
  • SPANX-N2 polypeptide having SEQ ID NO:2 is encoded by the cDNA sequence provided below (SEQ ID NO:27):
  • SPANX-N3 polypeptide having SEQ ID NO:3 is encoded by the cDNA sequence provided below (SEQ ID NO:28):
  • SPANX-N4 polypeptide having SEQ ID NO:4 is encoded by the cDNA sequence provided below (SEQ ID NO.29):
  • SPANX-N5 polypeptide having SEQ ID NO: 5 is encoded by the cDNA sequence provided below (SEQ ID NO:30):
  • SPANX-Al polypeptide having SEQ ID NO:6 is encoded by the cDNA sequence provided below (SEQ ID NO:31):
  • SPANX- A2 polypeptide having SEQ ID NO: 7 is encoded by the cDNA sequence provided below (SEQ ID NO:32):
  • SPANX-Bl polypeptide having SEQ ID NO:8 is encoded by the cDNA sequence provided below (SEQ ID NO:33): 1 GTCACCAGGA GGGTATGCAT AGGGAGGGCA AGAGCTCTGG
  • SPANX-B2 polypeptide having SEQ ID NO: 9 is encoded by the cDNA sequence provided below (SEQ ID NO:34): 1 GTCACCAGGA GGGTATGCAT AGGGAGGGCA AGAGCTCTGG
  • SPANX-C polypeptide having SEQ ID NO: 10 is encoded by the cDNA sequence provided below (SEQ ID NO:35):
  • SPANX-D polypeptide having SEQ ID NO:11 is encoded by the cDNA sequence provided below (SEQ ID NO:36): 1 AAGCCTGCCG CTGACATTGA AGAACCAATA TATACAATGG
  • the invention provides SPANX-N-specific nucleic acids, primers or probes.
  • SPANX-N-specific nucleic acids, primers or probes can be used for detection of SPANX-N expression, for example, in testes and prostate cancer cells.
  • expression of SPANX-N in non-testes tissues is an indication that such tissues are cancerous.
  • any of the SPANX-N cDNAs (SEQ ID NO:26-30) provided above can serve as a SPANX-N-specific nucleic acid or probes.
  • the invention also provides primers or probes that are referred to herein as the FhuS-F and RhuS-R primers, which specifically hybridize to the SPANX-N subfamily of genes. The sequences of these SPANX-N specific primers are shown below:
  • FhuS-F 5'-atggaacagccgacttcaag-3' (SEQ ID NO:37)
  • RhuS-R 5'-tgagtctaggccttcgtcct-3' (SEQ ID NO:38)
  • the SEQ ID NO: 37-38 probes are specific for SPANX-N subfamily and distinguish these genes from SPANX-Al, -A2,-B,-C and-D genes subfamily. These primers were used for detecting SPANX-N expression and the products detected were confirmed to be SPANX-N nucleic acids by sequencing. In addition, the following probes can be used to detect SPANX-N transcripts and expression patterns. In addition, the following nucleic acids are useful for nucleic acid amplification of specific SPANX-N RNA and DNA sequences, including specific portions of SPANX-N RNA and DNA sequences. SPANX-Nl Exonl: Particularly useful for nucleic acid amplification (1,946 bp product)
  • Nlexl-R 5'-acaactttcgttaaccgcca-3' SEQ ID NO:13S
  • Exonl Particularly useful for nucleic acid sequencing
  • Exon2 Particularly useful for nucleic acid amplification (1,779 bp) Nlex2-F 5'-agggaagtgaatacaccaga-3' (SEQ ID NO: 141)
  • SeqN2ex2-F 5'-taacaggtgaccctacccat-3' (SEQ ID NO:143)
  • SeqN2ex2-R 5'-gatcactggagaaggaggaa-3' SEQ ID NO: 1444
  • SPANX-N2 Exonl Particularly useful for nucleic acid amplification (3,710 bp)
  • N2ex2-F 5'-tgagcgagtactccagaga-3' (SEQ ID NO: 149)
  • N2ex2-R 5'-ctggttgtgacgtactatact-3' (SEQ ID NO: 150)
  • Exonl Particularly useful for nucleic acid amplification (4,593 bp) N3exl-F 5'-aggttcgcttggtttgttag-3' (SEQ ID NO: 153)
  • N3exl-R 5'-acagcaactgaccaatcttc-3' (SEQ ID NO: 154)
  • SPANX-N4 Exonl Particularly useful for nucleic acid amplification (1,515 bp)
  • Exonl Particularly useful for nucleic acid sequencing
  • SeqlN4 R 5'-tct gcaggtgtctgcagtat-3' (SEQ ID NO:164)
  • SPANX-N1-5 Detecting SPANX-N expression in humans (e.g. by RT-PCR) spaxnl/n5-F (ISO bp) 5'-aagaggaagagcccctgtga-3' (SEQ ID NO:169) spaxnl/n5-R (180 bp) 5'-ggtcattctccagttgatttga-3' (SEQ ID NO:170)
  • the SPANX-N promoter sequences can promote expression in selected cell and tissue types, including testis tissue, and in the case of the SPANX-Nl gene, in cancer cells.
  • testis tissue and in the case of the SPANX-Nl gene, in cancer cells.
  • SPANX-N4 transcripts were detected only in testis
  • SPANX-N2-5 transcripts were detected in several normal nongametogenic tissues (placenta, prostate, proximal and distal colon, lung, and cervix), although the levels of this SPANX-N expression in these tissues was lower than that observed in testis (FIG. 7A and Table 10).
  • SPANX-Nl was not expressed in normal, nongametogenic tissues. In other words, detectable levels of SPANX-Nl are not observed in normal tissues.
  • the SPANX-Nl, SPANX-N2, SPANX-N3, SPANX-N4 and SPANX- N5 promoters have utility for expressing gene products in testis
  • the SPANX-N2, SPANX-N3, SPANX-N4 and SPANX-N5 (but not SPANX-Nl) promoters have utility for expressing gene products in tissues such as placenta, prostate, proximal and distal colon, lung, and cervix.
  • SPANX-Nl is not expressed in normal tissues (except testis and sperm), it is e ' xpressed in cancer cells.
  • SPANX-Nl is a diagnostic marker for cancer.
  • the SPANX-Nl promoter can be used to promote expression of anti-cancer gene products.
  • the promoters of the SPANX-Nl through SPANX-N5 genes have the following sequences.
  • SPANX-Nl Promoter SEQ ID NO:206
  • SPANX-N2 Promoter SEQ ID NO:207
  • SPANX-N4 Promoter (SEQ IDNO:209): 1 TCCATGTGAA CCATGAACAT TAAACATGGA GAAATGAGGA
  • SPANX-N5 Promoter SEQ ID NO:210:
  • genes, promoters and nucleotide sequences of the invention include both the naturally occurring sequences as well as variants thereof.
  • the invention contemplates SPANX-N nucleic acids from mammalian species other than humans.
  • sequence relationships between two or more nucleic acids or polynucleotides are used to describe the sequence relationships between two or more nucleic acids or polynucleotides: (a) “reference sequence,” (b) “comparison window,” (c) “sequence identity,” (d) “percentage of sequence identity,” and (e) “substantial identity”.
  • reference sequence is a defined sequence used as a basis for sequence comparison.
  • a reference sequence may be a subset or the entirety of a specified sequence; for example, as a segment of a full-length cDNA or gene sequence, or the complete cDNA or gene sequence.
  • comparison window makes reference to a contiguous and specified segment of a polynucleotide sequence, wherein the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
  • the comparison window is at least 15 or 17 contiguous nucleotides in length, and optionally can be 20, 30, 40, 50, 100, or longer.
  • Computer implementations of these mathematical algorithms can be utilized for comparison of sequences to determine sequence identity. Such implementations include, but are not limited to: CLUSTAL in the PC/Gene program (available from Intelligenetics, Mountain View, California); the ALIGN program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Version 8 (available from Genetics Computer Group (GCG), 575 Science Drive, Madison, Wisconsin, USA). Alignments using these programs can be performed using the default parameters. Further information on the CLUSTAL program can be found in Higgins et al. , Gene, 73, 237 (1988); Higgins et al. CABIOS. 5. 151 (1989); Corpet et al, Nucl.
  • HSPs high scoring sequence pairs
  • Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always ⁇ 0).
  • M forward score for a pair of matching residues
  • N penalty score for mismatching residues; always ⁇ 0.
  • a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when the cumulative alignment score falls off by the quantity X from its maximum achieved value, the cumulative score goes to zero or below due to the accumulation of one or more negative-scoring residue alignments, or the end of either sequence is reached.
  • the BLAST algorithm In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences.
  • One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance.
  • P(N) the smallest sum probability
  • a test nucleic acid sequence is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid sequence to the reference nucleic acid sequence is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001.
  • Gapped BLAST in BLAST 2.0
  • PSI-BLAST in BLAST 2.0
  • the default parameters of the respective programs e.g. BLASTN for nucleotide sequences, BLASTX for proteins
  • W wordlength
  • E expectation
  • BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix. See website at ncbi.nlm.nih.gov. Alignment may also be performed manually by inspection.
  • comparison of nucleotide sequences for determination of percent sequence identity to the promoter sequences disclosed herein is preferably made using the BlastN program (version 1.4.7 or later) with its default parameters or any equivalent program.
  • equivalent program any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide or amino acid residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by the preferred program.
  • percentage of sequence identity means the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
  • the percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100 to yield the percentage of sequence identity.
  • substantially identical of polynucleotide sequences means that a polynucleotide comprises a sequence that has at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, or 79%, preferably at least 80%, 81%, S2%, 83%, 84%, 85%, 86%, 87%, 88%, or 89%, more preferably at least 90%, 91%, 92%, 93%, or 94%, and most preferably at least 95%, 96%, 97%, 9S%, or 99% sequence identity, compared to a reference sequence using one of the alignment programs described using standard parameters.
  • amino acid sequences for these purposes normally means sequence identity of at least 70%, more preferably at least S0%, 90%, and most preferably at least 95%.
  • nucleotide sequences are substantially identical if two molecules hybridize to each other under stringent conditions.
  • stringent conditions are selected to be about 5 0 C lower than the thermal melting point (T n ,) for the specific sequence at a defined ionic strength and pH.
  • stringent conditions encompass temperatures in the range of about I 0 C to about 20°C, depending upon the desired degree of stringency as otherwise qualified herein.
  • Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides they encode are substantially identical. This may occur, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code.
  • One indication that two nucleic acid sequences are substantially identical is when the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the polypeptide encoded by the second nucleic acid.
  • sequence comparison typically one sequence acts as a reference sequence to which test sequences are compared.
  • test and reference sequences are input into a computer, subsequence coordinates are designated if necessary, and sequence algorithm program parameters are designated.
  • sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
  • hybridizing specifically to refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA.
  • Bod(s) substantially refers to complementary hybridization between a probe nucleic acid and a target nucleic acid and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired detection of the target nucleic acid sequence.
  • “Stringent hybridization conditions” and “stringent hybridization wash conditions” in the context of nucleic acid hybridization experiments such as Southern and Northern hybridizations are sequence dependent, and are different under different environmental parameters. Longer sequences hybridize specifically at higher temperatures.
  • the T m is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Specificity is typically the function of post- hybridization washes, the critical factors being the ionic strength and temperature of the final wash solution. For DNA-DNA hybrids, the T m can be approximated from the equation of Meinkoth and Wahl (Anal. Biochem.. 138.
  • T m 81.5°C + 16.6 (log M) +0.41 (%GC) - 0.61 (% form) - 500/L; where M is the molarity of monovalent cations, %GC is the percentage of guanosine and cytosine nucleotides in the DNA, % form is the percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs.
  • T m is reduced by about I 0 C for each 1% of mismatching; thus, T m , hybridization, and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with >90% identity are sought, the T m can be decreased 1O 0 C.
  • stringent conditions are selected to be about 5 0 C lower than the thermal melting point (T m ) for the specific sequence and its complement at a defined ionic strength and pH.
  • severely stringent conditions can utilize a hybridization and/or wash at 1, 2, 3, or 4°C lower than the thermal melting point (T m );
  • moderately stringent conditions can utilize a hybridization and/or wash at 6, 7, 8, 9, or 10 0 C lower than the thermal melting point (T 1n );
  • low stringency conditions can utilize a hybridization and/or wash at 11, 12, 13, 14, 15, or 20 0 C lower than the thermal melting point (T m ).
  • An example of highly stringent wash conditions is 0.15 M NaCl at 72 0 C for about 15 minutes.
  • An example of stringent wash conditions is a 0.2X SSC wash at 65 0 C for 15 minutes (see, Sambrook and Russell, infra, for a description of SSC buffer).
  • a high stringency wash is preceded by a low stringency wash to remove background probe signal.
  • An example medium stringency wash for a duplex of, e.g., more than 100 nucleotides is IX SSC at 45 °C for 15 minutes.
  • An example low stringency wash for a duplex of, e.g., more than 100 nucleotides is 4-6X SSC at 40 0 C for 15 minutes.
  • stringent conditions typically involve salt concentrations of less than about 1.5 M, more preferably about 0.01 to 1.0 M, Na ion concentration (or other salts) at pH 7.0 to 8.3, and the temperature is typically at least about 30°C and at least about 6O 0 C for long probes (e.g., >50 nucleotides).
  • Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide.
  • destabilizing agents such as formamide.
  • a signal to noise ratio of 2X (or higher) than that observed for an unrelated probe in the particular hybridization assay indicates detection of a specific hybridization.
  • Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the proteins that they encode are substantially identical. This occurs, e.g. , when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code.
  • Very stringent conditions are selected to be equal to the T m for a particular probe.
  • An example of stringent conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on a filter in a Southern or Northern blot is 50% formamide, e.g., hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37°C, and a wash in 0.1 X SSC at 60 to 65°C.
  • Exemplary moderate stringency conditions include hybridization in 40 to 45% formamide, 1.0 M NaCl, 1% SDS at 37 0 C, and a wash in 0.5X to IX SSC at 55 to 6O 0 C.
  • the SPANX-Nl through SPANX-N5 promoter sequences of the invention can be used to promote expression of therapeutic gene products in a tissue- specific manner.
  • the SPANX-Nl through SPANX-N5 promoters can be operably linked to nucleic acids that encode beneficial and/or therapeutic gene products to form expression cassettes and/or expression vectors useful for promoting expression of those beneficial and/or therapeutic gene products in tissues where the SPANX-Nl through SPANX-N5 promoters are active.
  • Nucleic acids encoding beneficial and/or therapeutic gene products that can be operably linked to the promoters of the invention include any available nucleic acids selected by one of skill in the art.
  • nucleic acids that encode beneficial and/or therapeutic gene products include cytokines, interferons, growth factors, hormones, cell growth inhibitors, cell cycle regulators, apoptosis regulators, cytotoxins, cytolytic viruses, antibodies and the like.
  • the nucleic acid encodes interleukins and cytokines, such as interleukin 1 (IL-I), IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-IO, IL-11, IL- 12, IL- 13, IL- 14, IL-15, INF-alpha, INF-beta, INF- gamma, angiostatin, thrombospondin, endostatin, METH-I, METH-2, Flk2/Flt3 ligand, GM-CSF, G-CSF, M-CSF, and tumor necrosis factor (TNF).
  • IL-I interleukin 1
  • IL-2 interleukin-2
  • IL-3 interleukin-4
  • IL-5 IL-6
  • IL-7 IL-8
  • IL-9 IL-IO
  • IL-11 interleukin-11
  • IL- 12 IL- 13, IL- 14, IL-15
  • Interferons are soluble proteins that originally were found to induce antiviral activity in target cells. IFNs are now known to inhibit cell division and modulate the immune response. IFN-alpha produces an overall response rate of 20% in advanced melanoma and is associated with a 42% improvement in the fraction of patients with high risk melanoma who are disease-free.
  • the melanoma differentiation associated protein 7 may be encoded in the nucleic acids linked to the SPANX-N promoters of the invention.
  • MDA7 was identified following treatment of melanoma cells with interferon- alpha and mezerin.
  • Jiang and Fisher noted loss of proliferative ability and terminal differentiation (Jiang et al., Proc. Natl. Acad. Sci. USA, 93:9160-9165 (1996)).
  • Jiang and Fisher developed a novel subtraction hybridization scheme in human melanoma cells and this resulted in the identification and cloning of a series of melanoma- differentiation-associated (MDA) genes implicated in growth-controlled differentiation and apoptosis.
  • MDA melanoma- differentiation-associated
  • One of the MDA genes identified, MDA7 was noted to be a novel gene and expression of this gene correlated with the induction of terminal differentiation in human melanoma cells (Jiang et al., 1996; Jiang et al., Oncogene, 11 :2477-2486 (1995)).
  • the MDA7 gene was noted to be expressed at high levels in proliferating normal melanocytes, but the expression was decreased as disease progressed to metastatic disease.
  • MDA7 Ther., 7:2051-2057 (2000); Saeki et al., Oncogene, 21 :4558-4566 (2002)).
  • the MDA7 gene was recently mapped to chromosome Iq32, an area containing a cluster of genes associated with the IL-10 family of cytokines (Mhashilkar et al., MoI Med., 7:271-282 (2001)).
  • MDA7 has now been classified as interleukin-24 and has been demonstrated to bind to the IL-20 and IL-22 receptors, and subsequently mediate cell signaling.
  • the nucleic acids linked to the promoters of the invention encode a hormone.
  • hormones or steroids can be used in the present invention: insulin, somatotropin, gonadotropin, ACTH, CGH, or gastrointestinal hormones such as secretin.
  • therapeutic gene products encoded by nucleic acids linked to the promoters of the invention include plant-, fungus-, or bacteria-derived toxins such as ricin A-chain (Burbage, Leuk Res., 21(7):681- 690 (1997)), a ribosome inactivating protein, a-sarcin, aspergillin, restrictocin, a ribonuclease, diphtheria toxin A (Masuda et al., MoI. Cell. Biol., 17:2066-2075 (1997); Lidor, Am. J. Obstet. Gynecol., 177(3):579-585 (1997)), pertussis toxin A subunit, E.
  • Chemokines also may be used in the nucleic acids linked to promoters of the present invention. Chemokines generally act as chemoattractants to recruit immune effector cells to the site of chemokine expression. It may be advantageous to express a particular chemokine gene in combination with, for example, a cytokine gene, to enhance the recruitment of other immune system components to the site of treatment. Such chemokines include RANTES, MCAF, MIPl -alpha, MIPl -beta, and IP-IO. The skilled artisan will recognize that certain cytokines are also known to have chemoattractant effects and could also be classified under the term chemokines.
  • nucleic acids encoding cell cycle regulators can be operably linked to the promoters of the invention.
  • Such cell cycle regulators include p27, pl6, p21, p57, pl8 , p73 , pl9, pl5, E2F-1, E2F-2, E2F-3, plO7, p 130 and E2F-4.
  • cell cycle regulators include anti-angiogenic proteins, such as soluble Fltl (dominant negative soluble VEGF receptor), soluble Wnt receptors, soluble Tie2/Tek receptor, soluble hemopexin domain of matrix metalloprotease 2 and soluble receptors of other angiogenic cytokines (e.g., VEGFR1/KDR, VEGFR3/FU4, both VEGF receptors).
  • anti-angiogenic proteins such as soluble Fltl (dominant negative soluble VEGF receptor), soluble Wnt receptors, soluble Tie2/Tek receptor, soluble hemopexin domain of matrix metalloprotease 2 and soluble receptors of other angiogenic cytokines (e.g., VEGFR1/KDR, VEGFR3/FU4, both VEGF receptors).
  • nucleic acids operably linked to the promoters of the invention can encode inducers of apoptosis, such as Bax , Bak, BcI-Xs , Bad , Bim, Bik, Bid, Harakiri, Ad ElB, Bad, ICE-CED3 proteases, TRAIL, SARP-2 and apoptin.
  • inducers of apoptosis such as Bax , Bak, BcI-Xs , Bad , Bim, Bik, Bid, Harakiri, Ad ElB, Bad, ICE-CED3 proteases, TRAIL, SARP-2 and apoptin.
  • tumor suppressors may also be encoded in nucleic acids operably linked to the promoters of the present invention.
  • Such tumor suppressors include, but are not limited to p53, pl6, CCAM, p21, pl5, BRCAl, BRCA2, IRF-I, PTEN (MMACl), RB, APC, DCC, NF-I, NF-2, WT- 1, MEN-I, MEN-II, zacl, p73, VHL, FCC, MCC, DBCCRl, DCP4 and p57.
  • an antibody fragment or a single-chain antibody can be encoded in nucleic acids linked to the promoters of the invention.
  • a single chain antibody is created by fusing together the variable domains of the heavy and light chains using a short peptide linker, thereby reconstituting an antigen binding site on a single molecule.
  • Single-chain antibody variable fragments scFvs
  • scFvs Single-chain antibody variable fragments in which the C-terminus of one variable domain is tethered to the N-terminus of the other via a 15 to 25 amino acid peptide or linker, have been developed without significantly disrupting antigen binding or specificity of the binding.
  • These scFvs lack the constant regions (Fc) present in the heavy and light chains of the native antibody.
  • Antibodies capable of binding to a wide variety of molecules are contemplated, including antibodies that bind SPANX polypeptides, as described herein. However, the antibodies and antibody fragments can bind to oncogenes, growth factors, hormones, enzymes, transcription factors, receptors viral proteins and the like. Also contemplated are secreted antibodies, targeted to serum, against angiogenic factors (VEGF/VSP, beta-FGF, alpha-FGF and others) and endothelial antigens necessary for angiogenesis (i.e., V3 integrin). Specifically contemplated are growth factors such as transforming growth factor and platelet derived growth factor.
  • Oncogenes that are targets for such antibodies and/or antibody fragments include ras, myc, neu, raf, erb, src, fins, jun, trk , ret, hst, gsp, bcl-2 and abl. Also contemplated to be useful will be anti-apoptotic genes and angiogenesis promoters.
  • cytolytic or oncolytic viral proteins may be encoded in a nucleic acid that is operably linked to a promoter of the invention. The cell will typically localize in a tumor microenvironment where the viral product is expressed by the promoter. In some cases, an entire viral genome may be expressed and the virus may infect the surrounding cells.
  • the virus will selectively or preferentially lyse or kill hyperproliferative or tumor cells.
  • Cytolytic or oncolytic viruses are known. Examples of oncolytic viruses include mutated adenovirus (Heise et al., Nat. Med., 3:639-645 (1997)), mutated vaccinia virus (Gnant et al., Cancer Res., 59:3396-3403 (1999)) and mutated reovirus (Coffey et al., Science, 282:1332- 1334 (1998)). Examples of viral vectors for use in gene therapy include mutated vaccinia vims (Lattime et al., Semin.
  • any one particular construct or expression cassette that includes a promoter of the invention may be combined with any other construct or expression cassette, either in the same or different expression vector.
  • Such “combined" therapies may have particular import in treating multiple aspects of condition, disease, or other abnonnal physiology, for example, when treating multidrug resistant (MDR) cancers.
  • MDR multidrug resistant
  • one aspect of the present invention utilizes a combination of expression cassettes, each encoding a beneficial gene product, wherein at least one of the gene products is operably linked to a promoter of the invention. This combination permits expression of the beneficial agent(s) in an appropriate site in a tissue, organ or organism for treatment of diseases, so that both agents can beneficially operate to optimally treat the disease.
  • the present invention also relates to a process for treating cancer comprising operably linking a nucleic acid that encodes an anti-cancer or other beneficial gene product to a promoter of the invention such that expression of the anti-cancer or beneficial gene product suppresses the cancer.
  • the promoter is selected from the group consisting of the sequences of SEQ ID NO:206-210.
  • the anti-cancer or beneficial gene product is generally operably linked to an Nl promoter, for example, a promoter with SEQ ID NO:206.
  • Nucleic acids that can inhibit the functioning of SPANX RNA include small interfering RNAs (siRNAs), ribozymes, antisense nucleic acids, and the like.
  • prostate cancer can be treated by administering to a mammal a nucleic acid that can inhibit the functioning of an SPANX RNA.
  • the nucleic acid that inhibits the function of SPANX-N mRNA can be operably linked to a SPANX-N promoter to generate an expression cassette useful for inhibiting production of SPANX-N polypeptides.
  • Nucleic acids that can inhibit the function of an SPANX RNA can be generated from coding and non-coding regions of the SPANX gene.
  • nucleic acids that can inhibit the function of an SPANX RNA are often selected to be complementary to sequences near the 5' end of the coding region of the RNA.
  • the nucleic acid that can inhibit the functioning of an SPANX RNA can be complementary to a SPANX-N mRNA sequences encoded near the 5' end of SEQ ID NO:26 to 30.
  • nucleic acids that can inhibit the function of an SPANX RNA can be complementary to a SPANX-A/D mRNA, for example, a mRNA encoded by any one of SEQ ID NO:31-35.
  • nucleic acids that can inhibit the function of an SPANX RNA can be complementary to SPANX RNAs from other species (e.g., mouse, rat, cat, dog, goat, pig, gorilla or a monkey SPANX RNA).
  • a nucleic acid that can inhibit the functioning of an SPANX RNA need not be 100% complementary to a selected region of mRNA. Instead, some variability in the sequence of the nucleic acid that can inhibit the functioning of an SPANX RNA is permitted.
  • a nucleic acid that can inhibit the functioning of a human SPANX RNA can be complementary to a nucleic acid encoding a mouse or rat SPANX gene product. Nucleic acids encoding mouse SPANX gene product, for example, can be found in the NCBI database at GenBank.
  • nucleic acids that can hybridize under moderately or highly stringent hybridization conditions are sufficiently complementary to inhibit the functioning of an SPANX RNA and can be utilized in the compositions of the invention.
  • stringent hybridization conditions are selected to be about 5 0 C lower than the the ⁇ nal melting point (T m ) for the specific sequence at a defined ionic strength and pH.
  • stringent conditions encompass temperatures in the range of about 1 0 C to about 2O 0 C lower than the the ⁇ nal pointing point of the selected sequence, depending upon the desired degree of stringency as otherwise qualified herein.
  • the nucleic acids that can inhibit the functioning of SPANX RNA can hybridize to an SPANX RNA under physiological conditions, for example, physiological temperatures and salt concentrations.
  • Inhibitory nucleic acid molecules that comprise, for example, 2, 3, 4, or 5 or more stretches of contiguous nucleotides that are precisely complementary to an SPANX coding sequence, each separated by a stretch of contiguous nucleotides that are not complementary to adjacent SPANX mRNA coding sequences, can inhibit the ⁇ function of SPANX mRNA.
  • each stretch of contiguous nucleotides is at least 4, 5, 6, 7, or 8 or more nucleotides in length.
  • Non-complementary intervening sequences are preferably 1, 2, 3, or 4 nucleotides in length.
  • One skilled in the art can easily use the calculated melting point of a nucleic acid hybridized to a sense nucleic acid to estimate the degree of mismatching that will be tolerated between a particular nucleic acid for inhibiting expression of a particular SPANX RNA.
  • a nucleic acid that can inhibit the function of an endogenous SPANX RNA is an anti-sense oligonucleotide.
  • the anti-sense oligonucleotide is complementary to at least a portion of the sequence of a SPANX mRNA.
  • Such anti-sense oligonucleotides are generally at least six nucleotides in length, but can be about 8, 12, 15, 20, 25, 30, 35, 40, 45, or 50 nucleotides long. Longer oligonucleotides can also be used.
  • SPANX anti-sense oligonucleotides can be provided in a DNA construct, or expression cassette and introduced into cells introduced into tumor sites.
  • ribozyme is an RNA molecule with catalytic activity. See, e.g., Cech, 1987, Science 236: 1532-1539; Cech, 1990, Ann. Rev. Biochem. 59:543-568; Cech, 1992, Curr. Opin. Struct. Biol. 2: 605-609; Couture and Stinchcomb, 1996, Trends Genet. 12: 510-515. Ribozymes can be used to inhibit gene function by cleaving an RNA sequence, as is known in the art (see, e.g., Haseloff et al., U.S. Pat. No. 5,641,673).
  • SPANX nucleic acids complementary to mRNA encoded by any of SEQ ID NO:26-30 or SEQ ID NO:31-35 can be used to generate ribozymes that will specifically bind to mRNA transcribed from an SPANX gene.
  • SPANX nucleic acids complementary to mRNA include those with sequence identity to the cDNA sequences of SEQ ID NO:26-30 or SEQ ID NO:31-35.
  • the cleavage activity of ribozymes can be targeted to specific RNAs by engineering a discrete "hybridization" region into the ribozyme.
  • the hybridization region contains a sequence complementary to the target RNA and thus specifically hybridizes with the target (see, for example, Gerlach et al., EP 321,201).
  • the target sequence can be a segment of about 10, 12, 15, 20, or 50 contiguous nucleotides selected from a nucleotide sequence having SEQ ID NO:26-30 or SEQ ID NO:31-35. Longer complementary sequences can be used to increase the affinity of the hybridization sequence for the target.
  • RNA interference involves post-transcriptional gene silencing
  • siRNAs Small interfering RNAs
  • siRNAs are generally 21-23 nucleotide dsRNAs that mediate post- transcriptional gene silencing.
  • Introduction of siRNAs can induce post- transcriptional gene silencing in mammalian cells.
  • siRNAs can also be produced in vivo by cleavage of dsRNA introduced directly or via a transgene or virus. Amplification by an RNA-dependent RNA polymerase may occur in some organisms.
  • siRNAs are incorporated into the RNA-induced silencing complex, guiding the complex to the homologous endogenous mRNA where the complex cleaves the transcript. Rules for designing siRNAs are available. See, e.g., Elbashir SM,
  • an effective siRNA can be made by selecting target sites within SEQ ID NO:26-35 that begin with AA, that have 3' UU overhangs for both the sense and antisense siRNA strands, and that have an approximate 50% G/C content.
  • a siRNA of the invention for inhibiting SPANX-N mRNA functioning can have any of the following sequences:
  • AACAGCCCAC UUCAAGCAUC AAUUU (SEQ IDNO:39) from SPANX-Nl.
  • AAGCAUCAAU GGGGAGAAGA GGUUU (SEQ IDNO:40) from SPANX-Nl.
  • AACUUCCAGC ACCAAUGGGG AGUUU (SEQIDNO-.44)fromSPANX-N3.
  • AAGAGCCAAC UUCCAGCACC AAUUU (SEQ IDNO:45) from SPANX-N4.
  • a siRNA of the invention for inhibiting SPANX-A/D mRNA functioning can have any of the following sequences:
  • AAGAUUCAAA ACCUACAAAA GCCUUU (SEQ IDNO-.51) fromSPANX-A2.
  • AACCUACAAA AGCCUGCCAC UUU (SEQ IDNO:52) from SPANX-A2.
  • AAGAGCUCUG GGCCACUGCG AAG UUU (SEQ IDNO:53) from SPANX-Bl and SPANX-B2.
  • AAGAUUCAAA AGCUCCAAAA ACCUUU SEQIDNO:55
  • AAGCCUGCCG CAGACAUUGA AGUUU (SEQ IDNO:59)from SPANX-C.
  • AAGCCUGCCG CUGACAUUGA AGUUU (SEQ IDNO:60) from SPANX-D.
  • Nucleic acids that can decrease SPANX expression or translation can hybridize to an mRNA encoded by any one of SEQ ID NO:26-36 under physiological conditions. In other embodiments, these nucleic acids can hybridize to an mRNA encoded by a nucleic acid comprising SEQ ID NO:26-36 under stringent hybridization conditions. Examples of nucleic acids that can modulate the expression or translation of an SPANX polypeptide include a siRNA that consists essentially of a double-stranded RNA with any one of SEQ ID NO:39-61.
  • the invention provides a method to identify an agent that modulates SPANX expression.
  • the method involves contacting a test cell with a candidate agent and determining if the agent modulates SPANX expression, either by increasing or decreasing SPANX expression within the test cell.
  • the invention provides a method for identify agents that increase or decrease SPANX expression.
  • An increase or decrease in SPANX expression within a cell can be determined by comparing the SPANX expression within a test cell that was contacted with a candidate agent, with the SPANX expression within a control cell that was not contacted with a candidate agent.
  • the SPANX expression in a control cell may be determined before, concurrently, or after the SPANX expression within the control cell is determined.
  • SPANX expression can be determined by detecting activity of an
  • SPANX promoter An increase or decrease in transcription from a SPANX promoter can be determined through use of many art recognized methods. For example, the presence and quantity of messenger RNA (mRNA) encoded by an SPANX regulated gene in a cell or other sample can be determined through use of hybridization based procedures, such as northern blotting, gene chip technologies, or through production and hybridization of complimentary DNA (cDNA). Additional examples of methods that can be used to detect and quantify mRNA of SPANX regulated genes include nucleic acid amplification based methods, such as polymerase chain reaction, ligase chain reaction, and the like. Instrumental methods may be used to detect and quantify mRNA of SPANX regulated genes.
  • mRNA messenger RNA
  • cDNA complimentary DNA
  • probes containing a detectable label may be hybridized to the mRNA.
  • Such probes may be labeled with a fluorescent tag that allows for rapid detection of the mRNA, and therefore provides for high- throughput screening of candidate agents that modulate SPANX expression.
  • Such methods can be automated according to procedures in common practice in the pharmaceutical industry.
  • Numerous labeled probes may be constructed, and include those that use fluorescence resonance energy transfer (FRET) or fluorescence quenching for detection.
  • FRET fluorescence resonance energy transfer
  • Such probes and instrumental methods are known in the art and have been reported (Sambrook et al., Molecular Cloning: A Laboratory Manual, 3rd edition. Cold Spring Harbor Press, Cold Spring Harbor, N. Y.
  • Candidate agents can be identified that cause an increase or decrease in the transcription or translation of the gene encoding SPANX. Accordingly, a test cell can be contacted with a candidate agent. Production of SPANX mRNA or SPANX protein within the cell can be determined and compared to production in a control cell to determine if a candidate agent increases of decreases production of SPANX mRNA or protein. Such methods have been described herein and are known in the art.
  • Antibodies have been described herein and can also be produced that bind to the SPANX protein. These antibodies can be used to determine if a candidate agent increases or decreases expression of the SPANX protein within a cell.
  • the antibodies can be utilized in immunosorbant assays, such as enzyme-linked immunosorbant (ELIZA) or radio-immunosorbant assays (RIA), to detect SPANX protein.
  • ELIZA enzyme-linked immunosorbant
  • RIA radio-immunosorbant assays
  • Test cells can also be constructed that express an SPANX protein that includes a tag.
  • a fusion protein can be constructed such that the tag is an epitope that can be bound by an antibody (Shimada et al., Intemat. Immunol., 11:1357-1362 (1999)).
  • An example of such a tag is the FLAG ® tag.
  • An increase or decrease in the production of the fusion protein can then be readily followed through use of immunological techniques as are known in the art and described herein (Harlow et al., Antibodies: A Laboratory Manual, page 319 (Cold Spring Harbor Pub. 1988)).
  • the antibodies, expression cassettes, expression vectors and nucleic acids of the invention, including their salts, are administered so as to achieve a reduction in at least one symptom associated with an indication or disease.
  • the antibodies, expression cassettes, expression vectors and nucleic acids or combinations thereof may be administered as single or divided dosages, for example, of at least about 0.01 mg/kg to about 500 to 750 mg/kg, of at least about 0.01 mg/kg to about 300 to 500 mg/kg, at least about 0.1 mg/kg to about 100 to 300 mg/kg or at least about 1 mg/kg to about 50 to 100 mg/kg of body weight, although other dosages may provide beneficial results.
  • the amount administered will vary depending on various factors including, but not limited to, the antibodies or nucleic acids chosen, the disease, the weight, the physical condition, the health, the age of the mammal, whether prevention or treatment is to be achieved.
  • Administration of the therapeutic agents in accordance with the present invention may be in a single dose, in multiple doses, in a continuous or intermittent manner, depending, for example, upon the recipient's physiological condition, whether the purpose of the administration is therapeutic or prophylactic, and other factors known to skilled practitioners.
  • the administration of the therapeutic agents of the invention may be essentially continuous over a pre-selected period of time or may be in a series of spaced doses. Both local and systemic administration is contemplated.
  • therapeutic agents are synthesized or otherwise obtained, purified as necessary or desired and then lyophilized and stabilized.
  • the therapeutic agents can then be adjusted to the appropriate concentration, and optionally combined with other agents.
  • the absolute weight of a given antibody or nucleic acid included in a unit dose can vary widely. For example, about 0.01 to about 2 g, or about 0.1 to about 500 mg, of at least one
  • the unit dosage can vary from about 0.01 g to about 50 g, from about 0.01 g to about 35 g, from about 0.1 g to about 25 g, from about 0.5 g to about 12 g, from about 0.5 g to about 8 g, from about 0.5 g to about 4 g, or from about 0.5 g to about 2 g.
  • Daily doses of the therapeutic agents of the invention can vary as well. Such daily doses can range, for example, from about 0.1 g/day to about 50 g/day, from about 0.1 g/day to about 25 g/day, from about 0.1 g/day to about 12 g/day, from about 0.5 g/day to about 8 g/day, from about 0.5 g/day to about 4 g/day, and from about 0.5 g/day to about 2 g/day.
  • one or more suitable unit dosage forms comprising the therapeutic agents of the invention can be administered by a variety of routes including oral, parenteral (including subcutaneous, intravenous, intramuscular and intraperitoneal), rectal, dermal, transdermal, intrathoracic, intrapulmonary and intranasal (respiratory) routes.
  • the therapeutic agents may also be formulated for sustained release (for example, using microencapsulation, see WO 94/ 07529, and U.S. Patent No.4,962,091).
  • the formulations may, where appropriate, be convenient ⁇ presented in discrete unit dosage forms and may be prepared by any of the methods well known to the pharmaceutical arts. Such methods may include the step of mixing the therapeutic agent with liquid earners, solid matrices, semi-solid carriers, finely divided solid carriers or combinations thereof, and then, if necessary, introducing or shaping the product into the desired delivery system.
  • the therapeutic agents of the invention are prepared for oral administration, they are generally combined with a pharmaceutically acceptable carrier, diluent or excipient to form a pharmaceutical formulation, or unit dosage form.
  • the therapeutic agents may be present as a powder, a granular formulation, a solution, a suspension, an emulsion or in a natural or synthetic polymer or resin for ingestion of the active ingredients from a chewing gum.
  • the active ingredients may also be presented as a bolus, electuary or paste.
  • Orally administered therapeutic agents of the invention can also be formulated for sustained release, e.g., the antibodies and/or nucleic acids can be coated, micro-encapsulated, or otherwise placed within a sustained delivery device.
  • the total active ingredients in such formulations comprise from 0.1 to 99.9% by weight of the formulation.
  • pharmaceutically acceptable it is meant a carrier, diluent, excipient, and/or salt that is compatible with the other ingredients of the formulation, and not deleterious to the recipient thereof.
  • compositions containing the therapeutic agents of the invention can be prepared by procedures known in the art using well-known and readily available ingredients.
  • the therapeutic agents can be formulated with common excipients, diluents, or carriers, and formed into tablets, capsules, solutions, suspensions, powders, aerosols and the like.
  • excipients, diluents, and carriers that are suitable for such formulations include buffers, as well as fillers and extenders such as starch, cellulose, sugars, mannitol, and silicic derivatives.
  • Binding agents can also be included such as carboxymethyl cellulose, hydroxymethylcellulose, hydroxypropyl methylcellulose and other cellulose derivatives, alginates, gelatin, and polyvinyl-pyrrolidone.
  • Moisturizing agents can be included such as glycerol, disintegrating agents such as calcium carbonate and sodium bicarbonate.
  • Agents for retarding dissolution can also be included such as paraffin.
  • Resorption accelerators such as quaternary ammonium compounds can also be included.
  • Surface active agents such as cetyl alcohol and glycerol monostearate can be included.
  • Adsorptive carriers such as kaolin and bentonite can be added.
  • Lubricants such as talc, calcium and magnesium stearate, and solid polyethyl glycols can also be included. Preservatives may also be added.
  • the compositions of the invention can also contain thickening agents such as cellulose and/or cellulose derivatives. They may also contain gums such as xanthan, guar or carbo gum or gum arabic, or alternatively polyethylene glycols, bentones and montmorillonites, and the like.
  • tablets or caplets containing the therapeutic agents of the invention can include buffering agents such as calcium carbonate, magnesium oxide and magnesium carbonate.
  • Caplets and tablets can also include inactive ingredients such as cellulose, pre-gelatinized starch, silicon dioxide, hydroxy propyl methyl cellulose, magnesium stearate, microcrystalline cellulose, starch, talc, titanium dioxide, benzoic acid, citric acid, com starch, mineral oil, polypropylene glycol, sodium phosphate, zinc stearate, and the like.
  • Hard or soft gelatin capsules containing at least one antibody or nucleic acid of the invention can contain inactive ingredients such as gelatin, microcrystalline cellulose, sodium lauryl sulfate, starch, talc, and titanium dioxide, and the like, as well as liquid vehicles such as polyethylene glycols (PEGs) and vegetable oil.
  • inactive ingredients such as gelatin, microcrystalline cellulose, sodium lauryl sulfate, starch, talc, and titanium dioxide, and the like, as well as liquid vehicles such as polyethylene glycols (PEGs) and vegetable oil.
  • PEGs polyethylene glycols
  • enteric-coated caplets or tablets containing one or more antibodies or nucleic acids of the invention are designed to resist disintegration in the stomach and dissolve in the more neutral to alkaline environment of the duodenum.
  • the therapeutic agents of the invention can also be formulated as elixirs or solutions for convenient oral administration or as solutions appropriate for parenteral administration, for instance by intramuscular, subcutaneous, intraperitoneal or intravenous routes.
  • the pharmaceutical formulations of the therapeutic agents of the invention can also take the form of an aqueous or anhydrous solution or dispersion, or alternatively the form of an emulsion or suspension or salve.
  • the therapeutic agents may be formulated for parenteral administration (e.g., by injection, for example, bolus injection or continuous infusion) and may be presented in unit dose form in ampoules, pre-filled syringes, small volume infusion containers or in multi-dose containers.
  • preservatives can be added to help maintain the shelve life of the dosage form.
  • the active agents and other ingredients may form suspensions, solutions, or emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, stabilizing and/or dispersing agents.
  • the active agents and other ingredients may be in powder form, obtained by aseptic isolation of sterile solid or by lyophilization from solution, for constitution with a suitable vehicle, e.g., sterile, pyrogen-free water, before use.
  • formulations can contain pharmaceutically acceptable earners, vehicles and adjuvants that are well known in the art. It is possible, for example, to prepare solutions using one or more organic solvent(s) that is/ are acceptable from the physiological standpoint, chosen, in addition to water, from solvents such as acetone, ethanol, isopropyl alcohol, glycol ethers such as the products sold under the name "Dowanol,” polyglycols and polyethylene glycols, C1-C4 alkyl esters of short-chain acids, ethyl or isopropyl lactate, fatty acid triglycerides such as the products marketed under the name "Miglyol,” isopropyl myristate, animal, mineral and vegetable oils and polysiloxanes.
  • solvents such as acetone, ethanol, isopropyl alcohol, glycol ethers such as the products sold under the name "Dowanol,” polyglycols and polyethylene glycols, C1-C4 alkyl esters of short-
  • antioxidants chosen from antioxidants, surfactants, other preservatives, film-forming, keratolytic or comedolytic agents, perfumes, flavorings and colorings.
  • Antioxidants such as t-butylhydroquinone, butylated hydroxyanis ⁇ le, butylated hydroxytoluene and ⁇ -tocopherol and its derivatives can be added.
  • the therapeutic agents are well suited to formulation as sustained release dosage forms and the like.
  • the formulations can be so constituted that they release the active agent, for example, in a particular part of the intestinal, urogenital or respiratory tract, possibly over a period of time.
  • Coatings, envelopes, and protective matrices may be made, for example, from polymeric substances, such as polylactide-glycolates, liposomes, microemulsions, microparticles, nanoparticles, or waxes. These coatings, envelopes, and protective matrices are useful to coat indwelling devices, e.g., stents, catheters, peritoneal dialysis tubing, draining devices and the like.
  • the therapeutic agents may be formulated as is known in the art for direct application to a target area.
  • Forms chiefly conditioned for topical application take the form, for example, of creams, milks, gels, dispersion or microemulsions, lotions thickened to a greater or lesser extent, impregnated pads, ointments or sticks, aerosol formulations (e.g., sprays or foams), soaps, detergents, lotions or cakes of soap.
  • Other conventional forms for this purpose include wound dressings, coated bandages or other polymer coverings, ointments, creams, lotions, pastes, jellies, sprays, and aerosols.
  • the therapeutic agents of the invention can be delivered via patches or bandages for dermal administration.
  • the therapeutic agents can be formulated to be part of an adhesive polymer, such as polyacrylate or acrylate/vinyl acetate copolymer.
  • an adhesive polymer such as polyacrylate or acrylate/vinyl acetate copolymer.
  • the backing layer can be any appropriate thickness that will provide the desired protective and support functions.
  • a suitable thickness will generally be from about 10 to about 200 microns.
  • Ointments and creams may, for example, be formulated with an aqueous or oily base with the addition of suitable thickening and/or gelling agents.
  • Lotions may be formulated with an aqueous or oily base and will in general also contain one or more emulsifying agents, stabilizing agents, dispersing agents, suspending agents, thickening agents, or coloring agents.
  • the therapeutic agents can also be delivered via iontophoresis, e.g., as disclosed in U.S. Patent Nos. 4,140,122; 4,3S3,529; or 4,051,842.
  • the percent by weight of a therapeutic agent of the invention present in a topical formulation will depend on various factors, but generally will be from 0.01% to 95% of the total weight of the formulation, and typically 0.1 -85% by weight.
  • Drops such as eye drops or nose drops, may be formulated with one or more of the therapeutic agents in an aqueous or non-aqueous base also comprising one or more dispersing agents, solubilizing agents or suspending agents.
  • Liquid sprays are conveniently delivered from pressurized packs. Drops can be delivered via a simple eye dropper-capped bottle, or via a plastic bottle adapted to deliver liquid contents dropwise, via a specially shaped closure.
  • the therapeutic agents may further be formulated for topical administration in the mouth or throat.
  • the active ingredients may be formulated as a lozenge further comprising a flavored base, usually sucrose and acacia or tragacanth; pastilles comprising the composition in an inert base such as gelatin and glycerin or sucrose and acacia; and mouthwashes comprising the composition of the present invention in a suitable liquid carrier.
  • the pharmaceutical formulations of the present invention may include, as optional ingredients, pharmaceutically acceptable carriers, diluents, solubilizing or emulsifying agents, and salts of the type that are available in the art.
  • pharmaceutically acceptable carriers such as physiologically buffered saline solutions and water.
  • diluents such as phosphate buffered saline solutions pH 7.0-8.0.
  • the therapeutic agents of the invention can also be administered to the respiratory tract.
  • the present invention also provides aerosol pharmaceutical formulations and dosage forms for use in the methods of the invention.
  • dosage forms comprise an amount of at least one of the agents of the invention effective to treat or prevent the clinical symptoms of a specific indication or disease. Any statistically significant attenuation of one or more symptoms of an indication or disease that has been treated pursuant to the method of the present invention is considered to be a treatment of such indication or disease within the scope of the invention.
  • the composition may take the form of a dry powder, for example, a powder mix of the therapeutic agent and a suitable powder base such as lactose or starch.
  • the powder composition may be presented in unit dosage form in, for example, capsules or cartridges, or, e.g., gelatin or blister packs from which the powder may be administered with the aid of an inhalator, insufflator, or a metered-dose inhaler (see, for example, the pressurized metered dose inhaler (MDI) and the dry powder inhaler disclosed in Newman, S. P. in Aerosols and the Lung.
  • MDI pressurized metered dose inhaler
  • the dry powder inhaler disclosed in Newman, S. P. in Aerosols and the Lung.
  • Therapeutic agents of the present invention can also be administered in an aqueous solution when administered in an aerosol or inhaled form.
  • other aerosol pharmaceutical fo ⁇ nulations may comprise, for example, a physiologically acceptable buffered saline solution containing between about 0.1 mg/ml and about 100 mg/ml of one or more of the agents of the present invention specific for the indication or disease to be treated.
  • Dry aerosol in the form of finely divided solid antibody or nucleic acid particles that are not dissolved or suspended in a liquid are also useful in the practice of the present invention.
  • Therapeutic agents of the present invention may be formulated as dusting powders and comprise finely divided particles having an average particle size of between about 1 and 5 ⁇ m, alternatively between 2 and 3 ⁇ m.
  • Finely divided particles may be prepared by pulverization and screen filtration using techniques well known in the art.
  • the particles may be administered by inhaling a predetermined quantity of the finely divided material, which can be in the form of a powder.
  • the unit content of active ingredient or ingredients contained in an individual aerosol dose of each dosage form need not in itself constitute an effective amount for treating the particular indication or disease since the necessary effective amount can be reached by administration of a plurality of dosage units.
  • the effective amount may be achieved using less than the dose in the dosage form, either individually, or in a series of administrations.
  • the therapeutic agents of the invention are conveniently delivered from a nebulizer or a pressurized pack or other convenient means of delivering an aerosol spray.
  • Pressurized packs may comprise a suitable propellant such as dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas.
  • the dosage unit may be determined by providing a valve to deliver a metered amount.
  • Nebulizers include, but are not limited to, those described in U.S. Patent Nos. 4,624,251; 3,703,173; 3,561,444; and 4,635,627.
  • Aerosol delivery systems of the type disclosed herein are available from numerous commercial sources including Fisons Corporation (Bedford, Mass.), Schering Corp. (Kenilworth, NJ) and American Pharmoseal Co., (Valencia, CA).
  • the therapeutic agent may also be administered via nose drops, a liquid spray, such as via a plastic bottle atomizer or metered-dose inhaler.
  • atomizers are the Mistometer (Wintrop) and the Medihaler (Riker).
  • the active ingredients may also be used in combination with other therapeutic agents, for example, pain relievers, anti-inflammatory agents, and the like, whether for the conditions described or some other condition.
  • the present invention further pertains to a packaged pharmaceutical composition for controlling prostate cancer such as a kit or other container.
  • the kit or container holds a therapeutically effective amount of a pharmaceutical composition for controlling prostate cancer and instructions for using the pharmaceutical composition for control of the prostate cancer.
  • the pharmaceutical composition includes at least one antibody or nucleic acid of the present invention, in a therapeutically effective amount such that a prostate cancer is controlled.
  • This Example describes the sequence, genomic organization and cellular expression patterns of SPANX-N genes. This Example also provides evidence that SPANX-Al, -A2, -B, -C and -D evolved from the SPANX-N genes.
  • F2/ R2 5'-cattctcagtccatgtgga-3' (SEQ ID NO:78) 5'-gcaggctactcatgacacta-3' (SEQ ID NO:79)
  • PrimSX-RR/ PrimSX-F5 5'-ctacctcttcccttcccttccttc-3' (SEQ ID NO:86) 5' tgggacactgcctgtatgat-3' (SEQ ID NO:87)
  • PCR was performed by using 1 ⁇ l of genomic DNA (100 ng) in a 50 ⁇ l reaction volume under the following conditions: 94°C, 2 min (thirty cycles of 94 0 C, 30 s; 60 0 C, 10 s; 68°C, 9 min); 72°C, 7 min; 4°C, hold.
  • RNAs from mouse (brain, testis, liver, and heart) and human (brain, testis, liver, and skeletal muscle) tissues were used for screening SPANX expression with the primers described in Table 1.
  • Complementary DNA cDNA was made from 1 ⁇ g of total RNA using the Superscript first-strand system kit (Invitrogen) and priming with oligo(dT) pursuant to the manufacturer's standard protocol.
  • Human ⁇ -actin primers (BD Biosciences Clontech) were used as positive controls for both human and mouse RT-PCR.
  • NCI-60 cancer cell lines were from the National Cancer Institute.
  • RT-PCR was performed by using 1 ⁇ l of cDNA or 1 ⁇ l of genomic DNA in a 50- ⁇ l reaction volume. Standard reaction conditions were as follows: 94 0 C, 5 min (35 cycles of 94°C, 1 min; 55 0 C, 1 min; 72 0 C, 1 min); 72 0 C, 7 min; and 4°C, hold.
  • TAR cloning experiments were carried out as described in Kouprina, N. & Larionov, V. (2003) FEMS Microbiol. Rev. 27, 1-21.
  • the TAR vector was constructed by using pVC604.
  • the vector contained 5' 164-bp and 3' 187-bp targeting sequences, specific to the unique sequences flanking SPANX-C. These targeting sequences were amplified from human genomic DNA with specific primers (Table 1). The 5' and 3' targeting sequences correspond to positions 39,708-39,872 and 122,818- 123,004 in the bacterial artificial chromosome (BAC) (AL109799). Before use in TAR cloning experiments, the vector was linearized with SpIiI. Genomic DNAs were prepared from primate tissue culture lines (Coriell Institute for Medical Research). To identify clones positive for LDOCl, yeast transformants were examined by PCR by using a pair of diagnostic primers (Table 1).
  • the yield of LDOCl -positive clones from African apes genomic DNAs was the same as with human DNA (1%).
  • the size, AIu profiles, and retrofitting of yeast artificial chromosomes (YACs) into BACs were determined as described in Kouprina, N. & Larionov, V. (2003) FEMS Microbiol. Rev. 27, 1-21. AIu profiles of three independent TAR isolates for each species were indistinguishable. These results strongly suggest that the isolated YACs contain non-rearranged genomic segments.
  • the human SPANX-Nl to -N5 gene sequences have accession numbers AY825029-AY825033.
  • Protein secondary structure was predicted by using the PHD program (see website at cubic.bioc.columbia.edu/ predict protein), with a multiple sequence alignment submitted as a query. See, Rost, B., Sander, C. & Schneider, R. (1994) Comput. Appl. Biosci. 10, 53-60; website at cubic.bioc.columbia.edu/predict protein.
  • the phylogenetic tree was constructed by using the neighbor-joining method (Saitou, N. & Nei, M. (1987) MoI. Biol, Evol.
  • the SPANX Family Identification of a Second Subfamily in Primates and a Single SPANX Gene in Rodents.
  • primate genomic segments homologous to human SPANX were isolated and characterized.
  • the SPANX regions from five species (chimpanzee, gorilla, orangutan, rhesus macaque, and tamarin) were amplified by using a set of primers developed from the conserved 5' and 3' flanking sequences of human SPANX genes (see Materials and Methods).
  • PCR products with a size predicted for the SPANX-A/D genes (1.2 kb) were obtained only from African apes (FIG. 1).
  • Expansion and variability of exon 2 size are due to the presence of a 39-bp mini-satellite sequence at its 5' end. Similar amplification of mini-satellites in exons without disruption of the ORF has been previously described for other genes. See Yang et al. (2000) Am. J. Med. Genet. 95, 385- 390; Lievers et al. (2001) Eur. J. Hum. Genet. 9, 583-589. End sequencing of 9.0-kb primate clones revealed significant sequence similarity to exon 1 and exon 2 of the 1.4-kb clones. The 1.4- and 9.0-kb clones differed in that the latter contained a second LTR upstream of exon 2.
  • the ERV-containing genes were classified as a second SPANX subfamily, which the inventors have named SPANX-N.
  • the SPANX-N genes encode proteins with predicted polypeptide sequences of 72 amino acids (SPANX-Nl), 180 amino acids (SPANX-N2), 141 amino acids (SPANX-N3), 159 amino acids (SPANX-N4), and 72 amino acids (SPANX-Nl).
  • a search of the GenBank database revealed two regions of significant similarity to human SPANX-N in the mouse and rat genomes. Both mouse and rat SPANX-N homologs are previously unannotated genes; the expression of the mouse gene was supported by the detection of eight ESTs in Database of Expressed Sequence Tags (dbEST) (BU939216, CA463062, CA464820, CB273391, BX635129, BC048649, CB273391, and BU946237).
  • the mouse and rat gene encode, respectively, 87 amino acid and 115 amino acid proteins with 28-36% amino acid identity to human SPA NX-N genes.
  • the mouse gene contains a 250-bp intron that shares about 65% identity with the primate SPANX intron.
  • the smaller size of the intron is due to the absence of the ERV sequence or the LTR, which is present in all primate SPANX genes.
  • This murine SPANX homolog appears to be a single murine ortholog of human SPANX-N 1-N4 because it shows the closest similarity to the SPANX-N genes and is located in the mouse chromosome X region syntenic to SPANX-N 1-N4.
  • the SPANX- N subfamily is apparently represented not only in all primates but also in rodents, whereas the SPANX-A/D genes appear to be present exclusively in the African great apes and humans.
  • the most prominent conserved sequence feature of the SPANX family is the central hydrophobic patch ending with an arginine. Secondary structure prediction suggested that the central conserved region formed a ⁇ -hairpin with a strongly hydrophobic proximal strand, followed by an ⁇ -helix. The rest of the protein seems to have a disordered structure with few residues conserved throughout the family but with considerable conservation within subfamilies and a marked preponderance of charged and polar residues.
  • the bipartite nuclear localization signal that has been previously detected in the SPANX-AlD subfamily (Zendman et al. (2003) Gene 309, 125-133) is conserved in most of the SPANX-N proteins, with the exception of SPANX-N2 and -N4 but not in the rodent sequences; however, the latter contain a putative monopartite nuclear localization signal.
  • the presence of a small globular core embedded in apparently disordered structure suggests that SPANX protein monomers may be unstable and is compatible with the reported dimer formation. See, Westbrook et al. (2001) Biol. Reprod. 64, 345-358; Westbrook et al. (2004) Clin. Cancer Res. 10, 101— 112.
  • SPANX-N Genes are Expressed in Normal Testis and in Melanoma Cell Lines.
  • expression of these genes was analyzed in a panel of normal tissues.
  • a 264-bp band of expected size was detected only in testis (FIG. 3A).
  • Sequencing of the RT-PCR products confirmed the identity of these transcripts to the SPANX-N2 and -N3 genes.
  • the amplified sequences corresponded to two ESTs in dbEST (BU569937 and BF967778).
  • Similar experiments with a panel of normal tissues from mice also detected expression of the mouse SPANX gene only in testis (FIG. 3B).
  • the exclusive expression of these genes in normal testis correlated with the conservation of the promoter region, which contained two recognition sites for testis-specific transcription factors (FIG. S).
  • SPANX-N expression was also examined in the NIH-60 panel of cancer cell lines that represent nine different types of cancers. See, Zendman et al. (2003) Gene 309, 125-133. RT-PCR products of SPANX-N2 or -N3 of the expected size were detected only in a melanoma cell line (Table 3).
  • NSCLC non-small cell lung cancer
  • *LOX IMVI cell line was derived from a malignant amelanotic melanoma.
  • SPANX-AlD subfamily is also expressed in the same line.
  • Co-expression of members of the two SPANX gene subfamilies is not surprising because of the remarkable conservation of the promoter sequences.
  • expression profile analysis indicates that the SPANX-N subfamily, similar to the SPANX-AID subfamily, consists of cancer/testis antigens (CTA) genes.
  • CTA cancer/testis antigens
  • the rate of evolution of SPANX genes is outstanding even among reproductive proteins.
  • the highest level of conservation between rodent SPANX proteins and human SPNAX-N family members is about 36%, substantially less than the values observed for most testis-associated proteins and about the same as for transition protein 2, the most rapid evolving among analyzed human and mouse orthologs. Makalowski, W. & Boguski, M. S. (1998) J. MoI. Evol. 47, 119-121; Swanson, W. J. & Vacquier, V. D. (2002) Nat. Rev. Genet. 3, 137— 144.
  • the dalds ratio for SPANX genes was typically close to 1 (Table 4), which normally would be inteipreted as evolution under substantially relaxed purifying selection, perhaps near-neutral evolution.
  • Table 5 Mean evolutionary distances for the '5 flanking regions, the intron, and the coding sequences of the SPAN-X genes.
  • SPANX family One highly unusual feature of the SPANX family is that both synonymous and nonsynonymous positions in the coding sequences of many SPANX genes evolved much faster than the noncoding sequences of the 5' UTR and the intron (Tables 4 and 5). This anomalous mode of evolution was detected both among the closely related paralogs within the SPANX-AID and -N subfamilies and in intersub family comparisons. Most of the intron sequences do not seem to contain specific functional signals and are believed to evolve
  • SPANX-N genes lacking the ERV and containing only a solo LTR in their intron apparently evolved independently in New World monkeys and great apes via duplications accompanied by homologous recombination between the ERVs LTRs.
  • Other SPANX-N duplications left the ERV intact, as illustrated by the existence of four ERV-containing SPANX-N genes in humans (the exact number of such genes in apes and monkeys remains to be determined).
  • the emergence of the SPANX-AID gene subfamily appears to be a more recent event, subsequent to the separation of the hominoid lineage from orangutan.
  • this subfamily evolved via duplication of one of the SPANX-N genes accompanied by deletion of the distal part of exon 2 and rapid divergence (FIG. 4).
  • the phylogenetic tree of the SPANX-A/D subfamily is most compatible with independent amplification of these genes in gorilla, chimpanzee, and humans (FIG. 5).
  • the SPANX-C locus was chosen as a target. Because SPANX-C resides within an approximate 20-kb segmental duplication, the targeting sequences in the TAR vector were designed from unique sequences flanking SPANX-C. The vector efficiently clones an 83 -kb human genomic segment containing SPANX-C and LDOCl genes.
  • SPANX-C flanking sequences The size difference is due to the absence of the 20-kb internal sequence containing the SPANX-C gene in African great apes. Partial sequencing of the chimpanzee clone revealed a similar organization of this syntenic region. Because this 20-kb sequence corresponds to a series of segmental duplications in chromosome X, it appears that at least the duplication that yielded SPANX-C occurred only in the human lineage (the alternative would require independent deletion of the same region in the gorilla, bonobo, and chimpanzee lineages, a highly unlikely event).
  • S6 duplication is likely not polymorphic, because a SPANX-C null allele was not detected in a human population analysis that involved 200 individuals by PCR by using specifically designed primers (Table 2).
  • a detailed sequence analysis of the gorilla TAR clone showed that its greater length, compared with the bonobo and chimpanzee sequence, was caused by a 3.4-kb insertion.
  • This insertion contains an ORF homologous to several human ESTs (ALl 36558).
  • the human gene corresponding to these ESTs consists of eight exons and spans about 30 kb on chromosome 3.
  • the intron-less insert in gorilla appears to represent a reverse-transcribed duplication of this gene, most likely a retropseudogene.
  • This Example provides genomic sequences for several SPANX-N genes, as well as information about where SPANX-N polypeptides are encoded in the genomic sequences.
  • the SPANX-Nl genomic sequence (SEQ ID NO:92) is provided below.
  • the SPANX-Nl polypeptide is encoded by nucleotides 119 to 253 and 8263 to 8406.
  • SPANX-N2 genomic sequence (SEQ ID NO:93) is provided below.
  • the SPANX-N2 polypeptide is encoded by nucleotides 179 to 256 and 8342 to 8806.
  • SPANX-N3 genomic sequence (SEQ ID NO:94) is provided below.
  • the SPANX-N3 polypeptide is encoded by nucleotides 179 to 256 and 8407 to
  • SPANX-N4 genomic sequence (SEQ ID NO: 95) is provided below.
  • the SPANX-N4 polypeptide is encoded by nucleotides ISl to 258 and 8191 to
  • the SPANX-N5 genomic sequence (SEQ ID NO:96) is provided below.
  • the SPANX-N5 polypeptide is encoded by nucleotides 174 to 248 and 891 to 1034.
  • Prostate Cancer X-chromosome region DNA from these cell lines was used for PCR analysis. Genomic DNA from 40 normal individuals used as eligible controls (Caucasians) was purchased from Coriell Institute for Medical Research (Camden, NJ). Melanoma cell lines LoxMVI, 537 MEL 5 938 MEL and 888 MEL were obtained from the National Cancer Institute, NIH. Melanoma cell line VMMl 50 was derived from a tumor digest obtained from a patient at the
  • RNA from normal adult human tissues prostate, placenta, proximal and distal colon, lung, and cervix
  • matching normal/tumor tissue pairs Ambion, Austin, TX
  • melanoma cell lines and primary uterine tumors was used for screening SPANX- N expression with the primers described in Table 9.
  • Nlex2-F 5'-agggaagtgaatacaccaga-3" (SEQ ID NO:141)
  • SeqN2ex2-F 5'-taacaggtgaccctacccat-3' (SEQ ID NO:143)
  • N2ex2-F S'-tgagcgagtactccagaga-S' (SEQ ID NO:149) N2ex2-R 5'-ctggttgtgacgtactatact-3' (SEQ ID NO: 150) Sequencing

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Immunology (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Pathology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Peptides Or Proteins (AREA)
  • Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides SPANX polypeptides, nucleic acids and antibodies that are useful for detecting and treating cancer. The invention relates to a new subfamily of SPANX genes, SPANX-N1, SPANX-N2, SPANX-N3 SPANX-N4 and SPANX-N5. In contrast to the previously identified SPANX-A/D subfamily, expression of SPANX-N genes is not restricted to testis, and was detected in normal prostate, placenta, lung, colon, ovary and cervix tissues. Comparison of matched normal/tumor tissues revealed a predominant activation of only one member of the subfamily, SPANX-N1, in tumors, indicating that expression of this gene is diagnostic of cancer. Methods and reagents described herein permit detection and treatment of cancers associated with SPANX-N1 expression.

Description

CANCER-SPECIFIC SPANX-N MARKERS
This application claims benefit of the filing date of U.S. Provisional Ser. No. 60/636,811, filed December 15, 2004, the contents of which are incorporated by reference herein.
Government Funding
The invention described herein was developed with support from the National Cancer Institute. The U.S. Government has certain rights in the invention.
Field of the Invention The invention relates to SPANX-N genes, a new cluster of genes whose expression is detected in few normal adult tissues, with the highest expression in normal testis tissues. Some of the SPANX-N genes are also highly expressed in tumor tissues, including prostate, melanoma, uterine and cervical cancer tissues. The invention therefore provides SPANX nucleic acids, polypeptides and antibodies useful for detecting and treating prostate cancer.
Background of the Invention
According to the National Cancer Institute, since 1990 over 17 million people have been diagnosed with cancer, and an additional 1,334,100 new cancer cases are expected to be diagnosed in 2003. About 556,500 Americans are expected to die of cancer in 2003, more than 1500 people every day. Cancer is the second leading cause of death in the United States, exceeded only by heart disease. The National Institutes of Health estimate the overall costs of cancer in the year 2002 at $171.6 billion (Cancer Facts & Figures, 2003). Clearly, cancer is an enormous problem, and new methods for detecting and treating cancer are needed.
For example, prostate cancer is the most common cancer, occuπing in as many as 15% of men in the United States. Approximately 330,000 new cases are diagnosed annually. Prostate cancer kills about 40,000 men in the United States each year and is second only to lung cancer in mortality to men. Castration, treatment with anti-androgens, and prostatectomy with its associated urogenital risk, are all treatments that can seriously compromise the quality of life for men diagnosed too late for less drastic prostate cancer treatments. Hence, early detection and treatment is critical.
Currently, serum prostate-specific antigen (PSA), a serine protease, level and prostate digital rectal exams are the only early diagnostic tests in routine use for screening for prostate cancer. However, small, aggressive tumors can be missed by digital rectal exams and even by needle biopsy, and only modest increases in prostate-specific antigen, i.e., below the 4 ng/mL threshold between normal and elevated PSA levels, are generated by these tumors. These aggressive tumors have the potential to suddenly dedifferentiate, grow, spread, and metastasize rapidly.
In addition to such lethal false negatives, PSA testing is also plagued by false positives, causing unneeded tests, medical expenses and needless worry amongst patients. An NCI fact sheet indicates that 80% of men above 50 years old who have PSA test levels above 4 ng/mL will rum out to not have prostate cancer. See NCI Fact Sheet 5.29, Jan. 11, 2001, at cis.nci.gov. The NCI Fact Sheet also indicates that a need exists for a prostate cancer screen with an improved ability to differentiate between prostate cancer and benign conditions such as prostatitis, benign prostatic hypertrophy (BPH), inflammation and infection. A need also exists for prostate cancer screens that can differentiate between slow-growing and fast-growing cancers.
Similarly, cancer of the cervix is one of the most common malignancies in women and remains a significant public health problem throughout the world. In the United States alone, invasive cervical cancer accounts for approximately 19% of all gynecological cancers. In 1996, it was estimated that there were 14,700 newly diagnosed cases and 4900 deaths attributed to this disease (American Cancer Society, Cancer Facts & Figures 1996, Atlanta, Ga.: American Cancer Society, 1996). In many developing countries, where mass screening programs are not widely available, the clinical problem is more serious. Worldwide, the number of new cases is estimated to be 471,000 with a four-year survival rate of only 40% (Munoz et al., 1989, Epidemiology of
Cervical Cancer In: "Human Papillomavirus", New York, Oxford Press, pp 9-39; National Institutes of Health, Consensus Development Conference Statement on Cervical Cancer, Apr. 1-3, 1996).
Cervical cancer is detected by cellular diagnosis conducted by scrubbing a cervical surface with a cotton swab or a scrubber, immediately smearing the scrubbed cells on a slide glass to prepare a sample and observing the sample under a microscope or the like. The diagnosis is then performed by observing the form of the cells under a microscope and involves examination of each sample by a cytotechnologist. Therefore, a need exists for improved accuracy and speed in processing samples so that cervical cancer can be detected and treated before malignancy develops.
Therefore, new methods are needed for detecting and treating cancer, including prostate cancer, melanoma, uterine cancer and cervical cancer.
Summary of the Invention
The invention provides polypeptides and nucleic acids that are expressed in cancer cells and that can act as cancer markers.
Hence, one aspect of the invention is an isolated polypeptide having an amino acid sequence corresponding to any one of SEQ ID NO: 1-5. In some embodiments, the isolated polypeptide is a SPANX-Nl polypeptide with SEQ ID NO:1.
Another aspect of the invention is an isolated nucleic acid encoding a polypeptide having an amino acid sequence corresponding to any one of SEQ ID NO: 1-5. In some embodiments, the nucleic acid encodes an isolated SPANX-Nl polypeptide with SEQ ID NO: 1.
Another aspect of the invention is an isolated nucleic acid comprising a SPANX-N promoter that includes one of the following nucleotide sequences SEQ ID NO:206-210. In some embodiments, the promoter is a SPANX-Nl promoter (SEQ ID NO:206), which can promote expression in a variety of cancer cells and tumor tissue types.
Another aspect of the invention is an expression cassette that includes a nucleic acid encoding a therapeutic gene product operably linked to a SPANX-N promoter of the invention.
Another aspect of the invention is an isolated antibody that can bind to a polypeptide having an amino acid sequence corresponding to any one of SEQ ID NO.1-5. For example, the antibody can bind to a SPANX-N peptide consisting essentially of any one of SEQ ID NO: 12-25, or 136. Another aspect of the invention is an isolated SPANX-N-specific nucleic acid. For example, in one embodiment, the invention provides a SP ANX-N- specific probe or primer with any one of SEQ ID NO:37, 38, 145-176. In some embodiments, the isolated SPANX-N primer or probe has any one of SEQ ID NO: 145-152. In another embodiment, the invention provides SPANX-N nucleic acids with any one of SEQ ID NO:26-30.
Another aspect of the invention is an isolated nucleic acid that can inhibit the function of a SPANX mRNA. For example, the nucleic acids that can inhibit the function of a SPANX mRNA include DNA or RNA molecules that can hybridize to a nucleic acid encoding a SPANX-N polypeptide having an amino acid sequence comprising any one of SEQ ID NO: 1-5. The nucleic acid can be a small interfering RNA (siRNA), ribozyme, or antisense nucleic acid.
Another aspect of the invention is an siRNA that consists essentially of a double-stranded RNA with any one of SEQ ID NO:39-61.
Another aspect of the invention is a method for detecting cancer that involves contacting a non-testis tissue sample with a SPANX-N probe and observing whether an mRNA or cDNA in the sample hybridizes to the SPANX- N probe; wherein the SPANX-N probe comprises any one of SEQ ID NO:26-30, 37, 38, 145-176, or a combination thereof..
Another aspect of the invention involves a method for detecting cancer comprising performing nucleic acid amplification of RNA from a non-testis tissue sample using SPANX-N primers consisting essentially of SEQ ID NO:37 and 38, and observing whether a SPANX-N nucleic acid fragment is amplified. For example, the SPANX-N nucleic acid fragment can be about 240 to about 290 base pairs in length, or about 260-270 base pairs in length.
Another aspect of the invention involves a method for detecting cancer comprising performing nucleic acid amplification of RNA from a non-testis tissue sample using SPANX-Nl primers consisting essentially of SEQ ID
NO:145 and 146, or SEQ ID NO:151 and 152, and observing whether a nucleic acid fragment is amplified.
Another aspect of the invention involves a method for detecting cancer comprising contacting a non-testis tissue sample with an anti-SPANX-N antibody and observing whether a complex forms between the antibody and a SPANX-N polypeptide. The SPANX-N polypeptide can have any one of SEQ ID NO: 1-5. The antibody can bind to any SPANX-N peptidyl epitope. However, in some embodiments, the peptide epitope is a SPANX-N epitope that includes SEQ ID NO: 136. Another aspect of the invention is a method for treating cancer in a mammal comprising administering to the mammal an effective amount of an antibody that can bind to a SPANX-N peptide consisting of SEQ ID NO: 136.
Another aspect of the invention is a method for treating cancer in a mammal comprising administering to the mammal an effective amount of a nucleic acid that encodes an anti-cancer agent operably linked to a SPANX-Nl promoter comprising SEQ ID NO:206. The anti-cancer agent can, for example, be a cytokine, interferon, hormones, cell growth inhibitor, cell cycle regulator, apoptosis regulator, cytotoxin, cytolytic viral product, or antibody.
Another aspect of the invention involves a method for treating cancer comprising administering to a mammal an effective amount of a nucleic acid that can inhibit the function of a SPANX-N mRNA, wherein the nucleic acid comprises a DNA or RNA that can hybridize to a mRNA encoding a SPANX-N polypeptide having an amino acid sequence comprising any one of SEQ ID NO: 1-5. For example, the SPANX-N mRNA can be complementary to any one of SEQ ID NO-.26-30. The nucleic acid can be a small interfering RNA
(siRNA), ribozyme, or antisense nucleic acid. For example, the nucleic acid sequence can consist essentially of SEQ ID NO:49-61. Another aspect of the invention is a method to identify an agent that modulates SPANX-N expression. This method involves contacting a test cell with a candidate agent and determining if the candidate agent increases or decreases expression of an SPANX-N gene in the test cell when compared to expression of the SPANX-N gene in a control cell that was not contacted with the candidate agent. The agent can increase or decrease SPANX-N expression.
Description of the Drawings
FIG. 1 illustrates the sizes of DNA fragments from the SPANX family from chimpanzee (African great apes), orangutan (great apes), rhesus macaque (Old World monkeys), and tamarin (New World monkeys), which can be amplified by polymerase chain reaction. Oligonucleotide primers were designed to be complementary to sequences within the promoter and 3' non-coding regions. The double upper bands for rhesus macaque and tamarin are presumably due to polymorphism in paralogs.
FIG. 2 schematically illustrates the location of the SPANX family genes on human chromosome X. Five members of the SPANX-AID subfamily, SPANX-Al, SPANX-A2, SPANX-B, SPANX-C, and SPANX-D, are clustered within an approximate 800-kb region at Xq27.2. Four members of the SPANX-N subfamily, SPANX-Nl (positions 142995930-143005820), SPANX-N2 (positions 141495326-141490625), SPANX-N3 (positions 141297635- 141291834), and SPANX-N4 (positions 140806882-140816198), are located about 2 Mb apart from the SPANX-AID cluster (see, University of California, Santa Cruz, July 2003, website at genome.ucsc.edu). SPANX-N5 (positions 51791606-51793934) is located on the short arm of chromosome at XpI 1. SPANX genes reside within large segmental duplications across the chromosome., where each duplication is homologous to the others.
FIG. 3A-B are images of gels illustrating SPANX-N gene expression in human and mouse normal tissues as detected by reverse transcription- polymerase chain reaction (RT-PCR). FIG. 3A shows the cDNA prepared from a panel of human tissue mRNAs. Oligonucleotide primers employed were from exons 1 and 2 of the genomic sequence and designed to amplify putative transcripts. A 264-bp band of the expected size was observed only in testis. Two members of the human SPANX-N subfamily, SPANX-N2 and SPANX -N3, were detected upon cloning and sequencing of PCR products. FIG. 3B illustrates cDNA bands prepared from a panel of mouse tissue mRNAs. Oligonucleotide primers employed were from exons 1 and 2 of the genomic sequence and designed to amplify a putative transcript. A 264-bp band of the expected size was observed only in testis. Control PCR assays were carried out with the same samples by using actin-specific primers.
FIG. 4A is a schematic diagram illustrating a hypothetical evolutionary tree for the SPANX gene family. The expansion of SPANX genes is superimposed on the tree of primate evolution. FIG. 4B provides a schematic diagram of the genomic changes that may have occurred during evolution of the different SPANX genes.
FIG. 5 provides a neighbor-joining tree of primate SPANX genes. Noncoding regions (5' flanking regions and introns) were used for the tree reconstruction. The numbers at the interior branches indicate the percentage of 5,000 bootstrap pseudo-replicates that support the respective fork.
FIG. 6A-B provides analyses of the affinity-purified anti-EQPT antibodies prepared against peptide sequence EQPTS STNGEKRKSPCESNN (positions 2-21, SEQ ID NO: 136) from the SPANX-N sequence.. In FIG. 6A recombinant SPANX proteins were expressed in the pMAL-p2X bacterial expression vector. The proteins were purified as fusions with MBP by affinity chromatography. Lane 1 contained SPANX-N2 (uninduced) proteins. Lane 2 contained SPANX-N2 (induced) proteins. Lane 3 contained SPANX-B (induced) proteins. Lane 4 contained SPANX-C (induced) proteins. Lane 5 contained a 10-20OkDa ladder of molecular weight markers. Lane 6 contained SPANX-B purified proteins. Lane 7 contained SPANX-C purified proteins. Lane 8 contained SPANX-N2 purified (5ul) proteins. Lane 9 contained SPANX-N2 purified (lOul) proteins. In FIG. 6B, after separation of the recombinant proteins by SDS-PAGE, the gel was immunoblotted using anti- SPANX-N (EQPT) (lanes 1 , 2 and 3) antibodies. The antibodies exhibit a high specificity for the SPANX-N protein (lane 3).
FIG. 7A-C illustrates expression of SPANX-N genes in normal and cancer tissues. SPANX-N expression was determined in normal tissues (FIG. 7A), in primary uterine tumors and melanoma cell lines (FIG. 7B), and in normal and tumor pairs (FIG. 7C). For each SPANX-N, allele the number of cloned and sequenced RT-PCR products constituted a fraction of gene-specific transcripts in a tissue. The percentage reflected the number of tumor-associated clones with a gene-specific insert (Table 10). SPANX-Nl bars are horizontally (-) cross- hatched, SPANX-N2 bars are diagonally (/) cross-hatched, SPANX-N3 bars are vertically (|) cross-hatched, SPANX-N4 bars are diagonally (\) cross-hatched, and SPANX-N5 bars are double cross-hatched (x).
FIG. 8A-B provides a comparison of human SPANX- A/D and SPANX- N promoter sequences. The detected transcription start sites and the translation initiation codons (ATG) are indicated. Noncoding sequences are in lowercase. SPANX-N copies differ from SP ANX- A/D genes by the almost complete lack of all CpG dinucleotides in the promoter regions (CG above sequences). However, these CpGs are perfectly preserved in all of the SP ANX- A/D copies. Another difference is the presence of the SpI binding consensus in four SPANX-N copies.
Detailed Description of the Invention
The invention relates to nucleic acids, polypeptides and antibodies useful for detecting cancer. The invention also provides nucleic acids that can modulate the function of SPANX mRNA transcripts and antibodies that can modulate the function of SPANX gene products. In another embodiment, the invention relates to SPANX-N promoters that can express gene products in a tissue-specific manner, for example, in cancer cells where the promoter is generally active.
Definitions
The teπn "nucleic acid" refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form, composed of monomers (nucleotides) containing a sugar, phosphate and a base that is either a purine or pyrimidine. Unless specifically limited, the term encompasses nucleic acids containing known analogs of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions) and complementary sequences, as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et a!., Nucl. Acids Res..19. 508 (1991); Ohtsuka et al, JBC. 260. 2605 (1985); Rossolini et a!., MoI. Cell. Probes. 8, 91 (1994)).
The terms "nucleic acid," "nucleic acid molecule," "nucleic acid fragment," "nucleic acid sequence or segment," or "polynucleotide" are used interchangeably and may also be used interchangeably with gene, cDNA, DNA and RNA encoded by a gene. In some embodiments, "nucleic acid fragment" is a portion of a given nucleic acid molecule. Deoxyribonucleic acid (DNA) in the majority of organisms is the genetic material while ribonucleic acid (RNA) is involved in the transfer of information contained within DNA into proteins.
The invention encompasses isolated or substantially purified nucleic acid, peptide or polypeptide compositions. In the context of the present invention, an "isolated" or "purified" DNA molecule or RNA molecule or an "isolated" or "purified" polypeptide is a DNA molecule, RNA molecule, or polypeptide that exists apart from its native environment and is therefore not a product of nature. An isolated DNA molecule, RNA molecule or polypeptide may exist in a purified form or may exist in a non-native environment such as, for example, a transgenic host cell. For example, an "isolated" or "purified" nucleic acid molecule or protein, or biologically active portion thereof, is substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized. In one embodiment, an "isolated" nucleic acid is free of sequences that naturally flank the nucleic acid (i.e., sequences located at the 5' and 3' ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleotide sequences that naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. A protein that is substantially free of cellular material includes preparations of protein or polypeptide having less than about 30%, 20%, 10%, or 5% (by dry weight) of contaminating protein. When the protein of the invention, or biologically active portion thereof, is recombinantly produced, preferably culture medium represents less than about 30%, 20%, 10%, or 5% (by dry weight) of chemical precursors or non-protein-of-interest chemicals. Fragments and variants of the disclosed nucleotide sequences and proteins or partial-length proteins encoded thereby are also encompassed by the present invention. By "fragment" or "portion" is meant a full length or less than full length of the nucleotide sequence encoding, or the amino acid sequence of, a polypeptide or protein.
The term "gene" is used broadly to refer to any segment of nucleic acid associated with a biological function. Thus, genes include coding sequences and/or the regulatory sequences required for their expression. For example, "gene" refers to a nucleic acid fragment that expresses mRNA, functional RNA, or specific protein, including regulatory sequences. "Genes" also include non- expressed DNA segments that, for example, form recognition sequences for other proteins. "Genes" can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and may include sequences designed to have desired parameters.
"Naturally occurring" is used to describe an object that can be found in nature as distinct from being artificially produced. For example, a protein or nucleotide sequence present in an organism (including a virus), which can be isolated from a source in nature and which has not been intentionally modified by a person in the laboratory, is naturally occurring.
The terms "protein," "peptide" and "polypeptide" are used interchangeably herein.
A "variant" of a molecule is a sequence that is substantially similar to the sequence of the native molecule. For nucleotide sequences, variants include those sequences that, because of the degeneracy of the genetic code, encode the identical amino acid sequence of the native protein. Naturally occurring allelic variants such as these can be identified with the use of molecular biology techniques, as, for example, with polymerase chain reaction (PCR) and hybridization techniques. Variant nucleotide sequences also include synthetically derived nucleotide sequences, such as those generated, for example, by using site-directed mutagenesis, which encode the native protein, as well as those that encode a polypeptide having amino acid substitutions.
Generally, nucleotide sequence variants of the invention will have at least 40%, 50%, 60%, to 70%, e.g., 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, to 79%, generally at least 80%, e.g., 81%-84%, at least 85%, e.g., 86%, S7%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, to 9S%, sequence identity to the native (endogenous) nucleotide sequence.
"Conservatively modified variations" of a particular nucleic acid sequence refers to those nucleic acid sequences that encode identical or essentially identical amino acid sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given polypeptide. For instance, the codons CGT, CGC, CGA, CGG, AGA and AGG all encode the amino acid arginine. Thus, at every position where an arginine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded protein. Such nucleic acid variations are "silent variations," which are one species of "conservatively modified variations." Every nucleic acid sequence described herein that encodes a polypeptide also describes every possible silent variation, except where otherwise noted. One of skill in the art will recognize that each codon in a nucleic acid (except ATG, which is ordinarily the only codon for methionine) can be modified to yield a functionally identical molecule by standard techniques. Accordingly, each "silent variation" of a nucleic acid that encodes a polypeptide is implicit in each described sequence.
SPANX Polypeptides and Peptides
In one embodiment, the invention contemplates SPANX-N polypeptides and peptides. Such polypeptides and peptides have utility, for example, for generating SPANX-N-specific antibodies. Such antibodies can be used for detecting, diagnosing and treating cancer. For example, in one embodiment, the invention provides a human SPANX-Nl polypeptide of the following sequence (SEQ ID NO:1).
1 MEQPTSSING EKRKSPCESN NENDEMQETP NRDLAPEPSL 41 KKMKTSEYST VLAFCYRKAK KIHSNQLEND QS
In another embodiment, the invention provides a human SPANX-N2 polypeptide of the following sequence (SEQ ID NO:2).
1 MEQPTSSTNG EKRKSPCESN NKKNDEMQEA PNRVLAPKQS
41 LQKTKTIEYL TIIVYYYRKH TKINSNQLEK DQSRENSINP 81 VQEEEDEGLD SAEGSSQEDE DLDSSEGSSQ EDEDLDSSEG
121 SSQEDEDLDS SEGSSQEDED LDSSEGSSQE DEDLDPPEGS
161 SQEDEDLDSS EGSSQEGGED
In another embodiment, the invention provides a human SPANX-N3 polypeptide of the following sequence (SEQ ID NO:3). 1 MEQPTSSTNG EKTKSPCESN NKKNDEMQEV PNRVLAPEQS 41 LKKTKTSEYP IIFVYYLRKG KKINSNQLEN EQSQENSINP 81 IQKEEDEGVD LSEGSSNEDE DLGPCEGPSK EDKDLDSSEG 121 SSQEDEDLGL SEGSSQDSGE D
In another embodiment, the invention provides a human SPANX-N4 polypeptide of the following sequence (SEQ ID NO:4).
1 MEEPTSSTNE NKMKSPCESN KRKVDKKKKN LHRASAPEQS 41 LKETEKAKYP TLVFYCRKNK KRNSNQLENN QPTESSTDPI 81 KEKGDLDISA GSPQDGGQN hi another embodiment, the invention provides a human SPANX-N5 polypeptide of the following sequence (SEQ ID NO:5).
1 MEKPTSSTNG EKRKSPCDSN SKNDEMQETP NRDLVLEPSL 41 KKMKTSEYST VLVLCYRKTK KIHSNQLENDQS
As illustrated herein, the SPANX-N polypeptides of the invention have about 50% amino acid sequence identity with the SPANX- A/D proteins. Hence, they are structurally distinct and can readily be used to generate SP ANX-N- specific antibodies. Sequences for various SPANX- A/D polypeptides and nucleic acids are publicly available through the National Center for Biotechnology Information (http:/Λvww.ncbi.nlm.nih.gov/). For example, a sequence for a human SPANX- Al polypeptide can be found in the NCBI database at accession number NP 038481.2 (gi: 14192937). This sequence for human SPANX-Al has the following sequence (SEQ ID NO:6):
1 MDKQSSAGGV KRSVPCDSNE ANEMMPETPT GDSDPQPAPK 41 KMKTSESSTI LVVRYRRNFK RTSPEELLND HARENRINPL 81 QMEEEEFMEI MVEIPAK
A sequence for a human SPANX- A2 polypeptide can be found in the NCBI database at accession number NP 663695.1 (gi:22027489). This SPANX- A2 polypeptide sequence is as follows (SEQ ID NO:7):
1 MDKQSSAGGV KRSVPCDSNE ANEMMPETPT GDSDPQPAPK 41 KMKTSESSTI LVVRYRRNFK RTSPEELLND HARENRINPL 81 QMEEEEFMEI MVEIPAK
A sequence for the human SPANX-Bl polypeptide can be found in the NCBI database at accession number NP 115850.1 (gi:14196344). This sequence for SPANX-Bl is as follows (SEQ ID NO:8):
1 MGQQSSVRRL KRSVPCESNE ANEANEANKT MPETPTGDSD 41 PQPAPKKMKT SESSTILVVR YRRNVKRTSP EELVNDHARE 81 NRINPDQMEE EEFIEITTER PKK
A sequence for the human SPANX-B2 polypeptide can be found in the
NCBI database at accession number NP 663697.1 (gi:22027492). This SPANX- B2 polypeptide has the following sequence (SEQ ID NO:9):
1 MGQQSSVRRL KRSVPCESNE ANEANEANKT MPETPTGDSD 41 PQPAPKKMKT SESSTILVVR YRRNVKRTSP EELVNDHARE 81 NRINPDQMEE EEFIEITTER PKK A sequence for the human SPANX-C polypeptide can be found in the NCBI database at accession number NP 073152.1 (gi:13435137). This SPANX- C polypeptide has the following sequence (SEQ ID NO: 10):
1 MDKQSSAGGV KRSVPCESNE VNETMPETPT GDSDPQPAPK 41 KMKTSESSTI LVVRYRRNVK RTSPEELLND HARENRINPL 81 QMEEEEFMEI MVEIPAK
A sequence for the human SPANX-D polypeptide can be found in the NCBI database at accession number NP 115793.1 (gi:14192939). This SPANX-D polypeptide has the following sequence (SEQ ID NO: 11 ):
1 MDKQSSAGGV KRSVPCDSNE ANEMMPETSS GYSDPQPAPK 41 KLKTSESSTI LVVRYRRNFK RTSPEELVND HARKNRINPL 81 QMEEEEFMEI MVEIPAK
As illustrated herein, the SPANX-N nucleic acids and gene products have some degree of sequence identity. As used herein, "sequence identity" or "identity" in the context of two nucleic acid or polypeptide sequences refers to a specified percentage of residues in the two sequences that are the same when the sequences are aligned for maximum correspondence over a specified comparison window, as measured by sequence comparison algorithms or by visual inspection.
The term "substantial identity" in the context of a polypeptide or peptide indicates that a polypeptide or peptide comprises a sequence with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, or 79%, preferably 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, S8%, or 89%, more preferably at least 90%, 91%, 92%, 93%, or 94%, or even more preferably, 95%, 96%, 97%, 98% or 99%, sequence identity to the reference sequence over a specified comparison window. Preferably, optimal alignment is conducted using the homology alignment algorithm of Needleman and Wunsch (JMB, 48, 443 (1970)). An indication that two peptide sequences are substantially identical is that one peptide is immunologically reactive with antibodies raised against the second peptide. Thus, a peptide is substantially identical to a second peptide, for example, where the two peptides differ only by a conservative substitution. The invention also contemplates variant SPANX-N polypeptides and peptides as well as SPANX-N polypeptides and peptides from mammalian species other than humans. Such SPANX-N polypeptides and peptides are useful for raising antibodies and for detecting cancer in mammalian species. Residue positions in variant polypeptides and peptides may not be identical to those in the reference sequence but often differ, for example, by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. Sequences that differ by such conservative substitutions are said to have "sequence similarity" or "similarity." When sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are available to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non- conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1. The scoring of conservative substitutions is calculated, e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, California).
Thus, the invention is also directed to variant SPANX-N polypeptides and peptides. "By "variant" polypeptide is intended a polypeptide derived from the native protein by deletion (also called "truncation") or addition of one or more amino acids to the N-terminal and/or C-terminal end of the native protein; deletion or addition of one or more amino acids at one or more sites in the native protein; or substitution of one or more amino acids at one or more sites in the native protein. Such variants may result from, for example, genetic polymorphism, species variation or from human manipulation. Methods for such manipulations are generally known in the art.
Thus, the polypeptides of the invention may have sequence differences including amino acid substitutions, deletions, truncations, and insertions. Methods for such manipulations are generally known in the art. For example, amino acid sequence variants of the polypeptides can be prepared by mutations in the DNA. Methods for mutagenesis and nucleotide sequence alterations are well known in the art. See, for example, Kunkel, Proc. Natl. Acad. Sci. USA. 82, 488 (1985); Kunkel et al, Meth. EnzvmoL 154. 367 (1987); U. S. Patent No. 4,873,192; Walker and Gaastra (1983), and the references cited therein. Guidance as to appropriate amino acid substitutions that do not affect biological activity of the protein of interest may be found in the model of Dayhoff et al. (1978). Conservative substitutions, such as exchanging one amino acid with another having similar properties, are preferred. Therefore, the polypeptides of the invention encompass both naturally- occurring proteins as well as variations and modified forms thereof. Such variants will continue to possess the desired activity. The deletions, insertions, and substitutions of the polypeptide sequence encompassed herein are not expected to produce radical changes in the characteristics of the polypeptide. However, when it is difficult to predict the exact effect of the substitution, deletion, or insertion in advance of doing so, one skilled in the art will appreciate that the effect will be evaluated by routine screening assays.
Individual substitutions deletions or additions that alter, add or delete a single amino acid or a small percentage of amino acids (typically less than 5%, more typically less than 1 %) in an encoded sequence are "conservatively modified variations," where the alterations result in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. The following five groups each contain amino acids that are conservative substitutions for one another: Aliphatic: Glycine (G), Alanine (A), Valine (V), Leucine (L), Isoleucine (I); Aromatic: Phenylalanine (F), Tyrosine (Y), Tryptophan (W); Sulfur-containing: Methionine (M), Cysteine (C); Basic: Arginine (R), Lysine (K), Histidine (H); Acidic: Aspartic acid (D), Glutamic acid (E), Asparagine (N), Glutamine (Q). In addition, individual substitutions, deletions or additions which alter, add or delete a single amino acid or a small percentage of amino acids in an encoded sequence are also "conservatively modified variations." SPANX-N Specific Antibodies
Any of the entire SPANX-N 1-5 polypeptides can be used for generating antibodies. Alternatively, selected peptides from any of the SPANX-N polypeptides can be used for this purpose. These antibodies are useful for detecting and treating cancer.
For example, the following peptide (SEQ ID NO: 136) can be used to generate antibodies that specifically bind to each of the SPANX-N 1-5 polypeptides:
EQPTSSTNGEKRKSPCESNN (SPANX-N amino acid positions 2-21). One SPANX-Nl specific peptide epitope has the following sequence:
STVLAFCYRKA (SEQ ID NO: 12)
Several SPANX-N2 specific peptide epitopes include those with the following peptidyl sequences:
KQSLQKTKTIEYLTII (SEQ ID NO: 13); VLAPKQSLQKTKTIEYLTirVYYY (SEQ ID NO: 14); and
KDQSRENSI (SEQ ID NO: 15).
SPANX-N3 specific peptide epitopes include those with the following sequences:
GEKT (SEQ ID NO: 16); PIIFV (SEQ ID NO: 17);
SKEDKLSEGSS (SEQ ID NO: IS); NKKNDEMQEVPNRVL (SEQ ID NO: 19); and KTKTSEYPIIFVYYL (SEQ ID NO:20).
SPANX-N4 specific peptide epitopes include those with the following sequences:
ESNNLH (SEQ ID N0:21); and DGGQ (SEQ ID NO:22).
SPANX-N5 specific peptide epitopes include those with the following sequences: GEICRK (SEQ ID NO:23);
LVLEPS (SEQ ID NO:24); and STVLVLCY (SEQ ID NO:25). The entire SPANX-N polypeptides (SEQ ID NO: 1-5) and/or any of peptide SEQ ID NO: 12-25, 136 can be used to immunize animals to obtain SPANX-N specific antibodies.
The invention therefore provides antibodies made by available procedures that can bind SPANX-N peptides and/or polypeptides. The binding domains of such antibodies, for example, the CDR regions of these antibodies, can be transferred into or utilized with any convenient binding entity backbone.
Antibody molecules belong to a family of plasma proteins called immunoglobulins, whose basic building block, the immunoglobulin fold or domain, is used in various forms in many molecules of the immune system and other biological recognition systems. A standard antibody is a tetrameric structure consisting of two identical immunoglobulin heavy chains and two identical light chains and has a molecular weight of about 150,000 daltons.
The heavy and light chains of an antibody consist of different domains. Each light chain has one variable domain (VL) and one constant domain (CL), while each heavy chain has one variable domain (VH) and three or four constant domains (CH). See, e.g., Alzari, P. N., Lascombe, M.-B. & Poljak, R. J. (1988) Three-dimensional structure of antibodies. Annu. Rev. Immunol. 6, 555-580. Each domain, consisting of about 110 amino acid residues, is folded into a characteristic β-sandwich structure formed from two β-sheets packed against each other, the immunoglobulin fold. The VH and VL domains each have three complementarity determining regions (CDRl -3) that are loops, or turns, connecting β-strands at "one end of the domains. The variable regions of both the light and heavy chains generally contribute to antigen specificity, although the contribution of the individual chains to specificity is not always equal. Antibody molecules have evolved to bind to a large number of molecules by using six randomized loops (CDRs).
Immunoglobulins can be assigned to different classes depending on the amino acid sequences of the constant domain of their heavy chains. There are at least five (5) major classes of immunoglobulins: IgA, IgD, IgE, IgG and IgM. Several of these may be further divided into subclasses (isotypes), for example, IgG-I, IgG-2, IgG-3 and IgG-4; IgA-I and IgA-2. The heavy chain constant domains that correspond to the IgA, IgD, IgE, IgG and IgM classes of immunoglobulins are called alpha (α), delta (δ), epsilon (ε), gamma (γ) and mu (μ), respectively. The light chains of antibodies can be assigned to one of two clearly distinct types, called kappa (K) and lambda (λ), based on the amino sequences of their constant domain. The subunit structures and three- dimensional configurations of different classes of immunoglobulins are well known.
The term "variable" in the context of variable domain of antibodies, refers to the fact that certain portions of variable domains differ extensively in sequence from one antibody to the next. The variable domains are for binding and determine the specificity of each particular antibody for its particular antigen. However, the variability is not evenly distributed through the variable domains of antibodies. Instead, the variability is concentrated in three segments called complementarity determining regions (CDRs), also known as hypervariable regions in both the light chain and the heavy chain variable domains.
The more highly conserved portions of variable domains are called framework (FR) regions. The variable domains of native heavy and light chains each comprise four FR regions, largely adopting a β-sheet configuration, connected by three CDRs, which form loops connecting, and in some cases forming part of, the β-sheet structure. The CDRs in each chain are held together in close proximity by the FR regions and, with the CDRs from another chain, contribute to the formation of the antigen-binding site of antibodies. The constant domains are not involved directly in binding an antibody to an antigen, but exhibit various effector functions, such as participation of the antibody in antibody-dependent cellular toxicity.
An antibody that is contemplated for use in the present invention thus can be in any of a variety of forms, including a whole immunoglobulin, an antibody fragment such as Fv, Fab, and similar fragments, a single chain antibody which includes the variable domain complementarity determining regions (CDR), and the like forms, all of which fall under the broad term "antibody", as used herein. The present invention contemplates the use of any specificity of an antibody, polyclonal or monoclonal, and is not limited to antibodies that recognize and immunoreact with a specific SPANX-N polypeptide or derivative thereof. Moreover, the binding regions, or CDR, of antibodies can be placed within the backbone of any convenient binding entity polypeptide. In preferred embodiments, in the context of methods described herein, an antibody, binding entity or fragment thereof is used that is immunospecific for a SPANX-N polypeptide, as well as the variants and derivatives thereof.
The term "antibody fragment" refers to a portion of a full-length antibody, generally the antigen binding or variable region. Examples of antibody fragments include Fab, Fab', F(ab') 2 and Fv fragments. Papain digestion of antibodies produces two identical antigen binding fragments, called Fab fragments, each with a single antigen binding site, and a residual Fc fragment. Fab fragments thus have an intact light chain and a portion of one heavy chain. Pepsin treatment yields an F(ab') 2 fragment that has two antigen binding fragments that are capable of cross-linking antigen, and a residual fragment that is termed a pFc' fragment. Fab' fragments are obtained after reduction of a pepsin digested antibody, and consist of an intact light chain and a portion of the heavy chain. Two Fab" fragments are obtained per antibody molecule. Fab1 fragments differ from Fab fragments by the addition of a few residues at the carboxyl terminus of the heavy chain CHl domain including one or more cysteines from the antibody hinge region. Fv is the minimum antibody fragment that contains a complete antigen recognition and binding site. This region consists of a dimer of one heavy and one light chain variable domain in a tight, non-covalent association (VH -V L dimer). It is in this configuration that the three CDRs of each variable domain interact to define an antigen binding site on the surface of the VH -V L dimer. Collectively, the six CDRs confer antigen binding specificity to the antibody. However, even a single variable domain (or half of an Fv comprising only three CDRs specific for an antigen) has the ability to recognize and bind antigen, although at a lower affinity than the entire binding site. As used herein, "functional fragment" with respect to antibodies, refers to Fv, F(ab) and F(ab')2 fragments.
Additional fragments can include diabodies, linear antibodies, single- chain antibody molecules, and multispecific antibodies formed from antibody fragments. Single chain antibodies are genetically engineered molecules containing the variable region of the light chain, the variable region of the heavy chain, linked by a suitable polypeptide linker as a genetically fused single chain molecule. Such single chain antibodies are also referred to as "single-chain Fv" or "sFv" antibody fragments. Generally, the Fv polypeptide further comprises a polypeptide linker between the VH and VL domains that enables the sFv to form the desired structure for antigen binding. For a review of sFv see Pluckthun in The Pharmacology of Monoclonal Antibodies, vol. 113, Rosenburg and Moore eds. Springer- Verlag, N.Y., pp. 269-315 (1994).
The term "diabodies" refers to a small antibody fragments with two antigen-binding sites, where the fragments comprise a heavy chain variable domain (VH) connected to a light chain variable domain (VL) in the same polypeptide chain (VH-VL). By using a linker that is too short to allow pairing between the two domains on the same chain, the domains are forced to pair with the complementary domains of another chain and create two antigen-binding sites. Diabodies are described more fully in, for example, EP 404,097; WO
93/11161, and Hollinger et al, Proc. Natl. Acad Sci. USA 90: 6444-6448 (1993). Antibody fragments contemplated by the invention are therefore not full- length antibodies. However, such antibody fragments can have similar or improved immunological properties relative to a full-length antibody. Such antibody fragments may be as small as about 4 amino acids, 5 amino acids, 6 amino acids, 7 amino acids, 9 amino acids, about 12 amino acids, about 15 amino acids, about 17 amino acids, about 18 amino acids, about 20 amino acids, about 25 amino acids, about 30 amino acids or more.
In general, an antibody fragment of the invention can have any upper size limit so long as it is has similar or improved immunological properties relative to an antibody that binds with specificity to a SPANX-N polypeptide. For example, smaller binding entities and light chain antibody fragments can have less than about 200 amino acids, less than about 175 amino acids, less than about 150 amino acids, or less than about 120 amino acids if the antibody fragment is related to a light chain antibody subunit. Moreover, larger binding entities and heavy chain antibody fragments can have less than about 425 amino acids, less than about 400 amino acids, less than about 375 amino acids, less than about 350 amino acids, less than about 325 amino acids or less than about 300 amino acids if the antibody fragment is related to a heavy chain antibody subunit.
Antibodies directed against a SPANX-N peptide or polypeptide can be made by any available procedure. Methods for the preparation of polyclonal antibodies are available to those skilled in the art. See, for example, Green, et al., Production of Polyclonal Antisera, in: Immunochemical Protocols (Manson, ed.), pages 1-5 (Humana Press); Coligan, et al., Production of Polyclonal Antisera in Rabbits, Rats Mice and Hamsters, in: Current Protocols in Immunology, section 2.4.1 (1992), which are hereby incorporated by reference. Monoclonal antibodies can also be employed in the invention. The term
"monoclonal antibody" as used herein refers to an antibody obtained from a population of substantially homogeneous antibodies. In other words, the individual antibodies comprising the population are identical except for occasional naturally occurring mutations in some antibodies that may be present in minor amounts. Monoclonal antibodies are highly specific, being directed against a single antigenic site. Furthermore, in contrast to polyclonal antibody preparations that typically include different antibodies directed against different determinants (epitopes), each monoclonal antibody is directed against a single determinant on the antigen. In addition to their specificity, the monoclonal antibodies are advantageous in that they are synthesized by the hybridoma culture, uncoiitamiiiated by other immunoglobulins. The modifier "monoclonal" indicates that the antibody is obtained from a substantially homogeneous population of antibodies, and is not to be construed as requiring production of the antibody by any particular method. The monoclonal antibodies herein specifically include "chimeric" antibodies in which a portion of the heavy and/or light chain is identical or homologous to corresponding sequences in antibodies derived from a particular species or belonging to a particular antibody class or subclass, while the remainder of the chain(s) is identical or homologous to corresponding sequences in antibodies derived from another species or belonging to another antibody class or subclass. Fragments of such antibodies can also be used, so long as they exhibit the desired biological activity. See U.S. Patent No. 4,816,567; Morrison et al. Proc. Natl. Acad Sci. 81, 6851-55 (1984). The preparation of monoclonal antibodies likewise is conventional. See, for example, Kohler & Milstein, Nature, 256:495 (1975); Coligan, et al, sections 2.5.1-2.6.7; and Harlow, et al., in: Antibodies: A Laboratory Manual, page 726 (Cold Spring Harbor Pub. (1988)), which are hereby incorporated by reference. Monoclonal antibodies can be isolated and purified from hybridoma cultures by a variety of well-established techniques. Such isolation techniques include affinity chromatography with Protein-A Sepharose, size-exclusion chromatography, and ion-exchange chromatography. See, e.g., Coligan, et al., sections 2.7.1-2.7.12 and sections 2.9.1-2.9.3; Barnes, et al., Purification of Immunoglobulin G (IgG), in: Methods in Molecular Biology. Vol. 10, pages 79- 104 (Humana Press (1992).
Methods of in vitro and in vivo manipulation of antibodies are available to those skilled in the art. For example, the monoclonal antibodies to be used in accordance with the present invention may be made by the hybridoma method as described above or may be made by recombinant methods, e.g., as described in U.S. Pat. No. 4,816,567. Monoclonal antibodies for use with the present invention may also be isolated from phage antibody libraries using the techniques described in Clackson et al. Nature 352: 624-628 (1991), as well as in Marks et al., J. MoI Biol. 222: 581-597 (1991). Methods of making antibody fragments are also known in the art (see for example, Harlow and Lane, Antibodies: A Laboratory Manual Cold Spring Harbor Laboratory, New York, (1988), incorporated herein by reference). Antibody fragments of the present invention can be prepared by proteolytic hydrolysis of the antibody or by expression of nucleic acids encoding the antibody fragment in a suitable host. Antibody fragments can be obtained by pepsin or papain digestion of whole antibodies conventional methods. For example, antibody fragments can be produced by enzymatic cleavage of antibodies with pepsin to provide a 5S fragment described as F(ab')2. This fragment can be further cleaved using a thiol reducing agent, and optionally using a blocking group for the sulfhydryl groups resulting from cleavage of disulfide linkages, to produce 3.5S Fab' monovalent fragments. Alternatively, enzymatic cleavage using pepsin produces two monovalent Fab' fragments and an Fc fragment directly. These methods are described, for example, in US Patents No. 4,036,945 and No. 4,331,647, and references contained therein. These patents are hereby incorporated by reference in their entireties.
Other methods of cleaving antibodies, such as separation of heavy chains to form monovalent light-heavy chain fragments, further cleavage of fragments, or other enzymatic, chemical, or genetic techniques may also be used, so long as the fragments bind to the antigen that is recognized by the intact antibody. For example, Fv fragments comprise an association of VH and VL chains. This association may be noncovalent or the variable chains can be linked by an intermolecular disulfide bond or cross-linked by chemicals such as glutaraldehyde. Preferably, the Fv fragments comprise VH and VL chains connected by a peptide linker. These single-chain antigen binding proteins (sFv) are prepared by constructing a structural gene comprising DNA sequences encoding the VH and VL domains connected by an oligonucleotide. The structural gene is inserted into an expression vector, which is subsequently introduced into a host cell such as E. coli. The recombinant host cells synthesize a single polypeptide chain with a linker peptide bridging the two V domains. Methods for producing sFvs are described, for example, by Whitlow, et al., Methods: a Companion to Methods in Enzymology. Vol. 2, page 97 (1991); Bird, et al., Science 242:423-426 (1988); Ladπer, et al, US Patent No. 4,946,778; and Pack, et al., Bio/Technology 11:1271-77 (1993).
Another form of an antibody fragment is a peptide coding for a single complementarity-determining region (CDR). CDR peptides ("minimal recognition units") are often involved in antigen recognition and binding. CDR peptides can be obtained by cloning or constructing genes encoding the CDR of an antibody of interest. Such genes are prepared, for example, by using the polymerase chain reaction to synthesize the variable region from RNA of antibody-producing cells. See, for example, Larrick, et al., Methods: a Companion to Methods in Enzymology, Vol. 2, page 106 (1991).
The invention contemplates human and humanized forms of non-human (e.g. murine) antibodies. Such humanized antibodies are chimeric immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab')2 or other antigen-binding subsequences of antibodies) that contain minimal sequence derived from non-human immunoglobulin. For the most part, humanized antibodies are human immunoglobulins (recipient antibody) in which residues from a complementary determining region (CDR) of the recipient are replaced by residues from a CDR of a nonhuman species (donor antibody) such as mouse, rat or rabbit having the desired specificity, affinity and capacity. In some instances, Fv framework residues of the human immunoglobulin are replaced by corresponding non-human residues. Furthermore, humanized antibodies may comprise residues that are found neither in the recipient antibody nor in the imported CDR or framework sequences. These modifications are made to further refine and optimize antibody performance. In general, humanized antibodies will comprise substantially all of at least one, and typically two, variable domains, in which all or substantially all of the CDR regions correspond to those of a non-human immunoglobulin and all or substantially all of the FR regions are those of a human immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at least a portion of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin. For further details, see: Jones et al., Nature 321, 522- 525 (1986); Reichmann et al.. Nature 332, 323-329 (1988); Presta, Curr. Op. Struct. Biol. 2, 593-596 (1992); Holmes, et al., J. Immunol., 158:2192-2201 (1997) and Vaswani, et al., Annals Allergy, Asthma & Immunol., 81:105-115 (1998).
While standardized procedures are available to generate antibodies, the size of antibodies, the multi-stranded structure of antibodies and the complexity of six binding loops present in antibodies constitute a hurdle to the improvement and the manufacture of large quantities of antibodies. Hence, the invention further contemplates using binding entities, which comprise polypeptides that can recognize and bind to a SPANX-N polypeptide.
A number of proteins can serve as protein scaffolds to which binding domains reactive with a SPANX-N peptide can be attached and thereby form a suitable binding entity. The binding domains bind or interact with a SPANX-N peptide while the protein scaffold merely holds and stabilizes the binding domains so that they can bind. A number of protein scaffolds can be used. For example, phage capsid proteins can be used. See Review in Clackson & Wells, Trends Biotechnol. 12: 173- 184 (1994). Phage capsid proteins have been used as scaffolds for displaying random peptide sequences, including bovine pancreatic trypsin inhibitor (Roberts et al., PNAS 89:2429-2433 (1992)), human growth hormone (Lowman et al., Biochemistry 30:10S32-10S3S (1991)), Venturini et al., Protein Peptide Letters 1 :70-75 (1994)), and the IgG binding domain of Streptococcus (O'Neil et al., Techniques in Protein Chemistry V (Crabb, L,. ed.) pp. 517-524, Academic Press, San Diego (1994)). These scaffolds have displayed a single randomized loop or region that can be modified to include binding domains for a SPANX-N peptide or polypeptide.
Researchers have also used the small 74 amino acid α-amylase inhibitor Tendamistat as a presentation scaffold on the filamentous phage Ml 3. McConnell, S. J., & Hoess, R. H., J.Mol. Biol. 250:460-470 (1995). Tendamistat is a β-sheet protein from Streptomyces tendae. It has a number of features that make it an attractive scaffold for binding peptides, including its small size, stability, and the availability of high resolution NMR and X-ray structural data. The overall topology of Tendamistat is similar to that of an immunoglobulin domain, with two β-sheets connected by a series of loops. In contrast to immunoglobulin domains, the β-sheets of Tendamistat are held together with two rather than one disulfide bond, accounting for the considerable stability of the protein. The loops of Tendamistat can serve a similar function to the CDR loops found in immunoglobulins and can be easily randomized by in vitro mutagenesis. Tendamistat is derived from Streptomyces tendae and may be antigenic in humans. Hence, binding entities that employ Tendamistat are preferably employed in vitro.
Fibronectin type III domain has also been used as a protein scaffold to which binding entities can be attached. Fibronectin type III is part of a large subfamily (Fn3 family or s-type Ig family) of the immunoglobulin superfamily. Sequences, vectors and cloning procedures for using such a fibronectin type III domain as a protein scaffold for binding entities (e.g. CDR peptides) are provided, for example, in U.S. Patent Application Publication 20020019517. See also, Bork, P. & Doolittle, R. F. (1992) Proposed acquisition of an animal protein domain by bacteria. Proc. Natl. Acad. Sci. USA 89, 8990-8994; Jones, E. Y. (1993) The immunoglobulin superfamily Curr. Opinion Struct. Biol. 3, 846- 852; Bork, P., Horn, L. & Sander, C. (1994) The immunoglobulin fold. Structural classification, sequence patterns and common core. J. MoI. Biol. 242, 309-320; Campbell, I. D. & Spitzfaden, C. (1994) Building proteins with fibronectin type III modules Structure 2, 233-337; Harpez, Y. & Chothia, C (1994). In the immune system, specific antibodies are selected and amplified from a large library (affinity maturation). The combinatorial techniques employed in immune cells can be mimicked by mutagenesis and generation of combinatorial libraries of binding entities. Variant binding entities, antibody fragments and antibodies therefore can also be generated through display-type technologies. Such display-type technologies include, for example, phage display, retroviral display, ribosomal display, and other techniques. Techniques available in the art can be used for generating libraries of binding entities, for screening those libraries and the selected binding entities can be subjected to additional maturation, such as affinity maturation. Wright and Harris, supra., Hanes and Plucthau PNAS USA 94:4937-4942 (1997) (ribosomal display), Parmley and Smith Gene 73:305-318 (1988) (phage display), Scott TIBS 17:241-245 (1992), Cwirla et al. PNAS USA 87:6378-6382 (1990), Russel et al. Nucl. Acids Research 21:1081-1085 (1993), Hoganboom et al. Immunol. Reviews 130:43-68 (1992), Chiswell and McCafferty TIBTECH 10:80-84 (1992), and U.S. Pat. No. 5,733,743.
The invention therefore also provides methods of mutating antibodies, CDRs or binding domains to optimize their affinity, selectivity, binding strength and/or other desirable properties. A mutant binding domain refers to an amino acid sequence variant of a selected binding domain (e.g. a CDR). In general, one or more of the amino acid residues in the mutant binding domain is different from what is present in the reference binding domain. Such mutant antibodies necessarily have less than 100% sequence identity or similarity with the reference amino acid sequence, hi general, mutant binding domains have at least 75% amino acid sequence identity or similarity with the amino acid sequence of the reference binding domain. Preferably, mutant binding domains have at least 80%, more preferably at least 85%, even more preferably at least 90%, and most preferably at least 95% amino acid sequence identity or similarity with the amino acid sequence of the reference binding domain. For example, affinity maturation using phage display can be utilized as one method for generating mutant binding domains. Affinity maturation using phage display refers to a process described in Lowman et al., Biochemistry 30(45): 10832-10838 (1991), see also Hawkins et al., J. MoI Biol. 254: 889-896 (1992). While not strictly limited to the following description, this process can be described briefly as involving mutation of several binding domains or antibody hypervariable regions at a number of different sites with the goal of generating all possible amino acid substitutions at each site. The binding domain mutants thus generated are displayed in a monovalent fashion from filamentous phage particles as fusion proteins. Fusions are generally made to the gene III product of Ml 3. The phage expressing the various mutants can be cycled through several rounds of selection for the trait of interest, e.g. binding affinity or selectivity. The mutants of interest are isolated and sequenced. Such methods are described in more detail in U.S. Patent 5,750,373, U.S. Patent 6,290,957 and Cunningham, B. C. et al., EMBO J. 13(11), 2508-2515 (1994).
Therefore, in one embodiment, the invention provides methods of manipulating binding entity or antibody polypeptides or the nucleic acids encoding them to generate binding entities, antibodies and antibody fragments with improved binding properties that recognize a SPANX-N polypeptide. Such methods of mutating portions of an existing binding entity or antibody involve fusing a nucleic acid encoding a polypeptide that encodes a binding domain reactive with a SPANX-N peptide to a nucleic acid encoding a phage coat protein to generate a recombinant nucleic acid encoding a fusion protein, mutating the recombinant nucleic acid encoding the fusion protein to generate a mutant nucleic acid encoding a mutant fusion protein, expressing the mutant fusion protein on the surface of a phage, and selecting phage that bind to a SPANX-N polypeptide.
Accordingly, the invention provides antibodies, antibody fragments, and binding entity polypeptides that can recognize and bind to a SPANX-N polypeptide. The invention further provides methods of manipulating those antibodies, antibody fragments, and binding entity polypeptides to optimize their binding properties or other desirable properties (e.g., stability, size, ease of use). Such antibodies, antibody fragments, and binding entity polypeptides can be modified to include a label or reporter molecule useful for detecting the presence of the antibody. The labeled antibody can then be used for detection of SPANX-N polypeptides. As used herein, a label or reporter molecule is any molecule that can be associated with an antibody, directly or indirectly, and that results in a measurable, detectable signal, either directly or indirectly. Many such labels can be incorporated into or coupled onto an antibody or binding entity are available to those of skill in the art. Examples of labels suitable for use with the antibodies and binding entities of the invention include radioactive isotopes, fluorescent molecules, phosphorescent molecules, enzymes, secondary antibodies, and ligands.
Examples of suitable fluorescent labels include fluorescein (FITC), 5,6- carboxymethyl fluorescein, Texas red, nitrobenz-2-oxa-l,3-diazol-4-yl (NBD), coumarin, dansyl chloride, rhodamine, 4'-6-diamidino-2-phenylinodole (DAPI), and the cyanine dyes Cy3, Cy3.5, Cy5, Cy5.5 and Cy7. In some embodiments, the fluorescent label is fluorescein (5-carboxyfluorescein-N-hydroxysuccinimide ester) or rhodamine (5,6-tetramethyl rhodamine). Fluorescent labels for combinatorial multicolor used in some embodiments include FITC and the cyanine dyes Cy3, Cy3.5, Cy5, Cy5.5 and Cy7. The absoiption and emission maxima, respectively, for these fluors are: FITC (490 nm; 520 nm), Cy3 (554 nm; 568 nm), Cy3.5 (581 nm; 588 nm), Cy5 (652 nm: 672 nm), Cy5.5 (682 nm; 703 nm) and Cy7 (755 nm; 778 nm), thus allowing their simultaneous detection. Such fluorescent labels can be obtained from a variety of commercial sources, including Molecular Probes, Eugene. OR and Research Organics, Cleveland, Ohio.
Detection labels that are incorporated into an antibody or binding entity, such as biotin, can be subsequently detected using sensitive methods available in the art. For example, biotin can be detected using streptavidin-alkaline phosphatase conjugate (Tropix., Inc.) that binds to the biotin and subsequently can be detected by chemiluminescence of suitable substrates (for example, the chemiluminescent substrate CSPD: disodium, 3-(4-methoxyspiro-[l,2, - dioxetane-3-2'-(5'-chloro)tricyclo [3.3.1.1.sup.3,7 ]decane]-4-yl) phenyl phosphate; Tropix, Inc.).
Molecules that combine two or more of these reporter molecules or detection labels can also be used in the invention. Any of the known detection labels can be used with the disclosed antibodies, antibody fragments, binding entities, and methods. Methods for detecting and measuring signals generated by detection labels are also available to those of skill in the art. For example, radioactive isotopes can be detected by scintillation counting or direct visualization; fluorescent molecules can be detected with fluorescent spectrophotometers; phosphorescent molecules can be detected with a scanner or spectrophotometer, or directly visualized with a camera; enzymes can be detected by visualization of the product of a reaction catalyzed by the enzyme. Such methods can be used directly in methods for detecting SPANX-N polypeptides.
SPANX Specific Nucleic Acids
The invention provides SPANX-N nucleic acids that encode SPANX-N polypeptides and peptides. In another embodiment, the invention provides SPANX-specific nucleic acids, probes and primers that can be used to detect tissues that express SPANX, for example, normal testis, cervical cancer, uterine cancer, melanoma and prostate cancer tissues. In another embodiment, the invention provides SPANX-N promoters that can be used to express desirable gene products in a tissue-specific manner. For example, the SPAN-Nl promoter is active in a variety of cancer cells. In a further embodiment, the invention relates to SPANX-specific nucleic acids that can modulate or inhibit the function of a SPANX mRNA. The SPANX-specific nucleic acids that can modulate or inhibit the function of a SPANX mRNA are useful for treating cancer.
Thus, the invention provides nucleic acids that encode SPANX-N polypeptides. For example, the SPANX-Nl polypeptide having SEQ ID NO:1 is encoded by the cDNA sequence provided below (SEQ ID NO:26):
1 ATGGAACAGC CCACTTCAAG CATCAATGGG GAGAAGAGGA 41 AGAGCCCCTG TGAATCCAAC AATGAAAATG ATGAGATGCA
81 GGAGACACCA AACAGGGACT TAGCCCCCGA ACCGAGTTTG
121 AAAAAGATGA AAACGTCAGA ATATTCAACA GTATTAGCGT
161 TTTGCTACAG GAAAGCTAAG AAAATACATT CAAATCAACT 201 GGAGAATGAC CAGTCCTGA
The SPANX-N2 polypeptide having SEQ ID NO:2 is encoded by the cDNA sequence provided below (SEQ ID NO:27):
1 ATGGAACAGC CGACTTCAAG CACCAATGGG GAGAAGAGGA 41 AGAGCCCCTG TGAATCCAAT AACAAAAAAA ATGATGAGAT
81 GCAGGAGGCA CCGAACAGGG TCTTAGCCCC CAAACAGAGC
121 TTGCAAAAGA CAAAAACAAT AGAATATCTA ACAATAATAG
161 TGTATTACTA CAGGAAGCAT ACGAAAATAA ATTCAAATCA
201 ACTGGAGAAG GACCAGTCCC GAGAGAACTC CATCAATCCC 241 GTCCAAGAGG AGGAGGACGA AGGCCTAGAC TCAGCTGAAG
281 GATCTTCACA GGAGGACGAA GACCTGGACT CATCTGAAGG
321 ATCTTCACAG GAGGACGAAG ACCTGGACTC ATCTGAAGGA
361 TCTTCACAGG AGGACGAAGA CCTGGACTCA TCTGAAGGAT
401 CTTCACAGGA GGACGAAGAC CTGGACTCAT CTGAAGGATC 441 TTCACAGGAG GACGAAGACC TGGACCCACC TGAAGGATCT
481 TCACAGGAGG ACGAAGACCT AGACTCATCT GAAGGATCTT
521 CACAGGAGGG TGGGGAGGAC TAG
The SPANX-N3 polypeptide having SEQ ID NO:3 is encoded by the cDNA sequence provided below (SEQ ID NO:28):
1 ATGGAACAGC CAACTTCCAG CACCAATGGG GAGAAGACGA
41 AGAGCCCCTG TGAATCCAAT AACAAAAAAA ATGATGAGAT
81 GCAAGAGGTA CCAAACAGAG TCTTAGCCCC CGAACAGAGT
121 TTGAAGAAGA CAAAAACATC AGAATATCCA ATAATATTTG 161 TGTATTACCT CAGGAAGGGT AAGAAAATAA ATTCAAATCA
201 ACTGGAGAAT GAACAGTCCC AAGAGAACTC CATCAATCCA
241 ATCCAAAAGG AGGAGGACGA AGGCGTAGAC TTATCTGAAG
281 GATCTTCAAA TGAGGATGAA GACCTAGGCC CATGTGAAGG
321 ACCTTCAAAG GAGGACAAAG ATCTAGACTC ATCTGAAGGA 3 61 TCCTCACAGG AGGATGAAGA CCTAGGCTTA TCTGAAGGAT
4 01 CTTCACAGGA CAGTGGGGAG GATTAG
The SPANX-N4 polypeptide having SEQ ID NO:4 is encoded by the cDNA sequence provided below (SEQ ID NO.29):
1 ATGGAAGAGC CAACTTCCAG CACCAACGAG AATAAAATGA
41 AGAGCCCCTG TGAATCTAAC AAAAGAAAAG TTGACAAGAA
81 GAAGAAGAAT CTGCACAGAG CCTCAGCCCC TGAACAGAGT
121 TTGAAAGAGA CAGAAAAAGC AAAATATCCA ACATTAGTGT 161 TTTACTGCAG GAAGAATAAG AAAAGAAATT CAAATCAACT
201 GGAGAATAAC CAGCCTACAG AGAGCTCCAC TGATCCAATC
241 AAAGAGAAAG GAGACCTAGA CATATCTGCA GGATCTCCAC
281 AGGATGGTGG GCAGAATTAG
The SPANX-N5 polypeptide having SEQ ID NO: 5 is encoded by the cDNA sequence provided below (SEQ ID NO:30):
1 ATGGAAAAGC CCACTTCAAG CACCAATGGG GAGAAGAGGA
41 AGAGCCCCTG TGACTCCAAC AGCAAAAATG ATGAGATGCA 81 GGAGACACCA AACAGGGACT TAGTCCTCGA ACCGAGTTTG
121 AAAAAGATGA AΆACATCAGA ATATTCAACA GTATTAGTGT
161 TGTGCTACAG GAAGACTAAG AAAATACATT CAAATCAACT 201 GGAGAATGAC CAGTCCTGA
The SPANX-Al polypeptide having SEQ ID NO:6 is encoded by the cDNA sequence provided below (SEQ ID NO:31):
1 AAGCCTGCCA CTGACATTGA AGAACCAATA TATACAATGG
41 ACAAACAATC CAGTGCCGGC GGGGTGΆAGA GGAGCGTCCC
81 CTGTGATTCC AACGAGGCCA ACGAGATGAT GCCGGAGACC 121 CCAACTGGGG ACTCAGACCC GCAACCTGCT CCTAAAAAAA
161 TGAAAACATC TGAGTCCTCG ACCATACTAG TGGTTCGCTA
201 CAGGAGGAAC TTTAAAAGAA CATCTCCAGA GGAACTGCTG
241 AATGACCACG CCCGAGAGAA CAGAATCAAC CCCCTCCAAA
281 TGGAGGAGGA GGAATTCATG GAAATAATGG TTGAAATACC 321 TGCAAAGTAG CAAGAAGCTA CATCTCTCAA CCTTGGGCAA 361 TGAAAATAAA GTTTGAGAAG CTGA
The SPANX- A2 polypeptide having SEQ ID NO: 7 is encoded by the cDNA sequence provided below (SEQ ID NO:32):
1 ACTGAGAAGA TTCAAAACCT ACAAAAGCCT GCCACTGACA
41 TTGAAGAACC AATATATACA ATGGACAAAC AATCCAGTGC
81 CGGCGGGGTG AAGAGGAGCG TCCCCTGTGA TTCCAACGAG
121 GCCAACGAGA TGATGCCGGA GACCCCAACT GGGGACTCAG 161 ACCCGCAACC TGCTCCTAAA AAAATGAAAA CATCTGAGTC
201 CTCGACCATA CTAGTGGTTC GCTACAGGAG GAACTTTAAA
241 AGAACATCTC CAGAGGAACT GCTGAATGAC CACGCCCGAG
281 AGAACAGAAT CAACCCCCTC CAAATGGAGG AGGAGGAΆTT
321 CATGGAAATA ATGGTTGAAA TACCTGCAAA GTAGCAAGAA 361 GCTACATCTC TCAACCTTGG GCAATGAAAA TAAAGTTTGA 401 GAAGCTGATG GCTGTGTAAA
The SPANX-Bl polypeptide having SEQ ID NO:8 is encoded by the cDNA sequence provided below (SEQ ID NO:33): 1 GTCACCAGGA GGGTATGCAT AGGGAGGGCA AGAGCTCTGG
41 GCCACTGCGA AGATTCAAAA GCTCCAAAAA CCTACTGTAG
81 ACATCGAAGA ACCAATATAT ACAATGGGCC AACAATCCAG
121 TGTCCGCAGG CTGAAGAGGA GCGTCCCCTG TGAATCCAAC
161 GAGGCCAACG AGGCCAATGA GGCCAACAAG ACGATGCCGG 201 AGACCCCAAC TGGGGACTCA GACCCGCAAC CTGCTCCTAA 241 AAAAATGAAA ACATCTGAGT CCTCGACCAT ACTAGTGGTT 281 CGCTACAGGA GGAACGTGAA AAGAACATCT CCAGAGGAAC 321 TGGTGAATGA CCACGCCCGA GAGAACAGAA TCAACCCCGA 361 CCAAATGGAG GAGGAGGAAT TCATAGAAAT AACGACTGAA 401 AGACCTAAAA AGTAGCAAGA AGCTACATCC CTCAAACTTC
441 GGCAATGAAA ATAAAGTTTG AGAAGCTGAA AA
The SPANX-B2 polypeptide having SEQ ID NO: 9 is encoded by the cDNA sequence provided below (SEQ ID NO:34): 1 GTCACCAGGA GGGTATGCAT AGGGAGGGCA AGAGCTCTGG
41 GCCACTGCGA AGATTCAAAA GCTCCAAAAA CCTACTGTAG
81 ACATCGAAGA ACCAATATAT ACAATGGGCC AACAATCCAG
121 TGTCCGCAGG CTGAAGAGGA GCGTCCCCTG TGAATCCAAC 161 GAGGCCAACG AGGCCAATGA GGCCAACAAG ACGATGCCGG
201 AGACCCCAAC TGGGGACTCA GACCCGCAAC CTGCTCCTAA
241 AAAAATGAAA ACATCTGAGT CCTCGACCAT ACTAGTGGTT
281 CGCTACAGGA GGAACGTGAA AAGAACATCT CCAGAGGAAC
321 TGGTGAATGA CCACGCCCGA GAGAACAGAA TCAACCCCGA 361 CCAAATGGAG GAGGAGGAAT TCATAGAAAT AACGACTGAA
401 AGACCTAAAA AGTAGCAAGA AGCTACATCC CTCAAACTTC
441 GGCAATGAAA ATAAAGTTTG AGAAGCTGA
The SPANX-C polypeptide having SEQ ID NO: 10 is encoded by the cDNA sequence provided below (SEQ ID NO:35):
1 CGAAGATTCA AAACCTACAA AAGCCTGCCG CAGACATTGA
41 AGAACCAATA TATACAATGG ACAAACAATC CAGTGCCGGC
81 GGGGTGAAGA GGAGCGTCCC CTGTGAATCC AACGAGGTGA
121 ATGAGACGAT GCCGGAGACC CCAACTGGGG ACTCAGACCC 161 GCAACCTGCT CCTAAAAAAA TGAAAACATC TGAGTCCTCG
201 ACCATACTAG TGGTTCGCTA CAGGAGGAAC GTGAAAAGAA
241 CATCTCCAGA GGAACTGCTG AATGACCACG CCCGAGAGAA
281 CAGAATCAAC CCCCTCCAAA TGGAGGAGGA GGAATTCATG
321 GAAATAATGG TTGAAATACC TGCAAAGTAG CAAGAAGCTA 361 CATCTCTCAA CCTTGGGCAA TGAAAATAAA GTTTGAGAAG
401 CTGAAAAAAA AAAAAAAAAA AAAAA
The SPANX-D polypeptide having SEQ ID NO:11 is encoded by the cDNA sequence provided below (SEQ ID NO:36): 1 AAGCCTGCCG CTGACATTGA AGAACCAATA TATACAATGG
41 ACAAACAATC CAGTGCCGGC GGGGTGAAGA GGAGCGTCCC
81 CTGTGATTCC AACGAGGCCA ACGAGATGAT GCCGGAGACC
121 TCGAGTGGGT ACTCAGACCC GCAACCTGCT CCGAAAAAAC
161 TAAAAACATC TGAGTCCTCG ACCATACTAG TGGTTCGCTA 201 CAGGAGGAAC TTTAAAAGAA CATCTCCAGA GGAACTGGTG
241 AATGACCACG CCCGAAAGAA CAGAATCAAC CCCCTCCAAA
281 TGGAGGAGGA GGAATTCATG GAAATAATGG TTGAAATACC
321 TGCAAAGTAG CAAGAAGCTA CATCTCTCAA CCTTGGGCAA 361 TGACAATAAA GTTTGAGAAG CTGA
In another embodiment, the invention provides SPANX-N-specific nucleic acids, primers or probes. Such SPANX-N-specific nucleic acids, primers or probes can be used for detection of SPANX-N expression, for example, in testes and prostate cancer cells. As described and illustrated herein, expression of SPANX-N in non-testes tissues is an indication that such tissues are cancerous.
Any of the SPANX-N cDNAs (SEQ ID NO:26-30) provided above can serve as a SPANX-N-specific nucleic acid or probes. Moreover, the invention also provides primers or probes that are referred to herein as the FhuS-F and RhuS-R primers, which specifically hybridize to the SPANX-N subfamily of genes. The sequences of these SPANX-N specific primers are shown below:
FhuS-F: 5'-atggaacagccgacttcaag-3' (SEQ ID NO:37)
RhuS-R: 5'-tgagtctaggccttcgtcct-3' (SEQ ID NO:38)
The SEQ ID NO: 37-38 probes are specific for SPANX-N subfamily and distinguish these genes from SPANX-Al, -A2,-B,-C and-D genes subfamily. These primers were used for detecting SPANX-N expression and the products detected were confirmed to be SPANX-N nucleic acids by sequencing. In addition, the following probes can be used to detect SPANX-N transcripts and expression patterns. In addition, the following nucleic acids are useful for nucleic acid amplification of specific SPANX-N RNA and DNA sequences, including specific portions of SPANX-N RNA and DNA sequences. SPANX-Nl Exonl: Particularly useful for nucleic acid amplification (1,946 bp product)
Nlexl-F 5'-aggcttgaagcttgtaccct-3' (SEQ ID NO:137)
Nlexl-R 5'-acaactttcgttaaccgcca-3' (SEQ ID NO:13S) Exonl: Particularly useful for nucleic acid sequencing
SeqNlexl-R 5'-acaagacggacaaaggtcca-3' (SEQ ID NO:139)
SeqPrimSX-F5 5'-tgggacactgcctgtatgat-3' (SEQ ID NO: 140)
Exon2: Particularly useful for nucleic acid amplification (1,779 bp) Nlex2-F 5'-agggaagtgaatacaccaga-3' (SEQ ID NO: 141)
Nl ex2-R 5'-aatggtaggtccctgcagat-3' (SEQ ID NO: 142)
Exon2: Particularly useful for nucleic acid sequencing
SeqN2ex2-F 5'-taacaggtgaccctacccat-3' (SEQ ID NO:143)
SeqN2ex2-R 5'-gatcactggagaaggaggaa-3' (SEQ ID NO: 144)
SPANX-N2 Exonl: Particularly useful for nucleic acid amplification (3,710 bp)
N2exl-F/ 5'-cttactgtgtttgatgtggca-3' (SEQ ID NO:145)
N2exl-R 5'-accagtcggattccagaaaat-3' (SEQ ID NO:146) Exonl: Particularly useful for nucleic acid sequencing
Seql-F 5'-tcctcaacctgcattccttc-3' (SEQ ID NO:147)
Seql -R 5'-catcagacaacaggcagaga-3' (SEQ ID NO: 148)
Exon2: Particularly useful for nucleic acid amplification (4,143 bp)
N2ex2-F 5'-tgagcgagtactccagaga-3' (SEQ ID NO: 149) N2ex2-R 5'-ctggttgtgacgtactatact-3' (SEQ ID NO: 150)
Exon2: Particularly useful for nucleic acid sequencing
Seq2-F 5'-cctcaacctgcattccttct-3' (SEQ ID NO: 151)
SeqPrimSX-RR 5'-ctacctcttcccttcccttc-3' (SEQ ID NO:152)
SPANX-N3
Exonl: Particularly useful for nucleic acid amplification (4,593 bp) N3exl-F 5'-aggttcgcttggtttgttag-3' (SEQ ID NO: 153)
N3exl-R 5'-acagcaactgaccaatcttc-3' (SEQ ID NO: 154)
Exonl: Particularly useful for nucleic acid sequencing SeqPrimSX-F5 5'-tgggacactgcctgtatgat-3' (SEQ ID NO: 155)
SeqlN3-R 5'~gtctgcagtattcctgtgtt-3' (SEQ ID NO:156)
Exon2: Particularly useful for nucleic acid amplification (923 bp)
N3ex2-F 5'-cagctgcctggaaaggggaa-3' (SEQ ID NO: 157) N3ex2-R 5'-ttcccttccccaccccacac-3' (SEQ ID NO: 158)
Exon2: Particularly useful for nucleic acid sequencing
SeqPrimSX-RR 5' cacctgcattccttctcata 3' (SEQ ID NO: 159)
Seq2N3-F 5' cacctgcattccttctcata 3 ' (SEQ ID NO: 160)
SPANX-N4 Exonl: Particularly useful for nucleic acid amplification (1,515 bp)
N4exl-F 5'-ctccccttccacactaaatg-3' (SEQ ID NO:161)
N4ex 1 -R 5' tcagtctaaactgcactctct-3' (SEQ ID NO: 162) Exonl: Particularly useful for nucleic acid sequencing
SeqPRimSX-F5 5'-tgggacactgcctgtatgat-3' (SEQ ID NO: 163)
SeqlN4 R 5'-tct gcaggtgtctgcagtat-3' (SEQ ID NO:164)
Exon2: Particularly useful for nucleic acid amplification (2,245 bp)
N4ex2-F/ 5'-agggaagcaagtacctcaga-3' (SEQ ID NO: 165) N4ex2-R 5'-actgcaagctctgtctctag-3' (SEQ ID NO: 166)
Exon2: Particularly useful for nucleic acid sequencing
Seq2N4-F 5-tgattccacct gctcttct-31 (SEQ ID NO: 167)
Seq2N4-R 5'-cctatcttttcccttccctt-3' (SEQ ID NO: 168)
SPANX-N1-5: Detecting SPANX-N expression in humans (e.g. by RT-PCR) spaxnl/n5-F (ISO bp) 5'-aagaggaagagcccctgtga-3' (SEQ ID NO:169) spaxnl/n5-R (180 bp) 5'-ggtcattctccagttgatttga-3' (SEQ ID NO:170)
In addition, the SPANX-N promoter sequences can promote expression in selected cell and tissue types, including testis tissue, and in the case of the SPANX-Nl gene, in cancer cells. As shown herein, while SPANX-N4 transcripts were detected only in testis, SPANX-N2-5 transcripts were detected in several normal nongametogenic tissues (placenta, prostate, proximal and distal colon, lung, and cervix), although the levels of this SPANX-N expression in these tissues was lower than that observed in testis (FIG. 7A and Table 10). In contrast, SPANX-Nl was not expressed in normal, nongametogenic tissues. In other words, detectable levels of SPANX-Nl are not observed in normal tissues. Therefore, the SPANX-Nl, SPANX-N2, SPANX-N3, SPANX-N4 and SPANX- N5 promoters have utility for expressing gene products in testis, while the SPANX-N2, SPANX-N3, SPANX-N4 and SPANX-N5 (but not SPANX-Nl) promoters have utility for expressing gene products in tissues such as placenta, prostate, proximal and distal colon, lung, and cervix. However, while SPANX-Nl is not expressed in normal tissues (except testis and sperm), it is e'xpressed in cancer cells. Substantially exclusive expression of SPANX-Nl was observed in three primary uterine cancers and four melanoma cell lines (Table 10 and FIG. 7B-C). Therefore, while most SPANX genes are not expressed in cancer tissues, SPANX-Nl is expressed in a variety of cancer tissues including uterine cancer cells and melanoma cells. Hence, SPANX-Nl is a diagnostic marker for cancer.
Moreover, for example, the SPANX-Nl promoter can be used to promote expression of anti-cancer gene products.
The promoters of the SPANX-Nl through SPANX-N5 genes have the following sequences.
SPANX-Nl Promoter (SEQ ID NO:206):
1 TCCATGTGAA CCATGAACAT TAAACATGGA GAAATGAGGA
41 GCGGCGGCAG ATCGGTTTGG GATGCATCTT CAGGGGATGC
81 TGAAACAACA ACAGCATTTG GTTTCCTCTA CATCCCTGTC 121 ACCCCTCCCC CACAAGCCCA GGGATTGGTC AGCAGTGGTG
161 CTTCGTGATG TCAAAGCCAC CCTAGGACTG CCATTGGCTG
201 GGACACTGCC TGTATGATCA AACAAAGCTC AAGGGTGTGG
241 CTTTGCCTTG TCACCAGGAG GGTATATATA GGGAGGGCAA
281 GAGCTCTGGG ACATCCTCCT GGGAAGCTTC AATACAGCTG 321 TGCAAGTCTG GAGTCTACAA GAGCCTACTA TAGACATTCT
361 ACAACCAACC AGAATC
SPANX-N2 Promoter (SEQ ID NO:207):
1 TCCATGTGAA CCATGAACAT TAAACATGGA GAAATGAGCA 41 GCAGCAGATT AGTTTGGGAT GCATCTTCAG GGGATGCTGA
81 AACAACAACA GCATTTGGTT TCCTCTACAC CCCTGTCATC
121 CGTCCCCCAC AAGCCCAGGG AGTGGTCAGC AGTGGTGCTT
161 TGTGATGTCT AAGCCACCCT AGGACTGCCA TTGGCTGGGA 201 CACTGCCTGT ATGATCAAAC AAAGCTCAAG AGTGTGGCTT 241 TGCCTTGCCA CCAGGAAGGT ATACATAGGG AGGGCCAGAG 281 CTCTGGGACA TCCTCCTGGC AAGCTTCAAT ATAGCTGTGG 321 AAGTCTGCAG TCTACAAGAG CCTACTATAG ACATTCTACA 361 ACCAAGCAGA ATC
SPANX-N3 Promoter (SEQIDNO:208):
1 TCCATGTGAA CCGTGAACAT TAAACATGGA GAAATGAGGA 41 GTGGTGGCAG ATCAGTTTGG GATGCATCTT CAGGGGATGC 81 TGAAACAACA ACAGCATTTG GTTTCCTCTA CATCCCTGTC 121 ACCCCTCCCC CACAAGCCCA GGGAGTGGTC AGCAGTGGTG 161 CTTTGTGATG TCAAAGCCAC CTTAGGACCG CCATTGGCTG 201 GGACACTGCC TGTATGATCA AACAAAGCTC AAGGGTGTGG 241 CTTTGCCTTG TCACCAGGAG GGTATATATA GGGAGGGCAA 281 GAGCTCTGGG ACATCCCACT GGGAAGCTTC AACATAGCTG 321 TGGAAGTCTG CAGTCTACAG GAGCCTACTA TAGACATTCT 361 ACAACCAACC AGAATCATGG AACAGCC
SPANX-N4 Promoter(SEQ IDNO:209): 1 TCCATGTGAA CCATGAACAT TAAACATGGA GAAATGAGGA
41 GCGGAGGCAG ATCAGTTTGG GATGCATCTT CAGGGGATGC
81 TGAAACAACA ACAGCATTTG GTTTCCTCTA CACCCCTTTC
121 ACCCGTCCCC CACAAACCCA GGGAGTTGTC AGAGGTGGTT
161 CTTTGTGATG CCAAAGCCAC CCTAGGACTA CCATTGGCTG 201 GGACACTGCC TGTATGATCA AACAAAGCTC AAGGGTGTGG
241 CTTCGTCTTG CTCCCAGGAG GGTATATATA CAGGGTGGGC
281 AAAAGCTCTG GGACAGCCCA CTGGAAAGCT TCAATACAGC
321 TGTGGAAATC TGCACCCTAG AAGATCCTAG TACAGAAATT
361 CTACAACCAA CCATAATCAT GGAAGAGCC
SPANX-N5 Promoter (SEQ ID NO:210):
1 TCCATGTGAA CCATGAACAT TAAACATGGA GAAATGAGGA
41 GCGGCAGCAG ATCAGTTTGG GATGCGTCTT CAGGGGATGC
81 TGAAACAACA GCAGCATTTG GTTTCCTCTA CACCCCTGTC 121 ACCCCTCCCC CACAAGCCCA GGGAGTGGTC AGCAGTGGTG
161 CTTTGTGATG TCTAAGCCAC CCTTGGACTG CCATTGGCTG
201 GGACACTGCC TGTATGATCA AACAAAGCTC AAGGGTGTGG
241 CTTTGCCTTG TCACCAGGAG GGTATATATA GGGAGGGCAA 281 GAGCTCTGGG CCACTGGGAA GCTTCAATAT AGCTGTGGAA
321 GTCTGGACTC TACAAGATCC TGCTGTAGAC ATTCAACAAC
361 CAACCAGAAT C
The genes, promoters and nucleotide sequences of the invention include both the naturally occurring sequences as well as variants thereof. For example, the invention contemplates SPANX-N nucleic acids from mammalian species other than humans.
The following terms are used to describe the sequence relationships between two or more nucleic acids or polynucleotides: (a) "reference sequence," (b) "comparison window," (c) "sequence identity," (d) "percentage of sequence identity," and (e) "substantial identity".
As used herein, "reference sequence" is a defined sequence used as a basis for sequence comparison. A reference sequence may be a subset or the entirety of a specified sequence; for example, as a segment of a full-length cDNA or gene sequence, or the complete cDNA or gene sequence.
As used herein, "comparison window" makes reference to a contiguous and specified segment of a polynucleotide sequence, wherein the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. Generally, the comparison window is at least 15 or 17 contiguous nucleotides in length, and optionally can be 20, 30, 40, 50, 100, or longer. Those of skill in the ait understand that to avoid a high similarity to a reference sequence due to inclusion of gaps in the polynucleotide sequence a gap penalty is typically introduced and is subtracted from the number of matches.
Methods of alignment of sequences for comparison are available in the art. Thus, the determination of percent identity between any two sequences can be accomplished using a mathematical algorithm. Preferred, non-limiting examples of such mathematical algorithms are the algorithm of Myers and Miller (CABIOS. 4, 11 (1988)); the local homology algorithm of Smith et al (Adv. ABDI. Math.. 2, 4S2 (1981)); the homology alignment algorithm of Needleman and Wunsch (JMB. 48, 443 (1970)); the search-for-similarity-method of Pearson and Lipman (Proc. Natl. Acad. Sci. USA. 85, 2444 (1988)); the algorithm of Karlin and Altschul (Proc. Natl. Acad. Sci. USA. 87, 2264 (1990)), modified as in Karlin and Altschul (Proc Natl. Acad. Sci. USA. 90, 5873 (1993)).
Computer implementations of these mathematical algorithms can be utilized for comparison of sequences to determine sequence identity. Such implementations include, but are not limited to: CLUSTAL in the PC/Gene program (available from Intelligenetics, Mountain View, California); the ALIGN program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Version 8 (available from Genetics Computer Group (GCG), 575 Science Drive, Madison, Wisconsin, USA). Alignments using these programs can be performed using the default parameters. Further information on the CLUSTAL program can be found in Higgins et al. , Gene, 73, 237 (1988); Higgins et al. CABIOS. 5. 151 (1989); Corpet et al, Nucl. Acids Res.. 16, 10S81 (1988); Huang et al, CABIOS. S, 155 (1992); and Pearson et al, Meth. MoI. Biol.. 24, 307 (1994). The ALIGN program is based on the algorithm of Myers and Miller, CABIOS. 4, 11 (1988). The BLAST programs of Altschul et al (JMB. 215. 403 (1990)), are based on the algorithm of Karlin and Altschul supra.
Software for performing BLAST analyses is publicly available through the National Center fόTBΪόtechnology Information (see website at ncbi.nlm.nih.gov). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive- valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold. These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when the cumulative alignment score falls off by the quantity X from its maximum achieved value, the cumulative score goes to zero or below due to the accumulation of one or more negative-scoring residue alignments, or the end of either sequence is reached.
In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences. One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a test nucleic acid sequence is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid sequence to the reference nucleic acid sequence is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001.
To obtain gapped alignments for comparison puiposes, Gapped BLAST (in BLAST 2.0) can be utilized as described in Altschul et a (1997). Alternatively, PSI-BLAST (in BLAST 2.0) can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al, supra. When utilizing BLAST, Gapped BLAST, PSI-BLAST, the default parameters of the respective programs (e.g. BLASTN for nucleotide sequences, BLASTX for proteins) can be used. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11 , an expectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix. See website at ncbi.nlm.nih.gov. Alignment may also be performed manually by inspection. For purposes of the present invention, comparison of nucleotide sequences for determination of percent sequence identity to the promoter sequences disclosed herein is preferably made using the BlastN program (version 1.4.7 or later) with its default parameters or any equivalent program. By "equivalent program" is intended any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide or amino acid residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by the preferred program. As used herein, "percentage of sequence identity" means the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100 to yield the percentage of sequence identity.
The term "substantial identity" of polynucleotide sequences means that a polynucleotide comprises a sequence that has at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, or 79%, preferably at least 80%, 81%, S2%, 83%, 84%, 85%, 86%, 87%, 88%, or 89%, more preferably at least 90%, 91%, 92%, 93%, or 94%, and most preferably at least 95%, 96%, 97%, 9S%, or 99% sequence identity, compared to a reference sequence using one of the alignment programs described using standard parameters. One of skill in the art will recognize that these values can be appropriately adjusted to determine corresponding identity of proteins encoded by two nucleotide sequences by taking into account codon degeneracy, amino acid similarity, reading frame positioning, and the like. Substantial identity of amino acid sequences for these purposes normally means sequence identity of at least 70%, more preferably at least S0%, 90%, and most preferably at least 95%.
Another indication that nucleotide sequences are substantially identical is if two molecules hybridize to each other under stringent conditions. Generally, stringent conditions are selected to be about 50C lower than the thermal melting point (Tn,) for the specific sequence at a defined ionic strength and pH. However, stringent conditions encompass temperatures in the range of about I0C to about 20°C, depending upon the desired degree of stringency as otherwise qualified herein. Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides they encode are substantially identical. This may occur, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code. One indication that two nucleic acid sequences are substantially identical is when the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the polypeptide encoded by the second nucleic acid.
For sequence comparison, typically one sequence acts as a reference sequence to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are input into a computer, subsequence coordinates are designated if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
As noted above, another indication that two nucleic acid sequences are substantially identical is that the two molecules hybridize to each other under stringent conditions. The phrase "hybridizing specifically to" refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA. "Bind(s) substantially" refers to complementary hybridization between a probe nucleic acid and a target nucleic acid and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired detection of the target nucleic acid sequence.
"Stringent hybridization conditions" and "stringent hybridization wash conditions" in the context of nucleic acid hybridization experiments such as Southern and Northern hybridizations are sequence dependent, and are different under different environmental parameters. Longer sequences hybridize specifically at higher temperatures. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Specificity is typically the function of post- hybridization washes, the critical factors being the ionic strength and temperature of the final wash solution. For DNA-DNA hybrids, the Tm can be approximated from the equation of Meinkoth and Wahl (Anal. Biochem.. 138. 267 (1984)); Tm = 81.5°C + 16.6 (log M) +0.41 (%GC) - 0.61 (% form) - 500/L; where M is the molarity of monovalent cations, %GC is the percentage of guanosine and cytosine nucleotides in the DNA, % form is the percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs. Tm is reduced by about I0C for each 1% of mismatching; thus, Tm, hybridization, and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with >90% identity are sought, the Tm can be decreased 1O0C. Generally, stringent conditions are selected to be about 50C lower than the thermal melting point (Tm) for the specific sequence and its complement at a defined ionic strength and pH. However, severely stringent conditions can utilize a hybridization and/or wash at 1, 2, 3, or 4°C lower than the thermal melting point (Tm); moderately stringent conditions can utilize a hybridization and/or wash at 6, 7, 8, 9, or 100C lower than the thermal melting point (T1n); low stringency conditions can utilize a hybridization and/or wash at 11, 12, 13, 14, 15, or 200C lower than the thermal melting point (Tm). Using the equation, hybridization and wash compositions, and desired T, those of ordinary skill will understand that variations in the stringency of hybridization and/or wash solutions are inherently described. If the desired degree of mismatching results in a T of less than 450C (aqueous solution) or 32°C (formamide solution), it is preferred to increase the SSC concentration so that a higher temperature can be used. An extensive guide to the hybridization of nucleic acids is found in Tijssen (Laboratory Techniques in Biochemistry and Molecular Biology Hybridization with Nucleic Acid Probes, part I chapter 2 "Overview of principles of hybridization and the strategy of nucleic acid probe assays" Elsevier, New York (1993)). Generally, highly stringent hybridization and wash conditions are selected to be about 50C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH.
An example of highly stringent wash conditions is 0.15 M NaCl at 720C for about 15 minutes. An example of stringent wash conditions is a 0.2X SSC wash at 650C for 15 minutes (see, Sambrook and Russell, infra, for a description of SSC buffer). Often, a high stringency wash is preceded by a low stringency wash to remove background probe signal. An example medium stringency wash for a duplex of, e.g., more than 100 nucleotides, is IX SSC at 45 °C for 15 minutes. An example low stringency wash for a duplex of, e.g., more than 100 nucleotides, is 4-6X SSC at 400C for 15 minutes. For short probes {e.g., about 10 to 50 nucleotides), stringent conditions typically involve salt concentrations of less than about 1.5 M, more preferably about 0.01 to 1.0 M, Na ion concentration (or other salts) at pH 7.0 to 8.3, and the temperature is typically at least about 30°C and at least about 6O0C for long probes (e.g., >50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. In general, a signal to noise ratio of 2X (or higher) than that observed for an unrelated probe in the particular hybridization assay indicates detection of a specific hybridization. Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the proteins that they encode are substantially identical. This occurs, e.g. , when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code.
Very stringent conditions are selected to be equal to the Tm for a particular probe. An example of stringent conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on a filter in a Southern or Northern blot is 50% formamide, e.g., hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37°C, and a wash in 0.1 X SSC at 60 to 65°C. Exemplary low stringency conditions include hybridization with a buffer solution of 30 to 35% formamide, IM NaCl, 1% SDS (sodium dodecyl sulfate) at 370C, and a wash in IX to 2X SSC (2OX SSC = 3.0 M NaCl/0.3 M trisodium citrate) at 50 to 55°C. Exemplary moderate stringency conditions include hybridization in 40 to 45% formamide, 1.0 M NaCl, 1% SDS at 370C, and a wash in 0.5X to IX SSC at 55 to 6O0C.
Therapeutic Gene Products including Anti-cancer Gene Products
The SPANX-Nl through SPANX-N5 promoter sequences of the invention can be used to promote expression of therapeutic gene products in a tissue- specific manner. For example, the SPANX-Nl through SPANX-N5 promoters can be operably linked to nucleic acids that encode beneficial and/or therapeutic gene products to form expression cassettes and/or expression vectors useful for promoting expression of those beneficial and/or therapeutic gene products in tissues where the SPANX-Nl through SPANX-N5 promoters are active.
Nucleic acids encoding beneficial and/or therapeutic gene products that can be operably linked to the promoters of the invention, include any available nucleic acids selected by one of skill in the art. For example, nucleic acids that encode beneficial and/or therapeutic gene products include cytokines, interferons, growth factors, hormones, cell growth inhibitors, cell cycle regulators, apoptosis regulators, cytotoxins, cytolytic viruses, antibodies and the like.
Thus, in one embodiment the nucleic acid encodes interleukins and cytokines, such as interleukin 1 (IL-I), IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-IO, IL-11, IL- 12, IL- 13, IL- 14, IL-15, INF-alpha, INF-beta, INF- gamma, angiostatin, thrombospondin, endostatin, METH-I, METH-2, Flk2/Flt3 ligand, GM-CSF, G-CSF, M-CSF, and tumor necrosis factor (TNF).
Interferons (IFNs), are soluble proteins that originally were found to induce antiviral activity in target cells. IFNs are now known to inhibit cell division and modulate the immune response. IFN-alpha produces an overall response rate of 20% in advanced melanoma and is associated with a 42% improvement in the fraction of patients with high risk melanoma who are disease-free.
In various embodiments of the invention, the melanoma differentiation associated protein 7 (MD A7) may be encoded in the nucleic acids linked to the SPANX-N promoters of the invention. MDA7 was identified following treatment of melanoma cells with interferon- alpha and mezerin. In particular, Jiang and Fisher noted loss of proliferative ability and terminal differentiation (Jiang et al., Proc. Natl. Acad. Sci. USA, 93:9160-9165 (1996)). Jiang and Fisher developed a novel subtraction hybridization scheme in human melanoma cells and this resulted in the identification and cloning of a series of melanoma- differentiation-associated (MDA) genes implicated in growth-controlled differentiation and apoptosis. One of the MDA genes identified, MDA7, was noted to be a novel gene and expression of this gene correlated with the induction of terminal differentiation in human melanoma cells (Jiang et al., 1996; Jiang et al., Oncogene, 11 :2477-2486 (1995)). The MDA7 gene was noted to be expressed at high levels in proliferating normal melanocytes, but the expression was decreased as disease progressed to metastatic disease. Jiang et al. (1995 and 1996), subsequently demonstrated that, when MDA7 was expressed in a wide variety of tumor cells, this resulted in growth suppression and apoptosis. This has subsequently been confirmed by several additional groups. In addition, several groups have confirmed that the MDA7 gene effectively induces cell death in tumor cells with no significant toxicity to normal cells (Saeki et al.,
Gene Ther., 7:2051-2057 (2000); Saeki et al., Oncogene, 21 :4558-4566 (2002)). The MDA7 gene was recently mapped to chromosome Iq32, an area containing a cluster of genes associated with the IL-10 family of cytokines (Mhashilkar et al., MoI Med., 7:271-282 (2001)). MDA7 has now been classified as interleukin-24 and has been demonstrated to bind to the IL-20 and IL-22 receptors, and subsequently mediate cell signaling. Because of its potent antitumor activity and the apparent selectivity for cancer cells without toxicity to normal cells, this gene has been proposed as a novel tumor suppressor gene that may be effective in the treatment of cancer. In other embodiments, the nucleic acids linked to the promoters of the invention encode a hormone. For example, the following hormones or steroids can be used in the present invention: insulin, somatotropin, gonadotropin, ACTH, CGH, or gastrointestinal hormones such as secretin.
In other embodiments, therapeutic gene products encoded by nucleic acids linked to the promoters of the invention include plant-, fungus-, or bacteria-derived toxins such as ricin A-chain (Burbage, Leuk Res., 21(7):681- 690 (1997)), a ribosome inactivating protein, a-sarcin, aspergillin, restrictocin, a ribonuclease, diphtheria toxin A (Masuda et al., MoI. Cell. Biol., 17:2066-2075 (1997); Lidor, Am. J. Obstet. Gynecol., 177(3):579-585 (1997)), pertussis toxin A subunit, E. coli enterotoxin toxin A subunit, cholera toxin A subunit, and pseudomonas toxin c-terminal. It has been demonstrated that transfection of a plasmid containing a fusion protein regulatable diphtheria toxin A chain gene was cytotoxic for cancer cells. Thus, gene transfer of regulated toxin genes might also be applied to the treatment of diseases (Masuda et al., MoI. Cell. Biol., 17:2066-2075 (1997).
Chemokines also may be used in the nucleic acids linked to promoters of the present invention. Chemokines generally act as chemoattractants to recruit immune effector cells to the site of chemokine expression. It may be advantageous to express a particular chemokine gene in combination with, for example, a cytokine gene, to enhance the recruitment of other immune system components to the site of treatment. Such chemokines include RANTES, MCAF, MIPl -alpha, MIPl -beta, and IP-IO. The skilled artisan will recognize that certain cytokines are also known to have chemoattractant effects and could also be classified under the term chemokines.
In another embodiment, nucleic acids encoding cell cycle regulators can be operably linked to the promoters of the invention. Such cell cycle regulators include p27, pl6, p21, p57, pl8 , p73 , pl9, pl5, E2F-1, E2F-2, E2F-3, plO7, p 130 and E2F-4. Other cell cycle regulators include anti-angiogenic proteins, such as soluble Fltl (dominant negative soluble VEGF receptor), soluble Wnt receptors, soluble Tie2/Tek receptor, soluble hemopexin domain of matrix metalloprotease 2 and soluble receptors of other angiogenic cytokines (e.g., VEGFR1/KDR, VEGFR3/FU4, both VEGF receptors). Moreover, nucleic acids operably linked to the promoters of the invention can encode inducers of apoptosis, such as Bax , Bak, BcI-Xs , Bad , Bim, Bik, Bid, Harakiri, Ad ElB, Bad, ICE-CED3 proteases, TRAIL, SARP-2 and apoptin.
In another embodiment, tumor suppressors may also be encoded in nucleic acids operably linked to the promoters of the present invention. Such tumor suppressors include, but are not limited to p53, pl6, CCAM, p21, pl5, BRCAl, BRCA2, IRF-I, PTEN (MMACl), RB, APC, DCC, NF-I, NF-2, WT- 1, MEN-I, MEN-II, zacl, p73, VHL, FCC, MCC, DBCCRl, DCP4 and p57. In yet another embodiment, an antibody fragment or a single-chain antibody can be encoded in nucleic acids linked to the promoters of the invention. Methods for the production of single-chain antibodies are described herein and are available in the art. A single chain antibody is created by fusing together the variable domains of the heavy and light chains using a short peptide linker, thereby reconstituting an antigen binding site on a single molecule. Single-chain antibody variable fragments (scFvs) in which the C-terminus of one variable domain is tethered to the N-terminus of the other via a 15 to 25 amino acid peptide or linker, have been developed without significantly disrupting antigen binding or specificity of the binding. These scFvs lack the constant regions (Fc) present in the heavy and light chains of the native antibody. Antibodies capable of binding to a wide variety of molecules are contemplated, including antibodies that bind SPANX polypeptides, as described herein. However, the antibodies and antibody fragments can bind to oncogenes, growth factors, hormones, enzymes, transcription factors, receptors viral proteins and the like. Also contemplated are secreted antibodies, targeted to serum, against angiogenic factors (VEGF/VSP, beta-FGF, alpha-FGF and others) and endothelial antigens necessary for angiogenesis (i.e., V3 integrin). Specifically contemplated are growth factors such as transforming growth factor and platelet derived growth factor. Particular oncogenes that are targets for such antibodies and/or antibody fragments include ras, myc, neu, raf, erb, src, fins, jun, trk , ret, hst, gsp, bcl-2 and abl. Also contemplated to be useful will be anti-apoptotic genes and angiogenesis promoters.
It may be advantageous to combine portions of genomic DNA with cDNA or synthetic sequences to generate specific constructs. For example, where an intron is desired in the ultimate construct, a genomic clone will need to be used. The cDNA or a synthesized polynucleotide may provide more convenient restriction sites for the remaining portion of the construct and, therefore, would be used for the rest of the sequence. In certain embodiments, cytolytic or oncolytic viral proteins may be encoded in a nucleic acid that is operably linked to a promoter of the invention. The cell will typically localize in a tumor microenvironment where the viral product is expressed by the promoter. In some cases, an entire viral genome may be expressed and the virus may infect the surrounding cells. In certain embodiments the virus will selectively or preferentially lyse or kill hyperproliferative or tumor cells. Cytolytic or oncolytic viruses are known. Examples of oncolytic viruses include mutated adenovirus (Heise et al., Nat. Med., 3:639-645 (1997)), mutated vaccinia virus (Gnant et al., Cancer Res., 59:3396-3403 (1999)) and mutated reovirus (Coffey et al., Science, 282:1332- 1334 (1998)). Examples of viral vectors for use in gene therapy include mutated vaccinia vims (Lattime et al., Semin. Oncol., 23:88-100 (1996)), mutated herpes simplex virus (Toda et al., Hum. Gene Ther., 9:2177-2185 (1998)), mutated adenovirus (U.S. Pat. No. 5,698,443) and mutated retroviruses (Anderson, Nature, 392(Suppl.):25-30 (1998)), each of which is incorporated herein by reference.
Moreover, it is contemplated that any one particular construct or expression cassette that includes a promoter of the invention may be combined with any other construct or expression cassette, either in the same or different expression vector. In many therapies, it will be advantageous to provide more than one functional therapeutic. Such "combined" therapies may have particular import in treating multiple aspects of condition, disease, or other abnonnal physiology, for example, when treating multidrug resistant (MDR) cancers. Thus, one aspect of the present invention utilizes a combination of expression cassettes, each encoding a beneficial gene product, wherein at least one of the gene products is operably linked to a promoter of the invention. This combination permits expression of the beneficial agent(s) in an appropriate site in a tissue, organ or organism for treatment of diseases, so that both agents can beneficially operate to optimally treat the disease.
The present invention also relates to a process for treating cancer comprising operably linking a nucleic acid that encodes an anti-cancer or other beneficial gene product to a promoter of the invention such that expression of the anti-cancer or beneficial gene product suppresses the cancer. The promoter is selected from the group consisting of the sequences of SEQ ID NO:206-210. For cancer treatment the anti-cancer or beneficial gene product is generally operably linked to an Nl promoter, for example, a promoter with SEQ ID NO:206.
Nucleic Acids that Inhibit the Function of SPANX-N mRNA
Nucleic acids that can inhibit the functioning of SPANX RNA include small interfering RNAs (siRNAs), ribozymes, antisense nucleic acids, and the like. In one embodiment, prostate cancer can be treated by administering to a mammal a nucleic acid that can inhibit the functioning of an SPANX RNA. In another embodiment, the nucleic acid that inhibits the function of SPANX-N mRNA can be operably linked to a SPANX-N promoter to generate an expression cassette useful for inhibiting production of SPANX-N polypeptides. Nucleic acids that can inhibit the function of an SPANX RNA can be generated from coding and non-coding regions of the SPANX gene. However, nucleic acids that can inhibit the function of an SPANX RNA are often selected to be complementary to sequences near the 5' end of the coding region of the RNA. Hence, in some embodiments, the nucleic acid that can inhibit the functioning of an SPANX RNA can be complementary to a SPANX-N mRNA sequences encoded near the 5' end of SEQ ID NO:26 to 30. hi other embodiments, nucleic acids that can inhibit the function of an SPANX RNA can be complementary to a SPANX-A/D mRNA, for example, a mRNA encoded by any one of SEQ ID NO:31-35. In another embodiment, nucleic acids that can inhibit the function of an SPANX RNA can be complementary to SPANX RNAs from other species (e.g., mouse, rat, cat, dog, goat, pig, gorilla or a monkey SPANX RNA).
A nucleic acid that can inhibit the functioning of an SPANX RNA need not be 100% complementary to a selected region of mRNA. Instead, some variability in the sequence of the nucleic acid that can inhibit the functioning of an SPANX RNA is permitted. For example, a nucleic acid that can inhibit the functioning of a human SPANX RNA can be complementary to a nucleic acid encoding a mouse or rat SPANX gene product. Nucleic acids encoding mouse SPANX gene product, for example, can be found in the NCBI database at GenBank.
Moreover, nucleic acids that can hybridize under moderately or highly stringent hybridization conditions are sufficiently complementary to inhibit the functioning of an SPANX RNA and can be utilized in the compositions of the invention. Generally, stringent hybridization conditions are selected to be about 50C lower than the theπnal melting point (Tm) for the specific sequence at a defined ionic strength and pH. However, stringent conditions encompass temperatures in the range of about 10C to about 2O0C lower than the theπnal pointing point of the selected sequence, depending upon the desired degree of stringency as otherwise qualified herein. In some embodiments, the nucleic acids that can inhibit the functioning of SPANX RNA can hybridize to an SPANX RNA under physiological conditions, for example, physiological temperatures and salt concentrations. Precise complementarity is therefore not required for successful duplex formation between a nucleic acid that can inhibit an SPANX RNA and the complementary coding sequence of an SPANX RNA. Inhibitory nucleic acid molecules that comprise, for example, 2, 3, 4, or 5 or more stretches of contiguous nucleotides that are precisely complementary to an SPANX coding sequence, each separated by a stretch of contiguous nucleotides that are not complementary to adjacent SPANX mRNA coding sequences, can inhibit the function of SPANX mRNA. hi general, each stretch of contiguous nucleotides is at least 4, 5, 6, 7, or 8 or more nucleotides in length. Non-complementary intervening sequences are preferably 1, 2, 3, or 4 nucleotides in length. One skilled in the art can easily use the calculated melting point of a nucleic acid hybridized to a sense nucleic acid to estimate the degree of mismatching that will be tolerated between a particular nucleic acid for inhibiting expression of a particular SPANX RNA.
In some embodiments a nucleic acid that can inhibit the function of an endogenous SPANX RNA is an anti-sense oligonucleotide. The anti-sense oligonucleotide is complementary to at least a portion of the sequence of a SPANX mRNA. Such anti-sense oligonucleotides are generally at least six nucleotides in length, but can be about 8, 12, 15, 20, 25, 30, 35, 40, 45, or 50 nucleotides long. Longer oligonucleotides can also be used. SPANX anti-sense oligonucleotides can be provided in a DNA construct, or expression cassette and introduced into cells introduced into tumor sites.
In one embodiment of the invention, expression of an SPANX gene is decreased using a ribozyme. A ribozyme is an RNA molecule with catalytic activity. See, e.g., Cech, 1987, Science 236: 1532-1539; Cech, 1990, Ann. Rev. Biochem. 59:543-568; Cech, 1992, Curr. Opin. Struct. Biol. 2: 605-609; Couture and Stinchcomb, 1996, Trends Genet. 12: 510-515. Ribozymes can be used to inhibit gene function by cleaving an RNA sequence, as is known in the art (see, e.g., Haseloff et al., U.S. Pat. No. 5,641,673). SPANX nucleic acids complementary to mRNA encoded by any of SEQ ID NO:26-30 or SEQ ID NO:31-35 can be used to generate ribozymes that will specifically bind to mRNA transcribed from an SPANX gene. SPANX nucleic acids complementary to mRNA include those with sequence identity to the cDNA sequences of SEQ ID NO:26-30 or SEQ ID NO:31-35. Methods of designing and constructing ribozymes that can cleave other RNA molecules in trans in a highly sequence specific manner have been developed and described in the art (see Haseloff et al. (1988), Nature 334:585-591). For example, the cleavage activity of ribozymes can be targeted to specific RNAs by engineering a discrete "hybridization" region into the ribozyme. The hybridization region contains a sequence complementary to the target RNA and thus specifically hybridizes with the target (see, for example, Gerlach et al., EP 321,201). The target sequence can be a segment of about 10, 12, 15, 20, or 50 contiguous nucleotides selected from a nucleotide sequence having SEQ ID NO:26-30 or SEQ ID NO:31-35. Longer complementary sequences can be used to increase the affinity of the hybridization sequence for the target. The hybridizing and cleavage regions of the ribozyme can be integrally related; thus, upon hybridizing to the target RNA through the complementary regions, the catalytic region of the ribozyme can cleave the target. RNA interference (RNAi) involves post-transcriptional gene silencing
(PTGS) induced by the direct introduction of dsRNA. Small interfering RNAs (siRNAs) are generally 21-23 nucleotide dsRNAs that mediate post- transcriptional gene silencing. Introduction of siRNAs can induce post- transcriptional gene silencing in mammalian cells. siRNAs can also be produced in vivo by cleavage of dsRNA introduced directly or via a transgene or virus. Amplification by an RNA-dependent RNA polymerase may occur in some organisms. siRNAs are incorporated into the RNA-induced silencing complex, guiding the complex to the homologous endogenous mRNA where the complex cleaves the transcript. Rules for designing siRNAs are available. See, e.g., Elbashir SM,
Harborth J, Lendeckel W, Yalcin A, Weber K, Tuschl T (2001). Duplexes of 21- nucleotide RNAs mediate RNA interference in mammalian cell culture. Nature 411: 494-498; J. Harborth, S. M. Elbashir, K. Vandenburgh, H. Manninga, S. A. Scaringe, K. Weber and T. Tuschl (2003). Sequence, chemical, and structural variation of small interfering RNAs and short hairpin RNAs and the effect on mammalian gene silencing, Antisense Nucleic Acid Drug Dev. 13: 83-106.
Thus, an effective siRNA can be made by selecting target sites within SEQ ID NO:26-35 that begin with AA, that have 3' UU overhangs for both the sense and antisense siRNA strands, and that have an approximate 50% G/C content. For example, a siRNA of the invention for inhibiting SPANX-N mRNA functioning can have any of the following sequences:
AACAGCCCAC UUCAAGCAUC AAUUU (SEQ IDNO:39) from SPANX-Nl. AAGCAUCAAU GGGGAGAAGA GGUUU (SEQ IDNO:40) from SPANX-Nl.
AACAGCCGAC UUCΆAGCACC AAUUU (SEQIDNO:41)fromSPANX-N2.
AAGCACCAAU GGGGAGAAGA GGUUU (SEQ IDNO:42) from SPANX-N2.
AACAGCCAAC UUCCAGCACC AAUUU (SEQ IDNO:43)from SPANX-N3.
AACUUCCAGC ACCAAUGGGG AGUUU (SEQIDNO-.44)fromSPANX-N3. AAGAGCCAAC UUCCAGCACC AAUUU (SEQ IDNO:45) from SPANX-N4.
AACUUCCAGC ACCAACGAGA AUUU (SEQIDNO:46) from SPANX-N4.
AAAAGCCCAC UUCAAGCACC AAUUU (SEQ IDNO-.47)from SPANX-N5.
AAGCACCAAU GGGGAGAAGA GGUUU (SEQ IDNO:48) from SPANX-N5.
A siRNA of the invention for inhibiting SPANX-A/D mRNA functioning can have any of the following sequences:
AAGCCUGCCA CUGACAUUGA AGUUU (SEQIDNO:49) from SPANX-Al.
AAGAACCAAU AUAUACAAUG GACUUU (SEQIDNO:50) from SPANX-Al.
AAGAUUCAAA ACCUACAAAA GCCUUU (SEQ IDNO-.51) fromSPANX-A2.
AACCUACAAA AGCCUGCCAC UUU (SEQ IDNO:52) from SPANX-A2. AAGAGCUCUG GGCCACUGCG AAG UUU (SEQ IDNO:53) from SPANX-Bl and SPANX-B2.
AAGCUCCAAA AACCUACUGU AGUUU (SEQ IDNO:54) from SPANX-Bl.
AAGAUUCAAA AGCUCCAAAA ACCUUU (SEQIDNO:55) from SPANX-B2.
AAGCUCCAAA AACCUACUGU AGUUU (SEQ IDNO:56) from SPANX-B2. AAGAUUCAAA ACCUACAAAA GUUU (SEQIDNO:57) from SPANX-C.
AACCUACAAA AGCCUGCCGC AGUUU (SEQIDNO:5S) from SPANX-C.
AAGCCUGCCG CAGACAUUGA AGUUU (SEQ IDNO:59)from SPANX-C. AAGCCUGCCG CUGACAUUGA AGUUU (SEQ IDNO:60) from SPANX-D. AAUCCAGUGC CGGCGGGGUG UUU (SEQ IDNO:61) from SPANX-D.
Nucleic acids that can decrease SPANX expression or translation can hybridize to an mRNA encoded by any one of SEQ ID NO:26-36 under physiological conditions. In other embodiments, these nucleic acids can hybridize to an mRNA encoded by a nucleic acid comprising SEQ ID NO:26-36 under stringent hybridization conditions. Examples of nucleic acids that can modulate the expression or translation of an SPANX polypeptide include a siRNA that consists essentially of a double-stranded RNA with any one of SEQ ID NO:39-61.
Methods to Identify Agents that Modulate SPANX Expression
The invention provides a method to identify an agent that modulates SPANX expression. In one aspect, the method involves contacting a test cell with a candidate agent and determining if the agent modulates SPANX expression, either by increasing or decreasing SPANX expression within the test cell. Thus, the invention provides a method for identify agents that increase or decrease SPANX expression.
An increase or decrease in SPANX expression within a cell can be determined by comparing the SPANX expression within a test cell that was contacted with a candidate agent, with the SPANX expression within a control cell that was not contacted with a candidate agent. The SPANX expression in a control cell may be determined before, concurrently, or after the SPANX expression within the control cell is determined. SPANX expression can be determined by detecting activity of an
SPANX promoter. An increase or decrease in transcription from a SPANX promoter can be determined through use of many art recognized methods. For example, the presence and quantity of messenger RNA (mRNA) encoded by an SPANX regulated gene in a cell or other sample can be determined through use of hybridization based procedures, such as northern blotting, gene chip technologies, or through production and hybridization of complimentary DNA (cDNA). Additional examples of methods that can be used to detect and quantify mRNA of SPANX regulated genes include nucleic acid amplification based methods, such as polymerase chain reaction, ligase chain reaction, and the like. Instrumental methods may be used to detect and quantify mRNA of SPANX regulated genes. For example, probes containing a detectable label may be hybridized to the mRNA. Such probes may be labeled with a fluorescent tag that allows for rapid detection of the mRNA, and therefore provides for high- throughput screening of candidate agents that modulate SPANX expression. Such methods can be automated according to procedures in common practice in the pharmaceutical industry. Numerous labeled probes may be constructed, and include those that use fluorescence resonance energy transfer (FRET) or fluorescence quenching for detection. Such probes and instrumental methods are known in the art and have been reported (Sambrook et al., Molecular Cloning: A Laboratory Manual, 3rd edition. Cold Spring Harbor Press, Cold Spring Harbor, N. Y. (2001); Harlow et al., Antibodies: A Laboratory Manual, page 319 (Cold Spring Harbor Pub. (1988)). Candidate agents can be identified that cause an increase or decrease in the transcription or translation of the gene encoding SPANX. Accordingly, a test cell can be contacted with a candidate agent. Production of SPANX mRNA or SPANX protein within the cell can be determined and compared to production in a control cell to determine if a candidate agent increases of decreases production of SPANX mRNA or protein. Such methods have been described herein and are known in the art.
Antibodies have been described herein and can also be produced that bind to the SPANX protein. These antibodies can be used to determine if a candidate agent increases or decreases expression of the SPANX protein within a cell. For example, the antibodies can be utilized in immunosorbant assays, such as enzyme-linked immunosorbant (ELIZA) or radio-immunosorbant assays (RIA), to detect SPANX protein.
Test cells can also be constructed that express an SPANX protein that includes a tag. Such a fusion protein can be constructed such that the tag is an epitope that can be bound by an antibody (Shimada et al., Intemat. Immunol., 11:1357-1362 (1999)). An example of such a tag is the FLAG® tag. An increase or decrease in the production of the fusion protein can then be readily followed through use of immunological techniques as are known in the art and described herein (Harlow et al., Antibodies: A Laboratory Manual, page 319 (Cold Spring Harbor Pub. 1988)).
Formulations and Administration The antibodies, expression cassettes, expression vectors and nucleic acids of the invention, including their salts, are administered so as to achieve a reduction in at least one symptom associated with an indication or disease.
To achieve the desired effect(s), the antibodies, expression cassettes, expression vectors and nucleic acids or combinations thereof, may be administered as single or divided dosages, for example, of at least about 0.01 mg/kg to about 500 to 750 mg/kg, of at least about 0.01 mg/kg to about 300 to 500 mg/kg, at least about 0.1 mg/kg to about 100 to 300 mg/kg or at least about 1 mg/kg to about 50 to 100 mg/kg of body weight, although other dosages may provide beneficial results. The amount administered will vary depending on various factors including, but not limited to, the antibodies or nucleic acids chosen, the disease, the weight, the physical condition, the health, the age of the mammal, whether prevention or treatment is to be achieved. Such factors can be readily determined by the clinician employing animal models or other test systems that are available in the art. Administration of the therapeutic agents in accordance with the present invention may be in a single dose, in multiple doses, in a continuous or intermittent manner, depending, for example, upon the recipient's physiological condition, whether the purpose of the administration is therapeutic or prophylactic, and other factors known to skilled practitioners. The administration of the therapeutic agents of the invention may be essentially continuous over a pre-selected period of time or may be in a series of spaced doses. Both local and systemic administration is contemplated.
To prepare the composition, therapeutic agents are synthesized or otherwise obtained, purified as necessary or desired and then lyophilized and stabilized. The therapeutic agents can then be adjusted to the appropriate concentration, and optionally combined with other agents. The absolute weight of a given antibody or nucleic acid included in a unit dose can vary widely. For example, about 0.01 to about 2 g, or about 0.1 to about 500 mg, of at least one
5 S antibody or nucleic acid of the invention, or a plurality of antibodies and/or nucleic acids. Alternatively, the unit dosage can vary from about 0.01 g to about 50 g, from about 0.01 g to about 35 g, from about 0.1 g to about 25 g, from about 0.5 g to about 12 g, from about 0.5 g to about 8 g, from about 0.5 g to about 4 g, or from about 0.5 g to about 2 g.
Daily doses of the therapeutic agents of the invention can vary as well. Such daily doses can range, for example, from about 0.1 g/day to about 50 g/day, from about 0.1 g/day to about 25 g/day, from about 0.1 g/day to about 12 g/day, from about 0.5 g/day to about 8 g/day, from about 0.5 g/day to about 4 g/day, and from about 0.5 g/day to about 2 g/day.
Thus, one or more suitable unit dosage forms comprising the therapeutic agents of the invention can be administered by a variety of routes including oral, parenteral (including subcutaneous, intravenous, intramuscular and intraperitoneal), rectal, dermal, transdermal, intrathoracic, intrapulmonary and intranasal (respiratory) routes. The therapeutic agents may also be formulated for sustained release (for example, using microencapsulation, see WO 94/ 07529, and U.S. Patent No.4,962,091). The formulations may, where appropriate, be convenient^ presented in discrete unit dosage forms and may be prepared by any of the methods well known to the pharmaceutical arts. Such methods may include the step of mixing the therapeutic agent with liquid earners, solid matrices, semi-solid carriers, finely divided solid carriers or combinations thereof, and then, if necessary, introducing or shaping the product into the desired delivery system.
When the therapeutic agents of the invention are prepared for oral administration, they are generally combined with a pharmaceutically acceptable carrier, diluent or excipient to form a pharmaceutical formulation, or unit dosage form. For oral administration, the therapeutic agents may be present as a powder, a granular formulation, a solution, a suspension, an emulsion or in a natural or synthetic polymer or resin for ingestion of the active ingredients from a chewing gum. The active ingredients may also be presented as a bolus, electuary or paste. Orally administered therapeutic agents of the invention can also be formulated for sustained release, e.g., the antibodies and/or nucleic acids can be coated, micro-encapsulated, or otherwise placed within a sustained delivery device. The total active ingredients in such formulations comprise from 0.1 to 99.9% by weight of the formulation.
By "pharmaceutically acceptable" it is meant a carrier, diluent, excipient, and/or salt that is compatible with the other ingredients of the formulation, and not deleterious to the recipient thereof.
Pharmaceutical formulations containing the therapeutic agents of the invention can be prepared by procedures known in the art using well-known and readily available ingredients. For example, the therapeutic agents can be formulated with common excipients, diluents, or carriers, and formed into tablets, capsules, solutions, suspensions, powders, aerosols and the like. Examples of excipients, diluents, and carriers that are suitable for such formulations include buffers, as well as fillers and extenders such as starch, cellulose, sugars, mannitol, and silicic derivatives. Binding agents can also be included such as carboxymethyl cellulose, hydroxymethylcellulose, hydroxypropyl methylcellulose and other cellulose derivatives, alginates, gelatin, and polyvinyl-pyrrolidone. Moisturizing agents can be included such as glycerol, disintegrating agents such as calcium carbonate and sodium bicarbonate. Agents for retarding dissolution can also be included such as paraffin. Resorption accelerators such as quaternary ammonium compounds can also be included. Surface active agents such as cetyl alcohol and glycerol monostearate can be included. Adsorptive carriers such as kaolin and bentonite can be added. Lubricants such as talc, calcium and magnesium stearate, and solid polyethyl glycols can also be included. Preservatives may also be added. The compositions of the invention can also contain thickening agents such as cellulose and/or cellulose derivatives. They may also contain gums such as xanthan, guar or carbo gum or gum arabic, or alternatively polyethylene glycols, bentones and montmorillonites, and the like.
For example, tablets or caplets containing the therapeutic agents of the invention can include buffering agents such as calcium carbonate, magnesium oxide and magnesium carbonate. Caplets and tablets can also include inactive ingredients such as cellulose, pre-gelatinized starch, silicon dioxide, hydroxy propyl methyl cellulose, magnesium stearate, microcrystalline cellulose, starch, talc, titanium dioxide, benzoic acid, citric acid, com starch, mineral oil, polypropylene glycol, sodium phosphate, zinc stearate, and the like. Hard or soft gelatin capsules containing at least one antibody or nucleic acid of the invention can contain inactive ingredients such as gelatin, microcrystalline cellulose, sodium lauryl sulfate, starch, talc, and titanium dioxide, and the like, as well as liquid vehicles such as polyethylene glycols (PEGs) and vegetable oil.
Moreover, enteric-coated caplets or tablets containing one or more antibodies or nucleic acids of the invention are designed to resist disintegration in the stomach and dissolve in the more neutral to alkaline environment of the duodenum.
The therapeutic agents of the invention can also be formulated as elixirs or solutions for convenient oral administration or as solutions appropriate for parenteral administration, for instance by intramuscular, subcutaneous, intraperitoneal or intravenous routes. The pharmaceutical formulations of the therapeutic agents of the invention can also take the form of an aqueous or anhydrous solution or dispersion, or alternatively the form of an emulsion or suspension or salve.
Thus, the therapeutic agents may be formulated for parenteral administration (e.g., by injection, for example, bolus injection or continuous infusion) and may be presented in unit dose form in ampoules, pre-filled syringes, small volume infusion containers or in multi-dose containers. As noted above, preservatives can be added to help maintain the shelve life of the dosage form. The active agents and other ingredients may form suspensions, solutions, or emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, stabilizing and/or dispersing agents. Alternatively, the active agents and other ingredients may be in powder form, obtained by aseptic isolation of sterile solid or by lyophilization from solution, for constitution with a suitable vehicle, e.g., sterile, pyrogen-free water, before use.
These formulations can contain pharmaceutically acceptable earners, vehicles and adjuvants that are well known in the art. It is possible, for example, to prepare solutions using one or more organic solvent(s) that is/ are acceptable from the physiological standpoint, chosen, in addition to water, from solvents such as acetone, ethanol, isopropyl alcohol, glycol ethers such as the products sold under the name "Dowanol," polyglycols and polyethylene glycols, C1-C4 alkyl esters of short-chain acids, ethyl or isopropyl lactate, fatty acid triglycerides such as the products marketed under the name "Miglyol," isopropyl myristate, animal, mineral and vegetable oils and polysiloxanes.
It is possible to add, if necessary, an adjuvant chosen from antioxidants, surfactants, other preservatives, film-forming, keratolytic or comedolytic agents, perfumes, flavorings and colorings. Antioxidants such as t-butylhydroquinone, butylated hydroxyanisόle, butylated hydroxytoluene and α-tocopherol and its derivatives can be added.
Additionally, the therapeutic agents are well suited to formulation as sustained release dosage forms and the like. The formulations can be so constituted that they release the active agent, for example, in a particular part of the intestinal, urogenital or respiratory tract, possibly over a period of time. Coatings, envelopes, and protective matrices may be made, for example, from polymeric substances, such as polylactide-glycolates, liposomes, microemulsions, microparticles, nanoparticles, or waxes. These coatings, envelopes, and protective matrices are useful to coat indwelling devices, e.g., stents, catheters, peritoneal dialysis tubing, draining devices and the like.
For topical administration, the therapeutic agents may be formulated as is known in the art for direct application to a target area. Forms chiefly conditioned for topical application take the form, for example, of creams, milks, gels, dispersion or microemulsions, lotions thickened to a greater or lesser extent, impregnated pads, ointments or sticks, aerosol formulations (e.g., sprays or foams), soaps, detergents, lotions or cakes of soap. Other conventional forms for this purpose include wound dressings, coated bandages or other polymer coverings, ointments, creams, lotions, pastes, jellies, sprays, and aerosols. Thus, the therapeutic agents of the invention can be delivered via patches or bandages for dermal administration. Alternatively, the therapeutic agents can be formulated to be part of an adhesive polymer, such as polyacrylate or acrylate/vinyl acetate copolymer. For long-term applications it might be desirable to use microporous and/or breathable backing laminates, so hydration or maceration of the skin can be minimized. The backing layer can be any appropriate thickness that will provide the desired protective and support functions. A suitable thickness will generally be from about 10 to about 200 microns. Ointments and creams may, for example, be formulated with an aqueous or oily base with the addition of suitable thickening and/or gelling agents. Lotions may be formulated with an aqueous or oily base and will in general also contain one or more emulsifying agents, stabilizing agents, dispersing agents, suspending agents, thickening agents, or coloring agents. The therapeutic agents can also be delivered via iontophoresis, e.g., as disclosed in U.S. Patent Nos. 4,140,122; 4,3S3,529; or 4,051,842. The percent by weight of a therapeutic agent of the invention present in a topical formulation will depend on various factors, but generally will be from 0.01% to 95% of the total weight of the formulation, and typically 0.1 -85% by weight.
Drops, such as eye drops or nose drops, may be formulated with one or more of the therapeutic agents in an aqueous or non-aqueous base also comprising one or more dispersing agents, solubilizing agents or suspending agents. Liquid sprays are conveniently delivered from pressurized packs. Drops can be delivered via a simple eye dropper-capped bottle, or via a plastic bottle adapted to deliver liquid contents dropwise, via a specially shaped closure.
The therapeutic agents may further be formulated for topical administration in the mouth or throat. For example, the active ingredients may be formulated as a lozenge further comprising a flavored base, usually sucrose and acacia or tragacanth; pastilles comprising the composition in an inert base such as gelatin and glycerin or sucrose and acacia; and mouthwashes comprising the composition of the present invention in a suitable liquid carrier.
The pharmaceutical formulations of the present invention may include, as optional ingredients, pharmaceutically acceptable carriers, diluents, solubilizing or emulsifying agents, and salts of the type that are available in the art. Examples of such substances include normal saline solutions such as physiologically buffered saline solutions and water. Specific non-limiting examples of the carriers and/or diluents that are useful in the pharmaceutical formulations of the present invention include water and physiologically acceptable buffered saline solutions such as phosphate buffered saline solutions pH 7.0-8.0.
The therapeutic agents of the invention can also be administered to the respiratory tract. Thus, the present invention also provides aerosol pharmaceutical formulations and dosage forms for use in the methods of the invention. In general, such dosage forms comprise an amount of at least one of the agents of the invention effective to treat or prevent the clinical symptoms of a specific indication or disease. Any statistically significant attenuation of one or more symptoms of an indication or disease that has been treated pursuant to the method of the present invention is considered to be a treatment of such indication or disease within the scope of the invention.
Alternatively, for administration by inhalation or insufflation, the composition may take the form of a dry powder, for example, a powder mix of the therapeutic agent and a suitable powder base such as lactose or starch. The powder composition may be presented in unit dosage form in, for example, capsules or cartridges, or, e.g., gelatin or blister packs from which the powder may be administered with the aid of an inhalator, insufflator, or a metered-dose inhaler (see, for example, the pressurized metered dose inhaler (MDI) and the dry powder inhaler disclosed in Newman, S. P. in Aerosols and the Lung.
Clarke, S. W. and Davia, D. eds., pp. 197-224, Butterworths, London, England, 1984).
Therapeutic agents of the present invention can also be administered in an aqueous solution when administered in an aerosol or inhaled form. Thus, other aerosol pharmaceutical foπnulations may comprise, for example, a physiologically acceptable buffered saline solution containing between about 0.1 mg/ml and about 100 mg/ml of one or more of the agents of the present invention specific for the indication or disease to be treated. Dry aerosol in the form of finely divided solid antibody or nucleic acid particles that are not dissolved or suspended in a liquid are also useful in the practice of the present invention. Therapeutic agents of the present invention may be formulated as dusting powders and comprise finely divided particles having an average particle size of between about 1 and 5 μm, alternatively between 2 and 3 μm. Finely divided particles may be prepared by pulverization and screen filtration using techniques well known in the art. The particles may be administered by inhaling a predetermined quantity of the finely divided material, which can be in the form of a powder. It will be appreciated that the unit content of active ingredient or ingredients contained in an individual aerosol dose of each dosage form need not in itself constitute an effective amount for treating the particular indication or disease since the necessary effective amount can be reached by administration of a plurality of dosage units. Moreover, the effective amount may be achieved using less than the dose in the dosage form, either individually, or in a series of administrations.
For administration to the upper (nasal) or lower respiratory tract by inhalation, the therapeutic agents of the invention are conveniently delivered from a nebulizer or a pressurized pack or other convenient means of delivering an aerosol spray. Pressurized packs may comprise a suitable propellant such as dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas. In the case of a pressurized aerosol, the dosage unit may be determined by providing a valve to deliver a metered amount. Nebulizers include, but are not limited to, those described in U.S. Patent Nos. 4,624,251; 3,703,173; 3,561,444; and 4,635,627. Aerosol delivery systems of the type disclosed herein are available from numerous commercial sources including Fisons Corporation (Bedford, Mass.), Schering Corp. (Kenilworth, NJ) and American Pharmoseal Co., (Valencia, CA). For intra-nasal administration, the therapeutic agent may also be administered via nose drops, a liquid spray, such as via a plastic bottle atomizer or metered-dose inhaler. Typical of atomizers are the Mistometer (Wintrop) and the Medihaler (Riker).
Furthermore, the active ingredients may also be used in combination with other therapeutic agents, for example, pain relievers, anti-inflammatory agents, and the like, whether for the conditions described or some other condition. The present invention further pertains to a packaged pharmaceutical composition for controlling prostate cancer such as a kit or other container. The kit or container holds a therapeutically effective amount of a pharmaceutical composition for controlling prostate cancer and instructions for using the pharmaceutical composition for control of the prostate cancer. The pharmaceutical composition includes at least one antibody or nucleic acid of the present invention, in a therapeutically effective amount such that a prostate cancer is controlled.
Having now generally described the invention, the same will be more readily understood through reference to the following examples which are provided by way of illustration, and are not intended to be limiting of the present invention.
EXAMPLE 1: SPANX-N Prostate Cancer Marker Genes Are the Ancestral Progenitors of SPANX- A/D Genes
This Example describes the sequence, genomic organization and cellular expression patterns of SPANX-N genes. This Example also provides evidence that SPANX-Al, -A2, -B, -C and -D evolved from the SPANX-N genes.
Materials and Methods
Amplification of SPANX Genes from Primates. Genomic DNAs from chimpanzee (Pan troglodytes), gorilla (Gorilla gorilla), orangutan (Pongo pygmaens), rhesus macaque (Macaca mulatto), and tamarin (Saguinus labiatus) (Coriell Institute for Medical Research, Camden, NJ) were used as template DNA for amplification using the specific primers provided in Table 1.
Table 1. Primers Employed
Figure imgf000068_0001
doc2-F/doc2-R 5'-gcgtcttacatgctcgtgaa-3' (SEQ ID NC-.70) 5'-aagcacttccgagtgagtga-3' (SEQ ID NO:71)
Nat-Fl/Nat-Rl 5'-ataggaatggctgtggggaa-3' (SEQ ID NO:72) 5'-gtggtcattca(c/g)cagttcct-3' (SEQ ID NO:73)
Nat F2/ Nat Rl 2 5'-ggcttgaaggaatcatcctc-3' (SEQ ID NO:74) 5'-ttcctcctcctccatttgga-3' (SEQ ID NO:75)
Fl/Rl 5'-agctgctcaaaagtagggac-3' (SEQ ID NO:76) S'-gtacctgtgggtcaatctgt-S' (SEQ ID NO:77)
F2/ R2 5'-cattctcagtccatgtgga-3' (SEQ ID NO:78) 5'-gcaggctactcatgacacta-3' (SEQ ID NO:79)
FI-I/ RI-IR 5'-taacttctcccactccgtca-3' (SEQ ID NO:80) 5'-actcgtactgctgtaccttc-3' (SEQ ID NO:81)
FI-2/ RI-2R 5'-ttgtggacgctagactcaag-3' (SEQ ID NO:S2) 5'-aatcctggagcagtgcaatg-3' (SEQ ID NO:83)
FI-3/ RI-3R 5'-aactggaccttctcctccaa-3' (SEQ ID NO:84) 5'-acctagccagctaagcagaa-3' (SEQ ID NO.85)
Amplification of SPANX in apes
PrimSX-RR/ PrimSX-F5 5'-ctacctcttcccttcccttc-3' (SEQ ID NO:86) 5' tgggacactgcctgtatgat-3' (SEQ ID NO:87)
RT-PCR in human
FhuS-F/RhuS-R 5'-atggaacagccgacttcaag-3' (SEQ ID NO:37) 5'-tgagtctaggccttcgtcct-3' (SEQ ID NO:38)
RT-PCR in mouse
MoSPAN-F/MoSPAN-R 5'-gatggaagaacagcctacac-3' (SEQ ID NO:88)
Figure imgf000070_0001
*These primers were used to analyze SPANX-C insertion polymorphism in population study. The primers were designed so that if the SPANX-C duplication was absent, the predicted size of the PCR product was 5.6 kb. No PCR product was produced if 20-kb SPANX-C duplication was present. PCR was performed by using 1 μl of genomic DNA (100 ng) in a 50 μl reaction volume under the following conditions: 94°C, 2 min (thirty cycles of 940C, 30 s; 600C, 10 s; 68°C, 9 min); 72°C, 7 min; 4°C, hold.
RT-PCR. Total RNAs from mouse (brain, testis, liver, and heart) and human (brain, testis, liver, and skeletal muscle) tissues (Ambion, Austin, TX) were used for screening SPANX expression with the primers described in Table 1. Complementary DNA (cDNA) was made from 1 μg of total RNA using the Superscript first-strand system kit (Invitrogen) and priming with oligo(dT) pursuant to the manufacturer's standard protocol. Human β-actin primers (BD Biosciences Clontech) were used as positive controls for both human and mouse RT-PCR. NCI-60 cancer cell lines were from the National Cancer Institute. RT- PCR was performed by using 1 μl of cDNA or 1 μl of genomic DNA in a 50- μl reaction volume. Standard reaction conditions were as follows: 940C, 5 min (35 cycles of 94°C, 1 min; 550C, 1 min; 720C, 1 min); 720C, 7 min; and 4°C, hold. Construction of Transformation-Associated Recombination (TAR) Vector and Cloning by in Vivo Recombination in Yeast. TAR cloning experiments were carried out as described in Kouprina, N. & Larionov, V. (2003) FEMS Microbiol. Rev. 27, 1-21. The TAR vector was constructed by using pVC604. The vector contained 5' 164-bp and 3' 187-bp targeting sequences, specific to the unique sequences flanking SPANX-C. These targeting sequences were amplified from human genomic DNA with specific primers (Table 1). The 5' and 3' targeting sequences correspond to positions 39,708-39,872 and 122,818- 123,004 in the bacterial artificial chromosome (BAC) (AL109799). Before use in TAR cloning experiments, the vector was linearized with SpIiI. Genomic DNAs were prepared from primate tissue culture lines (Coriell Institute for Medical Research). To identify clones positive for LDOCl, yeast transformants were examined by PCR by using a pair of diagnostic primers (Table 1). The yield of LDOCl -positive clones from African apes genomic DNAs (chimpanzee, gorilla, and bonobo) was the same as with human DNA (1%). The size, AIu profiles, and retrofitting of yeast artificial chromosomes (YACs) into BACs were determined as described in Kouprina, N. & Larionov, V. (2003) FEMS Microbiol. Rev. 27, 1-21. AIu profiles of three independent TAR isolates for each species were indistinguishable. These results strongly suggest that the isolated YACs contain non-rearranged genomic segments.
Sequencing. Bonobo and gorilla TAR clones were directly sequenced from bacterial artificial chromosome DNAs by Fidelity Systems (Gaithersburg, MD). Sequences of primate paralogs were generated by TA subcloning of 81 PCR products, 1.2 or 1.4 kb, amplified from genomic DNAs (20 for chimpanzee, 20 for gorilla, 20 for orangutan, and 20 for tamarin) and one for rhesus macaque (9.0 kb). Sequence forward and reverse reactions were run on a 3100 Automated Capillary DNA Sequencer (PE Applied Biosystems). DNA sequences were compared by using the GCG DNA ANALYSIS Wisconsin Package (see website at accelrys.com/ support/bio/faqs. wis.pkg.html) and National Center for Biotechnology Information BLAST. Non-human sequences were deemed paralogous if more than two sequence differences were observed. All clones were named and numbered according to the clone/accession identifier (Table 2). Table 2. Clones and deposited accessions
Identification Species Accession nos.*
Chimp I1 Pan troglodytes AY457936
Chimp 2 P. troglodytes AY457935
Chimp 3 P. troglodytes AY457934
Chimp 4 P. troglodytes AY457933
Figure imgf000072_0001
*GenBank accession nos. corresponding to PCR products obtained with PrimSX-RR/PrimSX-F5.
All sequences except Rhesus 1 are developed from a 1.4-kb PCR products. *GenBank accession nos. corresponding to genomic regions obtained by TAR cloning from bonobo and gorilla.
The human SPANX-Nl to -N5 gene sequences have accession numbers AY825029-AY825033.
Sequence Analysis. Database searches were performed by using the versions of the BLAST program appropriate for different types of sequence comparisons: BLASTN for nucleotide sequences, BLASTP for protein sequences, and TBLASTN for searching a nucleotide database translated in six frames with a protein query. See Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. (1997) Nucleic Acids Res. 25, 3389-3402. Multiple alignments of protein sequences were constructed by using the MACAW program. Schuler, G. D., Altschul, S. F. & Lipman, D. J. (1991) Proteins 9, 180-190. Multiple alignments of nucleotide sequences were aligned to correspond to the protein sequence alignments. Protein secondary structure was predicted by using the PHD program (see website at cubic.bioc.columbia.edu/ predict protein), with a multiple sequence alignment submitted as a query. See, Rost, B., Sander, C. & Schneider, R. (1994) Comput. Appl. Biosci. 10, 53-60; website at cubic.bioc.columbia.edu/predict protein. The phylogenetic tree was constructed by using the neighbor-joining method (Saitou, N. & Nei, M. (1987) MoI. Biol, Evol. 4, 406-425) as implemented in the MEGA2 program (Kumar, S., Tamura, K., Jakobsen, I. B. & Nei, M. (2001) Bioinformatics 17, 1244-1245) with the maximum parsimony and maximum likelihood methods as implemented in the PAUP* program (website at paup.csit.fsu.edu). Evolutionary rates for synonymous and nonsynonymous positions in the coding sequences were calculated by using the modified Nei- Gojobori method. See Zhang, J., Rosenberg, H. F. & Nei, M. (1998) Proc. Natl. Acad. Sci. USA 95, 3708-3713. The evolutionary rates for noncoding sequences were calculated by using the two-parameter Kimura model. See Kimura, M. (198O) J. MoL Evol. 16, 111-120.
Results
The SPANX Family: Identification of a Second Subfamily in Primates and a Single SPANX Gene in Rodents. To shed light on the evolution of the SPANX family, primate genomic segments homologous to human SPANX were isolated and characterized. The SPANX regions from five species (chimpanzee, gorilla, orangutan, rhesus macaque, and tamarin) were amplified by using a set of primers developed from the conserved 5' and 3' flanking sequences of human SPANX genes (see Materials and Methods). PCR products with a size predicted for the SPANX-A/D genes (1.2 kb) were obtained only from African apes (FIG. 1). Sequence analysis of these fragments revealed genes with 65-90% nucleotide identity to the human SPANX genes. These gene fragments also were organized like human SPANX genes, i.e., two exons and a 650-bp highly conserved intron that includes approximately 400 bp of ERV LTR. In addition to or instead of the 1.2-kb fragment, two other PCR products were observed that were approximately 1.4 kb and 9.0 kb in size. These bands were amplified from DNA samples of chimpanzee, gorilla, orangutan, and tamarin. However, in rhesus macaques, only the 9.0-kb fragment was detected (FIG. 1). Sequence analysis of the 1.4-kb product obtained from anthropoids revealed SPANXΛike genes that were organized similarly to the known SPANX- AID genes, i.e., contained two exons separated by a 650-bp intron with a solo LTR and shared about 50% amino acid sequence identity with the SPANX-A/D proteins. A characteristic difference between the newly detected SP/liVX-related genes and human SPANX-AID genes was found in exon 2. This exon is longer in primate genes than in human genes and is variable in size (280-320 bp vs. 219 bp in SPANX-A/D). Expansion and variability of exon 2 size are due to the presence of a 39-bp mini-satellite sequence at its 5' end. Similar amplification of mini-satellites in exons without disruption of the ORF has been previously described for other genes. See Yang et al. (2000) Am. J. Med. Genet. 95, 385- 390; Lievers et al. (2001) Eur. J. Hum. Genet. 9, 583-589. End sequencing of 9.0-kb primate clones revealed significant sequence similarity to exon 1 and exon 2 of the 1.4-kb clones. The 1.4- and 9.0-kb clones differed in that the latter contained a second LTR upstream of exon 2. Sequencing of a 9.0-kb clone of rhesus macaque showed the presence of an intact ERV sequence. Presence of ERV in 9.0-kb clones of other species was confirmed by PCR (data not shown). A search of the GenBank database detected five regions of significant similarity to the 1.4- and 9.0-kb primate sequences in the human genome. One of these regions, which is located at XpI 1, produced a contiguous alignment with the 1.4-kb primate sequences. Analysis of this genomic sequence revealed a SPANX-like gene consisting of two exons separated by a 650-bp LTR-containing intron. Four other regions of similarity were identified at Xq27, about 2 IvIb away from the SPANX-A/D gene cluster (FIG. 2). In this region, the predicted exon sequences are separated by an intron containing a complete ERV, in an arrangement similar to the primate 9.0-kb clones. Like the iS!Λ4iVZhomologs in primates, the five new members of the SPANX family in humans have a variable size of exon 2 due to the presence of a 39-bp mini-satellite repeat at the 5' end. The protein sequences encoded by the five human SPANX-like genes identified here share 50-80% identity with each other and 40-50% identity with the sequences of the SPANX-A/D proteins. In all cases, the exon boundaries and the splice sites were well conserved (data not shown), suggesting that these genes are expressed.
Based on the distinct gene organization and the relatively low sequence similarity to SPANX-AID genes, the ERV-containing genes were classified as a second SPANX subfamily, which the inventors have named SPANX-N. In humans, the SPANX-N genes encode proteins with predicted polypeptide sequences of 72 amino acids (SPANX-Nl), 180 amino acids (SPANX-N2), 141 amino acids (SPANX-N3), 159 amino acids (SPANX-N4), and 72 amino acids (SPANX-Nl).
A search of the GenBank database revealed two regions of significant similarity to human SPANX-N in the mouse and rat genomes. Both mouse and rat SPANX-N homologs are previously unannotated genes; the expression of the mouse gene was supported by the detection of eight ESTs in Database of Expressed Sequence Tags (dbEST) (BU939216, CA463062, CA464820, CB273391, BX635129, BC048649, CB273391, and BU946237). The mouse and rat gene encode, respectively, 87 amino acid and 115 amino acid proteins with 28-36% amino acid identity to human SPA NX-N genes. The mouse gene contains a 250-bp intron that shares about 65% identity with the primate SPANX intron. The smaller size of the intron is due to the absence of the ERV sequence or the LTR, which is present in all primate SPANX genes. This murine SPANX homolog appears to be a single murine ortholog of human SPANX-N 1-N4 because it shows the closest similarity to the SPANX-N genes and is located in the mouse chromosome X region syntenic to SPANX-N 1-N4. Thus, the SPANX- N subfamily is apparently represented not only in all primates but also in rodents, whereas the SPANX-A/D genes appear to be present exclusively in the African great apes and humans.
Identification of the SPANX-N family and, particularly, the rodent SPANX homologs, which showed limited sequence similarity to the primate SPANX sequences, provided an opportunity to gain some insight into the structure and putative functional motifs of the SPANX proteins. The most prominent conserved sequence feature of the SPANX family is the central hydrophobic patch ending with an arginine. Secondary structure prediction suggested that the central conserved region formed a β-hairpin with a strongly hydrophobic proximal strand, followed by an α-helix. The rest of the protein seems to have a disordered structure with few residues conserved throughout the family but with considerable conservation within subfamilies and a marked preponderance of charged and polar residues. The bipartite nuclear localization signal that has been previously detected in the SPANX-AlD subfamily (Zendman et al. (2003) Gene 309, 125-133) is conserved in most of the SPANX-N proteins, with the exception of SPANX-N2 and -N4 but not in the rodent sequences; however, the latter contain a putative monopartite nuclear localization signal. The presence of a small globular core embedded in apparently disordered structure suggests that SPANX protein monomers may be unstable and is compatible with the reported dimer formation. See, Westbrook et al. (2001) Biol. Reprod. 64, 345-358; Westbrook et al. (2004) Clin. Cancer Res. 10, 101— 112.
SPANX-N Genes Are Expressed in Normal Testis and in Melanoma Cell Lines. Using primers specific to the SPANX-N2 and -N3 mRNA, expression of these genes was analyzed in a panel of normal tissues. A 264-bp band of expected size was detected only in testis (FIG. 3A). Sequencing of the RT-PCR products confirmed the identity of these transcripts to the SPANX-N2 and -N3 genes. Furthermore, the amplified sequences corresponded to two ESTs in dbEST (BU569937 and BF967778). Similar experiments with a panel of normal tissues from mice also detected expression of the mouse SPANX gene only in testis (FIG. 3B). The exclusive expression of these genes in normal testis correlated with the conservation of the promoter region, which contained two recognition sites for testis-specific transcription factors (FIG. S).
SPANX-N expression was also examined in the NIH-60 panel of cancer cell lines that represent nine different types of cancers. See, Zendman et al. (2003) Gene 309, 125-133. RT-PCR products of SPANX-N2 or -N3 of the expected size were detected only in a melanoma cell line (Table 3).
Figure imgf000077_0001
Figure imgf000078_0001
Figure imgf000079_0001
NSCLC, non-small cell lung cancer.
*LOX IMVI cell line was derived from a malignant amelanotic melanoma.
Specific primers, Xb-F/Xb-R for SPANX-B and Xa-F/Xa-R for SPANX-A members (1) were used.
ND, not determined. Notably, the SPANX-AlD subfamily is also expressed in the same line. Co- expression of members of the two SPANX gene subfamilies is not surprising because of the remarkable conservation of the promoter sequences. Thus, expression profile analysis indicates that the SPANX-N subfamily, similar to the SPANX-AID subfamily, consists of cancer/testis antigens (CTA) genes. Furthermore, the testis-specific pattern of expression of SPANX genes is conserved between primates and rodents.
Evolution of the SPANX Family: An Unusual Case of Apparent Positive Selection in both Nomynonymous and Synonymous Positions. Reproduction- related genes tend to evolve rapidly. Moreover, many of them appear to be subject to positive selection, which is usually detected by measuring the dalds ratio, i.e., the ratio of the evolutionary rates in nonsynonymous and synonymous codon positions. Makalowski, W. & Boguski, M. S. (1998) J MoI. Evol 47, 119-121; Wyckoff, G. J., Wang, W. & Wu, C. I. (2000) Nature 403, 304-309; Swanson, W. J., Clark, A. G., Waldrip-Dail, H. M., Wolfner, M. F. & Aquadro, C. F. (2001) Pr oc. Natl. Acad. Sd. USA 98, 7375-7379; Swanson, W. J. & Vacquier, V. D. (2002) Nat. Rev. Genet. 3, 137-144. For most genes, dalds is «1, which is a sign of evolution under purifying selection; in contrast, when dalds >1, then this is considered to be an indication of positive (diversifying) selection.
The rate of evolution of SPANX genes is outstanding even among reproductive proteins. The highest level of conservation between rodent SPANX proteins and human SPNAX-N family members is about 36%, substantially less than the values observed for most testis-associated proteins and about the same as for transition protein 2, the most rapid evolving among analyzed human and mouse orthologs. Makalowski, W. & Boguski, M. S. (1998) J. MoI. Evol. 47, 119-121; Swanson, W. J. & Vacquier, V. D. (2002) Nat. Rev. Genet. 3, 137— 144. The dalds ratio for SPANX genes was typically close to 1 (Table 4), which normally would be inteipreted as evolution under substantially relaxed purifying selection, perhaps near-neutral evolution. For several comparisons of closely related sequences, within the SPANX-AlD and -N subfamilies, dalds values close to 2 were observed. However, because of the small total number of nucleotide substitutions, these failed to pass statistical tests for positive selection (Tables 4 and 5).
Table 4. Evolutionary distances between coding and noncoding sequences of SPANX genes
Figure imgf000081_0001
Figure imgf000082_0001
Figure imgf000083_0001
Figure imgf000084_0001
Figure imgf000085_0001
Figure imgf000086_0001
Kimura two-parameter model was used for noncoding sequences. CDS/N, nonsynonymous positions in the coding region; CDS/S, synonymous positions in the coding region.
Table 5: Mean evolutionary distances for the '5 flanking regions, the intron, and the coding sequences of the SPAN-X genes.
Figure imgf000086_0002
One highly unusual feature of the SPANX family is that both synonymous and nonsynonymous positions in the coding sequences of many SPANX genes evolved much faster than the noncoding sequences of the 5' UTR and the intron (Tables 4 and 5). This anomalous mode of evolution was detected both among the closely related paralogs within the SPANX-AID and -N subfamilies and in intersub family comparisons. Most of the intron sequences do not seem to contain specific functional signals and are believed to evolve
(nearly) neutrally. Therefore, the almost 2-fold acceleration of evolution of all positions in the coding sequence compared to the intron (Table 4 and 5) seems to suggest that, in the SPANX family, positive selection acts with nearly equal strength on synonymous and nonsynonymous positions. Given that the coding sequences of the SPANX genes appeared to have evolved under positive selection, they were not suitable for phylogenetic analysis. Phylogenetic tree constructed from alignments of the 5'-UTR and intron sequences clearly supported the fundamental split between the SPANX-N and -AID subfamilies (FIG. 5). Beyond that, however, the resolution of the phylogenetic trees was low, presumably because of the small number of informative positions, non-uniformity of evolutionary rates, or both. Nevertheless, comparison of the organization of SPANX genes in different mammalian species allowed a confident reconstruction of the main events in the evolution of this family (FIG. 4). The common ancestor of rodents and primates apparently had a single SPANX-N subfamily gene (extensive search for potential divergent members of the SPANX family in the mouse and rat genomes failed to detect any). Because the ERV-containing intron was detected in SPANX-N subfamily genes from all examined primates, including tamarin (a representative of early-branching New World monkeys), ERV insertion apparently occurred early in primate evolution, before the divergence of the major monkey lineages (about 50 million years ago). SPANX-N genes lacking the ERV and containing only a solo LTR in their intron apparently evolved independently in New World monkeys and great apes via duplications accompanied by homologous recombination between the ERVs LTRs. Other SPANX-N duplications left the ERV intact, as illustrated by the existence of four ERV-containing SPANX-N genes in humans (the exact number of such genes in apes and monkeys remains to be determined). The emergence of the SPANX-AID gene subfamily appears to be a more recent event, subsequent to the separation of the hominoid lineage from orangutan. Apparently, this subfamily evolved via duplication of one of the SPANX-N genes accompanied by deletion of the distal part of exon 2 and rapid divergence (FIG. 4). Notably, the phylogenetic tree of the SPANX-A/D subfamily is most compatible with independent amplification of these genes in gorilla, chimpanzee, and humans (FIG. 5).
Lineage-Specific Amplification of SPANX-A/D Genes in Humans. Rapid evolution of the SPANX-A/D subfamily and location of these genes within segmental duplications impede a routine PCR analysis of syntenic chromosomal segments that is required to detect lineage-specific duplications. To overcome this problem, syntenic genomic segments from African great ape genomes
(chimpanzee, bonobo, and gorilla) were selectively isolated by TAR cloning in yeast. Kouprina, N. & Larionov, V. (2003) FEMS Microbiol. Rev. 27, 1-21. Because up to 15% divergence in DNA sequences does not prevent selective gene isolation by TAR, the targeting hooks were developed by using the human genomic sequence. The SPANX-C locus was chosen as a target. Because SPANX-C resides within an approximate 20-kb segmental duplication, the targeting sequences in the TAR vector were designed from unique sequences flanking SPANX-C. The vector efficiently clones an 83 -kb human genomic segment containing SPANX-C and LDOCl genes. Nagasaki, K., Schem, C, von Kaisenberg, C, Biallek, M., Rosel, F., Jonat, W. & Maass, N. (2003) Int. J. Cancer 105, 454-458. The size of the TAR yeast artificial chromosomes isolated from apes was different from that of human clones (about 5 S kb for chimpanzee and bonobo and about 61 kb for gorilla). To clarify the size differences between human and ape clones, the isolates from bonobo and gorilla were sequenced. All isolates shared a high level of sequence identity (about
95%) within the SPANX-C flanking sequences. The size difference is due to the absence of the 20-kb internal sequence containing the SPANX-C gene in African great apes. Partial sequencing of the chimpanzee clone revealed a similar organization of this syntenic region. Because this 20-kb sequence corresponds to a series of segmental duplications in chromosome X, it appears that at least the duplication that yielded SPANX-C occurred only in the human lineage (the alternative would require independent deletion of the same region in the gorilla, bonobo, and chimpanzee lineages, a highly unlikely event). The SPANX-C
S6 duplication is likely not polymorphic, because a SPANX-C null allele was not detected in a human population analysis that involved 200 individuals by PCR by using specifically designed primers (Table 2). A detailed sequence analysis of the gorilla TAR clone showed that its greater length, compared with the bonobo and chimpanzee sequence, was caused by a 3.4-kb insertion. This insertion contains an ORF homologous to several human ESTs (ALl 36558). The human gene corresponding to these ESTs consists of eight exons and spans about 30 kb on chromosome 3. Thus, the intron-less insert in gorilla appears to represent a reverse-transcribed duplication of this gene, most likely a retropseudogene. Identification of the recent insertion of the apparent pseudogene in the syntenic region in gorilla suggests that the SPANX-C-haώoήng locus is a rearrangement hot spot. Thus, analysis of TAR clones revealed the lineage-specific amplification of SPANX-Ai 1D genes in humans.
EXAMPLE 2: Sequences of SPANX-N Genes (Genomic Sequences)
This Example provides genomic sequences for several SPANX-N genes, as well as information about where SPANX-N polypeptides are encoded in the genomic sequences.
The SPANX-Nl genomic sequence (SEQ ID NO:92) is provided below. The SPANX-Nl polypeptide is encoded by nucleotides 119 to 253 and 8263 to 8406.
1 TGGGACACTG CCTGTATGAT CAAACAAAGC TCAAGGGTGT
41 GGCTTTGCCT TGTCACCAGG AGGGTATATA TAGGGAGGGC
81 AAGAGCTCTG GGACATCCTC CTGGGAAGCT TCAATACAGC 121 TGTGCAAGTC TGGAGTCTAC AAGAGCCTAC TATAGACATT
161 CTACAACCAA CCAGAATCAT GGAACAGCCC ACTTCAAGCA
201 TCAATGGGGA GAAGAGGAAG AGCCCCTGTG AATCCAACAA
241 TGAAAATGAT GAGGTAAGAT TGTTAGGTTT TGAAGGGAAG
281 GTGAGGGTGA AAGAAAGACA CACAGAGAAG GGACGGCTCA 321 AACAGCAACA CAGGAATATT GCAGACACCT GTGGAAGTGG
361 GAGACCCGCT TAATGCCAGA GCCCACCGCC GCTTACAGCC
401 TGGGGTGCTT GTAGGTAGGG GTGGGAGGGG CCTGGGCAGT
441 ATGGCTTGCT GCCCGGCAGG ATATTGATAA GATGTTTTTA 481 TGATCAGGCT GTTTGGTCCT TTTTCCAGTG GGATGTCATT 521 GTGGTGTTTC TTGGAACTTT GCCCAGCAAG ATACCATAGG 561 AAAGTTTCTT TAGTTGGACC TTTGTCCGTC TTGTGTTACG 601 ATGATTAGGC AAGATTTTTC TCAGCCTGAA CCCCCGTGGA 641 ATGTTTCACT TTGATCAAGG TCTGAAAAAT AGCAGGGTGC 681 TTACTAAATG GTGGTTTGGA CTCACATTCT TGCCTTCTAC 721 TTTAGTATAA AAGGAAGAGG GGCATTGTTG ATTATCTGGC 761 TGCTTCCTGC CGAATAGGGG AGCTGTAATC AGGGTTTGGG 801 TTTTGAAGCA GTGGGTGTTG GACTTCAGAG TTGTTTTCCT 841 GGAGGCACTG GTACCGGACT TGGCAGAGGA GAAGGATGGT 881 ATCAATGTGT TGCTGGGTGG CTGCCTGGAC AGGGGAGTTT
921 AGCCTTCGGG AGATAAAGCA GGATATGΆAG GTGAGTATAC
961 ATCGGCCAAA TGTACTATCA GGAGGAGGAA GATTAACGGC
1001 CCTAAGAAAG GGACCACCTG CTATCCCAGT GCTGAGTGCA 1041 TCAGAGATGC TTAGTTCTGC TAACACTAGA ATGAGATGTA
1081 CAGCCTGCTT AGTGTGTGTG GAGGAAGATG AGACAGAAGC
1121 TACTAAAGGA ACCTGGATGG TCTGCTTGTT AGGCAAAATG
1161 TTAATGTTTA GATTAACAAA AAAGTTGTAG ATTATGGCGG
1201 TTAACGAAAG TTGTAGGTTA TGGTGGCAAG CCATGAGAAG 1241 GTGGGGGTTG AATTCTCCAC AAACACTTTG TTTTAACTTT
1281 TATATTCCTC CAAGTTCAAT AGCTGCCAGC AAGGGTAGCC
1321 CCACTGAAGG CCTGATAGGA AATGTTGGTG GTGGAGGAGG
1361 TGAAGGCTGC TTGGTTTTGG AGΆGAGAGGG GGAAATGGGT
1401 TTTTGTAACT AATAGCCACT GTGGTCCTTA CAGGGCAGAA 1441 GGATACAGGT TGCAGTTGAG GGAATTGTTT GTGCATGTTT
1481 TCCAGGGAGC GCCCTTGAGT TGGATACATG AAGAGTGGCT
1521 CTAGCCAAGA GGGGAGGCGT TGGTAAGAGG TATTTTATGG
1561 AATCAGAGTT GTAACATGTT TAGTTAAGAG GCTATGAGGT
1601 GAAGGAGGGT GACAGCCCTG TGTGTGATGG TGTGGGTGGA 1641 TTTAATGGAT TGGGGGGAAA GGTGATTAGC CGAGCAAGAT
1681 TATAAACAGG ATTTGGGAAT GAATTGGTTA AGTAGGTGAG
1721 ATGCAGGTTT ATTTTCGACC AGGTTTATGT GGCCAGGTTG
1761 ACAGGAAGGG ATGTGAACTG CTGGATTTGT GTGGACAAAC
1801 AAGTTTAACA GTTGGAAGTA AAAGGGGAGT GTGACTGATT
SS 1841 TAATGGGCAA TGTGTGAGGC TGATGAAGGG GGCTCAACTG 1881 CTAGAGGTTG GGGGTGGGGA CTTAAGAGTA TCCCACAGAC 1921 AAAGAGGACA AAGGAGAAAA AGGAGATTTG AGTAGGAGTG 1961 AAATTTTGGA AGGTGCCCTA AAGTCATACC TCCTGCGTTA 2001 GTGTGAGAAA TTGGCATAGA CTATCCAGGG ACATAGGGAA 2041 GGGAAACCAG GAAAGATCTG AAACAAAGTA GGAGATAAAA 2081 GATTGGAAAT TGGCGACGGA GAGTGTTATG GGCTAGGGTG 2121 TTCTGGATTG GCAAATTCTG GAATTCTTGT TAAGTACAGT 2161 GAGGCTGGTC CTGTGAGGGA ACGAGAATAA TTTGGGGGTG 2201 AACAAATTTC TAGATGTGGA TCTGGTGCTC TTTTTAGTTT 2241 GAAATGGTGC ATTCAGTGTG GAAAGGATGT TAGCTTTGGC 2281 GCTGTGGGAG TAGTTAGCAT CACCTGGTGA GAACCCCTCC 2321 ACTTAGGTTG GAGAGGGAGG AGGAGGAGTC TGTGATTCAG 2361 ACCCAGTCCC CTGGTTGTAG GGACAGGGAG GAGTGTTTTG 2401 AGGATGGACT TTTAGGCTGG GTCAAGTAAG CATTTGTGTA 2441 CTGTCTTATT AGATGTTGGG TGAGGTGTAA CGCCGGCCAA 2481 GTATTCCATA TAGAAGGGGG AGAAAATACG GGGATATTTT 2521 GGAGGATAAA TGGAGCATTT GTACATGAGT TTAAATGGGC 2561 TTAGGCTGAG GAGCTTTTGG GAAATGACTC ATAAATGCAT 2601 GAGGAACAAT GGGAGAAGTG AAGTCCAGGC CATTTTAACC 2641 GTTAGGGAGA GTTTGGTTAG TTGTTTTTTA TAAAATTTTT 2681 CCTGAAGATT CGGAGTGGTA AGGAATATGG AAAGCCCATT 2721 TAATGTTJIAG.ΛGCCTTTGCC AGCTGTTGGC TGAACTGTGA 2761 AACAAATTGA CCCATTGTCT GACTGGATGC AAGAGCGAGT 2801 TTAAACCGGT GGATAATATG GGTGAAGAGA ATAGACGTGA 2841 TGGTGTGTGC CTTTTTGGTG GTGGTGGGAA AAACTTTTAT 2881 TCATCCAGAG AATGTATCTA CTATTGTGAG GTATAGGAAT 2921 CACTTTCTGG GGGGCATGTG AGTGATGTTG ATTTGCTAGT 2961 CCTGCCCTGG TAAGTGTCCT CAGGCCTCGT GTGTTGGAAA 3001 GGAGGTGGTT TGATAGCTTC CTGAGGGGAA GTTTGAGTGC 3041 AAAGGGAACA TGCCTTAGTA ATATCTTTGA GATGGGCAGC 3081 CATGGAGGAA GAATATATGT ATGTTTTTAA AAGCTGGAGT 3121 AGGGGCAGTA ACCGGCATGG AAATGGTTGT GCACATATGA 3161 AAGTTCAGAA GGTTTTTTAA ACTTGGACAA GACACTTTTA 3201 TCATTGAGGT GGAACCATTT TTTCCTGAAT GGCGCCAGTC 3241 CGGGCAAGTG GGGTTTGTTC CTGCTGGGTA TATACAGGGT 3281 GTATGCTGGA AAAAAATGAG CAATAACGAT GGGGTTTTAA 3321 GGGCTGCCTG CCAGGCTGCC AAATTTATTG AAAGGTTTTT 3361 CTCGGTTTTG GCATCTGTAA CCTTTCGGTG TCCCTTGCAA 3401 TGGATAATGG TGGCCTTTAG TGGTAGTTTA TCTACCCCCA 3441 GCAGCTTATG TATAAGTTTG CCATTTCTTG TGGAGGTTCC 3481 TTTTGTAGTT AGGAAACCCT GTTCCTGCCA GATGAAGGCA 3521 TGAGACTGTA GGACATGGTA TGCATATTTG GAATGGGTGT 3561 AAATGTTGAC CCTCTTTCCC TTTGCTAGGG TGAGGGCCCT
3601 GGTTAGGGCA ACTAGCTTTG CCTGTTGΆGA GGCAGCATGG
3641 GGTGGGAGAA CATTGGATTC TAGGAGTTTA TTTTCAGCAA 3681 TGATGGAATA GCCAGCTGCT GGACATGGCT CCCTAAAAGA
3721 GCTTCCATTA ACTAACCGTG TATGTGTTCC CTGCΆAAGAG 3761 GCATCTGAAA TGTGTTGGAA GGGGGAGGAG AGGGAGTCTA 3801 AGAGGTCCAG ACAGGAGTGA GAGAGCTTAG AGTCGGAAGT 3841 GTTTACAGGG AGGAGGGTGG CTGGGTTGAG AGTTTTATAT 3881 CTCTGGAAGG TGATTAGAGG GTTACCTATG AATAAGGCAT 3921 GTACCTGCTG TAAGCAGGAT GGTGGGAGGC ATAGAAGAAA 3961 TTGATGGTTT ATGAGGTCCT GTAGGTAATG GGAAGATGCA 4001 ATAGTAATGT GTTGGTAAAG AGGGAGTTTC TGTGCCTCTA 4041 AGGCCAGTGA TGTGGCCACA CCCAAGATTT TTAGTCAGAG 4081 TGACCAGCAT.JIGGATGACAG AGTCCAGTTG TTTTGAGAGA
4121 TGTGCAATGG CTTCTGGGGC GTCGCCGTAT GTTTGΆCAGA 4161 GTAGTCCAAG GACAAGGCCT TGGTCGGAAT GTACATACAA
4201 AGTAAAGGGC TTGGTGAGGT TGGGCAGTCC TGGTGCCAGG
4241 GCCATTAAAA GGGCATTTTT AAGTTTTTTG TTTGTTTGTT
4281 TTGGTTTTGG TTTTTTGTTT TGTTTTGTTT TGTTTTAGTG
4321 GGAATTGATG GGGTAAGCTG GGTTTGGGGA TTTTAGGATG 4361 GGCCCATGTG AGGCCATGTA GAGTGGCTTG GCCAGCAAGT
4401 CAAAGTTGGA AATTCATGGC CAAAAGTATC CCACAAGGCC
4441 CAAGAAGGAG AGGAGGTCCT TTTTTTGTGT GGGGAAGGGG
4481 CATGTCCCAA ATTAGCTCCT TTCATTGGGT TGGGATGGCT
4521 CGAGAATTAG GGGTTAGGAC AAACCCATGT TTGTGCTACC 4561 TGAGATTTTG TGGGTTAGGC CCAATCTCCT CGATTATGGA
4601 AGCAGTTTAA AACCTGAGTG GTGTGTTGAA TGGACAGGTT
4641 AAGGAAAGGG CTACAGAGAA GGAGGTCATT GACATATTGG
4681 AGGAGGGCAC TAGGAAAAAG AGGAAGTTCA GCTAGGTCCT 4721 TGATGAGGGC CTGTCTGAAT AGATGGGGGC TATCCTGGAA
4761 CCCCTGTGGG AGTATAATCC 7ATGTTAGTTG GGTGGACATG
4801 TGAGTATTAG GATTTGGCCA AGTTAGAACA AAGAGACTTT
4841 GGTAAGCTGG GTTTAAGGGA AGAGTGAAAT AGGCCTCTTT
4881 TAGCTCCAAC ACAGAGGAGT GTGTGGTAGA TGTAGGAGTA 4921 TGAGAGAGTA GAGTATATGG GTTGCAGACC ACCAGATGGA
4961 TTGGTACCAC TGCCTGGTTA ACAACTTGGA GATCCTGGAA 5001 AAAGGGGTAA GTCTTGTCTG TCTTCTGGAA AGCCAGAATA 5041 GGGGTGTTGT GGGGAGAGTT GACGGGCTTG AGAATTTGAG 5081 CTTGCAAAAG TTTACAAATA GGATTGAGGC CCCTTAAACC 5121 GGCTGGATTA AGTGGATATT GAGACTGATG AAGGAAAATG
5161 GAGGGGTTCT GGAGGGTTAT TTTAACTGGG ATGTGATGTG
5201 TGGCTATTGT GGGTTTAGAA ACATTCCAAA GTTTAGAATT
5241 AACAGAAGGT AACAGGGTGG ATAATGAGGA TGAGTAGGGG
5281 GAGAGGGAAG CGTTTGGGTG TCAGAGTAAA ATAAAAGGGG 5321 TAGAATTGTA GGAGCCACAT TGCATGGAGG TCTGGGATTT
5361 ACTTAATATG TCCCACCCCA AGATAGGGGT AGGGGACTGA
5401 GGGATAACCA GGAAAGAGTG GGTGAAGGGG GCTGTTGAAT
5441 AGGTTGCATA ATAAAGGACG AGTCTGTGCC TAGAGGGGAT
5481 TCTATTGACT CTCACAATAG AGATTGAAAA ACGGAAAAAG 5521 GGTCTAGAAT ATTTGGGTAA AACCGAGTAA CTAGTTGTCC
5561 TATTGAGCAT CCAATAGGAA AGACATGGGC TTACTAGAGA 5601 CTGACAGAAT TACCCTGGCT TCCGAGGTGG TGATGGCAGT
5641 ACGGGCGGTG GACTCTAGGC CCCGTCTGTC TTCAGGTACG 5681 GGTGGTGAAG CAGGATGAAG AGTTTCACTG AGCACAGTCT 5721 GACTTCCAGT GTTGCTTGAT ACCACAGATG GGGCAAGGTT
5761 TCCAGAATGT CCTGGGATTA GGGTAGGCTT TTTCCCAGTG
5801 TCCGTGTTGG ACGCACTTAT AACAGGCTCC TGGAGTTGAC 5841 TGTCGCCAAG TTGAGGAATT CTGTATGCCT TGGGAACCCT 5881 GTTGAGTGGC AGCCACCAGC ATTTGGTATT TAGCCTGGTC 5921 CTTCTTTAGG TTTGAGGTTT TTGGTTCTTC CTCTCTATTC
5961 TTAAAGACCT TAAAGGCCAC TTTGATTAAG TCTCTTTGGG
6001 AGGTTTGAGG GCCATCCTTC ATTTTTTAAT GTTTTTTTTC
6041 AAATGTCTGG GGCTGACTGG GAGATAAAGT GTAAATGGAG 6081 GTCGATTTTG CCCTTATTGT TATTAGGGCT TAAGGTGGCA 6121 TATTTAGTTG TGACCTTTGG AAGGCAGGAG AGGAAAAGAT
6161 CAGGGGCGAT ATTATATATG CCAGCTAGAA GGCATGATAT
6201 GATATGGCCT GGTTTCTGTC TGTCTACΆGA AGTTGCCTGA
6241 CAATTCCAAT TGGGGTCAGT TCTGGGGACA GCTAGGGTCC 6281 CTATTGGGTT GTGAGTAGCA TCTTTTTGTT GGAGGGTGTT
6321 CTCATGGGCC TGGGTGCCAA TAAAGATGTG TTCCCTGTCC
6361 TCCAGGGTGA GGGTAGAGGA GAGGATGACA TATACGTCAT
6401 GCCAGGTAAG GTCATAAGAT TTAGGGAAGT ACAGAΆATTT
6441 CTTGCAGAAG GAGGTAGGAT CCATTGAAAA GGAGTCAAGT 6481 CTCTTTTCAA TATGGGAGAA GTCAGCTAGG GAACTTGAAT
6521 TCAGATTGTG TCTTCAGGCC TCGCAACCTC CTGCAAGGGA
6561 AGGACTTTGG AAGGCCTTTG GGCATGTGCC TGGTTTCTAG
6601 CGAAAGGGGG TTGGGTGTGA GACCTGGTTT GAGGTGGAGA
6641 AGGAAGGGGG ATGGAGGGTG ACAGTAAGGG GGTGGAGGAG 6681 CCGGATCCGA AGGAGAAGGT GGAGGGAATG GGAGTGGGCA
6721 GGAGGCAGAG GAAGGGGTTT GTTGTCCAGG TTGAAATTCA
6761 GAGGAGGAAG GCTTGTCCGG AGGAGGAGGC ATTTTGGGGG
6801 GTTTCTCCTT GAG.GAGGAAG ATTTGAAAAG GTGAACAAGA
6841 TTGGCAGAGA GGGAGAGTGA GCTGTGAACG AAGATAAGCA 6881 AAGCACTTCA GATTATTTAC CCTTGTGTTG GAGAAAGTTT
6921 TTCAGATGAT GTTGAATTTG GAAGTTGAAT GTTTTGTTTT
6961 TTTGGAAGAG GGGAGTGAAT ACTCTGGATG GGAGTGACTA
7001 CCCCTAAGGA AAGTGAGTAC CCTGGAAGAG AGTGAATATC
7041 CCAGAGCGGA GTGAGTTCTC CAGAGGGAAG TGAATACACC 7081 AGAGGAAAGT GAGTACTTGA TTCCGCCTGG AAGAAGGGAC
7121 CTGGAGAGTC GGGGCAGCCC CGTACTATTC ATGCTCCCCA
7161 GAGGCCGGAG CGGCCAGAGT GGCAAGCATT CTGAGGCTGT
7201 CCCCTGAGAA GGGAGGCTGC GGTCACCTGG TGACTGGGGA
7241 GACCTCTCAC CCAGCACTTG ATTGTTTTTT GGAGAACAGG 7281 CAGAAATAGA GAGGAATCCA GAAAGGAGGA AGAGGGAGAC 7321 TCACCCACTA ATTGACGCCC AGTGTTGGAT ATGATGTCCA 7361 GAAATGGGGG TGATCCTTGT TGCAGCAGCG CAGAAAGGGG 7401 AAGGAAGGGA AAAGAAAAAG GTCGAGAGTT TGATCAATAT 7441 CTGGAGGTTA TCCCAGGTCT GGAGAAGACT AGAGTATAAG 7481 GGGAATGGAA CTAGGAACAG GGAGATCACA GGATCTGGAA 7521 GCAGGCCCCG GTCTAGTTTG TGCTGCTTTT CGCTTTCTGG 7561 GTTGCAGCAG AAAACTTACC CATTAGAACC CAATCACCAA 7601 CCACTCCAGG TTTTGACACC AAAATGTCAG GTTTTGAAGA 7641 GAAGGCGAGG ATGAAAGAAA GACACACAGA GATGGGGCAG 7681 CTCAAACAGC AACACAGGAA TATTGCAGAA ACTTGTGGAA 7721 GTGGGGGAAC AGCTTAATGC CAGATCCCAC CACTGCTTAC 7761 AGCCTGGGGT ACTTACAGGT ATGGGTGGGA GGGATCTGGG 7801 CAGTATGGCT TGCTGCCAGG CAGGATATTG ATAAGATGTT 7841 CTTATGATCA GGTGGTTTGG CCCTTTTTCT GGTGGAATAT
7881 CATTGTGGTG TTCCTTAGAA CTTTGCCAAG CΆAGATATGA
7921 TAGGGATGTT TCTTTAGTTG GGCCTTTGTC CGCCTTGTGG
7961 ACAGGTGGTT AGGCAGGATG TTTCTCACGG CCTGAACCCC
8001 CATGGGATGT TTCACTTTGA CCAAGGTCTG CAAAATGGCA 8041 AAGAACTTAC AAAATGGTGC ACTTTGGACT AACAGGTGAC
8081 CCTACCCATG CTCCTCTTCT TCTTCCCCAT AGATCCCTAC 8121 CCTATGATTC AACCTTGTTC TTCTCTGGCA CACACCCCTT 8161 CCTC.CACCTG CATJZCCT.TCT.CATAAAGCCC CCCTTGCTAT 8201 CCAGTCTCTA TCCTATTCAC CCAAAATAAT GTCTTTCTGG 8241 CCTCTCCCTG TTTTCTTAAC AGATGCAGGA GACACCAAAC 8281 AGGGACTTAG CCCCCGAACC GAGTTTGAAA AAGATGAAAA 8321 CGTCAGAATA TTCAACAGTA TTAGCGTTTT GCTACAGGAA 8361 AGCTAAGAAA ATACATTCAA ATCAACTGGA GAATGACCAG
8401 TCCTGAGAGA ACTCCATCAA TCCΆGTCCAA GAGGAGGAGG 8441 ACGAAGGCCT AGACTCAGCT GAAGGATCTT CAAAGCAGGA 8481 TGAAGACCTA GACTTACCTG AAGGCCTAGA CTCAGCTGAA 8521 GGATCTTCAA AGCAGGATGA AGACCTAGAC TTACCTGAAG 8561 GATCTTCAAA GCAGAATGAA GACCTAGGCT TACCTGAAGG 8601 ATCTTCAAAG CAGGATGAAG ACCTAGACTT ACCTGAAGGA 8641 TCTTCACAGG AGGATAAAGA CGTAGAATTA CCTGAAGGAT
8681 CTTCACAGGA GGATGAAGAC CTAGAGAAAA AGCGATCATC
8721 TAGGTCAAGA AATGAAATCG AAATTTCAGA AATTACAGAA
8761 ATTCTCCAAG GAAGGAGCCA GATAAATAAA TATTCTGATT 8801 TCATTTTCCT CCTTCTCCAG TGATCTGCAG GGACCTACCA
8841 TTCACTAAAA CTAACCCGAA GCCATAGAAC AAGACAGCTT 8881 ATTAATACAG TCCATGGTTT ATATGAGTAG ATTTTTAGCA 8921 GCACAAAGTA GAGTGGAG
The SPANX-N2 genomic sequence (SEQ ID NO:93) is provided below.
The SPANX-N2 polypeptide is encoded by nucleotides 179 to 256 and 8342 to 8806.
1 TGGGACACTG CCTGTATGAT CAAACAAAGC TCAAGAGTGT
41 GGCTTTGCCT TGCCACCAGG AAGGTATACA TAGGGAGGGC 81 CAGAGCTCTG GGACATCCTC CTGGCAAGCT TCAATATAGC
121 TGTGGAAGTC TGCAGTCTAC AAGAGCCTAC TATAGACATT
161 CTACAACCAA GCAGAATCAT GGAACAGCCG ACTTCAAGCA
201 CCAATGGGGA GAAGAGGAAG AGCCCCTGTG AATCCAATAA
241 CAAAAAAAAT GATGAGGTAA GATTGTTAGG TTTTGAAGGG 281 AAGGTGAGGG TGAAAGAAAG ACACACAGAG AAGGGACGGC
321 TCAAACAGCA ACACAGGAAT ACTGCAGACA CCTGTGGAAG
361 TGGGGGACCT GCTTAATGCC AGAACCCACC GCCGCTTACA
401 GGCTGGGGTG CTTATAGGTA—TGGGTGGGAG GGGCCGGGGC 441 AGTATGGCTT GCTGCCCGGC AGGATATTGA TAAGATGTTT 481 TTATGATCAG GCTGTTTGGT CCTTTTTCCA GTGGGATGTC
521 ATTGTGGTGT TTCTTGGAAC TTTGCCCAGC AAGATACCAT
561 AGGAAAGTTT CTTTAGTTGG ACCTGTGTCC GCCTTGTGTT
601 ACGACGATTA GGCAAGATTT TTCTCATGGC CCGAACCCCC
641 GTGGAATGTT TCACTTTGAT CAAGGTCTGC AAAATAGCAG 681 GGTGCTTACT AAATGGTGGT TTGGACTCAC ATTCTTGCCT
721 TCTACTTTAG TATAAAAGGA AGAGGGGCAT TGTTGATTAT
761 CTGGCTGCTT GCTGCCGAAT AGGGGAGCTG TAATCΆGGGT
801 TTGGGTTTTG AAGCAGGGGG TGACGGACTT CAGAGTTGTT
841 TTCCTGGAGG CACTGGTACC GGACTTGGCA GAGGAGAAGG 881 ATGGTATCAA TGTGTTGCTG GGTGGCTGCC TGGAGAGGGG
921 AGTTTAGCCT TCGGGAGATA AAGCAGGATA TGAAGGTGAG
961 TATACATCGG CCAAATGTAC TATCAGGAGG AGGAAGATTA
1001 ACGGCCCTAA GAAAGGGACC ACCTGCTATC CCAGTGCTGA 1041 GTGCATCAGA GATGTTTAGT TCTGCTAACA CTAGAATGAG
1081 ATGTACAGCC TGCTTAGTGT GTGTGGAGGA AGATGAGACA
1121 GAAGCTACTA AAGGAACCTG GATGGTCTGC TTGTTAGGCA
1161 AAGTGTTAAT GTTTAGATTG AAACACAAGG GTGCATGTTC
1201 CTGACCAATT GGCCAGTAGG CAGAGATAAG AGTTTGTGCC 1241 GCAAAGGAAG AAGACACTAG GGGTGGACAG ACAAAAGTTG 1281 TAGGTTATGG TGGCAAGTCA TGAGAAGGTG GGGGTTGAAT 1321 TCTGCACAAA CCCTTTGTTT TAACTTTTAT GTTTTTCCAA 1361 GTTGAATACC TGCCAGCAAG GGTAGCCCCT CTGAAGGCCT 1401 GATAGGAAAT GTTGGTAGTG GAGGAGGTGA AGGCTGTTTG 1441 GTTTTGGAGA GAGAGCAGGA AATGGGTTTT TGTAACTAAT 1481 AGCCATTGCG GTCCTGACAG CGCAGAAGGA TACAGGTTGC 1521 AGTTGAGGGA ATTGTTTGTG GATATTTTCC AGGGAGCGCC 1561 CTTGACTTGG ATACATGGAG AGCGGCTCCG GCAGAAAGGG 1601 GAGGGGTTGA TAAGAGGTAT TTTCTGGAAT CCGACTGGTA 1641 ACATGTTTAG TTAAGAGGCT ATGAGGTGAA GGAGGGTGAC 1681 AGCCCTGTGT GTGACGGTGT GGGTGGATTT GATGGATTGG 1721 TGGGGGGGGA AAGGTGATTA GCCGAGCAAG ATTATAAGCA
1761 GGTTTTGCGA ATGAATTGGT -CAAT-TAGGTG AAATGCAGGT
1801 TTATTTTGGA CAGGGTTCAT GTGGCCAGGT TAACAGGΆAG 1841 GGCTGTGAAC TACTGGATTT GTGTGGACAA ACAAATTTAA
1881 CAGTTGGAAG GAAGAGGGGA GTGTGACTGA TTTAAGAGGC 1921 AGTGAGGCTG ATGAAGGGGG CTCATCTGTT AGACGTTCGG 1961 GGTGGGGACT TAGAATATCC CACAGATAGA GAGGACAAAA 2001 GGAGAAAGAG GAGATTTGAG TAGGAGTGAA ATTTTGGAAG 2041 GTGCCCTGAA GTCATATCTC CTTGATTAGT GTGACAAATT 2081 GGCCTGGACT ATTCAGGGTC ATGGGGAAGG GAAACCAGGA 2121 AAGATCTGAA ATAAAGTAGG AGACAAAAGA TTGGAAACTG 2161 GAGACAGAGA GTGTTATGGA CTAAGGTGTT CTGGATTGGC 2201 AACTTCTGGA ATTCTTGTTA AGTACAGTGA GGTTGGTCTT 2241 GTCAGGGAAG GAGAATAATT TGGGGGTGAA CAAATTTCTA
2281 GATGTGGATC TGGTGCTCTT TTTAGTTTGA AATGTGCATT
2321 CAGTGTGCAA AGGATGTTAG CTTTGGCGCT GTGGGAGTAG
2361 TTAGCATCAC CTGGTGAGAA TTCCTCCACT TAGGTTGGAG 2401 AGCGGGGGAG GAGGAGTCTG TGATTCAGAC CCAGCCCGCT
2441 GGTTGTAGGG ACAGGGAGGA GTGTTTTGAG GATGGACTTT
2481 TAGGCTGGGT CAAGTAAGCA TTTGCGTACT GTCTTATTAG
2521 ATGTTGGGTG AGGTGTAACG CCGGCCAAGT ATTCCΆTATA
2561 GAATGGGGGG AAAATACAGG GAGATTCTGG AGGATAAATG 2601 GAGCATTTGT ACATGAATTT AAATGGGCTT AGGCTGAGCG
2641 CTTTTAGGAA ATGACTCATA ATCGCATGAG GAACAATGGG
2681 AGAAGTGAAG TCCAGGCCAT TTTAACCTTT AAGGAGAGTT
2721 TGGTTAGTTG TTGTTTTTTA TAAAATTTTT CCTGAAGATT
2761 CGGAGTGGTA AGGAATATGG AAAGCCCATT TAATGTTTAG 2801 AGCCTTTACC AGCTGTTGGT TAAACTGAAA CAAATTGGCC
2841 CATTGTCTGA CTGGATGCAA GAGGGGAGTT TAAACCGGTG
2881 GATAATATGG GTGGAGAGAA TAGAAGCGAT GGTGTGTGCC
2921 TTTTTGGTGG TGGTAGGAAA AGCTTTTATT CATCCAGAGA
2961 ATGTATCTAC TATTGTTAGA ATGTATAGGA ATCACTTTCT 3001 GGGGAGCATG TGAGTGATGT TGATTTGCTA GTCCTGCCCT
3041 AGTAAGTGTC CTCAGGTCTG GTGTGTGGGA AAGGAGATGG
3081 TTTGATAGCT TCCTGAGGGG AAGTTTGAGT GCAAAGGGAA
3121 CATGCCTTAG TAATATCTCT GAGAtGGGCA ACCATGGTGG
3161 AAGAATGTCT AGAAGTTTTT AAAAGCTGGΆ GTAGGGGCAG 3201 TAACCGGCAT GGAAATGGTT GTGCACATAT GAAAGTACAG
3241 AAGGTTTTTT AAACTTGGGC AAGACACTTT ATCAGTGAGA
3281 TCGAACTATT TTTTCCTGAA TGGCACCAGC CTGGGCAAGT
3321 GGGTTTTTTT CCTGCTGGGT ATATACAGGG TGTATGCTGG
3361 GAAAAAAAAT GAGCAATAAC GATGGGGTTT TAAGGGCTGC 3401 CTGCCAGGCT GCCGAATTTG TTGAAAGGTT TTTCTCGGTT
3441 TTGGCATCTG TAGCCTTTCG GTGTCCCTTG CAATGGATAA 3481 TGGTGGCCTT TGGTGGTGGT TTATCCACCC CCAACAGCTT 3521 ATGTGTAAGT TTGCCATTTC TTATGGGGGT TCCTTTTGTA 3561 GTTAGGAAAC CCTGTTCCTG CCAGATGAAG GCATGAGACT 3601 GTAGGACATG GTATGCATAT TTGGΆATGGG TGTAAATGTT
3641 GACCCTCTTT CCCTTTGCTA GGGTGAGGGC CCTGGTTAGG
3681 GCAACTAGCT TTGCCTGTTG AGAGGTAGCA TGGGGTGTGA 3721 GAGCATTCGA TTCTAGGAGT TTATCTTCAG CACTGATGGA 3761 ATAGCCAGCT GCTGGACATG GCTGCCTAAA AGAGGTTCCA 3801 TTAACTAACC GTGTATGTGT TCCCTGCAAA GGGGCATCTG 3841 AAATGTGTTG GAAGGGGGAG GAGAGGGAGT CTAAGAGGTC 3881 CAGACAGGAG TGAGAGAGCT TAGAGTCGGA AGTGTTTACA 3921 GGGAGAAGGG TGGCTGGGTA GAGAATTTTA TGTCTCTGGA 3961 AGGTGATTAG AGGGTTACCT ATGAATAAGG CATGCACCTG
4001 CTGTAAGCAG GATGGTGGGA GGGATAGAAG GGATTGGTGG 4041 TTTATGAGGT CCTGTAGATA ATGGGAAGAT GCAATAGTAA 4081 TGTGTTGGTA GAGAGGGAGT TTCCGTGCAT CTAAGGCCAG 4121 CAATGTGGCC ATACCCAAGA TTTTTAGTCA GAGTGACCAG 4161 CCTCGGATGA CAGAGTCCAG TTGTTTTGAG AGGCGTGCAA 4201 TGGCTTCTGG GGCGTTGCCA TATGTTTGGC AGAGTAGTCC 4241 AAGGGCAAGG CGCTGGTCAG AATGTACATA GAAAGTAACA 4281 GGCTTGGTGA GGTTTGGCAG TCCCAGTGCC AGGGCCATTA 4321 AAAGGGCATT TTTAAGTTTT TAAGTTGGAG TTGATGGGGC 4361 AAGCTGGGTT CAGGGGTTTT AGGATGGGCC CATGTGAGGC 4401 TACGTATAGT GGCTTGGCCA GCAAGTCAAA GTTAGGAATT 4441 CACAGCCAGA AGTATCCCAC AAGGCCCAAG AAGGAGGGGA 4481 GGTCCTTTTT TTTTATGTGG GGAAGGG.GCA TGTCCCAAAC 4521 TGGCTCCTTT CATTGGGTTG GGATGGCTCG AGAATTAGGG 4561 GTTAGGACAA ACCCAAGGTA AGTAACTTTG ATTTGGGTAC 4601 CTGAGATTTT GTGGGTTAGG CCCAATATAC TTGATTATGA 4641 AGGAAGTTTA AAACCTGAGT GGTGTGTTGA ATGGACAGGT 4681 TAAGAAAGGG GCTACAGAGA AGGAGGTCAT TGACATATTG 4721 GAGGAGGGTA CTAGGAGCAA GGGGAAGTTC AGCTAGGTCC 4761 TTGATGAGGG CCTGTCTGAA CAGGTGGGGC CTATCCCGGA 4801 ACCCCTCTAA GAGTACAGTC TAGGTGTAAG TCCATGTGTA 4841 AGTCCTGTAA GAGTACAGGA CATGTGAGTA TTAGGATTTG 4881 ACCAAGTCAA AGCAAAAAGA CTTTGGTAAG CCAGATTTAA 4921 GGGAATAGTG AAATAGTTGT CTTTTAGTTC AATACAAAGG
91 4961 AGTGTGTGGT AGATGGGGCA ATACGGTAGA GTAGAGTATA 5001 TTGGTTGAGG ACCACTAGAT GGATTGGTAC CACGGCCTGG 5041 TTAACGACTT GGAGATCCTA GACAAAGGGG TAAGTCCCGT 5081 CTGTCTTTTG GAAAGCCAGA ATAGGGGTGT TGTGGGGAGA 5121 GTTAACAGGC TTGAGAATAT GAGCTTGTAA AAGTTTACAG 5161 ATAATAGGTT TGAGACCCCC TAAGCAAGGC TGGATTAAGG 5201 GGATATTGAG ACTGATGAAG GAAAATGGAG GGGTTTTGAA 5241 GGGTTATTTT AACTGGGATG TGATGTGTGG CTATTGTGGG 5281 TTTAGAAACA TTCCAAGCTT TAGAATTAAC AGAAGGTAAC 5321 AGGGTGGATA ATGAGGATGA GTGGGGGGAG AGGGAAGCGT 5361 TTGGGTGTCA GAGTAAAATA AAAGGGGTAG AATTGTAGGA 5401 GCCACATTGC ATGGAGTTCT GGAATTTACT TAATATGTCC 5441 CATCCCAAGA TAGGGGTAGG GGACTGAGGG ATAACCAGGA 5481 AAGAGTGGGT GAAGGGGGCT GCTGAATAGG TTGTATAATA 5521 AAGGACCAGT CTGTTTGTGC CTAGAGGGGA TTCTATTGAT 5561 TCTCACAATA GAGATGGAAG AACTGAAAAA GGTTCTAGAA 5601 TATTCTGGTC AAACTGAGTA ACTAGTTGTC CTATTGAGTA
5641 TCCAATAGGA AAGACATGGG CTTACTAGAG ACTGACAGAA 5681 TTACCCTGGG TTCCGAGGTG GTGATGGCAG TAGGGGCGGT 5721 GGACTCCAGG CCCCACCTGT CTTCAGGTAC AGGTGGTGAA
5761 GCAGGATGAA GAGTTTCAGT GAGCACAGTC TGACTTCCAG
5801 TGTTGCTTGA TACCACAGAT GGGGCAAGAT TTCCAGAGTG
5841 TCCTGGGATT AGGCTAGGCT TTTGCCCAGT-.GTCCCTGTTA
5881 GACGAACTTG TAACAGGCTC CTGGAGTTGA CTGTTGCCAA 5921 GTTGAGGAAT TCTGTAGGCC TTGGGAACCC TGTTGAGTGG
5961 CAGCTACCAG CATTTGGTAT TTAGCCTGGT CCTTCTTTAG
6001 GTTTGGATTT TTTAGTTCTT CCTCTCTAAT CTTATAGATC
6041 TTGAAGGCCA CTTTGATTAA GTCTCTTTGG AAGGTTTGAG
6081 GGCCACCCTC CAGTTTTTTA AGTTTTTTTC AAATGTCTGG 6121 GTCTGACTGG GAGATAAAGT ATAAATGGAG GTAGTTTTTG
6161 CCCTTATTGG TATTAGGGCT TAAGGTGGCA TATTTAGTTG
6201 TGACTTTTGA AAGGCAGGAG AGGAAACGAG CAGGGACAGC
6241 TTTATTTATG CCAGCTAGAA GGCATGATAT CATATGGCCT
6281 GATTTCTGTC TGTCTACTGA AGTTGCCTGA TAATTCCAAC 6321 TGGGGCCAGT TCTGGGGACA GCTAGGGTCC CTATTGGATT
6361 ATGAGCAGCA TCTTGTTGGT GGAGGGTGTT CTCATGGACC
6401 TGGGCACCCA TAAAGATGTG TTCCCTGTCC TCCAGGGTGA
6441 GGGTAGAGGA GAGGATGACA TATATGTCAT GCCAGGTAAG 6481 GTCATAAGAT TTAGGGAAGT ACAAGAATTC CTTGTGGAAG
6521 GAGGTTGGAT CCATTGAAAA GGAGTCAAGT CTCCTTTCAA
6561 TATGGGAAAA CTCAGCTAGA GAGAAGGGAA CTTGAATTCA
6601 GATTATGCCT TTAGCCCTCG CAACCTCCTG CAAGGGAAGG
6641 ACTTTGGAAG GCCTTTGGGC ATGTGCCTGG TTCCTAGCGA 6681 AAGGGGGTTG GGTGTGAGAC CTGGTTTGAG GTGGAGAAGG
6721 AAGGGGGATG GAGGGCGACA GTAAGGGGGT GGAGGAGCCG
6761 GATCCGAAGG AGAAGGTGGA GGGAATGGGA GAGGGCAGGA
6801 GGCTGAAGAA CGGGTGTGTT GGCCAGGTTG AAATTCAGAG
6841 GAAGAAGACT TGTCCGGAGG AAGAGGCATT TTGGGGGGTT 6881 TCTCCTTGAG GAGGAAGATT TGAAAAGGTG AACAAGATTG
6921 GCAGAGAGAG AGAGAGAGAG AGAGAGAGTG AGGCATGAAA
6961 GGAGATAAGC AAAGCACTTC AGATTATTTA CCATTGCATT
7001 GAAGAAAGTT TTTTAGATTA TGTTGAATTT GGAAGTTGAA
7041 TGTTTTGTTT TTTGGAAGAG GGGAGTGAAT ACTCTGGATG 7081 GGAGTGACTA CCCCTGAGGA AAGTGAGTAC CCTGGAAGGG
7121 AGTGAGTACC CCAGGGGTGA GCGAGTACTC CAGAGAGAAG
7161 TGAATACCCC ATAGGGAAGT GAGTACTTGA TTCCTCCTGG 7201 GAGAAGGGAC CTGGAGAGTT GGGGCAGCCC-CGTACTATTC
7241 ATGCTCCCCA GAGGCCAGAG CAGCCAGAGT GGCAAACGTT 7281 TTGAAGGTGT ACCCTAAGAA GGGAGGCTGC AGTCACCTGG
7321 TGACTGGGGA GACCTCTCAC CCAGCACTGG AATTTTTTGG
7361 AGAACGGGCA GAAATAGAGA GGAATCTGGA AAGGAGGAAG
7401 AGGGAGACTC ACCCACTAAT TGAAGCCCAG TGTTGGATGT
7441 GATGTCCAGA AATGGGGGTG ATCCTTGTTG CAGCTGCCCG 7481 GAAACAGGAA GGAACAAAAA ATAAGGTGGC TTGTTTGATC
7521 ACTATCTGGA GGTTAGCCCA GGTCTGGAGA AGACTAGAGA
7561 ATAAGGAGAA TGGGACTGGG AACAGGGAGT ACACTCAGGA
7601 TCTGGAAGCA GGCCCTGGTC TAGTTTGTGC TGCTTGCTGC
7641 TTTCCAGGTT GCAAGAGAAA ACTTACCCAT TAGAACCCAA 7681 TCACCACTCC AGGTTTTGGC ACCAAAATGT TAGGTTTTGA
7721 AGGGAAGGCA AGGATGAAAG AAAGACACAC AGGGGGCAGC
7761 TCAAACAGCA ACACAGGAAT ATGGCAGACA CCTATGGAAG
7801 TGGAGGACAA GCTTAATGCC AGAGCCCACC GCCGCTTACA 7841 GGCTGGGGTA CTTACAGGGA TGGGCAGGAG GGATCTGGGC
7881 AATGTGGGAT GCTATCCGGC AGGATATTGA TAAGATGTTC
7921 TTATGATCAG GTGGTTTGGC CCTTTTTCCA GTGGAATATC
7961 ATTGTGGCGT TCCTTAGAAC TTTGCCAAGC AAGATATGAT
8001 AGGAATGTTT CTTTAGTTGG GCCTTTGTCC GCCCTGTGGT 8041 CAGGTGGTTA GGCAGGATGT TTCTCACGGC CTGAGCCCCC
8081 ATGGAGTGTC TCACTTTGAC CAAGGTCTGC AAAATAGCAA
8121 AGAACTTGCG AAATCGTGCA GTTTGGACTA ACAGATGACC
8161 cTAcccATGC TCCTCTTCTT CTTCCCCΆTA GATCACTACC
8201 CTATGATTCC ACCTTGTTCT TCTCTGGCAC ACACCCCTTC 8241 CTCAACCTGC ATTCCTTCTC ATAAAGCCCC CCTTGCTATC
8281 CAGTCTCTAC CCTATTCACC CAAAATAATG ACTTTCTGGC
8321 CTCTCCCTGT TTTCTTATCA GATGCAGGAG GCACCGAACA
8361 GGGTCTTAGC CCCCAAACAG AGCTTGCAAA AGACAAAAAC
8401 AATAGAATAT CTAACAATAA TAGTGTATTA CTACAGGAAG 8441 CATACGAAAA TAAATTCAAA TCAACTGGAG AAGGACCAGT
8481 CCCGAGAGAA CTCCATCAAT CCCGTCCAAG AGGAGGAGGA
8521 CGAAGGCCTA GACTCAGCTG AAGGATCTTC ACAGGAGGAC
8561 GAAGACCTGG ACTCATCTGA AGGATCTTCA CAGGAGGACG
8601 AAGACCTGGA CTCATCTGAA GGATCTTCAC AGGAGGACGΆ 8641 AGACCTGGAC TCATCTGAAG GATCTTCACA GGAGGACGAA
8681 GACCTGGACT CATCTGAAGG ATCTTCACAG GAGGACGAAG
8721 ACCTGGACCC ACCTGAAGGA TCTTCACAGG AGGACGAAGA
8761 CCTAGACTCA TCTGAAGGAT CTTCACAGGA GGGTGGGGAG
8801 GACTAGTCAA ACATGGAGAA ACCAAATTGG ACAΆATCCTC 8841 ACTACCAATG GCGATGATTA CAATAAAATC AAGTTTGAGG
8881 AGCTGATGAC CGTGTATATC TCTGCCTGTT GTCTGATGGT
8921 G The SPANX-N3 genomic sequence (SEQ ID NO:94) is provided below. The SPANX-N3 polypeptide is encoded by nucleotides 179 to 256 and 8407 to
8754.
1 TGGGACACTG CCTGTATGAT CAAACAAAGC TCAAGGGTGT 41 GGCTTTGCCT TGTCACCAGG AGGGTATATA TAGGGAGGGC
81 AAGAGCTCTG GGACATCCCA CTGGGAAGCT TCAACATAGC
121 TGTGGAAGTC TGCAGTCTAC AGGAGCCTAC TATAGACATT
161 CTACAACCAA CCAGAATCAT GGAACAGCCA ACTTCCAGCA
201 CCAATGGGGA GAAGACGAAG AGCCCCTGTG AATCCAATAA 241 CAAAAAAAAT GATGAGGTAA GATTGTCAGG TTTTGAAGGG
281 AAGGTGAGGG TGAAAGAAAG ACACACAGAG AGGGGGTGGC
321 TCAAACAGCA ACACAGGAAT ACTGCAGACA CCTGTGGAAG
361 TGGGGGACCC GCTTAATGCC AGAGCCCACC GCCGCTTACA
401 GGCTGGGGTG CTTGTAGGTA GGGGTGGGAG GGGCCTGGGC 441 AGTATGGCTT GCTGCCCGGC AGGATATTGA TAAGATGTTT
481 TTATGATCAG GCTGTTTGGT CCTTTTTCCA GTGGGATGTC 521 ATTGTGGTGT ATCTTGGAAC TTTGCCCAGC AAGATATGAT
561 AGGAAAGTTT CTTTAGTTGG ACCTTTGTCC GTCTTGTGTT 601 ACGACGATTA GGCAAGATTT TTCTCATGGC CCGAACCCCC 641 GTGGAATGTT TCACTTTGAC CAAGGTCTGA AAAATAGCAG
681 GGTGCTTACT AAATGGTGGT TTGGACTCAC ATTCTTGCCT
721 TCTACTTTAG TATAAAAGGA ΆGAGGGGCAT TGTTGATTAT
761 CTGGCTGCTT CCTGCCGAAT AGGGGAGCTG TAAT-CAGGGT
801 TTGGGTTTTG AAGCAGTGGG TGTTGGACTT CΆGAGTTGTT 841 TTCCTGGAGG CACTGGTACT GGACTTGGCA GAGGAGAAGG
881 ATGGTATCAA TGTGTTGCTG GGTGGCCGCC TGGACAGGGG
921 AGTTTAGCCT TCGGGAGATA AAGCGGGATA TGAAGGTAAA
961 TATACATGGG CCAATTGTAG TATTAGGAGG AGGAAAATTA
1001 AGGGTCCTAA GAAAGGGGCC CCCTGCTATC TCAGTGCTGA 1041 CTGCAGCAGA GATGTTTACT TCTGCTAACA CTAGAATGAG
1081 ATGTAGAGCC TGCTTAGTGT GTGTGGAGGA AGATGAGACA
1121 GAAGCTACTA AAGGAACCTG GATGGTCTGC TTGTTAGGCA
1161 AAATGTTAAT GTTTAGGCTG AAACACCAGG GTGTATGTTC
1201 CTGACCATTT GGCCAGTAGG CAGAGATAAG AGTTTGTGGC 1241 ACAAAAGAAG AAGACACTGG GGGTGGACAG ACAAAAGTTG 1281 TAGGTTATGG TGGCAAGCCA TGAGAAGGTG GGTGTTGAAT 1321 TCTGCAGAAA TTCTTTGTTT TAACTTTTAT ATTTTTCCAA 1361 GTTGAATAGC TGCCAGCAAG GGTAGCCCCT CTGACGGCCT 1401 GATAGGAAAT GTTGGTGGTG GAGGAGATGG AGGCTGTTTG 1441 GTTTTGGAGA GAGAGAGATA AATGGGTTTT TGTAACTAAT 1481 AGCCATTGTA GCCCTGACAG GGCAGAAGGA TATAGGTTGC 1521 ACTTGAGGGA ATTGTTTGTG CATGTTTTCC AGCGAGCGCC 1561 CTTGAGTTGG ATATATGAAG AGCGGCTCCA GCCAAGAGGG 1601 GAGGGGTTTC TAAAAGGTGT TTTATGGAAT CCGTTTGTAA
1641 CATATTCAGT TAAGAGGCTA TGAGGCGAGG GAGGGTGACA 1681 GCCCTGTATG TGATGGTGTG GGTGGATTTG ATGGATTGGG 1721 GGAAAAAGAG TTTAGCTGAT CAAGATTATA AACAGGTTTG 1761 GGAATGAACT GGTCAAGTAG GTGAGATGCA GGTTTATTTT 1801 GGACCGAGTT CATGTGGCCA GGTTGACAGG AAGGGCTGTG 1841 AATTGCTGGA TTTGTGTGGA CAAACAAGTT TAACAGTTGG 1881 AAGGAAGAGG GGAGTGTGAC TGATTTAAGA GGAAAGGTGT 1921 GAGGCTGATG AAGGGGGCTC AACTGTCAGA GGTTGGGGGT 1961 GGGGACTTAG AGTATCCCAC AGACAAAGAG GACAAAAGAA 2001 GAAAGAGGAG ATTTGAGTAG GAGTGAAATT TTGGAAGGTG 2041 CCCTGCAGTC ATACCTCCTG GATTAGTATG ACGAATTGGC 2081 ATGGACTATT CAGGGACATG GGGAAGGGTA GCCAGGAAAG 2121 ATCTGAAATA AAGTAGGAGA TAAAAGATTG GAAATTGGAG - 2161 ACAGAGAGTG TTATGGACTA AGGTGTTCTG GATTGGCAAC 2201 TTCTGGAATT CTTGTTAAGT ACAGTGAGGT TGGTCCTGTC 2241 GGGGAAGGAG AATAATATGG GGGTGAGGAA ATTTCTAGAT 2281 GTGGGTCTGG TGCTCTTTTT AGTTTGGAAT GGTGTATTCA 2321 GTGTGGAAAG GATGTTAGCT TTGCCCCAGT CGGAGTAGTT 2361 AGGATAACCT GGTGGGAACA TTTCCGCTTA GGTTGGAGAG 2401 GGGAGGAGGA GGAGTCTGTG ATTCAGACCC AGTCCCCTGG 2441 TTGTAGGGAC AGGGAGGAGT GTTTTGAGGA TGGACTTTTA 2481 GGCTGGGTCA AGTAAGCATT TGCGTACTGT CTTATTAGAT 2521 GTTGGGTGAG GTGTAACGCC GGCCAAGTAT TCCATATAGA 2561 ATGGGGGGAA AATACAGGGA GATTCTGGAG GATAAAAGGG 2601 CATTTGTACA TGAGTTTAAA TGGGCTTAGG CTGAGGAGCT 2641 TTTGGGAAAT GACTCATAAA TGCATGAGGA ACAATGGGAG 2681 AAGTGAAGTC CAGGCCATTT TAACCTTTAG GGAAGATTGG 2721 TCAGTTGCTG TTTTCTAAAA AAGTTTCCCT GAAGATTCGG 2761 GGTGGTAAGG AATATGCAAA GCCAATTTAA TGTTTACAGC 2801 CTTTACCAGC TGTTGGTTAA ACTGTGAAAC AAATTGGCCC 2841 GTTGTCTGAC TGGATGCAAG AGGGGAGTTT AAACCGGTGG 2881 ATAATATGGG TGAAGAGAAT AGAAGCAATG GTGTGTGCCT 2921 TTTCGGTGGT GGTGGGAAAA GCTTTTATCC ATCCAGAGAA 2961 TGTATCTACT ATTGTCAGAA GGTATAGGAA TCACTTTCTG 3001 GGGGACATGT GAGTGACGTT GATTTGCTAG TCCTGCCCTG 3041 GTAAGTGTCC TCAGACCTTG TGTGTTGGAA AGGAGGTGGT 3081 TTGATAGCTT CCTGAGGGGA AGTTTGAGTG CAAAGGGAAC 3121 ATGCCTTAGT AATATATTTC AGATTGGCAG CCATGGCAGA 3161 AGAATGTATA TAATTTTCTA AAAGCTGGAG TAGGGGCAGT
3201 AAGCAGCATG GAAATGGTTG TGCACATΆTG AAAGTACAGA
3241 AGGTTTTTTA AACTTGGACA AGACACTTTA TCACTGAGAT 3281 CAAACAATTT TTTCCTGAAT GGCGCCAGCC CGGGCAAGTG 3321 GGGTTCATTC CTTCTGGGTA TATACAGGAA AAAATGAGCA 3361 ATAACGATGG GGTTTTAAGG GCTGCCTGCC AGGCTGCCAA 3401 ATTTGTTGAA AGGTTTCTCT CGGTTATGGC ATCTTTAGAC 3441 TTTCGGCACC CCTTGCAATG GATAATGGTG GCCTTTGGTG 3481 GTAGTTTATC CACCCCCAGC AGCTTATGTA TAAGTTTGCC .. 3521 ATTTCTTATG GGAGTTCCTT TTGTAGTTAG GAAACCCTGT 3561 TCCTGCCAGA TGAAGGCATG AGACTGTAGG ACATGGTATG 3601 CATATTTGGA ATGGGTGTAA ATGTTGACCC TCTTTCCCTT 3641 TGCTAGGGTG AGGGCCCTGG TTAGGGCAAC TAGCTTAGGC 3681 TGTTGAGAGG TAGCATGGGG TGGCAGAGCA TTGGATTCTA 3721 GGAGTTTATT TTCAGCAATG ATGGAATAGC CAGCTGCTGG 3761 ACATGGCTCC CTAAAAGAGC TTCCATTAAC TAACCGTGTA 3801 TGTGTTCCCT GCAAAGGGGC ATGTGAAATG TGTTGGAAGG 3841 AGGAGGAGAG GGAGTCTAAG AGGTCCAGAC AGGAGTGAGA 3881 GAGCCTAGAA TCGGAAGTGT TTACAGGGAG GAGGGTGGCT 3921 GGGTTGAGAG TTTTATATCT CTGGAAGGGG ATTAGATGGC 3961 TACCTAGGAA TAAGGCATGT ACCTGTTATA AGCAGGATGG
4001 TAGGAGGGAT AGAAGGGATT GATGGTTTAT GAGGTCCTGT
4041 AGGTAATGGG AAGATGCAAT AGTAATGTGT TGGTAAAGAG
4081 GGAGTTTCCG TGCCTCTAAG GCCAGCGATG TGGCCACAGC 4121 CAAGATTTTT AGTCAGAGTG ACCAGCCTTG GTCACACAAΆ
4161 ACAACAGAGT CCAGTTGTTT TGAGAGAAGT GCAGTGGCTT
4201 GTGTGGCGTT GCCGTATGTT TGGCAGAGTA GTCCAAGGGC
4241 AAGGCCCTGG TCAGAATGTA CATACAAAGT AACGGGCTTG
4281 GTGAGGTTGG GCAGTCCCTG TGCCAGGGCC ATTAAAAGGG 4321 CATTTTTAAG TTTTTAAGTG GGAGTTGATG GGGCAAGCTG
4361 GGTTTAGGGG TTTTAGGATG GGCCCATGTG AGGCCCTGTT
4401 TAGTGGCTTG GCCAGCAAGT CAAAATTGGG AACTGACAGC
4441 CAGAAGTAAC CCACAAGGCC AGAGAAGGAG AGAAGGTACT
4481 TTTTTGTGTG AGGAAAGGGC ATGTCCCAAA TTAGCTCCTT 4521 TCATTGGGTT GGGATGGCTC GAGAATTAGG GGTTAGGACA
4561 AAcccAAGGT AAGTAACTTT GGTTTGGGCT ACCTGΆAATA
4601 TTGTGGGCTA GGCCCAAACT TTGGTTTGGG CTACCTGAGA
4641 TTTCCTCTAT TATGGAGGAA GTTTAAAACC TAAGTGGTGT
4681 GTTGGATGGA CAGGTTAAGG AAGGGGCTAC AGAGAAGGAG 4721 TTTGACATGT TGCAGGAGGG TGCTAGCAGC AAGGGAAAGT
4761 TCACCTAGGT CCTTGATGAG GGCAAGTCTG AATAGGTGGG
4801 GGCTATCCTG GAACCCCTGT AΆGAGTACAG TCCACTCTAG
4841 TTGGGTGCAC ATGTGAGTAT TAGGACTTGA CCAAGTTAAA .
4881 GCAAAAAGAC TTCTGTAAGC CGGATTTAAG GGAATGGTGA 4921 AATAGGCATC TTTCAGTTCC AATACAGAGG AGTGTGTAGT
4961 AGATTTAGGA ATATGGGAGA GTAGAGCATA TTGGTTGAGG
5001 ACCACTAGAT GTATTGGTAC CACTGCCTGG TCAACAACTT
5041 GGAGATCCTG GACAAAGGGG TAAGTCCCGT CTGTCTTTTG
5081 GAAAGCCAGA ATAGGGGTGT TCTGGGGAGA GTTAACAGGC 5121 TTGAGAATAT GAGCTTGTTA AAGTTTACAA ATAATAGGTT
5161 TGAGGCcccT AAGCACGGCT GGΆTTAAGGG GATATTGAGA
5201 CTGAAAAAAG AAAATGGAGG AGTTTTGAAA GGTTATTTTA
5241 ACTGGGATGT GATGTGTGGC TATTGTGGGT TTAGAAACAT
5481 TCCAAGCTTT AGAATTAACA GAAGGTAACA GGGTGGATAA 5321 TGAGGATGAG TAGGGGGAGA GGGAAGCGTT TGGGTGTCGG 5361 AGTAAAATAA AAGGGGTAGA ATTGTAGGAG CCGCATTGCA 5401 TGGAGGCCTG GAATTTACTT AATATGTCCC ACCCCAAGAT 5441 AGGGGTAGGG GACTGAGGGA TAACCAGGAA ACAGTGGGTG 5481 AAGGGGGCTG TTGAATAGGT TGCATAATAA AGGACGAGTC 5521 TGTGCCTAGA GGGGATTTTA TTGATTCTCA CAATAGAGAT 5561 CGAAGAACAG AAAAAGGGTC TAGAATATTC TGGTAAAACC
5601 GAGTAACTAG TTGTCGTATT GAGTATCCAA TAGGAAAGΆC
5641 ATGGACTTAC TAGAGACTGA CAGAATTACC CTGGGTTCCG 5681 AGGTGGTGAT GGCAGTAGGG GTGGTGGACT CTAGGCCCCA
5721 ccTGTCTTCA GGTACAGGTG GTGAAGCTAG ΆTGAAGAGTT
5761 TCACTGAGCA CAGTCTGACT TCCAGTGTTG CTTGATATCA 5801 CAGATGGGGC AAGGCTTCCA GAGTGTCACG GGATTAGGCC 5841 AGGCTTTTGC TCAGTGTCCC TGTTGGATGC ACTTATAACA 5881 GGCTCCTGGA GTTGACTGTT GCCAAGTTGA GGAATTCTGT 5921 ATGCCTTGGG AACCCTGTTG AGTGGCAGCT ATCAGCATTT 5961 GATATTTAGC CTGGCCCTTT TTTAGGTTTG GGTTTTTTAG 6001 TTCTTCCTCT CTAATCTTAT AGACCTTAAA GGCCACTTTG 6041 ATTAAGTCTC TTTGGAGGTT TGAGGGCCAT CCTCCACTTT 6081 TTAAAGTTTT TTTTCAAATG TCTGGGGCTG ACTGGGAGAT 6121 AAAGTGTAAA TGGAGGTCGA TTTTGCCCTT AATGGTATTA 6161 GGGCTTAAGG TGGCATATTT AGTTGTGACC TTTGAAAAGG 6201 CGGGAGAGGA AAAGAGCAGG GGCGACTTTA TTTATGCCAG 6241 CTAGAAGGCA TGATATCATA TGGCCTGGTT TCTGCCTGTC 6281 TACAGAAGTT GCCTGATAAT TCCGATTGAG GTCAGTTCTG 6321 GGGACAGCTA GGGTCCCTAC TGGGTTATGA GCAGCATCTT 6361 GTTGGTGGAG GGTGTTCTCA TGGGCCTGGG CACCCGTAAA 6401 TATGTGTTCC CTGTCCTCCA GGGTGAGGGT AGAGGAGAGG
6441 ATGACATATA TGTCATGCCA GGTAAGGTCA TAAGΆTTTAG 6481 GGAAGTACAG ACATTCCTTG AGGAAGGAGG TAGGATCCAT
6521 TGAAAAGGAG CAAAGTCTCT TTTCAGTATG GGAGAAGTCA
6561 GCTAGGGAAC TTGAATTCAG ATTATGCCTT CAGTCCTTGC
6601 AACCTCCTGC AAGGGAAAGA CTTTGGAAGG CCTTTGGGCA
6641 TGTGTCTGGT TTTCGGCAAA AGAGGGTTGG GCGTGAGACC 6681 TGGTTTGAGG TGGAGAAGGA AGGGGGATGG AGGGTGACAG
6721 TGAGGGGGTG GAGGAGCCGG ATCCGAAGGA GAAGGTGGAG
6761 GGAATGGGAG AGGGCAGGAG CCAGACGAAG GGGTTTGTTG
6801 GCCAGGTTGA AATTCAGAGG AGGAAGGCTT GTCCCGAGGA 6841 GGAGGCATTT TGGGGGGTTT CTCCTTGAGG AGGAAGATTT
6881 GAAAAGGCGA ACAAGATTGG CAGGAGAGTG AGTGAGGCAT
6921 GAGCGGAGAT AAGCAAAGCA CTTCAGATCA TTTACCATTG
6961 TGTTGGAGAA AGATTTTTAG ATGATGTTGA ATTTGAAAGT
7001 TGAATGTTTT GTTTTTTAGA AGAGGGGAGT GAATACTCTG 7041 GATGGGAGTG AATACTCTGG GGAAAGTGAG TACCCCAGAA
7081 GGGAGTGAAT ACCCCAGAGG GGAGTGAGTT CTCCAGAGGG
7121 AAGTGAATAC CCCAGAGGAA AGTGAGTACT GGATTCCACC
7161 TGGGAGAAGG GACCTGGAGA GTTGGGGCAG CCCCGTAGCT
7201 ATTCATGCTC CCCAGAGGAC GGAGCGGCCA GAATGGCAAG 7241 TGTTCTGAGG CTGTCCCCTG AGAAGGGAGG CTGCAGTCAC
7281 CTGGTGACTG GGGAGACCTC TCACCCAGCA CTGGAATTTT
7321 TTTGGAGAAC AAGCAGAAAT AGAAAGGAAT CCAGAAAGGA
7361 AGAATAGGGA GACTCACCCA CGAATTGAAG CCCAGTGTTG
7401 GATGTGATGT CCAGCAATGG GGGTGATCCT GGTTGCAGCT 7441 GCTTGGGAAG GGGAAGGAAG GAAAAATAAG ATGGAGAGTT
7481 TGATCAATAT CTGGAGGTTA GTCCAGGGCT GGAGAAGATG
7521 AGAGAATAGG GGAAACAGGA CTGGGAACAG ACAGTCCACT
7561 CAGGATCTGG AAGTAGGCCC CGGTCTAGTT TGTGCTGCTT
7601 GCTGCTTTCT GCTTTCCAGG TTGCAAGAGA AAACTTACCC 7641 ATTAGAACCC AATCACCGTC CTGGGTTTTG GCACCAAAGT
7681 GTTAGGTTTT GAAGGGAAGA TGAGGGTGAA AGAAAGACAG
7721 ACAGAGAGGG TGCATCTCAA ACAACAACAC AGGAATATTG
7761 CAGACACCTG TGGAAGTGGG GGACCAGCTT AATGCCAGAG
7801 CCCACTGCCG CTTATAGCCT GGGGTACTTA CAGGTATGGG 7841 TGGGAGGGAG CTGGGCAGTA TGGCTTGCTG CCCAGGAGGA
7881 TATTGATAAG ATGTTCTTAT GATCAGGTGG TTTGGCCCTT
7921 TTTCCAGTGG AATATCATTG TGGTGTTCCT TTGCAGCTGC
7961 CTGGAAAGGG GAATGAAGGA AAAATAAGAT GGAGAGTTTA
8001 ATCAATATCT GGAGGTTAGC CCAGGGCTGG AGAAGATGAA 8041 AGAATAAGGG GAACAGGACT TATGATAGGG ATGTTTCTTT
8081 AGTTGAGCCT TTGTCCACCT TGTGGTCAGG TGGTTAGGCA
8121 GGATGTTGCT CATGGCCTGA ACCCCCATGG AATGTTTCAC
8161 TTTGACCAAG GTCTGCAACA TAGCAAAAAC TTACAAAATC 8201 ATGCAGTTTG GACTAACAGG TGACTCTACC CATGCTCCTC
8241 TTCTTCTTCT GCATAGATAC CTACCCTATG ATTTCACCTT
8281 GTTCTTCTCT GGCACACACC CCTTCCTCCA CCTGCATTCC
8321 TTCTCATAAA GCCCCCCTTG CTATCTAGTC TCTATCCTAT
8361 TCACCCAAAA TAATGTCTTT CTGGCCTCTC CCTATTTTCT 8401 TACCAGATGC AAGAGGTACC AAACAGAGTC TTAGCCCCCG
8441 AACAGAGTTT GAAGAAGACA AAAACATCAG AATATCCAAT
8481 AATATTTGTG TATTACCTCA GGAAGGGTAA GAAAATAAAT
8521 TCAAATCAAC TGGAGAATGA ACAGTCCCAA GAGAACTCCA
8561 TCAATCCAAT CCAAAAGGAG GAGGACGAAG GCGTAGACTT 8601 ATCTGAAGGA TCTTCAAATG AGGATGAAGA CCTAGGCCCA
8641 TGTGAAGGAC CTTCAAAGGA GGACAAAGAT CTAGACTCAT
8681 CTGAAGGATC CTCACAGGAG GATGAAGACC TAGGCTTATC
8721 TGAAGGATCT TCACAGGACA GTGGGGAGGA TTAGTCACAC
8761 ATGGAGAAAC CAAATTGGAC AAATCATCAC CACTGATGGC 8801 GATGATTACA ATAAAATCAA GTTTAAGGAG CTGATGACTG
8841 TGTTTATCTC TGCCTGTTGT CTGGTGTGGG GTGGGGAAGG
8881 GAAGGGAAGG GAAGAGGTAG
The SPANX-N4 genomic sequence (SEQ ID NO: 95) is provided below. The SPANX-N4 polypeptide is encoded by nucleotides ISl to 258 and 8191 to
8412.
1 TGGGACACTG CCTGTATGAT CAAACAAAGC TCAAGGGTGT
41 GGCTTCGTCT TGCTCCCAGG AGGGTATATA TACAGGGTGG 81 GCAAAAGCTC TGGGACAGCC CACTGGAAAG CTTCAATACA
121 GCTGTGGAAA TCTGCACCCT AGAAGATCCT AGTACAGAAA
161 TTCTACAACC AACCATAATC ATGGAAGAGC CAACTTCCAG
201 CACCAACGAG ΆATAAAATGA AGAGCCCCTG TGAATCTAAC 241 AAAAGAAAAG TTGACAAGGT CAGATTGTTA GGTTTTGAAG
281 GGAAGGTGAG GGTGAAAGAA AGACACACAG ATAGGGCGCG
321 GGTCAAACAG CAACACGGGT ATACTGCAGA CACCTGCAGA
361 AATGTGGGGC CAGCTTCATG CCAGAGCCCA CCGCTACATA 401 CAGGCCAGCG TACTTATAGG TATGGGTGGG AGAGGTTTGG
441 GCAGTATGGC TGGCTGCTCG' GCGGAATATT GATAAGATGT
481 TGTTATGATC AGGCAGTTTG GCCCTTTTTC TGGTGGGATG
521 TCATCATGGT GTTGCTTGGA CCTTTTTCCC CAACAAGATA
561 TGATAGGGAT GTTTTTTTAG ATGGGCCTTT ATCCACCTCA 601 TGGTCAGGCA GTTAGGCGGG ATTTTTCTCA CGGCCAGAAC
641 TCCCGTGGAA TGTTTCACTT TGACCAAGGT CTACAAAATA
681 GAAGGGAGCT TACAAGAGAG TGCAGTTTAG ACTGATATTC
721 TTGCCTTCTA CTTTATCATA AAAGGAAGAG GGACGTGGTT
761 GATTATCTGG CTGCTTGCTG CTGAATAGGG GAGCTGTATT 801 CAGGGTTTGG GTTTTGAAGC AGTGGGTGTT GGACTTCAGA
841 GTTGTTTTCC TGGAGGCACT GGTACCGGAC TTGGCAGCGG
881 AGAAGGATGG TATCAATGTG TTGCTGGGTG GCTGCCTGGA
921 CAGGGGAGTT TAGCTTTAGG GAGATAAAGA GGGATATGAA
961 GGTGAGTATA CATGGGCCAA TTGTAGTATT TGGAGGAGAA 1001 AGATTAGGGG ACCTAAGAAA GCGGCCCCCT GCTATCCCAG
1041 TGCTGAGTGC AGCAGAGATG TTTAGTTCTG CTAACACTAG
1081 AATGAAATGT AGAGCCTGCT TAGTGTGTGT GGAGGAAGAT
1121 GAGACAGAAG CTACTAAAGG AAACCGGATG GTCTGCTTGT
1161 GAGGCAAAAT GTTAACGTTT AGACTGAAAC ACAAGGGTGC 1201 ATGTTCCTGA CCAATTGGCC AGTAGGCAGA GATAAGAGTT
1241 TGTGCCACAA AGGAAGAAGA CACTGGGGGT GGACAGACAA
1281 AAGTTGTAGC TTATGGTGGC AAGTCATGAG AAGATGGGGG
1321 TTGAATTCTG CACAAACCCT TTGTTTTAAC TTTTATGCTT
1361 TTCCAAGTTG AATAGCTGCC AGCΆAGGGTA GGCCCTGTGA 1401 GGGCATGACA GAAAATGCTG GTGGGGGAGG AGTTGAAGGC
1441 TGTTTGGTTT TGGAGAGAGA GAGATAAATG GGTTTTTGTA
1481 ACTAATAGTC ATTGCGGTCC TGATATGGCA GAAGGATATA
1521 GGTTGCAGTT GAGGGAATTG TTTGTGCATT TTTTCCAGGG
1561 AGCGCTGTTG AGTTGGATAC ATGAAGAGCG GCTCTGGCAG 1601 AGAGGGGAGG GGTTGGTAAG AGGTGTTTTA TGGGATCCAG
1641 CTTGTATCGT GTTCAGTTAA GAGGCTGGCT ATGAGGCGAA
1681 GGAGGGTGAC AGCCCTGTGT GTGACAGTGT GGGTGGATCT
1721 GATAGACTGG GGAGAAAGGT GTTTAGCTGA GCAAGATTAT 1761 AAACAGGTTT TGAGAATGAA TTGGCCAAGT AGGTGAGATG
1801 CAGGTTTATT TTGGACTGGG TTTACGTGGC CAGTTTGAAA
1841 GGAAGGGCTG TGAACTGCTG GATTTGTGTG GACAAACAAA
1881 TTTAACAGTT GGAAGAAAGA CGGGAGGGΆG TGTGACTGAT
1921 TTAAGAGGCA ATGTGTTAGG CTGATGAAGG GGGCTCAACT 1961 GTTAGAGGTT GGGGGTGGGG ACTTAGAATA TTCCACAGAC
2001 AAAAAGGGCA AAAGGAGAAA AAGGAGATTT GAGTAGGAGT
2041 GAAATTTTGG AAGGCGCCCT GCAGTCATAC CTCCTGCATT
2081 ACTGTGGGAA ATTGGCATGG GCTATCCAGG GACATGGGGA
2121 AGGGACACCA GGAAAGATCT GAAATAAAGT AGGAGATAAA 2161 AGATTGGAAA TTGGAGATGG AGACTGTTAT GCACTAGGGC
2201 ATTCTGGATT GGCAACTTCT GGAATTCTTG TTAAGTGCAG
2241 TGAGGTTGGT CCTGTGAGGG AGAGAGAGTA ATTTGGGGGT
2281 GAGGAAATTT CTAGATGTGG ATCTGGTGCT CTTTTTAGTT
2321 TGGAATGGTG TATTCAGTGT GGGAAGGATG TTAGCTTTGC 2361 CACTGTGTGA GTAGTTAGGA TAACCTGCTG AGGACCCGTC
2401 CACTTAGGTT GGAGAGGGGA GGAGGAGGAG TCTGCGATTC
2441 AGACCCAGTC CCCTGGTTGT AGGGACAGGG AGGAGTGTTT 2481. TGAGGATGGA CTTTTGGGCT GGGGATAATA GGGCTTAGGC
2521 TGAGGGGATT TGAGGAAATG GCTCATAAAT GCATGAGGAC 2561 CAGTAAGAGA AGTGAAGCCT GGGCCGTTTT AACCTCTAGG
2601 GAGTTTTTGT TTGTTTGTTT GTTTTTTAAT TACATGAAGA
2641 TTAGGGGAGG TAAGGAATAT GGAAGGCCCA CTTAATGTTT
2681 AGAGCCTTTG CCAGCTGTTG GTTAGCCTGT GAAACAAATG
2721 TTGGCCCATT GTCTGACTGA ATCCAAGAGG GGAGTTTAΆA 2761 CCTGTGGATA ATATGAGTGA AGAGAATAGA AGTGATGGGG
2801 TGTGCCTTTT CGGTGGTGGT AGGAAAAGCT TTTATCCATC
2841 CAGAGAATGT ATCTACTATT AGCAGAAGGT ATΆGGAATCA
2881 TTTGTTGGGA GGCATGTGGG TGAAGCTGAT TTGTTAGTCC
2921 TGCCCTGGTG TGTGTCCTCA GGCCTGGTGT GTGGGAAAGG 2961 AGATGGTTTG ATAGCTTCCT GAGGGGAAGT TTAAGTGCAA
3001 AGGGAACATG TCTTAGTΆAT ATCTTTCAGA TTGGCAGCCA 3041 TGGTCGAAGA ATGTATTTAA GCTTTTΆAAA GCTGGAGTAG
3081 GGGCAGTAAC TGGCATGGAA ATGGTTGCGC ACGTATAAAA 3121 GTACAGAAGG TTTTTAATAC TTGGACAAGA CACTTTCATC 3161 ATTGAGGTAG AACTATTTTT TCCTGAATGG TGCCAGCCCG 3201 GGCAAGTGGG GTTTGTTCCG CGTGGGTATA TACAGGTTGT 3241 ATGCTGGGAA AAATGGGTAA TAATGATGAA ATTTTAAGGG 3281 CTGCCTGCCA GGCTGCTGAA CGTGTTTAGA AGTTTCCCTT 3321 TGTTATGGCA TCTGTAGCCT TTTGGTGTCC CTTTCAATGG 3361 ATAATGGTGG CCTGCGGTGG TAGGTTAGCC ACCTCCAGCA 3401 GCTTGTGTGC AAGTTTCCTA TTTCTTATGA GGGTTCCTTT 3441 TGTAGTTAGA AAACCCCTTT CCTGCCAGAT TAAGGTGTGA 3481 GAGTGTAGGA TGTACCACGC ATATTTGGAA TTGGTGTGAA 3521 TGTTAACTCT CTTTCCGTTT GCTAGGGTGA GATCCTGGGT 3561 TAGGGCTACT AGCTTTGCCT GTTGAAAGGT AGTATGGAGT
3601 GGGAGAGCAT TGGATTCTAG GAGCTCAGTT TTGGTAATGG 3641 TGACGTGGAC AGCTGCTGGA CATGGCTCCC TTAAAGAGCT 3681 TCCATTAATG AACCATGTAG GCGTTCCCCG CAAAGGAGCT 3721 TTTGAAATAT GTTGGAAGCG GGAGGAGAGG AAGTTTAAGA 3761 GGTCCAGGCA GGAGTGAGAG AGCTTAGAGT TAGAGATATT 3801 AACAGGGAGG AGGGTGTTCA GGTTGAGAGT TTTATATCTC 3841 TGGAAGGTGA TTAGAGGGTT ACCTATGAGT AAGGCATGTA 3881 CCTGCTATAA GTGGGATGAT GGGAGGGATG GAAGGGATTC 3921 ACAGTTTATG AGGTCCTGTA GGTAATGGGA AGATGCACTA
3961 GTAATGTGTT GGTAAAGAGG GAGTTTCTGT ACCTCTAAGG
4001 CCAGCAATGT GGACACACCC AAGATTTTTA GTCAGAGTGA 4041 CCAGCCTTGG ATGACAGAGT CCAGTTGTTT TGAGAGGTAT
4081 GCAACGGCTT CTGGGGCGTC GCCGTATATT TGGCAGAGTΆ 4121 GTCCAAGAGC AAGGCCTTGG TCAGAATGTA CATACAAAGT
4161 AAAGGGCTTA GTGGGGTTAG GCAGTCCCAC TGACAGGGCC
4201 CTTAAAAGGG CATTTTTTAG TTTGTTTTTT TTTTAAGTGG
4241 GAGTTGATGG GGCAAGCTGG GTTCAGGTGT TTTAAGATGG
4281 GCCCATGTGA GGCCGTGTAA AGCAGCTTGG CCAGCAAGTC 4321 AAAGGTGGGA ATGCACAGCT GGAAGTATCC CACAAGGCTC 4361 AAGAAGGAGA AGAGGTCCTT TTTGGTGTAG GGAAGGGGCA 4401 TGTCCTGAAT TAGCTCATTT TGTTGGGTTG GGATGGCCCA
4441 AGAATTAGGG GGTAGGAGAA ACGCAAGGTA AGT'AACCTGG 4481 GTTTGGGCTA TCTGAGCTTT TGTGGGTGAC ACCCGACATC 4521 CTCAATTATG GAGGAAGTTT AAAACCTTAG TGATGTGTTG 4561 GACTGACAGG TTATGGGAGG GACTACAGAG AAGGAGGTCA
4601 TTGACATATT GGAGGAGGGT GCTAAAATGA AGGGGΆAGTT
4641 CAGCTAGGTC CTCGATGAGG GCCTGTCTGA ATAGGTGGGG 4681 GCTATCCTGG AACCCCTGTG GGAGTATGGT CCATGTCAGT 4721 TGGGTGAAAG TGTGAGTATT AGGATTTGAC TACATGAAAG 4761 CAAAAAAAAA AAAACTTTGG GAAGCCAGAT TTAAGGGAAT 4801 AGTGAAAAAG GCATCTTTTA GATCCAATTC AGAGAAGTGT 4841 ATGGTAGATG GGGCAATATG GGAGAGTAAA GTATATGGGT 4881 TGGGGACCAC CTAATGAATT GGTACCATCA CCTGGTTAAC 4921 AACTCCAAGA TCTTGGACCA ACAGGCAAGA TCCATCTGTC 4961 TTTTGGACAG CCAGGATAGG CATGTTGTAG CGAAAGTTGA 5001 CAGACTTGAG AATCTGAGCT TGTTAAACTT TACAGATAAT 5041 AGGCTTGAGG GCCCTGAGGC CAGCTGGGTT AAGGGGATAT 5081 TGAGACTGAT GAAGGAAAGT GGAGGAGATT GAAGGGTTAT 5121 TTTAACTGGG ATGTTGTGTG TGGCTATTGT GGGTTTAGAA 5161 ACGTTCCAAA CTTTAGAATT AACAGAAGGT AACGGTGGAT 5201 AATGAAGATG AGGGGGAGGA GAGGGAAGCA TTTTGGTGGC 5241 AGAGTAAAAT AAAAGTAGAA TTGTAGGAGC CGCATTGCAT 5281 GGAGTTCTGG AATTTACTTA ACTTTCCCAC CCCAAGATAG 5321 GGGTAGGGCA CTGAGGGATA ACCAGGAAGG AGTGGTGAAG 5361 GAGGGTGTTG AATAGATTGT ATAATAAAGG ACCAGTCTGA 5401 TTGTGCCTAG AGGGGATTTC ATTGACTTTC ACGATAGAGA 5441 CAGAGGAACT GAGGAGAGGT CCAGAATATT CTGGTAAAAC 5481 TGAATAACTA TCCCCTGTAT CCAATAGGAA AGACACGGGC 5521 TTACTAGAGA CTGACAGTGT TATCCTGGGT TCTGAGGTGG 5561 TGATGGTGGT AGGGGGGTGG GGGCAGACTC CAGGCCCTGT 5601 CAGTCTTCAG TTAGAGGTTA TGAAGCAAGC GGGATGAAAA 5641 GTTTTCCTGA GCACAGTCTG ACTTTCAGTG TTCCTTGATA 5681 ACACAGATGG GGCGAGGTTT CCTGGGTGTC CTGGGATTAG
5721 GGCAGGCTTT TGCCCAATGT CCCTGTTGGT TGCACTTATA
5761 ACAGGCTCCT GGCAGTGACT GTTGTAACAC TGAGGACTTC
5801 TGTATGAATT GGGAACCCTG TTGAATGGCA GCTGCCAGCA 5841 TTTGGTATTT AGCCTGGTCC TTTTTTGTGT TTTTAGTTCT
5881 TCTTCTCTAT TATTAAAGAC CTCAAATGCC ACTTCAATTA
5921 AGTGTCTTTG GGAGGTTTGA GGGCCATCCT CCAGTTTTTT
5961 TAAGTTTCTT TTGGATGTCG GGGGCTGACTGGGAGATAAA
6001 GTGTAAATGA AGGTGGATTT TGCCCTTATT GGTATCAGGG 6041 CTTTTGAAAG GCAGTAGAGG AAAAGAGCAT GGGCAGCTTT
6081 ATTTATGCCA GCTAGAAGGC ATGATATCGT ATGGCCTTGT
6121 TTTTGTCTGT CTGCAGAAGT TGCCTGATAA TTCCAATTGG
6161 TGTAAGTCCT GGGGACAGCT AGGGTCCCTA TTGGGTTATG
6201 GGCAGCATGT TGTTGGTGGA GGGTGTTTGC ATTGGCATAG 6241 ACAGCCATCC AGATGCATTC CCTGTCTTCT GGGGAGAGGG
6281 TGGAGGAGAG CATGACATAC GTGTCTTGCC AGGTAAGGTC 6321 GTAAGATTGA GTGGCATACA AAAATTCCTT GTGGAAAGAG
6361 ATGGGATCCT TGGATAAAGG AGCCAAGTCT CTTTTAAAAT
6401 GGGAGAGGCT GGCTAGAGAG ATGGGAACAT GATCTCAGAT 6441 TATGCTTTTG TCCCTCATGA TCTCCTGCAA GGGAAGGGCT
6481 TTGGAAGGCC TTTGGGTGTG TGCCTGGTTT TCAGCACTAG
6521 GGGGGTGGGT GTGAGACCTG GTTTGTGGTG GAGAACGAAG
6561 AGGGGAT-GGA GGGCGACAGT AAAGGGGTGG AGGAGCCAGA
6601 GCCAAAGGAG AAGAAGGAGG AAATGATGTT GGGAGGGGGA 6641 CTGAGATAGG GGTTTTTTTA ACCAGGCTGA AATTAGAGGA
6681 GAAAGGCTTG TCCTGAGGAG GAGGCATTTT TGCGGGTTTT
6721 TCATTGAGGA GGAGGATTTG AAAAGGGGAA AAAGATTGGC
6761 AGAGAGAGAG AGCAATCCAT GAACGGAGAT AAGCAAAGCA
6801 CTTTTGATCC TTTACCATTG CATTAGAGAA AGTTTTTAGA 6841 TTATGTTCAA TTTGGAAGTT GAAAGTTTTG TTGTTTGGAG
6881 GAGGGGAGTG CATACCCCTG AAGGGAATGA GTATCTCCAG
6921 AGGGTAGTGA ATACCCCAGA GGGAAGCAAG TACCTCAGAA
6961 GGGAGTTAAT ATCCCAGAAA AGAGTGAGTG CCCAGAAGGG
7001 GAGTGAATAC CCCGGAGAAA AGTGACTACC CAATTCCTCC 7041 CAGGAGAATG GACGTAGGAG AGCTGGGATG TACCAAAGAG 7081 CTGTGGCAGC TCTTCACAGT CCCCTAGAGG CCAGAGTGGC 7121 TGGAGTGGTG AGCGTTCTGA AGGTGTCCCC TGAGAAGGCA 7161 GGCTGCGGTC ACCTGGTGAC CAGGGAGACC TCCAACTCAG 7201 TGCTAGGATT TTCTGGAGAA CCAGCAGAAA TAGAGAAAAA 7241 TCTGGAAAGG AGGAAAAGGG AGATTCACCC ACTAACTGGA 7281 GACTGGTGTT GGATGTAAGA TCCAGCAATG GGGGTGATCC 7321 TTACTGGAGC CACCCAGAAA GGGTAAGGAA GCAGAGGATG 7361 GAGAGTTTGA TCAGGACCTA GAGGTCAGCC CAGGACTGGA 7401 GAAAATGAGA GAATAAGGAG AACGGGGCTG GGAACAGGTA 7441 GTCCACTCAG GATATGGAAG CAGGCTCAGG TCTAGTTTTT 7481 GCTGCTTGCT GCCTTCCAGG ATGCAAAGAA GACTTACCTG 7521 GTAGAACTCA ATCCCCATCC TGGGTTTTGG CACCAAAATG 7561 TTAGGTTTTG AAGGGAAGGC AAGGGTTAAA GACACACAAA 7601 GAGGGGACCA CTCAAACAGC AACACAGGTA TATTGCCGAC 7641 AGCTGCAGAA GTGGGGGACC AGCTTAATGC CAGAGCCCAC 7681 TACCACTTAC AGGCTGGGGT ACTTACAGGT ATTGGTGGGA 7721 GGCAGCTGGG CAGTATGGGA TGCTGCCTGG CAGGATATTG 7761 ATAAGATATT CTTATGATCA GGTGGTTTGA CCCTTTTTCC 7801 TGTGGGATGT CATTGTGGTG TTTCTTGGAC CTTTGCCCAG 7841 AAAGATATGA TAGCAATGTT TCTTAGTTGG GCCTTTGTCC 7881 ACCTTGTAGT CAAGTGGTTA GACAGGATGT TTCTCATGGC 7921 CTGAACCCCC .ATGGAATGTT TCACTTTGAC CGAGGTCTGC
7961 ATAATAGCAG GGAGCTTACA AAGTCATGCA GTTTGGAGTA 8001 ACACAGATGA CCCTACCCAT GCTCCTCTTC TTCCCCATAG
8041 ATCCCTACCC TATGATTCCA CCTTGCTCTT CTCTGGCACA 8081 AGCCCCTTCC TCAACCTGCA TTCCTTCTTA TAAAGCCCCC 8121 CTTGCTATCT ATTCTCTACC CTCTTCATCC AAAATAACAC 8161 CATTCTGGCC TCTCCCTGTT TTTTTACCAG AAGAAGAAGA 8201 ATCTGCACAG AGCCTCAGCC CCTGAACAGA GTTTGAAAGA 8241 GACAGAAAAA GCAAAATATC CAACATTAGT GTTTTACTGC
8281 AGGAAGAATA AGAAAAGAAA TTCAAATCAA CTGGΆGAATA
8321 ACCAGCCTAC AGAGAGCTCC ACTGATCCAA TCAAAGAGAA 8361 AGGAGACCTA GACATATCTG CAGGATCTCC ACAGGATGGT 8401 GGGCAGAATT AGTCCAAATT GGACAAATCA TCACCACTGA 8441 TGGCGATGAT TACAATAAAA TCAGTTTGAG GAGCTGATGA 8481 CTGTGTATAC CTCTGCCTTT TTTTCTGATG GTGGGGAGGA 8521 AGGGAAGGGA AAAGATAGGC ATTTGAGAAC GGAGGGATAT 8561 GAGATCCTGT AGCATTGGCG GACAGATCCA CAGGTTAGCC 8601 AGACATTGTA AATAAAGACT GGGGGAGGAC TGATTCCTGG 8641 AGACAAATTT TCTTCTTAAA ATTTTATCTC ACAGGAGAGG 8681 AAGAAAAGGA TTTGGTGTTC CTTGGAGCAC GTGCATGTTT 8721 AGAAGAGCAC ATAAGAAGAT CTGTATTGTA TGTAGGTGAC 8761 AGTGACACTC TTTCCAAAAT GAAGACATCA GAATCACCAC 8801 CTTCAGGCCA CATACCTCAG AGTGGTGTGT TTTGTAACTC 8841 ACCCAATGCT GTGAGCAACT GAAGGTCCTC AAAAGGAACA 8881 AGAACCTCAC CCTACAGAAA GTAAGACAGC ATGCCACACA 8921 CAATAAAATA ATGGTGTTTG GAGGACATGC TTCCAATACT 8961 AGGGCGCTTG TTAAGA
The SPANX-N5 genomic sequence (SEQ ID NO:96) is provided below. The SPANX-N5 polypeptide is encoded by nucleotides 174 to 248 and 891 to 1034.
1 TGGGACACTG CCTGTATGAT CAAACAAAGC TCAAGGGTGT 41 GGCTTTGCCT TGTCACCAGG AGGGTATATA TAGGGAGGGC
81 AAGAGCTCTG GGCCACTGGG AAGCTTCAAT ATAGCTGTGG
121 AAGTCTGGAC TCTACAAGAT CCTGCTGTAG ACATTCAACA
161 ACCAACCAGA ATCATGGAAA AGCCCACTTC AAGCACCAAT
201 GGGGAGAAGA GGAAGAGCCC CTGTGACTCC AACAGCAAAA 241 ATGATGAGGT AAGATTGTTA GGTTTTGAAG TGAAGGCGAG
281 GGTGAAAGAA AGACACACAG AGAGGGGGCA GCTCAAACAG
321 CAACACAGGA ATACTGCAGA CGCTTGTGGA AGTGGGGGAC
361 CAGCTTAATG CCAGATCCCA CCACCGCTTA CAGCCTGGGG
401 TACTTACAGG TΆTGGGTGGG AGGGATCTGG GCAGCATGGC 441 TTGCTGCCCA GCAGGATATT GATAAGATGT TCTTATGATC 481 AGGTGGTTTG GCCCTTTTTC TGGTGGAATA TCGTTGTGGT 521 GTTCCTTAGA ACTTTGCCAA GCAAGATATG ATAGGGATGT 561 TTCTTTAGTT GGGCCTTTGT CTGCCTTGTG GACAGGTGGT 601 TAGGCAGGAT GTTTCTCACG GCCTGAACCC CCATGGGATG 641 TTTCACTTTG ACCAAGGTCT GCAAAATAGC AAAGAACTTA
681 CAAAATGGTG CAGTTTGGAC TAACAGATGA CCCTACCCAT
721 GCTCCTCTTC TTCCCCATAG ATCCCTACCC TGTGAGTCAA
761 CCTTGTTCTT CTCTGGATCG AACCCCTTCC TCAACCTGCA 801 TTCCTTCTCA TAAAGCCCCC CTTGCTATCC AGTCTCTATC
841 CTATTCACCC AAAATAATGT CTTTCTGGCC TCTCCCTGTT
881 TTCTTAACAG ATGCAGGAGA CACCAAACAG GGACTTAGTC
921 CTCGAACCGA GTTTGAAAAA GATGAAAACA TCAGAATATT
961 CAACAGTATT AGTGTTGTGC TACAGGAAGA CTAAGAAAAT 1001 ACATTCAAAT CAACTGGAGA ATGACCAGTC CTGAGAGAAC
1041 TCCATCAATC CAGTCCAAGA GGAGGAGGAC GAAGGATCCT
1081 CACAGGAGGA TGAAGACCTA GACTCATCTG CAGAATCTTC
1121 AAAGCAGGAT GAAGACCTAC AATTACCTGA AGGATCTTCA
1161 CAGGAGGATG AAGACCTAGG GTTATCTGAA GGATCTTCAC 1201 AAGAGGATGA AGACCTAGAC TCATCTGAAG GATCTTTGAT
1241 GGAGGAAGAA GACCCAGACT CATCTGAAGG ATCGTCAGAG
1281 GAGGGTGAGG AAGACTAATT AAACATGGAG AAACCAAATT
1321 GGACAAATCC TCACCACCAA CGGTGATGAT TACAATAAAA
1361 TCAAGTTTGA GAAGCTGATG GCTGCATATA TCTATGCCTG 1401 TTTTCTGATG GTGGGGGAGA AGGGAAGGGA AGAGGTAG
EXAMPLE 3: SPANX-Nl Expression is activated in Tumors
As shown in this Example, expression of SPANX-N alleles in matched normal/tumor tissues revealed a predominant activation of only one member of the subfamily, SPANX-Nl, in tumors, indicating that expression of this gene may be diagnostic for malignancy.
Materials and Methods
Cell lines. The clinical materials were obtained as described in the previous Examples. Briefly, transformed lymphoblast cell lines were developed from 18 patients, 029-049, 075-014, 075-008, 018-014, 076-006, 076-008, 231- 020, 231-024, 087-005, 087-011, 086-013, 086-017, 194-004, 194-008, 239-019, 082-003, 082-011, 032-003 (11 families) linked to the HPCX (Hereditary
Prostate Cancer X-chromosome) region. DNA from these cell lines was used for PCR analysis. Genomic DNA from 40 normal individuals used as eligible controls (Caucasians) was purchased from Coriell Institute for Medical Research (Camden, NJ). Melanoma cell lines LoxMVI, 537 MEL5 938 MEL and 888 MEL were obtained from the National Cancer Institute, NIH. Melanoma cell line VMMl 50 was derived from a tumor digest obtained from a patient at the
University of Virginia (Westbrook et al., Clin. Cancer. Res. 10: 101-112 (2004)). All the cell lines were cultured in RPMI 1640 supplemented with 10% fetal bovine serum, penicillin, and streptomycin. Cells were washed with Dulbecco's PBS, and the pellets were snap- frozen in liquid nitrogen and stored at -8O0C until they were used for subsequent DNA, RNA and protein extraction.
Analysis of normal and cancer tissues by RT-PCR. Total RNA from normal adult human tissues (prostate, placenta, proximal and distal colon, lung, and cervix), matching normal/tumor tissue pairs (Ambion, Austin, TX), melanoma cell lines and primary uterine tumors was used for screening SPANX- N expression with the primers described in Table 9.
Table 9: Primers used for amplification and sequencing of SPANX-N genes
Oligonucleotide Name Sequence (5'-3') SPANX-Nl
Exonl PCR amplification (1,946 bp)**
Nlexl-F 5'-aggcttgaagcttgtaccct-3' (SEQ ID NO:137)
Nlexl-R 5'-acaactttcgttaaccgcca-3' (SEQ ID NO:138)
Sequencing SeqNlexl-R 5'-acaagacggacaaaggtcca-3' (SEQ ID NO:139)
SeqPrimSX-F5 5'-tgggacactgcctgtatgat-3' (SEQ ID NO: 140)
Exon2 PCR amplification (1,779 bp)
Nlex2-F 5'-agggaagtgaatacaccaga-3" (SEQ ID NO:141)
Nlex2-R 5'-aatggtaggtccctgcagat-3' (SEQ ID NO: 142) Sequencing
SeqN2ex2-F 5'-taacaggtgaccctacccat-3' (SEQ ID NO:143)
SeqN2ex2-R 5'-gatcactggagaaggaggaa-3' (SEQ ID NO:144) SPANX-N2
Exonl PCR amplification (3,710 bp)
N2exl -F/ 5'-cttactgtgtttgatgtggca-3' (SEQ ID NO: 145)
N2exl -R 5'-accagtcggattccagaaaat-3' (SEQ ID NO: 146) Sequencing
Seql-F 5'-tcctcaacctgcattccttc-3' (SEQ ID NO:147)
Seql-R 5'-catcagacaacaggcagaga-3' (SEQ ID NO:148) Exon2 PCR amplification (4, 143 bp)
N2ex2-F S'-tgagcgagtactccagaga-S' (SEQ ID NO:149) N2ex2-R 5'-ctggttgtgacgtactatact-3' (SEQ ID NO: 150) Sequencing
Seq2-F 5'-cctcaacctgcattccttct-3' (SEQ ID NO: 151 )
SeqPrimSX-RR 5 ' -ctacctcttcccttcccttc-3 ' (SEQ ID NO : 152)
SPANX-N3
Exonl PCi? amplification (4,593 bp)
N3exl-F 5'-aggttcgcttggtttgttag-3' (SEQ ID NO:153)
N3exl-R 5'-acagcaactgaccaatcttc-3' (SEQ ID NO:154)
Sequencing SeqPrimSX-F5 5'-tgggacactgcctgtatgat-3' (SEQ ID NO:155)
SeqlN3-R 5'-gtctgcagtattcctgtgtt-3' (SEQ ID NO: 156)
Exonl PCR amplification (923 bp)
N3ex2-F 5'-cagctgcctggaaaggggaa-3' (SEQ ID NO: 157)
N3ex2-R 5'-ttcccttccccaccccacac-3' (SEQ ID NO: 158) Sequencing
SeqPrimSX-RR 5' cacctgcattccttctcata 3' (SEQ ID NO: 159)
Seq2N3-F 5' cacctgcattccttctcata 3' (SEQ ID NO:160)
SPANX-N4 Exonl PCR amplification (1 ,515 bp)
N4exl-F 5'-ctccccttccacactaaatg-3' (SEQ ID NO:161)
N4exl-R 5' tcagtctaaactgcactctct-3' (SEQ ID NO:162) Sequencing
SeqPRimSX-F5 5'-tgggacactgcctgtatgat-3' (SEQ ID NO: 163)
SeqlN4 R 5'-tct gcaggtgtctgcagtat-3' (SEQ ID NO: 164)
Exon2 PCi? amplification (2,245 bp)
N4ex2-F/ 5'-agggaagcaagtacctcaga-3' (SEQ ID NO: 165)
N4ex2-R 5'-actgcaagctctgtctctag-3' (SEQ ID NO: 166)
Sequencing
Seq2N4-F 5-tgattccacct gctcttct-3' (SEQ ID NO:167) Seq2N4-R 5'-cctatcttttcccttccctt-3' (SEQ ID NO:16S)
RT-PCR expression in human spaxnl/n5-F (180 bp) 5'-aagaggaagagcccctgtga-3' (SEQ ID NO:169) spaxnl/n5-R (180 bp) 5'-ggtcattctccagttgatttga-3' (SEQ ID NO:170)
Minis atellite polymorphism
MSNl-F 5'-tgcaggagacaccaaacagg-3' (SEQ ID NO:171)
MSNl-R S'-gaatggtaggtccctgcaga-S" (SEQ ID NO:172)
MSN2-F 5'-gcaggaggcaccgaacagg-3' (SEQ ID NO:173)
MSN2-R 5'-ctagtcctccccaccctcct-3' (SEQ ID NO:174)
MSN3-F 5'-gcaagaggtaccaaacaga-3' (SEQ ID NO:175)
MSN3-R 5'-ctccatgtgtgactaatcct-3' (SEQ ID NO:176)
SPANX-N in dog
Dogspal Fo (710 bp) 5'-cacccacagagatgtcagaa-3' (SEQ ID NO: 177)
Dogspal Re (710 bp) 5'catcctcatgagacacagga-3' (SEQ ID NO: 178)
Dogspa2 Fo (606 bp) 5'-gtcaggaaccctgagataga-3' (SEQ ID NO: 179) Dogspa2 Re (606 bp) 5'-tggtctctctgtagttggct-3' (SEQ ID NO:180)
RT-PCR expression in dog Dogspal eF (572 bp) 5'-atgaccaaagccaaagggtg-3' (SEQ ID NO: 181) Dogspal eR (572 bp) 5'-cggggtctcttgtttatcct 3' (SEQ ID NO: 182)
Dogspa 2 eF (460 bp) 5'-gaatgccatgcaccaaagga-3' (SEQ ID NO: 183) Dogspa 2 eR (460 bp) 5-catgtagataggcccagatc-3' (SEQ ID NO: 184)
Site of initiation of transcription of SPANX-N2
225-F (458 bp) 5'- gcagtggtgctttgtgatgt-3' (SEQ ID NO:185)
209-F (442 bp) 5'-atgtctaagccaccctagga-3' (SEQ ID NO:186) 206-F (439 bp): 5'-tctaagccaccctaggactg-3' (SEQ ID NO:187)
19S-F (431 bp): S'-accctaggactgccattg^1 (SEQ ID NO: 188)
193-F (426 bp): 5'-aggactgccattggctgg-3' (SEQ ID NO:: 189)
178-F (411 bp) 5'- tgggacactgcctgtatgat-3' (SEQ ID NO:190)
130-F (363 bp) 5'-cttgtcaccaggagggtata-3' (SEQ ID NO: 191) 70-F (303 bp) 5'-cttcaacatagctgtggaagt-3' (SEQ ID NO:192)
N2-R 5'-atggagttctcttgggactg-3' (SEQ ID NO:193)
Site of initiation of transcription of SPANX- A/D 291-F (468 bp) 5'-atgcatcttcaggggatgct-3' (SEQ ID NO: 194)
251-F (418 bp) 5'-tttgctctacacccctgtca-3' (SEQ ID NO:195)
204-F (371 bp) 5'-agcagtggggctttgtga-3' (SEQ ID NO: 196)
154-F (321 bp) 5'-tgggacactgcctgtatgat-3' (SEQ ID NO: 197)
11 S-F (285 bp) 5'-tgtggctttgccttgtcac-3' (SEQ ID NO: 198)
B2-R 5'-atggtcgaggactcagatgt-3' (SEQ ID NO:199)
Amplification of full-size SPANX-C, SPANX-B and SPANX-N2 ORF by RT-PCR F-C (291 bp) 5'-atggacaaacaatccagtgc-3' (SEQ ID NO:200)
R-C (291 bp) 5'-ctttgcaggtatttcaaccat-3' (SEQ ID NO:201)
F-B/ (309 bp) 5'-atgggccaacaatccagtgt-3' (SEQ ID NO:202)
R-B (309 bp) 5'-ctttttaggtctttcagtcgt-3! (SEQ ID NO:203) F-N (540 bp) S'-atggaacagccgacttcaag-S1 (SEQ ID NO:204)
R-N (540 bp) 5'-gtcctccccaccctcctgt-3' (SEQ ID NO:205)
*These primers were used to analyze SPANX-N genes polymorphisms in normal individuals and in X-linked prostate cancer families. ** In parenthesis - the size of PCR product.
cDNA was made from 1 ug of total RNA using the Superscript first strand system kit (Invitrogene, Carlsbad, CA, USA) and priming with oligo dT per their standard protocol. Human β-actin primers (BD Biosciences Clontech, Mountain View, CA, USA) were used as positive controls. RT-PCR was performed using 1 ul of cDNA in a 50 ul reaction volume. Standard reaction conditions were 94 0C 5 m, (94 0C 1 m, 55 °C 1 m, 72 °C 1 m x 35 cycles), 72 0C 7 m, 4 0C hold). Before sequencing, PCR products were cloned in TA vector (Invitrogen, La Jolla, CA). Database analysis was performed using versions of the BLAST program appropriate for different types of sequence comparisons: BLASTN for nucleotide sequences, BLASTP for protein sequences.
Immunohistochemistry. Multitumor tissue arrays, containing approximately 500 tissue samples of prostate, ovarian colon adenocarcimoma, carcinoma of the breast, lung, colonic adenocarcinoma and prostatic anencocarinoma as well as normal control tissues were obtained from the TARP laboratory, NCI. Complete description of the arrays (TARPl) can be obtained at www.cancer.gov/tarp. Immunostaining was performed on formalin-fixed, paraffin-embedded sections, 4 μm thick, which were thaw-mounted onto 16 h and incubated for 30 min at 6O0C, slides were deparaffmized and rehydrated in graded alcohols. Affinity-purified polyclonal EQPT antibody against SPANX-N was used diluted 1 :2000 for immunohistochemistry of SPANX-N. One section of tumor and one section of non-tumor were stained in each case. The grading of the immunostaining was given according to the percentage of immunoreactive cells (1+, 1-25%; 2+, 26-50%; 3+, 51-75%; 4+, 76-99%).
Peptide specific antibodies. A synthetic peptide having amino acids EQPTSSTNGEKRICSPCESNN (positions 2-21, SEQ ID NO: 136) from the SPANX-N sequence was conjugated to Keyhole Limpet Hemacyanin and used as immunogen as previously described (Goldsmith et al., Biochemistry 27: 7085- 90 (1988)). The resulting antisera (called EQPT for the anti-SP ANX-N antibodies) were affinity-purified over columns of peptide conjugated to Affigel 15 (Bio-Rad, Hercules, CA, USA) and concentrated in stirred cells with YM30 membranes (Millipore, Billerica, MA, USA). The concentrates were then subjected to gel filtration chromatography using 2.6 x 60 cm2 Superdex 200 columns (GE Healthcare, Piscataway, NJ, USA), and the monomelic IgG fractions were pooled and concentrated. The protein concentrations were determined using the Bradford assay (Bio-Rad, Hercules, CA, USA). The antibody specificity was first confirmed using recombinant SPANX-N, SPANX- B and SPANX-C proteins. No cross-reactivity between SPANX-B, SPANX-C and SPANX-N2 was observed with the anti-SP ANX-N antibodies during Western analysis (FIG. 6B). The EQPT anti-SPANX-N antibodies were also used to detect endogeneous expression of SPANX-N proteins in melanoma cell lines (data not shown). A predicted size band was detected in all cases and pre- absorbtion of an antibody with corresponding peptide used for immunization abolished the signal (data not shown).
Production of recombinant SPANX proteins, polyacrylamide gel electrophoresis and Western blotting. For production of the SPANX proteins in E. coli cells, full-size ORF of SPANX-B (309 bp), SPANX-C (291 bp) and SPANX-N2 (540 bp) were generated by RT-PCR from testis RNA using the primers described in Table 9 and cloned into the BamHI site of the pMAL-p2X expression vector (New England BioLabs Inc., Beverly, MA, USA) to produce a fusion protein with MBP. The recombinant proteins were purified by affinity chromatography using a column with MBP. For one-dimensional SDS-PAGE, electrophoresis was performed on 15% acrylamide gels with 50 μg of protein per lane. After SDS-PAGE, polypeptides were either visualized by amido black staining or transferred onto nitrocellulose for Western blotting. Western blots were incubated in PBS containing 0.05% Tween-20 and 5% nonfat dry milk to block nonspecific protein-binding sites. Blots were washed with PBS containing 0.05% Tween-20 between all sxibsequent incubation steps and were incubated in antibody diluted in PBS-normal goat serum (PBS-NGS) or PBS-normal donkey serum (PBS-NDS), followed by horseradish peroxidase (HRP)-conjugated F(Ab)2 fragments of goat anti-mouse IgG/IgM or donkey anti-guinea pig IgG (Jackson ImmunoResearch, West Grove, PA). The HRP conjugates were visualized using TMB reagent according to the manufacturer's protocol (Kirkegaard & Perry Laboratories, Gaithersburg, MD).
Sequencing of SPANX-N genes. The fragments containing SPANX-N sequences were PCR amplified from genomic DNA using a set of specific primers developed for each gene (Table 9). Because coding regions were PCR amplified along with long flanking segments (from 2 to 5 kb) another pair of specific primers was used for direct sequencing of the coding regions from PCR fragments (Table 9). Minisatellite repeats of SPANX-Nl, SPANX-N2 and SPANX-N3 genes were amplified using a set of specific primers (Table 9). To analyze minisatellite repeats, the PCR fragments were cloned into a TA vector and sequenced. Sequence forward and reverse reactions were run on a PE- Applied Biosystem 3100 Automated Capillary DNA Sequencer. Complete sequences of SPANX-N alleles were named and numbered according to the clone/accession identifier (Table 9). Sequences were aligned with MAVID (Bray and Pachter, Genome Res. 14: 693-699 (2004); http://baboon.math. berkeley.edu/mavid/). Pictures of alignments and alignment schemes were prepared in WAViS (Zika et al., Nucl. Acids Res. 32: W4S-49 (2004)). Database searches were performed using versions of the BLAST program appropriate for different types of sequence comparisons: BLASTN for nucleotide sequences, BLASTP for protein sequences, and TBLASTN for searching a nucleotide database translated in 6 frames with a protein query (Altschul et al., J, MoI, Biol. 215: 403-410 (1990)).
RESULTS
SPANX-N2, SPANX-N3, SPANX-N4 and SPANX-N5 genes are expressed in normal tissues while SPANX-Nl is expressed in cancer tissues
As shown in the previous Examples, the human SPANX- A/D gene subfamily, SPANX-N2 and SPANX-N3 genes are expressed in normal testis but their transcripts were not detected in adult brain, liver and skeletal muscle tissues. In this study, a more representative number of adult normal tissues, as well as a set of tumor tissues were analyzed. Using new primers that recognize all five SPANX-N genes (Table 9) SPANX-N transcripts were detected in several normal nongametogenic tissues (placenta, prostate, proximal and distal colon, lung, and cervix), although their level was much lower than in testis.
The amplified coding region of -180 bp has a nucleotide sequence that is specific for each SPANX-N gene. Further cloning of RT-PCR products into a TA vector and sequencing of the inserts permitted determination of which gene family members are expressed in which tissues. This analysis revealed that SPANX-N2, SPANX-N3, and SPANX-N5 transcripts are abundant in certain normal tissues, while SPANX-N4 transcripts were detected only in testis (FIG. 7 and Table 10).
Table 10: Expression of the SPANX-N genes in normal and tumor tissues and cell lines
Figure imgf000125_0001
Figure imgf000126_0001
However, among the 86 TA-cloned sequences obtained by RT-PCR from seven tissues only 4 (one from cervix and 3 from testis) corresponded to SPANX-Nl, suggesting that this gene is silenced in nongametogenic normal tissues. In contrast, almost exclusive expression of SPANX-Nl was observed in three primary uterine cancers and four melanoma cell lines: 47 of 58 sequenced RT-PCR clones (81%) had the SPANX-Nl sequence (Table 10 and FIG. 7B). Further expansion of this analysis of the matching pairs of normal and cancer tissues produced similar results - almost all of the clones from cancer cells expressed SPANX-Nl but not SPANX-N2 through N5 (Table 10 and FIG. 7C).
Therefore, in contrast to the SPANX-A/D subfamily, at least three SPANX-N genes are widely expressed in a variety of normal adult tissues. Moreover, the observed differential activation of SPANX-Nl in cancer tissues indicates that SPANX-Nl expression may be a new diagnostic indicator of cancer. SPANX-N proteins were detected in normal and tumor tissues.
Although SPANX-N mRNAs were frequently detected in normal adult tissues and some malignancies, it was not clear whether the mRNAs were translated into SPANX-N proteins. To check this, a polyclonal rabbit antibody EQPT was generated against a chemically synthesized SPANX-N peptide as described in the Materials and Methods. Based on the peptide sequence, it appeared that the EQPT antibody would recognize at least four SPANX-N proteins (SPANX-Nl, - N2, -N3, and -N5. Multi-tumor tissue arrays were immunostained using this antibody, including approximately 500 samples from 7 tumor types such as prostatic adenocarcinoma. Protein expression was also examined in matching normal samples. The EQPT antibody preferentially stained cellular nuclei
Consistent with the RT-PCR analysis, expression of the proteins was detected in normal and tumor tissues of prostate, lung, colon and ovary. The highest expression of SPANX-N protein was observed in prostate tissues. No detectable level of SPANX-N protein was observed in breast, brain and lymphoma. A closer look at the tissue array showed some variability of expression between different types of tumors and also showed variable expression between different specimens of the same type of tumor. A fraction of tumors expressing SPANX-N proteins ranged from 8% in ovarian to 45% in prostate cancers. In normal tissues the fraction of cells ranged from 15% in colon to 38% in prostate.
These data indicate that SPANX-N proteins are expressed in a variety of adult normal tissues and tumors including prostate tissues.
SPANX-N promoter lacks most of the CpG dinucleotides that are present in SPANX- A/D. Previous analysis of members of the SPANX gene family by the inventors revealed a high conservatism of the noncoding sequences of the 5' UTR, suggesting a general control of transcription for all 10 SPANX genes. To shed light on the mechanism of different expression of SPANX- A/D and SPANX-N gene subfamilies, the initiation sites for transcription of the genes were determined and a comparative analysis of the promoter regions was performed.
The sites for initiation of transcription for SPANX-N and SPANX- A/D genes were determined by RT-PCR using a set of nested primers (Table 9). For this purpose, total mRNA from normal human testis was used. Detectable transcription started at -193 and -204 from the initiation codon for SPANX-N and SPANX-A/D genes, respectively (FIG. 8).
Further computational analysis revealed a difference between the predicted SPANX-A/D and SPANX-N promoter regions with regard to the density of CpG dinucleotides. In particular, SPANX-A/D promoters contain six CpG dinucleotides. However, several of these CpG dinucleotides are not present within the SPANX-N promoters (FIG. 8). The link between CpG islands and global DNA methylation is well known. These observations suggest that inactivation of SPANX-A/D genes by methylation is more efficient compared to SPANX-N. The difference in the promoter methylation between the genes may be even greater because of the presence of the SpI -binding site within the promoter sequence of four SPANX-N genes (FIG. 8). It is known that binding of the SpI transcription factor may prevent DNA methylation (Brandeis et al., Nature 371 :435-438 (1994); Macleod et al., Genes Dev. 8:2282-2292 (1994)). Taken together, these analyses suggest that expression of the SPANX gene family is generally regulated through promoter demethylation and the evolutionarily old group of genes, SPANX-N, may partially escape strong regulatory control due to a lower density of CpG dinucleotides. Analysis of SPANX-N genes in X-linked prostate cancer families did not reveal any mutations in coding and 5' untranslated regions. Genetic linkage studies indicate that SPANX-N genes overlap the HPCX locus implicated in prostate carcinogenesis. As shown herein these genes are expressed in normal prostate. Given their possible role in hereditary prostate cancer, a mutational analysis in eleven X-linked families affected by prostate cancer was performed.
The first screen included analysis of minisatellite repeats that are known to be a common source of gene inactivation (Bois, Genomics. 81 : 349-355 (2003)). SPANX-N genes, except SPANX-N4, contain 39 bp minisatellite repeats. Four, seven, three and five copies of the repeat are present in SPANX- Nl, -N2, -N3 and -N5 genes, correspondingly. In the case of SPANX-Nl and SPANX-N5, the minisatellite arrays were located after the stop codon. Therefore, these genes were excluded from the analysis. In the case of SPANX- N2 and SPANX-N3 that were preferentially expressed in prostate, the repeats are in frame with exon 2 and encode the C-terminus of the proteins.
The number of minisatellite repeats in SPANX-N genes may be variable in the human population and some minisatellite alleles may be associated with a risk of cancer by affecting expression of nearby genes. To test this, specific primers were designed that amplify the entire block of minisatellites within SPANX-N2 and SPANX-N3. Moreover, the minisatellite polymorphism was examined in X-linked families with hereditary prostate cancer and compared to unaffected controls. However, no polymorphism was found, indicating that the repeats are reasonably stable and have no link to the prostate malignancy.
The next stage included a mutational screen of the entire SPANX-N exon and 5' UTR sequences. Each of the four genes that mapped to Xq27 was checked in eleven X-linked families that were predisposed to prostate cancer and 40 unaffected controls. Exon 1 and exon 2 regions along with the flanking sequences were PCR amplified from each individual using a pair of specific primers and sequenced. Sequence analysis of the patients identified four alleles of SPANX-Nl, five alleles of SPANX-N2 and two alleles of SPANX-N3 and SPANX-N4 (Table 11).
Table 11: Nucleotide and amino acid changes in the SPANX-Nl-, -N2, -N3, -N4 coding regions: frequency and location
Figure imgf000129_0001
Figure imgf000130_0001
*Total 40 noπnal individuals and 18 X-linked prostate cancer patients from 11 families were analyzed.
As shown in Table 11, none of four SPANX-Nl variants contain mutations resulting in amino acid substitutions. Five SPANX-N2 alleles had one non-conservative change (T8I) and four conservative substitutions in codons 4, 80 and 151. One amino acid replacement (K43N) was found in two SPANX-N3 alleles. Two SPANX-N4 alleles had one non-conservative change (K48N). All other gene variants had single conservative missense mutations. Because the same substitutions were found in unaffected individuals, these changes may represent normal genetic polymorphism.
These data indicate that the predisposition to prostate cancer in X-linked families is not linked to specific mutations in the SPANX-N genes.
Identification and analysis of SPANX-N homologs. Comparison of SPANX gene homologs indicated that their expansion is still an ongoing process in hominids. Based on the proposed model of evolution of these genes, the common ancestor of rodent and primates apparently had a single SPANX gene. Amplification of this gene was initiated in primates and coincided with the insertion of ERV into the intronic sequence. To identify SPANX-N homologs in other species, bioinformatics was utilized in combination with experimental testing. A search of members of SPANX-N subfamily in the chimpanzee genome detected five contigs with incomplete sequences on the X chromosome. Thus, like the human genome, there are five SPANX-N genes in the chimpanzee genome. After PCR amplification of the corresponding chimp genomic regions using primers designed for the human SPANX-N genes sequencing and assembly, the complete coding regions of three gene homologs, SPANX-N2, SPANX-N3 and SPANX-N5 were obtained.
The chimpanzee gene homologs encode proteins that share about 95% identity with the human SPANX-N. Comparison of chimpanzee SPANX-N2 and SPANX-N3 coding sequences to human revealed four non-conservative substitutions, two of which are in the core (K43N and Y55H). The chimpanzee SPANX-N3 gene contains 10 non-conserved changes compared to human, three in the conserved core (El SK, N21S and K23E). There is also a single amino acid deletion (del 122K) and two conservative changes. Similar to human, all chimpanzee genes, except SPANX-N4, contained 39 bp minisatellite repeats at their 3' ends. Because these repeats are also present in tamarin and rhesus SPANX-N genes, they likely arose during early primate evolution.
Comparison of the SPANX-N genomic sequences permitted reconstruction of the probable scheme of evolution of the genes in human. The most likely scenario is that SPANX-N3 was the original locus. Expansion of other genes seems to have occurred by duplication of this chromosomal segment.
Dog and wolf SPANX-N gene homologs were also identified. Two regions of significant similarity to the human SPANX-N were identified in the canine genome, one on chromosome X and another on chromosome 31. However, the chromosome X-linked copy of SPANX gene was a pseudogene — this copy contained a stop codon at the middle of exon 2 and is not expressed in testis. In contrast, a gene copy corresponding to the chromosome 31 contig lacks a frameshift mutation and is abundantly expressed in testis (data not shown). Because the dog genome is not yet completed, additional experiments are required to elucidate if the functional copy is indeed outside of chromosome X, or if this is a mapping error. Similarly, one gene and one pseudogene were detected in the wolf genome. The canine SPANX-N protein shares only 30% amino acid identity with the human and mouse SPANX-N proteins. Sequence analysis of the SPANX-N gene in 33 canine breeds revealed four different alleles that may be useful for pedigree analysis. Two non-conservative changes were found in four canine and two wolf alleles.
These results indicate that a similar organization exists for SPANX-N genes in the human and chimpanzee genomes. Identification of a single functional SPANX-N gene homolog in dog, that represents the only animal model of prostate cancer, may be important to elucidate a possible contribution of this group of genes to prostate malignancies.
In addition, the results provided herein show that SPANX-Nl expression is correlated with the development of several types of cancer, including uterine and cervical cancers, melanoma, testicular and prostate cancers.
References:
Aktas et al., Int. Urol. Nephrol. 28: 819-29 (1996). Altschul SF, Gish W, Miller W, Myers EW, Lipman, DJ (1990) Basic local alignment search tool. J MoI Biol 215:403-410. Babcock M, Pavlicek A, Spiteri E, Kashork CD, Ioshikhes I, Shaffer LG, Jurka
J, Morrow BE. Shuffling of genes within low-copy repeats on 22ql 1
(LCR22) by Alu-mediated recombination events during evolution. Genome Res. 2003, 13, 2519-2532.
Baccetti, et al., Gamete Res. 23: 181-88 (19S9).
Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, Schwartz S5 Adams MD,
Myers EW, Li PW, Eichler EE. Recent segmental duplications in the human genome. Science. 2002, 297: 1003-1007. Ball MA, Parker GA. Sperm competition games: sperm selection by females. J
Theor Biol. 2003, 224, 27-42. Birkhead TR, Pizzari T. Postcopulatory sexual selection. Nat Rev Genet. 2002,
3, 262-273.
Bochum S, Paiss T, Vogel W, Herkommer K, Hautmann R, Haeussler J. Confirmation of the prostate cancer susceptibility locus HPCX in a set of
104 German prostate cancer families. Prostate. 2002, 52, 12-19. Bray N, Pachter L. 2004 MAVID: constrained ancestral alignment of multiple sequences. Genome Res. 14, 693-699.
Chen JM, Ferec C. Gene conversion-like missense mutations in the human cationic trypsinogen gene and insights into the molecular evolution of the human trypsinogen family. MoI Genet Metab. 2000 7, 463-469. Courseaux et al., Genome Res. 13: 369-3S1 (2003). Eichler, E.E. Trends Genet. 17:661-669 (2001). Gillanders EM, Xu J, Chang BL, Lange EM, Wiklund F, Bailey-Wilson JE, Baffoe-Bonnie A, Jones M, Gildea D, Riedesel E, Albertus J, Isaacs SD, Wiley KE, Mohai CE, Matikainen MP, Tammela TL, Zheng SL, Brown WM, Rokrnan A, Carpten JD, Meyers DA, Walsh PC, Schleutker J, Gronberg H, Cooney KA, Isaacs WB, Trent JM. Combined genome-wide scan for prostate cancer susceptibility genes. J Natl Cancer Inst. 2004, 96, 1240-1247.
Grasser KD. Chromatin-associated HMGA and HMGB proteins: versatile co- regulators of DNA-dependent processes. Plant MoI Biol. 2003, 53, 281- 295.
Guy et al., Hum. MoI. Genet. 9: 2029-2042 (2000).
Holzik MF, Rapley EA, Hoekstra HJ, Sleijfer DT, Nolte IM, Sijmons RH. Genetic predisposition to testicular germ-cell tumours. Lancet Oncol. 2004, 5, 363-371. Jurka, J. (2000) Repbase update: a database and an electronic journal of repetitive elements. Trends Genet., 16, 418-420.
Jurka, J., Klonowski, P., Dagman, V. and Pelton, P. (1996) CENSOR-a program for identification and elimination of repetitive elements from DNA sequences. Comput. Chem., 20, 119-121. Kent WJ. 2002 BLAT-the BLAST-like alignment tool. Genome Res.
Apr;12(4):656-64. Kibel AS, Faith DA, Bova GS, Isaacs WB.Xq27-28 deletions in prostate carcinoma. Genes Chromosomes Cancer. 2003, 37, 381-388. Kimura, M. J. MoI. Evol. 16: 111-120 (1980). Kouprina N., Annab L., Graves J., Afshari C, Barrett J. C, Resnick M. A., and Larionov V. Functional copies of a human gene can be directly isolated by TAR cloning with a small 3' end target sequence. Proc. Natl. Acad. Sci. USA 95: 4469-447 '4, 1998.
Kouprina, N. and Larionov, V., FEMS Microbiol. Rev. 27:1-21 (2003). Kumar et al., Bioinformatics 17: 1244-45 (2001).
Larson, R.E. and Chenoweth, PJ. 1990, MoI. Repro. Dev. 25: 87-96.
Leem S.-H., V. N. Noskov, J-E Park, S I Kim' V. Larionov, and N. Kouprina. Optimum conditions for selective isolation of genes from complex genomes by transformation-associated recombination cloning, Nucl.
Acids Res., 31, e29, 2003. Leem, S.-H., N. Kouprina, J. Grimwood, J.-H. Kim, M. Mullokandov, Y.-H.
Yoon, J.-Y. Chae, J. Morgan, S. Lucas, P. Richardson, C. Detter, T. Glavina, E. Rubin, J. C. Barrett, V. Larionov. Closing the Gaps on
Human Chromosome 19 Revealed Genes with a High Density of
Repetitive Tandemly Arrayed Elements. Genome Research, 14, 239-246,
2004.
Lievers et al., Eur. J. Hum. Genet. 9: 5S3-89 (2001). Malkalowski, W. and Boguski, M.S., J. MoI. Evol. 47: 1 19-121 (1998).
Montironi R, Scarpelli M, Lopez Beltran A. Carcinoma of the prostate: inherited susceptibility, somatic gene defects and androgen receptors. Virchows
Arch. 2004, 444, 503-508.
Nagasaki K, Manabe T, Hanzawa H, Maass N, Tsukada T, Yamaguchi K. (1999) Identification of a novel gene, LDOCl, down-regulated in cancer cell lines. Cancer Lett 140:227-234. Nagasaki K, Schem C, von Kaisenberg C, Biallek M, Rosel F, Jonat W, Maass N
(2003) Leucine-zipper protein, LDOCl, inhibits NF-kappaB activation and sensitizes pancreatic cancer cells to apoptosis. Int J Cancer 105:454- 458.
Newman T, Trask BJ. Complex evolution of 7E olfactory receptor genes in segmental duplications. Genome Res. 2003 May; 13(5):781-793. Olino, S Evolution by Gene Duplication. Springer, Berlin, 1970. Richardson C, Jasin M (2000) Coupled homologous and nonhomologous repair of a double-strand break preserves genomic integrity in mammalian cells.
MoI Cell Biol 20:9068-9075. Richardson C, Moynahan ME, Jasin M (1998) Double-strand break repair by interchiOinosomal recombination: suppression of chromosomal translocations. Genes Dev 12:3831-3842. Rost et al., Comput. Appl. Biosci. 10: 53-60 (1994).
Saitou, N. and Nei, M., MoI. Biol. Evol. 4: 406-25 (19S7). Samonte, R. V. and Eichler, E.E., Nat. Rev. Genet. 3: 65-72 (2002). Sawyer, S. A. (1989) Statistical tests for detecting gene conversion. Molecular Biology and Evolution 6, 526-538.
Scanlan MJ, Simpson AJ, Old LJ. The cancer/testis genes: review, standardization, and commentary. Cancer Immun. 2004, 4, 1. Schaid DJ. The complex genetic epidemiology of prostate cancer. MoI Genet. 2004, 13 Spec No l: R103-121.
Schuler, G.D. et al. Proteins 9:180-190 (1991).
Sebat J5 Lakshmi B, Troge J, Alexander J, Young J, Lundin P, Maner S, Massa H, Walker M, Chi M, Navin N5 Lucito R, Healy J5 Hicks J5 Ye K, Reiner A, Gilliam TC5 Trask B, Patterson N5 Zetterberg A, Wigler M. Large- scale copy number polymorphism in the human genome. Science. 2004, 305, 525-528.
Shaw CJ, Lupski JR. Implications of human genome architecture for rearrangement-based disorders: the genomic basis of disease. Hum MoI Genet. 2004, 13, Spec No 1 :R57-64.
Singh, R.S. and Kulathinal, RJ., Genes Genet. Syst. 75: 119-130 (2000).
Skaletsky H, Kuroda-Kawaguchi T, Minx PJ5 Cordum HS5 Hillier L, Brown LG,
Repping S, Pyntikova T, AIi J, Bieri T, Chinwalla A, Delehaunty A,
Delehaunty K, Du H, Fewell G, Fulton L5 Fulton R, Graves T, Hou SF5 Latrielle P5 Leonard S, Mardis E, Maupin R, McPherson J, Miner T5
Nash W5 Nguyen C5 Ozersky P5 Pepin K5 Rock S5 Rohlfmg T5 Scott K,
Schultz B, Strong C, Tin-Wollam A, Yang SP, Waterston RH5 Wilson
RK, Rozen S5 Page DC (2003) The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature 423:825- S37.
Sousa, M. and Caravalheiro, J. 1994. Anat. Embryol. 190: 479-87.
Stephen DA5 Howell GR5 Teslovich TM5 Coffey AJ, Smith L, Bailey-Wilson JE5 Malechek L, Gildea D, Smith JR5 Gillanders EM5 Schleutker J5 Hu P, Steingruber HE, Dhami P, Robbins CM, Makalowska I, Carpten JD, Sood R5 Mumm S, Reinbold R, Bonner TI5 Baffoe-Bonnie A5 Bubendorf
L5 Heiskanen M, Kallioneimi OP, Baxevanis AD, Joseph SS5 Zucchi I5 Burk RD, Isaacs W, Ross MT5 Trent JM. Physical and transcript map of the hereditary prostate cancer region at xq27. Genomics. 2002, 79, 41-50. Suh J, Rabson AB (2004) NF-kappaB activation in human prostate cancer: important mediator or epiphenomenon? J Cell Biochem 91:100- 117. Swanson et al., Proc. Natl. Acad. Sci. USA 9S: 7375-7379 (2001). Swanson et al., Nat. Rev. Genet. 3: 137-144 (2002). Tenzen T, Yamagata T, Fukagawa T, Sugaya K, Ando A, Inoko H, Gojobori T, Fujiyama A, Okumura K5 Ikemura T (1997) Precise switching of DNA replication timing in the GC content transition area in the human major histocompatibility complex. MoI Cell Biol. 17:4043-4050. Tremblay A, Jasin M, Chartrand P (2000) A double-strand break in a chromosomal LINE element can be repaired by gene conversion with various endogenous LINE elements in mouse cells. MoI Cell Biol 20:54- 60.
Wang, Z., Zhang, Y., Liu, H., Salati, E., Chiriva-Internati, M., and Lim, S.H. 2003. Gene expression and immunologic consequence of SPAN-Xb in myeloma and other hematologic malignancies. Blood. 101:955-960.
Warburton PE, Giordano J, Cheung F, Gelfand Y, Benson G (2004) Inverted repeat structure of the human genome: the x-chromosome contains a preponderance of large, highly homologous inverted repeats that contain testes genes. Genome Res 14:1861-1869. Watanabe Y, Tenzen T, Nagasaka Y, Inoko H, Ikemura T (2000) Replication timing of the human X-inactivation center (XIC) region: correlation with chromosome bands. Gene. 252:163-172.
Westbrook, V.A., Diekman, A.B., Klotz, K.L., Khole, V. V., von Kap-Herr, C,
Golden, W.L., Eddy, R.L., Shows, T.B., Stoler, M.H., Lee, C.Y., Flickinger, C. J., and Herr, J.C. 2000. Spermatid-specific expression of the novel X-linked gene product SPAN-X localized to the nucleus of human spermatozoa. Biol. Reprod. 63:469-481.
Westbrook, V.A., Diekman, A.B., Naaby-Hansen, S., Coonrod, S.A., Klotz,
K.L., Thomas, T.S., Norton, E.J., Flickinger, C.J., and Herr, J.C. 2001. Differential nuclear localization of the cancer/testis-associated protein,
SPAN-X/CTpl l, in transfected cells and in 50% of human spermatozoa.
Biol. Reprod. 64:345-358. Westbrook, V.A., Schoppee, P.D., Diekman, A.B., Klotz, K.L., Allietta, M.,
Hogan, K.T., Slingluff, C.L., Patterson, J.W., Frierson, H.F., Irvin, W.P.
Jr., Flickinger, C.J., Coppola, M.A., and Herr, J.C. 2004. Genomic organization, incidence, and localization of the SPAN-x family of cancer- testis antigens in melanoma tumors and cell lines. Clin. Cancer Res.
10:101-112.
Wyckoff et al., Nature 403: 304-309 (2000).
Xu J, Gillanders EM, Isaacs SD, Chang BL, Wiley KE, Zheng SL, Jones M,
Gildea D, Riedesel E, Albertus J, Freas-Lutz D, Markey C, Meyers DA, Walsh PC, Trent JM, Isaacs WB. Genome- wide scan for prostate cancer susceptibility genes in the Johns Hopkins hereditary prostate cancer families. Prostate. 2003, 57, 320-325.
Xu J, Meyers D, Freije D, Isaacs S, Wiley K, Nusskern D, Ewing C, Wilkens E, Bujnovszky P, Bova GS, Walsh P, Isaacs W, Schleutker J, Matikainen M, Tammela T, Visakorpi T, Kallioniemi OP, Berry R, Schaid D, French
A, McDonnell S, Schrøeder J, Blute M, Thibodeau S, Trent J, et al. Evidence for a prostate cancer susceptibility locus on the X chromosome. Nat Genet. 1998, 20, 175-179. Yang et al., Am. J. MoI. Genet. 95: 385-390 (2000). Zendman, A.J., Cornelissen, I.M., Weidle, U. H., Ruiter, D.J., and van Muijen, G.N. 1999. CTpI l, a novel member of the family of human cancer/testis antigens. Cancer Res. 59:6223-6229. Zendman, A.J., Huiter, D.J., Weiss, E.H., and van Muijen, G.N. 2003. Cell
Physiol. 194:272-288. Zendman, A.J., Cornelissen, I.M., Weidle, U.H., Ruiter, D.J., and van Muijen, G.N. 1999. CTpI l, a novel member of the family of human cancer/testis antigens. Cancer Res. 59:6223-6229.
Zhang et al., Proc. Natl. Acad. Sci. USA 95: 3708-13 (1998). Zika R, Paces J, Pavlicek A, Paces V (2004) WAViS server for handling, visualization and presentation of multiple alignments of nucleotide or amino acids sequences. Nucleic Acids Res 32:W48-49 All patents and publications referenced or mentioned herein are indicative of the levels of skill of those skilled in the art to which the invention pertains, and each such referenced patent or publication is hereby incorporated by reference to the same extent as if it had been incorporated by reference in its entirety individually or set forth herein in its entirety. Applicants reserve the right to physically incorporate" into this specification any and all materials and information from any such cited patents or publications.
The specific methods and compositions described herein are representative of preferred embodiments and are exemplary and not intended as limitations on the scope of the invention. Other objects, aspects, and embodiments will occur to those skilled in the art upon consideration of this specification, and are encompassed within the spirit of the invention as defined by the scope of the claims. It will be readily apparent to one skilled in the art that varying substitutions and modifications may be made to the invention disclosed herein without departing from the scope and spirit of the invention. The invention illustratively described herein suitably may be practiced in the absence of any element or elements, or limitation or limitations, which is not specifically disclosed herein as essential. The methods and processes illustratively described herein suitably may be practiced in differing orders of steps, and that they are not necessarily restricted to the orders of steps indicated herein or in the claims. As used herein and in the appended claims, the singular forms "a," "an," and "the" include plural reference unless the context clearly dictates otherwise7 Thus, for example, a reference to "an antibody" includes a plurality (for example, a solution of antibodies or a series of antibody preparations) of such antibodies, and so forth. Under no circumstances may the patent be interpreted to be limited to the specific examples or embodiments or methods specifically disclosed herein. Under no circumstances may the patent be interpreted to be limited by any statement made by any Examiner or any other official or employee of the Patent and Trademark Office unless such statement is specifically and without qualification or reservation expressly adopted in a responsive writing by Applicants.
The terms and expressions that have been employed are used as terms of description and not of limitation, and there is no intent in the use of such terms and expressions to exclude any equivalent of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention as claimed. Thus, it will be understood that although the present invention has been specifically disclosed by preferred embodiments and optional features, modification and variation of the concepts herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention as defined by the appended claims.
The invention has been described broadly and generically herein. Each of the narrower species and subgeneric groupings falling within the generic disclosure also form part of the invention. This includes the generic description of the invention with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein. Other embodiments are within the following claims, hi addition, where features or aspects of the invention are described in terms of Marlαish groups, those skilled in the art will recognize that the invention is also thereby described in terms of any individual member or subgroup of members of the Markush group.

Claims

WHAT IS CLAIMED:
1. An isolated SPANX-N polypeptide having an amino acid sequence comprising any one of SEQ ID NO: 1-5.
2. An isolated antibody that can bind to a polypeptide having an amino acid sequence corresponding to any one of SEQ ID NO: 1-5.
3. The isolated antibody of claim 2, wherein the antibody can bind to a SPANX-N peptide consisting essentially of any one of SEQ ID NO: 12- 25, or 136.
4. An isolated nucleic acid encoding a SPANX-N polypeptide having an amino acid sequence comprising any one of SEQ ID NO: 1-5.
5. The isolated nucleic acid of claim 4, wherein the nucleic acid comprises any one of SEQ ID NO:26-30.
6. An isolated nucleic acid encoding a SPANX-N promoter comprising any one of SEQ ID NO:206-210.
7. The nucleic acid of claim 6, wherein the promoter is a SPANX-Nl promoter consisting of SEQ ID NO:206.
8. An expression cassette comprising a nucleic acid encoding a therapeutic gene product operably linked to the isolated nucleic acid of claim 6.
9. The expression cassette of claim 8, wherein the promoter is a SPANX- Nl promoter consisting of SEQ ID NO:206.
10. An isolated cell comprising the expression cassette of claim 8.
11. An isolated SPANX-N specific primer or probe consisting essentially of any one of SEQIDNO:37, 38, 137-170, or a combination thereof
12. The isolated SPANX-N specific primer or probe of claim 11, wherein the primer or probe is a SPANX-Nl primer or probe that consists essentially of any one of SEQ ID NO: 137-144, or a combination thereof.
13. An isolated nucleic acid that can inhibit the function of a SPANX mRNA comprising a DNA or RNA that can hybridize to a nucleic acid encoding a SPANX-N polypeptide having an amino acid sequence comprising any one of SEQ ID NO:l-5.
14. The isolated nucleic acid of claim 13, wherein the SPANX mRNA is complementary to any one of SEQ ID NO:26-30.
15. The isolated nucleic acid of claim 13, wherein the isolated nucleic acid is a small interfering RNA (siRNA), ribozyme, or antisense nucleic acid.
16. The isolated nucleic acid of claim 13, wherein the isolated nucleic acid sequence consists essentially of SEQ ID NO:39-48.
17. A method for detecting cancer comprising contacting a mammalian tissue sample with a SPANX-N probe and observing whether an mRNA or cDNA in the sample hybridizes to the SPANX-N probe; wherein the SPANX-N probe comprises any one of SEQ ID NO:26-30, 37, 38, 137- 170.
IS. The method of claim 17, wherein the probe comprises any one of SEQ ID NO:137-144.
19. The method of claim 17, wherein observing whether an mRNA or cDNA in the sample hybridizes to the SPANX-N probe further comprises nucleic acid amplification of a nucleic acid in the tissue sample.
20. The method of claim 17, wherein the mammalian tissue sample is a not a testis tissue sample.
21. A method for detecting cancer comprising performing nucleic acid amplification of RNA from a non-testis tissue sample using SPANX-Nl primers consisting essentially of SEQ ID NO: 137 and 138, or SEQ ID NO: 141 and 142, and observing whether a nucleic acid fragment is amplified.
22. The method of claim 21, wherein the nucleic acid fragment is about 1750 to about 1,950 base pairs in length.
23. A method for detecting cancer comprising contacting a non-testis tissue sample with an anti-SPANX-N antibody and observing whether a complex forms between the antibody and a SPANX-N polypeptide.
24. The method of claim 23, wherein the SPANX-N polypeptide comprises SEQ ID NO: 1-5.
25. The method of claim 23, wherein the antibody can bind to a SPANX-N peptide consisting of SEQ ID NO: 136.
26. A method for treating cancer in a mammal comprising administering to the mammal an effective amount of an antibody that can bind to a SPANX-N peptide consisting of SEQ ID NO: 136.
27. The method of claim 26, wherein the antibody is linked to an anti-cancer agent.
28. A method for treating cancer in a mammal comprising administering to the mammal an effective amount of a nucleic acid that can inhibit the function of a SPANX-N mRNA, wherein the nucleic acid comprises a DNA or RNA that can hybridize to a mRNA encoding a SPANX-N polypeptide having an amino acid sequence comprising any one of SEQ ID NO: 1-5.
29. The method of claim 2S, wherein the SPANX-N mRNA is complementary to any one of SEQ ID NO:26-30.
30. The method of claim 28, wherein the nucleic acid is a small interfering RNA (siRNA), ribozyme, or antisense nucleic acid.
31. The method of claim 28, wherein the nucleic acid's sequence consists essentially of SEQ ID NO:39-48.
32. A method for treating cancer in a mammal comprising administering to the mammal an effective amount of a nucleic acid that encodes an anticancer agent operably linked to a SPANX-Nl promoter comprising SEQ ID NO:206.
33. The method of claim 32, wherein the anti-cancer agent is a cytokine, interferon, hormones, cell growth inhibitor, cell cycle regulator, apoptosis regulator, cytotoxin, cytolytic viral product, or antibody.
34. The method of claim 32, wherein the anti-cancer agent is interferon- alpha, p53, pl6, CCAM, p217pl5, BRCAl, BRCA2, IRF-I, PTEN (MMACl), RB, APC, DCC, NF-I, NF-2, WT-I, MEN-I, MEN-II, zacl, p73, VHL, FCC, MCC, DBCCRl, DCP4, p57, Bax , Bak, BcI-Xs , Bad , Bim, Bik, Bid, Harakiri, Ad ElB, Bad, ICE-CED3 protease, TRAIL, SARP-2, apoptin, p27, pl6, ρ21, p57, pl8 , p73 , pl9, ρl5, E2F-1, E2F- 2, E2F-3, plO7, pl30, E2F-4, soluble Fltl (dominant negative soluble VEGF receptor), soluble Wnt receptor, soluble Tie2/Tek receptor, soluble hemopexin domain of matrix metalloprotease 2, soluble receptor of VEGFRl /KDR, soluble receptor of VEGFR3/Flt4, RANTES, MCAF, MIPl -alpha, MlPl-beta, IP-IO, ribosome inactivating protein, a-sarcin, aspergillin, restrictocin, a ribonuclease, diphtheria toxin A, pertussis toxin A subunit, E. coli entero toxin toxin A subunit, cholera toxin A subunit, pseudomonas toxin c-terminal, ricin A-chain or melanoma differentiation associated protein 7 (MD A7).
35. A method to identify an agent that modulates SPANX-N expression comprising contacting a test cell with a candidate agent, and determining if the candidate agent increases or decreases expression of an SPANX-N gene in the test cell when compared to expression of the SPANX-N gene in a control cell that was not contacted with the candidate agent.
36. The method of claim 35, wherein the agent increases SPANX-N expression.
37. The method of claim 35, wherein the agent decreases SPANX-N expression.
38. The method of claim 35, wherein the SPANX-N is SPANX-Nl.
39. The method of claim 35, wherein expression of an SPANX-N gene is determined by observing whether an anti-SPANX-N antibody binds to a SPANX-N polypeptide expressed by the SPANX-N gene in the test cell.
PCT/US2005/045317 2004-12-15 2005-12-15 Cancer-specific spanx-n markers WO2006065938A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CA002591918A CA2591918A1 (en) 2004-12-15 2005-12-15 Cancer-specific spanx-n markers
AU2005316532A AU2005316532A1 (en) 2004-12-15 2005-12-15 Cancer-specific SPANX-N markers
EP05854102A EP1838872A2 (en) 2004-12-15 2005-12-15 Cancer-specific spanx-n markers

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US63681104P 2004-12-15 2004-12-15
US60/636,811 2004-12-15

Publications (2)

Publication Number Publication Date
WO2006065938A2 true WO2006065938A2 (en) 2006-06-22
WO2006065938A3 WO2006065938A3 (en) 2007-02-08

Family

ID=36588517

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2005/045317 WO2006065938A2 (en) 2004-12-15 2005-12-15 Cancer-specific spanx-n markers

Country Status (4)

Country Link
EP (1) EP1838872A2 (en)
AU (1) AU2005316532A1 (en)
CA (1) CA2591918A1 (en)
WO (1) WO2006065938A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010084488A1 (en) 2009-01-20 2010-07-29 Ramot At Tel-Aviv University Ltd. Mir-21 promoter driven targeted cancer therapy
US8664183B2 (en) 2009-02-27 2014-03-04 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services SPANX-B polypeptides and their use

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114262683B (en) * 2022-03-01 2022-06-17 中国科学院动物研究所 Bacterial preparation for expressing VEGFR 3D 2 polypeptide and construction method and application thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004020607A2 (en) * 2002-08-30 2004-03-11 Texas Tech University SPAN-Xb GENE AND PROTEIN FOR THE DIAGNOSIS AND TREATMENT OF CANCER
WO2004048518A2 (en) * 2002-11-26 2004-06-10 Incyte Corporation Organelle-associated proteins

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004020607A2 (en) * 2002-08-30 2004-03-11 Texas Tech University SPAN-Xb GENE AND PROTEIN FOR THE DIAGNOSIS AND TREATMENT OF CANCER
WO2004048518A2 (en) * 2002-11-26 2004-06-10 Incyte Corporation Organelle-associated proteins

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DATABASE UniProt 7 December 2004 (2004-12-07), XP002394415 retrieved from EBI Database accession no. Q5VSR9 & KOUPRINA NATALAY ET AL: "The SPANX gene family of cancer/testis-specific antigens: Rapid evolution and amplification in African great apes and hominids." PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 101, no. 9, 2 March 2004 (2004-03-02), pages 3077-3082, ISSN: 0027-8424 *
HOPP T P ET AL: "PREDICTION OF PROTEIN ANTIGENIC DETERMINANTS FROM AMINO ACID SEQUENCES" PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE USA, NEW YORK, NY, US, vol. 78, no. 6, 1 June 1981 (1981-06-01), pages 3824-3828, XP000647365 *
KOUPRINA NATALAY ET AL: "Dynamic structure of the SPANX gene cluster mapped to the prostate cancer susceptibility locus HPCX at Xq27" GENOME RESEARCH, vol. 15, no. 11, November 2005 (2005-11), pages 1477-1486, XP002394412 ISSN: 1088-9051 *
WERNER THOMAS: "Target gene identification from expression array data by promoter analysis" BIOMOLECULAR ENGINEERING, vol. 17, no. 3, March 2001 (2001-03), pages 87-94, XP002394413 ISSN: 1389-0344 *
ZENDMAN A J W ET AL: "The human SPANX multigene family: genomic organization, alignment and expression in male germ cells and tumor cell lines" GENE: AN INTERNATIONAL JOURNAL ON GENES AND GENOMES, ELSEVIER, AMSTERDAM, NL, vol. 309, no. 2, 8 May 2003 (2003-05-08), pages 125-133, XP004426689 ISSN: 0378-1119 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010084488A1 (en) 2009-01-20 2010-07-29 Ramot At Tel-Aviv University Ltd. Mir-21 promoter driven targeted cancer therapy
US8492133B2 (en) 2009-01-20 2013-07-23 Ramot At Tel Aviv University, Ltd. MIR-21 promoter driven targeted cancer therapy
US9044506B2 (en) 2009-01-20 2015-06-02 Alona Zilberberg MIR-21 promoter driven targeted cancer therapy
US8664183B2 (en) 2009-02-27 2014-03-04 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services SPANX-B polypeptides and their use
US9238684B2 (en) 2009-02-27 2016-01-19 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services SPANX-B polypeptides and their use

Also Published As

Publication number Publication date
CA2591918A1 (en) 2006-06-22
EP1838872A2 (en) 2007-10-03
WO2006065938A3 (en) 2007-02-08
AU2005316532A1 (en) 2006-06-22

Similar Documents

Publication Publication Date Title
AU2019201577B2 (en) Cancer diagnostics using biomarkers
US7125663B2 (en) Genes, compositions, kits and methods for identification, assessment, prevention, and therapy of cervical cancer
EP3062106B1 (en) Method for determining androgen receptor variants in prostate cancer
US6165713A (en) Composition and methods relating to DNA mismatch repair genes
CN101874120B (en) Genetic variants on chr2 and chr16 as markers for use in breast cancer risk assessment, diagnosis, prognosis and treatment
KR20100017865A (en) Genetic variants on chr 5p12 and 10q26 as markers for use in breast cancer risk assessment, diagnosis, prognosis and treatment
KR20110081807A (en) Genetic variants useful for risk assessment of thyroid cancer
US20060068411A1 (en) Cancer specific gene MH15
Darling et al. Revertant mosaicism: partial correction of a germ-line mutation in COL17A1 by a frame-restoring mutation
AU2023202663A1 (en) Hydroxysteroid 17-beta dehydrogenase 13 (HSD17B13) variants and uses thereof
US20070243176A1 (en) Human genes and gene expression products
KR20100095564A (en) Methods and compositions for assessing responsiveness of b-cell lymphoma to treatment with anti-cd40 antibodies
JP2018201515A (en) MECP2E1 gene
KR20110015409A (en) Gene expression markers for inflammatory bowel disease
JP2015109806A (en) Method for detecting new ret fused body
US20050170500A1 (en) Methods for identifying risk of melanoma and treatments thereof
US20030215803A1 (en) Human genes and gene expression products isolated from human prostate
CN114182023A (en) Detection kit and method for F-circP3F in rhabdomyosarcoma
WO2006065938A2 (en) Cancer-specific spanx-n markers
DK2148932T3 (en) SOX11 expression in malignant lymphomas
Müllenbach et al. A novel discoidin domain receptor 1 (Ddr1) transcript is expressed in postmeiotic germ cells of the rat testis depending on the major histocompatibility complex haplotype
CN113832189B (en) gRNA for knocking out pig immunoglobulin heavy chain IGHG region and application thereof
US20090011424A1 (en) Cancer-suppressing agents
WO2007115068A2 (en) Genetic variants in the indoleamine 2,3-dioxygenase gene
Nagai et al. Down-regulation in human cancers of DRHC, a novel helicase-like gene from 17q25. 1 that inhibits cell growth

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KN KP KR KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 2591918

Country of ref document: CA

NENP Non-entry into the national phase in:

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2005316532

Country of ref document: AU

ENP Entry into the national phase in:

Ref document number: 2005316532

Country of ref document: AU

Date of ref document: 20051215

Kind code of ref document: A

WWP Wipo information: published in national office

Ref document number: 2005316532

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 2005854102

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWP Wipo information: published in national office

Ref document number: 2005854102

Country of ref document: EP