WO2008008923A2

WO2008008923A2 - Compositions and methods for targeting cancer-specific transcription complexes

Info

Publication number: WO2008008923A2
Application number: PCT/US2007/073407
Authority: WO
Inventors: Anzelika Liik; Anna Kazantrseva
Original assignee: Oncotx, Inc.
Priority date: 2006-07-12
Filing date: 2007-07-12
Publication date: 2008-01-17
Also published as: WO2008008923A3; EP2038301A2; AU2007272452A1; US7973135B2; EP2038301A4; US20110151478A1; JP2009543559A; US20080027002A1; CA2655042A1

Abstract

The invention provides molecules that target cancer-specific transcription complexes (CSTC)1 compositions and kits comprising CSTC-targeting molecules, and methods of using CSTC- targeting molecules for the treatment, detection and monitoring of cancer.

Description

COMPOSITIONS AND METHODS FOR TARGETING CANCER-SPECIFIC TRANSCRIPTION COMPLEXES

Throughout this application various publications are referenced. The disclosures of these publications in their entireties are hereby incorporated by reference into this application in order to describe more fully the state of the art to which this invention pertains.

TECHNICAL FIELD OF THE INVENTION

The present invention relates generally to detection and therapy of cancer. The invention is more specifically related to novel molecules directed against cancer-specific transcription complexes. The molecules of the invention can be used in vaccines and pharmaceutical compositions for the treatment of various cancers expressing the targeted transcription complexes, as well as in methods of detecting and assessing the malignancy of such cancers.

BACKGROUND OF THE INVENTION Cancer remains a significant health problem throughout the world. Current therapies, which are generally based on a combination of chemotherapy or surgery and radiation, continue to prove inadequate in many patients.

Among cancers, melanoma is well known both for its rapidly increasing incidence and its resistance to virtually all but surgical therapies. Melanoma arises from melanocytes, neural crest derived pigment cells in the skin and eye. During melanoma carcinogenesis, many of the normal markers of the melanocyte lineage become lost. Gene expression patterns in melanoma cells and melanocytes have significant differences that reflect the cancerous nature of melanoma. In general, gene expression is regulated by two types of factors - DNA binding transcription factors and co-regulators which form cell type specific complexes including mediator complex and chromatin remodeling complex that control activity of RNA polymerase two (pol II). Cofactor complexes integrate signals from DNA binding transcription factors as well as from different signaling systems to control RNA synthesis. Cofactor complexes are highly cell- and stimulus- specific, and vary from one physiological stage to another. Cancer cells express transcriptional co-factors with modified structure that is a result of mutations, post-translational modifications, alternative splicing, fusion of different fragments of different proteins to name but a few.

Transcriptional control of melanoma and melanocyte development

Despite altered gene expression patterns, most, if not all, melanomas retain expression of the basic/helix-loop-helix/leucine-zipper (bHLHzip) transcription factor Microphthalmia-associated transcription factor (MITF) (King et al., 1999) that is characteristic for melanocytes. Published data suggests a role for MITF in the commitment, proliferation, and survival of melanocytes before and/or during neural crest cell migration (Opdecamp et al., 1997). Numerous studies also suggest that MITF₁ in addition to its role in differentiation pathways such as pigmentation, may have an important role in the proliferation and/or survival of developing melanocytes. The retention of MITF expression in the vast majority of human primary melanomas, including nonpigmented tumors, is consistent with this possibility and has also led to the widespread use of MITF as a diagnostic marker in this malignancy (King et al., 1999; Salti et al., 2000; Chang and Folpe, 2001; Miettiπen et al., 2001). Wnt signaling pathway and beta-catenin are significant regulators of melanoma cell growth, with MITF as a critical downstream target.

Importantly, disruption of the canonical Wnt pathway abrogates growth of melanoma cells, and constitutive overexpression of MITF rescues the growth suppression.

The invention disclosed herein arises from a search for MITF target genes, which influence cell cycle progression, to examine the possibility that MITF contributes to maintenance of the cell cycle machinery while perhaps not directly participating in the mitogenic response. Cell cycle targets of Wnt signaling such as c-Myc, Cyclin D1 (He et al., 1998; Tetsu and McCormick, 1999; Shtutman et al., 1999), and others may more directly mediate beta-catenin's mitogenic effects. In addition, it has been shown that MITF serves as an upstream regulator of a variety of proliferation related genes such as CDK2 , p21 (Cip1), INK4A. MITF interacts with several transcription factors (TFs) including Rb, TFEB, ITF2, PIAS3 and STAT3, to regulate a network of downstream genes that are related to different aspects of melanocyte and melanoma development.

In addition to the MITF pathway, several other signaling pathways have been reported to be associated with melanoma cells, including NOTCH, interferon, nuclear hormone receptor and immune modulatory pathways. Some differentially expressed genes reside on chromosomal regions displaying common loss or gain in melanomas or are known to be regulated by CpG promoter methylation. Several data also indicate that transcription cofactors are differentially expressed in melanomas compared to melanocytes. Goldberg et al. (2003) reported that tumor suppressor genes TXNIP and KISS1, which are down-regulated in metastatic melanomas, are controlled by transcriptional factor DRIP130/CRSP3. DRIP130/CRSP3 is located in chromosome 6 in the region that is frequently deleted in melanomas.

Transcriptional control

Precise temporal and spatial regulation of the transcription of protein-encoding genes by RNA polymerase Il (pol II) is vital to the execution of complex gene expression programs in response to growth, developmental and homeostatic signals. The molecular circuitry that enables coordinated gene expression is largely based on DNA-binding transcription factors (TFs) that bring regulatory information to the target genes. As a rule, DNA binding TFs do not interact directly with pol Il and other basal transcriptional complex components. Group of factors called co-regulators including co-activators, co-repressors and a mediator complex have emerged as central players in the process of transcription. These co-regulators mediate DNA binding TFs and pol Il complex to control transcriptional activity of specific genes. Although it has been realized that co-regulators are universally required for the expression of almost all genes, the full implications of a requirement for a multi-subunit co-regulator complex are not yet readily apparent. By inserting itself between the DNA binding TFs and the basal transcriptional machinery, the mediator complex probably affords additional opportunities to control the diverse regulatory inputs received both from the DNA-binding factors and, most likely, from other signals and to present an appropriately calibrated output to the pol Il machinery. In its capacity as a processor of diverse signals in the form of activators and repressors that impinge on it, and its location at the interface of pol Il and general transcription factors (GTFs), the mediator represents a final check-point before pol Il transcription actually commences. The central role of co-regulator complexes in transcriptional control makes them an attractive drug target. Interference at this point of transcription machinery could enable researchers and clinicians to control or correct expression of a large number of genes. Transcriptional complex that contains 70-80 subunits has a different composition in different cell types and on different promoters. This cell specific variability of transcriptional complex assures specificity of potential treatments that target transcriptional machinery. There remains a need for molecules useful in the treatment of cancer. The invention disclosed herein meets this need by providing isoforms of transcription factors and molecules that specifically target the transcription complexes found in cancer.

SUMMARY OF THE INVENTION

The present invention identifies cancer specific transcriptional complexes (CSTCs) that contain isoforms of individual cofactors in melanoma cells. The melanoma specific isoform related transcriptional complexes (TFCs) have altered function compared to wild type TFCs and are part of the molecular machinery that is responsible for malignant transformation. Therefore, melanoma specific TFCs represent attractive drug targets for treatment of melanoma. In addition, these specific TFCs can be used as diagnostic and prognostic biomarkers. Since individual melanomas express different sets of cofactors and TFCs, the efficacy of many current and novel drugs likely depend on composition of TFCs. Modified TFCs provide tools for theranostics, i.e., to select patients who will have favorable response to specific treatments. Moreover, the cancer-specific isoforms of transcriptional co-regulators described herein are expressed in a variety of other cancers, extending the usefulness of the disclosed molecules and methods beyond melanoma. The invention provides molecules that target cancer-specific transcription complexes (CSTCs), compositions and kits comprising CSTC-targeting molecules, and methods of using CSTC- targeting molecules for the treatment and detection of cancer. In one embodiment, the invention provides an expression vector comprising a nucleic acid molecule that encodes a CSTC-targeting molecule operably linked to an expression control sequence. In another embodiment, the invention provides an oligonucleotide that encodes a CSTC-targeting molecule. The nucleic acid molecule may encode the CSTC-targeting molecule in a sense or anti-sense orientation, depending on the intended use. Also provided are host cells containing such expression vectors, which can be used for the production of CSTC-targeting molecules. In some embodiments, the nucleic acid molecule is labeled with a detectable marker, or provided in a composition with a pharmaceutically acceptable carrier.

The invention additionally provides CSTC-targeting peptides and small molecules, including peptides that target transcription complexes modified by cancer-specific isoforms of transcriptional co-regulators. More specifically, the CSTC-targeting molecules of the invention include molecules that modulate the activity of a cancer-specific mediator complex, containing MED24/TRAP100 and isoforms thereof, and a cancer-specific chromatin modifying complex, containing BAF57 and isoforms thereof. The CSTC-targeting molecule may be provided in a variety of forms, as appropriate for a particular use, including, for example, in a soluble form, immobilized on a substrate, or in combination with a pharmaceutically acceptable carrier. In some embodiments, the CSTC-targeting molecule is labeled with a detectable marker, or provided in a composition with a pharmaceutically acceptable carrier.

The methods provided by the invention include a method for inhibiting proliferation of cancer cells comprising contacting a cancer cell with a CSTC-targeting molecule of the invention. Typically, the molecule comprises a peptide, oligonucleotide (e.g., siRNA) or small molecule that modulates the activity of a cancer-specific mediator complex containing MED24/TRAP100 and its isoforms, and a cancer-specific chromatin modifying complex containing BAF57 and its isoforms. In one embodiment, the peptide comprises the amino acid sequence PQMQQNVFQYPGAGMVPQGEANF (SEQ ID NO: 1) or NDRLSDGDSKYSQTSHKLVQLL (SEQ ID NO: 2), that interfere with the function of cancer-specific isoforms of TRAP100 and BAF57, respectively. In a typical embodiment, the peptide further comprises additional sequence selected to facilitate delivery into cells and into nuclei. For example, a cell penetrating peptide (CPP) can be added, such as the following amino acid sequence: RRRRRRR (SEQ ID NO: 3). An example of a peptide that facilitates nuclear delivery is the nucleus localizing signal (NLS) having the amino acid sequence PKKRKV (SEQ ID NO: 4). A peptide of the invention is exemplified by the peptide having the amino acid sequence of

PKKRKVRRRRRRRPQMQQNVFQYPGAGMVPQGEANF (TRAP100 P05; SEQ ID NO: 5) or PKKRKVRRRRRRRNDRLSDGDSKYSQTSHKLVQLL (BAF57 P12; SEQ ID NO: 6). Other methods provided include a method for treating cancer in a subject by administering to the subject a CSTC-targeting molecule of the invention, a method of inhibiting tumor growth, a method for detecting cancer, and a method for inducing apoptosis. The method for inhibiting tumor growth, and the method for inducing apoptosis, comprises contacting a tumor or cancer cell with a CSTC-targeting molecule. The method for detecting cancer comprises contacting a tissue specimen with a detectable molecule that specifically binds a CSTC and detecting binding of the detectable molecule. Binding of the detectable molecule is indicative of cancer. Examples of a detectable molecule include a peptide, antibody or other molecule that specifically binds to a CSTC. Typically, the cancer is melanoma.

DETAILED DESCRIPTION OF THE INVENTION

The present invention is based on the discovery of cancer-specific transcription complexes (CSTCs) that contain isoforms of transcriptional co-regulators specific to human cancers. These molecules provide novel targets for treatment and detection of cancer. Moreover, the data described herein show that molecules directed against the CSTC of the invention are effective in inhibiting proliferation of cancer cells, inducing apoptosis and inhibiting tumor growth. This invention thus provides CSTC-targeting molecules as diagnostic and therapeutic agents for the detection, monitoring and treatment of various cancers.

Transcriptional complexes as novel promising drug targets Transcriptional regulators determine regulatory networks that control gene-specific transcription. The misregulation of these networks is correlated with a growing number of human diseases that are characterized by altered gene expression patterns. This has spurred intense efforts toward the development of artificial transcriptional regulators and/or molecules that modify TFCs to correct and restore "normal" expression of affected genes. Numerous research groups and companies are focusing on development of treatment strategies that target signaling systems, mostly kinases and phosphatases, and cell surface molecules that control gene expression and regulate cell division and differentiation. All potential treatments that target signaling and cell surface molecules have one critical problem - cell type specificity. To be effective with minimal side effects, treatments have to affect only diseased cells. Signaling systems and surface molecules are expressed and function in a wide variety of cell populations that makes achieving localized/restricted effects extremely difficult.

It is well known that transcriptional control of individual genes is cell type specific and that different transcription factor complexes are responsible for this specificity. We propose to use the cell type specificity of TFCs to control expression of proteins that are critical for cancer development. Achieving this goal will allow us to manipulate growth and apoptosis of cancer cells. For a long time TFs have been considered to be difficult targets for effective drug development. Recently numerous reports show that small molecules can be developed that interact with specific TFs and control activity of specific TFCs.

Peptide drugs - targeting transcription complexes The ultimate action of TFs on target genes, after site-specific DNA binding, is to enhance the recruitment and/or function of the general transcription machinery (RNA polymerase Il and general transcription factors TFII-A, -B, -D, -E₁ -F, and -H; Roeder, 1996) on cognate core promoter elements. Recent studies have implicated a large multisubunit coactivator complex, a mediator, as the main pathway for direct communication between DNA binding TFs and the general transcription machinery (reviewed in Malik and Roeder, 2000). Large number of protein/protein interactions determines specificity and function of mediator complex. Peptides that represent interaction surfaces of different transcription factors have been designed and used to manipulate expression of target genes (Kalinichenko et al., 2004, Chinmay et al., 2005, Gail et al., 2005) and control disease. Prediction of the structures of multimolecular complexes has largely not been addressed, probably due to the magnitude of the combinatorial complexity of the problem. Docking applications have traditionally been used to predict pairwise interactions between molecules. Several algorithms that extend the application of docking to multimolecular assemblies have been developed. We apply these algorithms to predict quaternary structures of both oligomers and multi-protein complexes. These algorithms have predicted well a near-native arrangement of the subunits of mediator complexes. We have used these computational tools to design a small library of peptides that interact with a cancer specific mediator complex and a cancer specific chromatin modifying complex containing cancer specific isoforms of MED24/TRAP100 and BAF57 respectively. Screening of these libraries has identified peptides that affect growth and apoptosis of melanoma cells.

Another critical issue is delivery of therapeutic peptides to cell nucleus where transcription factor complexes are localized and where they perform their function. Cell membranes act as protective walls to exclude peptides that are not actively imported by living cells. In order to overcome this barrier for effective delivery of membrane-impermeable peptides, several chemical and physical methods have been developed including electroporation and cationic lipids/liposomes. These methods have been shown to be effective for delivering hydrophobic macromolecules. The drawbacks of these harsh methods are, primarily, the unwanted cellular effects exerted by them, and, secondly, their limitation to in vitro applications. The last decade's discovery of cell-penetrating peptides (CPP) translocating themselves across cell membranes of various cell lines, along with a cargo 100-fold their own size, via a seemingly energy independent process, opens up the possibility for efficient delivery of proteins, peptides and small molecules into cells both in vitro and in vivo. The only consistently found feature present in all CPPs is the high content of basic amino acids, resulting in a positive net charge. Rothbard et al. (2000) showed that cyclosporin A was efficiently delivered into dermal T lymphocytes and inhibited inflammation by linking to a hepta-arginine segment, suggesting that positive charge is the required feature for cellular translocation. CPPs possess an appealing set of desirable features for cellular targeting, such as effective delivery in vivo, targeting of the nucleus, applicability to all cell types, no apparent size constraint of cargo and seemingly no immunogenic, antigenic or inflammatory properties.

As delivery vectors, cell-penetrating peptides definitely have proven their value. Their ability to effectively deliver hydrophobic macromolecules into practically all types of cells in vitro, as well as in vivo, without marked levels of cytotoxicity, is impressive.

Combining CPP and TFC interfering peptides opens a new and more effective approach to the targeting of transcriptional complexes with therapeutic peptides.

Cancer and transcriptional control Cancer is a disease of enormous complexity. To date, thousands of genes representing virtually every sub-group of genes have been implicated in cancer. Currently, cancer is thought to develop from proliferating stem or progenitor cells with either mutated genes or rearranged chromosomes. As a result of these genetic alterations, tumor cells also possess an altered gene and protein expression compared with non-malignant cells. Whole-genome analysis of gene expression clearly shows specific differences between normal and cancerous cells as well as between cancer types. This suggests that regulatory networks determining the expression of specific genes are different in malignant and non-malignant cells.

Cancer patients have a highly variable clinical course and outcome. Intrinsic genetic heterogeneity of the primary tumor has been suggested to play a role in this variability and may explain it in part (Chang, et al., 2003). Pathological and clinical factors are insufficient to capture the complex cascade of events that drive the clinical behavior of tumors. Extensive analyses of gene expression patterns of a variety of tumors have resulted in an understanding that histologically similar tumors have different gene expression patterns. Oligonucleotide and cDNA microarray techniques have identified molecular subgroups of specific types of cancer (Perou et al., 2000, Hedenfalk et al., 2001 , West et al., 2001 , Zajchowski et al., 2001 ).

Molecular profiling of tumors has also been used to predict survival of patients and to select patients for adjuvant therapy (van't Veer et al., 2002, van de Vijever et al., 2002).

Cancer specific TFCs - novel drug targets with high specificity

Well-known characteristics of cancer cells are mutations in variety of regulatory molecules including transcription factors, misexpression of transcription factors, expression of mRNA splice variants encoding specific isoforms of proteins and presence of posttranslational modifications that are not present in normal cells. Mutations and expression of fusion proteins are described in almost every single type of cancer (Leroy H, Roumier C, Huyghe P, Biggio V. Fenaux P, Preudhomme C₁ CEBPA point mutations in hematological malignancies. Leukemia. 2005 Mar;19(3):329-34; Xia and Barr, Chromosome translocations in sarcomas and the emergence of oncogenic transcription factors. Eur J Cancer. 2005 Nov;41(16):2513-27). Large number of papers report identification of cancer specific or enriched mRNA alternative splice variants. For example, a genome-wide computational screening of 11014 genes using 3,471 ,822 human expressed sequence tag (EST) sequences identified 26,258 alternatively spliced transcripts/mRNAs of which 845 were significantly associated with cancer (Wang et al., 2003). Several of the gene-specific splice variants have been shown to have a prognostic value. Patients with a high expression of the alternative splice variant of helix-loop-helix transcription factor ARNT have a worse relapse-free and overall survival than patients with a low expression (Qin et al., 2001). As a rule the expression of cancer-specific or enriched alternatively spliced mRNAs is not related to the mutations in splice donor or acceptor sites but due to the changes in the expression of splicing factors.

Our in silico analysis using variety of gene expression and EST databases has revealed a large number of alternative splice variants of transcriptional coactivators including mediator complex that have cell type and diseases specific expression. Not all of these splice variants result in protein isoforms with altered function but represent a cryptic splicing that leads to degradation of mRNAs. However, a number of splice variants become translated into functional proteins that will become part of cancer specific TFCs. These changed TFCs may contribute to the development of cancer. We have generated peptides that affect specifically MED24/TRAP100 and BAF57 isoform containing TFCs and block proliferation and induce apoptosis of melanoma cells.

Therapeutic Approach

Our therapeutic approach is based on identification of cancer specific transcription factor complexes (TFC) that contain mutated and/or altered by posttranslational modifications, and/or alternative splicing, and/or TFC components that are modified by a genomic rearrangement. These cancer specific TFCs have structure and function that are different from structure and function of TFCs in normal, non-cancerous cells.

As an example of our approach, we have specifically identified a number of novel isoforms of transcriptional co-regulators that are components of cancer specific TFCs₁ including but not limited to mediator complex and chromatin remodeling complex. We have focused on two of these altered complexes:

1. Mediator complex that contains cancer specific isoform of MED24/TRAP100. 2. Chromatin modifying complex that contains cancer specific isoform of BAF57.

Using different modeling tools and current understanding of composition, structure and function of mediator and chromatin remodeling complexes we identified potential interactions that are unique in complexes that contain cancer specific isoforms of MED24 and BAF57 and identified potential therapeutic peptides. These peptides interact with a MED and chromatin remodeling complexes and alter the function of transcriptional machinery that results in apoptosis and growth arrest of melanoma cells.

MED24 isoform containing complex

Mediator complex consists of approximately 30 proteins that have different functions and participate in different signaling pathways to respond variety of regulatory signals. MED24 is a part of a MED complex "tail" subunit that is present in specific MED complexes. MED 24 co- precipitates with MED16, MED23 and MED25 that are other subunits of "tail" module. Incoφoration of MED24 isoform into "tail" subunit modifies interactions of subunit components and opens opportunity to design interfering molecules that target MED24 isoform specific complex. Therapeutic peptide TRAP100 P05 likely interacts with a "tail" complex structure that is composed of MED16, MED23, MED24 and MED25.

Based on these potential interactions, we have designed a small library of peptides that interact with a cancer specific mediator complex "tail" subunit containing cancer specific isoform of MED24/TRAP100. Screening of these libraries has identified a peptide that affects growth and apoptosis of melanoma cells. This peptide does not have a sequence of MED24 isoform and was found to affect transcription via binding to altered structure of "tail" subunit of MED complex.

Chromatin modifying complex

Chromatin modifying complex consists of a large number of SWI/SNF/SMARC/BAF proteins, histone acetylases (HAT) and histone deacetylases (HDAC). BAF 57, a specific member of BAF complex and it interacts directly with BAF155, BAF170, steroid hormone receptor co- activators and several HDAC proteins. BAF57 melanoma specific isoform modifies structure and function of a chromatin modifying complex. We have used modeling tools to predict changes in the structure and interactions of chromatin modifying complex containing isoform of BAF57. Based on this information, we have designed a peptide library and screening of this library resulted in the identification of peptides that affect growth and apoptosis of melanoma cells. Specifically, therapeutic peptide which we denoted as BAF57 P12 likely interacts with a chromatin modifying complex subunit that contains BAF155, BAF170 and one or more different HDAC molecules. Definitions

All scientific and technical terms used in this application have meanings commonly used in the art unless otherwise specified. As used in this application, the following words or phrases have the meanings specified. As used herein, "peptide" or "polypeptide" includes fragments of proteins, and peptides, whether isolated from natural sources, produced by recombinant techniques or chemically synthesized. Polypeptides (and peptides) of the invention typically comprise at least about 6 amino acids.

As used herein, "CSTC-targeting molecule" includes CSTC-targeting peptides, polynucleotides encoding CSTC-targeting peptides, polynucleotides complementary to those encoding CSTC- targeting peptides, antibodies that specifically recognize and bind CSTCs₁ and other small molecules exhibiting the same targeting activity.

A "small molecule" means a molecule having a molecular weight of less than 2000 daltons, in some embodiments less than 1000 daltons, and in still other embodiments less than 500 daltons or less. Such molecules include, for example, heterocyclic compounds, carboxylic compounds, sterols, amino acids, lipids, and nucleic acids.

As used herein, "CSTC-targeting" refers to the specific binding of a CSTC-targeting molecule to a cancer-specific transcription complex, wherein the specificity is such that the CSTC- targeting molecule essentially does not bind normal or native transcription complex. As used herein, "vector" means a construct, which is capable of delivering, and preferably expressing, one or more gene(s) or sequence(s) of interest in a host cell. Examples of vectors include, but are not limited to, viral vectors, naked DNA or RNA expression vectors, plasmid, cosmid or phage vectors, DNA or RNA expression vectors associated with cationic condensing agents, DNA or RNA expression vectors encapsulated in liposomes, and certain eukaryotic cells, such as producer cells.

As used herein, "expression control sequence" means a nucleic acid sequence that directs transcription of a nucleic acid. An expression control sequence can be a promoter, such as a constitutive or an inducible promoter, or an enhancer. The expression control sequence is operably linked to the nucleic acid sequence to be transcribed. The term "nucleic acid" or "polynucleotide" refers to a deoxyribonucleotide or ribonucleotide polymer in either single- or double-stranded form, and unless otherwise limited, encompasses known analogs of natural nucleotides that hybridize to nucleic acids in a manner similar to naturally-occurring nucleotides.

As used herein, "tumor protein" is a protein that is expressed by tumor cells. A tumor protein is tumor specific if it is not expressed in non-tumor cells. As used herein, "pharmaceutically acceptable carrier" includes any material which, when combined with an active ingredient, allows the ingredient to retain biological activity and is non- reactive with the subject's immune system. Examples include, but are not limited to, any of the standard pharmaceutical carriers such as a phosphate buffered saline solution, water, emulsions such as oil/water emulsion, and various types of wetting agents. Preferred diluents for aerosol or parenteral administration are phosphate buffered saline or normal (0.9%) saline.

Compositions comprising such carriers are formulated by well known conventional methods (see, for example, Remington's Pharmaceutical Sciences, 18th edition, A. Gennaro, ed., Mack Publishing Co., Easton, PA, 1990). As used herein, "a" or "an" means at least one, unless clearly indicated otherwise. CSTC-Tarqetinq Peptides

CSTC-targeting peptides and polypeptides as described herein may be of any length. Additional sequences derived from the native protein and/or heterologous sequences may be present, and such sequences retain the ability to modulate transcription complex. Preferred peptides comprise the amino acid sequence PQMQQNVFQYPGAGMVP QGEANF (SEQ ID NO: 1) or NDRLSDGDSKYSQTSHKLVQLL (SEQ ID NO: 2), peptides that interfere with function of CSTCs containing cancer-specific isoforms of TRAP100 P05 and BAF57, respectively. In a typical embodiment, the peptide further comprises additional sequence selected to facilitate delivery into cells and into nuclei. For example, a cell penetrating peptide (CPP) can be added, such as the following amino acid sequence: RRRRRRR (SEQ ID NO: 3). Those skilled in the art are aware of other CPPs that can be suitable for use with the invention, such as those described in UIo Langel, ed., Cell-Penetrating Peptides: Processes and Applications, Culinary & Hospitality Industry Publications Services (CHIPS), Weimar, Texas, 2002. An example of a peptide that facilitates nuclear delivery is a nuclear localizing signal (NLS). Typically, this signal consists of a few short sequences of positively charged lysines or arginines, such as PPKKRKV (SEQ ID NO: 9). In one embodiment, the NLS has the amino acid sequence PKKRKV (SEQ ID NO: 4). A peptide of the invention is exemplified by the peptide having the amino acid sequence of PKKRKVRRRRRRRPQMQQNVFQ YPGAGMVPQGEANF (TRAP100 P05; SEQ ID NO: 5) or PKKRKVRRRRRRR NDRLSDGDSKYSQTSHKLVQLL (BAF57 P12; SEQ ID NO: 6).

Those skilled in the art will appreciate that certain variants thereof will be useful in the treatment and detection of cancer. A peptide "variant," as used herein, is a peptide that differs from a native CSTC-targeting peptide in one or more substitutions, deletions, additions and/or insertions, such that the transcription complex targeting activity of the peptide is not substantially diminished. In other words, the ability of a variant to bind the transcription complex may be enhanced or unchanged, relative to the native peptide, or may be diminished by less than 50%, and preferably less than 20%, relative to the native peptide. Such variants may generally be identified by modifying one of the above peptide sequences and evaluating the binding of the modified peptide with the targeted transcription complex as described herein. Peptide variants preferably exhibit at least about 70%, more preferably at least about 90% and most preferably at least about 95% identity (determined as described above) to the identified peptides.

Preferably, a variant contains conservative substitutions. A "conservative substitution" is one in which an amino acid is substituted for another amino acid that has similar properties, such that one skilled in the art of peptide chemistry would expect the secondary structure and hydropathic nature of the peptide to be substantially unchanged. Amino acid substitutions may generally be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity and/or the amphipathic nature of the residues. For example, negatively charged amino acids include aspartic acid and glutamic acid; positively charged amino acids include lysine and arginine; and amino acids with uncharged polar head groups having similar hydrophilicity values include leucine, isoleucine and valine; glycine and alanine; asparagine and glutamine; and serine, threonine, phenylalanine and tyrosine. Other groups of amino acids that may represent conservative changes include: (1) ala, pro, gly, glu, asp, gin, asn, ser, thr; (2) cys, ser, tyr, thr; (3) val, ile, leu, met, ala, phe; (4) lys, arg, his; and (5) phe, tyr, trp, his. A variant may also, or alternatively, contain nonconservative changes. In a preferred embodiment, variant peptides differ from a native sequence by substitution, deletion or addition of five amino acids or fewer.

Specifically, amino acid residues of the peptides of the invention can be varied as follows: Variant Residues for MED24 P05

PQ(N)MQ(N)Q(N)N(Q)VFQ(N)YPG(A)A(G)G(A)MV(L)PQ{N)GE(D)A(G)N(Q)F (SEQ ID NO: 7);

Variant Residues for BAF57 P12 ND(E)R(K)L(V)SD(E)GD(E)SK(R)YSQ(N)TSHK(R)L(V)V(L)QL(V)L(V) (SEQ ID NO: 8); wherein each indicated native residue that is followed by an alternative in parentheses can optionally be substituted with that alternative residue. One or more of the indicated alternatives can be employed in a given variant peptide. Such variant peptides are referred to herein as "conservatively modified variants".

Recombinant peptides encoded by DNA sequences as described herein may be readily prepared from the DNA sequences using any of a variety of expression vectors known to those of ordinary skill in the art. Expression may be achieved in any appropriate host cell that has been transformed or transfected with an expression vector containing a DNA molecule that encodes a recombinant peptide. Suitable host cells include prokaryotes, yeast and higher eukaryotic cells. Preferably, the host cells employed are E. coli, yeast, insect cells or a mammalian cell line such as COS or CHO. Supernatants from suitable host/vector systems that secrete recombinant protein or peptide into culture media may be first concentrated using a commercially available filter. Following concentration, the concentrate may be applied to a suitable purification matrix such as an affinity matrix or an ion exchange resin. Finally, one or more reverse phase HPLC steps can be employed to further purify a recombinant peptide.

Portions and other variants having fewer than about 100 amino acids, and generally fewer than about 50 amino acids, may also be generated by synthetic means, using techniques well known to those of ordinary skill in the art. For example, such peptides may be synthesized using any of the commercially available solid-phase techniques, such as the Meπϊfield solid- phase synthesis method, where amino acids are sequentially added to a growing amino acid chain. See Merrifield, J. Am. Chem. Soc. 85:2149-2146, 1963. Equipment for automated synthesis of peptides is commercially available from suppliers such as Perkin Elmer/Applied BioSystems Division (Foster City, CA), and may be operated according to the manufacturer's instructions.

Peptides can be synthesized on a Perkin Elmer/Applied Biosystems Division 430A peptide synthesizer using FMOC chemistry with HPTU (O-BenzotriazoleN,N,N\N'-tetrarnethyluronium hexafluorophosphate) activation. A Gly-Cys-Gly sequence may be attached to the amino terminus of the peptide to provide a method of conjugation, binding to an immobilized surface, or labeling of the peptide. Cleavage of the peptides from the solid support may be carried out using the following cleavage mixture: trifluoroacetic acid:ethanedithiol:thioanisole:water:phenol (40:1:2:2:3). After cleaving for 2 hours, the peptides may be precipitated in cold methyl-t-butyl- ether. The peptide pellets may then be dissolved in water containing 0.1% trifluoroacetic acid (TFA) and lyophilized prior to purification by C18 reverse phase HPLC. A gradient of 0%-60% acetonitrile (containing 0.1% TFA) in water may be used to elute the peptides. Following lyophilization of the pure fractions, the peptides may be characterized using electrospray or other types of mass spectrometry and by amino acid analysis.

In general, peptides (including fusion proteins) and polynucleotides as described herein are isolated. An "isolated" peptide or polynucleotide is one that is removed from its original environment. For example, a naturally occurring protein is isolated if it is separated from some or all of the coexisting materials in the natural system. Preferably, such peptides are at least about 90% pure, more preferably at least about 95% pure and most preferably at least about 99% pure. A polynucleotide is considered to be isolated if, for example, it is cloned into a vector that is not a part of the natural environment. Polynucleotides of the Invention

The invention provides polynucleotides that encode one or more CSTC-targeting peptides, as described above. Preferred polynucleotides comprise at least 15 consecutive nucleotides, preferably at least 30 consecutive nucleotides and more preferably 35 consecutive nucleotides, that encode a CSTC-targeting peptide. Polynucleotides that are fully complementary to any such sequences are also encompassed by the present invention. Polynucleotides may be single-stranded (coding or antisense) or double-stranded, and may be DNA (genomic, cDNA or synthetic) or RNA molecules. RNA molecules include HnRNA molecules, which contain introns and correspond to a DNA molecule in a one-to-one manner, and mRNA molecules, which do not contain introns. Additional coding or non-coding sequences may, but need not, be present within a polynucleotide of the present invention, and a polynucleotide may, but need not, be linked to other molecules and/or support materials. Portions of such CSTC-targeting polynucleotides can be useful as primers and probes for the amplification and detection of CSTC-targeting molecules. Polynucleotides may comprise a native sequence (i.e., a sequence that encodes a CSTC- targeting peptide as described above or a portion thereof) or may comprise a variant of such a sequence. Polynucleotide variants contain one or more substitutions, additions, deletions and/or insertions such that the specific CSTC binding of the encoded peptide is not diminished, relative to a native peptide. Variants preferably exhibit at least about 70% identity, more preferably at least about 80% identity and most preferably at least about 90% identity to a polynucleotide sequence that encodes a native CSTC-targeting peptide or a portion thereof.

Two polynucleotide or peptide sequences are said to be "identical" if the sequence of nucleotides or amino acids in the two sequences is the same when aligned for maximum correspondence as described below. Comparisons between two sequences are typically performed by comparing the sequences over a comparison window to identify and compare local regions of sequence similarity. A "comparison window" as used herein, refers to a segment of at least about 20 contiguous positions, usually 30 to about 75, 40 to about 50, in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Optimal alignment of sequences for comparison may be conducted using the Megalign program in the Lasergene suite of bioinformatics software (DNASTAR₁ Inc., Madison, Wl)₁ using default parameters. This program embodies several alignment schemes described in the following references: Dayhoff, M. O. (1978) A model of evolutionary change in proteins • Matrices for detecting distant relationships. In Dayhoff, M.O. (ed.) Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, Washington DC Vol. 5, Suppl. 3, pp. 345-358; Mein J. (1990) Unified Approach to Alignment and Phylogenes pp. 626-645 Methods in Enzymology vol. 183, Academic Press, Inc., San Diego, CA; Higgins, D.G. and Sharp, P.M. (1989) CABIOS 5:151-153; Myers, E.W. and Muller W. (1988) CABIOS 4:11-17; Robinson, E.D. (1971) Comb. Theor. 11:105; Santou, N., Nes, M. (1987) MoI. Biol. Evol. 4:406-425; Sneath, P.H.A. and Sokal, R.R. (1973) Numerical Taxonomy the Principles and Practice of Numerical Taxonomy, Freeman Press, San Francisco, CA; Wilbur, W.J. and Lipman, D.J. (1983) Proc. Natl. Acad. Sci. USA 80:726-730.

Preferably, the "percentage of sequence identity" is determined by comparing two optimally aligned sequences over a window of comparison of at least 20 positions, wherein the portion of the polynucleotide or peptide sequence in the comparison window may comprise additions or deletions (i.e. gaps) of 20 percent or less, usually 5 to 15 percent, or 10 to 12 percent, as compared to the reference sequences (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid bases or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the reference sequence (i.e. the window size) and multiplying the results by 100 to yield the percentage of sequence identity.

Variants may also, or alternatively, be substantially homologous to a native gene, or a portion or complement thereof. Such polynucleotide variants are capable of hybridizing under moderately stringent conditions to a naturally occurring DNA sequence encoding a native protein (or a complementary sequence).

Suitable "moderately stringent conditions" include prewashing in a solution of 5 X SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0); hybridizing at 50°C-65°C, 5 X SSC, overnight; followed by washing twice at 65°C for 20 minutes with each of 2X, 0.5X and 0.2X SSC containing 0. 1 % SDS. As used herein, "highly stringent conditions" or "high stringency conditions" are those that: (1 ) employ low ionic strength and high temperature for washing, for example 0.015 M sodium chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate at 5O⁰C; (2) employ during hybridization a denaturing agent, such as formamide, for example, 50% (v/v) formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride, 75 mM sodium citrate at 42⁰C; or (3) employ 50% formamide, 5 x SSC (0.75 M NaCI, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5 x Denhardt's solution, sonicated salmon sperm DNA (50 μg/ml), 0.1% SDS, and 10% dextran sulfate at 42⁰C, with washes at 42°C in 0.2 x SSC (sodium chloride/sodium citrate) and 50% formamide at 55⁰C, followed by a high-stringency wash consisting of 0.1 x SSC containing EDTA at 55°C. The skilled artisan will recognize how to adjust the temperature, ionic strength, etc. as necessary to accommodate factors such as probe length and the like.

It will be appreciated by those of ordinary skill in the art that, as a result of the degeneracy of the genetic code, there are many nucleotide sequences that encode a peptide as described herein. Some of these polynucleotides bear minimal homology to the nucleotide sequence of any native gene. Nonetheless, polynucleotides that vary due to differences in codon usage are specifically contemplated by the present invention. Further, alleles of the genes comprising the polynucleotide sequences provided herein are within the scope of the present invention. Alleles are endogenous genes that are altered as a result of one or more mutations, such as deletions, additions and/or substitutions of nucleotides. The resulting mRNA and protein may, but need not, have an altered structure or function. Alleles may be identified using standard techniques (such as hybridization, amplification and/or database sequence comparison).

Polynucleotides may be prepared using any of a variety of techniques known in the art, including, for example, oligonucleotide synthesis. Libraries can be screened with probes designed to identify the gene of interest or the peptide encoded by it. Screening the cDNA or other library with the selected probe may be conducted using standard procedures, such as those described in Sambrook et al., Molecular Cloning: A Laboratory Manual (New York: Cold Spring Harbor Laboratory Press, 1989). The oligonucleotide sequences selected as probes should be sufficiently long and sufficiently unambiguous that false positives are minimized. The oligonucleotide is preferably labeled such that it can be detected upon hybridization to DNA in the library being screened. Methods of labeling are well known in the art, and include the use of radiolabels, such as ³²P-labeled ATP, biotinylation or enzyme labeling. Hybridization conditions, including moderate stringency and high stringency, are provided in Sambrook et al., supra.

Polynucleotide variants may generally be prepared by any method known in the art, including chemical synthesis by, for example, solid phase phosphoramidite chemical synthesis. Modifications in a polynucleotide sequence may also be introduced using standard mutagenesis techniques, such as oligoπucleotide-directed site-specific mutagenesis (see Adelman et al., DNA 2:183, 1983). Alternatively, RNA molecules may be generated by in vitro or in vivo transcription of DNA sequences encoding a CSTC-targeting peptide, or portion thereof, provided that the DNA is incorporated into a vector with a suitable RNA polymerase promoter (such as T7 or SP6). Certain portions may be used to prepare an encoded peptide, as described herein. In addition, or alternatively, a portion may be administered to a patient such that the encoded peptide is generated in vivo (e.g., by transfecting antigen-presenting cells, such as dendritic cells, with a cDNA construct encoding a CSTC-targeting peptide, and administering the transfected cells to the patient).

Any polynucleotide may be further modified to increase stability in vivo. Possible modifications include, but are not limited to, the addition of flanking sequences at the 5' and/or 3' ends; the use of phosphorothioate or 2' O-methyl rather than phosphodiesterase linkages in the backbone; and/or the inclusion of πontraditional bases such as inosine, queosine and wybutosine, as well as acetyl- methyl-, thio- and other modified forms of adenine, cytidine, guanine, thymine and uridine.

Nucleotide sequences can be joined to a variety of other nucleotide sequences using established recombinant DNA techniques. For example, a polynucleotide may be cloned into any of a variety of cloning vectors, including plasmids, phagemids, lambda phage derivatives and cosmids. Vectors of particular interest include expression vectors, replication vectors, probe generation vectors and sequencing vectors. In general, a vector will contain an origin of replication functional in at least one organism, convenient restriction endonuclease sites and one or more selectable markers. Other elements will depend upon the desired use, and will be apparent to those of ordinary skill in the art.

Within certain embodiments, polynucleotides may be formulated so as to permit entry into a cell of a mammal, and to permit expression therein. Such formulations are particularly useful for therapeutic purposes, as described below. Those of ordinary skill in the art will appreciate that there are many ways to achieve expression of a polynucleotide in a target cell, and any suitable method may be employed. For example, a polynucleotide may be incorporated into a viral vector such as, but not limited to, adenovirus, adeno-associated virus, retrovirus, or vaccinia or other pox virus (e.g., avian pox virus). Techniques for incorporating DNA into such vectors are well known to those of ordinary skill in the art. A retroviral vector may additionally transfer or incorporate a gene for a selectable marker (to aid in the identification or selection of transduced cells) and/or a targeting moiety, such as a gene that encodes a ligand for a receptor on a specific target cell, to render the vector target specific. Targeting may also be accomplished using an antibody, by methods known to those of ordinary skill in the art. Some embodiments of the peptides of the invention have been described herein with a cell penetrating peptide (CPP) incorporated into the peptide for facilitation of entry into a cell.

Other formulations for therapeutic purposes include colloidal dispersion systems, such as macromolecule complexes, nanocapsules, microspheres, beads, and lipid-based systems including oil-in-water emulsions, micelles, mixed micelles, and liposomes. A preferred colloidal system for use as a delivery vehicle in vitro and in vivo is a liposome (i.e., an artificial membrane vesicle). The preparation and use of such systems is well known in the art. Antisense and inhibitory nucleic acid molecules

The antisense molecules of the present invention comprise a sequence substantially complementary, or preferably fully complementary, to all or a fragment of a nucleic acid molecule that encodes a CSTC-targeting peptide and/or a cancer-specific isoform of a transcription modulator as described herein. Included are fragments of oligonucleotides within a coding sequence, and inhibitory nucleotides that inhibit the expression of CSTCs and/or cancer-specific isoforms of transcription modulators. Antisense oligonucleotides of DNA or RNA complementary to sequences at the boundary between introns and exons can be employed to prevent the maturation of newly-generated nuclear RNA transcripts of specific genes into mRNA for transcription. Antisense RNA, including siRNA, complementary to specific genes can hybridize with the mRNA for that gene and prevent its translation. The antisense molecule can be DNA, RNA₁ or a derivative or hybrid thereof. Examples of such derivative molecules include, but are not limited to, peptide nucleic acid (PNA) and phosphorothioate-based molecules such as deoxyribonucleic guanidine (DNG) or ribonucleic guanidine (RNG).

Antisense RNA can be provided to the cell as "ready-to-use" RNA synthesized in vitro or as an antisense gene stably transfected into cells which will yield antisense RNA upon transcription. Hybridization with mRNA results in degradation of the hybridized molecule by RNAse H and/or inhibition of the formation of translation complexes. Both result in a failure to produce the product of the original gene.

Both antisense RNA and DNA molecules and ribozymes of the invention may be prepared by any method known in the art for the synthesis of RNA molecules. These include techniques for chemically synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. Alternatively, RNA molecules may be generated by in vitro or in vivo transcription of DNA sequences encoding the antisense RNA molecule. Such DNA sequences may be incorporated into a wide variety of vectors with suitable RNA polymerase promoters such as T7 or SP6. Alternatively, antisense cDNA constructs that synthesize antisense RNA constitutively or inducibly can be introduced into cell lines, cells or tissues.

DNA molecules may be modified to increase intracellular stability and half-life. Possible modifications include, but are not limited to, the addition of flanking sequences of the 5' and/or 3' ends of the molecule or the use of phosphorothioate or 2' O-methyl rather than phosphodiesterase linkages within the backbone of the molecule. Other modifications include the use of chimeric antisense compounds. Chimeric antisense compounds of the invention may be formed as composite structures of two or more oligonucleotides, modified oligonucleotides, oligonucleosides and/or oligonucleotide mimetics. Such compounds have also been referred to in the art as hybrids or gapmers. Representative United States patents that teach the preparation of such hybrid structures include, but are not limited to, U.S. Pat. Nos.: 5,700,922 and 6,277,603.

The antisense compounds used in accordance with this invention may be conveniently and routinely made through the well-known technique of solid phase synthesis. Equipment for such synthesis is sold by several vendors including, for example, Applied Biosystems (Foster City, Calif.). Any other means for such synthesis known in the art may additionally or alternatively be employed. It is well known to use similar techniques to prepare oligonucleotides such as the phosphorothioates and alkylated derivatives.

Antisense compositions of the invention include oligonucleotides formed of homopyrimidines that can recognize local stretches of homopurines in the DNA double helix and bind to them in the major groove to form a triple helix. See: Helen, C and Toulme, J J. Specific regulation of gene expression by antisense, sense, and antigene nucleic acids. Biochem. Biophys Acta, 1049:99-125, 1990. Formation of the triple helix would interrupt the ability of the specific gene to undergo transcription by RNA polymerase. Triple helix formation using myc-specific oligonucleotides has been observed. See: Cooney, M, et al. Science 241 :456-459.

Antisense sequences of DNA or RNA can be delivered to cells. Several chemical modifications have been developed to prolong the stability and improve the function of these molecules without interfering with their ability to recognize specific sequences. These include increasing their resistance to degradation by DNases, including phosphotriesters, methylphosphonates, phosphorothioates, alpha-anomers, increasing their affinity for binding partners by covalent linkage to various intercalating agents such as psoralens, and increasing uptake by cells by conjugation to various groups including polylysine. These molecules recognize specific sequences encoded in mRNA and their hybridization prevents translation of and increases the degradation of these messages. Antisense compositions including oligonucleotides, derivatives and analogs thereof, conjugation protocols, and antisense strategies for inhibition of transcription and translation are generally described in: Antisense Research and Applications, Crooke, S. and B. Lebleu, eds. CRC Press, Inc. Boca Raton FIa. 1993; Nucleic Acids in Chemistry and Biology Blackburn, G. and M. J. Gait, eds. IRL Press at Oxford University Press, Inc. New York 1990; and Oligonucleotides and Analogues: A Practical Approach Eckstein, F. ed., IRL Press at Oxford University Press, Inc. New York 1991 ; which are each hereby incorporated herein by reference including all references cited therein which are hereby incorporated herein by reference.

Pharmaceutical Compositions and Vaccines

The invention provides CSTC-targeting peptides, cancer-specific isoforms of transcription modulators, polynucleotides, T cells and/or antigen presenting cells that are incorporated into pharmaceutical compositions. Pharmaceutical compositions comprise one or more such compounds and, optionally, a physiologically acceptable carrier. Vaccines may comprise one or more such compounds and an adjuvant that serves as a non-specific immune response enhancer. The adjuvant may be any substance that enhances an immune response to an exogenous antigen. Examples of adjuvants include conventional adjuvants, biodegradable microspheres (e.g., polylactic galactide), immunostimulatory oligonucleotides and liposomes (into which the compound is incorporated; see e.g., Fullerton, U.S. Patent No. 4,235,877). Vaccine preparation is generally described in, for example, M. F. Powell and MJ. Newman, eds., "Vaccine Design (the subunit and adjuvant approach)," Plenum Press (NY, 1995). Pharmaceutical compositions and vaccines within the scope of the present invention may also contain other compounds that may be biologically active or inactive. For example, one or more immunogenic portions of other tumor antigens may be present, either incorporated into a fusion polypeptide or as a separate compound, within the composition or vaccine.

A pharmaceutical composition can contain DNA encoding one or more of the peptides as described above, such that the peptide is generated in situ. As noted above, the DNA may be present within any of a variety of delivery systems known to those of ordinary skill in the art, including nucleic acid expression systems, bacteria and viral expression systems. Numerous gene delivery techniques are well known in the art, such as those described by Rolland, Crit. Rev. Therap. Drug Carrier Systems 15:143-198, 1998, and references cited therein. Appropriate nucleic acid expression systems contain the necessary DNA sequences for expression in the patient (such as a suitable promoter and terminating signal). Bacterial delivery systems involve the administration of a bacterium (such as Bacillus-Calmette-Guerrin) that expresses an immunogenic portion of the polypeptide on its cell surface or secretes such an epitope.

In a preferred embodiment, the DNA may be introduced using a viral expression system (e.g., vaccinia or other pox virus, retrovirus, or adenovirus), which may involve the use of a nonpathogenic (defective), replication competent virus. Suitable systems are disclosed, for example, in Fisher-Hoch et al., Proc. Natl. Acad. Sci. USA 86:317-321 , 1989; Flexner et al., Ann. N. Y. Acad Sci. 569:86-103, 1989; Flexner et al., Vaccine 8:17-21, 1990; U.S. Patent Nos. 4,603,112, 4,769,330, and 5,017,487; WO 89/01973; U.S. Patent No. 4,777,127; GB 2,200,651 ; EP 0,345,242; WO 91/02805; Berkner-Biotechniques 6:616-627, 1988; Rosenfeld et al., Science 252:431-434, 1991; KoIIs et al., Proc. Natl. Acad. Sci. USA 91:215-219, 1994; Kass-Eisler et al., Proc. Natl. Acad. Sci. USA 90:11498-11502, 1993; Guzman et al., Circulation 88:2838-2848, 1993; and Guzman et al., Cir. Res. 73:1202-1207, 1993. Techniques for incorporating DNA into such expression systems are well known to those of ordinary skill in the art. The DNA may also be "naked," as described, for example, in Ulmer et al., Science 259:1745-1749, 1993 and reviewed by Cohen, Science 259:1691-1692, 1993. The uptake of naked DNA may be increased by coating the DNA onto biodegradable beads, which are efficiently transported into the cells.

While any suitable carrier known to those of ordinary skill in the art may be employed in the pharmaceutical compositions of this invention, the type of carrier will vary depending on the mode of administration. Compositions of the present invention may be formulated for any appropriate manner of administration, including for example, topical, oral, nasal, intravenous, intracranial, intraperitoneal, subcutaneous, intradermal or intramuscular administration. For parenteral administration, such as subcutaneous injection, the carrier preferably comprises water, saline, alcohol, a fat, a wax or a buffer. For oral administration, any of the above carriers or a solid carrier, such as mannitol, lactose, starch, magnesium stearate, sodium saccharine, talcum, cellulose, glucose, sucrose, and magnesium carbonate, may be employed. Biodegradable microspheres (e.g., polylactate polyglycolate) may also be employed as carriers for the pharmaceutical compositions of this invention. Suitable biodegradable microspheres are disclosed, for example, in U.S. Patent Nos. 4,897,268 and 5,075,109. In addition, the carrier may contain other pharmacologically-acceptable excipients for modifying or maintaining the pH, osmolality, viscosity, clarity, color, sterility, stability, rate of dissolution, or odor of the formulation. Similarly, the carrier may contain still other pharmacologically-acceptable excipients for modifying or maintaining the stability, rate of dissolution, release, or absorption or penetration across the blood-brain barrier of the delivered molecule. Such excipients are those substances usually and customarily employed to formulate dosages for parenteral administration in either unit dose or multi-dose form or for direct infusion into the CSF by continuous or periodic infusion from an implanted pump.

Such compositions may also comprise buffers (e.g., neutral buffered saline or phosphate buffered saline), carbohydrates (e.g., glucose, mannose, sucrose or dextrans), mannitol, proteins, polypeptides or amino acids such as glycine, antioxidants, chelating agents such as EDTA or glutathione, adjuvants (e.g., aluminum hydroxide) and/or preservatives. Alternatively, compositions of the present invention may be formulated as a lyophilizate. Compounds may also be encapsulated within liposomes using well known technology.

Any of a variety of adjuvants may be employed in the vaccines of this invention. Most adjuvants contain a substance designed to protect the antigen from rapid catabolism, such as aluminum hydroxide or mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or Mycobacterium tuberculosis derived proteins. Suitable adjuvants are commercially available as, for example, Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, Ml); Merck Adjuvant 65 (Merck and Company, Inc., R ah way, NJ); aluminum salts such as aluminum hydroxide gel (alum) or aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated tyrosine acylated sugars; cation ically or anionically derivatized polysaccharides; polyphosphazenes biodegradable microspheres; monophosphoryl lipid A and quil A. Cytokines, such as GM CSF or interleukin-2, -7, or -12, may also be used as adjuvants.

Within the vaccines provided herein, the adjuvant composition is preferably designed to induce an immune response predominantly of the TM type. High levels of Th1-type cytokines (e.g., IFN-α, IL-2 and IL-12) tend to favor the induction of cell mediated immune responses to an administered antigen. In contrast, high levels of Th2-type cytokines (e.g., IL-4, IL-5, IL-6, IL-10 and TNF-β) tend to favor the induction of humoral immune responses. Following application of a vaccine as provided herein, a patient will support an immune response that includes TM- and Th2-type responses. Within a preferred embodiment, in which a response is predominantly TM -type, the level of Th1 -type cytokines will increase to a greater extent than the level of Th2-type cytokines. The levels of these cytokines may be readily assessed using standard assays. For a review of the families of cytokines, see Mosmann and Coffman, Ann. Rev. Immunol. 7:145-173, 1989. The compositions described herein may be administered as part of a sustained release formulation (i.e., a formulation such as a capsule or sponge that effects a slow release of compound following administration). Such formulations may generally be prepared using well known technology and administered by, for example, oral, rectal or subcutaneous implantation, or by implantation at the desired target site, such as a site of surgical excision of a tumor. Sustained-release formulations may contain a peptide, polynucleotide or antibody dispersed in a carrier matrix and/or contained within a reservoir surrounded by a rate controlling membrane. Carriers for use within such formulations are biocompatible, and may also be biodegradable; preferably the formulation provides a relatively constant level of active component release. The amount of active compound contained within a sustained release formulation depends upon the site of implantation, the rate and expected duration of release and the nature of the condition to be treated or prevented.

Therapeutic and Prophylactic Methods

Treatment includes prophylaxis and therapy. Prophylaxis or therapy can be accomplished by a single direct injection at a single time point or multiple time points to a single or multiple sites. Administration can also be nearly simultaneous to multiple sites. Patients or subjects include mammals, such as human, bovine, equine, canine, feline, porcine, and ovine animals. The subject is preferably a human.

A cancer may be diagnosed using criteria generally accepted in the art, including the presence of a malignant tumor. Pharmaceutical compositions and vaccines may be administered either prior to or following surgical removal of primary tumors and/or treatment such as administration of radiotherapy or conventional chemotherapeutic drugs. Within certain embodiments, immunotherapy may be active immunotherapy, in which treatment relies on the in vivo stimulation of the endogenous host immune system to react against tumors or infected cells with the administration of immune response-modifying agents (such as peptides and polynucleotides disclosed herein). Within other embodiments, immunotherapy may be passive immunotherapy, in which treatment involves the delivery of agents with established tumor-immune reactivity (such as effector cells or antibodies) that can directly or indirectly mediate antitumor effects and does not necessarily depend on an intact host immune system. Examples of effector cells include T cells as discussed above, T lymphocytes (such as CD8+ cytotoxic T lymphocytes and CD4+ T- helper tumor-infiltrating lymphocytes), killer cells (such as Natural Killer cells and lymphokine- activated killer cells), B cells and antigen-presenting cells (such as dendritic cells and macrophages) expressing a peptide provided herein. In a preferred embodiment, dendritic cells are modified in vitro to present the peptide, and these modified APCs are administered to the subject. T cell receptors and antibody receptors specific for the peptides recited herein may be cloned, expressed and transferred into other vectors or effector cells for adoptive immunotherapy. The peptides provided herein may also be used to generate antibodies or anti-idiotypic antibodies (as described above and in U.S. Patent No. 4,918,164) for passive immunotherapy.

Administration and Dosage The compositions are administered in any suitable manner, often with pharmaceutically acceptable carriers. Suitable methods of administering cells in the context of the present invention to a subject are available, and, although more than one route can be used to administer a particular cell composition, a particular route can often provide a more immediate and more effective reaction than another route. The dose administered to a patient, in the context of the present invention, should be sufficient to effect a beneficial therapeutic response in the patient over time, or to inhibit disease progression. Thus, the composition is administered to a subject in an amount sufficient to alleviate, reduce, cure or at least partially arrest symptoms and/or complications from the disease and/or to elicit an effective immune response to the specific antigens. An amount adequate to accomplish this is defined as a "therapeutically effective dose."

Routes and frequency of administration of the therapeutic compositions disclosed herein, as well as dosage, will vary from individual to individual, and may be readily established using standard techniques. In general, the pharmaceutical compositions and vaccines may be administered, by injection (e.g., intracutaneous, intratumoral, intramuscular, intraperitoneal, intravenous or subcutaneous), intranasally (e.g., by aspiration) or orally. Preferably, between 1 and 10 doses may be administered over a 52 week period. Preferably, 6 doses are administered, at intervals of 1 month, and booster vaccinations may be given periodically thereafter. Alternate protocols may be appropriate for individual patients. In one embodiment, 2 intradermal injections of the composition are administered 10 days apart. In another embodiment, a dose is administered daily or once every 2 or 3 days over an extended period, such as weeks or months.

A suitable dose is an amount of a compound that, when administered as described above, is capable of promoting an anti-tumor response, and is at least 10-50% above the basal (i.e., untreated) level. Such response can be monitored, for example, by measuring reduction in tumor size or the level of anti-tumor antibodies in a patient or by vaccine-dependent generation of cytolytic effector cells capable of killing the patient's tumor cells in vitro. Such therapies should also be capable of causing a response that leads to an improved clinical outcome (e.g., more frequent remissions, complete or partial or longer disease-free survival) in patients as compared to untreated patients. In general, for pharmaceutical compositions and vaccines comprising one or more peptides, the amount of each peptide present in a dose ranges from about 100 μg to 5 mg per kg of host. Suitable volumes will vary with the size of the patient, but will typically range from about 0.1 mL to about 5 ml_.

In general, an appropriate dosage and treatment regimen provides the active compound(s) in an amount sufficient to provide therapeutic and/or prophylactic benefit. Such a response can be monitored by establishing an improved clinical outcome (e.g., more frequent remissions, complete or partial, or longer disease-free survival) in treated patients as compared to non- treated patients. Increases in preexisting immune responses to a tumor protein generally correlate with an improved clinical outcome. Such immune responses may generally be evaluated using standard proliferation, cytotoxicity or cytokine assays, which may be performed using samples obtained from a patient before and after treatment. Diagnostic Methods

The invention provides a method for detecting cancer in a tissue comprising contacting the tissue with a molecule that recognizes and binds a CSTC or cancer-specific isoform of a transcription modulator described herein. The molecule can be, for example, a CSTC- targeting peptide, an antibody directed against a CSTC or cancer-specific isoform of a transcription modulator, or an oligonucleotide probe or antisense molecule directed against a cancer-specific molecule. The tissue can be from a mammal, such as human, bovine, equine, canine, feline, porcine, and ovine tissue. The tissue is preferably a human. The tissue can comprise a tumor specimen, cerebrospinal fluid, or other suitable specimen. In one embodiment, the method comprises use of an ELISA type assay. Those skilled in the art will appreciate additional variations suitable for the method of detecting cancer in tissue through detection of a cancer-specific molecule in a specimen. This method can also be used to monitor levels of the cancer-specific molecule in tissue of a patient undergoing treatment for cancer. The suitability of a CSTC-targeted therapeutic regimen for initial or continued treatment can be determined by monitoring such levels using this method.

The invention additionally provides a method for identifying a molecule that inhibits proliferation of cancer cells. The method comprises contacting a candidate molecule with a CSTC and determining whether the candidate molecule disrupts the biological activity of the CSTC. Disruption of the biological activity of the CSTC is indicative of a molecule that inhibits proliferation of cancer cells. Representative molecules include antibodies, proteins, peptides and nucleotides. Kits

For use in the diagnostic and therapeutic applications described herein, kits are also within the scope of the invention. Such kits can comprise a carrier, package or container that is compartmentalized to receive one or more containers such as vials, tubes, and the like, each of the container(s) comprising one of the separate elements to be used in the method. For example, the container(s) can comprise a probe that is or can be detectably labeled. The probe can be an antibody or polynucleotide specific for a cancer-specific molecule of the invention. The kit can also include containers containing nucleotide(s) for amplification of a target nucleic acid sequence and/or a container comprising a reporter-means, such as a biot in- binding protein, e.g., avidin or streptavidin, bound to a detectable label, e.g., an enzymatic, florescent, or radioisotope label. The kit can include all or part of an amino acid sequence of the sequences described herein, or a nucleic acid molecule that encodes such amino acid sequences.

The kit of the invention will typically comprise the container described above and one or more other containers comprising materials desirable from a commercial and user standpoint, including buffers, diluents, filters, needles, syringes, and package inserts with instructions for use. In addition, a label can be provided on the container to indicate that the composition is used for a specific therapeutic or non-therapeutic application, and can also indicate directions for either in vivo or in vitro use, such as those described above. Directions and or other information can also be included on an insert which is included with the kit. EXAMPLES

The following examples are presented to illustrate the present invention and to assist one of ordinary skill in making and using the same. The examples are not intended in any way to otherwise limit the scope of the invention. Example 1 : Identification of isoforms of transcriptional co-reαulators

This example demonstrates the results of an extensive in silico analysis of components of transcriptional co-regulators and use of PCR primers designed to identify novel iosforms with altered activity.

MATERIAL & METHODS

Primary tumors

Surgical specimens were obtained from human patients undergoing surgery for melanoma. Specimens were trypsin ized and prepared for analysis using conventional techniques. RNA isolation was performed as described below.

Cell culture

Human melanoma cell lines SK-MEL-28 and WM-266-4 were obtained from the American Type Culture Collection (ATCC; Manassas, Virginia; SK-MEL-28 deposited by T. Takehashi and subject to release terms set by The Memorial Sloan-Kettering Cancer Center; WM-266-4 deposited by M. Herlyn). Cells were cultured according to recommendations of ATCC (DMEM, 10% FCS, penicillin + streptomycin) and used in experiments after two passages in the laboratory. Cells were grown in 24 well plates, each treatment in triplicates. Cells were plated 16 hours prior treatments started. Peptides were added to the media, and media was changed every day during 7 day experiment. CPP concentration was 10μM.

For cell counting, cells were trypsinized (0.25% Trypsin, 2 mM EDTA) in Ca+2, Mg+2 free PBS. Cells were precipitated and resuspended in 100 μl of PBS, and 5 μl were removed for counting

Apoptosis was analyzed using Biovision Annexin V-Cy3 Apoptosis Kit according to manufacturer's protocols.

Identification of isoforms of transcriptional co-regulators in melanoma cells

RNA was isolated from human melanoma cell lines SK-MEL-28 and WM 266-4 and primary tumors using RNA isolation KIT (Qiagen). RT-PCR was used to identify isoforms of co- regulators. Primers used to analyze isoforms are presented in Table.1.

First strand cDNAs were synthesized with reverse transcriptase (Superscriptll, Life Technologies Inc.) using 5-10 μg of mRNA from different cell lines as a template. PCR reactions were performed in the volume of 25 μl containing one tenth of RT reaction as a template and GC-Rich PCR System or the ExpandT^M. Long Distance PCR System kit (Roche) according to manufacturer's instructions. All amplified PCR products were sequenced and sequences analyzed to identify novel functional isoforms of transcriptional co-regulators. Table 1 , Oligonucleotide primers used to isolate and characterize isoforms of transcriptional co-regulators in human melanoma cells.

1. GTF2H1 (NM_005316) Product amplified with primers : ACTTCCTGTCTAGAGTTGTAGC ( S ; SEQ ID NO : 14 ) and GTAAGTCAGCTATACTAAGTTCTG (AS3 ; SEQ ID NO : 15 )

67 aa : shorter protein (different sequence after 53 aa ) MATSSEEVLLIVKKVRQKKQDGALYLMAERIAWAPEGKDRFTISHMYADIKCKSAILSSDVFVCHSC* (SEQ ID NO : 16 )

Alignment with P32780 (548 aa; SEQ ID NO: 16 and 17) :

1 MATSSEEVLLIVKKVRQKKQDGALYLMAERIAWAPEGKDRFTISHMYADIKCQKISPEGK I III III III Il I I I Il I I I I Il Il Il I I I I I I I Il I Il III I Il IM III I_'

1 MATSSEEVLLIVKKVRQKKQDGALYLMAERIAWAPEGKDRFTISHMYADIKCKSAILSSD

61 AKI_QLQLVLHAGDTTNFHFSNESTAVKERDAVKDLLQQLLPKFKRKANKELEEKNRMLQE 61 VFVCHSC

121 DPVLFQLYKDLVVSQVISAEEFWANRLNVNATDSSSTSNHKQDVGISAAFLADVRPQTDG

68

181 CNGLRYNLTSDIIESIFRTYPAVKMKYAENVPHNMTEKEFWTRFFQSHYFHRDRLNTGSK

68 241 DLFAECAKIDEKGLKTMVSLGVKNPLLDLTALEDKPLDEGYGISSVPSASNSKSIKENSN

68

301 AAIIKRFNHHSAMVLAAGLRK_QEAQNEQTSEPSNMDGNSGDADCFQPAVKRAKLQESIEY

68

361 EDLGKNNSVKTIALNLKKSDRYYHGPTPIQSLQYATSQDIINSFQSIRQEMEAYTPKLTQ 68

421 VLSSSAASSTITALSPGGALM_QGGTQQAINQMVPNDIQSELKHLYVAVGELLRHFWSCFP

68

481 VNTPFLEEKWKMKSNLERF_QVTKLCPFQEKIRRQYLSTNLVSHIEEMLQTAYNKLHTWQ

68 541 SRRLMKKT 68

2. GTF2H2 (NM 001515)

1. Product amplified with primers TTTCCGGCTGAGAGTCCTTC (Sl; SEQ ID NO: 18₎ and CACATCACTTCAGCTTAACTC (ASl; SEQ ID NO: 19)

165 aa: 230 aa shorter C-term, different 8 aa of C-term;

MDEEPERTKRWEGGYERTWEILKEDESGSLKATIEDILFKAKRKRVFEHHG_QVRLGMMRHLYVWDGSRTMEDQDLKPNRLTCTL KLLEYFVEEYFD_QNPIS_QIGIIVTKSKRAEKLTELSGNPRKHITSLKKAVDMTCHGEPSLYNSLSIAMQTLKLVLYIMYN*

(SEQ ID NO: 20)

Alignment with Q13888 (395 aa; SEQ ID NO: 20 and 21): 1 MDEEPERTKRWEGGYERTWEILKEDESGSLKATIEDILFKAKRKRVFEHHGQVRLGMMRH

I I I I H I I I I I l I I I I l I l I l I l I l I l I I I l I l I I I I I l I I I I I I M l M I H I l I l I M

1 MDEEPERTKRWEGGYERTWEILKEDESGSLKATIEDILFKAKRKRVFEHHGQVRLGMMRH 61 LYVWDGSRTMEDQDLKPNRLTCTLKLLEYFVEEYFDQNPISQIGIIVTKSKRAEKLTEL

I I I I l I I I I I I I I I I I I I I l I l I I I I I I I I I l I I I I l I I l I I I I I I I I I I I I I I I I I I I I

61 LYVWDGSRTMED^QDLKPNRLTCTLKLLEYFVEEYFDQNPISQIGIIVTKSKRAEKLTEL

121 SGNPRKHITSLKKAVDMTCHGEPSLYNSLSIAMQTLKLVLYIMYN I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I .

121 SGNPRKHITSLKKAVDMTCHGEPSLYNSLSIAMQTLKHMPGHTSREVLIIFSSLTTCDPS

166 181 NIYDLIKTLKAAKIRVSVIGLSAEVRVCTVLARETGGTYHVILDESHYKELLTHHVSPPP

166

241 ASSSSECSLIRMGFPQHTIASLSDQDAKPSFSMAHLDGNTEPGLTLGGYFCPQCRAKYCE

166

301 LPVECKI CGLTLVSAPHLARSYHHLFPLDAFQE I PLEEYNGERFCYGCQGELKDQHVYVC 166

361 AVCQNVFCVDCDVFVHDSLHCCPGCIHKIPAPSGV

2. Product amplified with primers AGGATGTGAAGGAGCTTGTGAAG (S2; SEQ ID NO: 22) and CAAGTACAGTGCAAACGCGAAC (AS5; SEQ ID NO: 23)

338 aa: 57 aa shorter N-terminus (or NMD?), same as BG828327

PSLYNSLSIAMQTLKHMPGHTSREVLIIFSSLTTCDPSNIYDLIKTLKAAKIRVSVIGLSAEVRVCTVL... (SEQ ID NO: 24)

Alignment with Q13888 (395 aa; SEQ ID NO: 24 and 25) : 1 MDEEPERTKRWEGGYERTWEILKEDESGSLKATIEDILFKAKRKRVFEHHGQVRLGMMRH

I I I 1 MRH

61 LYVWDGSRTMEDQDLKPNRLTCTLKLLEYFVEEYFDQNPISQIGII VTKSKRAEKLTEL I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

4 LYVWDGSRTMEDQDLKPNRLTCTLKLLEYFVEEYFDQNPISQIGIIVTKSKRAEKLTEL

121 SGNPRKHITSLKKAVDMTCHGEPSLYNSLSIAMQTLKHMPGHTSREVLIIFSSLTTCDPS

I I l I I I I I l I I l I I I I l I I I I I I I l I l I I I I I I I I I I I I I I l I I l I I I I I I I I I l I l I I I 64 SGI)PRKHITSLKKAVDMTCHGEPSLYNSLS IAMQTLKHMPGHTSREVLI IFSSLTTCDPS

181 NIYDLIKTLKAAKIRVSVIGLSAEVRVCTVLARETGGTYHVILDESHYKELLTHHVSPPP

I I I I I l I I I I I I I I l I I I l I I I I I l I I I I I I 124 NIYDLIKTLKAAKIRVSVIGLSAEVRVCTVL

3. GTF2H3 (NM_001516)

1. Product amplified with primers GACAGCCATGGTTTCAGACG (Sl; SEQ ID NO: 26) and CAGAAACTTTGCTGGCAGGAT (ASl; SEQ ID NO: 27)

267 aa: 41aa shorter N-term,

MVLGNSHLFMNRSNKLAVIASHIQESRFLYPGKNGRLGDFFGDPGNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMTKSDIKG QHTETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQYMNFMNVIFAAQKQNILIDACVLDSDSGLLQQACDITG GLYLKVPQMPSLLQYLLWVFLPDQDQRSQLILPPPVHVDYRAACFCHRNLIEIGYVCSVCLSIFCNFSPICTTCETAFKISLPPV LKAKKKKLKVSA* (SEQ ID NO: 28)

Alignment with Q13889 (308 aa: SEQ ID NO: 28 and 29):

1 MVLGNSHLFMNRSNKLAVI Il Il I Il I I I I I Il Il I I I 1 MVSDEDELNLLVIVVDANPIWWGKQALKESQFTLSKCIDAVMVLGNSHLFMNRSNKLAVI

20 ASHIQESRFLYPGKNGRLGDFFGDPGNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMT I I Il I Il I Il I I I Il I I I I I I I I I I I I I I I Il M I Il I Il I I I I I I I I I I I I I I I I I I I I

61 ASHIQESRFLYPGKNGRLGDFFGDPGNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMT

80 KSDIKGQHTETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQYMNFMNV

I I I I I Il I Il I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I Il I Il I I I I I I I I I I 121 KSDIKGQHTETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQYMNFMNV

140 IFAAQKQNILIDACVLDSDSGLLQQACDITGGLYLKVPQMPSLLQYLLWVFLPDQDQRSQ

I I I Il I I I I I Il I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I I I I Il I I I I I I I I I I 181 IFAAQKQNILIDACVLDSDSGLLQQACDITGGLYLKVPQMPSLLQYLLWVFLPDQDQRSQ

200 LILPPPVHVDYRAACFCHRNLIEIGYVCSVCLSIFCNFSPICTTCETAFKISLPPVLKAK

II I I I I I I I Il I I I I I I I I I I I I I I I I I I I I I I Il I I I I Il I I I I I I I Il I I I I I I I I I I 241 LILPPPVHVDYRAACFCHRNLIEIGYVCSVCLSIFCNFSPICTTCETAFKISLPPVLKAK 260 KKKLKVSA

I I I I I Il I 301 KKKLKVSA

2. Products amplified with primers GACAGCCATGGTTTCAGACG (Sl; SEQ ID NO: 30) and CGTGGTGAAAACATGGTGAAAC (AS3; SEQ ID NO: 31) a) 43 aa: shorter protein, new product (Δ3-13, different last exon)

MVSDEDELNLLVIWDANPIWWGKQALKESQPPK* (SEQ ID NO: 32)

Alignment with Q13889 (308 aa; SEQ ID NO: 29) :

1 MVSDEDELNLLVIVVDANPIWWGKQALKESQFTLSKCIDAVMVLGNSHLFMNRSNKLAVI

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 MVSDEDELNLLVIVVDANPIWMGKQALKESQPPK

61 ASHIQESRFLYPGKNGRLGDFFGDPGNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMT 35

121 KSDIKGQHTETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQYMNFMNV

35 181 IFAAQKQNILIDACVLDSDSGLLQQACDITGGLYLKVPQMPSLLQYLLWVFLPDQDQRSQ

35

241 LILPPPVHVDYRAACFCHRNLIEIGYVCSVCLSIFCNFSPICTTCETAFKISLPPVLKAK

35

301 KKKLKVSA 35 b) 120 aa: shorter protein, different sequence after 108 aa, new product (Δ(4)-13, different last exon)

GNPPEFNPSGSKDGKYELLTSASQVAGITTLLNP* (SEQ ID NO: 33)

Alignment with Q13889 (308 aa; SEQ ID NO: 29) :

1 MVSDEDELNLLVIVVDANPIWWGKQALKESQFTLSKCIDAVMVLGNSHLFMNRSNKLAVI

Il I Il I I III I I I Il I I I I I I I Il I I I I I I I I I I I I I I I Il I Il I I I I I I I I Il I I I I I

1 MVSDEDELNLLVIVVDANPTWWGKQALKESQFTLSKCIDAVMVLGNSHLFMNRSNKLAVI 61 ASHIQESRFLYPGKNGRLGDFFGDPGNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMT I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I .: I 61 ASHI^QESRFLYPGKNGRLGDFFGDPGNPPEFNPSGSKDGKYELLTSASQVAGITTLLNP .

121 KSDIKGQHTETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQYMNFMNV 120

181 IFAAQKQNILIDACVLDSDSGLLQQACDITGGLYLKVPQMPSLLQYLLWVFLPDQDQRSQ

120

241 LILPPPVHVDYRAACFCHRNLIEIGYVCSVCLSIFCNFSPICTTCETAFKISLPPVLKAK

120 301 KKKLKVSA 120 c) 110 aa: shorter protein, different C-term beginning with 74aa, MVSDEDELNLLVIWDANPIWWGKQALKESQFTLSKCIDAVMVLGNSHLFMNRSNKLAVIASHIQESRFLYPGFTPFSCLSLPSS WDYYSTEPMRQKFETILPNWKTW* (SEQ ID NO: 34)

Alignment with Q13889 (308 aa; SEQ ID NO: 29) : 1 MVSDEDELNLLVIWDANPIWWGKQALKESQFTLSKCIDAVMVLGNSHLFMNRSNKLAVI

I l I I l I I I I l I I I I I I I I I l I l I I I I I I I l I I I I l I I I I l I I I I I I l I I l I l I I I I I I I I

1 MVSDEDELNLLVIWDANPIWWGKQALKESQFTLSKCIDA VMVLGNSHLFMNRSNKLAVI

61 ASHIQESRFLYPGFTPFSCLSLPSSWDYYSTEPMRQKFETILPNVVKTW 11 I I I I I I I I I I I • I I

61 ASHIQESRFLYPGKNGRLGDFFGDPGNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMT

110 121 KSDIKGQHTETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQYMNFMNV

110

181 IFAAQKQNILIDACVLDSDSGLLQQACDITGGLYLKVPQMPSLLQYLLWVFLPDQDQRSQ

110

241 LILPPPVHVDYRAACFCHRNLIEIGYVCSVCLSIFCNFSPICTTCETAFKISLPPVLKAK no

301 KKKLKVSA d) 297 aa: different C-terminus (16 aa), (shorter exon 13, extra exon after exon 13)

...MVLGNSHLFMNRSNKLAVIASHIQESRFLYPGKNGRLGDFFGDPGNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMTKSDIK

GQHTETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQ YMN FMNVI FAAQKQNILI DACVLDS DSGLLQQACDIT

GGLYLKVPQMPSLLQYLL! K* ( SEQ I D NO : 35 )

I I Il I Il I I I I I Il Il I I I

1 MVLGNSHLFMNRSNKLAVI

61 ASHIQESRFLYPGKNGRLGDFFGDPGNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMT

I I I I I Il I Il I I I I I I I I I I I I I I I I I I I I Il Il I Il I I I I I I I I I I I I I I I I I I I I I I I 20 ASHIQESRFLYPGKNGRLGDFFGDPGNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMT

121 KSDIKGQHTETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQYMNFMNV

I Il I I I I I I I Il I Il I I I Il I I I I I I I I I I Il I I I Il I I I I Il I I I I I I I I I I I I I I I I I 80 KSDIKGQHTETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQYMNFMNV

181 IFAAQKQNILIDACVLDSDSGLLQQACDITGGLYLKVPQMPSLLQYLLWVFLPDQDQRSQ

I I I Il I I I I Il I I I I I I I I I I I I I I I I I I I I I I I M I I I I Il I Il I I I I I I I I I I I I I I I

140 IFAAQKQNILIDACVLDSDSGLLQQACDITGGLYLKVPQMPSLLQYLLWVFLPDQDQRSQ 241 LILPPPVHVDYRAACFCHRNLIEIGYVCSVCLSIFCNFSPICTTCETAFKISLPPVLKAK

III I Il I I I I Il I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I Il I I I I I I I Il 200 LILPPPVHVDYRAACFCHRNLIEIGYVCSVCLSIFCNFSPICTTCETAFKISQPPK.... O

301 KKKLKVSA

256 0 4. GTF2H4 (BCOl6302)

1. Product amplified with GAGACTTTGGCTCCGATTAAG (Sl; SEQ ID NO: 36) and GAAGTGCTCCAAGGAACAGC (ASl; SEQ ID NO: 37) 5 81 aa: shorter protein,

MESTPSRGLNRVHLQCRNLQEFLGGLSPGVLDRLYGHPATCLAVFRELPSLAKNWVMRMLFLEQPLPQAAVALWVKKEFSK* (SEQ ID NO: 38)

Alignment with Q92759 (462 aa; SEQ ID NO: 39): 0

1 MESTPSRGLNRVHLQCRNLQEFLGGLSPGVLDRLYGHPATCLAVFRELPSLAKNWVMRML

I I I Il I I I I I Il I I I Il I I I I I I I I I I I I I I I I I I I I I I I Il I Il I I I I I I I I I I Il I I I 1 MESTPSRGLNRVHLQCRNLQEFLGGLSPGVLDRLYGHPATCLAVFRELPSLAKNWVMRML 5 61 FLEQPLPQAAVALWVKKEFSKAQEESTGLLSGLRIWHTQLLPGGLQGLILNPIFRQNLRI

II I Il I I Il I Il I I I I I I I I I

61 FLEQPLPQAAVALWVKKEFSK

121 ALLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEVVLHFMVGSPSAAVSQDLAQLLS 0

82

181 QAGLMKSTEPGEPPCITSAGFQFLLLDTPAQLWYFMLQYLQTAQSRGMDLVEILSFLFQL 5 82

241 SFSTLGKDYSVEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYPTRLAINLSSGVSGAGGT

82 0

301 VHQPGFIVVETNYRLYAYTESELQIALIALFSEMLYRFPNMVVAQVTRESVQQAIASGIT

82 5 361 AQQIIHFLRTRAHPVMLKQTPVLPPTITDQIRLWELERDRLRFTEGVLYNQFLSQVDFEL

82

421 LLAHARELGVLVFENSAKRLMVVTPAGHSDVKRFWKRQKHSS

82

2. Product amplified with GAGACTTTGGCTCCGATTAAG (Sl; SEQ ID NO: 40) and TGAGCGAGCATCCGCATCA (ASl; SEQ ID NO: 41) O

442 aa : internal 20 aa missing ( 61-82 aa; SEQ I D NO : 42 ) ,

MESTPSRGLNRVHLQCRNLQE FLGGLSPGVLDRLYGHPATCLAVFRELPSLAKNWVMRMLAQEESTGLLSGLRIWHTQLLPGGLQ GLILNPIFRQNLRIALLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWE WLH FMVGSPSAAVSQDLAQLLSQAGLMKSTEPG0 EPPCITSAGFQFLLLDTPAQLWYFMLQYLQTAQSRGMDLVEILSFLFQLSFSTLGKDYSVEGMSDSLLNFLQHLREFGLVFQRKR KSRRYYPTRLAINLSSGVSGAGGTVHQPGFI WETNYRLYAYTESELQIALIALFSEMLYRFPNMWAQVTRESVQQAIASGITA QQI IHFLRTRAHPVMLKQTPVLPPTITDQIRLWELERDRLRFTEGVLYNQFLSQVDFELLLAHARELGVLVFENSAKRLMVVTPA GHSDVKRFWKRQKHSS* 5 Alignment with Q92759 ( 462 aa ; SEQ ID NO : 43 ) :

1 MESTPSRGLNRVHLQCRNLQEFLGGLSPGVLDRLYGHPATCLAVFRELPSLAKNWVMRML I I I 111 I I I I 11 I I I 11 I I I I I I I I I I I I I I I I I I I I I I I 11 I I I I I I 11 I I I I I I I I I I

1 MESTPSRGLNRVHLQCRNLQEFLGGLSPGVLDRLYGHPATCLAVFRELPSLAKNWVMRML

61 FLEQPLPQAAVALHVKKEFSKAQEESTGLLSGLRIWHTQLLPGGLQGLILNPIFRQNLRI

I I I I I I I I I I I I I I I I I I I Il I Il I I I Il I I I Il Il I I I 61 AQEESTGLLSGLRIWHTQLLPGGLQGLILNPIFRQNLRI

121 ALLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEVVLHFMVGSPSAAVSQDLAQLLS

I l I I l I I I I I I I I I I I I I I I I I I I I I I I I I I I I I l I I I I I I I I I I I l I I l I I I I I I I I I I 100 ALLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEWLHFMVGSPSAAVSQDLAQLLS lθl QAGLMKSTEPGEPPCITSAGFQFLLLDTPAQLWYFMLQYLQTAQSRGMDLVEILSFLFQL

I I I I I I I I I I I I I I I Il I I I I I I I I I I I I Il I I I I I I I I I Il I I I I I I I I I I I I I I I I I I 160 QAGLMKSTEPGEPPCITSAGFQFLLLDTPAQLWYFMLQYLQTAQSRGMDLVEILSFLFQL

241 SFSTLGKDYSVEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYPTRLAINLSSGVSGAGGT

I I I I I Il I Il I I I I I I I I Il Il I I I I I I Il I I I I I I I I Il I I I I I I I I I I Il I I I I I I I I

220 SFSTLGKDYSVEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYPTRLAINLSSGVSGAGGT 301 VHQPGFIVVETNYRLYAYTESELQIALIALFSEMLYRFPNMVVAQVTRESVQQAIASGIT

I I Il I I Il I I I I I I I I I I Il Il I I I I I I Il I I I I I I I I I I I I I I I Il I I I I I I I I I I I I I

280 VHQPGFIVVETNYRLYAYTESELQIALIALFSEMLYRFPNMVVAQVTRESVQQAIASGIT

361 AQQIIHFLRTRAHPVMLKQTPVLPPTITDQIRLWELERDRLRFTEGVLYNQFLSQVDFEL I I I I I I I I I I I I I I I I I I 11 I I I I I I I I 11 I I I 11 I I I I I I I I I I 11 I I I 11 I I I I I I I I

340 AQQIIHFLRTRAHPVMLKQTPVLPPTITDQIRLWELERDRLRFTEGVLYNQFLSQVDFEL

421 LLAHARELGVLVFENSAKRLMVVTPAGHSDVKRFWKRQKHSS

I I I Il I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 400 LLAHARELGVLVFENSAKRLMVVTPAGHSDVKRFWKRQKHSS

5. ERCC2 (NM_000400) Product amplified with TGGGGTCATCGGCTCAACGTG (S2; SEQ ID NO: 44) and TCTTGAGCAGTAGATGAGTTTGG (AS2; SEQ ID NO: 45)

736 aa: 24 aa shorter N-terminus (SEQ ID NO: 46), MRELKRTLDAKGHGVLEMPSGTGKTVSLLALI^YQRAYPLEVTKLIYCSRTVPEIEKVIEELRKLLNFYEKQ.EGEKLPFLGLAL

RLRDEYRRLVEGLREASAARETDAHLANPV... Alignment with P18074 (760 aa; SEQ ID NO: 47):

1 MKLNVDGLLVYFPYDYIYPEQFSYMRELKRTLDAKGHGVLEMPSGTGKTVSLLALIMAYQ

I I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I 1 MRELKRTLDAKGHGVLEMPSGTGKTVSLLALIMAYQ

61 RAYPLEVTKLIYCSRTVPEIEKVIEELRKLLNFYEKQEGEKLPFLGLALSSRKNLCIHPE

I I I I I Il I Il I I I Il Il I I Il I I Il Il I I I I I I I I I I I I I Il I Il I I I I I I I I I I I I I I I

S3 RAYPLEVTKLIYCSRTVPEIEKVIEELRKLLNFYEKQEGEKLPFLGLALSSRKNLCIHPE 121 VTPLRFGKDVDGKCHSLTASYVRAQYQHDTSLPHCRFYEEFDAHGREVPLPAGIYNLDDL

I I Il I Il I Il I Il I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I I I I I I I I I I I I I I 113 VTPLRFGKDVDGKCHSLTASYVRAQYQHDTSLPHCRFYEEFDAHGREVPLPAGIYNLDDL

181 KALGRRQGWCPYFLARYSILHANVVVYSYHYLLDPKIADLVSKELARKAVVVFDEAHNID I I I I l I I I I I I I I I I I I I 1 1 1 1 I I I I I I I I I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I I I

173 KALGRRQGWCP YFLARYS ILHANVVVYS YHYLLDPKI ADLVSKELARKAVWFDEAHNI D

241 NVCIDSMSVNLTRRTLDRCQGNLETLQKTVLRIKETDEQRLRDEYRRLVEGLREASAARE

I I I I I I I I I I I I I I I I I I I I I I I l I I I I I I I I I I l I I I I I I I I l I I l I I I I I I I I I I I I I 233 NVC I DSMSVNLTRRTLDRCQGNLETLQKTVLRIKETDEQRLRDEYRRLVEGLREASAARE

6. MNATl (NM_002431)

Product amplified with GGTCAACATATTTCACTGGCAC (S2; SEQ ID NO: 48) and TCCATCAGATGAGGCTTATCGT (AS3; SEQ ID NO: 49)

278 aa: 31 aa shorter C-terminus, different sequence after 271 aa; (shorter exon 8), (SEQ ID NO: 50), ...MQLEKPKPVKPVTFSTGIKMGQHISLAPIHKLEEALYEYQPLQIETYGPHVPELEMLGRLGGFDTISLI* Alignment with P51948 ( 309 aa; SEQ ID NO: 51 ) :

121 KMEIYQKENKDVIQKNKLKLTREQEELEEALEVERQENEQRRLFIQKEEQLQQILKRKNK O i

181 ^QAFLDELESSDLPVALLLAQHKDRSTQLEMQLEKPKPVKPVTFSTGIKHGQHISLAPIHK

I I I I I I Il I Il I I I I I I I I I I I I I I I I I I I I 10 1 MQLEKPKPVKPVTFSTGIKMGQHISLAPIHK

241 LEEALYEYQPLQIETYGPHVPELEMLGRLGYLNHVRAASPQDLAGGYTSSLACHRALQDA

I I I I I Il I Il I I I Il I I I I I I I I I I I I I I I . : 32 LEEALYEYQPLQIETYGPHVPELEMLGRLGGFDTISLI

301 FSGLFWQPS

70

20

7. Cdk7 (NM_001799)

Product amplified with primers CAAGGCCAGAGATAAGAACACC (S; SEQ ID NO: 52) and GTAGGCTTTGATGTGTGATGGT (ASl; SEQ ID NO: 53) 25

323 aa: internal 23 aa missing (77-100 aa) , new product (Δ5), SEQ ID NO: 54

30

Alignment with P50613 (346 aa; SEQ ID NO: 55):

1 MALDVKSRAKRYEKLDFLGEGQFATVYKARDKNTNQIVAIKKIKLGHRSEAKDGINRTAL n_ I I III Il I IUI Il I I I I Il Il I I I I I I Il Il I Il III III I M I Il I Il Il I Il Il III 35 1 MALDVKSRAKRYEKLDFLGEGQFATVYKARDKNTNQIVAIKKIKLGHRSEAKDGINRTAL

61 REIKLLQELSHPNIIGLLDAFGHKSNISLVFDFMETDLEVIIKDNSLVLTPSHIKAYMLM

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

61 REIKLLQELSHPNIIG VIIKDNSLVLTPSHIKAY...

40

8. CCNH (NM_001239)

Product amplified with primers CTTGGACAGGAGAAGGCAC (Sl; SEQ ID NO: 56) and CAGTATAGTCACACCAGAATG (AS2; SEQ ID NO: 57) 45

329 aa: longer protein, different C-terminus after 312 aa, (mtron retention between exons 8 and 9, longer exon 9), (SEQ ID NO: 58),

MPRSWGTACMYFKRFYLNNSVMEYHPRIIMLTCAFLACKVDEFNVSSPQFVGNLRESPLGQEKALEQILEYELLLIQQLNFHL 50 IVHNPYRPFEGFLIDLKTRYPILENPEILRKTADDFLNRIALTDAYLLYTPSQIALTAILSSASRAGITMESYLSESLMLKENRT

Alignment with P51946 (323 aa; SEQ ID NO: 59) : 55

1 MYHNSSQKRHWTFSSEEQLARLRADANRKFRCKAVANGKVLPNDPVFLEPHEEMTLCKYY

1

60 61 EKRLLEFCSVFKPAMPRSWGTACMYFKRFYLNNSVMEYHPRI IMLTCAFLACKVDEFNV

I I I I I I I I I I l I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 MPRSWGTACMYFKRFYLNNSVMEYHPRI IMLTCAFLACKVDEFNV

121 SSPQFVGNLRESPLGQEKALEQILEYELLLIQQLNFHLIVHNPYRPFEGFLIDLKTRYPI

65 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

47 SSPQFVGNLRESPLGQEKALEQILEYELLLIQQLNFHLIVHNPYRPFEGFLIDLKTRYPI 181 LENPEiLRKTADDFLHRiALTDAYLLYTPSQiALTAi LSSASRAGITMESYLSESLMLKE

111!11111111111111111111111111IUIIIIIIIIIIIIIIIIIIIIIIIIIII

107 LEBPEILRKTADDFLNRIALTDAYLLYTPSQIALTAILSSASRAGITMESYLSESIJILKE 241 NRTCLSQLLDIMKSMRNLVKKYEPPRSEEVAVLKQKLERCHSAELALNVITKKRKGYEDD

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I I I I Il I I I I I I I I I I I I I 167 NRTCLSQLLDIMKSMRNLVKKYEPPRSEEVAVLKQKLERCHSAELALNVITKKRKGYEDD

301 DYVSKKSKHEEEEWTDDDLVESL Il I I Il I I I I I .I

227 DYVSKKSKHEEVCFTPKMNSKLFLLYILV

9. GTF2F1 (NM_002096) 1. Product amplified with primers GCTCGGAGGAAGTTCAAGG (S3; SEQ ID NO: 60) and GTCCTGGTCCTGATCCTTG (ASl; SEQ ID NO: 61)

490 aa: internal 27 aa missing (83-llOaa), shorter exon 4, (SEQ ID NO: 62),

GRRKASELRIHDLEDDLEMSSDASDASGE... Alignment with P35269 (517 aa; SEQ ID NO: 63) :

1 MAALGPSSQNVTEYVVRVPKNTTKKYNIMAFNAADKVNFATWNQARLERDLSNKKIYQ

I 11 I I 111 I I I 11 I I I I I I I I I I I I I I I 1111 I 11 I 11 I 11 I I I I 11 I I I I I I I I I

1 MAALGPSSQNVTEYVVRVPKNTTKKYNIMAFNAADKVNFATWNQARLERDLSNKKIYQ

S9 EEEMPESGAGSEFNRKLREEARRKKYGIVLKEFRPEDQPWLLRVNGKSGRKFKGIKKGGV

II I Il I I I I I I I I I I I I I I I I I I I I I I I I I I I I 61 EEEMPESGAGSEFNRKLREEAR RKFKGIKKGGV 119 TENTSYYIFTQCPDGAFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVLNHFSIMQ

I I I Il I I I I I I I I I I I I I I I I I I I I I I I Il I I I Il I Il I I I I I I I Il I Il I I I I I I I I I I

94 TENTSYYIFTQCPDGAFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVLNHFSIMQ

179 QRRLKDQDQDEDEEEKEKRGRRKASELRIHDLEDDLEMSSDASDASGEEGGRVPKAKKKA I I 11 I I 11 I 11 I I I I I I I 1111 I I I I I I 1111 I 11 I I I 11 I I I I I 11 I

154 QRRLKDQDQDEDEEEKEKRGRRKASELRIHDLEDDLEMSSDASDASGE

2. Products amplified with primers AGGCCTGGGCGTCTGTTTG (S5; SEQ ID NO: 64) and GCACGGCATCCTCAGTCAC (AS3; SEQ ID NO: 65) a) 437 aa: shorter/different C-terminus after 364 aa; new product (intron retentions between exons 10 and 11, 11 and 12), SEQ ID NO: 66)

LNHFSIMQQRRLKDQDQDEDEEEKEKRGRRKASELRIHDLEDDLEMSSDASDASGEEGGRVPKAKKKAPLAKGGRKKKKKKGSDD EAFEDSDDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEEGPKGVDEQSDSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKD SSEESDSSEESDIDSEASSALFMAVRPSPVAGEAWASVCRLTHLPTLTSAEEEDATQERAEAVGRELKGQQPPRHAQRRGWQHLL HPAGGCQQTRAR*

Alignment with P35269 (517 aa; SEQ ID NO: 67):

1 MAALGPSSQNVTEYVVRVPKNTTKKYNIMAFNAADKVNFATWNQARLERDLSNKKIYQEE

I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I 11 I I I I I I 11 I I I I I I I I I I I I I I I I I I I I 1 MAALGPSSQNVTEYVVRVPKNTTKKYNIMAFNAADKVNFATWNQARLERDLSNKKIYQEE

61 EMPESGAGSEFNRKLREEARRKKYGIVLKEFRPEDQPWLLRVNGKSGRKFKGIKKGGVTE

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I Il I I I I I I I I I I I I I I I I I I I I 61 EMPESGAGSEFNRKLREEARRKKYGIVLKEFRPEDQPWLLRVNGKSGRKFKGIKKGGVTE

121 NTSYYIFTQCPDGAFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVLNHFSIMQQR

I I I I I I I I I I I I I I I 11 I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 121 NTSYYIFTQCPDGAFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVLNHFSIMQQR 181 RLKDQDQDEDEEEKEKRGRRKASELRIHDLEDDLEMSSDASDASGEEGGRVPKAKKKAPL

III IMI III Il III Il Il Il I I Il Il Il Il Il IMI III III III Il III Il Il Il I I I

181 RLKDQDQDEDEEEKEKRGRRKASELRIHDLCDDLEMSSDASOASGEEGGRVPKAKKKAPL 241 AKGGRKKKKKKGSDDEAFEDSDDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEEGPKG

Mill Mill III Il Il Mill I Il Il Il III Il III III III Mil Mill Il Il M I I

241 AKGGRKKKKKKGSDDEAFEDSDDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEEGPKG

301 VDEQSDSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSEESDIDSEASSA

361 LFMAKKKTPPKRERKPSGGSSRGNSRPGTPSAEGGSTSSTLRAAASKLEQGKRVSEHPAA

Il I I : .1 I I . I . I 361 LFMAVRPSPVAGEAWASVCRLTHLPTLTSAEEEDATQERAEAVGRELKGQQPPRHAQRRG

421 KRLRLDTGPQSLSGKSTPQPPSGKTTPNSGDVQVTEDAVRRYLTRKPMTTKDLLKKFQTK 421 WQHLLHPAGGCQQTRAR b) 481 aa: shorter C-terminus, different sequence after 360 aa; new product (intron retention between exons 10 and 11), (SEQ ID NO: 68),

MAALGPSSQNVTEYWRVPKNTTKKYNIMAFNAADKVNFATWNQARLERDLSNKKIYQEEEMPESGAGSEFNRKLREEARRKKYG LNHFSIMQQRRLKDQ DQDEDEEEKEKRGRRKASELRIHDLEDDLEMSSDAS DASGEEGGRVPKAKKKAPLAKGGRKKKKKKGSDD

EAFEDSDDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEEGPKGVDEQSDSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKD SSEESDSSEESDIDSEASSAFFMAVRPSPVAGEAWASVCRLTHLPTLTSAEEEDATQERAEAVGRELKGQQPPRHAQRRGWQHLL HPAGGCQQTRAREAGERDACSQAVAAGHGTPEPVWEVDTPATIRQDNTQQRRRAGD*

Alignment with P35269 (517 aa) (SEQ ID NO: 69) :

301 VDEQSDSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSEESDIDSEASSA

I I I I III I Il I I I Il I I I M I I I I I I I I I I I I I I I Il I Il I Il I M I I I I I I I M I I I I I 301 VDEQSDSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSEESDIDSEASSA

361 LFMAKKKTPPKRERKPSGGSSRGNSRPGTPSAEGGSTSSTLRAAASKLEQGKRVSEMPAA

Il I : ■ I I I . I . I

361 FFMAVRPSPVAGEAWASVCRLTHLPTLTSAEEEDATQERAEAVGRELKGQQPPRHAQRRG

421 KRLRLDTGPQSLSGKSTPQPPSGKTTPNSGDVQVTEDAVRRYLTRKPMTTKDLLKKFQTK

. 1 :. . I : I I . . .: . 421 WQHLLHPAGGCQQTRAREAGERDACSQAVAAGHGTPEPVWEVDTPATIRQDNTQQRRRAG 481 KTGLSSEQTVNVLAQILKRLNPERKMINDKMHFSLKE 481 D

3. Product amplified with primers TACCAAGAGGAGGAGAAGGAG (S7; SEQ ID NO: 70) and TCCTCTGAGCTGTCCGACTC (AS4; SEQ ID NO: 71)

385 aa: internal 132 aa missing (59-191 aa) , (Δ(4)-(6)), (SEQ ID NO: 72)

...MAALGPSSQNVTEYWRVPKNTTKKYNIMAFNAADKVNFATWNQARLERDLSNKKIYQEEEKEKRGRRKASELRIHDLEDDLEM DEQSDSSEE...

Alignment with P35269 (517 aa) , (SEQ ID NO: 73):

1 ..MAALGPSSQNVTEYVVRVPKNTTKKYNIMAFNAADKVNFATWNQARLERDLSNKKIYQ Il I I MM Il I I I Il I M M I I I I I I I I I I I I Il I I M Il I Il I Il I I I I I Il Il I I I 1 MAALGPSSQNVTEYVVRVPKNTTKKYNIMAFNAADKVNFATWNQARLERDLSNKKIYQ

59 EEEMPESGAGSEFNRKLREEARRKKYGIVLKEFRPEDQPWLLRVNGKSGRKFKGIKKGGV 6¹

119 TENTSYYIFTQCPDGAFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVLNHFSIMQ 61

179 QRRLKDQDQOEDEEEKEKRGRRKASELRIHDLEDDLEMSSDASDASGEEGGRVPKAKKKA

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I I I I I I 61 EEEKEKRGRRKASELRIHDLEDDLEMSSDASDASGEEGGRVPKAKKKA

239 PLAKGGRKKKKKKGSDDEAFEDSDDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEEGP

II III I Il I III I I Il I I Il Il I I I I I I Il I Il Il I Il I III III III Il Il I Il Il I Il 109 PLAKGGRKKKKKKGSDDEAFEDSDDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEEGP

299 KGVDEQSDSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSEESDIDSEAS

I I I Il I I I I Il I

169 KGVDEQSDSSEE 4. Product amplified with primers CAGAGAACACGTCCTACTAC (S2; SEQ ID NO: 74) and CAGAGAACACGTCCTACTAC (AS2; SEQ ID NO: 75)

360 aa: shorter protein, different C-terminus after 318 aa; new product (Δ(9)-(13) ) ; (SEQ ID NO: 76)

...MQQRRLKDQDQDEDEEEKEKRGRRKASELRIHDLEDDLEMSSDASDASGEEGGRVPKAKKKAPLAKGGRKKKKKKGSDDEAFED HFPKSLFSCDLSTT* Alignment with P35269 (517 aa) (SEQ ID NO: 77) :

121 NTSYYIFTQCPDGAFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVLNHFSIMQQR

I I I I

1 MQQR

181 RLKDQDQDEDEEEKEKRQRRKASELRIHDLEDDLEMSSDASDASGEEGGRVPKAKKKAPL

III IMI III Il III Il Il I I I I Il Il Il Il Il III I Il I Il I Il I Il Il I I I Il Il I I I S RLKDQDQDEDEEEKEKRGRRKASELRIHDLEDDLEMSSDASDASGEEGGRVPKAKKKAPL 241 AKGGRKKKKKKGSDDEAFEDSDDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEEGPKG

II I I I I I I I I Il I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I Il I I I I I I I I I I 65 AKGGRKKKKKKGSDDEAFEDSDDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEEGPKG

301 VDEQSDSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSEESDIDSEASSA I I 11 I I 11 I I I I I I I I I I . I

125 VDEQSDSSEESEEEKPPEKPPPGSASLTI₁TKGLCCPLGNFYSSPFHFPKSLFSCDLSTT.

TBP ASSOCIATED FACTORS

10. TAFl (NM_004606)

1. Products amplified with primers AGACACGGACAGCGACGAA (S2; SEQ ID

NO: 78) and ACCACCTGAAGCTTGCCTC (AS2; SEQ ID NO: 79) a) 1417 aa: internal 455 aa missing (105-560 aa) , new product (Δ(3)- (11) ) ; (SEQ ID NO:80)

...MEGESVLDDECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRSTEVKDPWNLSNDEYYYPKQQGLRGTFGGNIIQ HSIPAVELRQPFFPTHMGPIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAEMREQERQASGG...

Alignment with P21675-1 (1872 aa) (SEQ ID NO: 81 and 82) :

1 MGPGCDLLLRTAATITAAAIMSDTDSDEDSAGGGPFSLAGFLFGNINGAGQLEGESVLDD

I I I I I I I I I 1 LEGESVLDD

61 ECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRSTEDAVDYSDINEVAEDE

I I III I I Il I Il Il Il I I Il Il I I I I I I Il Il I Il III Mil Il

10 ECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRST

121 SRRYQQTMGSLQPLCHSDYDEDDYDADCEDIDCKLMPPPPPPPGPMKKDKDQDSITGEKV 54 181 DFSSSSDSESEMCPQEATQAESEDGKLTLFLAGIMQHDATKLLPSVTELFPEFRPGKVLR

54 241 FLRLFGPGKNVPSVWBSARRKRKKKHRELIQEEQIQEVECSVESEVSQKSLWNYDYAPPP

51

301 PPEQCLSDDEITMMAPVESKFSQSTGDIDKVTDTKPRVAEWRYGPARLWYDMLQVPEDGS

54

361 GFDYGFKLRKTEHEPVI KSRMIEEFRKLEENNGTDLLADEHFLMVTQLHHEDDI IWDGED S4

421 VKHKGTKPQRASLAGWLPSSMTRNAMΛYNVQQGFAATLDDDKPWYSIFPIDSEDLVYGRW M

481 EDNIIWDAQAMPRLLEPPVLTLDPNDENLILEIPDEKEEATSNSPSKESKKSSSLKKSRI 54 541 LLGKTGVIKEEPQQNMSQPEVKDPWNLSNDEYYYPKQQGLRGTFGGNIIQHSIPAVELRQ

I Il Il Il I! Il Il III III III III III III Il Il Il Il Il 54 EVKDPWNLSNDEYYYPKQQGLRGTFGGNIIQHSIPAVELRQ

601 PFFPTHMGPIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGGG I I I I I I I 1 I I I I I I I I I I I I I ₁ I I I I I I I I I I I I I I I I I I I I I I I I I . I I I I I I I I I I I

95 PFFPTHMGPIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAEMREQERQASGG. b) 1375 aa: internal 497 aa missing (113-610 aa) , new product (Δ(3)~ (12)); (SEQ ID NO: 83)

...MEGESVLDDECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRSTEDAVDYSDIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHI KKKAKMREQERQASGG...

Alignment with P21675-1 {1872 aa) ; SEQ ID NO: 81 and 84:

1 MGPGCDLLLRTAATITAAAIMSDTDSDEDSAGGGPFSLAGFLFGNINGAGQLEGESVLDD

:l Il Il I Il 1 MEGESVLDD 61 ECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRSTEDAVDYSDINEVAEDE

III Mil III Il III Il Il Il Il Il Il I I Il Il II! Ml III III III Il Il 10 ECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRSTEDAVDYSD

121 SRRYQQTMGSLQPLCHSDYDEDDYDΛDCEDIDCKLMPPPPPPPGPMKKDKDQDSITGEKV

62

181 DFSSSSDSESEMGPQEATQAESEDGKLTLPLAGIMQHDATKLLPSVTELFPEFRPGKVLR 62

241 FLRLFGPGKN VPS VWRS ARRKRKKKHRELIQEEQIQΞVECSVESEVSQKSLWN YDYAPPP

62

301 PPEQCLSDDEITMMAPVESKFSQSTGDIDKVTDTKPRVAEWRYGPARLWYDMLGVPEDGS

62 361 GFDYGFKLRKTEHEPVIKSRMIEEFRKLEENNGTDLLADEHFLMVTQLHWSDDIIWDGED

62

421 VKHKGTKPQRASLAGWLPSSMTRNAMAYNVQQGFAATLDDDKPWYSIFPIDNEDLVYGRW

62

181 EDNIIWDAQAMPRLLEPPVLTLDPNDENLILEIPDEKEEATSHSPSKESKKESSLKKSRI 62 541 LLGKTGVIKEEPQQNMSQPEVKDPWNLSNDEYYYPKQQGLRGTFGGNIIQHSIPAVELRQ 62 601 PFFPTHMGPIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGGG

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I Il I I I I I I I I I I I I I I 62 IKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGG. c) 1341 aa: internal 531 aa missing (81- 612 aa) ; new product (Δ(2)- (12) ); (SEQ ID NO: 84)

HEGESVLDDECKKHLAGLGALGLGSLITELRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGG

Alignment with P21675-1 (1872 aa); SEQ ID NO: 81):

1 MGPGCDLLLRTAATITAAAIMSDTDSDEDSAGGGPFSLAGFLFGNINGAGQLEGESVLDD

: I I I I I I I I 1 MEGESVLDD 61 ECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRSTEDAVDYSDINEVAEDE

I I I I I I I I I I I I I I I I I I I I 10 ECKKHLAGLGALGLGSLITE

121 SRRYQQTMGSLQPLCHSDYDEDDYDADCEDIDCKLMPPPPPPPGPMKKDKDQDSITGEKV

30

181 DFSSSSDSESEMGPQEATQAESEDGKLTLPLAGIMQHDATKLLPSVTELFPEFRPGKVLR 30

241 FLRLFGPGKNVPSVWRSARRKRKKKHRELIQEEQIQEVECSVESEVSQKSLWNYDYAPPP 30

301 PPEQCLSDDEITMMAPVESKFSQSTGDIDKVTDTKPRVAEWRYGPARLWYDMLGVPEDGS

30 361 GFDYGFKLRKTEHEPVIKSRMIEEFRKLEENNGTDLLADENFLMVTQLHWEDDIIWDGED

30

421 VKHKGTKPQRASLAGWLPSSMTRNAMAYNVQQGFAATLDDDKPWYSIFPIDNEDLVYGRW

30

481 EDNIIWDAQAMPRLLEPPVLTLDPNDENLILEIPDEKEEATSNSPSKESKKESSLKKSRI 30

541 LLGKTGVIKEEPQQNMSQPEVKDPWNLSNDEYYYPKQQGLRGTFGGNIIQHSIPAVELRQ 30

601 PFFPTHMGPIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGGG

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 30 LRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGG. 2. Product amplified with primers GAGCTTTCTGGAT GATGTAAAC (Sl; SEQ ID NO: 85) and CTCCTCATCATCATACCCTTC (AS 4; SEQ ID NO: 86) :

1906 aa: extra internal domain (1688-1722 aa) , (extra exon between exons 35 and 36); SEQ ID NO: 87),

..MDDVNLILANSVKYNGPESQYTKTAQEIVNVCYQTLTEYDEHLTQLEKDICTAKEAALEEAELESLDPMT PGPYT PQPPDLYDT NTSLSMSRDASVFQDESNMSVLDI PSATPEKQVTQMRQGRGRLGEEDSDVDIEGYDDEEEDGKPKTPAPEGEDGDGDLADEEEGT VQQPQASVLYEDLLMSEGEDDEEDAGSDEEGDNPFS... Alignment with P21675-1 (1872 aa) ; SEQ ID NO: 88):

1561 KYQSRES FLDDVNLI LANSVKYNGPESQYTKTAQEI VNVC YQTLTE Y DEHLTQLEKDI CT M ll l ll ll l l l l l l ll l l l l l l l l l l ll l ll l ll l ll l l l l l l l l l l l l l l .MDDVNLILANSVKYNGPESQYTKTAQEIVNVCYQTLTEYDEHLTQLEKDICT

1621 AKEAMiEEAELESLDPMTPGPYTPQPPDLYDTNTSLSMSRDASVFQDESNMSVLDIPSAT

Il I Ii I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I Il I I I I I I I I I I I I I I I 53 AKEAALEEAELESLDPMTPGPYTPQPPDLYDTNTSLSMSRDASVFQDESNMSVLDIPSAT

1681 PEKQVTQ EGEDGDGDLADEEEGTVQQ mini 11 Ii 111 Ii i Ii M Ii 111

113 PEKQVTQMRQGRGRLGEEDSDVDIEGYDDBEEDGKPKTPAPEGEDGDGDLADEEEGTVQQ

1707 PQASVLYEDLLMSEGEDDEEDAGSDEEGDNPFSAIQLSESGSDSDVGSGGIRPKQPRMLQ

I l I I I I I I M M I l I I I I I l I l I l I I l I M l I I 173 PQASVLYEDLLMSEGEDDEEDAGSDEEGDNPFS 3. Products amplified with primers GAGCTTTCTGGATGATGTAAAC ( S3 ; SEQ I D NO : 89 ) and CTCCTCATCATCATACCCTTC (AS3 ; SEQ ID NO : 90 ) : a ) 1466 aa : shorter protein (extra exon after exon 28 ) ; (SEQ ID NO : 91 )

LEIPDEKEEATSNSPSKESKKESSLKKSRILLGKTGVIKEEPQQNHSQPEVKDPWNLSNDEYYYPKQQGLRGTFGGNIIQHSIPA VELRQPFFPTHMGPIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGGGEMFFMRTPQDLTGKDGDLIL AEYSEENGPLMMQVGMATKIKNYYKRKPGKDPGAPDCKYGETVYCHTSPFLGSLHPGQLLQAFENNLFRAPIYLHKMPETDFLII

TVHCDYLNRPHKSIHRRRTDPMVTLSSILESIINDMRDLPNTYPFHTPVNAKWKDYYKIITRPMDLQTLRENVRKRLYPSREEF REHLELIVKNSATYNAGSFSI* Alignment with P21675-1 (1872 aa) ; (SEQ ID NO: 92):

1 MGPGCDLLLRTAATITAAAIMSDTDSDEDSAGGGPFSLAGFLFGNINGAGQLEGESVLDD

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I Il I I I I I I I I I I I I I I I I I I Il I I I I I I I I

1 MGPGCDLLLRTAATITAAAIMSDTDSDEDSAGGGPFSLAGFLFGNINGAGQLEGESVLDD

61 ECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRSTEDAVDYSDINEVAEDE

I I I M Il I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I I I I I I I I I I I I Il Il I I I

61 ECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRSTEDAVDYSDINEVAEDE 121 SRRYQQTMGSLQPLCHSDYDEDDYDADCEDIDCKLMPPPPPPPGPMKKDKDQDSITGEKV

I Il M Il I Il I I I I I Il I I I I I I Il Il I I I I I I I I I I Il I I I I I I I I I I I I I I Il I I I I I

121 SRRYQQTMGSLQPLCHSDYDEDDYDADCEDIDCKLMPPPPPPPGPMKKDKDQDSITGEKV

181 DFSSSSDSESEMGPQEATQAESEDGKLTLPLAGIMQHDATKLLPSVTELFPEFRPGKVLR I 11 M I I 11 I 1111 I I I I I I I I I I I I I I I I I I I 11 I I I I I 11 I 11 I I I I I I I I I I I I I I I

181 DFSSSSDSESEMGPQEATQAESEDGKLTLPLAGIMQHDATKLLPSVTELFPEFRPGKVLR

241 FLRLFGPGKNVPSVHRSARRKRKKKHRELIQEEQIQEVECSVESEVSQKSLWNYDYAPPP

I I Mil I Il I Il I I I I I I I I I I I I I I I I I I I Il Il I I I I I Il I Il I I I I I I I I I I I I I I I 241 FLRLFGPGKNVPSVWRSARRKRKKKHRELIQEEQIQEVECSVESEVSQKSLWNYDYAPPP

301 PPEQCLSDDEITMMAPVESKFSQSTGDIDKVTDTKPRVAEWRYGPARLWYDMLGVPEDGS

I I III I Il I Il I I Il Il I Il Il I Il Il I Il Il I Il I Il I Il Il I Il Il III I I Il Il I Il 301 PPEQCLSDDEITMMAPVESKFSQSTGDIDKVTDTKPRVAEWRYGPARLWYDMLGVPEDGS

361 GFDYGFKLRKTEHEPVIKSRMIEEFRKLEENNGTDLLADENFLMVTQLHWEDDIIWDGED

Mil I llll I Il Il I I I I Il Il Il Il Il Il III Il III III I Il Il Il Il I I I I I I I I I I

361 GFDYGFKLRKTEHEPVIKSRMIEEFRKLEENNGTDLLADENFLMVTQLHWEDDIIWDGED 421 VKHKGTKPQRASLAGWLPSSMTRNAMAYNVQQGFAATLDDDKPWYSIFPIDNEDLVYGRW

II III III I Il I I I III I I I I I I I I I I I I I I I I Il I Il I I Il I I I I I I Il Il I Il I I I Il 421 VKHKGTKPQRASLAGWLPSSMTRNAMAYNVQQGFAATLDDDKPWYSIFPIDNEDLVYGRW 481 EDNIIWDAQAMPRLLEPPVLTLDPNDEKLILEIPDEKEEATSNSPSKESKKESSLKKSRI

I 1 Il 1111 Il 1111111111111111111111111 Il I Il 11111111111111111111

481 EDNIIWDAQAMPRLLEPPVLTLDPNDENLILEIPDEKEEATSNSPSKESKKESSLKKSRI 541 LLGKTGVIKEEPQQNMSQPEVKDPWNLSNDEYYYPKQQGLRGTFGGNIIQHSIPAVELRQ

III I I I I Il I Il I I I Il I I I I I I I I I I I I I I I I I Il I I I I Il I Il I I I I I I I I I I I I I I I 541 LLGKTGVIKEEPQQNMSQPEVKDPWNLSNDEYYYPKQQGLRGTFGGNIIQHSIPAVELRQ

601 PFFPTHMGPIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGGG I I III III Mil Il Il I I Il I I I I I I I I Il I Il Il I Il I III III III Il Il I Il Il I Il

601 PFFPTHMGPIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGGG

661 EMFFMRTPQDLTGKDGDLILAEYSEE(JGPLMMQVGMATKIKNYYKRKPGKDPGAPDCKYG

I I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I Il I I I I I I I I I I Il I Il I I I I I I I I Il 661 EMFFMRTPQDLTGKDGDLILAEYSEENGPLMMQVGMATKIKNYYKRKPGKDPGAPDCKYG

721 ETVYCHTSPFLGSLHPGQLLQAFENNLFRAPIYLHKMPETDFLIIRTRQGYYIRELVDIF

I I III I I I I I Il Il I I I I I I I I I I I I I I I I I I I Il I Il I Il I Il I Il I I I I I I I I I I I I I

721 ETVYCHTSPFLGSLHPGQLLQAFENNLFRAPIYLHKMPETDFLIIRTRQGYYIRELVDIF

781 VVGQQCPLFEVPGPNSKRANTHIRDFLQVFIYRLFWKSKDRPRRIRMEDIKKAFPSHSES

I I III I I I I I I I I I I I I I I I I I I I I I I I Il I I I Il I Il I I I I I I I Il I Il Il I I I I I I Il

781 VVGQQCPLFEVPGPNSKRANTHIRDFLQVFIYRLFWKSKDRPRRIRMEDIKKAFPSHSES 841 SIRKRLKLCADFKRTGMDSNWWVLKSDFRLPTEEEIRAMVSPEQCCAYYSMIAAEQRLKD

I Il I I I Il I I I I I I I I I I Il Il I I I I I I Il Il I I I I I I Il I I I I I I I I I I Il I I I I I I I I

841 SIRKRLKLCADFKRTGMDSNWWVLKSDFRLPTEEEIRAMVSPEQCCAYYSMIAAEQRLKD

901 AGYGEKSFFAPEEENEEDFQMKIDDEVRTAPWNTTRAFIAAMKGKCLLEVTGVADPTGCG I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

901 AGYGEKSFFAPEEENEEDFQMKIDDEVRTAPWNTTRAFIAAMKGKCLLEVTGVADPTGCG

961 EGFSYVKIPNKPTQQKDDKEPQPVKKTVTGTDADLRRLSLKNAKQLLRKFGVPEEEIKKL

I I I Il I I I I I I I I I I I I I Il Il I I I I I I Il I I I Il Il I Il I I I I I Il I I I Il I I I I I I I I 961 EGFSYVKIPNKPTQQKDDKEPQPVKKTVTGTDADLRRLSLKNAKQLLRKFGVPEEEIKKL

1021 SRWEVIDVVRTMSTEQARSGEGPMSKFARGSRFSVAEHQERYKEECQRIFDLQNKVLSST

I I I I I Il I Il I I I Il I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I I I I I I I I I I I I I I

1021 SRWEVIDVVRTMSTEQARSGEGPMSKFARGSRFSVAEHQERYKEECQRIFDLQNKVLSST

1081 EVLSTDTDSSSAEDSDFEEMGKNIENMLQNKKTSSQLSREREEQERKELQRMLLAAGSAA

I I I I I Il I I I Il I Il I I I I I I I I Il Il I I M I I I I I I I I I Il I Il I I I I I I I I I I I I I I I

IOBI EVLSTDTDSSSAEDSDFEEMGKNIENMLQNKKTSSQLSREREEQERKELQRMLLAAGSAA 1141 SGNNHRDDDTASVTSLNSSATGRCLKIYRTFRDEEGKEYVRCETVRKPAVIDAYVRIRTT

I I I I I Il I Il I I I Il I I I I I I I I Il Il I I I I I I I I I I I I I Il I Il I I I I I I I I Il Il I I I

1141 SGNNHRDDDTASVTSLNSSATGRCLKIYRTFRDEEGKEYVRCETVRKPAVIDAYVRIRTT

1201 KDEEFIRKFALFDEQHREEMRKERRRIQEQLRRLKRNQEKEKLKGPPEKKPKKMKERPDL I 11 I I I I M I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I 11 I 11 I 1111 I I I I I I I I I I

1201 KDEEFIRKFALFDEQKREEMRKERRRIQEQLRRLKRNQEKEKLKGPPEKKPKKMKERPDL

1261 KLKCGACGAIGHMRTNKFCPLYYQTNAPPSNPVAMTEEQEEELEKTVIHNDNEELIKVEG

I I I Il I I I I I I I I I I I I I I I I I I I I I I I Il I I I Il I I I I I I I I I I Il I Il Il I I I I I I I I 1261 KLKCGACGAIGHMRTNKFCPLYYQTNAPPSNPVAMTEEQEEELEKTVIHNDNEELIKVEG

1321 TKIVLGKQLIESADEVRRKSLVLKFPKQQLPPKKKRRVGTTVHCDYLNRPHKSIHRRRTD

II I III I I Il Il I Il Il Il I I I I I I I I I I I I I I I Il I I I I Il I Il I Il I I I I I Il Il I I I 1321 TKIVLGKQLIESADEVRRKSLVLKFPKQQLPPKKKRRVGTTVHCDYLNRPHKSIHRRRTD

1381 PMVTLSSILESIINDMRDLPNTYPFHTPVNAKVVKDYYKIITRPMDLQTLRENVRKRLYP

Il I Il I I I I Il I I I M I I Il I I I I I I I I Il I I I Il I I I Il I I I I I Il I I I Il I I I I I I I I

1381 PMVTLSSILESIINDMRDLPNTYPFHTPVNAKVVKDYYKIITRPMDLQTLRENVRKRLYP 1441 SREEFREHLELIVKNSATYNGPKHSLTQISQSMLDLCDEKLKEKEDKLARLEKAINPLLD Il III III III I I Il I I I Il I:

1441 SREEFREHLELIVKNSATYNAGSFSI

1501 DDDQVAFSFILDNIVTQKMMAVPDSWPFHHPVNKKFVPDYYKVIVNPMDLETIRKNISKH

1467

1561 KYQSRESFLDDVNLILANSVKYNGPESQYTKTAQEIVNVCYQTLTEYDEHLTQLEKDICT 1467

1621 AKEAALEEAELESLDPMTPGPYTPQPPDLYDTNTSLSMSRDASVFQDESNMSVLDIPSAT 1467

1681 PEKQVTQEGEDGDGDLADEEEGTVQQPQASVLYEDLLMSEGEDOEEOAGSDEEGDNPFSA 1467

1741 IQLSESGSDSDVGSGGIRPKQPRMLQEHTRMDMEHEESMMSYEGDGGEASHGLEDSNISY

1467

1801 GSYEEPDPKSHTQDTSFSSIGGYEVSEEEEDEEEEEQRSGPSVLSQVHLSEDEEDSEDFH

1467 1861 SIAGDSDLDSDE

1467 b) 1488 aa: shorter protein, different sequence after 1450 aa (extra exon after exon 28, 44bp shorter exon 28); (SEQ ID NO: 93)

ELTGTDGALVNDEGWVRSTEDAVDYSDINEVAEDESRRYQQTMGSLQPLCHSDYDEDDYDADCEDIDCKLMPPPPPPPGPMKKDK RSARRKRKKKHRELIQEEQIQEVECSVESEVSQKSLWNYDYAPPPPPEQCLSDDEITMMAPVESKFSQSTGDIDKVTDTKPRVAE WRYGPARLWYDMLGVPEDGSGFDYGFKLRKTEHEPVIKSRMIEEFRKLEENNGTDLLADENFLMVTQLHWEDDIIWDGE-DVKHKG

VELRQPFFPTHMGPIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGGGEMFFMRTPQDLTGKDGDLIL AEYSEENGPLMMQVGMATKIKNYYKRKPGKDPGAPDCKYGETVYCHTSPFLGSLHPGQLLQAFENNLFRAPIYLHKMPETDFLII RTRQGYYIRELVDIFWGQQCPLFEVPGPNSKRANTHIRDFLQVFIYRLFWKSKDRPRRIRMEDIKKAFPSHSESSIRKRLKLCA DFKRTGMDSNWWVLKSDFRLPTEEEIRAMVSPEQCCAYYSMIAAEQRLKDAGYGEKSFFAPEEENEEDFQMKIDDEVRTAPWNTT RAFIAAMKGKCLLEVTGVADPTGCGEGFSYVKIPNKPTQQKDDKEPQPVKKTVTGTDADLRRLSLKNAKQLLRKFGVPEEEIKKL

IDAYVRIRTTKDEEFIRKFALFDEQHREEMRKERRRIQEQLRRLKRNQEKEKLKGPPEKKPKKMKERPDLKLKCGACGAIGHMRT NKFCPLYYQTNAPPSNPVAMTEEQEEELEKTVIHNDNEELIKVEGTKIVLGKQLIESADEVRRKSLVLKFPKQQLPPKKKRRVGT TVHCDYLNRPHKSIHRRRTDPMVTLSSILESIINDMRDLPNTYPFHTPVNAKWKDYYKIITRPMDLQTLRENVRKRLYPSREEF REHLDDRWRPCLKKKKKEEETWLSEYAFHKPTRGCSLPTQSQF*

Alignment with P21675-1 (1872 aa); (SEQ ID NO: 92 and 93):

1 MGPGCDLLLRTAATITAAAIMSDTDSDEDSAGGGPFSLAGFLFGNINGAGQLEGESVLDD

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 MGPGCDLLLRTAATITAAAIMSDTDSDEDSAGGGPFSLAGFLFGHINGAGQLEGESVLDD

61 ECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRSTEDAVDYSDINEVAEDE

II I III III III Il I I Il Il Il I I Il I I Il Il I Il III Mil Il I Il III Il I I I I Il I I 61 ECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRSTEDAVDYSDINEVAEDE

121 SRRYQQTMGSLQPLCHSDYDEDDYDADCEDIDCKLMPPPPPPPGPMKKDKDQDSITGEKV

Il I I I I I I I I I I I I I I I I Il I I I I I I I I Il I I I Il I I I I I I I I I I Il I I I Il I I I I I I I I

121 SRRYQQTMGSLQPLCHSDYDEDDYDADCEDIDCKLMPPPPPPPGPMKKDKDQDSITGEKV 181 DFSSSSDSESEMGPQEATQAESEDGKLTLPLAGIMQHDATKLLPSVTELFPEFRPGKVLR

II I Il I I I I I I I I I I I I I I I I I I I I I I I I I Il M I Il I Il I I I I I I I I I I I I I I I I I I I I 181 DFSSSSDSESEMGPQEATQAESEDGKLTLPLAGIMQHDATKLLPSVTELFPEFRPGKVLR

241 FLRLFGPGKNVPSVWRSARRKRKKKHRELIQEEQIQEVECSVESEVSQKSLWNYDYAPPP I I I I I 11 I 11 I I I 1111 I I I I I I I I I I I I I I I I I 11 I I I I 11 I 11 I I I I I I I I I I I I I I I

241 FLRLFGPGKNVPSVWRSARRKRKKKHRELIQEEQIQEVECSVESEVSQKSLWNYDYAPPP

301 PPEQCLSDDEITMMAPVESKFSQSTGDIDKVTDTKPRVAEWRYGPARLWYDMLGVPEDGS

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 301 PPEQCLSDDEITMMAPVESKFSQSTGDIDKVTDTKPRVAEWRYGPARLWYDMLGVPEDGS

361 GFDYGFKLRKTEHEPVIKSRMIEEFRKLEEHNGTDLLADENFLMVTQLHWEDDIIWDGED

III Mil Il III I I I I Il Il Il I I Il I I Il Il III III Mil I Il Il III Il I I I I I I I I 361 GFDYGFKLRKTEHEPVIKSRMIEEFRKLEENNGTDLLADENFLMVTQLHWEDDIIWDGED

421 VKHKGTKPQRASLAGWLPSSMTRNAMAYNVQQGFAATLDDDKPWYSIFPIDNEDLVYGRW

I I I Il I I I I I Il I I I I I I I I I I I I I I I I Il I I I Il I I I Il I I I I I Il I I I Il I I I I I I I I

421 VKHKGTKPQRASLAGWLPSSMTRNAMAYNVQQGFAATLDDDKPWYSIFPIDNEDLVYGRW 481 EDNIIWDAQAMPRLLEPPVLTLOPNDENLILEIPDEKEEATSNSPSKESKKESSLKKSRI

481 EDNIIWDAQAMPRLLEPPVLTLDPNDENLILEIPOEKEEATSNSPSKESKKESSLKKSRI

541 LLGKTGVI KEEPQQNMSQPEVKDPWNLSNDE Y Y YPKQQGLRGTFGGN I IQHS I PAVELRQ

I l I I I I I I I I I I I I I I I I I I I I I I I I I I I I I l I I I I I I I l I I I I I I l I I I I l I I I I I I I I

541 LLGKTGVIKEEPQQNMSQPEVKDPWNLSNDEYYYPKQQGLRGTFGGNIIQHSIPAVELRQ

601 PFFPTHMGPIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGGG

Il IM I I I I I Il Il I Il I I I Il I I I Il I I M Il III I Mill III Il I Il I I I I I Il I Il

601 PFFPTHMGPIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGGG

661 EMFFMRTPQDLTGKDGDLILAEYSEENGPLMMQVGMATKIKNYYKRKPGKDPGAPDCKYG

I I I I I Il I I I I I I Il Il I I I I I I I I I I I I I I I I I Il I I I I Il I Il I I I I I I I I Il I I I I I 661 EMFFMRTPQDLTGKDGDLILAEYSEENGPLMMQVGMATKIKNYYKRKPGKDPGAPDCKYG

721 ETVYCHTSPFLGSLHPGQLLQAFENNLFRAPIYLHKMPETDFLIIRTRQGYYIRELVDIF

II I Il I I Il I Il I I I Il I I I I I I I I I I I I I I I I I I I I I I I Il I Il I I I I I I I I I I I I I I I 721 ETVYCHTSPFLGSLHPGQLLQAFENNLFRAPIYLHKMPETDFLIIRTRQGYYIRELVDIF

781 WGQQCPLFEVPGPNSKRANTHIRDFLQVFIYRLFWKSKDRPRRIRMEDIKKAFPSHSES

I I I I I I I I I I I I I I I I I I Il I I I I I I I I Il I I I Il I I I I I I I I I I Il I I I I I I I I I I I I I

781 WGQQCPLFEVPGPNSKRANTHIRDFLQVFI YRLFWKSKDRPRRIRMEDIKKAFPSHSES

841 SIRKRLKLCADFKRTGMDSNWWVLKSDFRLPTEEEIRAMVSPEQCCAYYSMIAAEQRLKD

I I I I I I I l I I I I I I I I M I l I I I I I I I I I l I I I I l I I I I I I I I I I I l I I I I I I I I I I I I I

841 SIRKRLKLCADFKRTGMDSNWWVLKSDFRLPTEEEIRAMVSPEQCCAYYSMIAAEQRLKD

901 AGYGEKSFFAPEEENEEDFQMKIDDEVRTAPWNTTRAFIAAMKGKCLLEVTGVADPTGCG

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

901 AGYGEKSFFAPEEENEEDFQMKIDDEVRTAPWNTTRAFIAAMKGKCLLEVTGVADPTGCG

961 EGFSYVKIPNKPTQQKDDKEPQPVKKTVTGTDADLRRLSLKNAKQLLRKFGVPEEEIKKL

I I Il I III I I I I I I I I I I Il Il I I I I I I Il I I I Il I I I Il I I I I I Il I I I Il I I I I I I I I

961 EGFSYVKIPNKPTQQKDDKEPQPVKKTVTGTDADLRRLSLKNAKQLLRKFGVPEEEIKKL

1021 SRWEVIDVVRTMSTEQARSGEGPMSKFARGSRFSVAEHQERYKEECQRIFDLQNKVLSST

IMIIMIMIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII

1021 SRWEVIDVVRTMSTEQARSGEGPMSKFARGSRFSVAEHQERYKEECQRIFDLQNKVLSST

1081 EVLSTDTDSSSAEDSDFEEMGKNIENMLQNKKTSSQLSREREEQERKELQRMLLAAGSAA

I I I Il I I I I I I I I I I I I I I I I I I I I I I I Il Il I I I Il I Il M I M I I I I I I I I I I I I I I I

1081 EVLSTDTDSSSAEDSDFEEMGKNIENMLQNKKTSSQLSREREEQERKELQRMLLAAGSAA

1141 SGNNHRDDDTASVTSLNSSATGRCLKIYRTFRDEEGKEYVRCETVRKPAVIDAYVRIRTT

I I I I I Il I Il I I I Il Il I I I I I I Il Il I I I I I I I Il I Il I Il I Il I I I I I I I I I I I I I I I

1141 SGNNHRDDDTASVTSLNSSATGRCLKIYRTFRDEEGKEYVRCETVRKPAVIDAYVRIRTT

1201 KDEEFIRKFALFDEQHREEMRKERRRIQEQLRRLKRNQEKEKLKGPPEKKPKKMKERPDL

I I I Il I I I I I Il Il I I I I I I I I I I I I I I I I I I I Il I Il I I I I I I I Il I Il I I I I I I I I I I

1201 KDEEFIRKFALFDEQHREEMRKERRRIQEQLRRLKRNQEKEKLKGPPEKKPKKMKERPDL

1261 KLKCGACGAIGHMRTNKFCPLYYQTNAPPSNPVAMTEEQEEELEKTVIHNDNEELIKVEG

I I I Il Il I Il I I I I I I I I Il M I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I I I I

1261 KLKCGACGAIGHMRTNKFCPLYYQTNAPPSNPVAMTEEQEEELEKTVIHNDNEELIKVEG

1321 TKIVLGKQLIESADEVRRKSLVLKFPKQQLPPKKKRRVGTTVHCDYLNRPHKSIHRRRTD

I I III I I I I I Il Il I I I I I I I I I I I I I I I I I I I Il I Il I Il I Il I Il I Il I I I I I I I I Il

1321 TKIVLGKQLIESADEVRRKSLVLKFPKQQLPPKKKRRVGTTVHCDYLNRPHKSIHRRRTD

1381 PMVTLSSILESIINDMRDLPNTYPFHTPVNAKVVKDYYKIITRPMDLQTLRENVRKRLYP

III I I Il Il I Il I I I I I I I I I I I I I I I I I I I Il Il I I I I I Il I I I I I I Il I I I I I I I I I I 1381 PMVTLSSILESIINDMRDLPNTYPFHTPVNAKWKDYYKIITRPMDLQTLRENVRKRLYP

1441 SREEFREHLELIVKNSATYNGPKHSLTQISQSMLDLCDEKLKEKEDKLARLEKAINPLLD

I I I I I I I I I : : ■ ■ : . 1441 SREEFREHLDDRWRPCLKKKKKEEETWLSEYAFHKPTRGCSLPTQSQF

1501 DDDQVAFSFILDNIVTQKMMAVPDSWPFHHPVNKKFVPDYYKVIVNPMDLETIRKNISKH

1489

1561 KYQSRESFLDDVNLILANSVKYNGPESQYTKTAQEIVNVCYQTLTEYDEHLTQLEKDICT

14B9

1621 AKEAALEEAELESLDPMTPGPYTPQPPDLYDTNTSLSMSRDASVFQDESNMSVLDIPSAT 1489

1681 PEKQVTQEGEOGDGDLAOEEEGTVQQPQASVLYEDLLMSEGEDDEEDAGSDEEGDHPFSA 1409

1741 IQLSESGSDSDVGSGGIRPKQPRMLQEHTRHDMEHEESMMSYEGDGGEASHGLEDSNISY

148S

1801 GSYEEPOPKSNTQDTSFSSIGGYEVSEEEEDEEEEEQRSGPSVLSQVHLSEDEEDSEDFfI

14S9 1861 SIAGDSDLDSDE 1489 c) 1185 aa: shorter protein, different sequence after 1461 aa (extra exon after exon 28); (SEQ ID NO: 94),

...MVTLSSI LES I INDMRDLPNTYPFHTPVNAKWKDYYKI ITRPMDLQTLRENVRKRLYPSREEFREHLELIVKNSATYNGKNQM FRDCKGHCS DP YSLLALHS D* Alignment with P21675-1 (1872 aa); (SEQ ID NO: 95):

321 TKIVLGKQLIESADEVRRKSLVtKFPKQQLPPKKKRRVGTTVHCDYLHRPHKSIHRRRTD l

1381 PMVTLSSILESIINDMRDLPNTYPFHTPVNAKVVKDYYKIITRPMDLQTLRENVRKRLYP

I IMItMIII Il UII Il Il Il Il Il Il III Il III UII III HIM Il HUH Il

1 .MVTLSSILSSIIHDMRDLPSTYPFHTPVNAKVVKDYYKIITRPMDLQTLREBVRKRLYP 1441 SREEFREHLELIVKNSATYHGPKHSLTQISQSMLDLCDEKLKEKEDKLARLEKAIHPLLD

III I III III Il I Il Il Il Il I : 60 SREEFREHLELIVKNSATYNGKNQMFRDCKGHCSDPYSLLALHSD

1501 DDDQVAFSFILDHIVTQKMMAVPDSWPFHHPVNKKFVPDYYKVIVHPMDLETIRKNISKH

105

1561 KYQSRESFLDDVNLILAHSVKYNGPESQYTKTΛQEIVHVCYQTLTEYDEHLTQLEKDICT 105

1621 AKEAALEEAELESLDPMTPGPYTPQPPDLYDTHTSLSMSRDASVFQDESNMSVLDIPSAT

105

1631 PEKQVTQEGEDGDGDLADEEEGTVQQPQASVLYEDLLMSEGEDDEEDAGSDEEGDNPFSA

105 1741 IQLSESGSDSDVGSGGIRPKQPRMLQENTRMDMENEESMHSYEGDGGEASHGLEDSNISY

105

1801 GSYEEPDPKSHTQDTSFSSIGGYEVSEEEEDEEEEEQRSGPSVLSQVHLSEDEEDSEDFH

105

1861 SIAGDSDLDSDE 105 d) 1460 aa, shorter protein (shorter exon 28, extra exon after exon 28) ; (SEQ ID NO: 96) MGPGCDLLLRTAATITAAAIMSDTDSDEDSAGGGPFSLAGFLFGNINGAGQLEGESVLDDECKKHLAGLGALGLGSLITELTANE ELTGTDGALVNDEGWVRSTEDAVDYSDINEVAEDESRRYQQTHGSLQPLCHSDYDEDDYDADCEDIDCKLMPPPPPPPGPHKKDK DQDSITGEKVDF_SSSSDSESEMGPQEATQAESEDGKLTLPLAGIMQHDATKLLPSVTELFPEFRPGKVLRFLRLFGPGKNVPSVW RSARRKRKKKHRELI_QEEQIQEVECSVESEVSQKSLWNYDYAPPPPPEQCLSDDEITMMAPVESKFSQSTGDIDKVTDTKPRVAE WRYGPARLWYDMLGVPEDGSGFDYGFKLRKTEHEPVIKSRMIEEFRKLEENNGTDLIADENFLMVTQLHWEDDIIWDGEDVKHKG TKP_QRASLAGWLPSSMTRNAMAYNVQQGFAATLDDDKPWYSIFPIDNEDLVYGRWEDNIIWDAQAMPRLLEPPVLTLDPNDENLI LEI PDEKEEATSNSPSKESKKESSLKKSRILLGKTGVIKEE PQQNMSQPEVKDPWNLSNDEYYYPKQQGLRGT FGGNI IQHS I PA VELRQPFFPTHMGPIKLRQFHRPPLKKYS FG ALSQPGPHSVQPLLKHI KKKAKMREQERQASGGGEMFFMRTPQDLTGKDGDLI L

AEYSEENGPLMMQVGMATKIKNYYKRKPGKDPGAPDCKYGETVYCHTSPFLGSLHPGQLLQAFENNLFRAPIYLHKMPETDFLII RTRQGYYIRELVDIFVVGQQCPLFEVPGPNSKRANTHIRDFLQVFIYRLFWKSKDRPRRIRMEDIKKAFPSHSESSIRKRLKLCA DFKRTGMDSNWWVLKSDFRLPTEEEIRAMVSPEQCCAYYSMIAAEQRLKDAGYGEKSFFAPEEENEEDFQMKIDDEVRTAPWNTT RAFIAAMKGKCLLEVTGVADPTGCGEGFSYVKIPNKPTQQKDDKEPQPVKKTVTGTDADLRRLSLKNAKQLLRKFGVPEEEIKKL SRWEVIDVVRTMSTEQARSGEGPMSKFARGSRFSVAEHQERYKEECQRIFDLQNKVLSSTEVLSTDTDSSSAEDSDFEEMGKNIE NMLQNKKTSSQLSREREEQERKELQRMLLAAGSAASGNNHRDDDTASVTSLNSSATGRCLKIYRTFRDEEGKEYVRCETVRKPAV IDAYVRIRTTKDEEFIRKFALFDEQHREEMRKERRRIQEQLRRLKRNQEKEKLKGPPEKKPKKMKERPDLKLKCGACGAIGHMRT NKFCPLYYQTNAPPSNPVAMTEEQEEELEKTVIHNDNEELIKVEGTKIVLGKQLIESADEVRRKSLVLKFPKQQLPPKKKRRVGT TVHCDYLNRPHKSIHRRRTDPMVTLSSILESIINDMRDLPNTYPFHTPAWMTDGDPVSKRKKKKKKRGFQSMLSTSPLGVALCPH RANSEWRGLPPRSLL* e) 1485 aa, shorter protein (longer exon 28, extra exon after exon 28); (SEQ ID NO: 97)

DQDSITGEKVDFSSSSDSESEMGPQEATQAESEDGKLTLPLAGIMQHDATKLLPSVTELFPEFRPGKVLRFLRLFGPGKNVPSVW RSARRKRKKKHRELIQEEQIQEVECSVESEVSQKSLWNYDYAPPPPPEQCLSDDEITMMAPVESKFSQSTGDIDKVTDTKPRVAE WRYGPARLWYDMLGVPEDGSGFDYGFKLRKTEHEPVIKSRMIEEFRKLEENNGTDLLADENFLMVTQLHWEDDIIWDGEDVKHKG TKPQRASLAGWLPSSMTRNAMAYNVQQGFAATLDDDKPWYSIFPIDNEDLVYGRWEDNIIWDAQAMPRLLEPPVLTLDPNDENLI LEIPDEKEEATSNSPSKESKKESSLKKSRILLGKTGVIKEEPQQNMSQPEVKDPWNLSNDEYYYPKQQGLRGTFGGNI IQHSIPA VELRQPFFPTHMGPIKLRQFHRPPLKKYSFGALSQPGPHSVQPLLKHIKKKAKMREQERQASGGGEMFFMRTPQDLTGKDGDLIL

DFKRTGMDSNWWVLKSDFRLPTEEEIRAMVSPEQCCAYYSMIAAEQRLKDAGYGEKSFFAPEEENEEDFQMKIDDEVRTAPWNTT

IDAYVRIRTTKDEEFIRKFALFDEQHREEMRKERRRIQEQLRRLKRNQEKEKLKGPPEKKPKKMKERPDLKLKCGACGAIGHMRT

NKFCPLYYQTNAPPSNPVAMTEEQEEELEKTVIHNDNEELI

TVHCDYLNRPHKSIHRRRTDPMVTLSSILESIINDMRDLPN

REHLELIVKNSATYNGKNQMFRDCKGHCSDPYSLLALNSD*

11. TAF2 (NM_003184)

Product amplified with primers CCACTAGAACCTGGTCAAATAC (Sl; SEQ ID NO: 98) and GACTGAGAGTGGAGCGCTTG (ASl; SEQ ID NO: 99)

1251 aa: extra 52 aa inside the protein, extra exon between exons 23 and 24, (SEQ ID NO: 100)

...LSRPSCLPLPELGLVLNLKEKKAVLNPTI I PESVAGNQEAANNPSSHPQLVGFQNPEPDHLAKEASCNISAHQQGVKRKSDTPL GSPLEPGQILEKMEDSSKVKLKIRFSSSQDEEEIDMDTVHDSQAFISHHLNMLERPSTPGLSKYRPASSRSALI PQHSAGCDSTP TTKPQWSLELARKGTGKEQAPLEMSMHPAASAPLSV-TKESTASKHS DHHHHHHHEHKKKKKKHKHKHKHKHKHDSKEKDKE PFT FSSPASGRSIRSPSLSD*

Alignment with 060668 (1199 aa) ; (SEQ ID NO: 101):

961 TSHDWRLRCGAVDLYFTLFGLSRPSCLPLPELGLVLNLKEKKAVLNPTIIPESVAGNQEA

961 TSHDWRLRCGAVDLYFTLFGLSRPSCLPLPELGLVLNLKEKKAVLNPTIIPESVAGNQEA 1021 ANNPSSHPQLVGFQNP

I I III I Il Il I I I I I I 1021 ANNPSSHPQLVGFQNPEDDHLAKEASCNISAHQQGVKRKSDTPLGSPLEPGQILEKNEDS

1037 FSSSQDEEEIDMDTVHDSQAFISHHLNMLERPSTPGLSKYRPASSRSALIPQ I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

1081 SKVKLKIRFSSSQDEEEIDMDTVHDSQAFISHHLNMLERPSTPGLSKYRPASSRSALIPQ

1089 HSAGCDSTPTTKPQWSLELARKGTGKEQAPLEMSMHPAASAPLSVFTKESTASKHSDHHH

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I I I I Il I I I I I I I I I I 1141 HSAGCDSTPTTKPQWSLELARKGTGKEQAPLEMSMHPTkASAPLSVFTKESTASKHSDHHH

1149 HHHHEHKKKKKKHKHKHKHKHKHDSKEKDKEPFTFSSPASGRSIRSPSLSD I I I I I Il I Il I Il I I I I I Il I I I I I I I I Il Il I Il I I I I I I I I I I Il I I I I 1201 HHHHEHKKKKKKHKHKHKHKHKHDSKEKDKEPFTFSSPASGRSIRSPSLSD

12. TAF4 (Ϊ11354) Product amplified with primers ATCTGCTGGACGAGGTCTTCT (Sl; SEQ ID NO: 102) and TATGGTAGTTGGGGTCACCTG (ASl; SEQ ID NO: 103) :

653 aa: internal 432 aa missing (22-454 aa) , deletion in exon 1, 65- 1355; (SEQ ID NO: 104)

MAAGSDLLDEVFFNSEVDEKVGMVLVRSENGQLLMIPQQALAQMQAQAHAQPQTTMAPRPATPTSAPPVQI STVQAPGTPI IARQ VTPTTI IKQVSQAQTTVQPSATLQRSPGVQPQLVLGGAAQTASLGTATAVQTGTPQRTVPGATTTSSAATETMENVKKCKN FLST LIKLASSGKQSTETAANVKELVQNLLDGKIEAEDFTSRLYRELNSSPQPYLVPFLKRSLPALRQLTPDSAAFIQQSQQQPP PPTS QATTALTAWLSSSVQRTAGKT AATVTSALQPPVLSLTQPTQVGVGKQGQPTPLVIQQPPKPGALIRPPQVTLTQTPMVALRQPH NRIMLTTPQQIQLNPLQPVPWKPAVLPGTKALSAVSAQAAAAQKNKLKEPGGGS FRDDDDINDVASMAGVNLSEESARI LATNS ELVGTLTRSCKDETFLLQAPLQRRILEIGKKHGITELHPDVVSYVSHATQQRLQNLVEKISETAQQKN FSYKDDDRYEQASDVRA QLK FFEQLDQI EKQRKDEQERE I LMRAAKS RSRQEDPEQLRLKQKAKEMQQQELAQMRQRDANLT ALAAI G PRKKRKVDC PGPGS GAEGSGPGSWPGSSGVGTPRQFTRQRITRVNLRDLI FCLENERETSHSLLLYKAFLK* Alignment with 000268 (1085 aa) ; (SEQ ID NO: 105):

1 MAAGSDLLDEVFFNSEVDEKWSDLVGSLESQLAASAAHHHHLAPRTPEVRAAAAGALGN

I I I I I I I I I I I I I I I I I I I I I 1 MAAGSDLLDEVFFNSEVDEKV

61 HVVSGSPAGAAGAGPAAPAEGAPGAAPEPPPAGRARPGGGGPQRPGPPSPRRPLVPAGPA

22 121 PPAAKLRPPPEGSAGSCAPVPAAAAVAAGPEPAPAGPAKPAGPAALAARAGPGPGPGPGP

22

181 GPGPGPGKPAGPGAAQTLNGSAALLNSHHAAAPAVSLVNNGPAALLPLPKPAAPGTVIQT

22

241 PPFVGAAAPPAPAAPSPPAAPAPAAPAAAPPPPPPAPATLARPPGHPAGPPTAAPAVPPP 22

301 AAAQNGGSAGAAPAPAPAAGGPAGVSGQPGPGAAAAAPAPGVKAESPKRVVQAAPPAAQT

22

361 LAASGPASTAASMVIGPTMQGALPSPAAVPPPAPGTPTGLPKGAAGAVTQSLSRTPTATT 22 421 SGIRATLTPTVLAPRLPQPPQNPTNIQNFQLPPGMVLVRSENGQLLMIPQQALAQMQAQA

Il I I I I I I I I I I I I I Il I I I I I I I I I I 22 GMVLVRSENGQLLMIPQQALAQMQAQA

481 HAQPQTTMAPRPATPTSAPPVQISTVQAPGTPIIARQVTPTTIIKQVSQAQTTVQPSATL 111111111111111111111111111111111111111111111111111111111111

49 HAQPQTTMAPRPATPTSAPPVQISTVQAPGTPIIARQVTPTTIIKQVSQAQTTVQPSATL

541 QRSPGVQPQLVLGGAAQTASLGTATAVQTGTPQRTVPGATTTSSAATETMENVKKCKNFL

I 11 I I 11 I I I I I I I I I I I 1111 I I I I I I 11 I I I 11 I I I I I I I I I I 11 I 11 I I I I I I I I 11 109 QRSPGVQPQLVLGGAAQTASLGTATAVQTGTPQRTVPGATTTSSAATETMENVKKCKNFL

601 STLIKLASSGKQSTETAANVKELVQNLLDGKIEAEDFTSRLYRELNSSPQPYLVPFLKRS

169 STLIKLASSGKQSTETAANVKELVQNLLDGKIEAEDFTSRLYRELNSSPQPYLVPFLKRS

661 LPALRQLTPDSAAFIQQSQQQPPPPTSQATTALTAVVLSSSVQRTAGKTAATVTSALQPP

229 LPALRQLTPDSAAFIQQSQQQPPPPTSQATTALTAVVLSSSVQRTAGKTAATVTSALQPP 721 VLSLTQPTQVGVGKQGQPTPLVIQQPPKPGALIRPPQVTLTQTPMVALRQPHNRIMLTTP

I I I I I Il I I Il I I I I I I I I I I I I I I I I I I I I Il Il I I I I I Il I Il I I I Il I I I I I I I I I I

289 VLSLTQPTQVGVGKQGQPTPLVIQQPPKPGALIRPPQVTLTQTPMVALRQPHNRIMLTTP 781 QQIQLNPLQPVPVVKPAVLPGTKALSAVSAQAAAAQKNKLKEPGGGSFRDDDDINDVASM

I Il I Il I Il I Il Il I I I I I I I I I I I I I I I I I I I I I I I I I I Il I Il I Il I I I I I I I I I I I I

349 ^QQI^QLNPL^QPVPVVKPAVLPGTKALSAVSAQAAAAQKNKLKEPGGGSFRDDDDINDVASM 841 AGVNLSEESARILATNSELVGTLTRSCKDETFLLQAPLQRRILEIGKKHGITELHPDVVS

I I I I I IM III Il Il III Il Il I Il I I I Il Il III III III I Il III III M I I I I I I M

409 AGVNLSEESARILATNSELVGTLTRSCKDETFLLQAPLQRRILEIGKKHGITELHPDVVS

901 YVSHATQQRLQNLVEKISETAQQKNFSYKDDDRYEQASDVRAQLKFFEQLDQIEKQRKDE Il I Il I I I I I Il Il I I I I I I I I I I I I I I I I I Il Il I I I I I Il I Il I I I Il I I I I I I I I I I

469 YVSHATQQRLQNLVEKISETAQQKNFSYKDDDRYEQASDVRAQLKFFEQLDQIEKQRKDE

961 QEREILMRAAKSRSRQEDPEQLRLKQKAKEMQQQELAQMRQRDANLTALAAIGPRKKRKV

I I I I I I I I Il I I I I I I I I I I I I I I I I I I I I Il Il I Il I Il I I I I I I I I I I I I I I I I I I I I 529 QEREILMRAAKSRSRQEDPEQLRLKQKAKEMQQQELAQMRQRDANLTALAAIGPRKKRKV

1021 DCPGPGSGAEGSGPGSVVPGSSGVGTPRQFTRQRITRVNLRDLIFCLENERETSHSLLLY

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I Il I I I I I Il I I I Il I I I I I I I I

589 DCPGPGSGAEGSGPGSVVPGSSGVGTPRQFTRQRITRVNLRDLIFCLENERETSHSLLLY

1081 KAFLK

I I I I I

649 KAFLK 13. TAP5L (NM_014409)

Product amplified with primers CAGTCATGAAACGAGTGCGTA (Sl; SEQ ID NO: 106) and GAATCTCTCATCTACAGACAAC (AS4; SEQ ID NO: 107) : 85 aa, shorter protein (similar to BE646579 but only one extra exon after exon 3) (SEQ ID NO: 108)

MKRVRTEQIQMAVSCYLKRRQYVDSDGPLKQGLRLSQTAEEMAANLTVQSESGCANIVSAAPCQAEPQQYEVQFGRLRNFLTGCL

Alignment with 075529 (589 aa) ; (SEQ ID NO: 109):

1 MKRVRTEQIQMAVSCYLKRRQYVDSDGPLKQGLRLSQTAEEMAANLTVQSESGCANIVSA

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I l I I I I I I I I I I 1 MKRVRTEQIQMA VSC YLKRRQYVDSDGPLKQGLRLSQTAEEMAANLTVQSESGCANI VSA

61 APCQAEPQQYEVQFGRLRNFLTDSDSQHSHEVMPLLYPLFVYLHLNLVQNSPKSTVESFY

I I I I I I I I I I I l I I I I I I I I I I 61 APCQAEPQQYEVQFGRLRNFLTGCL

14. TAF6L (NM_006473)

Product amplified with primers CCTATTTCGTAATCCGCACCT <S2; SEQ ID NO: 110) and ACTGACTCAGAGCGGCAAGTA (AS2; SEQ ID NO: 111) different C-terminus after 460 aa (new product, deletion in exon 11); (SEQ ID NO: 112)

. .MCLGPYVRCLVGSVLYCVLEPLAASINPLNDHWTLRDGAALLLSHIFWTHGDLVSGLYQHI LLSLQKILADPVRPLCCHYGAVV

DSLLFQESSSGGGAEPSFGSGLPLPPGGAGPEDPSLSVTIiADIYRELYAFFGDSLATRFGTGIALRAETAHDRPYQPPRPFVGAL GLIAVLAALSQ

Alignment with Q9Y6J9 (622 aa) ; (SEQ ID NO: 113): lβl QDLQTNSKIGALLPYFVYVVSGVKSVSHDLEQLHRLLQVARSLFRNPHLCLGPYVRCLVG

:| I I I I I I I I I I 1 MCLGPYVRCLVG 241 SVLYCVLEPLAASINPLNDHHTLRDGAALLLSHIFWTHGDLVSGLYQHILLSLQKILADP

I I I Il I I I I I Il Il I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I I I Il I I I I I I I I I I

13 SVLYCVLEPLAASINPLNDHWTLRDGAALLLSHIFWTHGDLVSGLYQHILLSLQKILADP

301 VRPLCCHYGAWGLHALGWKAVERVLYPHLSTYWTNLQAVLDDYSVSNAQVKADGHKVYG 73 VRPLCCHYGAVVGLHALGWKAVERVLYPHLSTYWTNLQAVLDDYSVSNAQVKADGHKVYG

361 AILVAVERLLKMKAQAAEPNRGGPGGRGCRRLDDLPWDSLLFQESSSGGGAEPSFGSGLP Il I Il I I I I I Il I I I I I I I I I I I I I I I I I I I I I I I I I Il I Il I I I I I I I I I I I I I I I I I I

133 AILVAVERLLKMKAQAAEPNRGGPGGRGCRRLDDLPWDSLLFQESSSGGGAEPSFGSGLP

421 LPPGGAGPEDPSLSVTLADIYRELYAFFGDSLATRFGTGQPAPTAPRPPGDKKEPAAAPD

Il I Il I I I I I Il I I I Il Il I I I I I I I I I I I I I I I I Il I I . 1 193 LPPGGAGPEDPSLSVTLADIYRELYAFFGDSLATRFGTGLALRAETAHDRPYQPPRPPVG

4Bl SVRKHPQLTASAIVSPHGDESPRGSGGGGPASASGPAASESRPLPRVHRARGAPRQQGPG

.. : I I . 253 ALGLLAVLAALSQ

15. TAP7L (NM_024885)

Products amplified with primers AGACATGAGTGAAAGCCAGGA (Sl; SEQ ID NO: 114) and CATAAGGCAACTGAAGGGACA (ASl; SEQ ID NO: 115) :

302 aa: 74 internal aa missing (232-304 aa) , Δ10, alternative 1st exon, (SEQ ID NO: 116) ,

MSESQDEVPDEVENQFILRLPLEHACTVRNLARSQSVKMKDKLKIDLLPDGRHAVVEVEDVPLAAKLVDLPCVIESLRTLDKKTF YKTADISQMLVCTADGDIHLSPEEPAASTDPNIVRKKERGREEKCVWKHGITPPLKNVRKKRFRKTQKKVPDVKEMEKSSFTEYI ESPDVENEVKRLLRSDAEAVSTRWEVIAEDGTKEIESQGSIPGFLISSGMSSHKQGHTSSVMEIQKQIEKKEKKLHKIQNKAQRQ KDLIHKVENLTLKNHFQSVLEQLELQEKQKNEKLISLQEQLQRFLKK*

Alignment with Q5H9L6 (462 aa) ; (SEQ ID NO: 117):

1 MSESQDEVPDEVENQFILRLPLEHACTVRNLARSQSVKMKDKLKIDLLPDGRHAVVEVED

I I I I I Il I Il I I I Il I I I I I I I I I I I I I I I Il Il I Il I Il I I I I I I I I I I Il I I I I I I I I

1 MSESQDEVPDEVENQfILRLPLEHACTVRNLARSQSVKMKDKLKIDLLPDGRHAVVEVED 61 VPLAAKLVDLPCVIESLRTLDKKTFYKTADISQMLVCTADGDIHLSPEEPAASTDPNIVR

61 VPLAAKLVDLPCVIESLRTLDKKTFYKTADISQMLVCTADGDIHLSPEEPAASTDPNIVR

121 KKERGREEKCVWKHGITPPLKNVRKKRFRKTQKKVPDVKEMEKSSFTEYIESPDVENEVK I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I

121 KKERGREEKCVWKHGITPPLKNVRKKRFRKTQKKVPDVKEMEKSSFTEYIESPDVENEVK

181 RLLRSDAEAVSTRWEVIAEDGTKEIESQGSIPGFLISSGMSSHKQGHTSSEYDMLREMFS

I I III I Il I I Il I I I I I I I I I I I I I I I I Il I I I III I I I I Il I Il Il I Il 181 RLLRSDAEAVSTRWEVIAEDGTKEIESQGSIPGFLISSGMSSHKQGHTSS

241 DSRSNNDDDEDEDDEDEDEDEDEDEDEDKEEEEEDCSEEYLERQLQAEFIESGQYRANEG 2³¹

301 TSSIVMEIQKQIEKKEKKLHKIQNKAQRQKDLIMKVENLTLKNHFQSVLEQLELQEKQKN

231 VMEIQKQIEKKEKKLHKIQNKAQRQKDLIMKVENLTLKNHFQSVLEQLELQEKQKN 361 EKLIΞLQEQLQRFLKK

I I I Il I I I I I I I I I Il

287 EKLISLQEQLQRFLKK

16. TAF8 (AF465841)

Product amplified with primers CACTACGCCAGAACAAGATGG (Sl; SEQ ID NO: 118) and GTTTGCTTCCGTGTGTGTCTT (ASl; SEQ ID NO: 119) :

214 aa: shorter/different C-terminus (after 164 aa) ; exons 6-8 spliced out; (SEQ ID NO: 120)

ENTSVLQQNPSLSGSRNGEENIIDNPYLRPVKKPKIRRKKPDTF* Alignment with Q7Z7C8 (338 aa); (SEQ ID NO: 121):

1 MADAAATAGAGGSGTRSGSKQSTNPADNYHLARRRTLQWVSSLLTEAGFESAEKASVET

5 Il I Il I I I I Il I I I I I I I I I I I I I I I I I I I I Il III I Il I Il I Il I Il Il I I I Il Il I I I

1 MADAAATAGAGGSGTRSGSKQSTNPADNYHLARRRTLQWVSSLLTEAGFESAEKASVET

61 LTEMLQSVISEIGRSAKSYCEHTARTQPTLSDIWTLVEMGFNVDTLPAYAKRSQRMVIT

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I I I I I I I I I I I I I I 10 61 LTEMLQSYISEIGRSAKSYCEHTARTQPTLSDIWTLVEMGFNVDTLPAYAKRSQRMVIT

121 APPVTNQPVTPKALTAGQNRPHPPHIPSHFPEFPDPHTYIKTPTYREPVSDYQVLREKAA

II I I I I I I Il I I I I I I I I I I I I I I I I I I I I Il M I Il I Il I I I : . :. 121 APPVTNQPVTPKALTAGQNRPHPPHIPSHFPEFPDPHTYIKTPEDSGAEKENTSVLQQNP

T D

181 SQRRDVERALTRFMAKTGETQSLFKDDVSTFPLIAARPFTIPYLTALLPSELEMQQMEET

I I 181 SLSGSRNGEENIIDMPYLRPVKKPKIRRKKPDTF

20 241 DSSEQDEQTDTENLALHISMIESRSVTQAGVQWQDLGSLQPPPPGFKRFSSLSLLSSWNY

215

301 RRILEPRRRTPLSCSRTPPCRVAGMGRRTSSITLICGR

25

215

17. TAP15 (NM_139215)

30 Product amplified with primers TTGATGACCCTCCTTCAGCTA (S2; SEQ ID NO: 122)) and GCAAAACTCTGGCAATTTCAC (AS2; SEQ ID NO: 123):

457 aa: different C-terminus after 393aa (Δ15); (SEQ ID NO: 124);

35 MSDSGSYGQSGGEQQSYSTYGNPGSQGYGQASQSYSGYGQTTDSSYGQNYSGYSSYGQSQSGYSQSYGGYENQKQSSYSQQPYNN

GEDNRGYGGSQGGGRGRGGYDKDGRGPMTGSSGGDRGGFKNFGGHRDYGPRTDADSESDNSDNNTIFVQGLGEGVSTDQVGEFFK QIGIIKTNKKTGKPMINLYTDKDTGKPKGEATV RGRGGFQGRGGDPKSGDWVCPNPSCGNMNFARR 40 LPAAFLVASSWWKLSDIWIFIWVGGLGQFFF*

Alignment with Q92804-1 (592 aa) ; (SEQ ID NO: 125)

1 MSDSGSYGQSGGEQQSYSTYGNPGSQGYGQASQSYSGYGQTTDSSYGQNYSGYSSYGQSQ

45 11 I I I I I I 11 I I I 11 I I I I I I I I I I I I I I I I I I I I 11 I 11 I I I I I I I I I I I I I I I I I I I I

1 MSDSGSYGQSGGEQQSYSTYGNPGSQGYGQASQSYSGYGQTTDSSYGQNYSGYSSYGQSQ

61 SGYSQSYGGYENQKQSSYSQQPYNNQGQQQNMESSGSQGGRAPSYDQPDYGQQDSYDQQS

I I I I I Il I I I I I I I I I I I I I I I I I I I I I I I Il I I I Il I Il I I I I I I I I I I I I I I I I I I I I

50 61 SGYSQSYGGYENQKQSSYSQQPYNNQGQQQNMESSGSQGGRAPSYDQPDYGQQDSYDQQS

121 GYDQHQGSYDEQSNYDQQHDSYSQNQQSYHSQRENYSHHTQDDRRDVSRYGEDNRGYGGS

I I Il I Il I Il I I I Il I I I Il Il I I I I I I Il Il III Il I Il I I I I I Il Ml Il I I I I I I Il 121 GYDQHQGSYDEQSNYDQQHDSYSQNQQSYHSQRENYSHHTQDDRRDVSRYGEDNRGYGGS

181 QGGGRGRGGYDKDGRGPMTGSSGGDRGGFKNFGGHRDYGPRTDADSESDNSDNNTIFVQG

II I I I Il I Il I I I I I I I I I I I I I I I I I I I I Il I I I Il I Il I I I I I I I Il I I I I I I I I I I I 181 QGGGRGRGGYDKDGRGPMTGSSGGDRGGFKNFGGHRDYGPRTDADSESDNSDNNTIFVQG

60 211 LGEGVSTDQVGEFFKQIGIIKTNKKTGKPMINLYTDKDTGKPKGEATVSFDDPPSAKAAI

Il I Il I I I Il I I I I I I I I I I I I I I I I I I I I Il I I I Il I Il I I I I I I I I I I Il I I I I Il I I

241 LGEGVSTDQVGEFFKQIGIIKTNKKTGKPMINLYTDKDTGKPKGEATVSFDDPPSAKAAI

301 DWFDGKEFHGNIIKVSFATRRPEFMRGGGSGGGRRGRGGYRGRGGFQGRGGDPKSGDWVC

65 1 1 I 1 1 I I I 1 1 I I I I I I I I 1 1 1 1 I I I I I I 1 1 I I I 1 1 I I I I I I I I I I 1 1 I I I I I I I I I I I I I

301 DWFDGKEFHGNI I KVSFATRRPEFMRGGGSGGGRRGRGGYRGRGGFQGRGGDPKSGDWVC

361 PNPSCGNMNFARRNSCNQCNEPRPEDSRPSGGDFRGRGYGGERGYRGRGGRGGDRGGYGG I I l I I I l I I M I I I I I I I I I I I I I I I I I I l I I :

70 361 PNPSCGNMNFARRNSCNQCNEPRPEDSRPSGGETTTEMISATDHTDDCFECSFVSDMIHS

421 DRSGGGYGGDRSSGGGYSGDRSGGGYGGDRSGGGYGGDRGGGYGGDRGGGYGGDRGGGYG : . . : I : 421 EIARVLPAAFLVASSWVVKLSDIWIFIWVGGLGQFFF

481 GDRGGYGGDRGGGYGGDRGGYGGDRGGYGGDRGGYGGDRGGYGGDRSRGGYGGDRGGGSG 458

541 YGGDRSGGYGGDRSGGGYGGDRGGGYGGDRGGYGGKMGGRNDYRNDQRNRPY

458

MEDIATOR COMPLEX COMPONENTS

Product amplified with primers GAAAATGGCTGCGTCTTCG (Sl; SEQ ID NO: 126) and CTCATCTCTAAATCAGTTGGG (AS; SEQ ID NO: 127)

224 aa: 46 aa shorter N-terminus, shorter 1st exon; (SEQ ID NO: 128);

MLAISRNQKLLQAGEENQVLELLIHRDGEFQELMKLALNQGKIHHEMQVLEKEVEKRDSDIQQLQKQLKEAEQI LATAVYQAKEK LKS IEKARKGAI SSEEIIKYAHRISASNAVCAPLTWVPGDPRRPYPTDLEM...

Alignment with Q9NPJ6 (270 aa) ; (SEQ ID NO: 129):

1 MAASSSGEKEKERLGGGLGVΛGGNSTRERLLSALEDLEVLSRELIEMLAISRNQKLLQAG

I I I I I I I I I I I I I I 1 MLAISRNQKLLQAG 61 EENQVLELLIHRDGEFQELMKLALNQGKIHHEMQVLEKEVEKRDSDIQQLQKQLKEAEQI

Il I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I I I I I Il I Il I I I I I I I I I I

15 EENQVLELLIHRDGEFQELMKLALNQGKIHHEMQVLEKEVEKRDSDIQQLQKQLKEAEQI

121 LATAVYQAKEKLKSIEKARKGAISSEEIIKYAHRISASNAVCAPLTWVPGDPRRPYPTDL I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

75 LATAVYQAKEKLKSIEKARKGAISSEEIIKYAHRISASNAVCAPLTWVPGDPRRPYPTDL

181 EMRSGLLGQMNNPSTNGVNGHLPGDALAAGRLPDVLAPQYPWQSNDMSMNMLPPNHSSDF

I I 135 EM

241 LLEPPGHNKENEDDVEIMSTDSSSSSSESD 137

19. Mθd8 (NM_001001651)

Product amplified with primers CAAATGGCTCAGGCAGGTC (Sl; SEQ ID NO: 130) and TGTAGACAATCATGAAGCCACG (ASl; SEQ ID NO: 131):

165 aa: shorter, four different aa-s in C-terminus, shorter 7th exon, two additional 3' exons from gene ELOVLl - elongation of very long chain fatty acids; SEQ ID NO: 132;

PTDTNALVAAVAFGKGLSNWRPSGSSGPGQAGQPGAGTILAGTSGLQQVQMAGAPSQQQPMLSGVQMAQAGQPGKCQVE*

Alignment with Q96G25 (268 aa) ; (SEQ ID NO: 133): 1 MRQTEGRVPVFSHEVVPDHLRTKPDPEVEEQEKQLTTDAARIGADAAQKQIQSLNKMCSN

I I I Il I I I I Il I I I I I I I I I I I I I I I I I I I I I I Il I Il I Il I Il I Il I Il I I I I I I I I I I 1 MRQTEGRVPVFSHEVVPDHLRTKPDPEVEEQEKQLTTDAARIGADAAQKQIQSLNKMCSN

61 LLEKISKEERESESGGLRPNKQTFNPTDTNALVAAVAFGKGLSNWRPSGSSGPGQAGQPG I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

61 LLEKISKEERESESGGLRPNKQTFNPTDTNALVAAVAFGKGLSNWRPSGSSGPGQAGQPG

121 AGTILAGTSGLQQVQMAGAPSQQQPMLSGVQMAQAGQPGKMPSGIKTNIKSASMHPYQR

II I I I I I I I I I I I Il I I I I I I I I Il I I I I I I I I I I I I I I I 121 AGTILAGTSGLQQVQMAGAPSQQQPMLSGVQMAQAGQPGKCQVE

20. Medl4 (NM_004229) 1. Product amplified with primers TCAGGATGCTCGAAGAAGGTC (Sl; SEQ ID NO: 134) and CAGACACTTGAGGAGATCCTG (ASl; SEQ ID NO: 135) :

1410 aa: internal 44 aa missing (1045-1089 aa), Δ24, (SEQ ID NO: 136) MNMFVDSNQDARRRSVNEDDNPPSPIGGDMMDSLISQLQPPPQQQPFPKQPGTSGAYPLTSPPTSYHSTVNQSPSMMHTQSPGT LDPSSPYTMVSPSGRAGNWPGSPQVSGP .

Alignment with 060244 (1454 aa) ; (SEQ ID NO: 137): 781 LEFARSLPDIPAHLNIFSEVRVYNYRKLILCYGTTKGSSISIQWNSIHQKFHISLGTVGP

1

841 NSGCSNCHNTILHQLQEMFNKTPNVVQLLQVLFDTQAPLNAINKLPTVPMLGLTQRTNTA i

901 YQCFSILPQSSTHIRLAFRNMYCIDIYCRSRGVVAIRDGAYSLFDNSKLVEGFYPAPGLK 1 ..MNMFVDSNQDARRRSVNEDDNPPSPIGGDMMDSLISQLQPPPQQQPFPKQPGTSGAYP

961 TFLNMFVDSNQDARRRSVNEDDNPPSPIGGDMMDSLISQLQPPPQQQPFPKQPGTSGAYP

59 LTSPPTSYHSTVNQSPSMMHTQSP I I I I I I I I I I 11 I I I I I I I I I I I I

1021 LTSPPTSYHSTVNQSPSMMHTQSPGNLHAASSPSGALRAPSPASFVPTPPPSSHGISIGP

83 GTLDPSSPYTMVSPSGRAGNWPGSPQVSGP

Ill Il III Il Il Il I Il I Il Il Il III III 1081 GASFASPHGTLDPSSPYTMVSPSGRAGNWPGSPQVSGPSPAARMPGMSPANPSLHSPVPD

113

1141 ASHSPRAGTSSQTMPTNMPPPRKLPQRSWAASIPTILTHSALNILLLPSPTPGLVPGLAG

113

1201 SYLCSPLERFLGSVIMRRHLQRIIQQETLQLINSNEPGVIMF 2. Product amplified with primers GCTCTGCCGATCGACTTCC (S2; SEQ ID NO: 138) and AGGCGATCAGCAGTGTCCAC (AS2; SEQ ID NO: 139) :

1382 aa: missing 73 aa at N-terminus, same as CN282118 (alternative 1st exon) ; (SEQ ID NO: 140);

MPRKS DVERKIEIVQFASRTRQLFVRLLALVKWANNAGKVEKCAMISSFLDQQAILFV DTADRLASLARDALVHARLPSFAIPYA VPWRLLKLEILVEDKETGDGRALVHSMQISFIHQLVQSRLFADEKPL .. Alignment with 060244 (1454 aa) ; (SEQ ID NO: 141): i

1 MAPVQLENHQLVPPGGGGGGSGGPPSAPAPPPPGAAVAAAAAAAASPGYRLSTLIEFLLH

1 MPRKSDVERKIEIVQFASRTRQLFVRLLALVKWANNAGKVEKCAMISS

Ul l l l l l l l l ll ll ll l l l l l l ll l l l l l l l ll l l l l l l l l l l l l l l 61 RAYSELMVLTDLLPRKSDVERKIEI VQFASRTRQLFVRLLALVKWANNAGKVEKCAMISS 49 FLDQQAI LFVDTADRLASLARDALVHARLPSFAIPYAIDVLTTGSYPRLPTCIRDKI IPP

I I I I I Il I Il I I I I I I I I I I I I I I I I I I Il I I I Il I I I I I I I I I I Il I Il I I I I I I I I I I

121 FLDQQAILFVDTADRLASLARDALVHARLPSFAIPYAIDVLTTGSYPRLPTCIRDKIIPP

109 DPITKIEKQATLHQLNQILRHRLVTTDLPPQLANLTVANGRVKFRVEGEFEATLTVMGDD I I I I I I I I I I I I I Il I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 181 DPITKIEK^QATLHQLN^QILRHRLVTTDLPPQLANLTVANGRVKFRVEGEFEATLTVMGDD

169 PDVPWRLLKLEILVEDKETGDGRALVHSMQISFIHQLVQSRLFADEKPL

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I Il I I I I 241 PDVPWRLLKLEILVEDKETGDGRALVHSMQISFIHQLVQSRLFADEKPLQDMYNCLHSFC

218

301 LSLQLEVLHSQTLMLIRERWGDLVQVERYHAGKCLSLSVWNQQVLGRKTGTASVHKVTIK

218

361 IDENDVSKPLQIFHDPPLPASDSKLVERAMKIDHLSIEKLLIDSVHARAHQKLQELKAIL 218

21. Medl5 (NM_001003891) Product amplified with primers TGGATGTAAGATGAGATTGGG (S2; SEQ ID NO: 142) and AATGAGCCTGGCCACGAGA (AS2); (SEQ ID NO: 143):

762 aa: 26 aa shorter N-terminus, alternative 1st exon; (SEQ ID NO: 144) ;

MRKAGVAHSKSSKDMESHVFLKAKTRDEYLSLVARLIIHFRDIHNKKSQASVSDPMNALQSLTGGPAAGAAGIGMPPRGPGQSLG GMGSL..

Alignment with Q96RN5 (788 aa) ; (SEQ ID NO: 145):

1 MDVSGQETDWRSTAFRQKLVSQIEDAMRKAGVAHSKSSKDMESHVFLKAKTRDEYLSLVA

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 MRKAGVAHSKSSKDMESHVFLKAKTRDEYLSLVA 61 RLIIHFRDIHNKKSQASVSDPMNALQSLTGGPAAGAAGIGMPPRGPGQSLGGMGSLGAMG

II I Mil I Il I Il Il I I I I I I I I Il Il I I I Il Il I II III III III I Il Il I I I I I

35 RLIIHFRDIHNKKSQASVSDPMNALQSLTGGPAAGAAGIGMPPRGPGQSLGGMGSL

121 QPMSLSGQPPPGTSGMAPHSMAVVSTATPQTQLQLQQVALQQQQQQQQFQQQQQAALQQQ

Product amplified with primers CACGAATCTGATCACACACTAC (S; SEQ ID NO: 146) and CTGGGTGGTCTGGACTATG (ASl; SEQ ID NO: 147):

244 aa: 50 aa longer C-terminus; (SEQ ID NO: 148)

PKKKNKHKHKQSRTQDPVPPETPSDSDHKKKKKKKEEDPERKRKKKEKKKKKNRHSPDHPGMGSSQASSSSSLR*

Alignment with Q8IV02 (194 aa) ; (SEQ ID NO: 149 and 150): 121 SHDNSSLRSLIEKPPILSSSFNPITGTMLAGFRLHTGPLPEQCRLMHIQPPKKKNKHKHK

I I I Il I I Il Il I I I I I I I I I I I I I I I I I Il Il I I I I I Il I Il I I I Il I I I Il I I I I I I I I 54 SHDNSSLRSLIEKPPILSSSFNPITGTMLAGFRLHTGPLPEQCRLMHIQPPKKKNKHKHK

181 QSRTQDPVPPGKPS 1111111111 M

114 QSRTQDPVPPETPSDSDHKKKKKKKEEDPERKRKKKEKKKKKNRHSPDHPGMGSSQASSSSSLR

23. TRAPlOO <NM_014815)

Product amplified with primers CGTTACTAGAGCAGGCCATG (S2; SEQ ID NO: 151) and TACCCTCGGGAGACTCAATGA (AS4; SEQ ID NO: 152): 1008 aa: longer protein (extra 19 aa inside the protein after 188 aa) ; SEQ ID NO: 153) MIGPSPNPLILSYLKYAISS^QMVSYSSVLTAISKFDDFSRDLCVQALLDIMDMFCDRLSCHGKAEECIGLCRALLSALHWLLRC TAASAERLREGLEAGTPAAGEK^QLAMCLQRLEKTLSSTKNRALLHIAKLEEASLHTSQGLGQGGTRANQPTASWTAIEHSLLKLG EILANLSNP^QLRSQAE^QCGTLIRSIPTMLSVHAEQMHKTGFPTVHAVILLEGTMNLTGETQSLVEQLTMVKRMQHIPTPLFVLEI WKACFVGLIESPEG Alignment with 075448 (989 aa) ; (SEQ ID NO: 154):

1 MKVVNLKQAILQAWKERWSDYQWAINMKKFFPKGATWDILNLADALLEQAMIGPSPNPLI

I I I I I I I I I I

I MIGPSPNPLI

61 LSYLKYAISSQMVSYSSVLTAISKFDDFSRDLCVQALLDIMDMFCDRLSCHGKAEECIGL I Il I I Il I Il I Il I I I I I Il I I I I I I I I I I Il I I I I I I Il I I I I I Il I I I Il I I I I I I Il

I 1 LSYLKYAISSQHVSYSSVLTAISKFDDFSRDLCVQALLDIMDMFCDRLSCHGKAEECIGL 121 CRALLSALHWLLRCTAASAERLREGLEAOTPAAGEKQLAMCLQRLEKTLSSTKNRALLHI

I I I Il I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Il I I I Il I I I I I I I I I I

71 CRALLSALHWLLRCTAASAERLREGLEAGTPAAGEKQLAMCLQRLEKTLSSTKNRALLHI lβl AKLEEAS SWTAIEHSLLKLGEILANLSNPQLRSQAEQCGTL I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

131 AKLEEASLHTSQGLGQGGTRANQPTASWTAIEHSLLKLGEILANLSNPQLRSQAEQCGTL

222 IRSIPTMLSVHAEQMHKTGFPTVHAVILLEGTMNLTGETQSLVEQLTMVKRMQHIPTPLF

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I I I I I 11 I I I I I I I I I I 191 IRSIPTMLSVHAEQMHKTGFPTVHAVILLEGTMNLTGETQSLVEQLTMVKRMQHIPTPLF

282 VLEIWKACFVGLIESPEGTEELKWTAFTFLKIPQVLVKLKKYSHGDKDFTEDVNCAFEFL

I 1 I I I I I I I I 11 I I I I I I

251 VLEIWKACFVGLIESPEG

CHROMATIN REMODELING COMPLEX (SWI/SNF)

24. SMRRCAl (NM_003069)

Product amplified with primers AGATGACTCGCTTGCTGGATA (S3; SEQ ID NO: 155) and AGGTTAATTCCGAGACCTCCA (AS2; SEQ ID NO: 156) :

955 aa: internal 12 aa missing (467 - 477 aa) , Δ13; (SEQ ID NO: 157)

TSNVCIRFEVSPSYVKGGPLRDYQIRGLNWLISLYENGVNGILADEMGLGKTLQTIALLGYLKHYRNIPGPHMVLVPKSTLHNWH

NRLLLTGTPLQNNLHELWALLNFLLPDVFNSADDFDSWFDTKNCLGDQKLVERLHAVLKPFLLRRIKTDVEKSLPPKKEIKIYLG LSKMQREWYTKILMKDIDVLNSSGKMDKMRLLNILMQLRKCCNHPYLFDGAEPGPPYTTDEHIVSNSGKMVVLDKLLAKLKEQGS

DLQAMDRAHRIGQKKPVRVFRLITDNTVEERIVERAEIKLRLDSIVIQQGRLIDQQSNKLAKEEMLQMIRHGATHVFASKESELT

EPKIPKAPRPPKQPNVQDFQFFPPRLFELLEKEILYYRKTIGYKVPRNPDIPNPALAQREEQKKIDGAEPLTPEETEEKEKLLTQ GFTNWTKRDFNQFIKANEKYGRDDIDNIAREVEGKSPEEVMEYSAVFWERCNELQDIEKIMAQIERGEARIQRRISIKKALDAKI ARYKAPFHQLRIQYGTSKGKNYTEEEDRFLICMLHKMGFDRENVYEELRQCVRNAPQFRFDWFIKSRTAMEFQRRCNTLISLIEK ENMEIEERERAKKKK..

Alignment with P28370 (967 aa) ; SEQ ID NO: 158 and 159):

361 NSSGKMDKMRLLNILMQLRKCCNHPYLFDGAEPGPPYTTDEHI VSNSGKMWLDKLLAKL

I I I I l I I I I I I I I I l I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I

361 NSSGKMDKMRLLNILMQLRKCCNHPYLFDGAEPGPPYTTDEHIVSNSGKMWLDKLLAKL 421 KEQGSRVLIFSQMTRLLDILEDYCMWRGYEYCRLDGQTPHEEREDKFLEVEFLGQREAIE

I I I I I Il I Il I I I I I I I I Il Il I I I I I I Il I I I Il I I I Il I I I I I I I I

421 KEQGSRVLIFSQMTRLLDILEDYCMWRGYEYCRLDGQTPHEERE EAIE

481 AFNAPNSSKFIFMLSTRAGGLGINLASADVVILYDSDWNPQVDLQAMDRAHRIGQKKPVR H I I I I I I I I I l I I I I I I I I I I I I I I I I I I I I I I I I I I I I I l I I l I I I I I I I I I I I I I I I 469 AFNAPNSSKFIFMLSTRAGGLGINLASADVVILYDSDWNPQVDLQAMDRAHRIGQKKPVR 25. SMARCA4 (NM_003072)

Products amplified with primers ATCATGGCCTACAAGATGCTG (S2; SEQ ID NO: 160) and ATCCGCTCGTTCTCTTTCTTC (AS2; SEQ ID NO: 161): a) 1438 aa: internal 209 aa missing (198-407 aa) ; new product (Δ(4)- (7)) ; (SEQ ID NO: 162)

MAYKMIARGQPLPDHLLNFQRQLRQEVVVCMRRDTALETALNAKAYKRSKRQSLREARITEKLEKQQKIEQERKRRQKHQEYLN SILQHAKDFKEYHRSVTGKIQKLTKAVATYHANTEREQKKENER...

Alignment with P5153 (1647 aa) ; (SEQ ID NO: 163):

121 PLGGSEHASSPVPASGPSSGPQMSSGPGGAPLDGADPQALGQQNRGPTPFNQNQLHQLRA i

181 QIMAYKHLARGQPLPDHLQMAVQGKRPMPGMQQCMPTLPPPSVSATGPGPGPGPGPGPGP

I IUI IUI I I Il I I i ..MAYKMLARGQPLPDH

241 GPAPPNYSRPHGMGGPNMPPPGPSGVPPGMPGQPPGGPPKPWPEGPMANAAAPTSTPQKL 16

301 IPPQPTGRPSPAPPAVPPAASPVMPPQTQSPGQPAQPAPMVPLHQKQSRITPIQKPRGLD 16 361 PVEILQEREYRLQARIAHRIQELENLPGSLAGDLRTKATIELKALRLLNFQRQLRQEVVV

I I I I I I I I I I I I I I 16 LLNFQRQLRQEVVV

421 CMRRDTALETALNAKAYKRSKRQSLREARITEKLEKQQKIEQERKRRQKHQEYLNSILQH 1111111 Il 11111 I Il Il I 11 I I I Il Il I I Il 111111111 Il I 11 I 1111 I 1111 I I I

30 CMRRDTALETALNAKAYKRSKRQSLREARITEKLEKQQKIEQERKRRQKHQEYLNSILQH

481 AKDFKEYHRSVTGKIQKLTKAVATYHANTEREQKKENERIEKERMRRLMAEDEEGYRKLI

Il I I I I I I I I Il I I I I I I I I I I I I I Il I I I I I I I I I I I I 90 AKDFKEYHRSVTGKIQKLTKAVATYHANTEREQKKENER b) 1388 aa: internal 259 aa missing (203-462 aa) , new product (Δ(4)- (8)); (SEQ ID NO: 164) .MAYKMLARGQPLPDHLQMAVQERKRRQKHQEYLNSILQHAKDFKEYHRSVTGKIQKLTKAVATYHANTEREQKKENER .

Alignment with P5153 (1647 aa); (SEQ ID NO: 165): 181 QIMAYKMLARGQPLPDHLQMAVQGKRPMPGMQQQMPTLPPPSVSATGPGPGPGPGPGPGP

I I I I I I I I Il I Il Il I I I I I 1 ..MAYKMLARGQPLPDHLQMAV

241 GPAPPNYSRPHGMGGPNMPPPGPSGVPPGMPGQPPGGPPKPWPEGPMANAAAPTSTPQKL

21

301 IPPQPTGRPSPAPPAVPPAASPVMPPQTQSPGQPAQPAPMVPLHQKQSRITPIQKPRGLD 2i

361 PVEILQEREYRLQARIAHRIQELENLPGSLAGDLRTKATIELKALRLLNFQRQLRQEVVV 21

421 CMRRDTALETALNAKAYKRSKRQSLREARITEKLEKQQKIEQERKRRQKHQEYLNSILQH

I Il I Il I Il I I I Il Il I I I 21 QERKRRQKHQEYLNSILQH

481 AKDFKEYHRSVTGKIQKLTKAVATYHANTEREQKKENERIEKERMRRLMAEDEEGyRKLI

I I Il I Il I Il I I I I I I I I I I Il I I I I I I I I Il I I I I I I I 40 AKDFKEYHRSVTGKIQKLTKAVATYHANTEREQKKENER

26. SMARCC2 (NM_003075) Products amplified with primers GCTCGGCAAGAACTACAAGAA (Sl; SEQ ID NO: 166) and CGGACACTTTGTTCCAGTCAT (AS2; SEQ ID NO: 167) : a) 749 aa: internal 465 aa missing (67-532 aa), new product (Δ(2)- (17) ) ; (SEQ ID NO: 168) ;

...MLGKNYKKYIQAEPPTNKSLSSLVVQLLQFQEEVFGKHVLADTPSGLVPLQPKTPQQTSASQQMLNFPDKGKEKPTDMQNFGLR TDMYTKKNVPSKSKAAASATREWTEQETLLLLEALEMYKDDWNKVS...

Aligned with Q8TAQ2 (1214 aa) ; (SEQ ID NO: 169):

1 MAVRKKDGGPNVKY YEAADTVTQFDNVRLWLGKNYKKYIQAEP PTNKSLS SLV VQLLQFQ

I I I I I I I I I I I l I I l I I I I I I I I I I I I I I I 1 MLGKNYKKYIQAEPPTNKSLSSLVVQLLQFQ 61 EEVFGKHVSNAPLTKLPIKCFLDFKAGGSLCHILAAAYKFKSDQGWRRYDFQNPSRMDRN

I I III I 32 EEVFGK

121 VEMFMTIEKSLVQNNCLSRPNIFLCPEIEPKLLGKLKDIIKRHQGTVTEDKNNASHVVYP

38

181 VPGNLEEEEWVRPVMKRDKQVLLHWGYYPDSYDTWIPASEIEASVEDAPTPEKPRKVHAK 38

241 WILDTDTFNEWMNEEDYEVNDDKNPVSRRKKISAKTLTDEVNSPDSDRRDKKGGNYKKRK

38

301 RSPSPSPTPEAKKKNAKKGPSTPYTKSKRGHREEEQEDLTKDMDEPSPVPNVEEVTLPKT

38 361 VNTKKDSESAPVKGGTMTDLDEQEDESMETTGKDEDENSTGNKGEQTKNPDLHEDNVTEQ

38

421 THHIIIPSYAAWFDYNSVHAIERRALPEFFNGKNKSKTPEIYLAYRNFMIDTYRLNPQEY

38

481 LTSTACRRNLAGDVCAIMRVHAFLEQWGLINYQVDAESRPTPMGPPPTSHFHVLADTPSG

IMMIIII 3B HVLADTPSG

541 LVPLQPKTPQQTSASQQMLNFPDKGKEKPTDMQNFGLRTDMYTKKNVPSKSKAAASATRE

I I I Il I Il I I I I I I I I I I I I I I I I I I I I I I Il Il I Il I Il I I I I Il I I I I I I I I I I I I I I

47 LVPLQPKTPQQTSASQQMLNFPDKGKEKPTDMQNFGLRTDMYTKKNVPSKSKAAASATRE

601 WTEQETLLLLEALEMYKDDWNKVSEHVGSRTQDECILHFLRLPIEDPYLEDSEASLGPLA

I Il I I Il I Il Il I I I I I I I I I I I I 107 WTEQETLLLLEALEMYKDDWNKVS b) 847 aa: internal 367 aa missing (97-464 aa), new product (Δ(3)- (16) ) ; (SEQ ID NO: 170) ;

...SSLVVQLLQFQEEVFGKHVSNAPLTKLPIKCFLDFKAGGSLCHILAAAYRNFMIDTYRLNPQEYLTSTACRRNLAGDVCAIMRV NVPSKSKAAASATREWTEQETLLLLEALEMYKDDWNKVS...

Aligned with Q8TAQ2 (1214 aa) ; (SEQ ID NO: 169 and 170): 1 MAVRKKDGGPNVKYYEAADTVTQFDNVRLWLGKNYKKYIQAEPPTNKSLSSLVVQLLQFQ

I I l I I I I I I I I

1 SSIiVVQLLQFQ

61 EEVFGKHVSNAPLTKLPIKCFLDFKAGGSLCHILAAAYKFKSDQGWRRYDFQNPSRMDRN

I I I I I I I I I I Il I I I Il I I I I I I Il Il I I I I I I I M 12 EEVFGKHVSNAPLTKLPIKCFLDFKAGGSLCHILAA 121 VEMFMTIEKSLVQNNCLSRPNIFLCPEIEPKLLGKLKDIIKRHQGTVTEDKNNASHVVYP

48

181 VPGNLEEEEWVRPVMKRDKQVLLHWGYYPDSYDTWIPASEIEASVEDAPTPEKPRKVHAK

48

241 WILDTDTFNEWMNEEDYEVNDDKNPVSRRKKISAKTLTDEVNSPDSDRRDKKGGNYKKRK 4β

301 RSPSPSPTPEAKKKNAKKGPSTPYTKSKRGHREEEQEDLTKDMDEPSPVPHVEEVTLPKT 48

361 VNTKKDSESAPVKGGTMTDLDEQEDESMETTGKDEDENSTGNKGEQTKNPDLHEDNVTEQ 48 421 THHIIIPSYAAWFDYNSVHAIERRALPEFFNGKNKSKTPEIYLAYRNFMIDTYRLNPQEY

III I I I I I I I Il I I I I I 48 AYRNFMIDTYRLNPQEY

481 LTSTACRRNLAGDVCAIMRVHAFLEQWGLINYQVDAESRPTPMGPPPTSHFHVLADTPSG I I I I I I 11 I I I I I I I I I I 11 I I I I I I M 11 I I I 11 I I I I I I I I I I 11 I 11 I I I I I I I I I I 65 LTSTACRRNLAGDVCAIMRVHAFLEQWGLINYQVDAESRPTPMGPPPTSHFHVLADTPSG

541 LVPLQPKTPQQTSASQQMLNFPDKGKEKPTDMQNFGLRTDMYTKKNVPSKSKAAASATRE

I I I I I Il I Il I I I Il I I I I I I I I Il Il Il I I I I I I Il I Il I I I I I I I I I I I I I I I I I I I I 125 LVPLQPKTPQQTSASQQMLNFPDKGKEKPTDMQNFGLRTDMYTKKNVPSKSKAAASATRE

601 WTEQETLLLLEALEMYKDDWNKVSEHVGSRTQDECILHFLRLPIEDPYLEDSEASLGPLA

II I I Il I I I I Il I I I Il I I I I Il I

185 WTEQETLLLLEALEMYKDDWNKVS c) 734 aa: internal 480 aa missing (83-563 aa) , new product (Δ(3) (19) ) ; (SEQ ID NO: 171);

s.

Aligned with Q8TAQ2 (1214 aa) ; (SEQ ID NO: 169 and 171)

1 MAVRKKDGGPNVKYYEAADTVTQFDNVRLWLGKNYKKYIQAEPPTNKSLSSLVVQLLQFQ : I I I I I I I I I I I

1 MSSLVVQLLQFQ

61 EEVFGKHVSNAPLTKLPIKCFLDFKAGGSLCHILAAAYKFKSDQGWRRYDFQNPSRMDRN

I I I Il I I I I I Il I I I I I I I I I I 13 EEVFGKHVSNAPLTKLPIKCFL

121 VEMFMTIEKSLVQNNCLSRPNIFLCPEIEPKLLGKLKDIIKRHQGTVTEDKNNASHVVYP 3⁵

181 VPGNLEEEEWVRPVMKRDKQVLLHWGYYPDSYDTWIPASEIEASVEDAPTPEKPRKVHAK

35 241 WILDTDTFNEWMNEEDYEVNDDKNPVSRRKKISAKTLTDEVNSPDSDRRDKKGGNYKKRK

35

301 RSPSPSPTPEAKKKNAKKGPSTPYTKSKRGHREEEQEDLTKDMDEPSPVPNVEEVTLPKT

35 361 VNTKKDSESAPVKGGTMTDLDEQEDESMETTGKDEDENSTGNKGEQTKNPDLHEDNVTEQ 35

421 THHIII PS YAftWFDYNS VHAIERRALPEFFNGKNKSKTPEI YLAYRNFMI DT YRLNPQE Y

35 . 481 LTSTACRRNLAGDVCAIMRVHAFLEQWGLINYQVDAESRPTPMGPPPTSHFHVLADTPSG

35

541 LVPLQPKTPQQTSASQQMLNFPDKGKEKPTDMQNFGLRTDMYTKKNVPSKSKAAASATRE I I I I I I I I I I I I I I I I I I Il I Il I I I I I I I I I I I I I I I

35 DKGKEKPTDMQNFGLRTDMYTKKNVPSKSKAAASATRE

601 WTEQETLLLLEALEMYKDDWNKVSEHVGSRTQDECILHFLRLPIEDPYLEDSEASLGPLA

11 I 11 I I I I I I I I I I I I I I I I I I I 73 WTEQETLLLLEALEMYKDDWNKVS

27. SMARCEl (NM_003079) Product amplified with primers GCGGTGTCTCAGATTCATTC (S2; SEQ ID NO: 172) and TTGCCGGATGCTGTAATAGTTG (AS4; SEQ ID NO: 173)

77 aa: shorter protein, different C-term after 52aa, long exon 4; (SEQ ID NO: 174) ;

MSKRPSYAPPPTPAPATQMPSTPGFVGYNPYSHIAYNNYRLGGNPGTNSRVTVGESTITASGKQLELTRNAFRIRSF*

Alignment with Q969G3 ( 411 aa) ; ( SEQ ID NO : 175 ) : 1 MSKRPSYAPPPTPAPATQMPSTPGFVGYNPYSHLAYNNYRLGGNPGTNSRVTASSGITIP

I I I I I Il I I I I I I I I I I I I I I I I I I I I I I I Il I I I I I I Il I I I I I I I I I I I I 1 MSKRPSYAPPPTPAPATQMPSTPGFVGYNPYSHLAYNNYRLGGNPGTNSRVTVGESTITA

61 KPPKPPDKPLMPYMRYSRKVWDQVKASNPDLKLWEIGKIIGGMWRDLTDEEKQEYLNEYE . . i

61 SGKQLELTRNAFRIRSF

121 AEKIEYNESMKAYHNSPAYLAYIHAKSRAEAALEEESRQRQSRMEKGEPYMSIQPAEDPD Peptides

We generated small libraries of CPP-NLS-interfering peptides that potentially interact with melanoma expressed TFCs containing isoforms of co-regulator proteins BAF57 and TRAP100. Initial screening of these libraries identified the two following peptides that were further analyzed.

(1) BAF57 P12- PKKRKVRRRRRRRNDRLSDGDSKYSQTSHKLVQLL (SEQ ID NO: 6)

(2) TRAP100 P05 - PKKRKVRRRRRRRPQMQQNVFQYPGAGMVPQGEANF (SEQ ID NO: 5) Red - NLS, blue - CPP and black - mimicking domains CPP - cell penetrating peptide NLS - nucleus localizing signal RESULTS lsoforms of transcriptional co-regulators

We have conducted an extensive in silico analysis of components of transcriptional co- regulators and designed PCR primers to identify novel isoforms with altered function (activity). Identified isoforms are presented in Table 1.

Based on the known assembly and composition of TFCs and function of individual components of TFCs₁ we predicted changes in TFCs that contain isoforms of MED24. Since these isoforms and corresponding TFCs are expressed specifically in melanoma cells, these TFCs represent a suitable target for drug development. We therefore designed peptides that interact with a melanoma specific TFC and in this way disrupt its function, leading to cell death (apoptosis) and/or cessation of cell proliferation.

Example 2: Effect of interfering peptides on proliferation and apoptosis of melanoma cells

Modeling of TFCs that contain isoforms of BAF57 and TRAP100 identified specific interactions that enabled us to synthesize small peptide libraries. Screening of these libraries using melanoma cell line SK-MEL-28 resulted in two peptides, denoted by us as BAF57 P12 and TRAP 100 P05 that were found to stimulate apoptosis and inhibit growth of melanoma cells in vitro.

Amino acid sequences of SMARCE1/BAF57 and TRAP100 isoforms. Unique, isoform specific sequences are underlined.

SMARCE 1/BAF57 (SEQ ID NO: 10)

MSKRPSYAPPPTPAPATQMPSTPGFVGYNPYSH:

ASNPDLKLWEIGKIIGGMWRDLTDEEKQEYLNE' MSIQPAEDPDDYDDGFSMKHTATARFQRNHRLISEILSESVVPDVRSWTTARMQVLKRQVQSLMVHQRKLEAELLQIEERHQEK

ETEETHLEETTESQQNGEEGTSTPEDKESGQEGVDSMAEEGTSDSNTGSESNSATVEEPPTDPIPEDEKKE SMARCE1/BAF57 isoform 1 (SEQ ID NO: 11)

EEKQEYLNEYEAEKIEYNESMKAYHNSPAYLAYINAKSRAEAALEEESRQRQSRMEKGEPYMSIQPAEDPDDYDDGFSMKHTATA

RFQRNHRLISEILSESWPDVRSVVTTARMQVLKRQVQSLMVHQRKLEAELLQIEERHQEKKRKFLESTDSFNNELKRLCGLKVE

VDMEKIAAEIAQAEEQARKRQEEREKEAAEQAERSQSSIVPEEEQAAh EDKESGQEGVDSMAEEGTSDSNTGSESNSATVEEPPTDPIPEDEKKE

TRAP100 (SEQ ID NO: 12)

MKWNLKQAILQAWKERWSYYQWAIl DDFSRDLCVQALLDIMDMFCDRLSCHGKAEECIGLCRALLSALHWLLRCTAASAERLREGLEAGTPAAGEKQLAMCLQRLEKTLS

ELLYSIFCLDMQQVTLVLLGHILPGLLTDSSKWHSLMDPPGTALAKLAVWCALSSYSSHKGQASTRQKKRHREDIEDYISLFPLD DVQPSKLMRLLSSNEDDANILSSPTDRSMSSSLSASQLHTVNMRDPLNRVLANLFLLISSILGSRTAGPHTQFVQWFMEECVDCL EQGGRGSVLQFMPFTTVSELVKVSAMSSPKVVLAITDLSLPLGRQVAAKAIAAL

TRAP100 isoform 1 (SEQ ID NO: 13)

MKWNLKQAILQAWKERWSYYQWAINMKKFFPKGATWDILNLADALLEQAMIGPS PNPLILSYLKYAI SSQMVSYSSVLTAISKF DDFSRDLCVQALLDIMDMFCDRLSCHGKAEECIGLCRALLSALHWLLRCT AAS AERLREGLEAGTPAAGEKQLAMCLQRLEKTLS STKNRALLHIAKLEEACPHQALLVGSKTSTSQTRKKLEDKTSTVSIIVFVSMLLIAWKQMTLVFECYLKCSSWTAIEHSLLKLGE ILTNLSNPQLRSQAEQCGTLIRS I PTMLSVHAEQMHKTGFPTVHAVILLEGTMNLTGETQSLVEQLTMVKRMQHI PTPLFVLEIW KACFVGLIESPEGTEELKWTAFTFLKIPQVLVKLKKYSHGDKDFTEDVNCAFEFLLKLTPLLDKADQRCNCDCTNFLLQECGKQG

VLAITDLSLPLGRQVAAKAIAAL

Effects of peptide drug candidates BAF57 P12 and TRAP100 P05 on cell proliferation and apoptosis were analyzed using human melanoma cell lines SK-MEL-28 and WM 266-4. Therapeutic peptides were added at a concentration of 10 μM directly to culture media. Internalization and translocation of the therapeutic peptide(s) into the cell nucleus was studied using fluorescein labeled peptides. Peptides showed prominent nuclear localization following 8 hours of incubation with cells. This pattern remained unchanged for 7 days in cells that do not become apoptotic. As controls we used scrambled peptides. Results of these experiments (Table 2) clearly show that peptides BAF57P12 and TRAP100P05 suppress significantly proliferation and induce apoptosis. Simultaneous incubation of melanoma cells with both peptides caused complete inhibition of proliferation and induction of apoptosis in almost all treated cells.

Table 2. Effect of peptides BAF57P12 and TRAP100P05 on proliferation and apoptosis of human melanoma SK-MEL-28 and WM 266-4 cells.

The results of testing and validation of our peptide drug candidates demonstrated that our therapeutic peptides are viable drug candidates for treatment of melanoma in situ as well as metastatic melanoma.

Example 3: Treatment of human melanoma xenografts using CSTC-tarαetinq peptides

This example demonstrates the effect of therapeutic peptides on development of human melanomas in 4-week-old BALB/cOlaHsd-nu mice (Harlan, UK). Seven days after injection of melanoma cells, mice were randomly divided into 2 groups, 10 animals each. Control animals received intravenous (tail vein) injections of 50 microliters of phosphate buffer solution (PBS) every other day for 3 weeks. Test animals received intravenous (tail vein) injections of peptides BAF57 P12 and TRAP100 P05 (together) at a concentration of 0.5 mM each in 50 microliters of PBS every other day for 3 weeks. Last 2 injections were done using peptides labeled with fluorescein.

Therapeutic Peptides

BAF57 P12- PKKRKVRRRRRRRNDRLSDGDSKYSQTSHKLVQLL (SEQ ID NO: 6) TRAP100 P05 - PKKRKVRRRRRRRPQMQQNVFQYPGAGMVPQGEANF (SEQ ID NO: 5)

First portion - NLS, underlined - CPP; last portion - mimicking domains NLS - nucleus localizing signal; CPP - cell penetrating peptide

One day following the last injection, animals were sacrificed and subcutaneous tumors were removed, weighed and measured. Tumor tissue samples were obtained and subjected to molecular analysis.

RESULTS

It was found that in SCID mice bearing cutaneous human melanomas, intravenous (systemic) treatment with our peptide drug candidates reduced the weight and size of melanoma tumors by 57 ± 18% (33 - 85%) compared to matched control animals receiving intravenous injections of PBS (Table 3).

Table 3. Effect of BAF57 P12 and TRAP100 P05 on tumor growth in vivo.

Control tumors Peptide treated tumors

Animal Tumor weight (g) Animal Tumor weight (g) Reduction (%

C1 0.6 T1 0,2 33

C2 0.5 T2 0.1 85

C3 0.9 T3 0.4 56 C4 0.6 T4 0.3 50

C5 0.4 T5 0.1 75

C6 0.7 T6 0.3 57

C7 0.9 T7 0.6 34

C8 died T8 0.4

C9 0.9 T9 0.5 46

C10 0.4 T10 0.1 75

Mean±SD 6 .8 ± . 3.0 ± 1.8 g 57 ± 18%

Treatment of immune-compromised mice with cutaneous human melanomas with our two peptide drug candidates BAF57 P12 and TRAP100 PO5 demonstrated that said therapeutic peptides are viable drug candidates for treatment of melanoma in situ as well as metastatic melanoma.

Example 4: Expression of BAF57 and MED24 isoforms in different cancer types

This example demonstrates that the cancer-specific isoforms of BAF57 and MED24 described herein are not limited to melanoma, but are also expressed in other types of cancer, including, colorectal, breast and brain cancers. In this study, surgically removed tumor samples from 21 melanoma, 25 colorectal cancer, 27 breast cancer and 11 glioblastoma patients were used to isolate RNA and analyze expression of BAF57 and MED24 isoforms using RT-PCR technique. Results of the analysis are presented in Table 4.

RNA was isolated from surgically removed tumor samples using RNA isolation KIT (Qiagen). RT-PCR was used to identify isoforms of co-regulators. First strand cDNAs were synthesized with reverse transcriptase (Superscriptll, Life Technologies Inc.) using 5-10 μg of mRNA. PCR reactions were performed in the volume of 25 μl containing one tenth of RT reaction as a template and GC-Rich PCR System or the Expand™. Long Distance PCR System kit (Roche) was used in accordance with manufacturer's instructions.

Table 4. Expression of isoforms of BAF57 and MED24 in tumor samples.

Cancer # of samples # of samples # of samples type with BAF57 with MED24 isoform isoform

Melanoma 21 12 16

Colorectal 25 3 15

Breast 27 1 1 glioblastoma 11 8 9

REFERENCES King R, et al. 1999. Am J Pathol. 1999 Sep;155(3):731-8 Opdecamp K, et al. 1997. Development.;124(12):2377-86.

Salti Gl, et al. 2000. Cancer Res. ,60(18): 5012-6.

Chang KL, Folpe AL. 2001. Adv Anat Pathol.;8(5):273-5.

Miettinen M, et al. 2001. Am J Surg Pathol. 2001, (2):205-11. He TC, et al. 1998. Science.;281(5382):1509-12.

Tetsu O₁ McCormick F. 1999. Nature. 1999;398(6726):422-6.

Shtutman M, et al. 1999. Proc Natl Acad Sci U S A. 1999; 96(10):5522-7.

Goldberg SF₁ et al. 2003. Cancer Res.;63(2):432-40.

Roeder RG. 1996. Trends Biochem Sd.;21{9):327-35. Malik S₁ Roeder RG. 2005. Trends Biochem Sci.;30(5):256-63.

Kalinichenko W, et al. 2004. Genes Dev.;18(7):830-50.

Gail R₁ et al. 2005. J Biol Chem.;280(8):7107-17.

Rothbard JB, et al. 2000. Nat Med.;6(11): 1253-7.

Chang J, et al. 2003. Cancer. 2003 Feb 1;97{3):545-53. Perou CM, et al. 2000. Nature. 2000;406(6797):747-52.

Hedenfalk I, et al. 2001. N Engl J Med. ;344(8):539-48.

West M., et al. 2001. Proc. Natl. Acad. Sci., USA., 98,: 11462-11467.

Zajchowski DA, et al. 2001. Cancer Res.;61(13):5168-78. van "t Veer LJ, et al. 2002. Nature.;415(6871):530-6. van de Vijver MJ, et al. 2002. N Engl J Med. ;347(25): 1999-2009.

Wang Z, et al. 2003. Cancer Res.;63(3):655-7.

Porter DC₁ Keyomarsi K. 2000. Nucleic Acids Res. 2000 Dec 1;28(23):E101.

Leroy H, et al. 2005. Leukemia. 2005 Mar;19(3):329-34.

Keyomarsi K, et al. 2002. N Engl J Med. 2002 Nov 14;347(20):1566-75. Erratum in: N Engl J Med 2003 Jan 9;348(2):186.

Qin C, et al. 2001. Clin Cancer Res.;7{4):818-23.

Xia and Barr. 2005. Eur J Cancer. 2005 Nov;41{16):2513-27.

From the foregoing it will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention. Accordingly, the invention is not limited except as by the appended claims.

Claims

What is claimed is:

1. A molecule that specifically binds to the cancer-specific transcription complex (CSTC) that is bound by a peptide having the amino acid sequence of SEQ ID NO: 1 or SEQ ID NO: 2.

2. The molecule of claim 1 , which is a small molecule.

3. The molecule of claim 1 , which is a peptide.

4. The peptide of claim 3 that comprises the amino acid sequence of SEQ ID NO: 1 or a conservatively modified variant thereof.

5. The peptide of claim 3 that comprises the amino acid sequence of SEQ ID NO: 2 or a conservatively modified variant thereof.

6. The molecule of claim 1 , further comprising a cell penetrating peptide (CPP).

7. The molecule of claim 6, wherein the CPP has the amino acid sequence RRRRRRR.

8. The molecule of claim 1 , further comprising a nuclear localizing signal (NLS).

9. The molecule of claim 8, wherein the NLS has the amino acid sequence PKKRKV.

10. The molecule of claim 3, wherein the peptide has the amino acid sequence PKKRKVRRRRRRRPQMQQNVFQYPGAGMVPQGEANF (TRAP100 P05; SEQ ID NO: 5) or PKKRKVRRRRRRRNDRLSDGDSKYSQTSHKLVQLL (BAF57 P12; SEQ ID NO: 6).

11. The molecule of any of the preceding claims, which disrupts the biological activity of a transcription factor complex (TFC).

12. The molecule of claim 11 , which induces apoptosis, inhibits proliferation of cancer cells and/or inhibits tumor growth.

13. A method of disrupting the biological activity of a TFC comprising contacting a cancer cell with the molecule of claim 11.

14. A method of inducing apoptosis in a cancer cell comprising contacting the cancer cell with the molecule of claim 12.

15. A method of inhibiting proliferation of cancer cells, comprising contacting the cancer cells with the molecule of claim 12.

16. A method of inhibiting tumor growth comprising contacting the tumor with the molecule of claim 12.

17. The method of any one of claims 13 to 16, wherein the contacting comprises delivering the molecule to the nucleus of the cancer cells.

18. An oligonucleotide that encodes a peptide according to any one of claims 3 to 10.

19. A vector comprising the oligonucleotide of claim 18.

20. A composition comprising the molecule of claim 10 or the oligonucleotide of claim 18, and a pharmaceutically acceptable carrier.

21. A method of disrupting the biological activity of a TFC comprising contacting a cancer cell with the composition of claim 20.

22. The method of claim 21 , wherein the molecule is an siRNA molecule.

23. The method of any of the preceding method claims, wherein the cancer cell is a melanoma cell.

23. The method of any of the preceding method claims, wherein the cancer cell is a human cell.

24. A method for treating cancer in a subject comprising administering to the subject the composition of claim 20.

25. A method for detecting cancer in a tissue specimen, comprising contacting a tissue specimen with a detectable molecule that specifically binds a CSTC and detecting binding of the detectable molecule, wherein binding of the detectable molecule is indicative of cancer.

26. The method of claim 25, wherein the cancer is melanoma.

27. An isoform of a transcriptional co-regulator selected from the isoforms shown in Table 1.