HUMA TRANSGLUTAMINASES
RELATED APPLICATION
This application claims the benefit of U.S. Provisional Application No. 60/257,754, filed December 21, 2000, the entire teachings of which are incorporated herein by reference.
BACKGROUND OF THE INVENTION
Transglutaminases are calcium-dependent enzymes that catalyze the crosslinking of proteins by forming isopeptide bonds between the gamma-carboxyl group of a glutamine and the epsilon-amino group of a lysine. Such crosslinks may be either intra- or intermolecular. Transglutaminases also catalyze the conjugation of polyamines to proteins, and in skin, the formation of the lipid envelope by covalently linking omega-hydroxyceramides through an ester bond to glutaminyl residues involucrin and other cornified envelope proteins. Six human transglutaminases have been described previously, including keratinocyte transglutaminase (TGM1 or TGLK), tissue transglutaminase (TGM2 or TGLC), hair follicle transglutaminase (TGM3 or TGLE), prostatic transglutaminase (TGM4), transglutaminase X (TGM5), and the catalytic subunit of the clotting factor XDIa (F13A). These enzymes are widely distributed in various organs, tissues and body fluids. Keratinocyte transglutaminse is a membrane-bound enzyme found in mammalian epidermis that is important for the formation of the cornified cell envelope. Tissue transglutaminase is a monomeric, ubiquitous enzyme located in the cytoplasm, while hair follicle transglutaminase (a proenzyme that is activated by proteo lysis) is responsible for the later stages of cornified cell envelope formation in the epidermis and hair follicle. Both prostatic transglutaminase and transglutaminase X were found initially in prostate. Perhaps the best studied transglutaminase is blood coagulation factor XIH, a tetrameric plasma protein comprised of two catalytic A transglutaminase subunits and two noncatalytic B
subunits. Factor XDl crosslinks fibrin chains, thus stabilizing the fibrin clot. Erythrocytes contain a membrane protein related to transglutaminases, membrane band 4.2 protein. This protein probably plays a role in regulating the shape of erythrocytes and their mechanical properties. Although highly related in amino acid sequence to the known transglutaminases, band 4.2 protein lacks transglutaminase catalytic activity.
Although the overall amino acid sequences of the transglutaminases are quite different, they share a common amino acid sequence at the active site (Y-G-Q-C-W- V-F-A) (SEQ ID NO:l) and a strict calcium dependence for their enzymatic activity. The differences in amino acid sequence of the various transglutaminases are probably responsible for their different substrate specificities. As a group, these enzymes plays roles in such diverse physiological processes as epidermal differentiation, seminal fluid coagulation and fertilization, blood coagulation and cell death or apoptosis.
SUMMARY OF THE INVENTION
The present invention relates to isolated nucleic acid molecules comprising a human transglutaminase, transglutaminase-A (TGMA) or transglutaminase-B (TGMB). In one embodiment, the isolated nucleic acid molecule comprises a nucleotide sequence encoding transglutaminase-A (TGMA), such as a sequence selected from the group consisting of NO: 2, the complement of SEQ ID NO: 2, SEQ ID NO: 4 and the complement of SEQ ID NO: 4. In another embodiment, the isolated nucleic acid molecule comprises a nucleotide sequence encoding transglutaminase-B (TGMB), such as a sequence selected from the group consisting of SEQ ID NO:5, the complement of SEQ ID NO:5, SEQ ID NO:7 and the complement of SEQ ID NO:7. In yet another embodiment, the isolated nucleic acid molecules comprises a nucleotide sequence encoding a splicing isoform of transglutaminase-B (TGMB) auch s a sequence selected from the group consisting of SEQ ID NO: 8 and the complement of SEQ ID NO: 8.
The invention further relates to a nucleic acid molecule which hybridizes under high stringency conditions to a nucleotide sequence selected from the group
consisting of SEQ ID NO: 2, the complement of SEQ ID NO: 2, SEQ ID NO: 5, the complement of SEQ ID NO: 5 , SEQ ID NO: 8 and the complement of SEQ ID NO: 8.
The invention further provides a method for assaying the presence of a nucleic acid molecule encoding all or a portion of TGMA or TGMB (or its isoform) in a sample, comprising contacting said sample with a second nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 2 and the complement of SEQ ID NO: 2 (for TGMA) or selected from the group consisting of SEQ ID NO: 5 and the complement of SEQ ID NO: 5 (for TGMB), or selected from the group consisting of SEQ ID NO: 8 and the complement of SEQ ID NO: 8 (for the isoform of TGMB) under conditions appropriate for selective hybridization.
The invention also relates to a vector comprising an isolated nucleic acid molecule of the invention operatively linked to a regulatory sequence, as well as to a recombinant host cell comprising the vector. The invention also provides a method for preparing a polypeptide encoded by an isolated nucleic acid molecule of the invention, comprising culturing a recombinant host cell of the invention under conditions suitable for expression of said nucleic acid molecule.
The invention further provides an isolated polypeptide encoded by isolated nucleic acid molecules of the invention (e.g., TGMA, TGMB or an isoform of TGMB). In a particular embodiment, the polypeptide comprises the amino acid sequence of SEQ ID NO: 3 (TGMA), SEQ ID NO: 6 (TGMB or SEQ ID NO: 9 (isoform of TGMB). The invention also relates to an isolated polypeptide comprising an amino acid sequence which is greater than about 90 percent identical to the amino acid sequence of SEQ ID NO: 3, SEQ ID NO:6 and/or SEQ ID NO: 9. The invention also relates to an antibody, or an antigen-binding fragment thereof, which selectively binds to a polypeptide of the invention, as well as to a method for assaying the presence of a polypeptide encoded by an isolated nucleic acid molecule of the invention in a sample, comprising contacting said sample with an antibody which specifically binds to the encoded polypeptide.
The invention further relates to an assay for identifying agents which alter (e.g., enhance or inhibit) the activity or expression of TGMA or TGMB. For example, a cell, cellular fraction, or solution containing TGMA or TGMB, or an isoform, an active fragment or derivative thereof, can be contacted with an agent to be tested, and the level of transglutaminase expression or activity can be assessed. Agents that enhance or inhibit transglutaminase expression or activity are also included in the current invention, as are methods of altering (enhancing or inhibiting) transglutaminase expression or activity by contacting a cell containing the transglutaminase, or by contacting the transglutaminase, with an agent that enhances or inhibits expression or activity of the transglutaminase.
Additionally, the invention pertains to pharmaceutical compositions comprising the nucleic acids of the invention, the polypeptides of the invention, and/or the agents that alter activity of the transglutaminase. The invention further pertains to methods of treating Huntington's Disease, by administering therapeutic agents, such as nucleic acids of the invention, the agents that alter activity of transglutaminase, or compositions comprising the nucleic acids and/or the agents that alter activity of the transglutaminase.
BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 depicts the predicted mRNA (cDNA) sequence (SEQ ID NO:2) encoding transglutaminase-A (TGMA).
Fig. 2 depicts the predicted amino acid sequence (SEQ ID NO: 3) of transglutaminase-A (TGMA).
Figs. 3 A- 3G depict the gene (SEQ ID NO: 4) encoding transglutaminase-A (TGMA).
Fig. 4 depicts the predicted mRNA (cDNA) sequence (SEQ ID NO:5) encoding transglutaminase-B (TGMB).
Fig. 5 depicts the predicted amino acid sequence (SEQ ID NO: 6) of transglutaminase-B (TGMB). Figs. 6A-6C2 depict the gene (SEQ ID NO: 7) encoding transglutaminase-A
(TGMB).
Fig. 7 depicts the predicted mRNA (cDNA) sequence (SEQ ID NO: 8) encoding a splicing isoform of transglutaminase-B (TGMB).
Fig. 8 depicts the predicted amino acid sequence (SEQ ID NO: 9) of the splicing isoform of transglutaminase-B (TGMB)
DETAILED DESCRIPTION OF THE INVENTION
As described herein, Applicants have identified two human brain transglutaminase genes, transglutaminase-A (TGMA) residing on chromosome 15, and transglutaminase-B (TGMB) residing on chromosome 20. In addition, Applicants have identified two isoforms of transglutaminase B.
NUCLEIC ACIDS OF THE INVENTION
Accordingly, the invention pertains to an isolated nucleic acid molecule comprising a human transglutaminase, TGMA or TGMB. The term, "TGMA," as used herein, refers to an isolated nucleic acid molecule on chromosome 15ql5, which encodes a transglutaminase, and also to an isolated nucleic acid molecule (e.g., mRNA, cDNA or the gene) that encodes TGMA (e.g., the polypeptide having SEQ ID 3). In a preferred embodiment, the isolated nucleic acid molecule comprises SEQ ID NO:2 or SEQ ID NO: 4. The term, "TGMB," as used herein, refers to an isolated nucleic acid molecule on chromosome 20, which encodes a transglutaminase, and also to an isolated nucleic acid molecule (e.g., mRNA, cDNA or the gene) that encodes TGMB (e.g., the polypeptide having SEQ ID 6). The term TGMB also refers to an isolated nucleic acid molecule that encodes a splicing isoform of TGMB (e.g., the polypeptide having SEQ ID NO: 9). In a preferred embodiment, the isolated nucleic acid molecule comprises SEQ ID NO:5, SEQ ID NO: 7 or SEQ ID NO: 8. The term, "the transglutaminase of interest," as used herein, refers to either TGMA or TGMB (including splicing isoform).
The isolated nucleic acid molecules of the present invention can be RNA, for example, mRNA, or DNA, such as cDNA and genomic DNA. DNA molecules can be double-stranded or single-stranded; single stranded RNA or DNA can be either the coding, or sense, strand or the non-coding, or antisense, strand. The nucleic acid
molecule can include all or a portion of the coding sequence of the gene and can further comprise additional non-coding sequences such as introns and non-coding 3' and 5' sequences (including regulatory sequences, for example). Additionally, the nucleic acid molecule can be fused to a marker sequence, for example, a sequence that encodes a polypeptide to assist in isolation or purification of the polypeptide. Such sequences include, but are not limited to, those which encode a glutathione-S-transferase (GST) fusion protein and those which encode a hemagglutinin A (HA) polypeptide marker from influenza.
An "isolated" nucleic acid molecule, as used herein, is one that is separated from nucleic acids which normally flank the gene or nucleotide sequence (as in genomic sequences) and/or has been completely or partially purified from other transcribed sequences (e.g., as in an RNA library). For example, an isolated nucleic acid of the invention may be substantially isolated with respect to the complex cellular milieu in which it naturally occurs, or culture medium when produced by recombinant techniques, or chemical precursors or other chemicals when chemically synthesized. In some instances, the isolated material will form part of a composition (for example, a crude extract containing other substances), buffer system or reagent mix. In other circumstances, the material may be purified to essential homogeneity, for example as determined by PAGE or column chromatography such as HPLC. Preferably, an isolated nucleic acid molecule comprises at least about 50, 80 or 90% (on a molar basis) of all macromolecular species present. With regard to genomic DNA, the term "isolated" also can refer to nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. For example, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of nucleotides which flank the nucleic acid molecule in the genomic DNA of the cell from which the nucleic acid molecule is derived.
The nucleic acid molecule can be fused to other coding or regulatory sequences and still be considered isolated. Thus, recombinant DNA contained in a vector is included in the definition of "isolated" as used herein. Also, isolated nucleic acid molecules include recombinant DNA molecules in heterologous host
cells, as well as partially or substantially purified DNA molecules in solution. "Isolated" nucleic acid molecules also encompass in vivo and in vitro RNA transcripts of the DNA molecules of the present invention. An isolated nucleic acid molecule or nucleotide sequence can include a nucleic acid molecule or nucleotide sequence which is synthesized chemically or by recombinant means. Therefore, recombinant DNA contained in a vector are included in the definition of "isolated" as used herein. Also, isolated nucleotide sequences include recombinant DNA molecules in heterologous organisms, as well as partially or substantially purified DNA molecules in solution. In vivo and in vitro RNA transcripts of the DNA molecules of the present invention are also encompassed by "isolated" nucleotide sequences. Such isolated nucleotide sequences are useful in the manufacture of the encoded polypeptide, as probes for isolating homologous sequences (e.g., from other mammalian species), for gene mapping (e.g., by in situ hybridization with chromosomes), or for detecting expression of the gene in tissue (e.g., human tissue), such as by Northern blot analysis.
The present invention also pertains to nucleic acid molecules which are not necessarily found in nature but which encode TGMA, TGMB or the isoform of TGMB (e.g., a polypeptide having the amino acid sequence of SEQ ID NO: 3, 6, or 9, respectively). Thus, for example, DNA molecules which comprise a sequence that is different from the naturally-occurring nucleotide sequence but which, due to the degeneracy of the genetic code, encode TGMA or TGMB are also the subject of this invention. The invention also encompasses variants of the nucleotide sequences of the invention, such as those encoding isoforms, portions, analogues or derivatives of TGMA or TGMB. Such variants can be naturally-occurring, such as in the case of allelic variation, or non-naturally-occurring, such as those induced by various mutagens and mutagenic processes. Intended variations include, but are not limited to, addition, deletion and substitution of one or more nucleotides which can result in conservative or non-conservative amino acid changes, including additions and deletions. Preferably the nucleotide (and/or resultant amino acid) changes are silent or conserved; that is, they do not alter the characteristics or activity of the transglutaminase of interest.
Other alterations of the nucleic acid molecules of the invention can include, for example, labeling, methylation, internucleotide modifications such as uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoamidates, carbamates), charged linkages (e.g., phosphorothioates, phosphorodithioates), pendent moieties (e.g., polypeptides), intercalators (e.g., acridine, psoralen), chelators, alkylators, and modified linkages (e.g., alpha anomeric nucleic acids). Also included are synthetic molecules that mimic nucleic acid molecules in the ability to bind to a designated sequences via hydrogen bonding and other chemical interactions. Such molecules include, for example, those in which peptide linkages substitute for phosphate linkages in the backbone of the molecule.
The invention also pertains to nucleic acid molecules which hybridize under high stringency hybridization conditions, such as for selective hybridization, to a nucleotide sequence described herein (e.g., nucleic acid molecules which specifically hybridize to a nucleotide sequence encoding polypeptides described herein, and, optionally, have an activity of the polypeptide). In one embodiment, the invention includes variants described herein which hybridize under high stringency hybridization conditions (e.g., for selective hybridization) to a nucleotide sequence comprising a nucleotide sequence selected from SEQ ID NO: 2 or the complement of SEQ ID NO: 2 (for TGMA), or comprising a nucleotide sequence selected from SEQ ID NO: 5 or, the complement of SEQ ID NO: 5 (for TGMB), SEQ ID NO: 8, or the complement of SEQ ID NO: 8 (for the isoform of TGMB).
Such nucleic acid molecules can be detected and/or isolated by specific hybridization (e.g., under high stringency conditions). "Stringency conditions" for hybridization is a term of art which refers to the incubation and wash conditions, e.g., conditions of temperature and buffer concentration, which permit hybridization of a particular nucleic acid to a second nucleic acid; the first nucleic acid may be perfectly (i.e., 100%) complementary to the second, or the first and second may share some degree of complementarity which is less than perfect (e.g., 70%, 75%, 85%, 95%). For example, certain high stringency conditions can be used which distinguish perfectly complementary nucleic acids from those of less complementarity. "High stringency conditions", "moderate stringency conditions"
and "low stringency conditions" for nucleic acid hybridizations are explained on pages 2.10.1-2.10.16 and pages 6.3.1-6.3.6 in Current Protocols in Molecular Biology (Ausubel, F.M. et al., "Current Protocols in Molecular Biology", John Wiley & Sons, (1998), the entire teachings of which are incorporated by reference herein). The exact conditions which determine the stringency of hybridization depend not only on ionic strength (e.g., 0.2XSSC, 0.1XSSC), temperature (e.g., room temperature, 42°C, 68°C) and the concentration of destabilizing agents such as formamide or denaturing agents such as SDS, but also on factors such as the length of the nucleic acid sequence, base composition, percent mismatch between hybridizing sequences and the frequency of occurrence of subsets of that sequence within other non-identical sequences. Thus, equivalent conditions can be determined by varying one or more of these parameters while maintaining a similar degree of identity or similarity between the two nucleic acid molecules. Typically, conditions are used such that sequences at least about 60%, at least about 70%, at least about 80%, at least about 90%> or at least about 95% or more identical to each other remain hybridized to one another. By varying hybridization conditions from a level of stringency at which no hybridization occurs to a level at which hybridization is first observed, conditions which will allow a given sequence to hybridize (e.g., selectively) with the most similar sequences in the sample can be determined. Exemplary conditions are described in Krause, M.H. and S.A. Aaronson,
Methods in Enzymology, 200:546-556 (1991). Also, in, Ausubel, et al, "Current Protocols in Molecular Biology", John Wiley & Sons, (1998), which describes the determination of washing conditions for moderate or low stringency conditions. Washing is the step in which conditions are usually set so as to determine a minimum level of complementarity of the hybrids. Generally, starting from the lowest temperature at which only homologous hybridization occurs, each °C by which the final wash temperature is reduced (holding SSC concentration constant) allows an increase by 1% in the maximum extent of mismatching among the sequences that hybridize. Generally, doubling the concentration of SSC results in an increase in Tm of ~ 17°C. Using these guidelines, the washing temperature can be
determined empirically for high, moderate or low stringency, depending on the level of mismatch sought.
For example, a low stringency wash can comprise washing in a solution containing 0.2XSSC/0.1% SDS for 10 min at room temperature; a moderate stringency wash can comprise washing in a prewarmed solution (42°C) solution containing 0.2XSSC/0.1% SDS for 15 min at 42°C; and a high stringency wash can comprise washing in prewarmed (68°C) solution containing 0.1XSSC/0.1%SDS for 15 min at 68°C. Furthermore, washes can be performed repeatedly or sequentially to obtain a desired result as known in the art. Equivalent conditions can be determined by varying one or more of the parameters given as an example, as known in the art, while maintaining a similar degree of identity or similarity between the target nucleic acid molecule and the primer or probe used.
The percent identity of two nucleotide or amino acid sequences can be determined by aligning the sequences for optimal comparison purposes (e.g., gaps can be introduced in the sequence of a first sequence). The nucleotides or amino acids at corresponding positions are then compared, and the percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e., % identity = # of identical positions/total # of positions x 100). In certain embodiments, the length of a sequence aligned for comparison purposes is at least 30%>, preferably at least 40%>, more preferably at least 60%>, and even more preferably at least 70%, 80%, 90%> or 95%) of the length of the reference sequence. The actual comparison of the two sequences can be accomplished by well-known methods, for example, using a mathematical algorithm. A preferred, non-limiting example of such a mathematical algorithm is described in Karlin et al, Proc. Nat Acad. Sci. USA, 90:5813-5811 (1993). Such an algorithm is incorporated into the NBLAST and XBLAST programs (version 2.0) as described in Altschul et al, Nucleic Acids Res., 25:389-3402 (1997). When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., NBLAST) can be used. See http://www.ncbi.nlm.nih.gov. In one embodiment, parameters for sequence comparison can be set at score=100, wordlength=12, or can be varied (e.g., W=5 or W=20).
Another preferred, non-limiting example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Myers and Miller, CABIOS (1989). Such an algorithm is incorporated into the ALIGN program (version 2.0) which is part of the CGC sequence alignment software package. When utilizing the ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a gap length penalty of 12 , and a gap penalty of 4 can be used. Additional algorithms for sequence analysis are known in the art and include ADVANCE and ADAM as described in Torellis and Robotti (1994) Comput. Appl. Biosci., 10:3-5; and FASTA described in Pearson and Lipman (1988) PNAS, 55:2444-8.
In another embodiment, the percent identity between two amino acid sequences can be accomplished using the GAP program in the CGC software package (available at http://www.cgc.com) using either a Blossom 63 matrix or a PAM250 matrix, and a gap weight of 12, 10, 8, 6, or 4 and a length weight of 2, 3, or 4. In yet another embodiment, the percent identity between two nucleic acid sequences can be accomplished using the GAP program in the CGC software package (available at http://www.cgc.com), using a gap weight of 50 and a length weight of 3.
The present invention also provides isolated nucleic acid molecules that contain a fragment or portion that hybridizes under highly stringent conditions to a nucleotide sequence comprising a nucleotide sequence selected from SEQ ID NO: 2 and the complement of SEQ ID NO: 2 (for TGMA) or that hybridizes under highly stringent conditions to a nucleotide sequence comprising a nucleotide sequence selected from SEQ ID NO: 5 and the complement of SEQ ID NO: 5 (for TGMB) or selected from SEQ ID NO: 8 and the complement of SEQ ID NO: 8 (for the isoform of TGMB). The nucleic acid fragments of the invention are at least about 15, preferably at least about 18, 20, 23 or 25 nucleotides, and can be 30, 40, 50, 100, 200 or more nucleotides in length. Longer fragments, for example, 30 or more nucleotides in length, which encode antigenic polypeptides described herein are particularly useful, such as for the generation of antibodies as described below.
In a related aspect, the nucleic acid fragments of the invention are used as probes or primers in assays such as those described herein. "Probes" are oligonucleotides that hybridize in a base-specific maimer to a complementary strand of nucleic acid molecules. Such probes include polypeptide nucleic acids, as described in Nielsen et al, Science, 254, 1497-1500 (1991). Typically, a probe comprises a region of nucleotide sequence that hybridizes under highly stringent conditions to at least about 15, typically about 20-25, and more typically about 40, 50 or 75, consecutive nucleotides of a nucleic acid molecule comprising a nucleotide sequence selected from SEQ ID NO: 2 and the complement of SEQ ID NO: 2 (TGMA), or consecutive nucleotides of a nucleic acid molecule comprising a nucleotide sequence selected from SEQ ID NO: 5 and the complement of SEQ ID NO: 5 (TGMB), or consecutive nucleotides of a nucleic acid molecule comprising a nucleotide sequence selected from SEQ ID NO: 8 and the complement of SEQ ID NO: 8 (isoform of TGMB). More typically, the probe further comprises a label, e.g., radioisotope, fluorescent compound, enzyme, or enzyme co-factor.
As used herein, the term "primer" refers to a single-stranded oligonucleotide which acts as a point of initiation of template-directed DNA synthesis using well-known methods (e.g., PCR, LCR) including, but not limited to those described herein. The appropriate length of the primer depends on the particular use, but typically ranges from about 15 to 30 nucleotides.
The nucleic acid molecules of the invention such as those described above can be identified and isolated using standard molecular biology techniques and the sequence information provided in SEQ ID NO: 2, SEQ ID NO: 5 or SEQ ID NO: 8. For example, nucleic acid molecules can be amplified and isolated by the polymerase chain reaction using synthetic oligonucleotide primers designed based on one or more of the sequences provided in SEQ ID NO: 2 and/or the complement of SEQ ID NO: 2, or SEQ ID NO: 5 and/or the complement of SEQ ID NO: 5, or SEQ ID NO: 8 and/or the complement of SEQ ID NO: 8. See generally PCR Technology: Principles and Applications for DNA Amplification (ed. H.A. Erlich, Freeman Press, NY, NY, 1992); PCR Protocols: A Guide to Methods and
Applications (Eds. Innis, et al, Academic Press, San Diego, CA, 1990); Mattila et
al, Nucleic Acids Res., 19:4961 (1991); Eckert et al, PCR Methods and Applications, 1:11 (1991); PCR (eds. McPherson et al, IRL Press, Oxford); and U.S. Patent 4,683,202. The nucleic acid molecules can be amplified using cDNA, mRNA or genomic DNA as a template, cloned into an appropriate vector and characterized by DNA sequence analysis.
Other suitable amplification methods include the ligase chain reaction (LCR) (see Wu and Wallace, Genomics, 4:560 (1989), Landegren et al, Science, 241:1011 (1988), transcription amplification (Kwoh et al, Proc. Natl. Acad. Sci. USA, 56V1173 (1989)), and self-sustained sequence replication (Guatelli et al, Proc. Nat. Acad. Sci. USA, 87:1814 (1990)) and nucleic acid based sequence amplification
(NASBA). The latter two amplification methods involve isothermal reactions based on isothermal transcription, which produce both single stranded RNA (ssRNA) and double stranded DNA (dsDNA) as the amplification products in a ratio of about 30 or 100 to 1, respectively. The amplified DNA can be radiolabelled and used as a probe for screening a cDNA library derived from human cells, mRNA in zap express, ZIPLOX or other suitable vector. Corresponding clones can be isolated, DNA can obtained following in vivo excision, and the cloned insert can be sequenced in either or both orientations by art recognized methods to identify the correct reading frame encoding a polypeptide of the appropriate molecular weight. For example, the direct analysis of the nucleotide sequence of nucleic acid molecules of the present invention can be accomplished using well-known methods that are commercially available. See, for example, Sambrook et al, Molecular Cloning, A Laboratory Manual (2nd Ed., CSHP, New York 1989); Zyskind et al, Recombinant DNA Laboratory Manual, (Acad. Press, 1988)). Using these or similar methods, the polypeptide and the DNA encoding the polypeptide can be isolated, sequenced and further characterized.
Antisense nucleic acid molecules of the invention can be designed using the nucleotide sequences of SEQ ED NO: 2 and/or the complement of SEQ ID NO: 2, and/or a portion of SEQ ID NO: 2 or the complement of SEQ ID NO: 2, and constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. Alternatively, antisense nucleic acid molecules of the
invention can be designed using the nucleotide sequences of SEQ ED NO: 5 and/or the complement of SEQ ED NO: 5, and/or a portion of SEQ ED NO: 5 or the complement of SEQ ED NO: 5; or antisense nucleic acid molecules of the invention can be designed using the nucleotide sequences of SEQ ED NO: 8 and/or the complement of SEQ ED NO: 8, and/or a portion of SEQ ED NO: 8 or the complement of SEQ ED NO: 8. For example, an antisense nucleic acid molecule (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. Alternatively, the antisense nucleic acid molecule can be produced biologically using an expression vector into which a nucleic acid molecule has been subcloned in an antisense orientation (t.e., RNA transcribed from the inserted nucleic acid molecule will be of an antisense orientation to a target nucleic acid of interest).
In general, the isolated nucleic acid sequences of the invention can be used as molecular weight markers on Southern gels, and as chromosome markers which are labeled to map related gene positions. The nucleic acid sequences can also be used as probes, such as to hybridize and discover related DNA sequences or to subtract out known sequences from a sample. The nucleic acid sequences can further be used to derive primers for genetic fmge rinting, to raise anti-polypeptide antibodies using DNA immunization techniques, and as an antigen to raise anti-DNA antibodies or elicit immune responses. Portions or fragments of .the nucleotide sequences identified herein (and the corresponding complete gene sequences) can be used in numerous ways as polynucleotide reagents. For example, these sequences can be used to: (i) map adjacent respective genes on a chromosome; and locate gene regions associated with genetic disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. Additionally, the nucleotide sequences of the invention can be used to identify and express recombinant polypeptides for analysis, characterization or therapeutic use, or as markers for tissues in which the corresponding polypeptide
is expressed, either constitutively, during tissue differentiation, or in diseased states. The nucleic acid sequences can additionally be used in the generation of transglutaminases for use in identification of agents which alter the activity of transglutaminases, as described in detail below. Another aspect of the invention pertains to nucleic acid constructs containing a nucleic acid molecule selected from the group consisting of SEQ ED NO: 2, the complement of SEQ ED NO: 2, SEQ ED NO: 4, the complement of SEQ ED NO: 4, SEQ ED NO: 5, the complement of SEQ ED NO: 5, SEQ ED NO: 7, the complement of SEQ ED NO: 7, SEQ ED NO: 8, the complement of SEQ ED NO: 8, or a portion of any one of the sequences. The constructs comprise a vector (e.g., an expression vector) into which a sequence of the invention has been inserted in a sense or antisense orientation. As used herein, the term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a "plasmid", which refers to a circular double stranded DNA loop into which additional DNA segments can be ligated. Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors, expression vectors, are capable of directing the expression of genes to which they are operably linked. In general, expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. However, the invention is intended to include such other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses) that serve equivalent functions.
Preferred recombinant expression vectors of the invention comprise a nucleic acid molecule of the invention in a form suitable for expression of the nucleic acid molecule in a host cell. This means that the recombinant expression vectors include one or more regulatory sequences, selected on the basis of the host cells to be used
for expression, which is operably linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, "operably linked" is intended to mean that the nucleotide sequence of interest is linked to the regulatory sequence(s) in a manner which allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). The term "regulatory sequence" is intended to include promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Such regulatory sequences are described, for example, in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence in many types of host cell and those which direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed and the level of expression of polypeptide desired. The expression vectors of the invention can be introduced into host cells to thereby produce polypeptides, including fusion polypeptides, encoded by nucleic acid molecules as described herein .
The recombinant expression vectors of the invention can be designed for expression of a polypeptide of the invention in prokaryotic or eukaryotic cells, e.g. , bacterial cells such as E. coli, insect cells (using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, supra. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.
Another aspect of the invention pertains to host cells into which a recombinant expression vector of the invention has been introduced. The terms "host cell" and "recombinant host cell" are used interchangeably herein. It is understood that such terms refer not only to the particular subject cell but also to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences,
such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
A host cell can be any prokaryotic or eukaryotic cell. For example, a nucleic acid molecule of the invention can be expressed in bacterial cells (e.g., E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in the art.
Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional transformation or transfection techniques. As used herein, the terms "transformation" and "transfection" are intended to refer to a variety of art-recognized techniques for introducing a foreign nucleic acid molecule (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation. Suitable methods for transforming or transfecting host cells can be found in Sambrook, et al. (supra), and other laboratory manuals. For stable transfection of mammalian cells, it is known that, depending upon the expression vector and transfection technique used, only a small fraction of cells may integrate the foreign DNA into their genome. In order to identify and select these integrants, a gene that encodes a selectable marker (e.g., for resistance to antibiotics) is generally introduced into the host cells along with the gene of interest. Preferred selectable markers include those that confer resistance to drugs, such as G418, hygromycin and methotrexate. Nucleic acid molecules encoding a selectable marker can be introduced into a host cell on the same vector as the nucleic acid molecule of the invention or can be introduced on a separate vector. Cells stably transfected with the introduced nucleic acid molecule can be identified by drug selection (e.g. , cells that have incorporated the selectable marker gene will survive, while the other cells die).
A host cell of the invention, such as a prokaryotic or eukaryotic host cell in culture, can be used to produce (i.e., express) a polypeptide of the invention. Accordingly, the invention further provides methods for producing a polypeptide using the host cells of the invention. En one embodiment, the method comprises culturing the host cell of invention (into which a recombinant expression vector
encoding a polypeptide of the invention has been introduced) in a suitable medium such that the polypeptide is produced. En another embodiment, the method further comprises isolating the polypeptide from the medium or the host cell.
The host cells of the invention can also be used to produce nonhuman transgenic animals. For example, in one embodiment, a host cell of the invention is a fertilized oocyte or an embryonic stem cell into which a nucleic acid molecule of the invention has been introduced. Such host cells can then be used to create non-human transgenic animals in which exogenous nucleotide sequences have been introduced into the genome or homologous recombinant animals in which endogenous nucleotide sequences have been altered. Such animals are useful for studying the function and/or activity of the nucleotide sequence and polypeptide encoded by the sequence and for identifying and/or evaluating modulators of their activity. As used herein, a "transgenic animal" is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens and amphibians. A transgene is exogenous DNA which is integrated into the genome of a cell from which a transgenic animal develops and which remains in the genome of the mature animal, thereby directing the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal. As used herein, an "homologous recombinant animal" is a non-human animal, preferably a mammal, more preferably a mouse, in which an endogenous gene has been altered by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.
Methods for generating transgenic animals via embryo manipulation and microinjection, particularly animals such as mice, have become conventional in the art and are described, for example, in U.S. Patent Nos. 4,736,866 and 4,870,009, U.S. Patent No. 4,873,191 and in Hogan, Manipulating the Mouse Embryo (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986). Methods for constructing homologous recombination vectors and homologous recombinant
animals are described further in Bradley (1991) Current Opinion in Bio/Technology, 2:823-829 and in PCT Publication Nos. WO 90/11354, WO 91/01140, WO 92/0968, and WO 93/04169. Clones of the non-human transgenic animals described herein can also be produced according to the methods described in Wilmut et al. (1997) Nature, 385: 810-813 and PCT Publication Nos. WO 97/07668 and WO 97/07669.
POLYPEPTIDES OF THE INVENTION
The present invention also pertains to isolated TGMA polypeptides and isolated TGMB polypeptides, e.g., proteins, and isoforms and variants thereof, as well as polypeptides encoded by nucleotide sequences described herein (e.g., other splicing variants). The term "polypeptide" refers to a polymer of amino acids, and not to a specific length; thus, peptides, oligopeptides and proteins are included within the definition of a polypeptide. As used herein, a polypeptide is said to be "isolated" or "purified" when it is substantially free of cellular material when it is isolated from recombinant and non-recombinant cells, or free of chemical precursors or other chemicals when it is chemically synthesized. A polypeptide, however, can be joined to another polypeptide with which it is not normally associated in a cell and still be "isolated" or "purified."
The polypeptides of the invention can be purified to homogeneity. It is understood, however, that preparations in which the polypeptide is not purified to homogeneity are useful. The critical feature is that the preparation allows for the desired function of the polypeptide, even in the presence of considerable amounts of other components. Thus, the invention encompasses various degrees of purity. En one embodiment, the language "substantially free of cellular material" includes preparations of the polypeptide having less than about 30% (by dry weight) other proteins (i.e., contaminating protein), less than about 20% other proteins, less than about 10% other proteins, or less than about 5% other proteins.
When a polypeptide is recombinantly produced, it can also be substantially free of culture medium, i.e., culture medium represents less than about 20%o, less than about 10%, or less than about 5% of the volume of the polypeptide preparation. The language "substantially free of chemical precursors or other chemicals" includes
preparations of the polypeptide in which it is separated from chemical precursors or other chemicals that are involved in its synthesis. In one embodiment, the language "substantially free of chemical precursors or other chemicals" includes preparations of the polypeptide having less than about 30% (by dry weight) chemical precursors or other chemicals, less than about 20% chemical precursors or other chemicals, less than about 10%> chemical precursors or other chemicals, or less than about 5% chemical precursors or other chemicals.
In one embodiment, a polypeptide comprises an amino acid sequence encoded by a nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ED NO: 2 and complements and portions thereof, and SEQ D NO: 4 and complements and portions thereof, e.g., SEQ ED NO: 3 or a portion thereof. En another embodiment, a polypeptide comprises an amino acid sequence encoded by a nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ED NO: 5 and complements and portions thereof, and SEQ ED NO: 7 and complements and portions thereof, e.g., SEQ ED NO: 6 or a portion thereof. In yet another embodiment, a polypeptide comprises an amino acid sequence encoded by a nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ED NO: 8 and complements and portions thereof, e.g., SEQ ED NO: 9 or a portion thereof However, the invention also encompasses sequence variants. Variants include a substantially homologous polypeptide encoded by the same genetic locus in an organism, i.e., an allelic variant, as well as other splicing variants. Variants also encompass polypeptides derived from other genetic loci in an organism, but having substantial homology to a polypeptide encoded by a nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ED NO: 2 and complements and portions thereof, SEQ ED NO: 4 and complements and portions thereof, SEQ ED NO: 5 and complements and portions thereof, SEQ ED NO: 7 and complements and portions thereo, and SEQ ED NO: 8 and complements and portions thereof. Variants also include polypeptides substantially homologous or identical to these polypeptides but derived from another organism, t.e., an ortholog. Variants also include polypeptides that are substantially homologous or identical to these polypeptides that are
produced by chemical synthesis. Variants also include polypeptides that are substantially homologous or identical to these polypeptides that are produced by recombinant methods.
As used herein, two polypeptides (or a region of the polypeptides) are substantially homologous or identical when the amino acid sequences are at least about 45-55%), typically at least about 70-75%, more typically at least about 80-85%>, and most typically greater than about 90%) or more homologous or identical. A substantially homologous amino acid sequence, according to the present invention, will be encoded by a nucleic acid molecule hybridizing to SEQ ID NO: 2, or portion thereof, or hybridizing to SEQ ED NO: 5, or portion thereof, or hybridizing to SEQ ED NO: 8, or a portion thereof, under stringent conditions as more particularly described above.
To determine the percent homology or identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g. , gaps can be introduced in the sequence of one polypeptide or nucleic acid molecule for optimal alignment with the other polypeptide or nucleic acid molecule). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in one sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the other sequence, then the molecules are homologous at that position. As used herein, amino acid or nucleic acid "homology" is equivalent to amino acid or nucleic acid "identity". The percent homology between the two sequences is a function of the number of identical positions shared by the sequences (i.e., percent homology equals the number of identical positions/total number of positions times 100).
The invention also encompasses polypeptides having a lower degree of identity but having sufficient similarity so as to perform one or more of the same functions performed by a polypeptide encoded by a nucleic acid molecule of the invention. Similarity is determined by conserved amino acid substitution. Such substitutions are those that substitute a given amino acid in a polypeptide by another amino acid of like characteristics. Conservative substitutions are likely to be
phenotypically silent. Typically seen as conservative substitutions are the replacements, one for another, among the aliphatic amino acids Ala, Val, Leu and He; interchange of the hydroxyl residues Ser and Thr, exchange of the acidic residues Asp and Glu, substitution between the amide residues Asn and Gin, exchange of the basic residues Lys and Arg and replacements among the aromatic residues Phe and Tyr. Guidance concerning which amino acid changes are likely to be phenotypically silent are found in Bowie et al, Science 247:1306-1310 (1990).
A variant polypeptide can differ in amino acid sequence by one or more substitutions, deletions, insertions, inversions, fusions, and truncations or a combination of any of these. Further, variant polypeptides can be fully functional or can lack function in one or more activities. Fully functional variants typically contain only conservative variation or variation in non-critical residues or in non-critical regions. Functional variants can also contain substitution of similar amino acids that result in no change or an insignificant change in function. Alternatively, such substitutions may positively or negatively affect function to some degree. Non-functional variants typically contain one or more non-conservative amino acid substitutions, deletions, insertions, inversions, or truncation or a substitution, insertion, inversion, or deletion in a critical residue or critical region. Amino acids that are essential for function can be identified by methods known in the art, such as site-directed mutagenesis or alanine-scanning mutagenesis (Cunningham et al, Science, 244:1081-1085 (1989)). The latter procedure introduces single alanine mutations at every residue in the molecule. The resulting mutant molecules are then tested for biological activity in vitro, or in vitro proliferative activity. Sites that are critical for polypeptide activity can also be determined by structural analysis such as crystallization, nuclear magnetic resonance or photoaffmity labeling (Smith et al, J. Mol. Biol, 224:899-904 (1992); de Vos et al. Science, 255:306-312 (1992)). Representative amino acids that are essential for function in TGMA and TGMB include those at the active site (SEQ ED NO: 1).
The invention also includes polypeptide fragments of the polypeptides of the invention. Fragments can be derived from a polypeptide encoded by a nucleic acid molecule comprising SEQ ED NO: 2 or 4, or a portion thereof and the complements
thereof, or from a polypeptide encoded by a nucleic acid molecule comprising SEQ ID NO: 5, 7, or 8, or a portion thereof and the complements thereof. However, the invention also encompasses fragments of the variants of the polypeptides described herein. As used herein, a fragment comprises at least 6 contiguous amino acids. Useful fragments include those that retain one or more of the biological activities of the polypeptide as well as fragments that can be used as an immunogen to generate polypeptide-specific antibodies.
Biologically active fragments (peptides which are, for example, 6, 9, 12, 15, 16, 20, 30, 35, 36, 37, 38, 39, 40, 50, 100 or more amino acids in length) can comprise a domain, segment, or motif that has been identified by analysis of the polypeptide sequence using well-known methods, e.g., signal peptides, extracellular domains, one or more transmembrane segments or loops, ligand binding regions, zinc finger domains, DNA binding domains, acylation sites, glycosylation sites, or phosphorylation sites. In a preferred embodiment, a biologically active fragment comprises the transglutaminase active site (SEQ ED NO: 1). In another preferred embodiment, a biologically active fragment includes amino terminal or carboxyl terminal truncated forms of the transglutaminase protein, that retain the active site motif (SEQ ED NO: 1) and exhibit enzymatic activity in assays of transglutaminase function, such as those described in detail below. These biologically active fragments can be used, for example, in enzymatic assays such as those described below.
Fragments can be discrete (not fused to other amino acids or polypeptides) or can be within a larger polypeptide. Further, several fragments can be comprised within a single larger polypeptide. In one embodiment a fragment designed for expression in a host can have heterologous pre- and pro-polypeptide regions fused to the amino terminus of the polypeptide fragment and an additional region fused to the carboxyl terminus of the fragment.
The invention thus provides chimeric or fusion polypeptides. These comprise a polypeptide of the invention operatively linked to a heterologous protein or polypeptide having an amino acid sequence not substantially homologous to the polypeptide. "Operatively linked" indicates that the polypeptide and the
heterologous protein are fused in-frame. The heterologous protein can be fused to the N-terminus or C-terminus of the polypeptide. h one embodiment the fusion polypeptide does not affect function of the polypeptide per se. For example, the fusion polypeptide can be a GST-fusion polypeptide in which the polypeptide sequences are fused to the C-terminus of the GST sequences. Other types of fusion polypeptides include, but are not limited to, enzymatic fusion polypeptides, for example β-galactosidase fusions, yeast two-hybrid GAL fusions, poly-His fusions and Ig fusions. Such fusion polypeptides, particularly poly-His fusions, can facilitate the purification of recombinant polypeptide. En certain host cells (e.g., mammalian host cells), expression and/or secretion of a polypeptide can be increased by using a heterologous signal sequence. Therefore, in another embodiment, the fusion polypeptide contains a heterologous signal sequence at its N-terminus.
EP-A-O 464 533 discloses fusion proteins comprising various portions of immunoglobulin constant regions. The Fc is useful in therapy and diagnosis and thus results, for example, in improved pharmacokinetic properties (EP-A 0232 262). hi drug discovery, for example, human proteins have been fused with Fc portions for the purpose of high-throughput screening assays to identify antagonists. Bennett et al, Journal of Molecular Recognition, S:52-58 (1995) and Johanson et al, The Journal of Biological Chemistry, 270,16:9459-9411 (1995). Thus, this invention also encompasses soluble fusion polypeptides containing a polypeptide of the invention and various portions of the constant regions of heavy or light chains of immunoglobulins of various subclass (IgG, IgM, IgA, IgE).
A chimeric or fusion polypeptide can be produced by standard recombinant DNA techniques. For example, DNA fragments coding for the different polypeptide sequences are ligated together in-frame in accordance with conventional techniques, h another embodiment, the fusion gene can be synthesized by conventional techniques including automated DNA synthesizers. Alternatively, PCR amplification of nucleic acid fragments can be carried out using anchor primers which give rise to complementary overhangs between two consecutive nucleic acid fragments which can subsequently be annealed and re-amplified to generate a chimeric nucleic acid
sequence (see Ausubel et al, Current Protocols in Molecular Biology, 1992). Moreover, many expression vectors are commercially available that already encode a fusion moiety (e.g., a GST protein). A nucleic acid molecule encoding a polypeptide of the invention can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the polypeptide.
The isolated polypeptide can be purified from cells that naturally express it, purified from cells that have been altered to express it (recombinant), or synthesized using known protein synthesis methods. In one embodiment, the polypeptide is produced by recombinant DNA techniques. For example, a nucleic acid molecule encoding the polypeptide is cloned into an expression vector, the expression vector introduced into a host cell and the polypeptide expressed in the host cell. The polypeptide can then be isolated from the cells by an appropriate purification scheme using standard protein purification techniques.
En general, polypeptides of the present invention can be used as a molecular weight marker on SDS-PAGE gels or on molecular sieve gel filtration columns using art-recognized methods. The polypeptides of the present invention can be used to raise antibodies or to elicit an immune response. The polypeptides can also be used as a reagent, e.g., a labeled reagent, in assays to quantitatively determine levels of the polypeptide or a molecule to which it binds (e.g., a receptor or a ligand) in biological fluids. The polypeptides can also be used as markers for cells or tissues in which the corresponding polypeptide is preferentially expressed, either constitutively, during tissue differentiation, or in a diseased state. The polypeptides can be used to isolate a corresponding binding partner, e.g., receptor, target, substrate, or ligand, such as, for example, in an interaction trap assay, and to screen for peptide or small molecule antagonists or agonists of the binding interaction. The polypeptides can additionally be used to identify agents which alter transglutaminase activity, as discussed in detail below.
ANTIBODIES OF THE INVENTION
En another aspect, the invention provides antibodies to the polypeptides and polypeptide fragments of the invention, e.g., having an amino acid sequence of SEQ
ED NO: 3 (TGMA), SEQ ED NO: 6 (TGMB), or SEQ ED NO: 9 (isoform of TGMB), or a portion thereof, or having an amino acid sequence encoded by a nucleic acid molecule comprising all or a portion of SEQ ED NO: 2, SEQ ED NO: 5, or SEQ ED NO: 8. The term "antibody" as used herein refers to immunoglobulin molecules and immunologically active portions of immunoglobulin molecules, i. e. , molecules that contain an antigen binding site that specifically binds an antigen. A molecule that specifically binds to a polypeptide of the invention is a molecule that binds to that polypeptide or a fragment thereof, but does not substantially bind other molecules in a sample, e.g., a biological sample, which naturally contains the polypeptide. Examples of immunologically active portions of immunoglobulin molecules include F(ab) and F(ab')2 fragments which can be generated by treating the antibody with an enzyme such as pepsin. The invention provides polyclonal and monoclonal antibodies that bind to a polypeptide of the invention. The term "monoclonal antibody" or "monoclonal antibody composition", as used herein, refers to a population of antibody molecules that contain only one species of an antigen binding site capable of immunoreacting with a particular epitope of a polypeptide of the invention. A monoclonal antibody composition thus typically displays a single binding affinity for a particular polypeptide of the invention with which it immunoreacts. Polyclonal antibodies can be prepared as described above by immunizing a suitable subject with a desired immunogen, e.g., polypeptide of the invention or fragment thereof. The antibody titer in the immunized subject can be monitored over time by standard techniques, such as with an enzyme linked immunosorbent assay (ELISA) using immobilized polypeptide. If desired, the antibody molecules directed against the polypeptide can be isolated from the mammal (e.g., from the blood) and further purified by well-known techniques, such as protein A chromatography to obtain the IgG fraction. At an appropriate time after immunization, e.g., when the antibody titers are highest, antibody-producing cells can be obtained from the subject and used to prepare monoclonal antibodies by standard techniques, such as the hybridoma technique originally described by Kohler and Milstein (1975) Nature, 256:495-491, the human B cell hybridoma technique
(Kozbor et al. (1983) Immunol. Today, 4:12), the EBV-hybridoma technique (Cole et al. (1985), Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 11-96) or trioma techniques. The technology for producing hybridomas is well known (see generally Current Protocols in Immunology (1994) Coligan et al. (eds.) John Wiley & Sons, Inc., New York, NY). Briefly, an immortal cell line (typically a myeloma) is fused to lymphocytes (typically splenocytes) from a mammal immunized with an immunogen as described above, and the culture supernatants of the resulting hybridoma cells are screened to identify a hybridoma producing a monoclonal antibody that binds a polypeptide of the invention. Any of the many well known protocols used for fusing lymphocytes and immortalized cell lines can be applied for the purpose of generating a monoclonal antibody to a polypeptide of the invention (see, e.g., Current Protocols in Immunology, supra; Galfre et al. (1977) Nature, 266:55052; R.H. Kenneth, in Monoclonal Antibodies: A New Dimension In Biological Analyses, Plenum Publishing Corp., New York, New York (1980); and Lerner (1981) Yale J. Biol.
Med., 54:381-402. Moreover, the ordinarily skilled worker will appreciate that there are many variations of such methods that also would be useful.
Alternative to preparing monoclonal antibody-secreting hybridomas, a monoclonal antibody to a polypeptide of the invention can be identified and isolated by screening a recombinant combinatorial immunoglobulin library (e.g. , an antibody phage display library) with the polypeptide to thereby isolate immunoglobulin library members that bind the polypeptide. Kits for generating and screening phage display libraries are commercially available (e.g. , the Pharmacia Recombinant Phage Antibody System, Catalog No. 27-9400-01; and the Stratagene Sur ZAP™ Phage Display Kit, Catalog No. 240612). Additionally, examples of methods and reagents particularly amenable for use in generating and screening antibody display library can be found in, for example, U.S. Patent No. 5,223,409; PCT Publication No. WO 92/18619; PCT Publication No. WO 91/17271; PCT Publication No. WO 92/20791; PCT Publication No. WO 92/15679; PCT Publication No. WO 93/01288; PCT Publication No. WO 92/01047; PCT Publication No. WO 92/09690; PCT
Publication No. WO 90/02809; Fuchs et al (1991) Bio/Technology, 9:1370-1372;
Hay et al. (1992) Hum. Antibod. Hybridomas, 5:81-85; Huse et al. (1989) Science, 246:1215-1281; Griffiths et al. (1993) EMBO J, 12:125-134.
Additionally, recombinant antibodies, such as chimeric and humanized monoclonal antibodies, comprising both human and non-human portions, which can be made using standard recombinant DNA techniques, are within the scope of the invention. Such chimeric and humanized monoclonal antibodies can be produced by recombinant DNA techniques known in the art.
In general, antibodies of the invention (e.g., a monoclonal antibody) can be used to isolate a polypeptide of the invention by standard techniques, such as affinity chromatography or immunoprecipitation. A polypeptide-specific antibody can facilitate the purification of natural polypeptide from cells and of recombinantly produced polypeptide expressed in host cells. Moreover, an antibody specific for a polypeptide of the invention can be used to detect the polypeptide (e.g., in a cellular lysate, cell supernatant, or tissue sample) in order to evaluate the abundance and pattern of expression of the polypeptide. Antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to, for example, determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling the antibody to a detectable substance. Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 1251, 1311, 35S or 3H.
SCREENING ASSAYS OF THE INVENTION
Another aspect of the invention pertains to monitoring the influence of agents (e.g., drugs, compounds or other agents) on the expression or activity of polypeptides of the invention. These and other agents are described in further detail in the following sections.
SCREENING ASSAYS AND AGENTS EDENTEFEED THEREBY
The invention provides methods (also referred to herein as "screening assays") for identifying agents (e.g., antisense, polypeptides, peptidomimetics, small molecules or other drugs) which alter (e.g., increase or decrease) the activity of the polypeptides described herein. For example, such agents can be agents which bind to nucleic acid molecules or polypeptides described herein; which have a stimulatory or inhibitory effect on, for example, expression or activity of the nucleic acid molecules or polypeptides of the invention; or which change (e.g., enhance or inhibit) the ability of the polypeptides of the invention to interact with transglutaminase targets or substrates.
In one embodiment, the invention provides assays for screening candidate or test agents that bind to or modulate the activity of polypeptides described herein (or biologically active portion(s) thereof). The test agents can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the 'one-bead one-compound' library method; and synthetic library methods using affinity chromatography selection. The biological library approach is limited to polypeptide libraries, while the other four approaches are applicable to polypeptide, non-peptide oligomer or small molecule libraries of compounds (Lam, K.S. (1997) Anticancer DrugDes., 72:145).
In one embodiment, to identify agents which alter the activity of a transglutaminase of interest (TMGA or TGMB, or the isoform of TGMB, or more than one of these transglutaminases), a cell, cell lysate, or solution containing or expressing a polypeptide of the invention (e.g., SEQ ED NO: 3 (TGMA) or SEQ ED NO: 6 (TGMB) or SEQ ED NO: 9 (isoform of TGMB)), or an active fragment or
derivative thereof (as described above), can be contacted with an agent to be tested; alternatively, the polypeptide can be contacted directly with the agent to be tested. The level (amount) of the activity of the transglutaminase of interest is assessed, and is compared with the level of activity in a control (i.e., the level of activity of the polypeptide or active fragment or derivative thereof in the absence of the agent to be tested). If the level of the activity in the presence of the agent differs, by an amount that is statistically significant, from the level of the activity in the absence of the agent, then the agent is an agent that alters the activity of the transglutaminase of interest. An increase in the level of transglutaminase activity relative to a control, indicates that the agent is an agent that enhances (is an agonist of) transglutaminase activity. Similarly, a decrease in the level of transglutaminase activity relative to a control, indicates that the agent is an agent that inhibits (is an antagonist of) transglutaminase activity. In another embodiment, the level of activity of an polypeptide of the invention (or derivative or fragment thereof) in the presence of the agent to be tested, is compared with a control level that has previously been established. A level of the activity in the presence of the agent that differs from the control level by an amount that is statistically significant indicates that the agent alters transglutaminase activity.
The present invention also relates to an assay for identifying agents which alter the expression of a transglutaminase of interest (TGMA or TGMB, an isoform of TGMB, or a combination thereof). For example, a cell, cell lysate, or solution containing a nucleic acid encoding transglutaminase of interest (e.g., a polypeptide of the invention, such as TGMA or TGMB or an isoform thereof) can be contacted with an agent to be tested. The level and/or pattern of transglutaminase expression (e.g., the level and/or pattern of mRNA or of protein expressed) is assessed, and is compared with the level and/or pattern of expression in a control (i.e., the level and/or pattern of the expression of the transglutaminase of interest in the absence of the agent to be tested). If the level and/or pattern in the presence of the agent differs, by an amount or in a manner that is statistically significant, from the level and/or pattern in the absence of the agent, then the agent is an agent that alters the expression of the transglutaminase of interest. Enhancement of transglutaminase
expression indicates that the agent is an agonist of transglutaminase activity. Similarly, inhibition of transglutaminase expression indicates that the agent is an antagonist of transglutaminase activity, h another embodiment, the level and/or pattern of a expression of a transglutaminase of interest in the presence of the agent to be tested, is compared with a control level and/or pattern that has previously been established. A level and/or pattern in the presence of the agent that differs from the control level and/or pattern by an amount or in a manner that is statistically significant indicates that the agent alters expression of the transglutaminase of interest. For example, in one embodiment of the invention, assays can be used to assess the impact of a test agent on the activity of a transglutaminase of interest in relation to a transglutaminase substrate. For example, a transglutaminase substrate (e.g., soluble forms of Htt protein containing an expanded polyglutamine tract; synthetic peptides containing amine acceptors such as EAQQQQQQEV (SEQ ED NO: 10) (see Kahlem et al, PNAS USA 93:14580-14585 (1996)); or polyamines such as putrescine) is contacted with a transglutaminase of interest (e.g., TGMA or TGMB) in the presence of a test agent, and the ability of the test agent to alter the interaction between the transglutaminase and its substrate is determined. Alternatively, a cell lysate or a solution containing the transglutaminase substrate can be used. An agent which binds to the transglutaminase or its substrate can alter the interaction by interfering with, or enhancing the ability of the transglutaminase to bind to, associate with, or otherwise interact with the substrate. Determining the ability of the test agent to bind to the transglutaminase or its substrate can be accomplished, for example, by coupling the test agent with a radioisotope or enzymatic label such that binding of the test agent to the polypeptide can be determined by detecting the labeled with 1251, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, test agents can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product. It is also within the scope of this invention to determine the ability of a test agent to interact with the polypeptide without the labeling of any of the
interactants. For example, a microphysiometer can be used to detect the interaction of a test agent with the transglutaminase or the transglutaminase substrate without the labeling of either the test agent, transglutaminase, or substrate. McConnell, H.M. et al. (1992) Science, 257:1906-1912. As used herein, a "microphysiometer" (e.g. , Cytosensor™) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between ligand and polypeptide.
En the above assay methods of the present invention, it may be desirable to immobilize either the transglutaminase, the substrate, or other components of the assay on a solid support, in order to facilitate analysis of the assay results, as well as to accommodate automation of the assay. Binding of a test agent to the transglutaminase, or interaction of the transglutaminase and the substrate in the presence and absence of a test agent, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtitre plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein (e.g., a glutathione-S-transferase fusion protein) can be provided which adds a domain that allows the transglutaminase or the substrate to be bound to a matrix or other solid support. En another embodiment, modulators of expression of nucleic acid molecules of the invention are identified in a method wherein a cell, cell lysate, or solution containing a nucleic acid encoding a transglutaminase of interest (e.g., TGMA or TGMB) is contacted with a test agent and the expression of appropriate mRNA or polypeptide in the cell, cell lysate, or solution, is determined. The level of expression of appropriate mRNA or polypeptide in the presence of the test agent is compared to the level of expression of mRNA or polypeptide in the absence of the test agent. The test agent can then be identified as a modulator of expression based on this comparison. For example, when expression of mRNA or polypeptide is greater (statistically significantly greater) in the presence of the test agent than in its absence, the test agent is identified as a stimulator or enhancer of the mRNA or polypeptide expression. Alternatively, when expression of the mRNA or
polypeptide is less (statistically significantly less) in the presence of the test agent than in its absence, the test agent is identified as an inhibitor of the mRNA or polypeptide expression. The level of mRNA or polypeptide expression in the cells can be determined by methods described herein for detecting mRNA or polypeptide. Representative assays for assessing transglutaminase activity are described, for example, by Jeon et al. (Analytical Biochemistry 182:170-175 (1989) (colorimerric assay) Slaughter et al. (Analytical Biochemistry 205:166-171 (1992) (microtiter plate assay) Lajemi et al. (Histochemical Journal 29:593-606 (1997) (in situ analysis). Representative inhibitors of other transglutaminases are described by Lorand et al. (Exp. Eye Res. 66:531-536 (1998), Matteucci et al (Biol. Chem. 379:921-824 (1998), Pliura et al. (J. Enzyme Inhibition 6:181-194 (1992), Goldsmith et al. (J. Invest. Dermatol. 97:156-158 (1991); and Lee et al. (J. Biol. Chem. 260(27): 14689-14694 (1985). All of these references are incorporated herein by reference in their entirety. The methods described in these references can be used for the assessment and identification of agents which alter transglutaminase activity of the polypeptides of the invention (e.g., TGMA and TGMB).
This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein in an appropriate animal model. For example, an agent identified as described herein (e.g., a test agent that is a modulating agent, an antisense nucleic acid molecule, a specific antibody, or a polypeptide-binding partner) can be used in an animal model to determine the efficacy, toxicity, or side effects of treatment with such an agent. Alternatively, an agent identified as described herein can be used in an animal model to determine the mechanism of action of such an agent. Furthermore, this invention pertains to uses of novel agents identified by the above-described screening assays for treatments as described in detail below.
PHARMACEUTICAL COMPOSITIONS The present invention also pertains to pharmaceutical compositions comprising nucleic acids described herein, particularly nucleotides encoding the
polypeptides described herein; comprising polypeptides described herein (e.g., SEQ ID NO: 3 and/or SEQ ED NO: 6 and/or SEQ ED NO: 9); and/or comprising the agent that alters (e.g., enhances or inhibits) transglutaminase activity described herein. For instance, a polypeptide, protein, fragment, fusion protein or prodrug thereof, or a nucleotide or nucleic acid construct (vector) comprising a nucleotide of the present invention, or an agent that alters transglutaminase activity, can be formulated with a physiologically acceptable carrier or excipient to prepare a pharmaceutical composition. The carrier and composition can be sterile. The formulation should suit the mode of administration. Suitable pharmaceutically acceptable carriers include but are not limited to water, salt solutions (e.g., NaCl), saline, buffered saline, alcohols, glycerol, ethanol, gum arabic, vegetable oils, benzyl alcohols, polyethylene glycols, gelatin, carbohydrates such as lactose, amylose or starch, dextrose, magnesium stearate, talc, silicic acid, viscous paraffin, perfume oil, fatty acid esters, hydroxymethylcellulose, polyvinyl pyrolidone, etc., as well as combinations thereof. The pharmaceutical preparations can, if desired, be mixed with auxiliary agents, e.g., lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, flavoring and/or aromatic substances and the like which do not deleteriously react with the active agents. The composition, if desired, can also contain minor amounts of wetting or emulsifying agents, or pH buffering agents. The composition can be a liquid solution, suspension, emulsion, tablet, pill, capsule, sustained release formulation, or powder. The composition can be formulated as a suppository, with traditional binders and carriers such as triglycerides. Oral formulation can include standard carriers such as pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, polyvinyl pyrollidone, sodium saccharine, cellulose, magnesium carbonate, etc.
Methods of introduction of these compositions include, but are not limited to, intracranial, intradermal, intramuscular, intraperitoneal, intraocular, intravenous, subcutaneous, topical, oral and infranasal. Other suitable methods of introduction can also include gene therapy (as described below), rechargeable or biodegradable
devices, particle acceleration devises ("gene guns") and slow release polymeric devices. The pharmaceutical compositions of this invention can also be administered as part of a combinatorial therapy with other agents.
The composition can be formulated in accordance with the routine procedures as a pharmaceutical composition adapted for administration to human beings. For example, compositions for intravenous administration typically are solutions in sterile isotonic aqueous buffer. Where necessary, the composition may also include a solubilizing agent and a local anesthetic to ease pain at the site of the injection. Generally, the ingredients are supplied either separately or mixed together in unit dosage form, for example, as a dry lyophilized powder or water free concentrate in a hermetically sealed container such as an ampule or sachette indicating the quantity of active agent. Where the composition is to be administered by infusion, it can be dispensed with an infusion bottle containing sterile pharmaceutical grade water, saline or dextrose/water. Where the composition is administered by injection, an ampule of sterile water for injection or saline can be provided so that the ingredients may be mixed prior to administration.
For topical application, nonsprayable forms, viscous to semi-solid or solid forms comprising a carrier compatible with topical application and having a dynamic viscosity preferably greater than water, can be employed. Suitable formulations include but are not limited to solutions, suspensions, emulsions, creams, ointments, powders, enemas, lotions, sols, liniments, salves, aerosols, etc., which are, if desired, sterilized or mixed with auxiliary agents, e.g., preservatives, stabilizers, wetting agents, buffers or salts for influencing osmotic pressure, etc. The agent may be incorporated into a cosmetic formulation. For topical application, also suitable are sprayable aerosol preparations wherein the active ingredient, preferably in combination with a solid or liquid inert carrier material, is packaged in a squeeze bottle or in admixture with a pressurized volatile, normally gaseous propellant, e.g., pressurized air.
Agents described herein can be formulated as neutral or salt forms. Pharmaceutically acceptable salts include those formed with free amino groups such as those derived from hydrochloric, phosphoric, acetic, oxalic, tartaric acids, etc.,
and those formed with free carboxyl groups such as those derived from sodium, potassium, ammonium, calcium, ferric hydroxides, isopropylamine, triethylamine, 2- ethylamino ethanol, histidine, procaine, etc.
The agents are administered in a therapeutically effective amount. The amount of agents which will be therapeutically effective in the treatment of a particular disorder or condition will depend on the nature of the disorder or condition, and can be determined by standard clinical techniques, h addition, in vitro or in vivo assays may optionally be employed to help identify optimal dosage ranges. The precise dose to be employed in the formulation will also depend on the route of administration, and the seriousness of the symptoms of the disease, and should be decided according to the judgment of a practitioner and each patient's circumstances. Effective doses may be extrapolated from dose-response curves derived from in vitro or animal model test systems.
The invention also provides a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the pharmaceutical compositions of the invention. Optionally associated with such container(s) can be a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use of sale for human administration. The pack or kit can be labeled with information regarding mode of administration, sequence of drug administration (e.g., separately, sequentially or concurrently), or the like. The pack or kit may also include means for reminding the patient to take the therapy. The pack or kit can be a single unit dosage of the combination therapy or it can be a plurality of unit dosages. In particular, the agents can be separated, mixed together in any combination, present in a single vial or tablet. Agents assembled in a blister pack or other dispensing means is preferred. For the purpose of this invention, unit dosage is intended to mean a dosage that is dependent on the individual pharmacodynamics of each agent and administered in FDA approved dosages in standard time courses.
METHODS OF THERAPY
Because of the role of transglutaminases in Huntington's disease and other neurodegenerative diseases, inhibitors of transglutaminases may be particularly useful in the treatment of disease. Role of Transglutaminases in Huntington 's disease and other
Neurodegenerative Disorders
Huntington's disease (HD) is a progressive neurodegenerative disorder characterized by choreaform movements, cognitive loss, changes in personality, and other psychiatric disturbances. HD is inherited as an autosomal dominant trait and affects close to 30,000 persons in the United States. The disease has a subtle onset in the fourth to fifth decade of life and gradually worsens over a course of 10 to 20 years eventually leading to death. The neuropathology of HD is distinctive, involving selective neuron loss predominantly in the caudate and putamen, two structures important for the control of movement as well as other brain functions. The genetic defect causing HD is due to mutation of a gene on the distal tip of human chromosome 4 assigned cytogenetically to band 4pl6.3 (The Huntington's Disease Collaborative Research Group, Cell 72:971-983 (1993)). Disease is due to an unstable trinucleotide repeat, (CAG)n which on normal chromosomes is 11-34 repeat units in length but expands to 37-180 repeat units in length in HD (Duyao et al. 1993). HD is one of a number of human diseases, all of which cause movement disorders, that are due to expansion of unstable (CAG)n repeats. These include many different spinocerebellar ataxias (spinocerebellar ataxia types 1, 2, 3 (Machado- Joseph disease), and SCA6, as well as dentatorubral pallidoluysian atrophy) and X- linked spinobulbar muscular atrophy (SBMA or Kennedy's disease). These diseases show in common the genetic phenomenon of anticipation wherein the age of onset becomes earlier in successive generations of an affected family following paternal transmission of the gene. This is due to further expansion of the (CAG)n repeat in successive generations with a strong correlation found between repeat length and earlier age of onset across the set of (CAG)n repeat diseases. In HD, (CAG)n repeat lengths of 40-50 are found in the majority of cases with adult onset, whereas (CAG)n repeat lengths in juvenile cases are usually 70 and above.
In all of these diseases, the (CAG)n repeat is contained within the coding sequence of a protein expressed in brain neurons. The (CAG)n repeat encodes a polyglutamine tract that is increased in length in individuals affected by disease. Although the proteins containing polyglutamine tracts are unrelated and mostly of unknown function (SBMA is due to a (CAG)n repeat expansion in the androgen receptor gene), it seems likely that the mutations act through a similar mechanism. The HD gene encodes a large protein of about 384 kDa (called huntingtin) that is widely expressed in brain and other tissues. The (CAG)n repeat is located in exon 1. Disease is thought to be caused by abnormal protein-protein interactions precipitated by the presence of the expanded polyglutamine tract. Poly-L-glutamines form pleated sheets of beta-strands held together by hydrogen bonds between their amides. Perutz et al. (Proc. Natl. Acad. Sci. USA 91:5355-5358 (1994)) therefore proposed that the expanded polyglutamine tract in the Huntington's disease protein may function as polar zippers, thereby joining huntingtin (Hit) proteins together. This could result in the precipitation of Hot as an insoluble aggregate in specific neurons thereby causing the selective pattern of neuron loss seen in HD.
Evidence for such insoluble aggregates of Hot recently has been found in both HD patients and in transgenic mice expressing mutant Hot protein or fragments thereof containing an expanded polyglutamine tract. In the transgenic mice, exon 1 of the HD gene carrying 115-156 (CAG)n repeat units was expressed under the control of the human HD gene promoter (Mangiarini et al, Cell 87:493-506 (1996)). The transgenic mice develop a neurological phenotype resembling HD. Neuropathological analysis shows the presence of distinct, intranuclear inclusions comprising an N-terminal fragment of Hot (Davies et al, Cell 90:537-548 (1997)). Similar nuclear inclusions are seen in biopsy material from cerebral cortex and caudate nucleus of HD patients. The intranuclear inclusions in both transgenic mice and patients are decorated with ubiquitin suggesting dysfunction of the ubiquitin- proteasome pathway for protein degradation.
The N-terminal fragment of Hot when expressed and purified forms insoluble, high molecular weight protein aggregates, but only when the polyglutamine tract length is in the pathogenic range (Scherzinger et al, Cell
90:549-558 (1997)). The protein aggregates into amyloid-like fibrils and ribbons that bind the dye Congo Red and give an X-ray diffraction pattern indicative of pleated beta-sheet. Expression of full-length or N-terminal fragments of Hot in neuroblastoma cells also gives rise to the formation of cytoplasmic and nuclear inclusions containing aggregated Hot (Lufrkes and Mandel, Human Molecular Genetics 7(9): 1355-1361 (1998)). As in vitro, aggregation of Hot in cells is dependent upon the length of the polyglutamine tract and only is seen when the tract length is in the pathogenic range.
Proteins containing polyglutamine tracts generally are excellent substrates for crosslinking reactions catalyzed by transglutaminases. Involucrin, the major substrate for crosslinking by epidermal transglutaminases during terminal differentiation of keratinocytes into the cornified epithelium, is a 347 amino acid long protein that contains 12 tracts of 4-5 adjacent glutamine residues. Synthetic peptide substrates containing polyglutamine are excellent substrates for tissue transglutaminase (TGLC) purified from guinea pig liver and also for uncharacterized transglutaminases obtained in brain extracts (Kahlem et al, Proc. Natl. Acad. Sci. USA 93:14580-14585 (1996)). Lengthening the polyglutamine sequence increased reactivity of each glutamine residue suggesting that the polyglutamine repeat expansion seen in HD and other diseases may increase reactivity of the mutant protein with transglutaminases in brain. Polyglutamine containing peptides can be crosslinked with brain proteins by transglutaminase in vitro to form insoluble aggregates.
Purified fragments of Hot containing either normal or pathological polyglutamine tracts are substrates for crosslinking by tissue type transglutaminase (Karpuj et al, Proc. Natl. Acad. Sci. USA 96:7388-7393 (1999)). More transglutaminase aggregates form when the polyglutamine domain of Hot was in the pathological range of length. Reactivity of Hot with guinea pig liver TGLC increases with the length of the polyglutamine repeat over a range of an order of magnitude (Kahlem et al, Molecular Cell 1:595-601 (1998)). Hot fragments containing a tract of 80 polyglutamines are efficiently crosslinked by guinea pig TGLC into high molecular weight aggregates.
The effect of known inhibitors of transglutaminases on crosslinking of Hot or other polyglutamine containing proteins has been investigated both in vitro and in cell models. Cystamine, a potent and specific inhibitor of most types of transglutaminases completely prevents the crosslinking of Hot containing 80 polyglutamines into high molecular weight aggregates by either guinea pig TGLC or by brain extracts containing uncharacterized transglutaminases (Kahlem et al, Molecular Cell 1:595-601 (1998)). Formation of polyglutamine containing aggregates can also be prevented in cells when the cells are treated in culture with either cystamine or mondansyl cadaverine (Igarashi et al, Nature Genetics 18:111- 117 (1998)).
Methods of Treatment
Thus, the present invention also pertains to methods of treatment (prophylactic and/or therapeutic) for Huntington's disease and other neurodegenerative disorders in which a transglutaminase contributes to the pathology of the disorder. The methods use a transglutaminase therapeutic agent. A "transglutaminase therapeutic agent" is an agent that alters (e.g., inhibits) activity or expression of a transglutaminase of interest described herein (e.g., TGMA and/or TGMB). The therapy is designed to inhibit activity of a transglutaminase of the invention in an individual (for example, by administering a nucleic acid that is antisense to a nucleic acid encoding a polypeptide of the invention or a derivative or active fragment thereof; and/or by administering an agent that alters the activity of the polypeptide of the invention or derivative or fragment of the polypeptide). The transglutaminase therapeutic agent can be a nucleic acid (e.g., cDNA, mRNA or oligonucleotide); a protein, polypeptide, peptide, or peptidomimetic, an antibody (e.g., an antibody to TGMA or TGMB, as described above); a ribozyme; a small molecule or other agent that inhibits activity and/or gene expression of the transglutaminase of interest (e.g., which downregulates expression of TGMA and/or TGMB). More than one transglutaminase therapeutic agent can be used concurrently, if desired.
The transglutaminase therapeutic agent(s) are administered in a therapeutically effective amount (i.e., an amount that is sufficient to treat the disease, such as by ameliorating symptoms associated with the disease, preventing or delaying the onset of the disease, and or also lessening the severity or frequency of symptoms of the disease). The amount which will be therapeutically effective in the treatment of a particular individual's disorder or condition will depend on the symptoms and severity of the disease, and can be determined by standard clinical techniques. En addition, in vitro or in vivo assays may optionally be employed to help identify optimal dosage ranges. The precise dose to be employed in the formulation will also depend on the route of administration, and the seriousness of the disease or disorder, and should be decided according to the judgment of a practitioner and each patient's circumstances. Effective doses may be extrapolated from dose-response curves derived from in vitro or animal model test systems. For example, in one embodiment of the invention, a nucleic acid of the invention; a nucleic acid complementary to a nucleic acid of the invention; or a portion of such a nucleic acid (e.g., an oligonucleotide as described below), can be used in "antisense" therapy, in which a nucleic acid (e.g., an oligonucleotide) which specifically hybridizes to the mRNA and/or genomic DNA of TGMA and/or TGMB is administered or generated in situ. The antisense nucleic acid that specifically hybridizes to the mRNA and/or DNA inhibits expression of the transglutaminase of interest, e.g., by inhibiting translation and/or transcription. Binding of the antisense nucleic acid can be by conventional base pair complementarity, or, for example, in the case of binding to DNA duplexes, through specific interaction in the major groove of the double helix. An antisense construct of the present invention can be delivered, for example, as an expression plasmid as described above. When the plasmid is transcribed in the cell, it produces RNA which is complementary to a portion of the mRNA and or DNA which encodes the transglutaminase of interest (e.g., TGMA or TGMB). Alternatively, the antisense construct can be an oligonucleotide probe which is generated ex vivo and introduced into cells; it then inhibits expression by hybridizing with the mRNA and/or genomic DNA of TGMA and/or TGMB In one
embodiment, the oligonucleotide probes are modified oligonucleotides which are resistant to endogenous nucleases, e.g. exonucleases and/or endonucleases, thereby rendering them stable in vivo. Exemplary nucleic acid molecules for use as antisense oligonucleotides are phosphoramidate, phosphothioate and methylphosphonate analogs of DNA (see also U.S. Pat. Nos. 5,176,996; 5,264,564; and 5,256,775). Additionally, general approaches to constructing oligomers useful in antisense therapy are also described, for example, by Van der Krol et al. ((1988) Biotechniques 6:958-976); and Stein et al. ( (1988) Cancer Res 48:2659-2668). With respect to antisense DNA, oligodeoxyribonucleotides derived from the translation initiation site, e.g. between the -10 and +10 regions of the TGMA or the TGMB sequence, are preferred.
To perform antisense therapy, oligonucleotides (mRNA, cDNA or DNA) are designed that are complementary to mRNA encoding a transglutaminase of interest, such as TGMA and/or TGMB (i.e., complementary to SEQ ED NO:2 and/or SEQ ED NO: 5 and/or SEQ ED NO: 8). The antisense oligonucleotides bind to mRNA transcripts of the transglutaminase of interest and prevent translation. Absolute complementarity, although preferred, is not required, a sequence "complementary" to a portion of an RNA, as referred to herein, indicates that a sequence has sufficient complementarity to be able to hybridize with the RNA, forming a stable duplex; in the case of double-stranded antisense nucleic acids, a single strand of the duplex DNA may thus be tested, or triplex formation may be assayed. The ability to hybridize will depend on both the degree of complementarity and the length of the antisense nucleic acid, as described in detail above. Generally, the longer the hybridizing nucleic acid, the more base mismatches with an RNA it may contain and still form a stable duplex (or triplex, as the case may be). One skilled in the art can ascertain a tolerable degree of mismatch by use of standard procedures.
The oligonucleotides used in antisense therapy can be DNA, RNA, or chimeric mixtures or derivatives or modified versions thereof, single-stranded or double-stranded. The oligonucleotides can be modified at the base moiety, sugar moiety, or phosphate backbone, for example, to improve stability of the molecule, hybridization, etc. The oligonucleotides can include other appended groups such as
peptides (e.g. for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaifre et al, (1987), Proc. Natl. Acad Sci. USA 84:648-652; PCT International Publication No. W088/09810) or the blood-brain barrier (see, e.g., PCT international Publication No. W089/10134), or hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, (1988), Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, hybridization-triggered cleavage agent).
The antisense molecules are delivered to cells which express the transglutaminase of interest in vivo. A number of methods can be used for delivering antisense DNA or RNA to cells; e.g., antisense molecules can be injected directly into the tissue site, or modified antisense molecules, designed to target the desired cells (e.g., antisense linked to peptides or antibodies that specifically bind targets or antigens expressed on the target cell surface) can be administered systematically. Alternatively, in a preferred embodiment, a recombinant DNA construct is utilized in which the antisense oligonucleotide is placed under the control of a strong promoter (e.g., pol IJJ or pol IE). The use of such a construct to transfect target cells in the patient results in the transcription of sufficient amounts of single stranded RNAs that will form complementary base pairs with the corresponding endogenous transglutaminase transcripts and thereby prevent translation of the corresponding transglutaminase mRNA. For example, a vector can be introduced in vivo such that it is taken up by a cell and directs the transcription of an antisense RNA. Such a vector can remain episomal or become chromosomally integrated, as long as it can be transcribed to produce the desired antisense RNA. Such vectors can be constructed by recombinant DNA technology methods standard in the art and described above. For example, a plasmid, cosmid, YAC or viral vector can be used to prepare the recombinant DNA construct which can be introduced directly into the tissue site. Alternatively, viral vectors can be used which selectively
infect the desired tissue, in which case administration may be accomplished by another route (e.g., systematically).
Endogenous expression of the transglutaminase of interest (TGMA and/or TGMB) can also be reduced by inactivating or "knocking out" the transglutaminase of interest or its promoter using targeted homologous recombination (e.g., see Smithies et al. (1985) Nature 317:230-234; Thomas & Capecchi (1987) Cell 51:503-512; Thompson et al. (1989) Cell 5:313-321). For example, a mutant, non-functional transglutaminase (or a completely unrelated DNA sequence) flanked by DNA homologous to the endogenous transglutaminase (e.g., either the coding regions or regulatory regions of TGMA or TGMB) can be used, with or without a selectable marker and/or a negative selectable marker, to transfect cells that express the transglutaminase of interest in vivo. Insertion of the DNA construct, via targeted homologous recombination, results in inactivation of the targeted transglutaminase (TGMA or TGMB). The recombinant DNA constructs can be directly administered or targeted to the required site in vivo using appropriate vectors, as described above. Alternatively, endogenous transglutaminase expression can be reduced by targeting deoxyribonucleotide sequences complementary to the regulatory region of TGMA or TGMB (i.e., the TGMA or TGMB promoter and/or enhancers) to form triple helical structures that prevent transcription of TGMA or TGMB in target cells in the body. (See generally, Helene, C. (1991) Anticancer Drug Des., 6(6):569-84; Helene, C, et al. (1992) Ann, NY. Acad. Sci., 660:27-36; and Maher, L. J. (1992) Bioassays 14(12):807-15). Likewise, the antisense constructs described herein, by antagonizing the normal biological activity of one of the transglutaminase proteins, can be used in the manipulation of tissue, e.g. tissue differentiation, both in vivo and or ex vivo tissue cultures. Furthermore, the anti-sense techniques (e.g. microinjection of antisense molecules, or transfection with plasmids whose transcripts are anti-sense with regard to TGMA and/or TGMB mRNA or gene sequence) can be used to investigate the role of TGMA or TGMB in developmental events, as well as the normal cellular function of TGMA or TGMB in adult tissue. Such techniques can be utilized in cell culture, but can also be used in the creation of transgenic animals.
In yet another embodiment of the invention, polypeptides and/or agents that inhibit activity of the transglutaminase polypeptides of the invention, as described herein, and/or antibodies to the polypeptides of the invention, as described herein, can be used in the treatment or prevention of Huntington's disease or other neurodegenerative disorders in which transglutaminase plays a role in the pathology of the disorders. The polypeptides, agents and/or antibodies can be delivered in a composition, as described above, or by themselves. They can be administered systemically, or can be targeted to a particular tissue. The polypeptides, agents, and/or antibodies can be produced by a variety of means, including chemical synthesis; recombinant production; in vivo production (e.g., a transgenic animal, such as U.S. Pat. No. 4,873,316 to Meade et al), for example, and can be isolated using standard means such as those described herein.
A combination of any of the above methods of treatment (e.g., administration of antibody to TGMA and/or TGMB, in conjunction with antisense therapy targeting TGMA and/or TGMB mRNA), can also be used.
The invention will be further described by the following non-limiting examples. The teachings of all publications cited herein are incorporated herein by reference in their entirety.
EXEMPLIFICATION Identification of Novel Transglutaminase Genes
Example 1 Identification of novel transglutaminase gene on human chromosome 15ql5 Tblastn was used to search the protein sequences of TGM1, TGM2, TGM3, TGM4, TGM5 and Factor XEHA against the high-through put human genome sequence publicly deposited in GenBank. This revealed a novel transglutaminase gene in a contiguous genomic sequence of approximately 184,000 base pairs that also contained the TGM5 and erythrocyte band 4.2 protein (EPB42) genes. The EPB42 gene has previously been mapped to human chromosome 15ql5-q21 suggesting that all three genes map to the same location on 15ql5. The protein
encoded by the novel transglutaminase gene contains a canonical transglutaminase active site motif (Y-G-Q-C-W-V-F-A) (SEQ ED NO: 1) and is most closely related to TGM5. The novel gene contains at least 11 exons with an intron/exon pattern like that of the TGM5 gene. It encodes a predicted protein of at least 706 amino acids. This is roughly the size of the novel brain transglutaminase purified by Ohashi et al. (J. Biochem. 118:1271-1278 (1995)).
Experiments were performed to detect expression in brain of mRNA transcripts from the novel transglutaminase gene on human chromosome 15ql5. The polymerase chain reaction was used to determine if the novel transglutaminase gene was transcribed in brain. This was the case as shown by specific amplification of a PCR product of the expected size from a brain cDNA library. The sequence of the amplified PCR product was confirmed by direct sequencing and corresponded to the DNA sequence predicted (SEQ ED NO: 27). Details of the experimental protocol were as follows. Gene specific primers were designed within exons 10 and 11, flanking intron
10, using the Primer3 program (http ://www. genome. wi.mit. edu/ with the default setting, except that length was 23-28 nucleotides, GC content 50-70% and TM >65°. Sense (Forward) primer: ctcctggagtctgggggtcttaggg (SEQ ED NO: 11) Antisense (Reverse) primer: agacaagtggggaggctccagacag (SEQ ED NO: 12) PCR was performed on Clontech Marathon-Ready cDNA library from human whole brain (Clontech cat no. #7400-1) and human CEPH genomic DNA. Each PCR reaction included 18 μl H20, 3 μl lOx PCR reaction buffer (Clontech), 3 μl of 2 mM dNTP mix, 0.6 μl Advantage 2 polymerase mix (50x; Clontech), 1.2 μl each Forward and Reverse primer and either 3 μl of cDNA (0.1 ng/μl) or 3 μl of CEPH genomic DNA (10 ng/μl). The total reaction volume was 30 μl. The PCR was run on an MJR thermal cycler for 94° for 12 minutes, followed by 35 cycles of 94° for 30 seconds and 68° for 4 minutes.
An aliquot of the PCR product was loaded onto a 2.5% Metaphor Agarose (BMA) gel, electrophoresed for 45 minutes at 80V, 135 mA, stained with ethidium bromide and photographed. DNA bands visible on the agarose gel were recovered and run through DNA gel extraction columns (Millipore) for 10 min at 5100 x g. This was
followed by a 10 minute ligation reaction including 2 μl DNA from columns, 0,5 μl TOPO TA cloning vector (Invitrogen) and 0.5 μl salt solution (Invitrogen). This ligation was used to transform 50 μl TOP10 cells (Invitrogen). They were incubated for 30 minutes on ice, followed by heat shock at 42° for 30 sec. Then the cells were incubated 2 min on ice before adding 200 μl SOC medium (Invitrogen). This was followed by 60 min incubation at 37° before the cells where spread on solid LB medium containing ampicillin (lOOug/ml, XGAL and EPTG. After incubation overnight white colonies were picked and inoculated into 3 ml LB medium with ampicillin. The plasmid was isolated according to standard miniprep protocol (Auto gen
740 robot). 3 μl of template was then used in a 10 μl cycle sequencing reaction containing 2 μl BigDye mix (ABI Biosystems) and either 25 pmol Ml 3 Forward or Ml 3 Reverse primers. The cycle sequencing reaction was run on a MJR thermal cycler for 30 cycles of 94° for 10 seconds, 50° for 5 seconds and 60° for 4 minutes. Before running the reaction on an ABI3700 automatic sequencer the reactions were cleaned by running though Sephadex P-50 columns spun at 3000 rpm for 3 minutes.
The nucleotide sequences were determined using the ABI Sequencing Analysis software. The sequence of the PCR amplimer was compared to the cDNA and genomic sequences and found to be an exact match.
Example 2 Identification of a novel transglutaminase gene on human chromosome 20 As described below, PCR primers were used to amplify a single transcript from prostate cDNA of an expected size and nucleotide sequence. This PCR product also was amplified from brain. En addition, a second, shorter PCR product (lacking exon 12) was amplified only from brain. They were identified as transglutaminase-B and an isoform of transglutaminase-B, respectively.
Novel transglutaminase gene on chromosome 20 The program TBLASTN was used to search the publicly available High Throughput Genomic (HTG) Sequences division of GenBank using the protein
sequences of TGM1, TGM2, TGM3, TGM4 and TGM5 as queries. The results revealed a human genomic working draft sequence (HTG phase 1) of approximately 271,000 base pairs in 9 unordered pieces from BAC clone RP11-128M1 (AL121899.34) on chromosome 20 that contained the TGM3 gene, and seemed to contain a novel transglutaminase gene as well. Initially, the approximate locations of 8 exons in the new gene seemed to be fairly easily discernible due to a high similarity to TGM3 (37-80% identities and 58-82% positives in the TBLASTN output), including an exon with a canonical transglutaminase active site motif (Y-G- Q-C-W-V-F-A) (SEQ ED NO: 1). However, when the sequence was retrieved from GenBank, it turned out to have been recently updated, and the updated sequence was considerably shorter (about 139,000 bp) and did not contain the TGM3 or novel TGM genes. A completely sequenced (HTG phase 3) BAC clone (RP4-816K17) of about 116,000 bp was then found in GenBank (AL031678) which contained the TGM3 gene and part of the novel TGM gene. The new gene and 8 of its exons were annotated in the GenBank file. Detailed analysis of the sequence using BLAST and GeneMiner2 (including the gene and exon finding programs Genscan, FGENES, HMMgene, MZEF and Xpound) seemed to confirm those 8 exons and did not reveal any additional exons. Subsequently, another completely sequenced (HTG phase 3) BAC clone (RP4-734P14) of about 90,000 bp was found in GenBank (AL049650) which seemed to contain the 3 ' end of the novel TGM gene, with 4 additional exons annotated in the GenBank file. Detailed analysis of the sequence as before seemed to confirm those 4 exons and did not reveal any additional exons. These 12 exons were joined together to produce a predicted mRNA (cDNA) sequence, which was then translated. The +3 reading frame coded for a predicted protein of 703 amino acids (roughly the size of the novel brain transglutaminase purified by Ohashi et al. (J. Biochem. 118:1271-1278 (1995))), with a canonical transglutaminase active site motif (Y-G-Q-C-W-V-F-A) (SEQ ED NO: 1). The sequence did not begin with a methiomne, indicating that the N-terminal part of the protein was missing. Efforts to find additional 5' exons were unsuccessful. A BLASTP search of the non- redundant amino acid database from GenBank revealed that this is a novel protein sequence, with the closest match, apart from CAB37632 (the predicted protein
sequence of 442 aa from the 8 exons on AL031678) and CAB46716 (the predicted protein sequence of 260 aa from the 4 exons on AL049650), being TGM3 (about 50%) indentities and about 60%> positives in the BLASTP output) . The program Megalign (part of the DNASTAR package) was used to align the predicted partial protein sequence with the sequences of the known TGMs using the Clustal method with the PAM250 residue weight table, revealing that it exhibited considerable similarity to them, especially TGM3. The weight of the evidence seemed to strongly indicate that this was indeed a novel transglutaminase, most closely related to TGM3, as annotated in the GenBank files.
Identification of a splicing isoform of the novel transglutaminase gene on chromosome 20
Gene specific primers were designed within exons 11 and 13 using the Primer3 program flιttp://www.genome.wi.mit.edu/). with the default setting, except that length was 22-28 nucleotides, GC content 50-70% and TM >65°.
Sense (Forward) primer: ctgttggctgccatgtgccttgtc (SEQ ED NO: 13) Antisense (Reverse) primer: cgatcacaaagcccttgatgtccg (SEQ ED NO: 14)
Nested Forward primer: gaagcttctggtggagaaggac (SEQ ED NO: 15) Nested Reverse primer: aagtgagggcttacaaggtcca (SEQ ED NO: 16)
PCR was performed on Clontech Marathon-Ready cDNA library from human whole brain (Clontech cat no. #7400-1) and human CEPH genomic DNA. Each PCR reaction included 18 μl H2O, 3 μl lOx PCR reaction buffer (Clontech), 3 μl of 2 mM dNTP mix, 0.6 μl Advantage 2 polymerase mix (50x; Clontech), 1.2 μl each Forward and Reverse primer and either 3 μl of cDNA (0.1 ng/μl) or 3 μl of
CEPH genomic DNA (10 ng/μl). The total reaction volume was 30 μl. The PCR was run on an MJR thermal cycler for 94° for 12 minutes, followed by 35 cycles of 94° for 30 seconds and 68° for 4 minutes. A nested PCR was then performed, using the exact same conditions, except using 1 μl of the first PCR as DNA template.
An aliquot of the nested PCR product was loaded onto a 2.5% Metaphor Agarose (BMA) gel, electrophoresed for 45 minutes at 80V, 135 mA, stained with ethidium bromide and photographed.
DNA bands visible on the gel electrophoretogram were recovered and run through DNA gel extraction columns (Millipore) for 10 min at 5100 x g. This was followed by a 10 minute ligation reaction including 2 μl DNA from columns, 0.5 μl TOPO TA cloning vector (Invitrogen) and 0.5 ul salt solution (Invitrogen). This ligation was used to transform 50 μl TOP 10 cells (Invitrogen). They were incubated for 30 minutes on ice, followed by heat shock at 42° for 30 sec. Then the cells were incubated 2 min on ice before adding 200 μl SOC medium (Invitrogen). This was followed by 60 min incubation at 37° before the cells where spread on solid LB medium containing ampicillin (lOOug/ml, XGAL and EPTG. After incubation overnight white colonies were picked and inoculated into 3 ml LB medium with ampicillin. The plasmid was isolated according to standard miniprep protocol (Auto gen
740 robot). 3 μl of template was then used in a 10 μl cycle sequencing reaction containing 2 μl BigDye mix (ABI Biosystems) and either 25 pmol Ml 3 Forward or Ml 3 Reverse primers . The cycle sequencing reaction was run on a MJR thermal cycler for 30 cycles of 94° for 10 seconds, 50° for 5 seconds and 60° for 4 minutes. Before running the reaction on an ABI3700 automatic sequencer the reactions were cleaned by running though Sephadex P-50 columns spun at 3000 rpm for 3 minutes.
The nucleotide sequences were determined using the ABI Sequencing Analysis software. The sequence of the PCR amplimer was compared to the cDNA and genomic sequences and found to be an exact match.
While this invention has been particularly shown and described with reference to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.