WO2003004682A2 - Nucleic acid regulatory sequences and uses thereof - Google Patents

Nucleic acid regulatory sequences and uses thereof Download PDF

Info

Publication number
WO2003004682A2
WO2003004682A2 PCT/US2002/021228 US0221228W WO03004682A2 WO 2003004682 A2 WO2003004682 A2 WO 2003004682A2 US 0221228 W US0221228 W US 0221228W WO 03004682 A2 WO03004682 A2 WO 03004682A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
sequence
nucleic acid
vector
cni
Prior art date
Application number
PCT/US2002/021228
Other languages
French (fr)
Other versions
WO2003004682A3 (en
Inventor
Donald C. Lo
James B. Antczak
Shawn Barney
Howard M. Bomze
Original Assignee
Cogent Neuroscience, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cogent Neuroscience, Inc. filed Critical Cogent Neuroscience, Inc.
Priority to AU2002316557A priority Critical patent/AU2002316557A1/en
Publication of WO2003004682A2 publication Critical patent/WO2003004682A2/en
Publication of WO2003004682A3 publication Critical patent/WO2003004682A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells

Definitions

  • the present invention relates to nucleic acid regulatory sequences that modulate (e.g., promote, enhance, suppress, repress, or silence) expression of a nucleic acid of interest in a cell.
  • the present invention relates to nucleic acid regulatory sequences referred to herein as the CNI-01142 regulatory sequence, the CNI-01080 regulatory sequence, the CNI-01104 regulatory sequence, the CNI-01120 regulatory sequence, the CNI-01125 regulatory sequence, the CNI-01131 regulatory sequence, and transcription-modulating sequences thereof, i a specific embodiment, the present invention relates to CNI-01142 regulatory sequence, or the CNI-01080 regulatory sequence, or the CNI-01104 regulatory sequence, or the CNI-01120 regulatory sequence, or the CNI-01125 regulatory sequence, or the CNI-01131 regulatory sequence, or portions thereof, that promote or enhance transcription of nucleic acids of interest in cells, in particular cells of the nervous system, including, but not limited to cells in the central nervous system (CNS), such as neurons and glia in the brain.
  • CNS central
  • the present invention also relates to vectors and cells engineered to contain such regulatory sequences.
  • the present invention still further relates to methods of using the regulatory sequences of the invention to modulate expression of a nucleic acid of interest in cells, preferably cells of • the nervous system. 2. BACKGROUND OF THE INVENTION
  • Promoters of nervous system-specific genes have been used to direct the expression of heterologous genes to nervous system-derived cells in culture or in transgenic animals.
  • nervous system-specific promoters to express heterologous genes in cell culture or in transgenic mice has allowed the creation of disease models (Sturchler-Pierrat & Sornmer, Rev. Neurosci. 10(1): 15-24 (1999);
  • Nervous system-specific promoters have also been used to deliver therapeutic genes to the CNS to correct genetic deficiencies in vitro and in vivo (Kaplitt et al, Nature Genet. 8(2): 148-54 (1994); Miyao et al, Jpn. J. Cancer Res. 88(7):678-86 (1997); Hayward, Chem. Senses 20(2):261-9 (1995)).
  • mutational analysis and sequence analysis have been used to identify and map the cz ' s-acting regulatory regions and transacting factors that impart tissue- specificity and regulatory characteristics to the promoter.
  • the present invention relates to nucleic acid regulatory sequences that modulate (e.g., promote, enhance, suppress, repress, or silence) expression of a nucleic acid of interest in a cell.
  • the invention relates to an isolated nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1, SEQ ED NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ JD NO: 5, or SEQ ID NO: 6.
  • the invention relates to an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ID NO: 5, or SEQ ED NO: 6.
  • the invention relates to an isolated nucleic acid regulatory sequence, wherein the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SEQ ED NO: 1, SEQ ID NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ED NO: 5, or SEQ ED NO: 6.
  • the invention relates to an isolated nucleic acid regulatory sequence, wherein the isolated nucleic acid regulatory sequence is created by nuclease digestion of a nucleic acid molecule comprising SEQ ED NO: 1, SEQ JD NO: 2, SEQ D NO: 3, SEQ ID NO: 4, SEQ ED NO: 5, or SEQ ED NO: 6.
  • the invention relates to an isolated nucleic acid molecule of claim 1, wherein the isolated nucleic acid molecule is operably linked to a nucleic acid molecule comprising a coding sequence.
  • the invention relates to an isolated nucleic acid regulatory sequence molecule of any one of the preceding, wherein the isolated nucleic acid regulatory sequence molecule is operably linked to a nucleic acid molecule comprising a coding sequence.
  • the invention relates to an isolated nucleic acid molecule comprising the reverse complement of the nucleotide sequence of SEQ ED NO: 1, SEQ D NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ID NO: 5, or SEQ ED NO: 6.
  • the invention relates to an isolated nucleic acid regulatory sequence molecule comprising the reverse complement of the nucleotide sequence of the nucleic acid regulatory sequence, h another specific embodiment, the invention relates to an isolated nucleic acid regulatory sequence molecule, wherein the transcription activating nucleotide sequence comprises at least about 50 contiguous nucleotides of SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ JD NO: 5, or SEQ ED NO: 6.
  • the invention relates to an isolated nucleic acid regulatory sequence molecule, wherein the transcription activating nucleotide sequence comprises at least about 100 contiguous nucleotides of SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ED NO: 5, or SEQ JD NO: 6. h another specific embodiment, the invention relates to an isolated nucleic acid regulatory sequence molecule, wherein the transcription activating nucleotide sequence comprises at least about 200 contiguous nucleotides of SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ED NO: 5, or SEQ ED NO: 6.
  • the invention also provides nucleic acid sequences that hybridize to SEQ ED NO: 1, SEQ ED NO: 2, SEQ JD NO: 3, SEQ D NO: 4, SEQ ED NO: 5, or SEQ JD NO: 6.
  • the invention relates to an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ ED NO: 1, SEQ JD NO: 2, SEQ ID NO: 3, SEQ JD NO: 4, SEQ JD NO: 5, or SEQ ID NO: 6 or the complement thereof.
  • the invention also provides for vectors containing a regulatory sequence of the invention.
  • the invention relates to a vector comprising the nucleotide sequence of SEQ JD NO: 1, SEQ D NO: 2, SEQ JD NO: 3, SEQ ED NO: 4, SEQ ED NO: 5, or SEQ ED NO: 6.
  • the invention relates to a vector comprising at least 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000 or 1250 nucleotides of the nucleotide sequence of SEQ JD NO: 1; at least 20, 30, 40, 50 or 60 nucleotides of the nucleotide sequence of SEQ JD NO: 2; at least 20, 30, 40, 50, 75, 100 or 125 nucleotides of the nucleotide sequence of SEQ JD NO: 3; at least 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1500, 2000 or 2500 nucleotides of the nucleotide sequence of SEQ JD NO: 4; at least 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000 or 1500 nucleotides of the nucleotide sequence of SEQ JD NO: 5; and/or at least 20, 30, 40, 50, 75, 100, 200, 300, 400, 500 or
  • the invention relates to a vector containing an isolated nucleic acid regulatory sequence that hybridizes along its entire length to the sequence of SEQ ED NO: 1, SEQ JD NO: 2, SEQ JD NO: 3, SEQ JD NO: 4, SEQ ID NO: 5, or SEQ ID NO: 6.
  • the invention relates to a vector further comprising a coding sequence operably linked to the nucleotide sequence of SEQ JD NO: 1, SEQ JD NO: 2, SEQ JD NO: 3, SEQ JD NO: 4, SEQ JD NO: 5, or SEQ D NO: 6, or a subsequence thereof.
  • the invention relates to a vector comprising a coding sequence operably linked to the nucleotide sequence of SEQ ED NO: 1, SEQ JD NO: 2, SEQ JD NO: 3, SEQ JD NO: 4, SEQ TD NO: 5, or SEQ JD NO: 6, or a subsequence thereof, wherein the coding sequence is heterologous to the nucleotide sequence of SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ JD NO: 5, or SEQ JD NO: 6.
  • any of the vectors described above further comprises a multiple cloning site (MCS), wherein when a nucleotide sequence is inserted into the MCS, the nucleotide sequence is operably linked to SEQ JD NO: 1, SEQ ED NO: 2, SEQ JD NO: 3, SEQ JD NO: 4, SEQ JD NO: 5, or SEQ JD NO: 6 or a subsequence thereof.
  • MCS multiple cloning site
  • the invention relates to a vector further comprising an internal ribosomal entry site (IRES).
  • the invention relates to a vector further comprising a multiple cloning site (MCS), wherein when a nucleotide sequence is inserted into the MCS, the nucleotide sequence is operably linked to SEQ JD NO: 1, SEQ JD NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, or SEQ TD NO: 6.
  • MCS multiple cloning site
  • this vector comprises an IRES.
  • the invention relates to a vector comprising SEQ JD NO: 1, SEQ JD NO: 2, SEQ D NO: 3, SEQ ED NO: 4, SEQ ID NO: 5, or SEQ JD NO: 6 or a subsequence thereof, and an MCS, wherein when a coding sequence is present within the MCS, the coding sequence is operably linked to the nucleotide sequence of SEQ JD NO: 1, SEQ ED NO: 2, SEQ JD NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, or SEQ JD NO: 6 or a subsequence thereof, hi another specific embodiment, the invention relates to a vector comprising SEQ JD NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ TD NO: 4, SEQ ID NO: 5, or SEQ JD NO: 6 or a subsequence thereof, and an MCS, wherein when a coding sequence is present within the MCS, the coding sequence is operably linked to the transcription activating sequence
  • any of the above the vectors contains a coding sequence within the MCS.
  • said coding sequence is a reporter gene sequence.
  • said reporter gene sequence encodes ⁇ - galactosidase, a fluorescent protein, chloramphenicol acetyltransferase, luciferase or an antigenic marker.
  • said coding sequence is a neuroprotective sequence.
  • the invention provides a vector comprising a promoter and an MCS operably linked in an upstream-to-downstream order, and the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ID NO: 5, or SEQ ED NO: 6 or a transcription activating nucleotide sequence thereof.
  • this vector further comprises an internal ribosomal entry site (IRES).
  • any of the vectors of the invention can be adapted for transfer to a eukaryotic host cell, including a human host cell.
  • the eukaryotic host cell is a nervous system cell.
  • the nervous system cell is a nervous system cell line, glial cell, astrocyte, oligodendrocyte, mesencephalic neuron, hypothalamic neuron or cortical neuron.
  • the vectors above are adapted for transfer to a prokaryotic host cell.
  • the invention further provides for host cells, or progeny thereof, containing the vectors above.
  • said host cell is a eukaryotic cell, including a human host cell.
  • said host cell is a nervous system cell. In another specific embodiment, said host cell is a prokaryotic cell.
  • the invention also provides for kits containing one or more of the vectors and or host cells of the invention in one or more containers, and, preferably, further containing instructions for use.
  • the present invention also relates to transgenic non-human animals engineered to contain a nucleic acid regulatory sequence of the invention.
  • the nucleic acid regulatory sequence can be contained within an episome or, alternatively, the sequence can be integrated within the genome of the transgenic animal. Genomic insertion can be by either homologous or non-homologous recombination.
  • the invention further provides a method of expressing a coding sequence in a host cell in cell culture, h one embodiment, the method comprises culturing a host cell containing a vector of the invention that contains a coding sequence under conditions effective to allow expression of the coding sequence by said host cell.
  • the method comprises culturing a host cell of the invention wherein the nucleic acid regulatory sequence controls expression of a coding sequence endogenously present in the genome of said host cell, under conditions effective to allow expression of the coding sequence by said host cell.
  • the invention also provides a method of producing a peptide or polypeptide comprising maintaining a host cell of the invention that contains a coding sequence that encodes a peptide or polypeptide under conditions effective to allow expression of said coding sequence, and to allow translation of the resulting mRNA, such that a peptide or polypeptide is expressed.
  • the coding sequence is present as part of a vector of the invention.
  • the host cell has been engineered such that a nucleic acid regulatory sequence of the invention controls expression of a coding sequence endogenously present in the genome of said host cell, under conditions effective to allow expression of the coding sequence by said host cell.
  • the vector is present in the genome of said host cell.
  • the invention also provides a method of identifying a modulator of a nucleic acid regulatory sequence of the invention, comprising: (a) contacting a host cell containing a nucleic acid regulatory sequence of the invention operably linked to a reporter gene sequence with a test compound; and (b) assaying expression of the reporter gene, such that, if a change in reporter gene expression relative to its expression in the absence of the test compound, is detected, a modulator of the nucleic acid regulatory sequence is identified, h a particular embodiment, the host cell is a nervous system cell.
  • an "isolated nucleic acid” is a nucleic acid outside its normal biological context (i.e., outside an intact chromosome).
  • GenBank accession numbers U62317 or AC005226 see Section 6.2
  • GenBank accession number HS941F9 see Section 6.3
  • GenBank accession number AC000079 see Sections 6.4 and 6.5
  • GenBank accession number AL031186 see Section 6.6
  • isolated nucleic acid refers to any other full-length sequence disclosed in GenBank.
  • an isolated nucleic acid molecule of the invention contains no more than up to about 5,000 to 10,000 nucleotides of sequence that would endogenously flank SEQ ID NO: 1, or SEQ ED NO: 2, or SEQ ED NO: 3, or SEQ ID NO: 4, or SEQ ID NO: 5, or SEQ ID NO: 6.
  • isolated nucleic acid refers to either the single-stranded or double-stranded form of the nucleic acid molecule.
  • nucleic acid regulatory sequence or “regulatory sequence” comprises a nucleotide sequence that, when operably linked to a nucleic acid of interest, modulates (e.g., activates (promotes, enhances) or inhibits (suppresses, represses, silences) transcription) the nucleic acid of interest, particularly in a cell.
  • a nucleotide sequence is considered "transcription activating" if, when operably linked to a nucleic acid whose expression may be monitored, and placed in a cell (e.g., a nervous system cell in cell culture) under conditions under which expression may take place, promotes or enhances the expression of the nucleic acid detectably above the expression of the same nucleic acid in the absence of the nucleotide sequence operably linked thereto.
  • a nucleic acid regulatory sequence "promotes" transcription of a nucleic acid of interest if, when operably linked to the nucleic acid of interest, it elicits a detectable level of expression of the nucleic acid of interest.
  • a nucleic acid regulatory sequence "enhances" transcription of a nucleic acid of interest if, when operably linked to the nucleic acid of interest, it increases the detectable level of expression relative to expression of the nucleic acid of interest in the absence of the nucleic acid regulatory sequence operably linked thereto.
  • a nucleic acid regulatory sequence is considered to enhance transcription of the nucleic acid of interest when said nucleic acid is already expressed to some detectable level (e.g., is controlled by a promoter sequence) that is increased by the nucleic acid regulatory sequence.
  • a nucleotide sequence e.g., a nucleic acid regulatory sequence
  • a nucleic acid regulatory sequence is "operably linked" to a nucleic acid of interest if said nucleotide sequence is present in a cis configuration relative to said nucleic acid of interest, i.e., the nucleotide sequence attached via a covalent linkage (e.g., a phosphodiester linkage) to the same nucleic acid molecule that comprises the nucleic acid of interest.
  • a nucleic acid regulatory sequence can be adjacent to a nucleic acid of interest or to a promoter sequence that promotes expression of the nucleic acid of interest.
  • the nucleic acid regulatory sequence can be placed upstream (i.e., 5') of the sequence whose expression is to be activated (promoted, enhanced) or inhibited. Additionally, in particular where the regulatory sequence has enhancer or silencer activity, the nucleic acid regulatory sequence can be placed within (e.g., in an intron) or downstream (i.e., 3') of the sequence whose expression is to be modulated.
  • a "coding sequence” is a nucleotide sequence that, when transcribed, yields an RNA molecule.
  • a coding sequence comprises an open reading frame (ORF) that can be translated into a peptide or polypeptide sequence.
  • ORF open reading frame
  • a coding sequence comprises a nucleotide sequence that, when transcribed, yields a tRNA, rRNA, antisense RNA or enzymatically active RNA molecule.
  • a first nucleic acid sequence is considered “heterologous" to a second nucleic acid sequence when the sequences are not endogenously present contiguous to each other, or when neither sequence is endogenously contained within the other.
  • a "vector” is any nucleic acid that is self-replicating in at least one host cell, and is capable of containing the isolated nucleic acid for storage, replication, or propagation of the isolated nucleic acid, or for expression of a coding sequence operably linked to the isolated nucleic acid.
  • a "nervous system cell” can refer to a cell of the central nervous system (CNS), such as neurons, e.g., cortical, hippocampal, mesencephalic or medullary neurons, and glia in the brain, as well as to eye, spinal cord, and olfactory bulb cells, and to cells in the peripheral nervous system (PNS).
  • CNS central nervous system
  • PNS peripheral nervous system
  • a "peptide” refers to a macromolecule of from two to about nineteen amino acids covalently linked, e.g., covalently linked via peptide bonds.
  • a “polypeptide” refers to a macromolecule of at least about twenty amino acids covalently linked, e.g., covalently linked via peptide bonds.
  • FIG. 1 is a diagram of the plasmid pCOGENTl containing CNI-01142.
  • pCOGENTl contains a basal promoter (i.e., a TATA box) between the BamHl and ClaJ sites.
  • the negative control plasmid for expression experiments is pCOGENTl containing only a basal promoter and no regulatory sequence insert; the positive control is pCOGENTl (E), which is pCOGENTl containing the CMV enhancer region inserted into the MCS upstream of the basal promoter.
  • FIG. 2 depicts the DNA sequence of CNI-01142.
  • FIG. 3 depicts the UCSC linkage map of a region of human chromosome 22 containing the nucleotide sequence of CNI-01142.
  • the position of the sequence of CNI- 01142 in the map is from base position 47478467 to 47479777.
  • the position in the UCSC linkage map (University of California-Santa Cruz, October 7, 2000 freeze) corresponds to positions 34371702 to 34373014 in the nucleotide sequence of human chromosome 22 as reported in CHR22_19_05_2000 version 2 (The Sanger Centre, Cambridge, England).
  • Base Position Chromosomal coordinates, numbered from the telomere of the short arm of human chromosome 22.
  • Chromosome Band Light and dark blocks show traditional cytological bands seen with Giemsa staining. STS Markers:
  • Gaps spanned by mRNA and paired reads have a white horizontal line through the black box to indicate bridging.
  • Coverage In dense display, the level of gray gives level of coverage: White/Clear: no coverage (gap); Light Gray: predraft (less than 4x shotgun); Medium Gray: draft (at least 4x shotgun); Dark Gray: multiple draft, covered by more than one draft clone; Finished: covered by a finished clone.
  • YourSeq Position of the DNA sequence of CNI-01142 relative to other sequences or features in the linkage map.
  • Known Genes from full length rxiRNAs: Known protein coding genes from LocusLink.
  • Exons are represented by black boxes; thin horizontal lines represent introns. In the full view, the arrows on the introns indicate direction of transcription.
  • Affymetrix Gene Predictions (excluding known genes): Gene predictions based on ESTs, mRNA, codon usage, and splice site statistics by the proprietary program Genie, supplied by David Kulp at Affymetrix.
  • Ensembl Genes Gene predictions from Ensembl, an automatic DNA sequence assembly and tracking system developed by the European Bioinformatics Institute and the Sanger Center (Cambridgeshire, Wellcome Trust Genome Campus, UK. Arrows on the introns, when present, indicate direction of transcription.
  • Fgenesh++ Gene Predictions Fgenesh++ predictions based on Softberry's gene finding software (Salamov & Solovyev, Genome Research 10(5):516-522 (2000)).
  • Full mRNAs Shows aligning regions as black boxes or vertical lines connected by horizontal lines for gaps. In full display, arrows on the introns indicate the direction of transcription.
  • Human ESTs That Have Been Spliced Shows spliced human ESTs. This track may suggest alternative splicings.
  • Human ESTs In dense mode the level of gray in this track represents the number of ESTs that align at that region.
  • RNA Genes Indicated location of non-protein coding RNA genes and pseudogenes, including tRNAs, rRNAs, SRPs, C/D box methylation guide snoRNAs, U6 etc. RNA-like, HBII, and hVS-like elements.
  • FIG. 4 shows the locations of transcription-factor binding motifs in CNI-01142.
  • the names of the factors that bind the motifs are displayed above the diagram.
  • the nucleotide position of the motifs as they occur sequentially 5' to 3' in the nucleotide sequence of CNI-01142 are indicated to the right of the transcription factor name.
  • FIG. 5 Quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis. Cells were transfected with pCOGENTl containing CNI-01142, negative control DNA, or positive control DNA. The number of cells expressing the reporter gene in each slice is determined visually.
  • FIG. 5 Quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis. Cells were transfected with pCOGENTl containing CNI-01142, negative control DNA, or positive control DNA. The number of cells expressing the reporter gene in each slice is determined visually.
  • FIG. 6 shows images of coronal brain slices transfected with (a) pCOGENTl containing CNI-01142; (b) negative control plasmid, pCOGENTl; or (c) positive control DNA, pCOGENTl (E).
  • FIG. 7 is a diagram of the plasmid pCOGENTl containing CNI-01080. Regulatory sequences of the present invention are indicated as "UNIQUE SEQUENCE
  • pCOGENTl contains a basal promoter (i.e., a TATA box) between the basal promoter and the basal promoter
  • FIG. 8 depicts the DNA sequence of CNI-01080.
  • FIG. 9 depicts the UCSC linkage map of a region of human chromosome 22 containing the nucleotide sequence of CNI-01080. The position of the sequence of CNI- 01080 in the map is the complement of base positions 42509227 to 42509291.
  • the position in the UCSC linkage map corresponds to the complement of positions 29402464 to 29402528 in the nucleotide sequence of human chromosome 22 as reported in CHR22_19_05_2000 version 2 (The Sanger Centre, Cambridge, England).
  • Base Position Chromosomal coordinates, numbered from the telomere of the short arm of human chromosome 22.
  • Chromosome Band Light and dark blocks show traditional cyto logical bands seen with Giemsa staining.
  • STS Markers Location of markers from genetic, RH, YAC, and FISH maps.
  • Mouse Synteny Syntenic chromosomal region in mouse if known.
  • GC Percent Darker shades of gray correspond to higher % GC figured for a window of 20 kbp.
  • FPC Contigs Large dark blocks correspond to fingerprint map contigs at the Washington University (Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri). Assembly: An individual box shows the assembly extracted from a single clone fragment. Gaps in the path appear as white space between the boxes at an adequate zoom level, with a thin horizontal line connecting boxes bridged over a gap. Gap: Shows locations of gaps in the assembly with black boxes or vertical lines. Small gaps may have artefactually coalesced in the graphic. Gaps spanned by mRNA and paired reads have a white horizontal line through the black box to indicate bridging.
  • Affymetrix Gene Predictions (excluding known genes): Gene predictions based on ESTs, mRNA, codon usage, and splice site statistics by the proprietary program Genie, supplied by David Kulp at Affymetrix.
  • Ensembl Genes Gene predictions from Ensembl, an automatic DNA sequence assembly and tracking system developed by the European Bioinformatics Institute and the Sanger Center (Cambridgeshire, Wellcome Trust Genome Campus, UK. Arrows on the introns, when present, indicate direction of transcription.
  • Fgenesh++ Gene Predictions Fgenesh++ predictions based on Softberry's gene finding software (Salamov & Solovyev, Genome Research 10(5):516-522 (2000)).
  • RNA Genes Indicated location of non-protein coding RNA genes and pseudogenes, including tRNAs, rRNAs, SRPs, C/D box methylation guide snoRNAs, U6 etc. RNA-like, HBIL and hVS-like elements.
  • FIG. 10 shows the locations of transcription-factor binding motifs in CNI- 01080.
  • the names of the factors that bind the motifs are displayed above the diagram.
  • the nucleotide position of the motifs as they occur sequentially 5' to 3' in the nucleotide sequence of CNI-01080 are indicated to the right of the transcription factor name.
  • FIG. 11 depicts quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis. Cells were transfected with pCOGENTl containing CNI-01080, negative control DNA, or positive control DNA. The number of cells expressing the reporter gene in each slice is determined visually.
  • FIG. 12 shows images of coronal brain slices transfected with (a) pCOGENTl containing CNI-01080; (b) negative control plasmid, pCOGENTl; or (c) positive control DNA, pCOGENTl (E).
  • FIG. 13 is a diagram of the plasmid pCOGENTl containing CNI-01104. Regulatory sequences of the present invention are indicated as "UNIQUE SEQUENCE CNI-01104".
  • pCOGENTl contains a basal promoter (i.e., a TATA box) between the basal promoter and the basal promoter.
  • FIG. 14 depicts the DNA sequence of CNI-01104.
  • FIG. 15 depicts the UCSC linkage map of a region of human chromosome 22 containing the nucleotide sequence of CNI-01104.
  • the position of the sequence of CNI- 01104 in the map is from base position 16340620 to 16340766.
  • the position in the UCSC linkage map corresponds to positions 3266028 to 3266173 in the nucleotide sequence of human chromosome 22 as reported in CHR22_19_05_2000 version 2 (The Sanger Centre, Cambridge, England).
  • Base Position Chromosomal coordinates, numbered from the telomere of the short arm of human chromosome 22.
  • Chromosome Band Light and dark blocks show traditional cytological bands seen with Giemsa staining.
  • STS Markers Location of markers from genetic, RH, YAC, and FISH maps.
  • Mouse Synteny Syntenic chromosomal region in mouse if known.
  • GC Percent Darker shades of gray correspond to higher % GC figured for a window of 20 kbp.
  • FPC Contigs Large dark blocks correspond to fingerprint map contigs at the Washington University (Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri). Assembly: An individual box shows the assembly extracted from a single clone fragment.
  • Gaps in the path appear as white space between the boxes at an adequate zoom level, with a thin horizontal line connecting boxes bridged over a gap.
  • Gap Shows locations of gaps in the assembly with black boxes or vertical lines. Small gaps may have artefactually coalesced in the graphic.
  • Gaps spanned by mRNA and paired reads have a white horizontal line through the black box to indicate bridging.
  • Coverage In dense display, the level of gray gives level of coverage: White/Clear: no coverage (gap); Light Gray: predraft (less than 4x shotgun); Medium Gray: draft (at least 4x shotgun); Dark Gray: multiple draft, covered by more than one draft clone; Finished: covered by a finished clone.
  • Known Genes from full length mRNAs: Known protein coding genes from LocusLmk. Exons are represented by black boxes; thin horizontal lines represent introns. In the full view, the arrows on the introns indicate direction of transcription.
  • Affymetrix Gene Predictions (excluding known genes): Gene predictions based on ESTs, mRNA, codon usage, and splice site statistics by the proprietary program Genie, supplied by David Kulp at Affymetrix.
  • Ensembl Genes Gene predictions from Ensembl, an automatic DNA sequence assembly and tracking system developed by the European Bioinformatics Institute and the Sanger Center (Cambridgeshire, Wellcome Trust Genome Campus, UK. Arrows on the introns, when present, indicate direction of transcription.
  • Fgenesh++ Gene Predictions Fgenesh++ predictions based on Softberry's gene finding software (Salamov & Solovyev, Genome Research 10(5):516-522 (2000)).
  • RNA Genes Indicated location of non-protein coding RNA genes and pseudogenes, including tRNAs, rRNAs, SRPs, C D box methylation guide snoRNAs, U6 etc. RNA-like, HBII, and hVS-like elements. [0038] FIG.
  • FIG. 16 shows the locations of transcription-factor binding motifs in CNI- 01104.
  • the names of the factors that bind the motifs are displayed above the diagram.
  • the nucleotide position of the motifs as they occur sequentially 5' to 3' in the nucleotide sequence of CNI-01104 are indicated to the right of the transcription factor name.
  • FIG. 17 depicts quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis. Cells were transfected with the reporter- gene plasmid pCOGENTl containing CNI-01104, negative control DNA, or positive control.
  • CFP cyan fluorescent protein
  • FIG. 18 shows images of coronal brain slices transfected with (a) pCOGENTl containing CNI-01104; (b) negative control plasmid, pCOGENTl; or (c) positive control DNA, pCOGENTl (E).
  • FIG. 19 is a diagram of the plasmid pCOGENTl containing CNI-01120.
  • pCOGENTl contains a basal promoter (i.e., a TATA box) between the BamHI and Clal sites
  • the negative control plasmid for expression experiments is pCOGENTl containing only a basal promoter and no regulatory sequence insert.
  • the positive control is pCOGENTl (E), which is pCOGENTl containing the CMV enhancer region inserted into the MCS upstream of the basal promoter.
  • FIG. 20 depicts the DNA sequence of CNI-01120.
  • FIG. 21 depicts the UCSC linkage map of a region of human chromosome 22 containing the nucleotide sequence of CNI-01120.
  • the position of the sequence of CNI- 01120 in the map is from base position 16340761 to 16343529.
  • the position in the UCSC linkage map corresponds to positions 3266168 to 3268936 in the nucleotide sequence of human chromosome 22 as reported in CHR22_19_05_2000 version 2 (The Sanger Centre, Cambridge, England).
  • Base Position Chromosomal coordinates, numbered from the telomere of the short arm of human chromosome 22.
  • Chromosome Band Light and dark blocks show traditional cytological bands seen with Giemsa staining.
  • STS Markers Location of markers from genetic, RH, YAC, and FISH maps.
  • Mouse Synteny Syntenic chromosomal region in mouse if known.
  • GC Percent Darker shades of gray correspond to higher % GC figured for a window of 20 kbp.
  • FPC Contigs Large dark blocks correspond to fingerprint map contigs at the Washington University (Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri). Assembly: An individual box shows the assembly extracted from a single clone fragment.
  • Gaps in the path appear as white space between the boxes at an adequate zoom level, with a thin horizontal line connecting boxes bridged over a gap.
  • Gap Shows locations of gaps in the assembly with black boxes or vertical lines. Small gaps may have artefactually coalesced in the graphic.
  • Gaps spanned by mRNA and paired reads have a white horizontal line through the black box to indicate bridging.
  • Coverage In dense display, the level of gray gives level of coverage: White/Clear: no coverage (gap); Light Gray: predraft (less than 4x shotgun); Medium Gray: draft (at least 4x shotgun); Dark Gray: multiple draft, covered by more than one draft clone; Finished: covered by a finished clone.
  • Known Genes from full length mRNAs: Known protein coding genes from LocusLink. Exons are represented by black boxes; thin horizontal lines represent introns. In the full view, the arrows on the introns indicate direction of transcription.
  • Affymetrix Gene Predictions (excluding known genes): Gene predictions based on ESTs, mRNA, codon usage, and splice site statistics by the proprietary program Genie, supplied by David Kulp at Affymetrix.
  • Ensembl Genes Gene predictions from Ensembl, an automatic DNA sequence assembly and tracking system developed by the European Bioinformatics Institute and the Sanger Center (Cambridgeshire, Wellcome Trust Genome Campus, UK. Arrows on the introns, when present, indicate direction of transcription.
  • Fgenesh++ Gene Predictions Fgenesh++ predictions based on Softberry's gene finding software (Salamov & Solovyev, Genome Research 10(5):516-522 (2000)).
  • Full mRNAs Shows aligning regions as black boxes or vertical lines connected by horizontal lines for gaps. In full display, arrows on the introns indicate the direction of transcription.
  • Human ESTs That Have Been Spliced Shows spliced human ESTs.
  • RNA Genes Indicated location of non-protein coding RNA genes and pseudogenes, including tRNAs, rRNAs, SRPs, C/D box methylation guide snoRNAs, U6 etc. RNA-like, HBII, and hVS-like elements.
  • FIG. 22 shows the locations of transcription-factor binding motifs in CNI- 01120.
  • the names of the factors that bind the motifs are displayed above the diagram.
  • the nucleotide position of the motifs as they occur sequentially 5' to 3' in the nucleotide sequence of CNI-01120 are indicated to the right of the transcription factor name.
  • FIG. 23 depicts quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis.
  • Cells were transfected with pCOGENTl containing CNI-01120, negative control DNA, or positive control DNA. Cells were co-transfected with a plasmid causing high-level expression of cyan fluorescent protein (CFP), which served as an internal control for transfection.
  • CFP cyan fluorescent protein
  • the number of cells expressing the reporter gene (GFP) in each slice is determined visually, and is expressed as a percentage of the CFP-expressing cells ("% GFP-expressing
  • FIG. 24 shows images of coronal brain slices transfected with (a) pCOGENTl containing CNI-01120; (b) negative control plasmid, pCOGENTl; or (c) positive control DNA, pCOGENTl (E).
  • FIG. 25 is a diagram of the plasmid pCOGENTl containing CNI-01125. Regulatory sequences of the present invention are indicated as "UNIQUE SEQUENCE CNI-01125".
  • pCOGENTl contains a basal promoter (i.e., a TATA box) between the BamHI and Clal sites.
  • the negative control plasmid for expression experiments is pCOGENTl containing only a basal promoter and no regulatory sequence insert.
  • FIG. 26 depicts the DNA sequence of CNI-01125.
  • FIG. 27 depicts the UCSC linkage map of a region of human chromosome 22 containing the nucleotide sequence of CNI-01125. The position of the sequence of CNI- 01125 in the map is from base position 26309737 to 26311433.
  • the position in the UCSC linkage map corresponds to positions 13202009 to 13203706 in the nucleotide sequence of human chromosome 22 as reported in CHR22_19_05_2000 version 2 (The Sanger Centre, Cambridge, England).
  • Base Position Chromosomal coordinates, numbered from the telomere of the short arm of human chromosome 22.
  • Chromosome Band Light and dark blocks show traditional cytological bands seen with Giemsa staining.
  • STS Markers Location of markers from genetic, RH, YAC, and FISH maps.
  • Mouse Synteny Syntenic chromosomal region in mouse if known.
  • GC Percent Darker shades of gray correspond to higher % GC figured for a window of 20 kbp.
  • FPC Contigs Large dark blocks correspond to fingerprint map cont gs at the Washington University (Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri). Assembly: An individual box shows the assembly extracted from a single clone fragment. Gaps in the path appear as white space between the boxes at an adequate zoom level, with a thin horizontal line connecting boxes bridged over a gap. Gap: Shows locations of gaps in the assembly with black boxes or vertical lines. Small gaps may have artefactually coalesced in the graphic. Gaps spanned by mRNA and paired reads have a white horizontal line through the black box to indicate bridging.
  • Known Genes from full length mRNAs: Known protein coding genes from LocusEink. Exons are represented by black boxes; thin horizontal lines represent introns. In the full view, the arrows on the introns indicate direction of transcription.
  • Affymetrix Gene Predictions (excluding known genes): Gene predictions based on ESTs, mRNA, codon usage, and splice site statistics by the proprietary program Genie, supplied by David Kulp at Affymetrix.
  • Sanger Chromosome 22 Annotation Known and predicted genes on human chromosome 22 based on information supplied by the Sanger Center (Cambridge, UK).
  • Ensembl Genes Gene predictions from Ensembl, an automatic DNA sequence assembly and tracking system developed by the European Bioinformatics Institute and the Sanger Center (Cambridgeshire, Wellcome Trust Genome Campus, UK. Arrows on the introns, when present, indicate direction of transcription.
  • Fgenesh-H- Gene Predictions Fgenesh++ predictions based on Softberry's gene finding software (Salamov & Solovyev, Genome Research 10(5):516-522 (2000)).
  • Full mRNAs Shows aligning regions as black boxes or vertical lines connected by horizontal lines for gaps, i full display, arrows on the introns indicate the direction of transcription.
  • Human ESTs That Have Been Spliced Shows spliced human ESTs.
  • RNA Genes Indicated location of non-protein coding RNA genes and pseudogenes, including tRNAs, rRNAs, SRPs, C/D box methylation guide snoRNAs, U6 etc. RNA-like, HBII, and hVS-like elements.
  • FIG. 28 shows the locations of transcription-factor binding motifs in CNI- 01125.
  • the names of the factors that bind the motifs are displayed above the diagram.
  • the nucleotide position of the motifs as they occur sequentially 5' to 3' in the nucleotide sequence of CNI-01125 are indicated to the right of the transcription factor name.
  • FIG. 29 depicts quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis. Cells were transfected with pCOGENTl containing CNI-01125, negative control DNA, or positive control DNA. The number of cells expressing the reporter gene in each slice is determined visually.
  • FIG. 30 shows images of coronal brain slices transfected with (a) pCOGENTl containing CNI-01125; (b) negative control plasmid, pCOGENTl; or (c) positive control
  • FIG. 31 is a diagram of the plasmid pCOGENTl containing CNI-01131. Regulatory sequences of the present invention are indicated as "UNIQUE SEQUENCE CNI-01131".
  • pCOGENTl contains a basal promoter (i.e., a TATA box) between the BamHI and Clal sites.
  • the negative control plasmid for expression experiments is pCOGENTl containing only a basal promoter and no regulatory sequence insert.
  • the positive control is pCOGENTl(E), which is pCOGENTl containing the CMV enhancer region inserted into the MCS upstream of the basal promoter.
  • FIG. 32 depicts the DNA sequence of CNI-01131.
  • FIG. 33 depicts the UCSC linkage map of a region of human chromosome 22 containing the nucleotide sequence of CNI-01131.
  • the position of the sequence of CNI- 01131 in the map is from base position 34059978 to 34060733.
  • the position in the UCSC linkage map corresponds to positions 20952250 to 20953005 in the nucleotide sequence of human chromosome 22 as reported in CHR22_19_05_2000 version 2 (The Sanger Centre, Cambridge, England).
  • Base Position Chromosomal coordinates, numbered from the telomere of the short arm of human chromosome 22.
  • Chromosome Band Light and dark blocks show traditional cytological bands seen with Giemsa staining.
  • STS Markers Location of markers from genetic, RH, YAC, and FISH maps.
  • Mouse Synteny Syntenic chromosomal region in mouse if known.
  • GC Percent Darker shades of gray correspond to higher % GC figured for a window of 20 kbp.
  • FPC Contigs Large dark blocks correspond to fingerprint map contigs at the Washington University (Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri). Assembly: An individual box shows the assembly extracted from a single clone fragment.
  • Gaps in the path appear as white space between the boxes at an adequate zoom level, with a thin horizontal line connecting boxes bridged over a gap.
  • Gap Shows locations of gaps in the assembly with black boxes or vertical lines. Small gaps may have artefactually coalesced in the graphic. Gaps spanned by mRNA and paired reads have a white horizontal line through the black box to indicate bridging.
  • Coverage In dense display, the level of gray gives level of coverage: White/Clear: no coverage (gap); Light Gray: predraft (less than 4x shotgun); Medium Gray: draft (at least 4x shotgun); Dark Gray: multiple draft, covered by more than one draft clone; Finished: covered by a finished clone. YourSeq:
  • Known Genes from full length mRNAs: Known protein coding genes from LocusLink. Exons are represented by black boxes; thin horizontal lines represent mtrons. In the full view, the arrows on the introns indicate direction of transcription.
  • Affymetrix Gene Predictions (excluding known genes): Gene predictions based on ESTs, mRNA, codon usage, and splice site statistics by the proprietary program Genie, supplied by David Kulp at Affymetrix.
  • Sanger Chromosome 22 Annotation Known and predicted genes on human chromosome 22 based on information supplied by the Sanger Center (Cambridge, UK).
  • Ensembl Genes Gene predictions from Ensembl, an automatic DNA sequence assembly and tracking system developed by the European Bioinformatics Institute and the Sanger Center (Cambridgeshire, Wellcome Trust Genome Campus, UK. Arrows on the introns, when present, indicate direction of transcription.
  • Fgenesh++ Gene Predictions Fgenesh-H- predictions based on Softberry's gene finding software (Salamov & Solovyev, Genome Research 10(5):516-522 (2000)).
  • Full mRNAs Shows aligning regions as black boxes or vertical lines connected by horizontal lines for gaps. In full display, arrows on the introns indicate the direction of transcription.
  • Human ESTs That Have Been Spliced Shows spliced human ESTs.
  • RNA Genes Indicated location of non-protein coding RNA genes and pseudogenes, including tRNAs, rRNAs, SRPs, C/D box methylation guide snoRNAs, U6 etc. RNA- ke, HBII, and hVS-like elements.
  • FIG. 34 shows the locations of transcription-factor binding motifs in CNI- 01131.
  • the names of the factors that bind the motifs are displayed above the diagram.
  • the nucleotide position of the motifs as they occur sequentially 5' to 3' in the nucleotide sequence of CNI-01131 are indicated to the right of the transcription factor name.
  • FIG. 35 depicts quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis. Cells were transfected with pCOGENTl containing CNI-01131, negative control DNA, or positive control DNA. The number of cells expressing the reporter gene in each slice is determined visually.
  • FIG. 36 shows images of coronal brain slices transfected with (a) pCOGENTl containing CNI-01131; (b) negative control plasmid, pCOGENTl; or (c) positive control
  • nucleic acid molecules that represent nucleic acid regulatory sequence molecules of the invention:
  • SEQ ED NO: 1, SEQ ID NO: 2, SEQ ED NO: 3, SEQ ID NO: 4, SEQ ED NO: 5 and SEQ ID NO: 6 each modulate, promote or enhance gene expression in the nervous system.
  • the sequence of CNI-01142 is located near a known gene, arylsulfatase A (ARSA), (see Section 6.2).
  • the sequence of CNI-01080 is located within an intron of the known gene, FBLNl, which encodes the extracellular matrix protein, fibulin-1 (see Section 6.3).
  • FBLNl which encodes the extracellular matrix protein, fibulin-1
  • the sequence of CNI-01104 lies within an intron of a known gene, HERA which encodes a putative transcription factor, TUPLEl (see Section 6.4).
  • HERA which encodes a putative transcription factor
  • FIG. 21 the sequence of CNI-01120 is also located within an intron of the known gene HERA (see Section 6.5).
  • FIG. 27 UCSC linkage map of a region of human chromosome 22
  • the sequence of CNI- 01125 overlaps intronic and exonic sequences of two known genes, EWSR1 (Ewing sarcoma breakpoint region 1) and C22ORF3 (see Section 6.6).
  • the present invention also relates to isolated nucleic acid regulatory sequences comprising a transcription activating nucleotide sequence of SEQ JD NO: 1, SEQ JD NO: 2, SEQ TD NO: 3, SEQ D NO: 4, SEQ ID NO: 5, or SEQ ID NO: 6.
  • Such nucleic acid regulatory sequences may be restriction fragments of the full-length sequences disclosed.
  • the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SEQ ID NO: 1.
  • a nucleic acid regulatory sequence of the invention is the PstJ-PstJ fragment represented by nucleotides 5-1310 of SEQ ID NO: 1.
  • a nucleic acid regulatory sequence of the invention is the SfcJ-Sfcl fragment represented by nucleotides 1-494 of SEQ ED NO: 1.
  • a nucleic acid regulatory sequence of the invention is the SfcJ-SfcJ fragment represented by nucleotides 494- 1306 of SEQ ID NO: 1.
  • a nucleic acid regulatory sequence of the invention is the AccBlJ-AccBlJ fragment represented by nucleotides 116-601 of SEQ ID NO: 1. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the AccBlJ-AccBlJ fragment represented by nucleotides 601-1001 of SEQ ID NO: 1. h another specific embodiment, a nucleic acid regulatory sequence of the invention is the BbuJ-BbuJ fragment represented by nucleotides 309-736 of SEQ ED NO: 1. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BbuJ-BbuJ fragment represented by nucleotides 736-1240 of SEQ D NO: 1.
  • a nucleic acid regulatory sequence of the invention is the EcoOl 09J-Eco01091 fragment represented by nucleotides 125-544 of SEQ JD NO: 1.
  • a nucleic acid regulatory sequence of the invention is the EcoOl 09J-Eco01091 fragment represented by nucleotides 544-907 of SEQ JD NO: 1.
  • a nucleic acid regulatory sequence of the invention is the AspJ-Ear ⁇ fragment represented by nucleotides 62-265 of SEQ ID NO: 1.
  • a nucleic acid regulatory sequence of the invention is the EarJ-GsuJ fragment represented by nucleotides 265-859 of SEQ ID NO: 1.
  • a nucleic acid regulatory sequence of the invention is the AIN-Nspl fragment represented by nucleotides 416-736 of SEQ ID NO: 1.
  • a nucleic acid regulatory sequence of the invention is the AlwNJ-SfcJ fragment represented by nucleotides 910-1306 of SEQ TD NO: 1.
  • the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SEQ ID NO: 2.
  • a nucleic acid regulatory sequence of the invention is the BamHJ-BamHJ fragment represented by nucleotides 1-60 of SEQ ID NO: 2.
  • a nucleic acid regulatory sequence of the invention is the XlioJL-XhoR fragment represented by nucleotides 1-60 of SEQ ID NO: 2.
  • a nucleic acid regulatory sequence of the invention is the BamH ⁇ -AflJJJ fragment represented by nucleotides 1-44 of SEQ ID NO: 2.
  • a nucleic acid regulatory sequence of the invention is the AflLJJ-BamEJ fragment represented by nucleotides 44-60 of SEQ D NO: 2.
  • a nucleic acid regulatory sequence of the invention is the BamHJ-Nspl fragment represented by nucleotides 1-48 of SEQ ID NO: 2.
  • a nucleic acid regulatory sequence of the invention is the NspJ-BamH.1 fragment represented by nucleotides 48-60 of SEQ ID NO: 2.
  • the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SEQ ED NO: 3.
  • a nucleic acid regulatory sequence of the invention is the SfcJ-SfcJ fragment represented by nucleotides 1-141 of SEQ ED NO: 3.
  • a nucleic acid regulatory sequence of the invention is the PstJ-SfcJ fragment represented by nucleotides 5-141 of SEQ JD NO: 3.
  • a nucleic acid regulatory sequence of the invention is the Pstl-PstJ fragment represented by nucleotides 5-145 of SEQ JD NO: 3.
  • a nucleic acid regulatory sequence of the invention is the ⁇ cI-EcoO109I fragment represented by nucleotides 1-24 of SEQ ID NO: 3.
  • a nucleic acid regulatory sequence of the invention is the EcoOl 091-S/cI fragment represented by nucleotides 24-141 of S ⁇ Q ID NO: 3.
  • a nucleic acid regulatory sequence of the invention is the SfcJ-ErhJ fragment represented by nucleotides 1-28 of S ⁇ Q ID NO: 3.
  • a nucleic acid regulatory sequence of the invention is the
  • a nucleic acid regulatory sequence of the invention is the Pst - EcoO109I fragment represented by nucleotides 5-24 of SEQ D NO: 3.
  • a nucleic acid regulatory sequence of the invention is the EcoOl 091-PstI fragment represented by nucleotides 24-145 of S ⁇ Q JD NO: 3.
  • a nucleic acid regulatory sequence of the invention is the PstJ-ErhJ fragment represented by nucleotides 5-28 of S ⁇ Q ID NO: 3.
  • a nucleic acid regulatory sequence of the invention is the ErhJ-PstJ fragment represented by nucleotides 28-145 of S ⁇ Q ID NO: 3.
  • the isolated nucleic acid regulatory sequence molecule is a restriction fragment of S ⁇ Q ID NO: 4.
  • a nucleic acid regulatory sequence of the invention is the BpmJ-BpmJ fragment represented by nucleotides 177-940 of S ⁇ Q ID NO: 4.
  • a nucleic acid regulatory sequence of the invention is the BpmJ-BpmJ fragment represented by nucleotides 940-1753 of S ⁇ Q ID NO: 4.
  • a nucleic acid regulatory sequence of the invention is the BsgJ-BsgJ fragment represented by nucleotides 563-1063 of S ⁇ Q ID NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BsgJ-BsgJ fragment represented by nucleotides 1063- 1518 of S ⁇ Q ID NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BsgJ-BsgJ fragment represented by nucleotides 1518- 1897 of S ⁇ Q ID NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BspML-BspMI fragment represented by nucleotides 8- 1691 of S ⁇ Q ID NO: 4.
  • a nucleic acid regulatory sequence of the invention is the Eco88I-Eco88I fragment represented by nucleotides 1963-2606 of S ⁇ Q ID NO: 4.
  • a nucleic acid regulatory sequence of the invention is the EcoO 1091-EcoO 1091 fragment represented by nucleotides 161-383 of S ⁇ Q JD NO: 4.
  • a nucleic acid regulatory sequence of the invention is the BanJ-BanJ fragment represented by nucleotides 331-649 of S ⁇ Q JD NO: 4.
  • a nucleic acid regulatory sequence of the invention is the Bsp ⁇ J-Ecd ⁇ J fragment represented by nucleotides 739-1247 of S ⁇ Q ID NO: 4.
  • a nucleic acid regulatory sequence of the invention is the AspHJ-Bsp 14071 fragment represented by nucleotides 1282-2285 of S ⁇ Q ID NO: 4.
  • a nucleic acid regulatory sequence of the invention is the BseRI-PstJ fragment represented by nucleotides 32-2608 of SEQ JD NO: 4.
  • the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SEQ ED NO: 5.
  • a nucleic acid regulatory sequence of the invention is the SmaJ-Bspml fragment represented by nucleotides 71-1695 of SEQ ED NO: 5.
  • a nucleic acid regulatory sequence of the invention is the SmaJ-BspmJ fragment represented by nucleotides 498-1695 of SEQ ED NO: 5.
  • a nucleic acid regulatory sequence of the invention is the Eco52I-5 ⁇ mHI fragment represented by nucleotides 112-746 of S ⁇ Q ID NO: 5.
  • a nucleic acid regulatory sequence of the invention is the BamHJ-SalJ fragment represented by nucleotides 746-1552 of S ⁇ Q JD NO: 5.
  • a nucleic acid regulatory sequence of the invention is the ErhJ-Erhl fragment represented by nucleotides 852-1448 of S ⁇ Q ID NO: 5.
  • a nucleic acid regulatory sequence of the invention is the BsaJ-BsaJ fragment represented by nucleotides 335-1442 of S ⁇ Q ID NO: 5.
  • a nucleic acid regulatory sequence of the invention is the KasJ-KasJ fragment represented by nucleotides 529-801 of S ⁇ Q ID NO: 5.
  • a nucleic acid regulatory sequence of the invention is the ⁇ ONI- ⁇ NI fragment represented by nucleotides 629-1103 of S ⁇ Q JD NO: 5.
  • a nucleic acid regulatory sequence of the invention is the roNI- roNI fragment represented by nucleotides 1103-1527 of S ⁇ Q ID NO: 5.
  • a nucleic acid regulatory sequence of the invention is the SfcJ-SfcJ fragment represented by nucleotides 1-232 of S ⁇ Q ID NO: 5.
  • a nucleic acid regulatory sequence of the invention is the SfcJ-SfcJ fragment represented by nucleotides 232-1692 of S ⁇ Q ID NO: 5.
  • a nucleic acid regulatory sequence of the invention is the XmaJ- XmaJ fragment represented by nucleotides 69-496 of S ⁇ Q ID NO: 5.
  • the isolated nucleic acid regulatory sequence molecule is a restriction fragment of S ⁇ Q JD NO: 6.
  • a nucleic acid regulatory sequence of the invention is the SfcJ-SfcJ fragment represented by nucleotides 1-347 of S ⁇ Q ID NO: 6.
  • a nucleic acid regulatory sequence of the invention is the SfcJ-Sfcl fragment represented by nucleotides 347-751 of SEQ ID NO: 6.
  • a nucleic acid regulatory sequence of the invention is the EcoO109I-EcoO109I fragment represented by nucleotides 189-331 of S ⁇ Q ID NO: 6. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the EcoO109I-EcoO109I fragment represented by nucleotides 331-431 of S ⁇ Q JD NO: 6. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the Van91J-Van91J fragment represented by nucleotides 81-250 of S ⁇ Q TD NO: 6. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the Van9 ⁇ J-Van9 ⁇ J represented by nucleotides 250-630 of S ⁇ Q ID NO: 6.
  • a nucleic acid regulatory sequence of the invention is the PstJ-PstJ fragment represented by nucleotides 5-755 of S ⁇ Q ID NO: 6.
  • a nucleic acid regulatory sequence of the invention is the Ksp22J-EarJ fragment represented by nucleotides 8-655 of S ⁇ Q ID NO: 6.
  • a nucleic acid regulatory sequence of the invention is the Ksp22J-BanJ fragment represented by nucleotides 8-484 of S ⁇ Q ID NO: 6.
  • a nucleic acid regulatory sequence of the invention is the BanJ-BcgJ fragment represented by nucleotides 484-754 of S ⁇ Q ID NO: 6.
  • S ⁇ Q JD NO: 1 S ⁇ Q JD NO: 2
  • S ⁇ Q JD NO: 3 S ⁇ Q JD NO: 4
  • S ⁇ Q ID NO: 5 S ⁇ Q ID NO: 6
  • Nucleic acid regulatory sequences of the invention may also comprise part or all of the reverse compliment of the full-length sequences disclosed.
  • nucleic acid regulatory sequence of the invention comprises an isolated nucleic acid molecule that is the reverse complement of the nucleotide sequence of S ⁇ Q ID NO: 1. In another embodiment, the nucleic acid regulatory sequence of the invention comprises an isolated nucleic acid molecule that is the reverse complement of the nucleotide sequence of S ⁇ Q ID NO: 2. In another embodiment, the nucleic acid regulatory sequence of the invention comprises an isolated nucleic acid molecule that is the reverse complement of the nucleotide sequence of S ⁇ Q ID NO: 3. In another embodiment, the nucleic acid regulatory sequence of the invention comprises an isolated nucleic acid molecule that is the reverse complement of the nucleotide sequence of S ⁇ Q ID NO: 4.
  • nucleic acid regulatory sequence of the invention comprises an isolated nucleic acid molecule that is the reverse complement of the nucleotide sequence of SEQ ID NO: 5. In another embodiment, the nucleic acid regulatory sequence of the invention comprises an isolated nucleic acid molecule that is the reverse complement of the nucleotide sequence of SEQ TD NO: 6.
  • the invention also provides regulation sequences that comprise all or part of the reverse complement of SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ JD NO: 5, or SEQ ID NO: 6.
  • the nucleic acid regulatory sequence of the invention comprises the reverse complement of the nucleotide sequence of a transcription activating sequence of SEQ ED NO: 1.
  • the nucleic acid regulatory sequence of the invention comprises the reverse complement of the nucleotide sequence of a transcription activating sequence of SEQ ED NO: 2.
  • the nucleic acid regulatory sequence of the invention comprises the reverse complement of the nucleotide sequence of a transcription activating sequence of SEQ ID NO: 3.
  • nucleic acid regulatory sequence of the invention comprises the reverse complement of the nucleotide sequence of a transcription activating sequence of SEQ TD NO: 4. In another embodiment, the nucleic acid regulatory sequence of the invention comprises the reverse complement of the nucleotide sequence of a transcription activating sequence of SEQ D NO: 5. In another embodiment, the nucleic acid regulatory sequence of the invention comprises the reverse complement of the nucleotide sequence of a transcription activating sequence of SEQ ID NO: 6.
  • the transcription activating sequence may additionally be discrete fragments of the full-length sequences disclosed.
  • the transcription activating nucleotide sequence comprises at least about 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or 1300 contiguous nucleotides of SEQ ID NO: 1 or the reverse complement thereof.
  • a nucleic acid regulatory sequence of the invention may be, but is not limited to, the nucleotide sequence nni-nnso, mnist-nnioQ, nn ⁇ ornn 15 o, nni5t-nn 2 oo, nn 20 ⁇ -nn 25 o, nn 5 i-nn 3 oo, nn 3 oi-nn 35 o, nn 35 ⁇ -nn4oo, n 45i-r ⁇ n 5 oo, rjn 501 -nn 55 o, nn 551 -nn 60 o, nn 601 -nn 65 o, nn 651 -nn 70 o, nn 70 ⁇ -nn 750 , nn 751 -nn 800 , nn 801 -nn 850 ,r ⁇ n 851 - ⁇ mgo 0 ,nng 01 -r
  • nn x -nn y means nucleotide X to nucleotide Y of the specific SEQ JD NO.
  • nni-nnso of SEQ ED NO: 1 means contiguous nucleotides 1-50 of SEQ JD NO: 1.
  • nn x -nn y means nucleotide X to nucleotide Y of the specific SEQ ID NO: 1.
  • nucleotides nni-nnso of SEQ ID NO: 1 means contiguous nucleotides 1-50 of SEQ TD NO: 1.
  • the transcription activating sequence comprises at least about 10, 20, 30, 40, 50, or 60 contiguous nucleotides of SEQ ED NO: 2.
  • a nucleic acid regulatory sequence of the invention may be, but is not limited to, the nucleotide sequence nn nnio, nn 61 -nn 65 , or any contiguous combination thereof, of SEQ ED NO: 2 or the reverse complement thereof.
  • the transcription activating nucleotide sequence comprises at least -about 20, 40, 60, 80, 100, 120, or 140 contiguous nucleotides of SEQ ED NO: 3 or the reverse complement thereof.
  • a nucleic acid regulatory sequence of the invention is, but is not limited to, the nucleotide sequence nn 1 -nn 2 o, nn 21 -nn 40 , nrm-mirio, rm 61 -nn 80 , nn 81 -nn 100 , nn 10 ⁇ -m ⁇ 12 o, nni2i-nn 140 , nn 1 1 -nn 1 6 , or any contiguous combination thereof, of SEQ ID NO: 3 or the reverse complement thereof.
  • the transcription activating nucleotide sequence comprises at least about 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, or 2750 contiguous nucleotides of SEQ ID NO: 4 or the reverse complement thereof.
  • a nucleic acid regulatory sequence of the invention is, but is not limited to, the nucleotide sequence nni-nnioo, nn 101 -nn2oo, rufcor nn 300 , rm 30 i-rin oo, nn-wi-nnsoo, nn 501 -nn 60 o, nn 601 -nn 7 oo, nn 701 -nn 800 , nngorimgoo, nn 901 - nn 100 o, nn 1001 -nn ⁇ 0 o, nn ⁇ o ⁇ -nni2oo > nni2o ⁇ -nn 130 o, nn 1301 -nn 1 0 o, nn 1 o ⁇ -nni5oo, nni5 01 -nn 16 oo, nn 16 o ⁇ -nn ⁇ 7
  • the transcription activating nucleotide sequence comprises at least about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, 1500, or 1650 contiguous nucleotides of SEQ ID NO: 5 or the reverse complement thereof.
  • a nucleic acid regulatory sequence of the invention is, but is not limited to, the nucleotide sequence n -nnioo, nnioi-nn2oo, nn 20 ⁇ - mi 3 oo, nn 301 -nn4oo, nmo nnsoo, nn 501 -nn 60 o, nn 601 -nn 70 o, nn 70 ⁇ -nn 80 o, nn 8 o ⁇ -nn9 ⁇ o, nn 901 - nniooo, nniooi-nnnooj m ⁇ ori ⁇ oo, nni2oi-nn 13 oo, nn 13 o ⁇ .-nn 14 oo, nnwoi-nmsoo, nni5 01 - ⁇ n ⁇ 60 o, nni ⁇ oi-n
  • the transcription activating nucleotide sequence comprises at least about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, or 750 contiguous nucleotides of SEQ ID NO: 6 or the reverse complement thereof.
  • a nucleic acid regulatory sequence of the invention is, but is not limited to, the nucleotide sequence nni-nnioo, nn 1 o 1 - ⁇ m2oo, mi2o ⁇ -nn 3 oo, rnisoi-nr ooj nnsoo, nn 50 ⁇ -nn 6 oo, nn t 5o 1 -nn 7 oo, nn 7 o ⁇ -nnsoo, nn 801 -nng O o, nn 90 ⁇ -nn 10 oo, nn 10 o ⁇ -nn ⁇ 10 o, nn 1101 - nni2oo, nni20i-nni3oo, nn ⁇ 3 o ⁇ -nni4oo, iHoi-nnisoo, nn ⁇ so ⁇ -nn ⁇ 6 oo, nnn
  • the invention provides for sequences that hybridize to the full-length sequences or reverse complements thereof.
  • the invention provides for an isolated nucleic acid regulatory sequence molecule comprising a transcription nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ JD NO: 1 , to a transcriptional activating sequence of SEQ ID NO: 1, or to a complement or reverse complement thereof.
  • the invention provides for an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ ID NO: 2, to a transcriptional activating sequence of SEQ ID NO: 2, or to a complement or reverse complement thereof, h another embodiment, the invention provides for an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ TD NO: 3, to a transcriptional activating sequence of SEQ JD NO: 3, or to a complement or reverse complement thereof.
  • the invention provides for an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ ID NO: 4, to a transcriptional activating sequence of SEQ ID NO: 4, or to a complement or reverse complement thereof.
  • the invention provides for an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ ID NO: 5, to a transcriptional activating sequence of SEQ ED NO: 5, or to a complement or reverse complement thereof.
  • restriction fragments and discrete subsequences enumerated above represent sequences that hybridize along their entire lengths to the disclosed full-length sequences or their complements.
  • the invention provides for an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ ID NO: 6, to a transcriptional activating sequence of SEQ ID NO: 6, or to a complement or reverse complement thereof.
  • Hybridizing conditions can be of low or high stringency. Such stringency conditions are well known to those of skill in the art.
  • sequences that hybridize under low stringency conditions are ones that would hybridize under conditions as follows (see also Shilo and Weinberg, Proc. Natl. Acad. Sci. U.S.A. 78:6789-6792 (1981)): Filters containing DNA are pretreated for 6 h at 40°C in a solution containing 35% formamide, 5X SSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 ⁇ g/ml denatured salmon sperm DNA.
  • Hybridizations are carried out in the same solution with the following modifications: 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 ⁇ g g/ml salmon sperm DNA, 10% (wt/vol) dextran sulfate, and 5-20 X 10 cpm P-labeled probe is used. Filters are incubated in hybridization mixture for 18-20 h at 40°C, and then washed for 1.5 h at 55°C in a solution containing 2X SSC, 25 mM Tris-HCl (pH 7.4), 5 mM EDTA, and 0.1% SDS. The wash solution is replaced with fresh solution and incubated an additional 1.5 h at 60°C. Filters are blotted dry and exposed for autoradiography. If necessary, filters are washed for a third time at 65-68°C and re-exposed to film.
  • sequences that hybridize under highly stringent conditions are ones that hybridize under such conditions of high stringency as follows. Prehybridization of filters containing DNA is carried out for 8 h to overnight at 65°C in buffer composed of 6X SSC, 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.02% BSA, and 500 ⁇ g/ml denatured salmon sperm DNA. Filters are hybridized for 48 h at 65 °C in prehybridization mixture containing 100 ⁇ g/ml denatured salmon sperm DNA and 5-20 X 10 6 cpm of 32 P -labeled probe.
  • Hybridization conditions are said to be "highly stringent” or “high stringency” when said conditions are at least as stringent as those disclosed in this paragraph.
  • Stringency can also be determined by calculating the Tm of the hybridization.
  • nucleic acid molecules of the invention are deoxyoligonucleotides ("oligos") which hybridize under highly stringent or moderately stringent conditions to the nucleic acid molecules described above.
  • Tm melting temperature
  • modifications of the regulatory nucleotide sequences of the invention that do not substantially affect their transcriptional activities. Such modifications include additions, deletions and substitutions.
  • the present invention also relates to the nucleic acid regulatory sequences of the invention operably linked to a nucleic acid molecule comprising a coding sequence.
  • the invention also provides for the control of gene expression using modifications of CNI-01142 (SEQ ID NO: 1) or a nucleic acid regulatory sequence thereof that substantially alters the transcriptional promotion activities of CNI-01142.
  • the invention provides CNI-01142 sequences that act as stronger modulators than full-length CNI-01142.
  • the invention provides such sequences that are weaker promoters than CNI-01142.
  • the invention provides such sequences that act to suppress or depress the level of transcription below that of the full-length CNI-01142.
  • the invention also provides for the control of gene expression using modifications of CNI-01080 (SEQ JD NO: 2) or a nucleic acid regulatory sequence thereof that substantially alters the transcriptional promotion activities of CNI-01080.
  • CNI-01080 SEQ JD NO: 2
  • the invention provides CNI-01080 sequences that act as stronger modulators than full-length CNI-01080.
  • the invention provides such sequences that are weaker promoters than CNI-01080.
  • the invention provides such sequences that act to suppress or depress the level of transcription below that of the full-length CNI-01080.
  • the invention also provides for the control of gene expression using modifications of CNI-01104 (SEQ D NO: 3) or a nucleic acid regulatory sequence thereof that substantially alters the transcriptional promotion activities of CNI-01104.
  • the invention provides CNI-01104 sequences that act as stronger modulators than full-length CNI-01104.
  • the invention provides such sequences that are weaker promoters than CNI-01104.
  • the invention provides such sequences that act to suppress or depress the level of transcription below that of the full-length CNI-01104.
  • the invention also provides for the control of gene expression using modifications of CNI-01120 (SEQ ID NO: 4) or a nucleic acid regulatory sequence thereof that substantially alters the transcriptional promotion activities of CNI-01120.
  • the invention provides CNI-01120 sequences that act as stronger modulators than full-length CNI-01120.
  • the invention provides such sequences that are weaker promoters than CNI-01120.
  • the invention provides such sequences that act to suppress or depress the level of transcription below that of the full-length CNI-01120.
  • the invention also provides for the control of gene expression using modifications of CNI-01125 (SEQ TD NO: 5) or a nucleic acid regulatory sequence thereof that substantially alters the transcriptional promotion activities of CNI-01125.
  • the invention provides CNI-01125 sequences that act as stronger modulators than full-length CNI-01125.
  • the invention provides such sequences that are weaker promoters than CNI-01125.
  • the invention provides such sequences that act to suppress or depress the level of transcription below that of the full-length CNI-01125.
  • the invention also provides for the control of gene expression using modifications of CNI-01131 (SEQ JD NO: 6) or anucleic acid regulatory sequence thereof that substantially alters the transcriptional promotion activities of CNI-01131.
  • the invention provides CNI-01131 sequences that act as stronger modulators than full-length CNI-01131.
  • the invention provides such sequences that are weaker promoters than CNI-01131.
  • the invention provides such sequences that act to suppress or depress the level of transcription below that of the full-length CNI-01131.
  • a restriction map is generated, the determination of those regions of the nucleic acid regulatory sequences of the invention strongest in promoting or enhancing gene expression is a straightforward task.
  • the region is first digested with restriction endonucleases that produce the desired fragments.
  • the restriction endonucleases are commercially available, and recognize six-nucleotide sequences.
  • these restriction endonucleases utilize sites that are also present in the MCS of an expression vector, to facilitate cloning the fragments in such a way that they are operably linked to a gene to be expressed, the level of expression of which indicates the strength of promotion or enhancement of gene expression.
  • the region is segregated into subregions representing progressively longer deletions from the 5' end, or from the 3' end; internal sequences may be deleted, as well.
  • those fragments that result in the most production of gene product are the strongest promoters; those that produce the least above background are the weakest.
  • This example is not meant to be limiting, as there are other means to generate fragments in order to map promoter, enhancer or silencer regions; for example, exonuclease digestion.
  • the same procedure may be used for regulatory sequence fragments created by exonuclease digestion. Typically, an exonuclease is contacted with the regulatory sequence, and treatment is allowed to continue for varying periods of time, thus generating fragments of various sizes.
  • the fragments are size-separated, for example, on a sizing column or in an agarose gel.
  • the fragments can then either be blunt-end ligated into an expression vector, or can be tailed with linkers to facilitate cloning into such a vector.
  • the resulting constructs are then analyzed for insert sequence and for the insert's ability to promote expression of the reporter gene.
  • the ability of sequences or fragments of the regulatory sequences of the invention to promote or enhance transcription can be assessed in two kinds of plasmid vectors.
  • the regulatory sequence or subfragments thereof is cloned into a site, typically part of an MCS, that places the regulatory sequence upstream of, and operably linked to, a reporter gene whose expression can be monitored.
  • the vector prior to insertion of the regulatory sequence, has no promoter of its own that can drive expression of the reporter gene.
  • Expression of the reporter sequence over that seen with a no-insert control indicates that the regulatory sequence acts as a promoter of transcription.
  • a second vector contains a promoter operably linked to the reporter gene.
  • the putative regulatory sequence is inserted upstream of the promoter, again typically into an MCS. If there is additional increase of the reporter gene above that seen in a promoter- only control, the regulatory sequence has enhancer activity.
  • the above two vectors may additionally be used to discover other regulatory sequences, for example, homologous or analogous regulatory sequences that drive expression in the nervous systems of other species.
  • one may design sets of primers based upon the nucleotide sequence of the regulatory sequence of the invention, and perform PCR under moderately-stringent conditions well known to those of skill in the art on genomic DNA derived from a non-human species. PCR products are then cloned directly into one of the above two vectors. PCR products driving expression in the vector containing a promoter operably linked to the reporter gene have enhancer activity, while PCR products driving expression in the promoterless vector have promoter activity.
  • Alterations in the regulatory sequences can be generated using a variety of chemical and enzymatic methods which are well known to those skilled in the art. For example, regions of the sequences defined by restriction sites can be deleted.
  • Oligonucleotide-directed mutagenesis can be employed to alter the sequence in a defined way and/or to introduce restriction sites in specific regions within the sequence. Additionally, deletion mutants can be generated using DNA nucleases such as Bal31 or ExoIII and SI nuclease. Progressively larger deletions in the regulatory sequences are generated by incubating the DNA with nucleases for increased periods of time (see
  • the altered sequences are evaluated for their ability to direct expression of heterologous coding sequences in appropriate host cells, e.g., nervous system cells. It is within the scope of the present invention that any altered regulatory sequences which retain their ability to direct expression of a coding sequence be incorporated into recombinant expression vectors for further use.
  • the regulatory nucleic acid sequences of the invention can routinely be analyzed for the presence of transcription elements by various publicly available computer programs. Putative transcription elements are located, for example, by means of comparing the sequence to known or known consensus transcription factor binding sequences, and determining that the percent identity between the two is significant.
  • the CNI-01104 regulatory sequence likely responds to a variety of exogenous factors and environmental stimuli.
  • Computer analysis of the CNI-01120 sequence performed as described above revealed a number of transcriptional elements, e.g. binding sites for transcription factors (see FIG. 22).
  • the CNI-01120 regulatory sequence likely responds to a variety of exogenous factors and environmental stimuli.
  • Computer analysis of the CNI-01125 sequence performed as described above revealed a number of transcriptional elements, e.g. binding sites for transcription factors (see FIG. 28).
  • the CNI-01125 regulatory sequence likely responds to a variety of exogenous factors and environmental stimuli.
  • Computer analysis of the CNI-01131 sequence performed as described above revealed a number of transcriptional elements, e.g. binding sites for transcription factors (see FIG. 34).
  • the CNI-01131 regulatory sequence likely responds to a variety of exogenous factors and environmental stimuli.
  • the invention also provides regulatory sequences containing binding sites for various transcription factors.
  • the transcription modulating nucleic acid sequence contains at least 20 contiguous nucleotides of SEQ ED NO: 1 and at least one of the transcription factor binding sites of FIG. 4.
  • the transcription modulating nucleic acid sequence contains at least 20 contiguous nucleotides of SEQ JD NO: 2, and at least one of the transcription factor binding sites of FIG. 10.
  • the transcription modulating nucleic acid sequence contains at least 20 contiguous nucleotides of SEQ ID NO: 3, and at least one of the transcription factor binding sites of FIG. 16.
  • the transcription modulating nucleic acid sequence contains at least 20 contiguous nucleotides of SEQ ID NO: 4, and at least one of the transcription factor binding sites of FIG 22. In another embodiment, the transcription modulating nucleic acid sequence contains at least 20 contiguous nucleotides of SEQ JD NO: 5, and at least one of the transcription factor binding sites of FIG 28. In another embodiment, the transcription modulating nucleic acid sequence contains at least 20 contiguous nucleotides of SEQ ED NO: 6, and at least one of the transcription factor binding sites of FIG 34.
  • Regulatory sequences can also be physically mapped using restriction endonucleases to create restriction maps, which can easily be constructed. Such maps may be constructed by restricting the sequence with a variety of restriction enzymes, separating the resulting fragments on an agarose gel, and therefrom determining the relative positions of the restriction enzyme recognition sequences. Alternatively, since the recognition sequences of most restriction enzymes are well known to those of skill in the art, a restriction map may be generated once the nucleotide sequence of the promoter or regulatory sequence is determined. [0099] Finer mapping of regulatory sequences can routinely be accomplished using site-directed mutagenesis, using variants of the fragments of the present invention.
  • Site-specific mutagenesis is a technique useful in the preparation of mutant promoter regions useful in identifying important promoter elements. The technique further provides a ready ability to prepare and test sequence variants, for example, incorporating one or more of the foregoing considerations, by introducing one or more nucleotide sequence changes into the DNA.
  • Site-specific mutagenesis allows the production of mutants through the use of specific oligonucleotide sequences which encode the DNA sequence of the desired mutation, as well as a sufficient number of adjacent nucleotides, to provide a primer sequence of sufficient size and sequence complexity to form a stable duplex on both sides of the mismatch junction being traversed.
  • a primer of about 17 to 25 nucleotides in length is preferred, with about 5 to 10 residues on both sides of the junction of the sequence being altered.
  • the technique of site-specific mutagenesis is well known in the art as exemplified by publications (Adelman et al, DNA 2:183 (1983)).
  • the technique typically employs a phage vector which exists in both a single stranded and double stranded form.
  • Typical vectors useful in site-directed mutagenesis include vectors such as the M13 phage (Messing et al, Meth. Enzymol. 101:20 (1981)). These phage are readily commercially available and their use is generally well known to those skilled in the art.
  • Double stranded plasmids are also routinely employed in site directed mutagenesis which eliminates the step of transferring the gene of interest from a plasmid to a phage.
  • site-directed mutagenesis in accordance herewith is performed by first obtaining a single-stranded vector or melting apart the two strands of a double stranded vector which includes within any of the nucleic acid regulatory sequences of the invention.
  • An oligonucleotide primer bearing the desired mutated sequence is prepared, generally synthetically, for example by the method of Crea et al. Proc. Natl. Acad. Sci. U.S.A. 75:5765-5769 (1978).
  • Primer sequences are, of course, based on the nucleotide sequences of the regulatory sequences of the invention i.e., SEQ TD NO: 1, SEQ TD NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ED NO: 5, or SEQ ID NO: 6.
  • This primer is then annealed with the single-stranded vector, and subjected to DNA polymerizing enzymes such as E. coli polymerase I Klenow fragment, in order to complete the synthesis of the mutation-bearing strand.
  • DNA polymerizing enzymes such as E. coli polymerase I Klenow fragment
  • sequence variants of the nucleic acid regulatory sequences of the invention using site-directed mutagenesis is provided as a means of producing useful regulatory sequence variants and is not meant to be limiting, as there are other ways in which sequence variants of the regulatory sequences of the invention may be obtained, such as chemical mutagenesis.
  • recombinant vectors containing the desired regulatory sequence may be treated with mutagenic agents to obtain sequence variants (see, e.g., a method described by Eichenlaub et al, J. Bad. 138(2):559-566 (1979) for the mutagenesis of plasmid DNA using hydroxylamine).
  • the present invention also provides for fragments, i.e., subsequences, of the CNI-01142 (SEQ ED NO: 1) regulatory sequence, which fragments need not be transcription activating.
  • Such fragments can be used to detect the CNI-01142 (SEQ ED NO: 1) sequence in a human cell sample, or in a cell sample derived from one of the animal sources listed above.
  • Such fragments may be at least about 5, 10, 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, or 1300 nucleotides in length, or no more than about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, or 1300 nucleotides in length.
  • the present invention also provides for fragments, i.e., subsequences, of the CNI-01080 (SEQ ID NO: 2) regulatory sequence, which fragments need not be transcription activating.
  • Such fragments can be used to detect the CNI-01080 (SEQ JD NO: 2) sequence in a human cell sample, or in a cell sample derived from one of the animal sources listed above.
  • Such fragments may be at least about 5, 10, 20, 30, 40, 50, or 60 nucleotides in length, or no more than about 20, 30, 40, 50, or 60 nucleotides in length.
  • the present invention also provides for fragments, i.e., subsequences, of the CNI-01104 (SEQ ED NO: 3) regulatory sequence, which fragments need not be transcription activating.
  • Such fragments can be used to detect the CNI-01104 (SEQ ID NO: 3) sequence in a human cell sample, or in a cell sample derived from one of the animal sources listed above.
  • Such fragments may be at least about 5, 10, 20, 30, 40, 50, 75, 100, 125, or 140 nucleotides in length, or no more than about 20, 30, 40, 50, 75, 100, 125, or 140 nucleotides in length.
  • the present invention also provides for fragments, i.e., subsequences, of the CNI-01120 (SEQ ID NO: 4) regulatory sequence, which fragments need not be transcription activating. Such fragments can be used to detect the CNI-01120 (SEQ ED
  • fragments may be at least about 5, 10, 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 2000, 2500, or 2750 nucleotides in length, or no more than about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 2000, 2500, or 2750 nucleotides in length.
  • the present invention also provides for fragments, i.e., subsequences, of the CNI-01125 (SEQ JD NO: 5) regulatory sequence, which fragments need not be transcription activating. Such fragments can be used to detect the CNI-01125 (SEQ ID NO: 5) sequence in a human cell sample, or in a cell sample derived from one of the animal sources listed above.
  • Such fragments may be at least about 5, 10, 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, 1500, or 1650 nucleotides in length, or no more than about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, 1500, or 1650 nucleotides in length.
  • the present invention also provides for fragments, i.e., subsequences, of the CNI-01131 (SEQ ID NO: 6) regulatory sequence, which fragments need not be transcription activating.
  • Such fragments can be used to detect the CNI-01131 (SEQ ID NO: 6) sequence in a human cell sample, or in a cell sample derived from one of the animal sources listed above.
  • Such fragments may be at least about 5, 10, 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, or 750 nucleotides in length, or no more than about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, or 750 nucleotides in length.
  • nucleic acid regulatory sequences of the invention can be generated using techniques well known to those of skill in the art.
  • the sequences may be generated from nucleic acids derived from natural sources or from publicly available cloned sequences by any one of a number of means known in the art, i.e., cleavage by one or more restriction endonucleases; DNasel treatment; exonuclease treatment or mechanical shearing.
  • Such fragments may also be constructed artificially.
  • fragments maybe synthesized chemically, or may be generated by means of the polymerase chain reaction (PCR).
  • nucleic acid segment that includes a contiguous sequence from the genomic sequence region may alternatively be described as preparing a nucleic acid fragment.
  • fragments may also be obtained by other techniques such as, e.g., by mechanical shearing, exonuclease treatment or by restriction enzyme digestion.
  • Small nucleic acid segments or fragments may be readily prepared by, for example, directly synthesizing the fragment by chemical means, as is commonly practiced using an automated oligonucleotide synthesizer.
  • fragments may be obtained by application of nucleic acid reproduction technology, such as the PCR technology of U.S. Pat. No. 4,603,102, (incorporated herein by reference), by introducing selected sequences into recombinant vectors for recombinant production, and by other recombinant DNA techniques generally known to those of skill in the art of molecular biology.
  • sequence of a particular regulatory sequence may be determined by a number of means well known in the art, including but not limited to the method of Maxam and Gilbert (Meth. Enzymol. 65:499-560 (1980)), the Sanger dideoxy method (Sanger, F., et al, Proc. Natl. Acad. Sci. U.S.A. 74:5463 (1977)), the use of T7 DNA polymerase (Tabor and Richardson, U.S. Patent No. 4,795,699), or use of an automated DNA sequencer (e.g., Applied Biosystems, Foster City, CA).
  • the labels used in sequencing may be radioactive or fluorescent.
  • a vector suitable for maintenance and gene expression in a host cell is constructed, whereby the vector contains a reporter gene operably linked to the particular regulatory sequence or transcription activating sequence of the invention.
  • the vector containing the regulatory sequence or transcription activating sequence is then placed in to a cell, preferably a neural cell or cell derived from the brain.
  • the amount of the reporter gene product is assessed. For example, if the reporter gene product is GFP, the amount of GFP is determined by assessing the amount of fluorescence emitted by the cell.
  • a nucleotide sequence that modulates reporter gene expression according to the invention is one that causes a detectable difference of the level of expression of the reporter gene, and/or amount of the reporter gene product, when compared to a control cell containing the vector and reporter gene, but lacking the regulatory sequence or transcriptional activating sequence.
  • the difference is an increase in the expression of the reporter gene over that of the control.
  • the regulatory sequences of the present invention each promotes or enhance gene expression in cells derived from the nervous system; thus, each of these regulatory sequences or nucleic acid regulatory sequences thereof are useful for the expression of a coding sequence in cells, particularly in nervous system cells.
  • the invention further provides vectors comprising a nucleic acid regulatory molecule of the invention.
  • the invention provides a vector comprising the nucleotide sequence of SEQ JD NO: 1, SEQ JD NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 or SEQ ED NO: 6. Additionally, the invention further provides vectors comprising two or more of the nucleotide sequences of these SEQ ED NOs.
  • the vector comprises the nucleotide sequence of a transcription activating sequence of SEQ ED NO: 1 or the reverse complement of SEQ ED NO: 1.
  • the transcription activating sequence of SEQ ED NO: 1 maybe at least about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, or 1300 nucleotides in length.
  • the transcription activating sequence of SEQ ED NO: 2 may be at least about 20, 30, 40, 50, or 60 nucleotides in length.
  • the transcription activating sequence of SEQ ED NO: 3 maybe at least about 20, 30, 40, 50, 75, 100, 125, or 140 nucleotides in length.
  • the transcription activating sequence of SEQ JD NO: 4 may be at least about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 2000, 2500, or 2750 nucleotides in length.
  • the transcription activating sequence of SEQ ED NO: 5 maybe at least about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, 1500, or 1650 nucleotides in length.
  • the transcription activating sequence of SEQ ED NO: 6 may be at least about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, or 750 nucleotides in length.
  • the vector may also include a nucleic acid that hybridizes along its entire length to SEQ ED NO: 1, or SEQ ED NO: 2, or SEQ ED NO: 3, or SEQ ID NO: 4, or SEQ ID NO: 5, or SEQ D NO: 6.
  • the vector further comprises a coding sequence operably linked to a nucleic acid regulatory sequence of the invention.
  • the coding sequence is heterologous to the nucleic acid regulatory sequence of the invention.
  • the coding sequence can encode a peptide or a polypeptide and can comprise, e.g., a reporter gene sequence or a neuroprotective sequence.
  • a reporter gene sequence such a sequence can encode, for example, ⁇ -galactosidase, a fluorescent protein (e.g., a green, red, blue, or cyan fluorescent protein), chloramphenicol acetyltransferase, luciferase or an antigenic marker.
  • the vector further comprises a multiple cloning site (MCS), wherein when a nucleotide sequence is inserted into the MCS, the nucleotide sequence is operably linked to a nucleic acid regulatory sequence of the invention.
  • MCS multiple cloning site
  • a vector of the invention can further comprise a coding sequence within the MCS.
  • a vector of the invention can further comprise an internal ribosomal entry site (IRES).
  • IRS internal ribosomal entry site
  • the invention further provides that any vector of the invention can also contain regulatory sequence (e.g., promoter sequence) in addition to the nucleic acid regulatory sequence of the invention.
  • the invention also provides for the enhancement of expression of a nucleotide sequence of interest in a vector containing the nucleotide sequence operably linked to a promoter sequence heterologous to the nucleic acid molecule of the invention, h this regard, in one embodiment, the invention provides a vector comprising a nucleic acid regulatory sequence of the invention, a promoter, and an MCS operably linked in an upstream-to-downstream order, such that when the nucleotide sequence of interest is present within the MCS, expression of the nucleotide sequence of interest is enhanced relative to its expression from the vector in the absence of the nucleic acid regulatory sequence of the invention.
  • the vector further comprises an IRES.
  • the vector further comprises a coding sequence within the MCS.
  • the coding sequence is heterologous to the nucleic acid regulatory sequence of the invention.
  • the coding sequence can encode a peptide or a polypeptide and can comprise, e.g., a reporter gene sequence or a neuroprotective sequence.
  • a reporter gene sequence such a sequence can encode, for example, ⁇ -galactosidase, a fluorescent protein (e.g., a green, red, blue, or cyan fluorescent protein), chloramphenicol acetyltransferase, luciferase or an antigenic marker.
  • any of the vectors of the invention can be adapted for transfer to a eukaryotic host cell, including a human host cell.
  • the eukaryotic host cell is a nervous system cell
  • the nervous system cell is a nervous system cell line, glial cell, astrocyte, oligodendrocyte, mesencephalic neuron, hypothalamic neuron or cortical neuron.
  • the vectors above are adapted for transfer to a prokaryotic host cell.
  • heterologous gene sequences can be expressed under the control of the nucleic acid regulatory sequences of the invention.
  • gene sequences include, but are not limited to, sequences encoding neuroprotective sequences, reporter gene products, toxic gene products, potentially toxic gene products, antiprohferation or cytostatic gene products.
  • Reporter genes can also be expressed including enzymes, (e.g. Chloramphenicol Acetyl Transferase (CAT), beta-galactosidase, luciferase, light-emitting proteins such as those encoded by luxAB, fluorescent proteins such as a green, red, blue, or cyan fluorescent protein, or antigenic markers.
  • enzymes e.g. Chloramphenicol Acetyl Transferase (CAT), beta-galactosidase, luciferase, light-emitting proteins such as those encoded by luxAB, fluorescent proteins such as a green, red, blue, or cyan fluorescent protein, or antigenic markers.
  • nucleic acid regulatory sequences of the invention can be used to modulate the expression of a gene contained in an expression vector that either possesses or lacks a promoter.
  • an expression vector typically possesses a multiple cloning site upstream of the start codon of a gene.
  • the vector may or may not possess a promoter between the MCS and the gene.
  • the plasmid lacks a promoter, an increase in the expression of the gene indicates that the cloned genomic fragment has promoter activity, or promoter and enhancer activities.
  • an increase in the expression of the gene indicates that the cloned fragment possesses at least enhancer activity.
  • genomic fragment may be cloned in either orientation, the method of generating the fragment permitting.
  • genomic fragments generated by DNase I treatment, shearing, or restriction with a single restriction endonuclease may be inserted in either orientation.
  • Fragments generated by filling-in and or digestion with a single-strand nuclease, thereby generating blunt-ended fragments can be inserted in either orientation.
  • directional cloning can be achieved by restriction with a pair of restriction endonucleases, each having a different recognition sequence.
  • the genomic fragment representing a regulatory sequence may be inserted in multiple copies upstream of a gene to be expressed, perhaps improving the regulatory activities.
  • the regulatory sequence or fragment thereof need not be placed in an adjacent conformation and maybe separated by numerous random nucleotides and still retain their improved regulatory and promotion capability.
  • the regulatory sequences and transcription activating fragments thereof of the present invention may be used to induce expression of a heterologous gene in cells derived from the nervous system, such as neurons, including cortical neurons, hippocampal neurons, mesencephalic neurons, medullary neurons, and glial cells.
  • the invention further provides for host cells, or progeny thereof, containing the vectors above, hi a more specific embodiment, said host cell is a eukaryotic cell, including a human host cell. In a more specific embodiment, said host cell is a nervous system cell, hi another specific embodiment, said host cell is a prokaryotic cell.
  • the induction of a cytotoxic product by the the regulatory sequences of the present invention may be used as a form of cancer gene therapy.
  • antisense, antigene, or aptameric oligonucleotides may be delivered to cells using the presently described expression constructs.
  • Ribozymes or single-stranded RNA can also be expressed in a cell to inhibit the expression of a particular gene of interest.
  • the target genes for these antisense or ribozyme molecules should be those encoding gene products that are essential for cell maintenance.
  • the regulatory sequences disclosed herein may be inserted into a variety of expression vectors for introduction into host cells.
  • the invention further provides for host cells, or progeny thereof, containing the vectors above.
  • said host cell is a eukaryotic cell, including a human host cell.
  • said host cell is a nervous system cell.
  • said host cell is a prokaryotic cell.
  • "host cells” means both cells, generally prokaryotic, used to maintain genetic constructs comprising the regulatory sequences of the present invention and a gene of interest that this region controls, as well as cells, generally eukaryotic, in which expression of the gene of interest is desired.
  • the expression vector or the nucleic acid regulatory sequence of the invention is engineered to be stably integrated into the eukaryotic host cell genome.
  • the invention further provides a method of expressing a coding sequence in a host cell in cell culture, hi one embodiment, the method comprises culturing a host cell containing a vector of the invention that contains a coding sequence under conditions effective to allow expression of the coding sequence by said host cell, hi another embodiment, the method comprises culturing a host cell of the invention wherein the nucleic acid regulatory sequence controls expression of a coding sequence endogenously present in the genome of said host cell, under conditions effective to allow expression of the coding sequence by said host cell.
  • the invention also provides a method of producing a peptide or polypeptide comprising maintaining a host cell of the invention that contains a coding sequence that encodes a peptide or polypeptide under conditions effective to allow expression of said coding sequence, and to allow translation of the resulting mRNA, such that a peptide or polypeptide is expressed.
  • the coding sequence is present as part of a vector of the invention.
  • the host cell has been engineered such that a nucleic acid regulatory sequence of the invention controls expression of a coding sequence endogenously present in the genome of said host cell, under conditions effective to allow expression of the coding sequence by said host cell.
  • the vector is present in the genome of said host cell.
  • a number of expression vectors may be advantageously selected depending upon the use intended for the expressed product; the promoter or regulatory sequences contained therein can be replaced by one or more of the regulatory sequences of the present invention, i.e., SEQ JD NO: 1, SEQ JD NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ID NO: 5, SEQ JD NO: 6, or transcription regulating sequences thereof.
  • Such vectors include, but are not limited to, the E. coli expression vector pUR278 (Ruther et al, EMBO J.
  • pGEX vectors may also be used to express foreign polypeptides as fusion proteins with glutathione S- transferase (GST). In general, such fusion proteins are soluble and can easily be purified from lysed cells by adsorption to glutathione-agarose beads followed by elution in the presence of free glutathione.
  • the pGEX vectors are designed to include thrombin or factor Xa protease cleavage sites so that the cloned polypeptide of interest can be released from the GST moiety.
  • yeast a number of vectors containing constitutive or inducible promoters can be replaced by the regulatory sequence of the invention and fragments thereof (CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, Vol. 2, Ed. Ausubel et al, Greene Publish. Assoc. & Wiley Interscience, Ch. 13 (1988); Grant et al, Expression and Secretion Vectors for Yeast, in METHODS IN ENZYMOLOGY, Eds. Wu & Grossman, Acad. Press, N.Y., Vol. 153, pp. 516-544 (1987); Glover, DNA CLONING, Vol. II, TRL Press, Wash., D.C., Ch. 3
  • a host cell strain may be chosen that modulates the expression of the inserted sequences, or modifies and processes the gene product in the specific fashion desired. Such modifications (e.g., glycosylation) and processing (e.g., cleavage) of protein products may be important for the function of the protein. Appropriate cell lines or host systems can be chosen to ensure the correct modification and processing of the foreign protein expressed. To this end, eukaryotic host cells that possess the cellular machinery for proper processing of the primary transcript, glycosylation, and phosphorylation of the gene product may be used.
  • the host cells may be derived from the nervous system itself, and grown in culture, or may be established neuronal or neuron-like cell lines.
  • many neuronal clones exist which have been used extensively as model systems of development since they retain electrophysiological activity with appropriate surface receptors, specific neurotransmitters, synapse forming properties and the ability to differentiate morphologically and biochemically into normal neurons.
  • Such cells are described in the following references: Kimhi et al, Proc. Natl. Acad. Sci. USA 73:462-466 (1976); h : EXCITABLE CELLS LN TISSUE CULTURE, Nelson, P. G.
  • the expression vectors that contain the nucleic acid regulatory sequences of the invention may contain a gene encoding a selectable marker.
  • a number of selection systems may be used, including but not limited to, the herpes simplex virus thymidine kinase (Wigler et al, Cell 11 :223 (1977)), hypoxanthine-guanine phosphoribosyltransferase (Szybalska & Szybalski, Proc. Natl. Acad. Sci. USA 48:2026 (1962)), and adenine phosphoribosyltransferase (Lowy et al, Cell 22:817 (1980)) genes can be employed in tk " , hgprt " or aprf cells, respectively.
  • antimetabolite resistance can be used as the basis of selection for dhfr, which confers resistance to methotrexate (Wigler et al, Proc. Natl. Acad. Sci. USA 11-3561 (1980); O'Hare et al, Proc. Natl. Acad. Sci. USA 78:1527 (1981)); gpt, which confers resistance to mycophenolic acid (Mulligan & Berg, Proc. Natl. Acad. Sci. USA 78:2072 (1981)); neo, which confers resistance to the aminoglycoside G-418 (Colberre-Garapin et al, J. Mol. Biol.
  • hygro which confers resistance to hygromycin (Santerre, et al, Gene 30:147 (1984)) genes.
  • Additional selectable genes include trpB, which allows cells to utilize indole in place of tryptophan; hisD, which allows cells to utilize histinol in place of histidine (Hartman & Mulligan, Proc. Natl. Acad. Sci.
  • ODC ornithine decarboxylase
  • DFMO McConlogue, In: CURRENT COMMUNICATIONS IN MOLECULAR BIOLOGY, 1987, Cold Spring Harbor Laboratory ed.
  • glutamine synthetase Bebbington et al, Biotech 10:169 (1992)
  • nucleic acid comprising the nucleic acid regulatory sequence and, optionally, the coding sequence to be expressed
  • introduction of the nucleic acid, comprising the nucleic acid regulatory sequence and, optionally, the coding sequence to be expressed, into the cell is accomplished by such methods as electroporation, lipofection, calcium phosphate mediated transfection, viral infection, cell fusion, chromosome-mediated gene transfer, microcell-mediated gene transfer, spheroplast fusion, etc.
  • Numerous techniques are known in the art for the introduction of foreign genes into cells (see, e.g., Loeffler and Behr, Meth. Enzymol. 217: 599-618 (1993); Cohen et al, Meth. Enzymol. 217: 618-644 (1993); Cline, Pharmac. Tlier.
  • the chosen technique preferably provides for the stable transfer of the nucleic acid to the cell, so that the nucleic acid is expressible by the cell and is heritable and expressible by its cell progeny.
  • the invention also provides a method of identifying a modulator of a nucleic acid regulatory sequence of the invention, comprising: (a) contacting a host cell containing a nucleic acid regulatory sequence of the invention operably linked to a reporter gene sequence with a test compound; and (b) assaying expression of the reporter gene, such that, if a change in reporter gene expression relative to its expression in the absence of the test compound, is detected, a modulator of the nucleic acid regulatory sequence is identified.
  • the host cell is a nervous system cell.
  • the genetically-engineered cell lines of Section 5.3., supra may be used to screen for peptides, polypeptides, small molecules, natural and synthetic compounds or other cell bound or soluble molecules that cause a stimulation or inhibition of transcriptional activities of the regulatory sequences of the invention. Such compounds may, for example, be used to control gene expression in cells in vitro that is mediated by a regulatory sequence of the present invention.
  • Random peptide libraries consisting of all possible combinations of amino acids attached to a solid phase support may be used to identify peptides that are able to activate or inhibit the activities of the regulatory sequences of the invention (Lam et al, Nature 354: 82-84 (1991)).
  • the screening of peptide libraries may have therapeutic value in the discovery of pharmaceutical agents that stimulate or inhibit gene expression of mediated or controlled by one or more of the regulatory sequences of the invention.
  • combinatorial chemistry libraries can also be screened.
  • An example of an in vitro screening assay is described below. About 10,000 cells per well are plated in 96- well plates in total volume of 100 ⁇ l, using medium appropriate for each cell line. A reporter plasmid is used or constructed whereby the expression of a gene for luciferese is placed under the control of one or more of the regulatory sequences of the invention.
  • this reporter plasmid is transfected into the cells, using 50 ng plasmid per well in the presence of LipofectAmine cationic lipid transfection reagent (Gibco) at 16 ⁇ g/ml. Final volume of the transfection mix is 100 ⁇ l.
  • Potential inhibitors of gene expression controlled by one or more of the regulatory sequences of the invention can also be added to the cells at this time. The effect of the such inhibitors can be determined by measuring the response of the luciferase reporter gene driven by the regulatory sequence(s). After 6 hr.
  • the reporter can also be a fluorescent protein such as green fluorescent protein (GFP). This assay can easily be set up in a high-throughput screening mode for evaluation of compound libraries in a 96-well format.
  • the invention provides means for promoting or increasing the activity of the regulatory sequences, and thereby increasing or promoting the expression of a gene or genes controlled by one or more sequences of the invention.
  • the invention further provides for inhibiting the regulatory activity of the regulatory sequences, and thereby inhibiting the expression of a gene or genes controlled by one or more sequences of the invention.
  • oligonucleotides complementary to the regulatory sequences may be designed and delivered to cells that contain a gene under the control of the a regulatory sequence of the present invention. Such oligonucleotides anneal to the regulatory sequence, and prevent activation of transcription.
  • the regulatory sequence or portions thereof may be delivered to cells in saturating concentrations to compete for transcription factor binding.
  • the nucleic acid is directly administered in vivo into a target cell.
  • This can be accomplished by any methods known in the art, e.g., by constructing it as part of an appropriate nucleic acid expression vector and administering it so that it becomes intracellular, e.g. , by infection using a defective or attenuated retroviral or other viral vector (see U.S. Patent No.
  • microparticle bombardment e.g., a gene gun; Biolistic, Dupont
  • lipids or cell-surface receptors or transfecting agents by encapsulation in liposomes, microparticles, or microcapsules, by administering it in linkage to a peptide known to enter the nucleus, or by administering it in linkage to a ligand subject to receptor-mediated endocytosis (see, e.g., Wu and Wu, 1987, J. Biol. Chem.
  • nucleic acid-ligand complex can be formed in which the ligand comprises a fusogenic viral peptide to disrupt endosomes, allowing the nucleic acid to avoid lysosomal degradation.
  • the nucleic acid can be targeted in vivo for cell specific uptake and expression, by targeting a specific receptor
  • nucleic acid can be introduced intracellularly and incorporated within host cell DNA for expression, by homologous recombination (Koller and Smithies, Proc. Natl. Acad. Sci. USA 86:8932-8935 (1989); Zijlstra et al, Nature 342:435-438 (1989)).
  • the oligonucleotide may comprise at least one modified base moiety which is selected from the group including, but not limited to, 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylammomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta- D-mannosyl
  • Endogenous target gene expression can also be reduced by inactivating or "knocking out" a regulatory sequence using targeted homologous recombination (e.g. , see Smithies, et al, Nature 317:230-234 (1985); Thomas and Capecchi, Cell 51:503-512 (1987); Thompson et al, Cell 5:313-321 (1989); each of which is incorporated by reference herein in its entirety).
  • a non-functional target sequence (or a completely unrelated DNA sequence) flanked by DNA homologous to the specific regulatory sequence can be used, with or without a selectable marker and/or a negative selectable marker, to transfect cells that express the target gene in vivo.
  • This approach can be adapted for use in humans provided the recombinant DNA constructs are directly administered or targeted to the required site in vivo using appropriate vectors.
  • endogenous target gene expression can be reduced by targeting deoxyribonucleotide sequences complementary to the regulatory sequence of the target gene (i.e., the target gene promoter and/or enhancers) to form triple helical structures that prevent transcription of the target gene in target cells in the body (see generally, Helene, Anticancer Drug Des., 6(6):569-584 (1991); Helene et al., Ann. NY. Acad. Sci., 660:27- 36 (1992); and Maher, Bioassays 14(12):807-815 (1992)).
  • Nucleic acid molecules to be used in triple helix formation for the inhibition of transcription should be single stranded and composed of deoxynucleotides.
  • the base composition of these oligonucleotides must be designed to promote triple helix formation via Hoogsteen base pairing rules, which generally require sizeable stretches of either purines or pyrimidines to be present on one strand of a duplex.
  • Nucleotide sequences may be pyrimidine-based, which will result in TAT and CGC triplets across the three associated strands of the resulting triple helix.
  • the pyrimidine-rich molecules provide base complementarity to a purine-rich region of a single strand of the duplex in a parallel orientation to that strand.
  • nucleic acid molecules may be chosen that are purine-rich, for example, contain a stretch of G residues.
  • the potential sequences that can be targeted for triple helix formation may be increased by creating a so-called "switchback" nucleic acid molecule.
  • Switchback molecules are synthesized in an alternating 5'-3', 3'-5' manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.
  • the anti-sense RNA and DNA molecules and triple helix molecules of the invention may be prepared by any method known in the art for the synthesis of nucleic acid molecules.
  • RNA molecules may be generated by in vitro and in vivo transcription of DNA sequences encoding the RNA molecule.
  • antisense cDNA constructs that synthesize antisense RNA constitutively or inducibly, depending on the promoter used, can be introduced stably into cell lines.
  • DNA molecules may be introduced as a means of increasing intracellular stability and half-life. Possible modifications include, but are not limited to, the addition of flanking sequences of ribo- or deoxy- nucleotides to the 5' and/or 3' ends of the molecule or the use of phosphorothioate or 2' O-methyl rather than phosphodiesterase linkages within the oligodeoxyribonucleotide backbone.
  • genes not operably linked to one of the disclosed regulatory sequences can be accomplished by use of antisense nucleic acids.
  • the regulatory sequences promote or enhance the expression of a nucleotide sequence that has exact or substantial complementarity to a gene whose expression is to be down regulated.
  • downregulation of non-cis-linked genes by a regulatory sequence of the invention may be accomplished by using the regulatory sequence to drive the production of mRNA that folds into a ribozyme, which is able to cleave the mRNA produced by the gene whose downregulation is sought.
  • Antisense RNA and DNA molecules act to directly block the translation of mRNA by hybridizing to targeted mRNA and preventing protein translation.
  • Antisense approaches involve the design of oligonucleotides which are complementary to a protective sequence mRNA.
  • the antisense oligonucleotides will bind to the complementary sequence in mRNA transcripts and prevent translation. Absolute complementarity, although preferred, is not required.
  • a sequence "complementary" to a portion of an RNA means a sequence having sufficient complementarity to be able to hybridize with the RNA, forming a stable duplex; in the case of double-stranded antisense nucleic acids, a single strand of the duplex DNA may thus be tested, or triplex formation may be assayed.
  • the ability to hybridize will depend on both the degree of complementarity and the length of the antisense nucleic acid. Generally, the longer the hybridizing nucleic acid, the more base mismatches with an RNA it may contain and still form a stable duplex (or triplex, as the case may be).
  • One skilled in the art can ascertain a tolerable degree of mismatch by use of standard procedures to determine the melting point of the hybridized complex.
  • oligonucleotides complementary to non-coding regions of a gene to be downregulated could be used in an antisense approach to inhibit translation of endogenous mRNA.
  • Antisense nucleic acids should be at least six nucleotides in length, and are preferably oligonucleotides ranging from 6 to about 50 nucleotides in length.
  • the oligonucleotide is at least 10 nucleotides, at least 17 nucleotides, at least 25 nucleotides or at least 50 nucleotides.
  • in vitro studies are first performed to quantitate the ability of the antisense oligonucleotide to inhibit protective sequence expression. It is preferred that these studies utilize controls that distinguish between antisense gene inhibition and nonspecific biological effects of oligonucleotides. It is also preferred that these studies compare levels of the cerebral RNA or protein with that of an internal control RNA or protein. Additionally, it is envisioned that results obtained using the antisense oligonucleotide are compared with those obtained using a control oligonucleotide.
  • control oligonucleotide is of approximately the same length as the test oligonucleotide and that the nucleic acid of the oligonucleotide differs from the antisense sequence no more than is necessary to prevent specific hybridization to the target sequence.
  • the oligonucleotides can be DNA or RNA or chimeric mixtures or derivatives or modified versions thereof, single-stranded or double-stranded.
  • the oligonucleotide can be modified at the base moiety, sugar moiety, or phosphate backbone, for example, to improve stability of the molecule, hybridization, etc.
  • the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger, et al, Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556 (1988); Lemaitre, et al, Proc. Natl. Acad. Sci. U.S.A. 84:648-652 (1987); U.S. Patent No. 4,904,582) or the blood-brain barrier (see, e.g., PCT Publication No.
  • peptides e.g., for targeting host cell receptors in vivo
  • agents facilitating transport across the cell membrane see, e.g., Letsinger, et al, Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556 (1988); Lemaitre, et al, Proc
  • the oligonucleotide may be conjugated to another molecule, e.g., a peptide, hybridization triggered cross-linking agent, transport agent, hybridization-triggered cleavage agent, etc.
  • the antisense oligonucleotide may comprise at least one modified base moiety which is selected from the group including but not limited to 5-fluorouracil, 5- bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5-
  • the antisense oligonucleotide may also comprise at least one modified sugar moiety selected from the group including but not limited to arabinose, 2-fluoroarabinose, xylulose, and hexose.
  • the antisense oligonucleotide comprises at least one modified phosphate backbone selected from the group consisting of a phosphorothioate, a phosphorodithioate, a phosphoramidothioate, a phosphoramidate, a phosphordiamidate, a methylphosphonate, an alkyl phosphotriester, and a formacetal or analog thereof.
  • the antisense oligonucleotide is an ⁇ -anomeric oligonucleotide.
  • An ⁇ -anomeric oligonucleotide forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual ⁇ -units, the strands run parallel to each other (Gautier, et al, Nucl Acids Res. 15:6625-6641 (1987)).
  • the oligonucleotide is a 2'-O-methylribonucleotide (Inoue et al, Nucl. Acids Res.
  • Oligonucleotides of the invention may be synthesized by standard methods known in the art, e.g., by use of an automated DNA synthesizer (such as are commercially available from Biosearch, Applied Biosystems, etc.). As examples, phosphorothioate oligonucleotides may be synthesized by the method of Stein, et al. (Nucl. Acids Res.
  • methylphosphonate oligonucleotides can be prepared by use of controlled pore glass polymer supports (Sarin, et al, Proc. Natl. Acad. Sci. U.S.A. 85:7448-7451 (1988)), etc. [0160] While antisense nucleotides complementary to the coding region sequence of the gene to be downregulated are useful, antisense nucleotides complementary to the transcribed, untranslated region are most preferred.
  • Antisense molecules should be delivered to cells that express the gene to be down regulated in vivo.
  • a number of methods have been developed for delivering antisense DNA or RNA to cells; e.g., antisense molecules can be injected directly into the tissue site, or modified antisense molecules, designed to target the desired cells (e.g., antisense linked to peptides or antibodies which specifically bind receptors or antigens expressed on the target cell surface) can be administered systemically.
  • a preferred approach to achieve intracellular concentrations of the antisense sufficient to suppress translation of endogenous mRNAs utilizes a recombinant DNA construct in which the antisense oligonucleotide is placed under the control of a strong promoter.
  • RNAs which will form complementary base pairs with the endogenous protective sequence transcripts and thereby prevent translation of the protective sequence mRNA.
  • a vector can be introduced e.g., such that it is taken up by a cell and directs the transcription of an antisense RNA.
  • Such a vector can remain episomal or become chromosomally integrated, as long as it can be transcribed to produce the desired antisense RNA.
  • Such vectors can be constructed by recombinant DNA technology methods standard in the art. Vectors can be plasmid, viral, or others known in the art, used for replication and expression in mammalian cells.
  • plasmid, cosmid, YAC or viral vector can be used to prepare the recombinant DNA construct that can be introduced directly into the tissue site.
  • viral vectors can be used that selectively infect the desired tissue, in which case administration may be accomplished by another route (e.g. , systemically).
  • Ribozyme molecules designed to catalytically cleave target gene mRNA transcripts can also be used to prevent translation of target gene mRNA and, therefore, expression of target gene product (see, e.g., PCT International Publication WO90/11364, published October 4, 1990; Sarver et al, Science 247, 1222-1225(1990)).
  • Ribozymes are enzymatic RNA molecules capable of catalyzing the specific cleavage of RNA (for a review, see Rossi, Current Biology 4:469-471(1990)).
  • the mechanism of ribozyme action involves sequence specific hybridization of the ribozyme molecule to complementary target RNA, followed by an endonucleolytic cleavage event.
  • the composition of ribozyme molecules must include one or more sequences complementary to the target gene mRNA, and must include the well known catalytic sequence responsible for mRNA cleavage. For this sequence, see, e.g., U.S. Patent No. 5,093,246, which is incorporated herein by reference in its entirety.
  • ribozymes that cleave mRNA at site-specific recognition sequences can be used to destroy target gene mRNAs
  • the use of hammerhead ribozymes is preferred.
  • Hammerhead ribozymes cleave mRNAs at locations dictated by flanking regions which form complementary base pairs with the target mRNA. The sole requirement is that the target mRNA have the following sequence of two bases: 5'-UG-3'.
  • the construction and production of hammerhead ribozymes is well known in the art and is described more fully in Myers, MOLECULAR BIOLOGY AND BIOTECHNOLOGY: A COMPREHENSIVE DESK REFERENCE, VCH Publishers, New York (1995) (see especially FIG. 4, page 833) and in Haseloff and Gerlach, Nature, 334:585-591 (1988), which is incorporated herein by reference in its entirety.
  • the ribozyme is engineered so that the cleavage recognition site is located near the 5' end of the target gene mRNA, i.e., to increase efficiency and minimize the intracellular accumulation of non-functional mRNA transcripts.
  • the ribozymes of the present invention also include RNA endoribonucleases (hereinafter "Cech-type ribozymes”) such as the one which occurs naturally in Tetrahymena thermophila (known as the TVS, or L-19 EVS RNA) and which has been extensively described by Thomas Cech and collaborators (Zaug, et al, Science, 224:574-578 (1984); Zaug & Cech, Science, 231:470-475 (1986); Zaug, et al, Nature, 324:429-433 (1986); U.S. Patent No. 4,987,071; Been & Cech, Cell, 47:207-216 (1986)).
  • Cech-type ribozymes such as the one which occurs naturally in Tetrahymena thermophila (known as the TVS, or L-19 EVS RNA) and which has been extensively described by Thomas Cech and collaborators (Zaug, et al, Science, 224:574-578 (1984); Zaug
  • the Cech-type ribozymes have an eight nucleotide active site that hybridizes to a target RNA sequence cleavage of the target RNA takes place.
  • the invention encompasses those Cech-type ribozymes that target eight nucleotide active site sequences that are present in the target gene.
  • the ribozymes can be composed of modified oligonucleotides (e.g., for improved stability, targeting, etc.) and should be delivered to cells that express the target gene in vivo.
  • a preferred method of delivery involves using a DNA construct "encoding" the ribozyme under the control of a strong constitutive promoter, so that transfected cells will produce sufficient quantities of the ribozyme to destroy endogenous target gene messages and inhibit translation. Because ribozymes, unlike antisense molecules, are catalytic, a lower intracellular concentration is required for efficiency.
  • the nucleic acid regulatory sequences of the invention can be used to direct expression of a coding sequence in animals by transgenic technology.
  • Animals of any species including, but not limited to, mice, rats, rabbits, guinea pigs, pigs, micro-pigs, goats, sheep, and non-human primates, e.g., baboons, monkeys, and chimpanzees may be used to generate transgenic animals.
  • transgenic refers to animals expressing coding sequences from a different species (e.g., mice expressing human gene sequences), as well as animals that have been genetically engineered to overexpress endogenous (i.e., same species) sequences or animals that have been genetically engineered to no longer express endogenous gene sequences (i.e., "knock-out” animals), and their progeny.
  • Any technique known in the art may be used to introduce a transgene under the control of a regulatory sequence of the invention into animals to produce the founder lines of transgenic animals.
  • Such techniques include, but are not limited to, pronuclear microinjection (Hoppe and Wagner U.S. Patent No. 4,873,191); retrovirus-mediated gene transfer into germ lines (Van der Putten, et al, Proc. Natl. Acad. Sci, USA 82:6148-6152 (1985)); gene targeting in embryonic stem cells (Thompson, et al, Cell 56:313-321 (1989)); electroporation of embryos (Lo, Mol. Cell Biol.
  • transgenic animal clones containing a transgene for example, nuclear transfer into enucleated oocytes of nuclei from cultured embryonic, fetal or adult cells induced to quiescence (Campbell, et al, Nature 380:64-66 (1996); Wilmut, et al, Nature 385:810-813 (1997)).
  • the present invention provides for transgenic animals that carry a transgene such as a reporter gene under the control of a regulatory sequence of the invention or transcription modulating sequences thereof in all their cells, as well as animals that carry the transgene in some, but not all their cells, i.e., mosaic animals.
  • the transgene may be integrated as a single transgene or in concatamers, e.g., head-to-head tandems or head-to- tail tandems.
  • the transgene may also be selectively introduced into and activated in a particular cell type by following, for example, the teaching of Lasko et al. (Proc. Natl. Acad. Sci. U.S.A 89:6232-6236 (1992)).
  • the expression characteristics of an endogenous gene within a cell, cell line or microorganism may be modified by inserting a regulatory sequence of the invention or transcription modulating sequence thereof, into the genome of a cell, stable cell line or cloned microorganism, by nonhomologous recombination, such that the inserted regulatory element is operatively linked with the endogenous gene and controls, modulates or activates the endogenous gene.
  • endogenous genes that are normally "transcriptionally silent,” i.e., one that is normally not expressed, or are expressed only at very low levels in a cell line or microorganism may be activated by inserting a regulatory sequence of the invention, or transcription activating sequence thereof which is capable of promoting the expression of a normally expressed gene product in that cell line or microorganism.
  • a heterologous regulatory element may be inserted into a stable cell line or cloned microorganism, such that it is operatively linked with and activates expression of endogenous genes, using techniques, such as targeted homologous recombination, which are well known to those of skill in the art, and described e.g. , in Chappel, U.S. Pat. No.
  • transgenic animals Once transgenic animals have been generated, the transcriptional activities of the specific regulatory sequence may be assayed utilizing standard techniques. Initial screening may be accomplished by Southern blot analysis or PCR techniques to analyze animal tissues to assay whether integration of the transgene has taken place. The level of mRNA expression of the transgene in the tissues of the transgenic animals may also be assessed using techniques that include, but are not limited to, northern blot analysis of tissue samples obtained from the animal, in situ hybridization analysis, and RT-PCR. Samples of transgene-expressing tissue, may also be evaluated immunocytochemically using antibodies specific for the transgene product. Such animals may be used as in vivo system for the screening of agents that activate or inhibit the activities of the regulatory sequence.
  • DNA sequences that regulate cell-, tissue- or organ-specific transcription may be used therapeutically or prophylactically. Such sequences can be inserted into DNA vector and used to control cell-, tissue-, region- or nervous system-specific transcription of an introduced gene or DNA sequence, or an antisense form of a gene, in order to alter the expression of endogenous cellular genes, or to cause expression of factors (e.g., secreted cytokines) that will alter the properties of other cells.
  • factors e.g., secreted cytokines
  • neuron-specific regulatory sequences it may be possible to use neuron-specific regulatory sequences to express the antisense forms of factors responsible for the excess process outgrowth in neurons that is associated with epilepsy.
  • categories of genes associated with nerve regeneration can be placed under control of inducible promoters associated with regions of DNA that regulate neuron-specific expression.
  • Other applications may be the prophylactic or therapeutic expression of factors that would confer resistance to the effects of chronic infectious agents such as viruses or bacteria that harm cells in the CNS.
  • synthetic antisense molecules e.g., phosphorothioate oligodeoxynucleotides
  • HEV infection can affect the CNS, it may be possible to replace damaged nervous system tissue with nervous system stem cells stably expressing an antisense RNA against HIV mRNAs under the control of a neuron-specific regulatory sequence.
  • Antisense nucleic acids expressed under the control of the regulatory sequences of the present invention can be used to treat disorders of a cell type that expresses, or preferably overexpresses, the particular mRNA to which the antisense nucleic acid is directed.
  • a disorder is an overexpression of a neurotransmitter.
  • a single-stranded DNA antisense TCAP oligonucleotide is used.
  • Cell types which express or overexpress a particular mRNA can be identified by various methods known in the art. Such methods include but are not limited to hybridization with a nucleic acid to the gene of interest (e.g.
  • RNA from the cell type can be translated in vitro into the specific protein produced by the gene, immunoassay, etc.
  • primary tissue from a patient can be assayed for protein expression prior to treatment, e.g., by immunocytochemistry or in situ hybridization.
  • the amount of antisense nucleic acid that will be effective in the treatment of a particular disorder or condition will depend on the nature of the disorder or condition, and can be determined by standard clinical techniques. Where possible, it is desirable to determine the antisense cytotoxicity to the cell type to be treated in vitro, and then in useful animal model systems prior to testing and use in humans.
  • compositions comprising antisense nucleic acids are administered via liposomes, microparticles, or microcapsules.
  • it may be desirable to utilize liposomes targeted via antibodies to specific identifiable tumor antigens Leonetti et al, Proc. Natl. Acad. Sci U.S.A. 87: 2448-2451 (1990); Renneisen et al, J. Biol Chem. 265: 16337-16342 (1990)).
  • nucleotide sequences described herein may also be used as diagnostic tools, where a particular condition or disease state is correlated with polymorphisms among individuals in the CNI-01142, CNI-01080, CNI-01104, CNI-01120, CNI-01125, or CNI- 01131 regulatory sequence. Sequence polymorphisms are the DNA sequence variations that occur between different individuals at the same genetic loci. Polymorphisms can be single nucleotide polymorphisms (SNPs), as well as larger-scale sequence deletions, insertions, or inversions that vary between individuals.
  • SNPs single nucleotide polymorphisms
  • Sequence polymorphisms that occur within regulatory DNA sequence can alter the relative levels of gene expression, which in turn can result in a disease condition, susceptibility to a disease, or alter the response of an individual to drug prophylaxis, drug therapy, or other medical treatments.
  • identifying regulatory sequences and the sequence polymorphisms that occur within them can be used to diagnose a disease or condition, predict the likelihood of developing a disease condition or susceptibility to a condition, predict the likelihood of transmitting an inheritable susceptibility to offspring, or predict the responses of individuals to drug prophylaxis, drug therapies, or other medical treatments.
  • Methods for detecting SNPs are well known in the art, and generally rely on differential hybridization, i.e., the ability to distinguish between a nucleic acid with full complementarity to a regulatory sequence and a nucleic acid with a single mismatch.
  • the methods can either involve a simple determination of hybridization or lack thereof, or can involve a determination of failure of PCR to produce a product, where the mismatched primer is designed to be mismatched at the more critical 3' end of the primer.
  • Conventional techniques for detecting SNPs include, e.g., conventional dot blot analysis, single stranded conformational polymorphism (SSCP) analysis (see, e.g., Orita et al, Proc. Natl. Acad.
  • DGGE denaturing gradient gel electrophoresis
  • heteroduplex analysis mismatch cleavage detection
  • Other methods are known in the art, for example, solid phase arrays using primer-guided nucleotide incorporation procedures (e.g., Kornher, et al, Nucl. Acids Res. 17:7779-7784 (1989); Sokolov, Nucl. Acids Res.
  • SNPs single nucleotide polymorphisms
  • biallelic SNPs or biallelic markers which have two alleles, both of which are present at a fairly high frequency in a population.
  • preferred methods of detecting and mapping SNPs involve microsequencing techniques wherein an SNP site in a target DNA is detecting by a single nucleotide primer extension reaction (see, e.g., Goelet et al, U.S. Patent No. 6,004,744; Mundy, U.S. Patent No. 4,656,127; Vary and Diamond, U.S. Patent No. 4,851,331; Cohen et al, PCT Publication No. WO91/02087;
  • the present invention further provides methods for the use of the nucleic acid regulatory sequences of the invention.
  • DNA fragments that are found to promote or enhance gene expression may be used to find genes not previously known to be expressed in the nervous system; such genes may include previously unknown genes.
  • the method comprises sequencing the fragment in question, followed by a deduction of the gene or gene-like sequences that the fragment appears to regulate by comparison of the sequence to known genomic sequences using the search algorithms described above.
  • the regulatory sequence, or fragments thereof, as provided by the present invention may also be used to discover new transcription factors. Though thousands of transcription factors are predicted to exist in humans (see Venter et al, Science 291 : 1304- 1350 (2001)), only a few hundred have been discovered; far fewer have been described as regulating gene expression in the nervous system. Transcription factors binding to the regulatory sequences provided herein may be discovered by any means known to those in the art.
  • fragments of the regulatory sequence can be separated on a non- denaturing agarose or polyacrylamide gel, under conditions allowing for binding of transcription factors to appropriate DNA recognition sequences or elements, in the presence or absence of extracts of cells derived from the nervous system; a shift in the mobility of a particular fragment in the presence of cell extracts indicates that the fragment is being bound by a protein that may regulate transcription.
  • a column can be constructed, comprising a packing material having a fragment of the regulatory sequence available for binding to cell extract components passed through the column, followed by washing of the column with a buffer that allows for DNA-protein interactions; proteins binding to the fragment, including potential new transcription factors, can thereupon be eluted and characterized.
  • the nucleic acid regulatory sequences of the invention can also be used to aid in the construction of microarrays that allow the simultaneous assessment of the binding of specific transcription factors to a plurality of regulatory DNA sequences.
  • a microarray has been reported in the yeast genetic system (Ren et al, Science 290:2306- 2309 (2001)), and the techniques utilized therein can be readily utilized in the construction of such micro-arrays.
  • the regulatory sequences provided herein in addition to known regulatory sequences, one can construct a similar microarray for human regulatory DNA sequences in order to profile transcription factor utilization in different cell, tissues, or between different physiological conditions or disease states.
  • Human chromosome 22 DNA libraries were prepared by cloning fragments of BamHJ- or Pstl-digested human chromosome 22 DNA sequences into the unique Baj ⁇ HI or PstJ sites present in the multiple cloning site (MCS) of a plasmid vector constructed at
  • This plasmid contains a multiple cloning site (MCS) containing unique restriction enzyme sites for BamHI, EcoRI and
  • the vector Downstream of the MCS, the vector also contains a basal promoter sequence containing a "TATA" box and a sequence encoding green fluorescent protein.
  • the vector also contains an ampicillin resistance gene, and a pMBl -derived origin of DNA replication.
  • a positive control plasmid, pCOGENTl ( ⁇ ) was created by inserting an approximately 400 nucleotide DNA fragment containing the strong transcription enhancer from the CMV immediate early (IE) gene promoter (Boshart et al Cell 41(2):521-30 (1985)) into the unique EcoRI site in the MCS of pCOGENTl.
  • plasmid DNA was extracted using Promega DNA extraction kits. Purified plasmid DNA was introduced into mammalian nervous system cells.
  • SEQ ID NO: 1 The ability of the nucleotide sequences of SEQ ID NO: 1 , SEQ ID NO: 2, SEQ JD NO: 5, and SEQ D NO: 6 to modulate the expression of a nucleic acid of interest in a cell were measured quantitatively in the following manner. Plasmid DNAs from human DNA library clones (see Section 6.1.1) containing the sequences of SEQ ED NO: 1, SEQ ID NO: 2, SEQ ID NO: 5, or SEQ JD NO: 6 were introduced by transfection into cells in rat brain slices and maintained in culture. The number of cells expressing the reporter gene, green fluorescent protein (GFP), was determined by fluorescence microscopy.
  • GFP green fluorescent protein
  • nucleotide sequences of SEQ ID NO: 3 and SEQ ID NO: 4 were measured quantitatively in the following manner. Plasmid DNAs from human DNA library clones (see Section 6.1.1) containing the sequences of SEQ ID NO: 3 or SEQ ED NO: 4 were co- transfected into cells in rat brain slices in culture with the plasmid pECFP (Clontech, PaloAlto, CA). pECFP contains the gene for cyan fluorescent protein (CFP) under the control of a strong CMV promoter, which results in high-level expression of CFP in mammalian cells.
  • CFP cyan fluorescent protein
  • the number of cells expressing GFP and CFP were determined separately by fluorescence microscopy using different excitation and emission wavelengths for each fluorescent protein. The percentage of cells expressing CFP that also expressed GFP was calculated [(GFP-expressing cells/CFP-expressing cells) x 100]. Individual library clones were judged to contain nucleotide sequences capable of modulating gene expression if they caused a higher percentage of cells to express GFP (relative to CFP) than the percentage of cells expressing GFP (relative to CFP) in brain slices that were similarly co-transfected with the negative control plasmid pCOGENTl and pECFP. The percentage of GFP-expressing cells similarly co-transfected with the positive control plasmid pCOGENTl(E) and pECFP, was also determined.
  • the nucleotide sequence of a DNA insert that was selected for its ability to cause detectable expression of the reporter gene when introduced into cells was determined using the ABI Big Dye terminator Cycle Sequencing Ready Reaction Kit followed by subsequent analysis on the ABI3700 capillary sequencing machine (PE Biosystems, Foster City, CA). Plasmid DNA was annealed with oligonucleotide primers complementary to regions upstream (forward primer) and downstream (reverse primer) of the MCS. Cycle sequencing reactions were carried out in a thermocycler (PCR machine) using standard methods. The extension products from the sequencing reaction were purified by precipitation using isopropanol and analyzed on the ABI3700 sequencer according to the manufacturer's protocol.
  • the eukaryotic transcription factors and DNA motifs from the Transcription Factor Database are located on the Internet, via file transfer protocol, at ncbi.nlm.nih.gov/repository/TFD. Information present in the University of California, Santa Cruz (UCSC), draft assembly of the human genome (available on the Internet at genome.ucsc.edu/goldenPath/octTracks.html) was used to position the regulatory sequence on human chromosome 22.
  • the nearest known or predicted gene to the sequence of CNI-01142 is a known gene, arylsulfatase A (ARSA).
  • the sequence of the complement of the gene extends from position 47481553 to 47484673.
  • the sequence of arylsulfatase A precursor is in the opposite orientation of the sequence of CNI-01142.
  • the 3' end of the sequence of the gene is approximately 1,776 base pairs "downstream" from the 3' end of the sequence of CNI-01142.
  • CNI-01142 nucleotide sequence was analyzed for transcription factor recognition sites (FIG. 4).
  • Genomic clone CNI-01142 caused expression of the reporter gene above the level for the negative control pCOGENTl (FIG. 5). Expression in the middle region of the brain was greater than in caudal and rostral regions. Nervous system cells transfected with CNI-01142 clearly show expression in brain slices (FIGS. 6A-6C).
  • the sequence of the 65 nucleotide CNI-01080 regulatory sequence is shown in FIG. 8.
  • a BLAST analysis showed the highest homology to GenBank accession number HS941F9, a human DNA sequence from clone CTA-941F9 on cliromosome 22ql3.
  • the sequence of CNI-01080 is located within an intron of the known gene, FBLNl, which encodes the extracellular matrix protein, fibulin-1.
  • the sequence of FBLNl extends from sequence positions 42448246 to 42545953.
  • the sequence of FBLNl is in the opposite orientation of the sequence of CNI-01080.
  • FBLNl The 5' end of FBLNl is approximately 60,981 base pairs "upstream” from the 3' end of the sequence of CNI- 01080.
  • Fibulin-1 is reported to be expressed in brain exclusively in neurons (to the exclusion of astrocytes or microglia), and is implicated in the pathogenesis of Alzheimer's disease as a consequence of its ability to bind to amyloid precursor protein (Ohsawa I., et al. J Neurochem. 76(5):1411-20 (2001) "Fibulin-1 binds the amino-terminal head of beta- amyloid precursor protein and modulates its physiological function").
  • the CNI-01080 nucleotide sequence was analyzed for transcription factor recognition sites (FIG.
  • Genomic clone CNI-01080 caused expression of the reporter gene above the level for the negative control pCOGENTl (FIG. 11).
  • the expression of the reporter gene is higher in the caudal region than that of the middle region, and lower in the rostral region.
  • Nervous system cells transfected with CNI-01080 clearly show expression in brain slices (FIG. 12).
  • FIG. 15 (UCSC linkage map of a region of human cliromosome 22), two genes are positioned near the sequence of CNI-01104.
  • the sequence of CNI- 01104 lies within an intron of a known gene, HTRA.
  • the sequence of the complement of the HTRA gene extends from position 16258356 to 16359351.
  • the sequence of the HIRA gene is in the opposite orientation of the sequence of CNI-01104.
  • the 5 ' end of the sequence of the HTRA gene is approximately 18,585 base pairs "downstream" from the 3' end of the sequence of CNI-01104.
  • a second gene (not shown in FIG.
  • NLVCF nuclear localization Velo-cardio-facial syndrome
  • TUPLE 1 TUP-like enhancer of Split 1
  • Hira is expressed in the developing neural plate, the neural tube, the neural crest, and the mesenchyme of the head and branchial arch structures (Roberts et al, Hu. Mol. Genet. 6: 237-245, (1997), "Cloning and developmental expression analysis of chick Hira [Chira], a candidate gene for DiGeorge syndrome").
  • the NLVCF gene aligns with the same region of chromosome 22 deleted in patients with Velo-cardio-facial and DiGeorge syndromes.
  • NLVCF is expressed at high levels in the brain during development and may be co-regulated with HERA, the murine homolog of which displays a similar expression pattern in mouse embryos as does the murine Nlvcf gene (Funke et al, Genomics 53:146-54 (1998), "Isolation and characterization of a human gene containing a nuclear localization signal from the critical region for velo-cardio-facial syndrome on 22ql 1 ") .
  • the CNI-01104 nucleotide was analyzed for transcription factor recognition sites (FIG. 16).
  • Genomic clone CNI-01104 caused expression of the reporter gene above the level for the negative control pCOGENTl (FIG. 17). The expression of the reporter gene is highest in the middle region of the brain, with lower expression in the rostral and caudal regions. Nervous system cells transfected with CNI-01104 clearly show expression in brain slices (FIG. 18).
  • the sequence of CNI-01120 is located within the first intron of the known gene, HTRA.
  • the sequence of the complement of the gene extends from position 16258356 to 16359351.
  • the sequence of HTRA is in the opposite orientation of the sequence of CNI- 01120.
  • the 5' end of the predicted gene is approximately 15,823 base pairs "downstream" from the 3' end of the sequence of CNI-01120.
  • a second gene, NLVCF extends from position 16360198 tol6363730 (UCSC linkage map, December 12, 2000 freeze).
  • the sequence of the NLVCF gene is in the same orientation as the sequence of CNI-01120.
  • the 5' end of the NLVCF gene is approximately 16,670 nucleotides
  • the HTRA gene histone cell cycle regulation defective, S. cerevisiae homolog A, also known as DiGeorge critical region gene 1 [DCGR1]
  • DCGR1 DiGeorge critical region gene 1
  • TUPLE 1 TUP-like enhancer of Split 1
  • NLVCF nuclear localization Velo- cardio-facial syndrome
  • NLVCF is expressed at high levels in the brain during development and may be co-regulated with HIRA, the murine homolog of which displays a similar expression pattern in mouse embryos as does the murine Nlvcf gene (Funke et al, Genomics 53:146-54 (1998), "Isolation and characterization of a human gene containing a nuclear localization signal from the critical region for velo-cardio-facial syndrome on 22ql l").
  • the CNI-01104 nucleotide was analyzed for transcription factor recognition sites (FIG. 22).
  • Genomic clone CNI-01120 caused expression of the reporter gene above the level for the negative control pCOGENTl (FIG. 23.) The expression of the reporter gene is higher in the middle region of the brain than in the rostral and causal regions. Nervous system cells transfected with CNI-01120 show expression in brain slices (FIG. 24).
  • FIG. 26 The sequence of the 1697 nucleotide CNI-01125 regulatory sequence is shown in FIG. 26.
  • a BLAST analysis showed the highest homology to GenBank accession number AL031186, a human DNA sequence from clone CTA-984G1 on cliromosome 22ql2.1-12.2 that contains the 5' part of the EWSR1 gene for Ewing sarcoma breakpoint region 1 protein.
  • the sequence of CNI-01125 overlaps intronic and exonic sequence of two known genes, EWSR1 and C22ORF3.
  • the sequence of EWSR1 extends from position 26310272 to 26342152, in the same orientation as the sequence of CNI-01125.
  • the 5 ' end of the sequence of the gene is approximately 535 base pairs "downstream" from the 5' end of the sequence of CNI-01125.
  • the sequence of CNI-01125 extends approximately 1,162 bases further "downstream" into the sequence of EWSR1.
  • sequence of the complement of C22ORF3 extends from 26301845 to 26309915, in the opposite orientation of the sequence of CNI-01125.
  • the 5' end of the sequence of C22ORF3 is approximately 178 bases "downstream" of the 5' end of the sequence of
  • EWSR1 (Ewing sarcoma breakpoint region 1) is a gene that lies at the point of cliromosome 22 at which translocations with cliromosome 11 associated with Ewing sarcoma typically occur (Plougastel et al, Genomics 18:609-15 (1993), "Genomic structure of the EWS gene and its relationship to EWSR1, a site of tumor-associated chromosome translocation").
  • Patients with other kinds of cancer including peripheral neuroepithelioma, display identical translocations between several different chromosomes and 22 (Whang-Peng et al. , New Eng. J. Med.
  • the CNI-01125 nucleotide was analyzed for transcription factor recognition sites (FIG. 28).
  • Genomic clone CNI-01125 caused significant expression of the reporter gene above the level for the negative control pCOGENTl (FIG. 29).
  • the expression of the reporter gene is higher in the caudal region of the brain than in the rostral and middle regions. In particular, expression in the caudal region was substantially higher than that which occurred in the positive control.
  • Nervous system cells transfected with CNI-01125 show expression in brain slices (FIG. 30).
  • FIG. 32 A BLAST analysis showed the highest homology to GenBank accession number AL022314, a human DNA sequence from clone RP5-1170K4 on chromosome
  • 22ql2.2-13.1 that contains several novel genes, one of which codes for a trypsin family protein with class A LDL receptor domains, and the IL2RB gene for
  • Interleukin 2 Receptor, Beta (IL-2 Receptor, CD122).
  • the sequence of CNI-01131 encompasses an exon and part of the adjacent intron of the known gene, IL2RB.
  • the sequence of the complement of the gene extends from position 34050482 to 34074557.
  • the sequence of IL2RB is in the opposite orientation of the sequence of CNI-01131.
  • the 5 ' end of the sequence of the gene is approximately 13,824 base pairs "downstream" from the 3' end of the sequence of CNI-01131.
  • IL2RB encodes the beta chain of the IL2 receptor, which with the alpha and gamma chains comprise the high affinity IL2 receptor.
  • IL2 the protypical T cell growth factor and immunoregulatory cytokine produced by lymphocytes, has been implicated as a brain neurotrophic factor and neuromodulator (Shimojo et al, Neurosci. Lett. 151:170-3 (1993), "Interleukin-2 enhances the viability of primary cultured rat neocortical neurons"). Additionally, IL2 has been influences inflammatory processes in the brain such as encephalomyelitis (Petitto et al, Neurosci. Lett. 285:66-70 (2000),
  • Interleukin-2 gene deletion produces a robust reduction in susceptibility to experimental autoimmune encephalomyelitis in C57BL/6 mice", learning and memory (Petitto et al, J. Neurosci. Res. 56:441-6 (1999), "Impaired learning and memory and altered hippocampal neurodevelopment resulting from interleukin-2 gene deletion"), and in controlling tumor growth (Sampath et al, Cancer Res. 59:2107-14 (1999), "Paracrine immunotherapy with interleukin-2 and local chemotherapy is synergistic in the treatment of experimental brain tumors”).
  • the CNI-01131 nucleotide was analyzed for transcription factor recognition sites (FIG. 34).
  • Genomic clone CNI-01131 caused significant expression of the reporter gene above the level for the negative control pCOGENTl (FIG. 35).
  • the expression of the reporter gene is higher in the middle region of the brain than in the rostral and caudal regions. Nervous system cells transfected with CNI-01131 clearly show expression in brain slices (FIG. 32).

Abstract

The present invention is directed to the nucleotide sequences of the CNI-01142, CNI-01080, CNI-01104, CNI-01120, CNI-01125, or CN-01131 regulatory sequences, and to transcription activating regulatory molecules derived therefrom. The invention is further directed to vectors comprising these sequences, and to host cells containing the vectors. The invention further provides methods for the expression of a nucleotide sequence, or producing a polypeptide, of interest using the CNI-01142, CNI-01080, CNI-01104, CNI-01120, CNI-01125, or CNI-01131 regulatory sequences, in vitro and in vivo. Also provided is a method of identifying a regulator of the CNI-01142, CNI-01080, CNI-01104, CNI-01120, CNI-01125, or CNI-01131 regulatory sequences. Kits and non-human transgenic animals containing the CNI-01142, CNI-01080, CNI-01104, CNI-01120, CNI-01125, or CNI-01131 regulatory sequences are also provided.

Description

NUCLEICACIDREGULATORYSEQUENCES AND USES THEREFOR
[0001] This application claims benefit of U.S. Provisional Application No. 60/303,398, filed July 6, 2001; U.S. Provisional Application No. 60/305,261, filed July 13, 2001; U.S. Provisional Application No. 60/307,394, filed July 24, 2001; U.S. Provisional Application No. 60/307,395, filed July 24, 2001 ; U.S. Provisional Application No. 60/307,666, filed July 25, 2001; and U.S. Provisional Application No. 60/309,885, filed August 3, 2001, each of which is incorporated by reference herein in its entirety.
1. INTRODUCTION
[0002] The present invention relates to nucleic acid regulatory sequences that modulate (e.g., promote, enhance, suppress, repress, or silence) expression of a nucleic acid of interest in a cell. In particular, the present invention relates to nucleic acid regulatory sequences referred to herein as the CNI-01142 regulatory sequence, the CNI-01080 regulatory sequence, the CNI-01104 regulatory sequence, the CNI-01120 regulatory sequence, the CNI-01125 regulatory sequence, the CNI-01131 regulatory sequence, and transcription-modulating sequences thereof, i a specific embodiment, the present invention relates to CNI-01142 regulatory sequence, or the CNI-01080 regulatory sequence, or the CNI-01104 regulatory sequence, or the CNI-01120 regulatory sequence, or the CNI-01125 regulatory sequence, or the CNI-01131 regulatory sequence, or portions thereof, that promote or enhance transcription of nucleic acids of interest in cells, in particular cells of the nervous system, including, but not limited to cells in the central nervous system (CNS), such as neurons and glia in the brain. The present invention also relates to vectors and cells engineered to contain such regulatory sequences. The present invention still further relates to methods of using the regulatory sequences of the invention to modulate expression of a nucleic acid of interest in cells, preferably cells of • the nervous system. 2. BACKGROUND OF THE INVENTION
[0003] The molecular basis of nervous system-specific gene expression is relatively poorly understood, mainly because of the large number of diverse cell types in the mammalian nervous system. Although many of the genes expressed in the nervous system are "housekeeping" genes, expressed in a variety of tissues, a significant number of genes have been identified that are expressed exclusively in neurons or glia. What is known regarding the molecular basis of gene expression in the nervous system has been reviewed elsewhere (Twyman & Jones, J. Neurogenet. 10(2):67-101 (1995); Quinn, Prog. Neurobiol 50(4):373-79 (1995); Grant, in MOLECULAR BIOLOGY OF THE NEURON Davies & Morris, eds., Bios Scientific Publishers, Oxford (1996)).
[0004] Promoters of nervous system-specific genes have been used to direct the expression of heterologous genes to nervous system-derived cells in culture or in transgenic animals. For example, the use of nervous system-specific promoters to express heterologous genes in cell culture or in transgenic mice has allowed the creation of disease models (Sturchler-Pierrat & Sornmer, Rev. Neurosci. 10(1): 15-24 (1999);
Brenner, Brain Pathol. 4(3):245-57 (1994)), permitted the characterization of individual gene function (Caroni, J. Neurosci. Meth. 71(1). -3-9 (1997)), and defined the minimum promoter and enhancer sequences necessary for tissue-specific expression (Chin et al, J. Biol. Chem. 269(28): 18507-18513 (1994); Liu et al Brain Res. Mol Brain Res. 50(1- 2):33-42 (1997); Whyte et al, Mol. Endocrinol. 9(4):467-477 (1995); Min et al, Brain Res. Mol. Brain Res. 27(2):281-9 (1994)). Nervous system-specific promoters have also been used to deliver therapeutic genes to the CNS to correct genetic deficiencies in vitro and in vivo (Kaplitt et al, Nature Genet. 8(2): 148-54 (1994); Miyao et al, Jpn. J. Cancer Res. 88(7):678-86 (1997); Hayward, Chem. Senses 20(2):261-9 (1995)). In addition, in vitro binding assays, mutational analysis and sequence analysis have been used to identify and map the cz's-acting regulatory regions and transacting factors that impart tissue- specificity and regulatory characteristics to the promoter.
[0005] Although the identification and characterization of promoters and enhancers functional in nervous system cells has given us fundamental insights into the regulation of gene expression in the nervous system, the picture is far from complete. Thus, there continues to be a need for the discovery of additional regulatory sequences that are functional in nervous system cells and especially a need for information serving to specifically identify and characterize them in terms of their DNA sequence.
3. SUMMARY OF THE INVENTION [0006] The present invention relates to nucleic acid regulatory sequences that modulate (e.g., promote, enhance, suppress, repress, or silence) expression of a nucleic acid of interest in a cell. In particular, the invention relates to an isolated nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1, SEQ ED NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ JD NO: 5, or SEQ ID NO: 6. In one embodiment, then, the invention relates to an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ID NO: 5, or SEQ ED NO: 6. hi specific embodiments, the invention relates to an isolated nucleic acid regulatory sequence, wherein the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SEQ ED NO: 1, SEQ ID NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ED NO: 5, or SEQ ED NO: 6. hi another specific embodiment, the invention relates to an isolated nucleic acid regulatory sequence, wherein the isolated nucleic acid regulatory sequence is created by nuclease digestion of a nucleic acid molecule comprising SEQ ED NO: 1, SEQ JD NO: 2, SEQ D NO: 3, SEQ ID NO: 4, SEQ ED NO: 5, or SEQ ED NO: 6. hi another specific embodiment, the invention relates to an isolated nucleic acid molecule of claim 1, wherein the isolated nucleic acid molecule is operably linked to a nucleic acid molecule comprising a coding sequence. In another specific embodiment, the invention relates to an isolated nucleic acid regulatory sequence molecule of any one of the preceding, wherein the isolated nucleic acid regulatory sequence molecule is operably linked to a nucleic acid molecule comprising a coding sequence.
[0007] another embodiment, the invention relates to an isolated nucleic acid molecule comprising the reverse complement of the nucleotide sequence of SEQ ED NO: 1, SEQ D NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ID NO: 5, or SEQ ED NO: 6. hi another specific embodiment, the invention relates to an isolated nucleic acid regulatory sequence molecule comprising the reverse complement of the nucleotide sequence of the nucleic acid regulatory sequence, h another specific embodiment, the invention relates to an isolated nucleic acid regulatory sequence molecule, wherein the transcription activating nucleotide sequence comprises at least about 50 contiguous nucleotides of SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ JD NO: 5, or SEQ ED NO: 6. In another specific embodiment, the invention relates to an isolated nucleic acid regulatory sequence molecule, wherein the transcription activating nucleotide sequence comprises at least about 100 contiguous nucleotides of SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ED NO: 5, or SEQ JD NO: 6. h another specific embodiment, the invention relates to an isolated nucleic acid regulatory sequence molecule, wherein the transcription activating nucleotide sequence comprises at least about 200 contiguous nucleotides of SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ED NO: 5, or SEQ ED NO: 6.
[0008] The invention also provides nucleic acid sequences that hybridize to SEQ ED NO: 1, SEQ ED NO: 2, SEQ JD NO: 3, SEQ D NO: 4, SEQ ED NO: 5, or SEQ JD NO: 6. Thus, in one embodiment, the invention relates to an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ ED NO: 1, SEQ JD NO: 2, SEQ ID NO: 3, SEQ JD NO: 4, SEQ JD NO: 5, or SEQ ID NO: 6 or the complement thereof.
[0009] The invention also provides for vectors containing a regulatory sequence of the invention. Thus, in one embodiment, the invention relates to a vector comprising the nucleotide sequence of SEQ JD NO: 1, SEQ D NO: 2, SEQ JD NO: 3, SEQ ED NO: 4, SEQ ED NO: 5, or SEQ ED NO: 6. In specific embodiments, the invention relates to a vector comprising at least 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000 or 1250 nucleotides of the nucleotide sequence of SEQ JD NO: 1; at least 20, 30, 40, 50 or 60 nucleotides of the nucleotide sequence of SEQ JD NO: 2; at least 20, 30, 40, 50, 75, 100 or 125 nucleotides of the nucleotide sequence of SEQ JD NO: 3; at least 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1500, 2000 or 2500 nucleotides of the nucleotide sequence of SEQ JD NO: 4; at least 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000 or 1500 nucleotides of the nucleotide sequence of SEQ JD NO: 5; and/or at least 20, 30, 40, 50, 75, 100, 200, 300, 400, 500 or 750 nucleotides of the nucleotide sequence of SEQ ED NO: 6. In another specific embodiment, the invention relates to a vector comprising the reverse complement of SEQ JD NO: 1, SEQ JD NO: 2, SEQ JD NO: 3, SEQ JD NO:
4, SEQ JD NO: 5, or SEQ ED NO: 6, or a transcription activating nucleotide sequence thereof. In another specific embodiment, the invention relates to a vector containing an isolated nucleic acid regulatory sequence that hybridizes along its entire length to the sequence of SEQ ED NO: 1, SEQ JD NO: 2, SEQ JD NO: 3, SEQ JD NO: 4, SEQ ID NO: 5, or SEQ ID NO: 6. In another specific embodiment, the invention relates to a vector further comprising a coding sequence operably linked to the nucleotide sequence of SEQ JD NO: 1, SEQ JD NO: 2, SEQ JD NO: 3, SEQ JD NO: 4, SEQ JD NO: 5, or SEQ D NO: 6, or a subsequence thereof. In a more specific embodiment, the invention relates to a vector comprising a coding sequence operably linked to the nucleotide sequence of SEQ ED NO: 1, SEQ JD NO: 2, SEQ JD NO: 3, SEQ JD NO: 4, SEQ TD NO: 5, or SEQ JD NO: 6, or a subsequence thereof, wherein the coding sequence is heterologous to the nucleotide sequence of SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ JD NO: 5, or SEQ JD NO: 6. In another specific embodiment, any of the vectors described above further comprises a multiple cloning site (MCS), wherein when a nucleotide sequence is inserted into the MCS, the nucleotide sequence is operably linked to SEQ JD NO: 1, SEQ ED NO: 2, SEQ JD NO: 3, SEQ JD NO: 4, SEQ JD NO: 5, or SEQ JD NO: 6 or a subsequence thereof. In another more specific embodiment, the invention relates to a vector further comprising an internal ribosomal entry site (IRES). In another specific embodiment, the invention relates to a vector further comprising a multiple cloning site (MCS), wherein when a nucleotide sequence is inserted into the MCS, the nucleotide sequence is operably linked to SEQ JD NO: 1, SEQ JD NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, or SEQ TD NO: 6. In another more specific embodiment, this vector comprises an IRES. In another specific embodiment, the invention relates to a vector comprising SEQ JD NO: 1, SEQ JD NO: 2, SEQ D NO: 3, SEQ ED NO: 4, SEQ ID NO: 5, or SEQ JD NO: 6 or a subsequence thereof, and an MCS, wherein when a coding sequence is present within the MCS, the coding sequence is operably linked to the nucleotide sequence of SEQ JD NO: 1, SEQ ED NO: 2, SEQ JD NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, or SEQ JD NO: 6 or a subsequence thereof, hi another specific embodiment, the invention relates to a vector comprising SEQ JD NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ TD NO: 4, SEQ ID NO: 5, or SEQ JD NO: 6 or a subsequence thereof, and an MCS, wherein when a coding sequence is present within the MCS, the coding sequence is operably linked to the transcription activating sequence of SEQ JD
NO: 1, SEQ D NO: 2, SEQ JD NO: 3, SEQ JD NO: 4, SEQ JD NO: 5, or SEQ JD NO: 6 or a subsequence thereof. In another specific embodiment, any of the above the vectors contains a coding sequence within the MCS. hi another specific embodiment, any of the above vectors that contains a coding sequence, said coding sequence is a reporter gene sequence. In a more specific embodiment, said reporter gene sequence encodes β- galactosidase, a fluorescent protein, chloramphenicol acetyltransferase, luciferase or an antigenic marker. In another specific embodiment, said coding sequence is a neuroprotective sequence. In another specific embodiment, the invention provides a vector comprising a promoter and an MCS operably linked in an upstream-to-downstream order, and the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ID NO: 5, or SEQ ED NO: 6 or a transcription activating nucleotide sequence thereof. In a more specific embodiment, this vector further comprises an internal ribosomal entry site (IRES).
[0010] Any of the vectors of the invention can be adapted for transfer to a eukaryotic host cell, including a human host cell. In a more specific embodiment, the eukaryotic host cell is a nervous system cell. In a more specific embodiment, the nervous system cell is a nervous system cell line, glial cell, astrocyte, oligodendrocyte, mesencephalic neuron, hypothalamic neuron or cortical neuron. In another embodiment, the vectors above are adapted for transfer to a prokaryotic host cell. [0011] The invention further provides for host cells, or progeny thereof, containing the vectors above. In a more specific embodiment, said host cell is a eukaryotic cell, including a human host cell. In a more specific embodiment, said host cell is a nervous system cell. In another specific embodiment, said host cell is a prokaryotic cell. The invention also provides for kits containing one or more of the vectors and or host cells of the invention in one or more containers, and, preferably, further containing instructions for use.
[0012] The present invention also relates to transgenic non-human animals engineered to contain a nucleic acid regulatory sequence of the invention. The nucleic acid regulatory sequence can be contained within an episome or, alternatively, the sequence can be integrated within the genome of the transgenic animal. Genomic insertion can be by either homologous or non-homologous recombination. [0013] The invention further provides a method of expressing a coding sequence in a host cell in cell culture, h one embodiment, the method comprises culturing a host cell containing a vector of the invention that contains a coding sequence under conditions effective to allow expression of the coding sequence by said host cell. In another embodiment, the method comprises culturing a host cell of the invention wherein the nucleic acid regulatory sequence controls expression of a coding sequence endogenously present in the genome of said host cell, under conditions effective to allow expression of the coding sequence by said host cell. [0014] The invention also provides a method of producing a peptide or polypeptide comprising maintaining a host cell of the invention that contains a coding sequence that encodes a peptide or polypeptide under conditions effective to allow expression of said coding sequence, and to allow translation of the resulting mRNA, such that a peptide or polypeptide is expressed. In one embodiment, the coding sequence is present as part of a vector of the invention. In another embodiment, the host cell has been engineered such that a nucleic acid regulatory sequence of the invention controls expression of a coding sequence endogenously present in the genome of said host cell, under conditions effective to allow expression of the coding sequence by said host cell. In a more specific embodiment, the vector is present in the genome of said host cell. [0015] The invention also provides a method of identifying a modulator of a nucleic acid regulatory sequence of the invention, comprising: (a) contacting a host cell containing a nucleic acid regulatory sequence of the invention operably linked to a reporter gene sequence with a test compound; and (b) assaying expression of the reporter gene, such that, if a change in reporter gene expression relative to its expression in the absence of the test compound, is detected, a modulator of the nucleic acid regulatory sequence is identified, h a particular embodiment, the host cell is a nervous system cell. As used herein, an "isolated nucleic acid" is a nucleic acid outside its normal biological context (i.e., outside an intact chromosome). The term is also not intended to refer to nucleotide sequences consisting of the sequences disclosed in GenBank accession numbers U62317 or AC005226 (see Section 6.2), or in GenBank accession number HS941F9 (see Section 6.3), or in GenBank accession number AC000079 (see Sections 6.4 and 6.5), or in GenBank accession number AL031186 (see Section 6.6), or in
GenBank accession number AL022314 (see Section 6.7). Finally, the term "isolated nucleic acid" as used herein is also not intended to refer to any other full-length sequence disclosed in GenBank. Further, an isolated nucleic acid molecule of the invention contains no more than up to about 5,000 to 10,000 nucleotides of sequence that would endogenously flank SEQ ID NO: 1, or SEQ ED NO: 2, or SEQ ED NO: 3, or SEQ ID NO: 4, or SEQ ID NO: 5, or SEQ ID NO: 6. Additionally, the term "isolated nucleic acid" refers to either the single-stranded or double-stranded form of the nucleic acid molecule. Furthermore, the isolated nucleic acid molecule may consist of DNA or RNA, and may contain base analogs. [0016] A "nucleic acid regulatory sequence" or "regulatory sequence" comprises a nucleotide sequence that, when operably linked to a nucleic acid of interest, modulates (e.g., activates (promotes, enhances) or inhibits (suppresses, represses, silences) transcription) the nucleic acid of interest, particularly in a cell. A nucleotide sequence is considered "transcription activating" if, when operably linked to a nucleic acid whose expression may be monitored, and placed in a cell (e.g., a nervous system cell in cell culture) under conditions under which expression may take place, promotes or enhances the expression of the nucleic acid detectably above the expression of the same nucleic acid in the absence of the nucleotide sequence operably linked thereto. A nucleic acid regulatory sequence "promotes" transcription of a nucleic acid of interest if, when operably linked to the nucleic acid of interest, it elicits a detectable level of expression of the nucleic acid of interest. A nucleic acid regulatory sequence "enhances" transcription of a nucleic acid of interest if, when operably linked to the nucleic acid of interest, it increases the detectable level of expression relative to expression of the nucleic acid of interest in the absence of the nucleic acid regulatory sequence operably linked thereto. Generally, a nucleic acid regulatory sequence is considered to enhance transcription of the nucleic acid of interest when said nucleic acid is already expressed to some detectable level (e.g., is controlled by a promoter sequence) that is increased by the nucleic acid regulatory sequence.
[0017] A nucleotide sequence, e.g., a nucleic acid regulatory sequence, is "operably linked" to a nucleic acid of interest if said nucleotide sequence is present in a cis configuration relative to said nucleic acid of interest, i.e., the nucleotide sequence attached via a covalent linkage (e.g., a phosphodiester linkage) to the same nucleic acid molecule that comprises the nucleic acid of interest. In one embodiment, a nucleic acid regulatory sequence can be adjacent to a nucleic acid of interest or to a promoter sequence that promotes expression of the nucleic acid of interest. The nucleic acid regulatory sequence can be placed upstream (i.e., 5') of the sequence whose expression is to be activated (promoted, enhanced) or inhibited. Additionally, in particular where the regulatory sequence has enhancer or silencer activity, the nucleic acid regulatory sequence can be placed within (e.g., in an intron) or downstream (i.e., 3') of the sequence whose expression is to be modulated.
[0018] A "coding sequence" is a nucleotide sequence that, when transcribed, yields an RNA molecule. In a preferred embodiment, a coding sequence comprises an open reading frame (ORF) that can be translated into a peptide or polypeptide sequence. In another preferred embodiment, a coding sequence comprises a nucleotide sequence that, when transcribed, yields a tRNA, rRNA, antisense RNA or enzymatically active RNA molecule. [0019] A first nucleic acid sequence is considered "heterologous" to a second nucleic acid sequence when the sequences are not endogenously present contiguous to each other, or when neither sequence is endogenously contained within the other. [0020] A "vector" is any nucleic acid that is self-replicating in at least one host cell, and is capable of containing the isolated nucleic acid for storage, replication, or propagation of the isolated nucleic acid, or for expression of a coding sequence operably linked to the isolated nucleic acid.
[0021] A "nervous system cell" can refer to a cell of the central nervous system (CNS), such as neurons, e.g., cortical, hippocampal, mesencephalic or medullary neurons, and glia in the brain, as well as to eye, spinal cord, and olfactory bulb cells, and to cells in the peripheral nervous system (PNS). [0022] A "peptide" refers to a macromolecule of from two to about nineteen amino acids covalently linked, e.g., covalently linked via peptide bonds. A "polypeptide" refers to a macromolecule of at least about twenty amino acids covalently linked, e.g., covalently linked via peptide bonds.
4. BRIEF DESCRIPTION OF THE DRAWINGS
[0023] FIG. 1 is a diagram of the plasmid pCOGENTl containing CNI-01142.
Regulatory sequences of the present invention are indicated as "UNIQUE SEQUENCE CNI-01142". pCOGENTl contains a basal promoter (i.e., a TATA box) between the BamHl and ClaJ sites. The negative control plasmid for expression experiments is pCOGENTl containing only a basal promoter and no regulatory sequence insert; the positive control is pCOGENTl (E), which is pCOGENTl containing the CMV enhancer region inserted into the MCS upstream of the basal promoter. [0024] FIG. 2 depicts the DNA sequence of CNI-01142.
[0025] FIG. 3 depicts the UCSC linkage map of a region of human chromosome 22 containing the nucleotide sequence of CNI-01142. The position of the sequence of CNI- 01142 in the map is from base position 47478467 to 47479777. The position in the UCSC linkage map (University of California-Santa Cruz, October 7, 2000 freeze) corresponds to positions 34371702 to 34373014 in the nucleotide sequence of human chromosome 22 as reported in CHR22_19_05_2000 version 2 (The Sanger Centre, Cambridge, England). Base Position: Chromosomal coordinates, numbered from the telomere of the short arm of human chromosome 22. Chromosome Band: Light and dark blocks show traditional cytological bands seen with Giemsa staining. STS Markers:
Location of markers from genetic, RH, YAC, and FISH maps. Mouse Synteny: Syntenic chromosomal region in mouse if known. GC Percent: Darker shades of gray correspond to higher % GC figured for a window of 20 kbp. FPC Contigs: Large dark blocks correspond to fingerprint map contigs at the Washington University (Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri). Assembly: An individual box shows the assembly extracted from a single clone fragment. Gaps in the path appear as white space between the boxes at an adequate zoom level, with a thin horizontal line connecting boxes bridged over a gap. Gap: Shows locations of gaps in the assembly with black boxes or vertical lines. Small gaps may have artefactually coalesced in the graphic. Gaps spanned by mRNA and paired reads have a white horizontal line through the black box to indicate bridging. Coverage: In dense display, the level of gray gives level of coverage: White/Clear: no coverage (gap); Light Gray: predraft (less than 4x shotgun); Medium Gray: draft (at least 4x shotgun); Dark Gray: multiple draft, covered by more than one draft clone; Finished: covered by a finished clone. YourSeq: Position of the DNA sequence of CNI-01142 relative to other sequences or features in the linkage map. Known Genes (from full length rxiRNAs): Known protein coding genes from LocusLink. Exons are represented by black boxes; thin horizontal lines represent introns. In the full view, the arrows on the introns indicate direction of transcription. Affymetrix Gene Predictions (excluding known genes): Gene predictions based on ESTs, mRNA, codon usage, and splice site statistics by the proprietary program Genie, supplied by David Kulp at Affymetrix. Ensembl Genes: Gene predictions from Ensembl, an automatic DNA sequence assembly and tracking system developed by the European Bioinformatics Institute and the Sanger Center (Cambridgeshire, Wellcome Trust Genome Campus, UK. Arrows on the introns, when present, indicate direction of transcription. Fgenesh++ Gene Predictions: Fgenesh++ predictions based on Softberry's gene finding software (Salamov & Solovyev, Genome Research 10(5):516-522 (2000)). Full mRNAs: Shows aligning regions as black boxes or vertical lines connected by horizontal lines for gaps. In full display, arrows on the introns indicate the direction of transcription. Human ESTs That Have Been Spliced: Shows spliced human ESTs. This track may suggest alternative splicings. Human ESTs: In dense mode the level of gray in this track represents the number of ESTs that align at that region. RNA Genes: Indicated location of non-protein coding RNA genes and pseudogenes, including tRNAs, rRNAs, SRPs, C/D box methylation guide snoRNAs, U6 etc. RNA-like, HBII, and hVS-like elements.
[0026] FIG. 4 shows the locations of transcription-factor binding motifs in CNI-01142. The names of the factors that bind the motifs are displayed above the diagram. The nucleotide position of the motifs as they occur sequentially 5' to 3' in the nucleotide sequence of CNI-01142 are indicated to the right of the transcription factor name. [0027] FIG. 5 Quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis. Cells were transfected with pCOGENTl containing CNI-01142, negative control DNA, or positive control DNA. The number of cells expressing the reporter gene in each slice is determined visually. [0028] FIG. 6 shows images of coronal brain slices transfected with (a) pCOGENTl containing CNI-01142; (b) negative control plasmid, pCOGENTl; or (c) positive control DNA, pCOGENTl (E). [0029] FIG. 7 is a diagram of the plasmid pCOGENTl containing CNI-01080. Regulatory sequences of the present invention are indicated as "UNIQUE SEQUENCE
CNI-01080". pCOGENTl contains a basal promoter (i.e., a TATA box) between the
BamHI and Clal sites. The negative control plasmid for expression experiments is pCOGENTl containing only a basal promoter and no regulatory sequence insert. The positive control is pCOGENTl(E), which is pCOGENTl containing the CMV enhancer region inserted into the MCS upstream of the basal promoter. [0030] FIG. 8 depicts the DNA sequence of CNI-01080. [0031] FIG. 9 depicts the UCSC linkage map of a region of human chromosome 22 containing the nucleotide sequence of CNI-01080. The position of the sequence of CNI- 01080 in the map is the complement of base positions 42509227 to 42509291. The position in the UCSC linkage map (University of California-Santa Cruz) corresponds to the complement of positions 29402464 to 29402528 in the nucleotide sequence of human chromosome 22 as reported in CHR22_19_05_2000 version 2 (The Sanger Centre, Cambridge, England). Base Position: Chromosomal coordinates, numbered from the telomere of the short arm of human chromosome 22. Chromosome Band: Light and dark blocks show traditional cyto logical bands seen with Giemsa staining. STS Markers: Location of markers from genetic, RH, YAC, and FISH maps. Mouse Synteny: Syntenic chromosomal region in mouse if known. GC Percent: Darker shades of gray correspond to higher % GC figured for a window of 20 kbp. FPC Contigs: Large dark blocks correspond to fingerprint map contigs at the Washington University (Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri). Assembly: An individual box shows the assembly extracted from a single clone fragment. Gaps in the path appear as white space between the boxes at an adequate zoom level, with a thin horizontal line connecting boxes bridged over a gap. Gap: Shows locations of gaps in the assembly with black boxes or vertical lines. Small gaps may have artefactually coalesced in the graphic. Gaps spanned by mRNA and paired reads have a white horizontal line through the black box to indicate bridging. Coverage: In dense display, the level of gray gives level of coverage: White/Clear: no coverage (gap); Light Gray: predraft (less than 4x shotgun); Medium Gray: draft (at least 4x shotgun); Dark Gray: multiple draft, covered by more than one draft clone; Finished: covered by a finished clone. YourSeq: Position of the DNA sequence of CNI-01080 relative to other sequences or features in the linkage map. Known Genes (from full length mRNAs): Known protein coding genes from LocusLmk. Exons are represented by black boxes; thin horizontal lines represent introns. hi the full view, the arrows on the introns indicate direction of transcription.
Affymetrix Gene Predictions (excluding known genes): Gene predictions based on ESTs, mRNA, codon usage, and splice site statistics by the proprietary program Genie, supplied by David Kulp at Affymetrix. Ensembl Genes: Gene predictions from Ensembl, an automatic DNA sequence assembly and tracking system developed by the European Bioinformatics Institute and the Sanger Center (Cambridgeshire, Wellcome Trust Genome Campus, UK. Arrows on the introns, when present, indicate direction of transcription. Fgenesh++ Gene Predictions: Fgenesh++ predictions based on Softberry's gene finding software (Salamov & Solovyev, Genome Research 10(5):516-522 (2000)). Full mRNAs: Shows aligning regions as black boxes or vertical lines connected by horizontal lines for gaps. In full display, arrows on the introns indicate the direction of transcription. Human ESTs That Have Been Spliced: Shows spliced human ESTs. This track may suggest alternative splicings. Human ESTs: In dense mode the level of gray in this track represents the number of ESTs that align at that region. RNA Genes: Indicated location of non-protein coding RNA genes and pseudogenes, including tRNAs, rRNAs, SRPs, C/D box methylation guide snoRNAs, U6 etc. RNA-like, HBIL and hVS-like elements.
[0032] FIG. 10 shows the locations of transcription-factor binding motifs in CNI- 01080. The names of the factors that bind the motifs are displayed above the diagram. The nucleotide position of the motifs as they occur sequentially 5' to 3' in the nucleotide sequence of CNI-01080 are indicated to the right of the transcription factor name. [0033] FIG. 11 depicts quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis. Cells were transfected with pCOGENTl containing CNI-01080, negative control DNA, or positive control DNA. The number of cells expressing the reporter gene in each slice is determined visually. [0034] FIG. 12 shows images of coronal brain slices transfected with (a) pCOGENTl containing CNI-01080; (b) negative control plasmid, pCOGENTl; or (c) positive control DNA, pCOGENTl (E).
[0035] FIG. 13 is a diagram of the plasmid pCOGENTl containing CNI-01104. Regulatory sequences of the present invention are indicated as "UNIQUE SEQUENCE CNI-01104". pCOGENTl contains a basal promoter (i.e., a TATA box) between the
BamHI and Clal sites. The negative control plasmid for expression experiments is pCOGENTl containing only a basal promoter and no regulatory sequence insert. The positive control is pCOGENTl(E), which is pCOGENTl containing the CMV enhancer region inserted into the MCS upstream of the basal promoter. [0036] FIG. 14 depicts the DNA sequence of CNI-01104.
[0037] FIG. 15 depicts the UCSC linkage map of a region of human chromosome 22 containing the nucleotide sequence of CNI-01104. The position of the sequence of CNI- 01104 in the map is from base position 16340620 to 16340766. The position in the UCSC linkage map (University of California-Santa Cruz, October 7, 2000 freeze) corresponds to positions 3266028 to 3266173 in the nucleotide sequence of human chromosome 22 as reported in CHR22_19_05_2000 version 2 (The Sanger Centre, Cambridge, England). Base Position: Chromosomal coordinates, numbered from the telomere of the short arm of human chromosome 22. Chromosome Band: Light and dark blocks show traditional cytological bands seen with Giemsa staining. STS Markers: Location of markers from genetic, RH, YAC, and FISH maps. Mouse Synteny: Syntenic chromosomal region in mouse if known. GC Percent: Darker shades of gray correspond to higher % GC figured for a window of 20 kbp. FPC Contigs: Large dark blocks correspond to fingerprint map contigs at the Washington University (Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri). Assembly: An individual box shows the assembly extracted from a single clone fragment. Gaps in the path appear as white space between the boxes at an adequate zoom level, with a thin horizontal line connecting boxes bridged over a gap. Gap: Shows locations of gaps in the assembly with black boxes or vertical lines. Small gaps may have artefactually coalesced in the graphic. Gaps spanned by mRNA and paired reads have a white horizontal line through the black box to indicate bridging. Coverage: In dense display, the level of gray gives level of coverage: White/Clear: no coverage (gap); Light Gray: predraft (less than 4x shotgun); Medium Gray: draft (at least 4x shotgun); Dark Gray: multiple draft, covered by more than one draft clone; Finished: covered by a finished clone. YourSeq: Position of the DNA sequence of CNI-01104 relative to other sequences or features in the linkage map. Known Genes (from full length mRNAs): Known protein coding genes from LocusLmk. Exons are represented by black boxes; thin horizontal lines represent introns. In the full view, the arrows on the introns indicate direction of transcription.
Affymetrix Gene Predictions (excluding known genes): Gene predictions based on ESTs, mRNA, codon usage, and splice site statistics by the proprietary program Genie, supplied by David Kulp at Affymetrix. Ensembl Genes: Gene predictions from Ensembl, an automatic DNA sequence assembly and tracking system developed by the European Bioinformatics Institute and the Sanger Center (Cambridgeshire, Wellcome Trust Genome Campus, UK. Arrows on the introns, when present, indicate direction of transcription. Fgenesh++ Gene Predictions: Fgenesh++ predictions based on Softberry's gene finding software (Salamov & Solovyev, Genome Research 10(5):516-522 (2000)). Full mRNAs: Shows aligning regions as black boxes or vertical lines connected by horizontal lines for gaps. In full display, arrows on the introns indicate the direction of transcription. Human ESTs That Have Been Spliced: Shows spliced human ESTs. This track may suggest alternative splicings. Human ESTs: hi dense mode the level of gray in this track represents the number of ESTs that align at that region. RNA Genes: Indicated location of non-protein coding RNA genes and pseudogenes, including tRNAs, rRNAs, SRPs, C D box methylation guide snoRNAs, U6 etc. RNA-like, HBII, and hVS-like elements. [0038] FIG. 16 shows the locations of transcription-factor binding motifs in CNI- 01104. The names of the factors that bind the motifs are displayed above the diagram. The nucleotide position of the motifs as they occur sequentially 5' to 3' in the nucleotide sequence of CNI-01104 are indicated to the right of the transcription factor name. [0039] FIG. 17 depicts quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis. Cells were transfected with the reporter- gene plasmid pCOGENTl containing CNI-01104, negative control DNA, or positive control. Cells were co-transfected with a plasmid causing high-level expression of cyan fluorescent protein (CFP), which served as an internal control for transfection. The number of cells expressing the reporter gene (GFP) in each slice is determined visually, and is expressed as a percentage of the CFP-expressing cells ("% GFP-expressing cells").
[0040] FIG. 18 shows images of coronal brain slices transfected with (a) pCOGENTl containing CNI-01104; (b) negative control plasmid, pCOGENTl; or (c) positive control DNA, pCOGENTl (E). [0041] FIG. 19 is a diagram of the plasmid pCOGENTl containing CNI-01120.
Regulatory sequences of the present invention are indicated as "UNIQUE SEQUENCE
CNI-01120". pCOGENTl contains a basal promoter (i.e., a TATA box) between the BamHI and Clal sites The negative control plasmid for expression experiments is pCOGENTl containing only a basal promoter and no regulatory sequence insert. The positive control is pCOGENTl (E), which is pCOGENTl containing the CMV enhancer region inserted into the MCS upstream of the basal promoter. [0042] FIG. 20 depicts the DNA sequence of CNI-01120.
[0043] FIG. 21 depicts the UCSC linkage map of a region of human chromosome 22 containing the nucleotide sequence of CNI-01120. The position of the sequence of CNI- 01120 in the map is from base position 16340761 to 16343529. The position in the UCSC linkage map (University of California-Santa Cruz, October 7, 2000 freeze) corresponds to positions 3266168 to 3268936 in the nucleotide sequence of human chromosome 22 as reported in CHR22_19_05_2000 version 2 (The Sanger Centre, Cambridge, England). Base Position: Chromosomal coordinates, numbered from the telomere of the short arm of human chromosome 22. Chromosome Band: Light and dark blocks show traditional cytological bands seen with Giemsa staining. STS Markers: Location of markers from genetic, RH, YAC, and FISH maps. Mouse Synteny: Syntenic chromosomal region in mouse if known. GC Percent: Darker shades of gray correspond to higher % GC figured for a window of 20 kbp. FPC Contigs: Large dark blocks correspond to fingerprint map contigs at the Washington University (Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri). Assembly: An individual box shows the assembly extracted from a single clone fragment. Gaps in the path appear as white space between the boxes at an adequate zoom level, with a thin horizontal line connecting boxes bridged over a gap. Gap: Shows locations of gaps in the assembly with black boxes or vertical lines. Small gaps may have artefactually coalesced in the graphic. Gaps spanned by mRNA and paired reads have a white horizontal line through the black box to indicate bridging. Coverage: In dense display, the level of gray gives level of coverage: White/Clear: no coverage (gap); Light Gray: predraft (less than 4x shotgun); Medium Gray: draft (at least 4x shotgun); Dark Gray: multiple draft, covered by more than one draft clone; Finished: covered by a finished clone. YourSeq: Position of the DNA sequence of CNI-01120 relative to other sequences or features in the linkage map. Known Genes (from full length mRNAs): Known protein coding genes from LocusLink. Exons are represented by black boxes; thin horizontal lines represent introns. In the full view, the arrows on the introns indicate direction of transcription. Affymetrix Gene Predictions (excluding known genes): Gene predictions based on ESTs, mRNA, codon usage, and splice site statistics by the proprietary program Genie, supplied by David Kulp at Affymetrix. Ensembl Genes: Gene predictions from Ensembl, an automatic DNA sequence assembly and tracking system developed by the European Bioinformatics Institute and the Sanger Center (Cambridgeshire, Wellcome Trust Genome Campus, UK. Arrows on the introns, when present, indicate direction of transcription. Fgenesh++ Gene Predictions: Fgenesh++ predictions based on Softberry's gene finding software (Salamov & Solovyev, Genome Research 10(5):516-522 (2000)). Full mRNAs: Shows aligning regions as black boxes or vertical lines connected by horizontal lines for gaps. In full display, arrows on the introns indicate the direction of transcription. Human ESTs That Have Been Spliced: Shows spliced human ESTs. This track may suggest alternative splicings. Human ESTs: In dense mode the level of gray in this track represents the number of ESTs that align at that region. RNA Genes: Indicated location of non-protein coding RNA genes and pseudogenes, including tRNAs, rRNAs, SRPs, C/D box methylation guide snoRNAs, U6 etc. RNA-like, HBII, and hVS-like elements.
[0044] FIG. 22 shows the locations of transcription-factor binding motifs in CNI- 01120. The names of the factors that bind the motifs are displayed above the diagram. The nucleotide position of the motifs as they occur sequentially 5' to 3' in the nucleotide sequence of CNI-01120 are indicated to the right of the transcription factor name.
[0045] FIG. 23 depicts quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis. Cells were transfected with pCOGENTl containing CNI-01120, negative control DNA, or positive control DNA. Cells were co-transfected with a plasmid causing high-level expression of cyan fluorescent protein (CFP), which served as an internal control for transfection. The number of cells expressing the reporter gene (GFP) in each slice is determined visually, and is expressed as a percentage of the CFP-expressing cells ("% GFP-expressing
Cells").
[0046] FIG. 24 shows images of coronal brain slices transfected with (a) pCOGENTl containing CNI-01120; (b) negative control plasmid, pCOGENTl; or (c) positive control DNA, pCOGENTl (E). [0047] FIG. 25 is a diagram of the plasmid pCOGENTl containing CNI-01125. Regulatory sequences of the present invention are indicated as "UNIQUE SEQUENCE CNI-01125". pCOGENTl contains a basal promoter (i.e., a TATA box) between the BamHI and Clal sites. The negative control plasmid for expression experiments is pCOGENTl containing only a basal promoter and no regulatory sequence insert. The positive control is pCOGENTl(E), which is pCOGENTl containing the CMV enhancer region inserted into the MCS upstream of the basal promoter. [0048] FIG. 26 depicts the DNA sequence of CNI-01125. [0049] FIG. 27 depicts the UCSC linkage map of a region of human chromosome 22 containing the nucleotide sequence of CNI-01125. The position of the sequence of CNI- 01125 in the map is from base position 26309737 to 26311433. The position in the UCSC linkage map (University of California-Santa Cruz, October 7, 2000 freeze) corresponds to positions 13202009 to 13203706 in the nucleotide sequence of human chromosome 22 as reported in CHR22_19_05_2000 version 2 (The Sanger Centre, Cambridge, England). Base Position: Chromosomal coordinates, numbered from the telomere of the short arm of human chromosome 22. Chromosome Band: Light and dark blocks show traditional cytological bands seen with Giemsa staining. STS Markers: Location of markers from genetic, RH, YAC, and FISH maps. Mouse Synteny: Syntenic chromosomal region in mouse if known. GC Percent: Darker shades of gray correspond to higher % GC figured for a window of 20 kbp. FPC Contigs: Large dark blocks correspond to fingerprint map cont gs at the Washington University (Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri). Assembly: An individual box shows the assembly extracted from a single clone fragment. Gaps in the path appear as white space between the boxes at an adequate zoom level, with a thin horizontal line connecting boxes bridged over a gap. Gap: Shows locations of gaps in the assembly with black boxes or vertical lines. Small gaps may have artefactually coalesced in the graphic. Gaps spanned by mRNA and paired reads have a white horizontal line through the black box to indicate bridging. Coverage: h dense display, the level of gray gives level of coverage: White/Clear: no coverage (gap); Light Gray: predraft (less than 4x shotgun); Medium Gray: draft (at least 4x shotgun); Dark Gray: multiple draft, covered by more than one draft clone; Finished: covered by a finished clone. YourSeq:
Position of the DNA sequence of CNI-01125 relative to other sequences or features in the linkage map. Known Genes (from full length mRNAs): Known protein coding genes from LocusEink. Exons are represented by black boxes; thin horizontal lines represent introns. In the full view, the arrows on the introns indicate direction of transcription. Affymetrix Gene Predictions (excluding known genes): Gene predictions based on ESTs, mRNA, codon usage, and splice site statistics by the proprietary program Genie, supplied by David Kulp at Affymetrix. Sanger Chromosome 22 Annotation: Known and predicted genes on human chromosome 22 based on information supplied by the Sanger Center (Cambridge, UK). Ensembl Genes: Gene predictions from Ensembl, an automatic DNA sequence assembly and tracking system developed by the European Bioinformatics Institute and the Sanger Center (Cambridgeshire, Wellcome Trust Genome Campus, UK. Arrows on the introns, when present, indicate direction of transcription. Fgenesh-H- Gene Predictions: Fgenesh++ predictions based on Softberry's gene finding software (Salamov & Solovyev, Genome Research 10(5):516-522 (2000)). Full mRNAs: Shows aligning regions as black boxes or vertical lines connected by horizontal lines for gaps, i full display, arrows on the introns indicate the direction of transcription. Human ESTs That Have Been Spliced: Shows spliced human ESTs. This track may suggest alternative splicings. Human ESTs: In dense mode the level of gray in this track represents the number of ESTs that align at that region. RNA Genes: Indicated location of non-protein coding RNA genes and pseudogenes, including tRNAs, rRNAs, SRPs, C/D box methylation guide snoRNAs, U6 etc. RNA-like, HBII, and hVS-like elements.
[0050] FIG. 28 shows the locations of transcription-factor binding motifs in CNI- 01125. The names of the factors that bind the motifs are displayed above the diagram. The nucleotide position of the motifs as they occur sequentially 5' to 3' in the nucleotide sequence of CNI-01125 are indicated to the right of the transcription factor name. [0051] FIG. 29 depicts quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis. Cells were transfected with pCOGENTl containing CNI-01125, negative control DNA, or positive control DNA. The number of cells expressing the reporter gene in each slice is determined visually. [0052] FIG. 30 shows images of coronal brain slices transfected with (a) pCOGENTl containing CNI-01125; (b) negative control plasmid, pCOGENTl; or (c) positive control
DNA, pCOGENTl (E). [0053] FIG. 31 is a diagram of the plasmid pCOGENTl containing CNI-01131. Regulatory sequences of the present invention are indicated as "UNIQUE SEQUENCE CNI-01131". pCOGENTl contains a basal promoter (i.e., a TATA box) between the BamHI and Clal sites. The negative control plasmid for expression experiments is pCOGENTl containing only a basal promoter and no regulatory sequence insert. The positive control is pCOGENTl(E), which is pCOGENTl containing the CMV enhancer region inserted into the MCS upstream of the basal promoter. [0054] FIG. 32 depicts the DNA sequence of CNI-01131. [0055] FIG. 33 depicts the UCSC linkage map of a region of human chromosome 22 containing the nucleotide sequence of CNI-01131. The position of the sequence of CNI- 01131 in the map is from base position 34059978 to 34060733. The position in the UCSC linkage map (University of California-Santa Cruz, October 1, 2000 freeze) corresponds to positions 20952250 to 20953005 in the nucleotide sequence of human chromosome 22 as reported in CHR22_19_05_2000 version 2 (The Sanger Centre, Cambridge, England). Base Position: Chromosomal coordinates, numbered from the telomere of the short arm of human chromosome 22. Chromosome Band: Light and dark blocks show traditional cytological bands seen with Giemsa staining. STS Markers: Location of markers from genetic, RH, YAC, and FISH maps. Mouse Synteny: Syntenic chromosomal region in mouse if known. GC Percent: Darker shades of gray correspond to higher % GC figured for a window of 20 kbp. FPC Contigs: Large dark blocks correspond to fingerprint map contigs at the Washington University (Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri). Assembly: An individual box shows the assembly extracted from a single clone fragment. Gaps in the path appear as white space between the boxes at an adequate zoom level, with a thin horizontal line connecting boxes bridged over a gap. Gap: Shows locations of gaps in the assembly with black boxes or vertical lines. Small gaps may have artefactually coalesced in the graphic. Gaps spanned by mRNA and paired reads have a white horizontal line through the black box to indicate bridging. Coverage: In dense display, the level of gray gives level of coverage: White/Clear: no coverage (gap); Light Gray: predraft (less than 4x shotgun); Medium Gray: draft (at least 4x shotgun); Dark Gray: multiple draft, covered by more than one draft clone; Finished: covered by a finished clone. YourSeq:
Position of the DNA sequence of CNI-01131 relative to other sequences or features in the linkage map. Known Genes (from full length mRNAs): Known protein coding genes from LocusLink. Exons are represented by black boxes; thin horizontal lines represent mtrons. In the full view, the arrows on the introns indicate direction of transcription. Affymetrix Gene Predictions (excluding known genes): Gene predictions based on ESTs, mRNA, codon usage, and splice site statistics by the proprietary program Genie, supplied by David Kulp at Affymetrix. Sanger Chromosome 22 Annotation: Known and predicted genes on human chromosome 22 based on information supplied by the Sanger Center (Cambridge, UK). Ensembl Genes: Gene predictions from Ensembl, an automatic DNA sequence assembly and tracking system developed by the European Bioinformatics Institute and the Sanger Center (Cambridgeshire, Wellcome Trust Genome Campus, UK. Arrows on the introns, when present, indicate direction of transcription. Fgenesh++ Gene Predictions: Fgenesh-H- predictions based on Softberry's gene finding software (Salamov & Solovyev, Genome Research 10(5):516-522 (2000)). Full mRNAs: Shows aligning regions as black boxes or vertical lines connected by horizontal lines for gaps. In full display, arrows on the introns indicate the direction of transcription. Human ESTs That Have Been Spliced: Shows spliced human ESTs. This track may suggest alternative splicings. Human ESTs: In dense mode the level of gray in this track represents the number of ESTs that align at that region. RNA Genes: Indicated location of non-protein coding RNA genes and pseudogenes, including tRNAs, rRNAs, SRPs, C/D box methylation guide snoRNAs, U6 etc. RNA- ke, HBII, and hVS-like elements.
[0056] FIG. 34 shows the locations of transcription-factor binding motifs in CNI- 01131. The names of the factors that bind the motifs are displayed above the diagram. The nucleotide position of the motifs as they occur sequentially 5' to 3' in the nucleotide sequence of CNI-01131 are indicated to the right of the transcription factor name. [0057] FIG. 35 depicts quantitative measurement of cell transfection in coronal brain slices prepared from three regions along the rostral-caudal axis. Cells were transfected with pCOGENTl containing CNI-01131, negative control DNA, or positive control DNA. The number of cells expressing the reporter gene in each slice is determined visually. [0058] FIG. 36 shows images of coronal brain slices transfected with (a) pCOGENTl containing CNI-01131; (b) negative control plasmid, pCOGENTl; or (c) positive control
DNA, pCOGENTl(E). 5. DETAILED DESCRIPTION OF THE INVENTION
5.1. THE REGULATORY SEQUENCES
[0059] Using plasmid pCOGENTl (FIG. 1, FIG. 7, FIG. 13, FIG. 19, FIG. 25, FIG. 31), sequences have been identified that modulate the expression of a reporter sequence in nervous system cells. The present invention therefore relates to the following nucleic acid molecules that represent nucleic acid regulatory sequence molecules of the invention:
TABLE 1 FULL-LENGTH REGULATORY NUCLEIC ACID MOLECULES
Figure imgf000024_0001
[0060] In particular, SEQ ED NO: 1, SEQ ID NO: 2, SEQ ED NO: 3, SEQ ID NO: 4, SEQ ED NO: 5 and SEQ ID NO: 6 each modulate, promote or enhance gene expression in the nervous system.
[0061] As depicted in FIG. 3 (UCSC linkage map of a region of human chromosome 22), the sequence of CNI-01142 is located near a known gene, arylsulfatase A (ARSA), (see Section 6.2). As depicted in FIG. 9 (UCSC linkage map of a region of human chromosome 22), the sequence of CNI-01080 is located within an intron of the known gene, FBLNl, which encodes the extracellular matrix protein, fibulin-1 (see Section 6.3). As depicted in FIG. 15 (UCSC linkage map of a region of human chromosome 22), the sequence of CNI-01104 lies within an intron of a known gene, HERA which encodes a putative transcription factor, TUPLEl (see Section 6.4). As depicted in FIG. 21 (UCSC linkage map of a region of human chromosome 22), the sequence of CNI-01120 is also located within an intron of the known gene HERA (see Section 6.5). As depicted in FIG. 27 (UCSC linkage map of a region of human chromosome 22), the sequence of CNI- 01125 overlaps intronic and exonic sequences of two known genes, EWSR1 (Ewing sarcoma breakpoint region 1) and C22ORF3 (see Section 6.6). As depicted in FIG. 33 (UCSC linkage map of a region of human chromosome 22), the sequence of CNI-01131 encompasses an exon and part of the adjacent intron of the known gene, IL2RB (IL2 receptor, beta chain) (see Section 6.7). [0062] The present invention also relates to isolated nucleic acid regulatory sequences comprising a transcription activating nucleotide sequence of SEQ JD NO: 1, SEQ JD NO: 2, SEQ TD NO: 3, SEQ D NO: 4, SEQ ID NO: 5, or SEQ ID NO: 6. Such nucleic acid regulatory sequences may be restriction fragments of the full-length sequences disclosed. For example, in one embodiment, the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SEQ ID NO: 1. Thus, in a specific embodiment, a nucleic acid regulatory sequence of the invention is the PstJ-PstJ fragment represented by nucleotides 5-1310 of SEQ ID NO: 1. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the SfcJ-Sfcl fragment represented by nucleotides 1-494 of SEQ ED NO: 1. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the SfcJ-SfcJ fragment represented by nucleotides 494- 1306 of SEQ ID NO: 1. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the AccBlJ-AccBlJ fragment represented by nucleotides 116-601 of SEQ ID NO: 1. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the AccBlJ-AccBlJ fragment represented by nucleotides 601-1001 of SEQ ID NO: 1. h another specific embodiment, a nucleic acid regulatory sequence of the invention is the BbuJ-BbuJ fragment represented by nucleotides 309-736 of SEQ ED NO: 1. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BbuJ-BbuJ fragment represented by nucleotides 736-1240 of SEQ D NO: 1. hi another specific embodiment, a nucleic acid regulatory sequence of the invention is the EcoOl 09J-Eco01091 fragment represented by nucleotides 125-544 of SEQ JD NO: 1. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the EcoOl 09J-Eco01091 fragment represented by nucleotides 544-907 of SEQ JD NO: 1. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the AspJ-Earϊ fragment represented by nucleotides 62-265 of SEQ ID NO: 1. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the EarJ-GsuJ fragment represented by nucleotides 265-859 of SEQ ID NO: 1. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the Apoi-Nspl fragment represented by nucleotides 416-736 of SEQ ID NO: 1. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the AlwNJ-SfcJ fragment represented by nucleotides 910-1306 of SEQ TD NO: 1. [0063] In another embodiment, the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SEQ ID NO: 2. For example, in a specific embodiment, a nucleic acid regulatory sequence of the invention is the BamHJ-BamHJ fragment represented by nucleotides 1-60 of SEQ ID NO: 2. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the XlioJL-XhoR fragment represented by nucleotides 1-60 of SEQ ID NO: 2. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BamHΪ-AflJJJ fragment represented by nucleotides 1-44 of SEQ ID NO: 2. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the AflLJJ-BamEJ fragment represented by nucleotides 44-60 of SEQ D NO: 2. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BamHJ-Nspl fragment represented by nucleotides 1-48 of SEQ ID NO: 2. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the NspJ-BamH.1 fragment represented by nucleotides 48-60 of SEQ ID NO: 2.
[0064] In another embodiment, the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SEQ ED NO: 3. For example, in a specific embodiment, a nucleic acid regulatory sequence of the invention is the SfcJ-SfcJ fragment represented by nucleotides 1-141 of SEQ ED NO: 3. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the PstJ-SfcJ fragment represented by nucleotides 5-141 of SEQ JD NO: 3. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the Pstl-PstJ fragment represented by nucleotides 5-145 of SEQ JD NO: 3. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the ^cI-EcoO109I fragment represented by nucleotides 1-24 of SEQ ID NO: 3. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the EcoOl 091-S/cI fragment represented by nucleotides 24-141 of SΕQ ID NO: 3. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the SfcJ-ErhJ fragment represented by nucleotides 1-28 of SΕQ ID NO: 3. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the
Erhl-SfcHΪ fragment represented by nucleotides 28-141 of SΕQ ID NO: 3. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the Pst - EcoO109I fragment represented by nucleotides 5-24 of SEQ D NO: 3. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the EcoOl 091-PstI fragment represented by nucleotides 24-145 of SΕQ JD NO: 3. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the PstJ-ErhJ fragment represented by nucleotides 5-28 of SΕQ ID NO: 3. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the ErhJ-PstJ fragment represented by nucleotides 28-145 of SΕQ ID NO: 3. [0065] i another embodiment, the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SΕQ ID NO: 4. For example, in a specific embodiment, a nucleic acid regulatory sequence of the invention is the BpmJ-BpmJ fragment represented by nucleotides 177-940 of SΕQ ID NO: 4. hi another specific embodiment, a nucleic acid regulatory sequence of the invention is the BpmJ-BpmJ fragment represented by nucleotides 940-1753 of SΕQ ID NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BsgJ-BsgJ fragment represented by nucleotides 563-1063 of SΕQ ID NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BsgJ-BsgJ fragment represented by nucleotides 1063- 1518 of SΕQ ID NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BsgJ-BsgJ fragment represented by nucleotides 1518- 1897 of SΕQ ID NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BspML-BspMI fragment represented by nucleotides 8- 1691 of SΕQ ID NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the Eco88I-Eco88I fragment represented by nucleotides 1963-2606 of SΕQ ID NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the EcoO 1091-EcoO 1091 fragment represented by nucleotides 161-383 of SΕQ JD NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BanJ-BanJ fragment represented by nucleotides 331-649 of SΕQ JD NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BspΑJ-EcdΑJ fragment represented by nucleotides 739-1247 of SΕQ ID NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the AspHJ-Bsp 14071 fragment represented by nucleotides 1282-2285 of SΕQ ID NO: 4. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BseRI-PstJ fragment represented by nucleotides 32-2608 of SEQ JD NO: 4.
[0066] In another embodiment, the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SEQ ED NO: 5. For example, in a specific embodiment, a nucleic acid regulatory sequence of the invention is the SmaJ-Bspml fragment represented by nucleotides 71-1695 of SEQ ED NO: 5. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the SmaJ-BspmJ fragment represented by nucleotides 498-1695 of SEQ ED NO: 5. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the Eco52I-5αmHI fragment represented by nucleotides 112-746 of SΕQ ID NO: 5. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BamHJ-SalJ fragment represented by nucleotides 746-1552 of SΕQ JD NO: 5. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the ErhJ-Erhl fragment represented by nucleotides 852-1448 of SΕQ ID NO: 5. h another specific embodiment, a nucleic acid regulatory sequence of the invention is the BsaJ-BsaJ fragment represented by nucleotides 335-1442 of SΕQ ID NO: 5. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the KasJ-KasJ fragment represented by nucleotides 529-801 of SΕQ ID NO: 5. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the ΓONI- ΌNI fragment represented by nucleotides 629-1103 of SΕQ JD NO: 5. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the roNI- roNI fragment represented by nucleotides 1103-1527 of SΕQ ID NO: 5. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the SfcJ-SfcJ fragment represented by nucleotides 1-232 of SΕQ ID NO: 5. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the SfcJ-SfcJ fragment represented by nucleotides 232-1692 of SΕQ ID NO: 5. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the XmaJ- XmaJ fragment represented by nucleotides 69-496 of SΕQ ID NO: 5. [0067] In another embodiment, the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SΕQ JD NO: 6. For example, in a specific embodiment, a nucleic acid regulatory sequence of the invention is the SfcJ-SfcJ fragment represented by nucleotides 1-347 of SΕQ ID NO: 6. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the SfcJ-Sfcl fragment represented by nucleotides 347-751 of SEQ ID NO: 6. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the EcoO109I-EcoO109I fragment represented by nucleotides 189-331 of SΕQ ID NO: 6. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the EcoO109I-EcoO109I fragment represented by nucleotides 331-431 of SΕQ JD NO: 6. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the Van91J-Van91J fragment represented by nucleotides 81-250 of SΕQ TD NO: 6. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the Van9\J-Van9\J represented by nucleotides 250-630 of SΕQ ID NO: 6. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the PstJ-PstJ fragment represented by nucleotides 5-755 of SΕQ ID NO: 6. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the Ksp22J-EarJ fragment represented by nucleotides 8-655 of SΕQ ID NO: 6. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the Ksp22J-BanJ fragment represented by nucleotides 8-484 of SΕQ ID NO: 6. In another specific embodiment, a nucleic acid regulatory sequence of the invention is the BanJ-BcgJ fragment represented by nucleotides 484-754 of SΕQ ID NO: 6. [0068] It will be clear to a person of skill in the art that other restriction fragments of SΕQ JD NO: 1, SΕQ JD NO: 2, SΕQ JD NO: 3, SΕQ JD NO: 4, SΕQ ID NO: 5, OR SΕQ ID NO: 6 may be generated using other restriction enzymes and used as regulatory sequences e.g., as transcription activating nucleic acid sequences. The above examples are not meant to limit the invention to any particular restriction fragment or fragments. [0069] Nucleic acid regulatory sequences of the invention may also comprise part or all of the reverse compliment of the full-length sequences disclosed. In another embodiment, the nucleic acid regulatory sequence of the invention comprises an isolated nucleic acid molecule that is the reverse complement of the nucleotide sequence of SΕQ ID NO: 1. In another embodiment, the nucleic acid regulatory sequence of the invention comprises an isolated nucleic acid molecule that is the reverse complement of the nucleotide sequence of SΕQ ID NO: 2. In another embodiment, the nucleic acid regulatory sequence of the invention comprises an isolated nucleic acid molecule that is the reverse complement of the nucleotide sequence of SΕQ ID NO: 3. In another embodiment, the nucleic acid regulatory sequence of the invention comprises an isolated nucleic acid molecule that is the reverse complement of the nucleotide sequence of SΕQ ID NO: 4. In another embodiment, the nucleic acid regulatory sequence of the invention comprises an isolated nucleic acid molecule that is the reverse complement of the nucleotide sequence of SEQ ID NO: 5. In another embodiment, the nucleic acid regulatory sequence of the invention comprises an isolated nucleic acid molecule that is the reverse complement of the nucleotide sequence of SEQ TD NO: 6.
[0070] The invention also provides regulation sequences that comprise all or part of the reverse complement of SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ JD NO: 5, or SEQ ID NO: 6. Thus, in another embodiment, the nucleic acid regulatory sequence of the invention comprises the reverse complement of the nucleotide sequence of a transcription activating sequence of SEQ ED NO: 1. In another embodiment, the nucleic acid regulatory sequence of the invention comprises the reverse complement of the nucleotide sequence of a transcription activating sequence of SEQ ED NO: 2. In another embodiment, the nucleic acid regulatory sequence of the invention comprises the reverse complement of the nucleotide sequence of a transcription activating sequence of SEQ ID NO: 3. In another embodiment, the nucleic acid regulatory sequence of the invention comprises the reverse complement of the nucleotide sequence of a transcription activating sequence of SEQ TD NO: 4. In another embodiment, the nucleic acid regulatory sequence of the invention comprises the reverse complement of the nucleotide sequence of a transcription activating sequence of SEQ D NO: 5. In another embodiment, the nucleic acid regulatory sequence of the invention comprises the reverse complement of the nucleotide sequence of a transcription activating sequence of SEQ ID NO: 6.
[0071] The transcription activating sequence may additionally be discrete fragments of the full-length sequences disclosed. For example, in a more specific embodiment, the transcription activating nucleotide sequence comprises at least about 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or 1300 contiguous nucleotides of SEQ ID NO: 1 or the reverse complement thereof. For example, a nucleic acid regulatory sequence of the invention may be, but is not limited to, the nucleotide sequence nni-nnso, mnist-nnioQ, nnιornn15o, nni5t-nn2oo, nn20ι-nn25o, nn 5i-nn3oo, nn3oi-nn35o, nn35ι-nn4oo,
Figure imgf000030_0001
n 45i-rιn5oo, rjn501-nn55o, nn551-nn60o, nn601-nn65o, nn651-nn70o, nn70ι-nn750, nn751-nn800, nn801-nn850,rιn851- ιmgo0,nng01-rmg50,rmg51-nnι00(), nn1001-nn105(), nn105i-nn110o, nn1101-nn115o, nnι151-nn120o, nnπoi-nnuso, nn125i-nn13oo, nn^ornn^π, or any contiguous combination thereof, of SEQ ED NO: 1 or the reverse complement of any of the foregoing. In this and in following examples, "nnx-nny" means nucleotide X to nucleotide Y of the specific SEQ JD NO. For example, nni-nnso of SEQ ED NO: 1 means contiguous nucleotides 1-50 of SEQ JD NO: 1. In this and in following examples "nnx-nny" means nucleotide X to nucleotide Y of the specific SEQ ID NO: 1. For example, "nucleotides nni-nnso of SEQ ID NO: 1" means contiguous nucleotides 1-50 of SEQ TD NO: 1. This format applies, of course, to subsequences of SEQ JD NO: 2, SEQ JD NO: 3, SEQ D NO: 4, SEQ ED NO: 5, or SEQ D NO: 6. [0072] hi another specific embodiment, the transcription activating sequence comprises at least about 10, 20, 30, 40, 50, or 60 contiguous nucleotides of SEQ ED NO: 2. For example, a nucleic acid regulatory sequence of the invention may be, but is not limited to, the nucleotide sequence nn nnio,
Figure imgf000031_0001
nn61-nn65, or any contiguous combination thereof, of SEQ ED NO: 2 or the reverse complement thereof.
[0073] hi another specific embodiment, the transcription activating nucleotide sequence comprises at least -about 20, 40, 60, 80, 100, 120, or 140 contiguous nucleotides of SEQ ED NO: 3 or the reverse complement thereof. For example, in a specific embodiment, a nucleic acid regulatory sequence of the invention is, but is not limited to, the nucleotide sequence nn1-nn2o, nn21-nn40, nrm-mirio, rm61-nn80, nn81-nn100, nn10ι-mι12o, nni2i-nn140, nn1 1-nn1 6, or any contiguous combination thereof, of SEQ ID NO: 3 or the reverse complement thereof.
[0074] In another specific embodiment, the transcription activating nucleotide sequence comprises at least about 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, or 2750 contiguous nucleotides of SEQ ID NO: 4 or the reverse complement thereof. For example, in a specific embodiment, a nucleic acid regulatory sequence of the invention is, but is not limited to, the nucleotide sequence nni-nnioo, nn101-nn2oo, rufcor nn300, rm30i-rin oo, nn-wi-nnsoo, nn501-nn60o, nn601-nn7oo, nn701-nn800, nngorimgoo, nn901- nn100o, nn1001-nnπ0o, nnπoι-nni2oo> nni2oι-nn130o, nn1301-nn1 0o, nn1 oι-nni5oo, nni501-nn16oo, nn16oι-nnι7oo, nnι7oι-nn18oo, nnι801-mii9o0, nni9oι-nn2ooo> nn20Gi-nn2ioo, ri ioi-mtøoo, ∞tøot- nn23oo, nn23oi-nn24oo, nn24oι-nn25oo, nn250i-nn26oo, nn26oι-nιi2 oo, πn27oι-nn2768, or any contiguous combination thereof, of SEQ ID NO: 4 or the reverse complement thereof. [0075] hi another specific embodiment, the transcription activating nucleotide sequence comprises at least about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, 1500, or 1650 contiguous nucleotides of SEQ ID NO: 5 or the reverse complement thereof. For example, in a specific embodiment, a nucleic acid regulatory sequence of the invention is, but is not limited to, the nucleotide sequence n -nnioo, nnioi-nn2oo, nn20ι- mi3oo, nn301-nn4oo, nmo nnsoo, nn501-nn60o, nn601-nn70o, nn70ι-nn80o, nn8oι-nn9θo, nn901- nniooo, nniooi-nnnooj m πori πoo, nni2oi-nn13oo, nn13oι.-nn14oo, nnwoi-nmsoo, nni501-ιιnι60o, nniβoi-nnπoo, nni70i-nn180o, nnι8oι-nnι9oo, nni9oι-nn2ooo> nn2ooι-un2ioo, nn21oι-nn22oo, nitøoi- nri23oo, nn23oi-nn24oo> nn2401-nn25oo5 nn250i-nn26oo> nn260i-nn270o, nn270ι-nn280o, nn28oι-nn2g00, n 29oi-nn3ooo> nn3o01-ιιn3ioo, rm31oι-nιι32oo> nn32oι-nn33oo5 nn33oi-mi34oo, 1^13^-11^500, nn35oι- nn 6oo, nn 601-m 37oo, or nn37oι-nn3 47, or any contiguous combination thereof, of SEQ ED NO: 5 or the reverse complement thereof. [0076] In another specific embodiment, the transcription activating nucleotide sequence comprises at least about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, or 750 contiguous nucleotides of SEQ ID NO: 6 or the reverse complement thereof. For example, in a specific embodiment, a nucleic acid regulatory sequence of the invention is, but is not limited to, the nucleotide sequence nni-nnioo, nn1o1-ιm2oo, mi2oι-nn3oo, rnisoi-nr ooj
Figure imgf000032_0001
nnsoo, nn50ι-nn6oo, nnt5o1-nn7oo, nn7oι-nnsoo, nn801-nngOo, nn90ι-nn10oo, nn10oι-nnι10o, nn1101- nni2oo, nni20i-nni3oo, nnι3oι-nni4oo, iHoi-nnisoo, nnιsoι-nnι6oo, nnι oι-nnι7oo, m ι oι-nnι8oo, nni8oi-nn19oo, nn19o1-mi2oooJ nn2ooι-nn21oo, nn21oι-nn22oo, nn220i-nn23oo5 nn23Gi-nn2400, i24oι- mi25oo, nn250ι-nn2600j nn26oι-nn27oo, nn27oi-nn280o5 nn28oι-nn2goo, nn290i-nn3o0o, mi3ooι-nn 1oo, mι 101-nιi32oo, nn32oι-nn33oo, nn33o1-mi34oo, nn34oι-m 35oo, nn35oι-mi36oo, nn3601-nn3 oo, or nn37o1-mi3747, or any contiguous combination thereof, of SEQ ED NO: 6 or the reverse complement thereof.
[0077] It will be readily apparent to one of skill in the art that one can derive transcription activating nucleotide sequence of different lengths in a like manner for SEQ JD NO: 1, SEQ ED NO: 2, SEQ D NO: 3, SEQ ED NO: 4, SEQ JD NO: 5, or SEQ ED NO: 6, where the sequence may be at least 20 nucleotides in length, up to the length of SEQ JD NO: 1, SEQ JD NO: 2, SEQ JD NO: 3, SEQ ID NO: 4, SEQ ED NO: 5, or SEQ
ED NO: 6. [0078] In another embodiment, the invention provides for sequences that hybridize to the full-length sequences or reverse complements thereof. For example, in specific embodiment, the invention provides for an isolated nucleic acid regulatory sequence molecule comprising a transcription nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ JD NO: 1 , to a transcriptional activating sequence of SEQ ID NO: 1, or to a complement or reverse complement thereof. In another embodiment, the invention provides for an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ ID NO: 2, to a transcriptional activating sequence of SEQ ID NO: 2, or to a complement or reverse complement thereof, h another embodiment, the invention provides for an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ TD NO: 3, to a transcriptional activating sequence of SEQ JD NO: 3, or to a complement or reverse complement thereof. In another embodiment, the invention provides for an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ ID NO: 4, to a transcriptional activating sequence of SEQ ID NO: 4, or to a complement or reverse complement thereof. In another embodiment, the invention provides for an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ ID NO: 5, to a transcriptional activating sequence of SEQ ED NO: 5, or to a complement or reverse complement thereof. The restriction fragments and discrete subsequences enumerated above represent sequences that hybridize along their entire lengths to the disclosed full-length sequences or their complements. In another embodiment, the invention provides for an isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ ID NO: 6, to a transcriptional activating sequence of SEQ ID NO: 6, or to a complement or reverse complement thereof. [0079] Hybridizing conditions can be of low or high stringency. Such stringency conditions are well known to those of skill in the art. By way of example and not limitation, sequences that hybridize under low stringency conditions are ones that would hybridize under conditions as follows (see also Shilo and Weinberg, Proc. Natl. Acad. Sci. U.S.A. 78:6789-6792 (1981)): Filters containing DNA are pretreated for 6 h at 40°C in a solution containing 35% formamide, 5X SSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 μg/ml denatured salmon sperm DNA. Hybridizations are carried out in the same solution with the following modifications: 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 μg g/ml salmon sperm DNA, 10% (wt/vol) dextran sulfate, and 5-20 X 10 cpm P-labeled probe is used. Filters are incubated in hybridization mixture for 18-20 h at 40°C, and then washed for 1.5 h at 55°C in a solution containing 2X SSC, 25 mM Tris-HCl (pH 7.4), 5 mM EDTA, and 0.1% SDS. The wash solution is replaced with fresh solution and incubated an additional 1.5 h at 60°C. Filters are blotted dry and exposed for autoradiography. If necessary, filters are washed for a third time at 65-68°C and re-exposed to film.
[0080] Likewise, by way of example and not limitation, sequences that hybridize under highly stringent conditions are ones that hybridize under such conditions of high stringency as follows. Prehybridization of filters containing DNA is carried out for 8 h to overnight at 65°C in buffer composed of 6X SSC, 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.02% BSA, and 500 μg/ml denatured salmon sperm DNA. Filters are hybridized for 48 h at 65 °C in prehybridization mixture containing 100 μg/ml denatured salmon sperm DNA and 5-20 X 106 cpm of 32P -labeled probe. Washing of filters is done at 37°C for 1 h in a solution containing 2X SSC, 0.01% PVP, 0.01% Ficoll, and 0.01% BSA. This is followed by a wash in 0.1X SSC at 50°C for 45 min before autoradiography. Hybridization conditions are said to be "highly stringent" or "high stringency" when said conditions are at least as stringent as those disclosed in this paragraph. [0081] Stringency can also be determined by calculating the Tm of the hybridization. Among the nucleic acid molecules of the invention are deoxyoligonucleotides ("oligos") which hybridize under highly stringent or moderately stringent conditions to the nucleic acid molecules described above. In general, for probes between 14 and 70 nucleotides in length the melting temperature (Tm) is calculated using the formula: Tm (°C)=81.5+16.6(log[monovalent cations (molar)])+0.41 (% G+C)-(500/N) where N is the length of the probe. If the hybridization is carried out in a solution containing formamide, the melting temperature is calculated using the equation Tm
(°C)=81.5+16.6(log[monovalent cations (molar)])+0.41(% G+C)-(0.61% formamide)- (500/N) where N is the length of the probe. In general, hybridization is carried out at about 20-25 degrees below Tm (for DNA-DNA hybrids) or 10-15 degrees below Tm (for RNA-DNA hybrids). For example, the Tm decreases approximately 1°C for every 1% of base pairs that are mismatched. For hybrids shorter than 20 base pairs, the Tm decreases by approximately 5°C for every mismatched base pair. Stringent conditions, therefore, are those where the hybridization temperature is Tm-25°C (where the maximum hybridization rate is observed) to Tm-5°C (maximum stringency). [0082] Also encompassed within the scope of the invention are modifications of the regulatory nucleotide sequences of the invention that do not substantially affect their transcriptional activities. Such modifications include additions, deletions and substitutions. When operably linked to the coding region for a heterologous gene, such modifications of the 1311 nucleotide CNI-01142 (SEQ ID NO: 1) regulatory sequence, the 65 nucleotide CNI-01080 (SEQ ED NO: 2) regulatory sequence, the 146 nucleotide CNI-01104 (SEQ JD NO: 3) regulatory sequence, the 2768 nucleotide CNI-01120 (SEQ ID NO: 4) regulatory sequence, the 1697 nucleotide CNI-01125 (SEQ JD NO: 5) regulatory sequence, the 756 nucleotide CNI-01131 (SEQ TD NO: 6) regulatory sequence, or nucleic acid regulatory sequences thereof, are sufficient to modulate expression of the operatively linked heterologous gene in a cell.
[0083] The present invention also relates to the nucleic acid regulatory sequences of the invention operably linked to a nucleic acid molecule comprising a coding sequence. Thus, the invention also provides for the control of gene expression using modifications of CNI-01142 (SEQ ID NO: 1) or a nucleic acid regulatory sequence thereof that substantially alters the transcriptional promotion activities of CNI-01142. In one embodiment, the invention provides CNI-01142 sequences that act as stronger modulators than full-length CNI-01142. h another embodiment, the invention provides such sequences that are weaker promoters than CNI-01142. In yet another embodiment, the invention provides such sequences that act to suppress or depress the level of transcription below that of the full-length CNI-01142. [0084] The invention also provides for the control of gene expression using modifications of CNI-01080 (SEQ JD NO: 2) or a nucleic acid regulatory sequence thereof that substantially alters the transcriptional promotion activities of CNI-01080. In one embodiment, the invention provides CNI-01080 sequences that act as stronger modulators than full-length CNI-01080. In another embodiment, the invention provides such sequences that are weaker promoters than CNI-01080. hi yet another embodiment, the invention provides such sequences that act to suppress or depress the level of transcription below that of the full-length CNI-01080. [0085] The invention also provides for the control of gene expression using modifications of CNI-01104 (SEQ D NO: 3) or a nucleic acid regulatory sequence thereof that substantially alters the transcriptional promotion activities of CNI-01104. In one embodiment, the invention provides CNI-01104 sequences that act as stronger modulators than full-length CNI-01104. In another embodiment, the invention provides such sequences that are weaker promoters than CNI-01104. In yet another embodiment, the invention provides such sequences that act to suppress or depress the level of transcription below that of the full-length CNI-01104. [0086] The invention also provides for the control of gene expression using modifications of CNI-01120 (SEQ ID NO: 4) or a nucleic acid regulatory sequence thereof that substantially alters the transcriptional promotion activities of CNI-01120. In one embodiment, the invention provides CNI-01120 sequences that act as stronger modulators than full-length CNI-01120. In another embodiment, the invention provides such sequences that are weaker promoters than CNI-01120. In yet another embodiment, the invention provides such sequences that act to suppress or depress the level of transcription below that of the full-length CNI-01120. [0087] The invention also provides for the control of gene expression using modifications of CNI-01125 (SEQ TD NO: 5) or a nucleic acid regulatory sequence thereof that substantially alters the transcriptional promotion activities of CNI-01125. In one embodiment, the invention provides CNI-01125 sequences that act as stronger modulators than full-length CNI-01125. hi another embodiment, the invention provides such sequences that are weaker promoters than CNI-01125. In yet another embodiment, the invention provides such sequences that act to suppress or depress the level of transcription below that of the full-length CNI-01125. [0088] The invention also provides for the control of gene expression using modifications of CNI-01131 (SEQ JD NO: 6) or anucleic acid regulatory sequence thereof that substantially alters the transcriptional promotion activities of CNI-01131. In one embodiment, the invention provides CNI-01131 sequences that act as stronger modulators than full-length CNI-01131. In another embodiment, the invention provides such sequences that are weaker promoters than CNI-01131. In yet another embodiment, the invention provides such sequences that act to suppress or depress the level of transcription below that of the full-length CNI-01131. [0089] If a restriction map is generated, the determination of those regions of the nucleic acid regulatory sequences of the invention strongest in promoting or enhancing gene expression is a straightforward task. The region is first digested with restriction endonucleases that produce the desired fragments. Preferably, the restriction endonucleases are commercially available, and recognize six-nucleotide sequences. Preferably, too, these restriction endonucleases utilize sites that are also present in the MCS of an expression vector, to facilitate cloning the fragments in such a way that they are operably linked to a gene to be expressed, the level of expression of which indicates the strength of promotion or enhancement of gene expression. Typically, the region is segregated into subregions representing progressively longer deletions from the 5' end, or from the 3' end; internal sequences may be deleted, as well. In general, those fragments that result in the most production of gene product are the strongest promoters; those that produce the least above background are the weakest. This example is not meant to be limiting, as there are other means to generate fragments in order to map promoter, enhancer or silencer regions; for example, exonuclease digestion. [0090] The same procedure may be used for regulatory sequence fragments created by exonuclease digestion. Typically, an exonuclease is contacted with the regulatory sequence, and treatment is allowed to continue for varying periods of time, thus generating fragments of various sizes. The fragments are size-separated, for example, on a sizing column or in an agarose gel. The fragments can then either be blunt-end ligated into an expression vector, or can be tailed with linkers to facilitate cloning into such a vector. The resulting constructs are then analyzed for insert sequence and for the insert's ability to promote expression of the reporter gene. [0091] The ability of sequences or fragments of the regulatory sequences of the invention to promote or enhance transcription can be assessed in two kinds of plasmid vectors. In one vector, the regulatory sequence or subfragments thereof is cloned into a site, typically part of an MCS, that places the regulatory sequence upstream of, and operably linked to, a reporter gene whose expression can be monitored. The vector, prior to insertion of the regulatory sequence, has no promoter of its own that can drive expression of the reporter gene. Expression of the reporter sequence over that seen with a no-insert control indicates that the regulatory sequence acts as a promoter of transcription. A second vector contains a promoter operably linked to the reporter gene. Here, the putative regulatory sequence is inserted upstream of the promoter, again typically into an MCS. If there is additional increase of the reporter gene above that seen in a promoter- only control, the regulatory sequence has enhancer activity.
[0092] It will be apparent to those of skill in the art that the above two vectors may additionally be used to discover other regulatory sequences, for example, homologous or analogous regulatory sequences that drive expression in the nervous systems of other species. For example, one may design sets of primers based upon the nucleotide sequence of the regulatory sequence of the invention, and perform PCR under moderately-stringent conditions well known to those of skill in the art on genomic DNA derived from a non-human species. PCR products are then cloned directly into one of the above two vectors. PCR products driving expression in the vector containing a promoter operably linked to the reporter gene have enhancer activity, while PCR products driving expression in the promoterless vector have promoter activity. [0093] Alterations in the regulatory sequences can be generated using a variety of chemical and enzymatic methods which are well known to those skilled in the art. For example, regions of the sequences defined by restriction sites can be deleted.
Oligonucleotide-directed mutagenesis can be employed to alter the sequence in a defined way and/or to introduce restriction sites in specific regions within the sequence. Additionally, deletion mutants can be generated using DNA nucleases such as Bal31 or ExoIII and SI nuclease. Progressively larger deletions in the regulatory sequences are generated by incubating the DNA with nucleases for increased periods of time (see
Ausubel, et al, CURRENT PROTOCOLS FOR MOLECULAR BIOLOGY (1989), for a review of mutagenesis techniques). [0094] The altered sequences are evaluated for their ability to direct expression of heterologous coding sequences in appropriate host cells, e.g., nervous system cells. It is within the scope of the present invention that any altered regulatory sequences which retain their ability to direct expression of a coding sequence be incorporated into recombinant expression vectors for further use.
[0095] The regulatory nucleic acid sequences of the invention can routinely be analyzed for the presence of transcription elements by various publicly available computer programs. Putative transcription elements are located, for example, by means of comparing the sequence to known or known consensus transcription factor binding sequences, and determining that the percent identity between the two is significant.
[0096] Computer analysis of the CNI-01142 sequence performed as described above revealed a number of transcriptional elements, e.g. binding sites for transcription factors (see FIG. 4). Thus, the CNI-01142 regulatory sequence likely responds to a variety of exogenous factors and environmental stimuli. Computer analysis of the CNI-01080 sequence performed as described above revealed a number of transcriptional elements, e.g. binding sites for transcription factors (see FIG. 10). Thus, the CNI-01080 regulatory sequence likely responds to a variety of exogenous factors and environmental stimuli. Computer analysis of the CNI-01104 sequence performed as described above revealed a number of transcriptional elements, e.g. binding sites for transcription factors (see FIG. 16). Thus, the CNI-01104 regulatory sequence likely responds to a variety of exogenous factors and environmental stimuli. Computer analysis of the CNI-01120 sequence performed as described above revealed a number of transcriptional elements, e.g. binding sites for transcription factors (see FIG. 22). Thus, the CNI-01120 regulatory sequence likely responds to a variety of exogenous factors and environmental stimuli. Computer analysis of the CNI-01125 sequence performed as described above revealed a number of transcriptional elements, e.g. binding sites for transcription factors (see FIG. 28). Thus, the CNI-01125 regulatory sequence likely responds to a variety of exogenous factors and environmental stimuli. Computer analysis of the CNI-01131 sequence performed as described above revealed a number of transcriptional elements, e.g. binding sites for transcription factors (see FIG. 34). Thus, the CNI-01131 regulatory sequence likely responds to a variety of exogenous factors and environmental stimuli. [0097] The invention also provides regulatory sequences containing binding sites for various transcription factors. Thus, in one embodiment, the transcription modulating nucleic acid sequence contains at least 20 contiguous nucleotides of SEQ ED NO: 1 and at least one of the transcription factor binding sites of FIG. 4. In another embodiment, the transcription modulating nucleic acid sequence contains at least 20 contiguous nucleotides of SEQ JD NO: 2, and at least one of the transcription factor binding sites of FIG. 10. In another embodiment, the transcription modulating nucleic acid sequence contains at least 20 contiguous nucleotides of SEQ ID NO: 3, and at least one of the transcription factor binding sites of FIG. 16. In another embodiment, the transcription modulating nucleic acid sequence contains at least 20 contiguous nucleotides of SEQ ID NO: 4, and at least one of the transcription factor binding sites of FIG 22. In another embodiment, the transcription modulating nucleic acid sequence contains at least 20 contiguous nucleotides of SEQ JD NO: 5, and at least one of the transcription factor binding sites of FIG 28. In another embodiment, the transcription modulating nucleic acid sequence contains at least 20 contiguous nucleotides of SEQ ED NO: 6, and at least one of the transcription factor binding sites of FIG 34.
[0098] Regulatory sequences can also be physically mapped using restriction endonucleases to create restriction maps, which can easily be constructed. Such maps may be constructed by restricting the sequence with a variety of restriction enzymes, separating the resulting fragments on an agarose gel, and therefrom determining the relative positions of the restriction enzyme recognition sequences. Alternatively, since the recognition sequences of most restriction enzymes are well known to those of skill in the art, a restriction map may be generated once the nucleotide sequence of the promoter or regulatory sequence is determined. [0099] Finer mapping of regulatory sequences can routinely be accomplished using site-directed mutagenesis, using variants of the fragments of the present invention. Site-specific mutagenesis is a technique useful in the preparation of mutant promoter regions useful in identifying important promoter elements. The technique further provides a ready ability to prepare and test sequence variants, for example, incorporating one or more of the foregoing considerations, by introducing one or more nucleotide sequence changes into the DNA. Site-specific mutagenesis allows the production of mutants through the use of specific oligonucleotide sequences which encode the DNA sequence of the desired mutation, as well as a sufficient number of adjacent nucleotides, to provide a primer sequence of sufficient size and sequence complexity to form a stable duplex on both sides of the mismatch junction being traversed. Typically, a primer of about 17 to 25 nucleotides in length is preferred, with about 5 to 10 residues on both sides of the junction of the sequence being altered.
[0100] In general, the technique of site-specific mutagenesis is well known in the art as exemplified by publications (Adelman et al, DNA 2:183 (1983)). As will be appreciated, the technique typically employs a phage vector which exists in both a single stranded and double stranded form. Typical vectors useful in site-directed mutagenesis include vectors such as the M13 phage (Messing et al, Meth. Enzymol. 101:20 (1981)). These phage are readily commercially available and their use is generally well known to those skilled in the art. Double stranded plasmids are also routinely employed in site directed mutagenesis which eliminates the step of transferring the gene of interest from a plasmid to a phage. [0101] In general, site-directed mutagenesis in accordance herewith is performed by first obtaining a single-stranded vector or melting apart the two strands of a double stranded vector which includes within any of the nucleic acid regulatory sequences of the invention. An oligonucleotide primer bearing the desired mutated sequence is prepared, generally synthetically, for example by the method of Crea et al. Proc. Natl. Acad. Sci. U.S.A. 75:5765-5769 (1978). Primer sequences are, of course, based on the nucleotide sequences of the regulatory sequences of the invention i.e., SEQ TD NO: 1, SEQ TD NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ED NO: 5, or SEQ ID NO: 6. This primer is then annealed with the single-stranded vector, and subjected to DNA polymerizing enzymes such as E. coli polymerase I Klenow fragment, in order to complete the synthesis of the mutation-bearing strand. Thus, a heteroduplex is formed wherein one strand encodes the original non-mutated sequence and the second strand bears the desired mutation. This heteroduplex vector is then used to transform appropriate cells, such as E. coli cells, and clones are selected which include recombinant vectors bearing the mutated sequence arrangement. [0102] The preparation of sequence variants of the nucleic acid regulatory sequences of the invention using site-directed mutagenesis is provided as a means of producing useful regulatory sequence variants and is not meant to be limiting, as there are other ways in which sequence variants of the regulatory sequences of the invention may be obtained, such as chemical mutagenesis. For example, recombinant vectors containing the desired regulatory sequence may be treated with mutagenic agents to obtain sequence variants (see, e.g., a method described by Eichenlaub et al, J. Bad. 138(2):559-566 (1979) for the mutagenesis of plasmid DNA using hydroxylamine).
[0103] The present invention also provides for fragments, i.e., subsequences, of the CNI-01142 (SEQ ED NO: 1) regulatory sequence, which fragments need not be transcription activating. Such fragments can be used to detect the CNI-01142 (SEQ ED NO: 1) sequence in a human cell sample, or in a cell sample derived from one of the animal sources listed above. Such fragments may be at least about 5, 10, 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, or 1300 nucleotides in length, or no more than about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, or 1300 nucleotides in length. [0104] The present invention also provides for fragments, i.e., subsequences, of the CNI-01080 (SEQ ID NO: 2) regulatory sequence, which fragments need not be transcription activating. Such fragments can be used to detect the CNI-01080 (SEQ JD NO: 2) sequence in a human cell sample, or in a cell sample derived from one of the animal sources listed above. Such fragments may be at least about 5, 10, 20, 30, 40, 50, or 60 nucleotides in length, or no more than about 20, 30, 40, 50, or 60 nucleotides in length.
[0105] The present invention also provides for fragments, i.e., subsequences, of the CNI-01104 (SEQ ED NO: 3) regulatory sequence, which fragments need not be transcription activating. Such fragments can be used to detect the CNI-01104 (SEQ ID NO: 3) sequence in a human cell sample, or in a cell sample derived from one of the animal sources listed above. Such fragments may be at least about 5, 10, 20, 30, 40, 50, 75, 100, 125, or 140 nucleotides in length, or no more than about 20, 30, 40, 50, 75, 100, 125, or 140 nucleotides in length.
[0106] The present invention also provides for fragments, i.e., subsequences, of the CNI-01120 (SEQ ID NO: 4) regulatory sequence, which fragments need not be transcription activating. Such fragments can be used to detect the CNI-01120 (SEQ ED
NO: 4) sequence in a human cell sample, or in a cell sample derived from one of the animal sources listed above. Such fragments may be at least about 5, 10, 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 2000, 2500, or 2750 nucleotides in length, or no more than about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 2000, 2500, or 2750 nucleotides in length.
[0107] The present invention also provides for fragments, i.e., subsequences, of the CNI-01125 (SEQ JD NO: 5) regulatory sequence, which fragments need not be transcription activating. Such fragments can be used to detect the CNI-01125 (SEQ ID NO: 5) sequence in a human cell sample, or in a cell sample derived from one of the animal sources listed above. Such fragments may be at least about 5, 10, 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, 1500, or 1650 nucleotides in length, or no more than about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, 1500, or 1650 nucleotides in length.
[0108] The present invention also provides for fragments, i.e., subsequences, of the CNI-01131 (SEQ ID NO: 6) regulatory sequence, which fragments need not be transcription activating. Such fragments can be used to detect the CNI-01131 (SEQ ID NO: 6) sequence in a human cell sample, or in a cell sample derived from one of the animal sources listed above. Such fragments may be at least about 5, 10, 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, or 750 nucleotides in length, or no more than about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, or 750 nucleotides in length. [0109] The nucleic acid regulatory sequences of the invention can be generated using techniques well known to those of skill in the art. For example, the sequences may be generated from nucleic acids derived from natural sources or from publicly available cloned sequences by any one of a number of means known in the art, i.e., cleavage by one or more restriction endonucleases; DNasel treatment; exonuclease treatment or mechanical shearing. Such fragments may also be constructed artificially. For example, fragments maybe synthesized chemically, or may be generated by means of the polymerase chain reaction (PCR).
[0110] The process of selecting and preparing a nucleic acid segment that includes a contiguous sequence from the genomic sequence region may alternatively be described as preparing a nucleic acid fragment. Of course, fragments may also be obtained by other techniques such as, e.g., by mechanical shearing, exonuclease treatment or by restriction enzyme digestion. Small nucleic acid segments or fragments may be readily prepared by, for example, directly synthesizing the fragment by chemical means, as is commonly practiced using an automated oligonucleotide synthesizer. Also, fragments may be obtained by application of nucleic acid reproduction technology, such as the PCR technology of U.S. Pat. No. 4,603,102, (incorporated herein by reference), by introducing selected sequences into recombinant vectors for recombinant production, and by other recombinant DNA techniques generally known to those of skill in the art of molecular biology.
[0111] The sequence of a particular regulatory sequence may be determined by a number of means well known in the art, including but not limited to the method of Maxam and Gilbert (Meth. Enzymol. 65:499-560 (1980)), the Sanger dideoxy method (Sanger, F., et al, Proc. Natl. Acad. Sci. U.S.A. 74:5463 (1977)), the use of T7 DNA polymerase (Tabor and Richardson, U.S. Patent No. 4,795,699), or use of an automated DNA sequencer (e.g., Applied Biosystems, Foster City, CA). The labels used in sequencing may be radioactive or fluorescent. [0112] The ability of any of the foregoing sequences to modulate, activate or enhance gene expression in a cell is straightforward. A vector suitable for maintenance and gene expression in a host cell is constructed, whereby the vector contains a reporter gene operably linked to the particular regulatory sequence or transcription activating sequence of the invention. The vector containing the regulatory sequence or transcription activating sequence is then placed in to a cell, preferably a neural cell or cell derived from the brain. After culturing for a period of time suitable for the reporter gene to express the reporter gene product, the amount of the reporter gene product is assessed. For example, if the reporter gene product is GFP, the amount of GFP is determined by assessing the amount of fluorescence emitted by the cell. A nucleotide sequence that modulates reporter gene expression according to the invention is one that causes a detectable difference of the level of expression of the reporter gene, and/or amount of the reporter gene product, when compared to a control cell containing the vector and reporter gene, but lacking the regulatory sequence or transcriptional activating sequence. In a preferred embodiment, the difference is an increase in the expression of the reporter gene over that of the control. 5.2. VECTORS AND REGULATION OF GENE EXPRESSION [0113] The present invention provides the CNI-01142, CNI-01080, CNI-01104, CNI- 01120, CNI-01125 and CNI-01131 regulatory sequences, or transcription modulating sequences thereof, contained in a vector. The regulatory sequences of the present invention each promotes or enhance gene expression in cells derived from the nervous system; thus, each of these regulatory sequences or nucleic acid regulatory sequences thereof are useful for the expression of a coding sequence in cells, particularly in nervous system cells. [0114] The invention further provides vectors comprising a nucleic acid regulatory molecule of the invention. In this regard, in one embodiment, the invention provides a vector comprising the nucleotide sequence of SEQ JD NO: 1, SEQ JD NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 or SEQ ED NO: 6. Additionally, the invention further provides vectors comprising two or more of the nucleotide sequences of these SEQ ED NOs. [0115] In another embodiment, the vector comprises the nucleotide sequence of a transcription activating sequence of SEQ ED NO: 1 or the reverse complement of SEQ ED NO: 1. For example, the transcription activating sequence of SEQ ED NO: 1 maybe at least about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, or 1300 nucleotides in length. The transcription activating sequence of SEQ ED NO: 2 may be at least about 20, 30, 40, 50, or 60 nucleotides in length. The transcription activating sequence of SEQ ED NO: 3 maybe at least about 20, 30, 40, 50, 75, 100, 125, or 140 nucleotides in length. The transcription activating sequence of SEQ JD NO: 4 may be at least about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 2000, 2500, or 2750 nucleotides in length. The transcription activating sequence of SEQ ED NO: 5 maybe at least about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, 750, 1000, 1250, 1500, or 1650 nucleotides in length. The transcription activating sequence of SEQ ED NO: 6 may be at least about 20, 30, 40, 50, 75, 100, 200, 300, 400, 500, or 750 nucleotides in length. The vector may also include a nucleic acid that hybridizes along its entire length to SEQ ED NO: 1, or SEQ ED NO: 2, or SEQ ED NO: 3, or SEQ ID NO: 4, or SEQ ID NO: 5, or SEQ D NO: 6.
[0116] In another embodiment, the vector further comprises a coding sequence operably linked to a nucleic acid regulatory sequence of the invention. I a more specific embodiment, the coding sequence is heterologous to the nucleic acid regulatory sequence of the invention. For example, the coding sequence can encode a peptide or a polypeptide and can comprise, e.g., a reporter gene sequence or a neuroprotective sequence. [0117] With respect to a reporter gene sequence, such a sequence can encode, for example, β -galactosidase, a fluorescent protein (e.g., a green, red, blue, or cyan fluorescent protein), chloramphenicol acetyltransferase, luciferase or an antigenic marker. [0118] hi another embodiment, the vector further comprises a multiple cloning site (MCS), wherein when a nucleotide sequence is inserted into the MCS, the nucleotide sequence is operably linked to a nucleic acid regulatory sequence of the invention. In yet another embodiment, a vector of the invention can further comprise a coding sequence within the MCS. hi yet another embodiment, a vector of the invention can further comprise an internal ribosomal entry site (IRES). The invention further provides that any vector of the invention can also contain regulatory sequence (e.g., promoter sequence) in addition to the nucleic acid regulatory sequence of the invention. [0119] The invention also provides for the enhancement of expression of a nucleotide sequence of interest in a vector containing the nucleotide sequence operably linked to a promoter sequence heterologous to the nucleic acid molecule of the invention, h this regard, in one embodiment, the invention provides a vector comprising a nucleic acid regulatory sequence of the invention, a promoter, and an MCS operably linked in an upstream-to-downstream order, such that when the nucleotide sequence of interest is present within the MCS, expression of the nucleotide sequence of interest is enhanced relative to its expression from the vector in the absence of the nucleic acid regulatory sequence of the invention. In one embodiment, the vector further comprises an IRES. In another embodiment, the vector further comprises a coding sequence within the MCS. hi a more specific embodiment, the coding sequence is heterologous to the nucleic acid regulatory sequence of the invention. For example, the coding sequence can encode a peptide or a polypeptide and can comprise, e.g., a reporter gene sequence or a neuroprotective sequence. [0120] With respect to a reporter gene sequence, such a sequence can encode, for example, β-galactosidase, a fluorescent protein (e.g., a green, red, blue, or cyan fluorescent protein), chloramphenicol acetyltransferase, luciferase or an antigenic marker. [0121] Any of the vectors of the invention can be adapted for transfer to a eukaryotic host cell, including a human host cell. In a more specific embodiment, the eukaryotic host cell is a nervous system cell, hi a more specific embodiment, the nervous system cell is a nervous system cell line, glial cell, astrocyte, oligodendrocyte, mesencephalic neuron, hypothalamic neuron or cortical neuron. In another embodiment, the vectors above are adapted for transfer to a prokaryotic host cell.
[0122] A wide variety of heterologous gene sequences can be expressed under the control of the nucleic acid regulatory sequences of the invention. Such gene sequences include, but are not limited to, sequences encoding neuroprotective sequences, reporter gene products, toxic gene products, potentially toxic gene products, antiprohferation or cytostatic gene products. Reporter genes can also be expressed including enzymes, (e.g. Chloramphenicol Acetyl Transferase (CAT), beta-galactosidase, luciferase, light-emitting proteins such as those encoded by luxAB, fluorescent proteins such as a green, red, blue, or cyan fluorescent protein, or antigenic markers. [0123] A person of skill in the art would understand that the nucleic acid regulatory sequences of the invention can be used to modulate the expression of a gene contained in an expression vector that either possesses or lacks a promoter. Such an expression vector typically possesses a multiple cloning site upstream of the start codon of a gene. The vector may or may not possess a promoter between the MCS and the gene. Where the plasmid lacks a promoter, an increase in the expression of the gene indicates that the cloned genomic fragment has promoter activity, or promoter and enhancer activities. Where the plasmid possesses a promoter, an increase in the expression of the gene indicates that the cloned fragment possesses at least enhancer activity. It will be apparent to one of skill in the art that the genomic fragment may be cloned in either orientation, the method of generating the fragment permitting. For example, genomic fragments generated by DNase I treatment, shearing, or restriction with a single restriction endonuclease may be inserted in either orientation. Fragments generated by filling-in and or digestion with a single-strand nuclease, thereby generating blunt-ended fragments, can be inserted in either orientation. Alternatively, directional cloning can be achieved by restriction with a pair of restriction endonucleases, each having a different recognition sequence. [0124] The genomic fragment representing a regulatory sequence may be inserted in multiple copies upstream of a gene to be expressed, perhaps improving the regulatory activities. Furthermore, the regulatory sequence or fragment thereof need not be placed in an adjacent conformation and maybe separated by numerous random nucleotides and still retain their improved regulatory and promotion capability.
[0125] The regulatory sequences and transcription activating fragments thereof of the present invention may be used to induce expression of a heterologous gene in cells derived from the nervous system, such as neurons, including cortical neurons, hippocampal neurons, mesencephalic neurons, medullary neurons, and glial cells. The invention further provides for host cells, or progeny thereof, containing the vectors above, hi a more specific embodiment, said host cell is a eukaryotic cell, including a human host cell. In a more specific embodiment, said host cell is a nervous system cell, hi another specific embodiment, said host cell is a prokaryotic cell. In cases where such cells are tumor cells, the induction of a cytotoxic product by the the regulatory sequences of the present invention may be used as a form of cancer gene therapy. Additionally, antisense, antigene, or aptameric oligonucleotides may be delivered to cells using the presently described expression constructs. Ribozymes or single-stranded RNA can also be expressed in a cell to inhibit the expression of a particular gene of interest. The target genes for these antisense or ribozyme molecules should be those encoding gene products that are essential for cell maintenance.
5.3. GENETICALLY ENGINEERED HOST CELLS
[0126] The regulatory sequences disclosed herein may be inserted into a variety of expression vectors for introduction into host cells. Thus, the invention further provides for host cells, or progeny thereof, containing the vectors above. In a more specific embodiment, said host cell is a eukaryotic cell, including a human host cell. In a more specific embodiment, said host cell is a nervous system cell. In another specific embodiment, said host cell is a prokaryotic cell. In this context, "host cells" means both cells, generally prokaryotic, used to maintain genetic constructs comprising the regulatory sequences of the present invention and a gene of interest that this region controls, as well as cells, generally eukaryotic, in which expression of the gene of interest is desired. In a preferred embodiment, the expression vector or the nucleic acid regulatory sequence of the invention is engineered to be stably integrated into the eukaryotic host cell genome. [0127] The invention further provides a method of expressing a coding sequence in a host cell in cell culture, hi one embodiment, the method comprises culturing a host cell containing a vector of the invention that contains a coding sequence under conditions effective to allow expression of the coding sequence by said host cell, hi another embodiment, the method comprises culturing a host cell of the invention wherein the nucleic acid regulatory sequence controls expression of a coding sequence endogenously present in the genome of said host cell, under conditions effective to allow expression of the coding sequence by said host cell.
[0128] The invention also provides a method of producing a peptide or polypeptide comprising maintaining a host cell of the invention that contains a coding sequence that encodes a peptide or polypeptide under conditions effective to allow expression of said coding sequence, and to allow translation of the resulting mRNA, such that a peptide or polypeptide is expressed. In one embodiment, the coding sequence is present as part of a vector of the invention. In another embodiment, the host cell has been engineered such that a nucleic acid regulatory sequence of the invention controls expression of a coding sequence endogenously present in the genome of said host cell, under conditions effective to allow expression of the coding sequence by said host cell. In a more specific embodiment, the vector is present in the genome of said host cell.
[0129] In bacterial systems a number of expression vectors may be advantageously selected depending upon the use intended for the expressed product; the promoter or regulatory sequences contained therein can be replaced by one or more of the regulatory sequences of the present invention, i.e., SEQ JD NO: 1, SEQ JD NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ID NO: 5, SEQ JD NO: 6, or transcription regulating sequences thereof. Such vectors include, but are not limited to, the E. coli expression vector pUR278 (Ruther et al, EMBO J. 2:1791 (1983)), in which a coding sequence may be ligated into the vector in frame with the lacZ coding region so that a hybrid protein is produced; pIN vectors (Inouye & h ouye, Nucleic Acids Res. 13 :3101-3109 (1985); Van Heeke & Schuster, J. Biol Chem. 264:5503-5509 (1989)); and the like. pGEX vectors may also be used to express foreign polypeptides as fusion proteins with glutathione S- transferase (GST). In general, such fusion proteins are soluble and can easily be purified from lysed cells by adsorption to glutathione-agarose beads followed by elution in the presence of free glutathione. The pGEX vectors are designed to include thrombin or factor Xa protease cleavage sites so that the cloned polypeptide of interest can be released from the GST moiety. [0130] In yeast, a number of vectors containing constitutive or inducible promoters can be replaced by the regulatory sequence of the invention and fragments thereof (CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, Vol. 2, Ed. Ausubel et al, Greene Publish. Assoc. & Wiley Interscience, Ch. 13 (1988); Grant et al, Expression and Secretion Vectors for Yeast, in METHODS IN ENZYMOLOGY, Eds. Wu & Grossman, Acad. Press, N.Y., Vol. 153, pp. 516-544 (1987); Glover, DNA CLONING, Vol. II, TRL Press, Wash., D.C., Ch. 3
(1986); and Bitter, Heterologous Gene Expression in Yeast, METHODS IN ENZYMOLOGY, Eds. Berger & Kimmel, Acad. Press, N.Y., Vol. 152, pp. 673-684 (1987); and THE MOLECULAR BIOLOGY OF THE YEAST SACCHAROMYCES, Eds. Strathern et al, Cold Spring Harbor Press, Vols. I and II (1982)). [0131] In mammalian host cells, a number of commercially available vectors can be engineered to insert the regulatory sequence of the invention (Clontech, Palo Alto, CA). [0132] In addition, a host cell strain may be chosen that modulates the expression of the inserted sequences, or modifies and processes the gene product in the specific fashion desired. Such modifications (e.g., glycosylation) and processing (e.g., cleavage) of protein products may be important for the function of the protein. Appropriate cell lines or host systems can be chosen to ensure the correct modification and processing of the foreign protein expressed. To this end, eukaryotic host cells that possess the cellular machinery for proper processing of the primary transcript, glycosylation, and phosphorylation of the gene product may be used. [0133] For expression in nervous system-specific host cells, the host cells may be derived from the nervous system itself, and grown in culture, or may be established neuronal or neuron-like cell lines. In reference to neuronal cell lines, many neuronal clones exist which have been used extensively as model systems of development since they retain electrophysiological activity with appropriate surface receptors, specific neurotransmitters, synapse forming properties and the ability to differentiate morphologically and biochemically into normal neurons. Such cells are described in the following references: Kimhi et al, Proc. Natl. Acad. Sci. USA 73:462-466 (1976); h : EXCITABLE CELLS LN TISSUE CULTURE, Nelson, P. G. et al, eds., Plenum Press, New York, pp. 173-245 (1977); Prasad, K. M. et al, In: CONTROL OF PROLIFERATION OF ANIMAL CELLS, Clarkson, B. et al, eds., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp. 581-594 (1974); Puro et al, Proc. Natl. Acad. Sci. USA 73:3544-3548 (1976); Notter et al, Devel. Brain Res. 26:59-68 (1986); Schubert et al, Proc. Natl. Acad. Sci. USA 67:247-254 (1970); Kaplan et al, In: BASIC AND CLINICAL ASPECTS OF MOLECULAR NEUROBIOLOGY, Guffrida-Stella, A. M. et al, eds., Milano Fondozione International Manarini (1982)) (see also U.S. Pat. No. 6,020,197 (describing methods of culturing neuroblasts). [0134] The expression vectors that contain the nucleic acid regulatory sequences of the invention may contain a gene encoding a selectable marker. A number of selection systems may be used, including but not limited to, the herpes simplex virus thymidine kinase (Wigler et al, Cell 11 :223 (1977)), hypoxanthine-guanine phosphoribosyltransferase (Szybalska & Szybalski, Proc. Natl. Acad. Sci. USA 48:2026 (1962)), and adenine phosphoribosyltransferase (Lowy et al, Cell 22:817 (1980)) genes can be employed in tk", hgprt" or aprf cells, respectively. Also, antimetabolite resistance can be used as the basis of selection for dhfr, which confers resistance to methotrexate (Wigler et al, Proc. Natl. Acad. Sci. USA 11-3561 (1980); O'Hare et al, Proc. Natl. Acad. Sci. USA 78:1527 (1981)); gpt, which confers resistance to mycophenolic acid (Mulligan & Berg, Proc. Natl. Acad. Sci. USA 78:2072 (1981)); neo, which confers resistance to the aminoglycoside G-418 (Colberre-Garapin et al, J. Mol. Biol. 150:1 (1981)); and hygro, which confers resistance to hygromycin (Santerre, et al, Gene 30:147 (1984)) genes. Additional selectable genes include trpB, which allows cells to utilize indole in place of tryptophan; hisD, which allows cells to utilize histinol in place of histidine (Hartman & Mulligan, Proc. Natl. Acad. Sci. USA 85:8047 (1988)); ODC (ornithine decarboxylase) which confers resistance to the ornithine decarboxylase inhibitor, 2-(difluoromethyl)-DL-ornithine, DFMO (McConlogue, In: CURRENT COMMUNICATIONS IN MOLECULAR BIOLOGY, 1987, Cold Spring Harbor Laboratory ed.) and glutamine synthetase (Bebbington et al, Biotech 10:169 (1992)). [0135] Introduction of the nucleic acid, comprising the nucleic acid regulatory sequence and, optionally, the coding sequence to be expressed, into the cell is accomplished by such methods as electroporation, lipofection, calcium phosphate mediated transfection, viral infection, cell fusion, chromosome-mediated gene transfer, microcell-mediated gene transfer, spheroplast fusion, etc. Numerous techniques are known in the art for the introduction of foreign genes into cells (see, e.g., Loeffler and Behr, Meth. Enzymol. 217: 599-618 (1993); Cohen et al, Meth. Enzymol. 217: 618-644 (1993); Cline, Pharmac. Tlier. 29: 69-92 (1985)) and may be used in accordance with the present invention, provided that the necessary developmental and physiological functions of the recipient cells are not disrupted. The chosen technique preferably provides for the stable transfer of the nucleic acid to the cell, so that the nucleic acid is expressible by the cell and is heritable and expressible by its cell progeny.
5.4. SCREENING FOR MODULATORS
[0136] The invention also provides a method of identifying a modulator of a nucleic acid regulatory sequence of the invention, comprising: (a) contacting a host cell containing a nucleic acid regulatory sequence of the invention operably linked to a reporter gene sequence with a test compound; and (b) assaying expression of the reporter gene, such that, if a change in reporter gene expression relative to its expression in the absence of the test compound, is detected, a modulator of the nucleic acid regulatory sequence is identified. In a particular embodiment, the host cell is a nervous system cell. [0137] h a specific embodiment of the invention, the genetically-engineered cell lines of Section 5.3., supra may be used to screen for peptides, polypeptides, small molecules, natural and synthetic compounds or other cell bound or soluble molecules that cause a stimulation or inhibition of transcriptional activities of the regulatory sequences of the invention. Such compounds may, for example, be used to control gene expression in cells in vitro that is mediated by a regulatory sequence of the present invention. [0138] Random peptide libraries consisting of all possible combinations of amino acids attached to a solid phase support may be used to identify peptides that are able to activate or inhibit the activities of the regulatory sequences of the invention (Lam et al, Nature 354: 82-84 (1991)). The screening of peptide libraries may have therapeutic value in the discovery of pharmaceutical agents that stimulate or inhibit gene expression of mediated or controlled by one or more of the regulatory sequences of the invention. In addition, combinatorial chemistry libraries can also be screened. [0139] An example of an in vitro screening assay is described below. About 10,000 cells per well are plated in 96- well plates in total volume of 100 μl, using medium appropriate for each cell line. A reporter plasmid is used or constructed whereby the expression of a gene for luciferese is placed under the control of one or more of the regulatory sequences of the invention. In the following day, this reporter plasmid is transfected into the cells, using 50 ng plasmid per well in the presence of LipofectAmine cationic lipid transfection reagent (Gibco) at 16 μg/ml. Final volume of the transfection mix is 100 μl. Potential inhibitors of gene expression controlled by one or more of the regulatory sequences of the invention can also be added to the cells at this time. The effect of the such inhibitors can be determined by measuring the response of the luciferase reporter gene driven by the regulatory sequence(s). After 6 hr. incubation, 100 μg DMEM medium + 2.5% fetal bovine serum (FBS) to 1.25% final serum concentration is added to the cells, and incubated a total of 24 hr (18 hr more). At 24 hr, the plates are washed with PBS, blot dried, and frozen at -80°C. The plates are thawed the next day and 200 μg luciferin (LucLite, Packard) reagent is added to each well. The plates are counted in TopCount scintillation counter to determine RLU (relative luciferase units). In the above assay, the reporter can also be a fluorescent protein such as green fluorescent protein (GFP). This assay can easily be set up in a high-throughput screening mode for evaluation of compound libraries in a 96-well format.
5.5. MODIFICATION OF GENE EXPRESSION
5.5.1. MODIFICATION OF REGULATORY SEQUENCE-CONTROLLED GENE EXPRESSION [0140] Under certain circumstances, it is desirable to modify the expression of a gene controlled in cis by the regulatory sequences of the invention. This modification can constitute increasing the activity of the regulatory sequences, or inhibiting their activity. Thus, the invention provides means for promoting or increasing the activity of the regulatory sequences, and thereby increasing or promoting the expression of a gene or genes controlled by one or more sequences of the invention. The invention further provides for inhibiting the regulatory activity of the regulatory sequences, and thereby inhibiting the expression of a gene or genes controlled by one or more sequences of the invention. [0141] The endogenous counterparts of the regulatory sequences of the invention may be targeted to specifically down regulate expression of the genes under their control. For example, oligonucleotides complementary to the regulatory sequences may be designed and delivered to cells that contain a gene under the control of the a regulatory sequence of the present invention. Such oligonucleotides anneal to the regulatory sequence, and prevent activation of transcription. Alternatively, the regulatory sequence or portions thereof may be delivered to cells in saturating concentrations to compete for transcription factor binding. For general reviews of the methods of gene therapy, see Goldspiel et al, Clinical Pharmacy 12:488-505 (1993); Wu and Wu, Biotherapy 3:87-95 91991); Tolstoshev, Ann. Rev. Pharmacol. Toxicol 32:573-596 (1993); Mulligan, Science
260:926-932 (1993); and Morgan and Anderson, Ann. Rev. Biochem. 62:191-217 (1993); also TIBTECH 11(5):155-215 (1993). Methods commonly known in the art of recombinant DNA technology that can be used are described in Ausubel et al. (eds.), 1993, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, NY; and Kriegler, 1990, GENE TRANSFER AND EXPRESSION, A LABORATORY MANUAL, Stockton Press, NY.
[0142] hi a specific embodiment, the nucleic acid is directly administered in vivo into a target cell. This can be accomplished by any methods known in the art, e.g., by constructing it as part of an appropriate nucleic acid expression vector and administering it so that it becomes intracellular, e.g. , by infection using a defective or attenuated retroviral or other viral vector (see U.S. Patent No. 4,980,286), by direct injection of naked DNA, by use of microparticle bombardment (e.g., a gene gun; Biolistic, Dupont), by coating with lipids or cell-surface receptors or transfecting agents, by encapsulation in liposomes, microparticles, or microcapsules, by administering it in linkage to a peptide known to enter the nucleus, or by administering it in linkage to a ligand subject to receptor-mediated endocytosis (see, e.g., Wu and Wu, 1987, J. Biol. Chem. 262:4429- 4432), which can be used to target cell types specifically expressing the receptors, hi another embodiment, a nucleic acid-ligand complex can be formed in which the ligand comprises a fusogenic viral peptide to disrupt endosomes, allowing the nucleic acid to avoid lysosomal degradation. In yet another embodiment, the nucleic acid can be targeted in vivo for cell specific uptake and expression, by targeting a specific receptor
(see, e.g., PCT Publications WO 92/06180, published April 16, 1992; WO 92/22635, published December 23, 1992; WO92/20316, published November 26, 1992; WO93/14188, published July 22, 1993; WO 93/20221, published October 14, 1993). Alternatively, the nucleic acid can be introduced intracellularly and incorporated within host cell DNA for expression, by homologous recombination (Koller and Smithies, Proc. Natl. Acad. Sci. USA 86:8932-8935 (1989); Zijlstra et al, Nature 342:435-438 (1989)). [0143] The oligonucleotide may comprise at least one modified base moiety which is selected from the group including, but not limited to, 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylammomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta- D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl- 2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. [0144] Endogenous target gene expression can also be reduced by inactivating or "knocking out" a regulatory sequence using targeted homologous recombination (e.g. , see Smithies, et al, Nature 317:230-234 (1985); Thomas and Capecchi, Cell 51:503-512 (1987); Thompson et al, Cell 5:313-321 (1989); each of which is incorporated by reference herein in its entirety). For example, a non-functional target sequence (or a completely unrelated DNA sequence) flanked by DNA homologous to the specific regulatory sequence can be used, with or without a selectable marker and/or a negative selectable marker, to transfect cells that express the target gene in vivo. Insertion of the DNA construct, via targeted homologous recombination, results in inactivation of the specific regulatory sequence (Chappel, 1993, U.S. Patent No. 5,272,071). This approach can be adapted for use in humans provided the recombinant DNA constructs are directly administered or targeted to the required site in vivo using appropriate vectors.
[0145] Alternatively, endogenous target gene expression can be reduced by targeting deoxyribonucleotide sequences complementary to the regulatory sequence of the target gene (i.e., the target gene promoter and/or enhancers) to form triple helical structures that prevent transcription of the target gene in target cells in the body (see generally, Helene, Anticancer Drug Des., 6(6):569-584 (1991); Helene et al., Ann. NY. Acad. Sci., 660:27- 36 (1992); and Maher, Bioassays 14(12):807-815 (1992)). [0146] Nucleic acid molecules to be used in triple helix formation for the inhibition of transcription should be single stranded and composed of deoxynucleotides. The base composition of these oligonucleotides must be designed to promote triple helix formation via Hoogsteen base pairing rules, which generally require sizeable stretches of either purines or pyrimidines to be present on one strand of a duplex. Nucleotide sequences may be pyrimidine-based, which will result in TAT and CGC triplets across the three associated strands of the resulting triple helix. The pyrimidine-rich molecules provide base complementarity to a purine-rich region of a single strand of the duplex in a parallel orientation to that strand. In addition, nucleic acid molecules may be chosen that are purine-rich, for example, contain a stretch of G residues. These molecules will form a triple helix with a DNA duplex that is rich in GC pairs, in which the majority of the purine residues are located on a single strand of the targeted duplex, resulting in GGC triplets across the three strands in the triplex.
[0147] Alternatively, the potential sequences that can be targeted for triple helix formation may be increased by creating a so-called "switchback" nucleic acid molecule. Switchback molecules are synthesized in an alternating 5'-3', 3'-5' manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex. [0148] The anti-sense RNA and DNA molecules and triple helix molecules of the invention may be prepared by any method known in the art for the synthesis of nucleic acid molecules. These include techniques for chemically synthesizing oligodeoxyribonucleotides well known in the art such as for example solid phase phosphoramidite chemical synthesis. Alternatively, RNA molecules may be generated by in vitro and in vivo transcription of DNA sequences encoding the RNA molecule. Alternatively, antisense cDNA constructs that synthesize antisense RNA constitutively or inducibly, depending on the promoter used, can be introduced stably into cell lines.
[0149] Various modifications to the DNA molecules may be introduced as a means of increasing intracellular stability and half-life. Possible modifications include, but are not limited to, the addition of flanking sequences of ribo- or deoxy- nucleotides to the 5' and/or 3' ends of the molecule or the use of phosphorothioate or 2' O-methyl rather than phosphodiesterase linkages within the oligodeoxyribonucleotide backbone.
5.5.2. MODIFICATION OF EXPRESSION OF NON-c/s-LINKED GENES
USING REGULATORY SEQUENCES OF THE INVENTION
[0150] The expression of genes not operably linked to one of the disclosed regulatory sequences can be accomplished by use of antisense nucleic acids. In this regard, the regulatory sequences promote or enhance the expression of a nucleotide sequence that has exact or substantial complementarity to a gene whose expression is to be down regulated. Alternatively, downregulation of non-cis-linked genes by a regulatory sequence of the invention may be accomplished by using the regulatory sequence to drive the production of mRNA that folds into a ribozyme, which is able to cleave the mRNA produced by the gene whose downregulation is sought. Antisense RNA and DNA molecules act to directly block the translation of mRNA by hybridizing to targeted mRNA and preventing protein translation. Antisense approaches involve the design of oligonucleotides which are complementary to a protective sequence mRNA. The antisense oligonucleotides will bind to the complementary sequence in mRNA transcripts and prevent translation. Absolute complementarity, although preferred, is not required.
[0151] A sequence "complementary" to a portion of an RNA, as referred to herein, means a sequence having sufficient complementarity to be able to hybridize with the RNA, forming a stable duplex; in the case of double-stranded antisense nucleic acids, a single strand of the duplex DNA may thus be tested, or triplex formation may be assayed. The ability to hybridize will depend on both the degree of complementarity and the length of the antisense nucleic acid. Generally, the longer the hybridizing nucleic acid, the more base mismatches with an RNA it may contain and still form a stable duplex (or triplex, as the case may be). One skilled in the art can ascertain a tolerable degree of mismatch by use of standard procedures to determine the melting point of the hybridized complex.
[0152] In one embodiment, oligonucleotides complementary to non-coding regions of a gene to be downregulated could be used in an antisense approach to inhibit translation of endogenous mRNA. Antisense nucleic acids should be at least six nucleotides in length, and are preferably oligonucleotides ranging from 6 to about 50 nucleotides in length. In specific aspects, the oligonucleotide is at least 10 nucleotides, at least 17 nucleotides, at least 25 nucleotides or at least 50 nucleotides.
[0153] Regardless of the choice of target sequence, it is preferred that in vitro studies are first performed to quantitate the ability of the antisense oligonucleotide to inhibit protective sequence expression. It is preferred that these studies utilize controls that distinguish between antisense gene inhibition and nonspecific biological effects of oligonucleotides. It is also preferred that these studies compare levels of the cerebral RNA or protein with that of an internal control RNA or protein. Additionally, it is envisioned that results obtained using the antisense oligonucleotide are compared with those obtained using a control oligonucleotide. It is preferred that the control oligonucleotide is of approximately the same length as the test oligonucleotide and that the nucleic acid of the oligonucleotide differs from the antisense sequence no more than is necessary to prevent specific hybridization to the target sequence. [0154] The oligonucleotides can be DNA or RNA or chimeric mixtures or derivatives or modified versions thereof, single-stranded or double-stranded. The oligonucleotide can be modified at the base moiety, sugar moiety, or phosphate backbone, for example, to improve stability of the molecule, hybridization, etc. The oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger, et al, Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556 (1988); Lemaitre, et al, Proc. Natl. Acad. Sci. U.S.A. 84:648-652 (1987); U.S. Patent No. 4,904,582) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134, published April 25, 1988), hybridization-triggered cleavage agents (see, e.g., Krol et al, BioTechniques 6:958-976 (1988)) or intercalating agents (see, e.g., Zon, Pharm. Res. 5:539-549 (1988)). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a peptide, hybridization triggered cross-linking agent, transport agent, hybridization-triggered cleavage agent, etc. [0155] The antisense oligonucleotide may comprise at least one modified base moiety which is selected from the group including but not limited to 5-fluorouracil, 5- bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5-
(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5- carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6- isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2- methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7- methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 5- methoxycarboxymethyluracil, 5-methoxyuracil, 2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5- methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6- diaminopurine.
[0156] The antisense oligonucleotide may also comprise at least one modified sugar moiety selected from the group including but not limited to arabinose, 2-fluoroarabinose, xylulose, and hexose. [0157] In yet another embodiment, the antisense oligonucleotide comprises at least one modified phosphate backbone selected from the group consisting of a phosphorothioate, a phosphorodithioate, a phosphoramidothioate, a phosphoramidate, a phosphordiamidate, a methylphosphonate, an alkyl phosphotriester, and a formacetal or analog thereof. [0158] In yet another embodiment, the antisense oligonucleotide is an α-anomeric oligonucleotide. An α-anomeric oligonucleotide forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β -units, the strands run parallel to each other (Gautier, et al, Nucl Acids Res. 15:6625-6641 (1987)). The oligonucleotide is a 2'-O-methylribonucleotide (Inoue et al, Nucl. Acids Res. 15:6131-6148 (1987)), or a chimeric RNA-DNA analogue (Inoue et al, FEBSLett. 215:327-330 (1987)). [0159] Oligonucleotides of the invention may be synthesized by standard methods known in the art, e.g., by use of an automated DNA synthesizer (such as are commercially available from Biosearch, Applied Biosystems, etc.). As examples, phosphorothioate oligonucleotides may be synthesized by the method of Stein, et al. (Nucl. Acids Res. 16:3209 (1988)), methylphosphonate oligonucleotides can be prepared by use of controlled pore glass polymer supports (Sarin, et al, Proc. Natl. Acad. Sci. U.S.A. 85:7448-7451 (1988)), etc. [0160] While antisense nucleotides complementary to the coding region sequence of the gene to be downregulated are useful, antisense nucleotides complementary to the transcribed, untranslated region are most preferred.
[0161] Antisense molecules should be delivered to cells that express the gene to be down regulated in vivo. A number of methods have been developed for delivering antisense DNA or RNA to cells; e.g., antisense molecules can be injected directly into the tissue site, or modified antisense molecules, designed to target the desired cells (e.g., antisense linked to peptides or antibodies which specifically bind receptors or antigens expressed on the target cell surface) can be administered systemically. [0162] A preferred approach to achieve intracellular concentrations of the antisense sufficient to suppress translation of endogenous mRNAs utilizes a recombinant DNA construct in which the antisense oligonucleotide is placed under the control of a strong promoter. The use of such a construct to transfect target cells in a patient will result in the transcription of sufficient amounts of single stranded RNAs which will form complementary base pairs with the endogenous protective sequence transcripts and thereby prevent translation of the protective sequence mRNA. For example, a vector can be introduced e.g., such that it is taken up by a cell and directs the transcription of an antisense RNA. Such a vector can remain episomal or become chromosomally integrated, as long as it can be transcribed to produce the desired antisense RNA. Such vectors can be constructed by recombinant DNA technology methods standard in the art. Vectors can be plasmid, viral, or others known in the art, used for replication and expression in mammalian cells. Any type of plasmid, cosmid, YAC or viral vector can be used to prepare the recombinant DNA construct that can be introduced directly into the tissue site. Alternatively, viral vectors can be used that selectively infect the desired tissue, in which case administration may be accomplished by another route (e.g. , systemically).
[0163] Ribozyme molecules designed to catalytically cleave target gene mRNA transcripts can also be used to prevent translation of target gene mRNA and, therefore, expression of target gene product (see, e.g., PCT International Publication WO90/11364, published October 4, 1990; Sarver et al, Science 247, 1222-1225(1990)).
[0164] Ribozymes are enzymatic RNA molecules capable of catalyzing the specific cleavage of RNA (for a review, see Rossi, Current Biology 4:469-471(1990)). The mechanism of ribozyme action involves sequence specific hybridization of the ribozyme molecule to complementary target RNA, followed by an endonucleolytic cleavage event. The composition of ribozyme molecules must include one or more sequences complementary to the target gene mRNA, and must include the well known catalytic sequence responsible for mRNA cleavage. For this sequence, see, e.g., U.S. Patent No. 5,093,246, which is incorporated herein by reference in its entirety. [0165] While ribozymes that cleave mRNA at site-specific recognition sequences can be used to destroy target gene mRNAs, the use of hammerhead ribozymes is preferred. Hammerhead ribozymes cleave mRNAs at locations dictated by flanking regions which form complementary base pairs with the target mRNA. The sole requirement is that the target mRNA have the following sequence of two bases: 5'-UG-3'. The construction and production of hammerhead ribozymes is well known in the art and is described more fully in Myers, MOLECULAR BIOLOGY AND BIOTECHNOLOGY: A COMPREHENSIVE DESK REFERENCE, VCH Publishers, New York (1995) (see especially FIG. 4, page 833) and in Haseloff and Gerlach, Nature, 334:585-591 (1988), which is incorporated herein by reference in its entirety.
[0166] Preferably the ribozyme is engineered so that the cleavage recognition site is located near the 5' end of the target gene mRNA, i.e., to increase efficiency and minimize the intracellular accumulation of non-functional mRNA transcripts. [0167] The ribozymes of the present invention also include RNA endoribonucleases (hereinafter "Cech-type ribozymes") such as the one which occurs naturally in Tetrahymena thermophila (known as the TVS, or L-19 EVS RNA) and which has been extensively described by Thomas Cech and collaborators (Zaug, et al, Science, 224:574-578 (1984); Zaug & Cech, Science, 231:470-475 (1986); Zaug, et al, Nature, 324:429-433 (1986); U.S. Patent No. 4,987,071; Been & Cech, Cell, 47:207-216 (1986)). The Cech-type ribozymes have an eight nucleotide active site that hybridizes to a target RNA sequence cleavage of the target RNA takes place. The invention encompasses those Cech-type ribozymes that target eight nucleotide active site sequences that are present in the target gene. [0168] As in the antisense approach, the ribozymes can be composed of modified oligonucleotides (e.g., for improved stability, targeting, etc.) and should be delivered to cells that express the target gene in vivo. A preferred method of delivery involves using a DNA construct "encoding" the ribozyme under the control of a strong constitutive promoter, so that transfected cells will produce sufficient quantities of the ribozyme to destroy endogenous target gene messages and inhibit translation. Because ribozymes, unlike antisense molecules, are catalytic, a lower intracellular concentration is required for efficiency.
5.6. TRANSGENIC ANIMALS
[0169] The nucleic acid regulatory sequences of the invention can be used to direct expression of a coding sequence in animals by transgenic technology. Animals of any species, including, but not limited to, mice, rats, rabbits, guinea pigs, pigs, micro-pigs, goats, sheep, and non-human primates, e.g., baboons, monkeys, and chimpanzees may be used to generate transgenic animals. The term "transgenic," as used herein, refers to animals expressing coding sequences from a different species (e.g., mice expressing human gene sequences), as well as animals that have been genetically engineered to overexpress endogenous (i.e., same species) sequences or animals that have been genetically engineered to no longer express endogenous gene sequences (i.e., "knock-out" animals), and their progeny.
[0170] Any technique known in the art may be used to introduce a transgene under the control of a regulatory sequence of the invention into animals to produce the founder lines of transgenic animals. Such techniques include, but are not limited to, pronuclear microinjection (Hoppe and Wagner U.S. Patent No. 4,873,191); retrovirus-mediated gene transfer into germ lines (Van der Putten, et al, Proc. Natl. Acad. Sci, USA 82:6148-6152 (1985)); gene targeting in embryonic stem cells (Thompson, et al, Cell 56:313-321 (1989)); electroporation of embryos (Lo, Mol. Cell Biol. 3:1803-1814 (1983)); and sperm-mediated gene transfer (Lavitrano et al, Cell 57:717-723 (1989)) (see also Gordon, Transgenic Animals, Intl. Rev. Cytol 115, 171-229 (1989)). [0171] Any technique known in the art may be used to produce transgenic animal clones containing a transgene, for example, nuclear transfer into enucleated oocytes of nuclei from cultured embryonic, fetal or adult cells induced to quiescence (Campbell, et al, Nature 380:64-66 (1996); Wilmut, et al, Nature 385:810-813 (1997)).
[0172] The present invention provides for transgenic animals that carry a transgene such as a reporter gene under the control of a regulatory sequence of the invention or transcription modulating sequences thereof in all their cells, as well as animals that carry the transgene in some, but not all their cells, i.e., mosaic animals. The transgene may be integrated as a single transgene or in concatamers, e.g., head-to-head tandems or head-to- tail tandems. The transgene may also be selectively introduced into and activated in a particular cell type by following, for example, the teaching of Lasko et al. (Proc. Natl. Acad. Sci. U.S.A 89:6232-6236 (1992)). hi one embodiment, the expression characteristics of an endogenous gene within a cell, cell line or microorganism may be modified by inserting a regulatory sequence of the invention or transcription modulating sequence thereof, into the genome of a cell, stable cell line or cloned microorganism, by nonhomologous recombination, such that the inserted regulatory element is operatively linked with the endogenous gene and controls, modulates or activates the endogenous gene. For example, endogenous genes that are normally "transcriptionally silent," i.e., one that is normally not expressed, or are expressed only at very low levels in a cell line or microorganism, may be activated by inserting a regulatory sequence of the invention, or transcription activating sequence thereof which is capable of promoting the expression of a normally expressed gene product in that cell line or microorganism. [0173] A heterologous regulatory element may be inserted into a stable cell line or cloned microorganism, such that it is operatively linked with and activates expression of endogenous genes, using techniques, such as targeted homologous recombination, which are well known to those of skill in the art, and described e.g. , in Chappel, U.S. Pat. No. 5,272,071; PCT publication No. WO 91/06667, published May 16, 1991; Skoultchi U.S. Pat. No. 5,981,214; Treco et al, U.S. Pat. No. 5,968,502 and PCT publication No. WO 94/12650, published June 9, 1994. Alternatively, non-targeted (e.g., non-homologous) recombination techniques which are well known to those of skill in the art and described, e.g., in PCT publication No. WO 99/15650, published April 1, 1999, may be used.
[0174] Once transgenic animals have been generated, the transcriptional activities of the specific regulatory sequence may be assayed utilizing standard techniques. Initial screening may be accomplished by Southern blot analysis or PCR techniques to analyze animal tissues to assay whether integration of the transgene has taken place. The level of mRNA expression of the transgene in the tissues of the transgenic animals may also be assessed using techniques that include, but are not limited to, northern blot analysis of tissue samples obtained from the animal, in situ hybridization analysis, and RT-PCR. Samples of transgene-expressing tissue, may also be evaluated immunocytochemically using antibodies specific for the transgene product. Such animals may be used as in vivo system for the screening of agents that activate or inhibit the activities of the regulatory sequence.
5.7. THERAPEUTICS AND DIAGNOSTICS 5.7.1. THERAPEUTIC USES OF REGULATORY SEQUENCES [0175] DNA sequences that regulate cell-, tissue- or organ-specific transcription may be used therapeutically or prophylactically. Such sequences can be inserted into DNA vector and used to control cell-, tissue-, region- or nervous system-specific transcription of an introduced gene or DNA sequence, or an antisense form of a gene, in order to alter the expression of endogenous cellular genes, or to cause expression of factors (e.g., secreted cytokines) that will alter the properties of other cells. For example, it may be possible to use neuron-specific regulatory sequences to express the antisense forms of factors responsible for the excess process outgrowth in neurons that is associated with epilepsy. Alternatively, categories of genes associated with nerve regeneration can be placed under control of inducible promoters associated with regions of DNA that regulate neuron-specific expression. Other applications may be the prophylactic or therapeutic expression of factors that would confer resistance to the effects of chronic infectious agents such as viruses or bacteria that harm cells in the CNS. For example, synthetic antisense molecules (e.g., phosphorothioate oligodeoxynucleotides) are known to suppress HEV infection in vitro, but toxicity has prevented these compounds from progressing through clinical trials. Since HEV infection can affect the CNS, it may be possible to replace damaged nervous system tissue with nervous system stem cells stably expressing an antisense RNA against HIV mRNAs under the control of a neuron-specific regulatory sequence.
[0176] Antisense nucleic acids expressed under the control of the regulatory sequences of the present invention can be used to treat disorders of a cell type that expresses, or preferably overexpresses, the particular mRNA to which the antisense nucleic acid is directed. In a specific embodiment, such a disorder is an overexpression of a neurotransmitter. In a preferred embodiment, a single-stranded DNA antisense TCAP oligonucleotide is used. [0177] Cell types which express or overexpress a particular mRNA can be identified by various methods known in the art. Such methods include but are not limited to hybridization with a nucleic acid to the gene of interest (e.g. by northern hybridization, dot blot hybridization, in situ hybridization), observing the ability of RNA from the cell type to be translated in vitro into the specific protein produced by the gene, immunoassay, etc. In a preferred aspect, primary tissue from a patient can be assayed for protein expression prior to treatment, e.g., by immunocytochemistry or in situ hybridization. [0178] The amount of antisense nucleic acid that will be effective in the treatment of a particular disorder or condition will depend on the nature of the disorder or condition, and can be determined by standard clinical techniques. Where possible, it is desirable to determine the antisense cytotoxicity to the cell type to be treated in vitro, and then in useful animal model systems prior to testing and use in humans. [0179] In a specific embodiment, pharmaceutical compositions comprising antisense nucleic acids are administered via liposomes, microparticles, or microcapsules. In various embodiments of the invention, it may be useful to use such compositions to achieve sustained release of the antisense nucleic acids. In a specific embodiment, it may be desirable to utilize liposomes targeted via antibodies to specific identifiable tumor antigens (Leonetti et al, Proc. Natl. Acad. Sci U.S.A. 87: 2448-2451 (1990); Renneisen et al, J. Biol Chem. 265: 16337-16342 (1990)).
5.7.2. DIAGNOSTIC USES OF NUCLEIC ACID REGULATORY
SEQUENCES [0180] The nucleotide sequences described herein may also be used as diagnostic tools, where a particular condition or disease state is correlated with polymorphisms among individuals in the CNI-01142, CNI-01080, CNI-01104, CNI-01120, CNI-01125, or CNI- 01131 regulatory sequence. Sequence polymorphisms are the DNA sequence variations that occur between different individuals at the same genetic loci. Polymorphisms can be single nucleotide polymorphisms (SNPs), as well as larger-scale sequence deletions, insertions, or inversions that vary between individuals. Sequence polymorphisms that occur within regulatory DNA sequence can alter the relative levels of gene expression, which in turn can result in a disease condition, susceptibility to a disease, or alter the response of an individual to drug prophylaxis, drug therapy, or other medical treatments. Thus, identifying regulatory sequences and the sequence polymorphisms that occur within them can be used to diagnose a disease or condition, predict the likelihood of developing a disease condition or susceptibility to a condition, predict the likelihood of transmitting an inheritable susceptibility to offspring, or predict the responses of individuals to drug prophylaxis, drug therapies, or other medical treatments.
[0181] Methods for detecting SNPs are well known in the art, and generally rely on differential hybridization, i.e., the ability to distinguish between a nucleic acid with full complementarity to a regulatory sequence and a nucleic acid with a single mismatch. The methods can either involve a simple determination of hybridization or lack thereof, or can involve a determination of failure of PCR to produce a product, where the mismatched primer is designed to be mismatched at the more critical 3' end of the primer. Conventional techniques for detecting SNPs include, e.g., conventional dot blot analysis, single stranded conformational polymorphism (SSCP) analysis (see, e.g., Orita et al, Proc. Natl. Acad. Sci. USA 86:2166-2110 (1989)), denaturing gradient gel electrophoresis (DGGE), heteroduplex analysis, mismatch cleavage detection, and other routine techniques well known in the art (see, e.g., Sheffield et al, Proc. Natl. Acad. Sci. U.S.A. 86:5855-5892 (1989); Grompe, Nature Genetics 5:111-117 (1993)). Other methods are known in the art, for example, solid phase arrays using primer-guided nucleotide incorporation procedures (e.g., Kornher, et al, Nucl. Acids Res. 17:7779-7784 (1989); Sokolov, Nucl. Acids Res. 18:3671 (1990); Syvanen, et al, Genomics 8:684-692 (1990); Kuppuswamy, et al., Proc. Natl. Acad. Sci. U.S.A. 88:1143-1147 (1991); Prezant, et al, Hum. Mutat. 1:159-164 (1992); Ugozzoli, et al, GATA 9:107-112 (1992); Nyren, et al, Anal. Biochem. 208:171-175 (1993); and Wallace WO89/10414). Other methods well known in the art may be used to identify single nucleotide polymorphisms (SNPs), including biallelic SNPs or biallelic markers which have two alleles, both of which are present at a fairly high frequency in a population. Alternative, preferred methods of detecting and mapping SNPs involve microsequencing techniques wherein an SNP site in a target DNA is detecting by a single nucleotide primer extension reaction (see, e.g., Goelet et al, U.S. Patent No. 6,004,744; Mundy, U.S. Patent No. 4,656,127; Vary and Diamond, U.S. Patent No. 4,851,331; Cohen et al, PCT Publication No. WO91/02087;
Chee et al, PCT Publication No. WO95/11995; Landegren et al, Science 241:1077-1080
(1988); Nicerson et al, Proc. Natl. Acad. Sci. U.S.A. 87:8923-8927 (1990); Pastinen et al, Genome Res. 7:606-614 (1997); Pastinen et al, Clin. Chem. 42:1391-1397 (1996); Jalanko et al, Clin. Chem. 38:39-43 (1992); Shumaker et al, Hum. Mutation 7:346-354 (1996); Caskey et al, PCT Publication No. WO 95/00669).
5.8. OTHER USES
[0182] The present invention further provides methods for the use of the nucleic acid regulatory sequences of the invention. In one embodiment, DNA fragments that are found to promote or enhance gene expression may be used to find genes not previously known to be expressed in the nervous system; such genes may include previously unknown genes. The method comprises sequencing the fragment in question, followed by a deduction of the gene or gene-like sequences that the fragment appears to regulate by comparison of the sequence to known genomic sequences using the search algorithms described above. In another embodiment, one can determine the gene associated with a particular regulatory sequence based on sequence homology with a cognate regulatory sequence in another organism, wherein the cognate regulatory sequence in another organism possesses a sequence substantially similar to that of the human regulatory sequence. Such a degree of conservation has been demonstrated for the GAP-43 promoter, known to be found in organisms as evolutionarily diverse as mammals and fish (Reinhard et α/., £>eve/. 120:7167-1775 (1994)). [0183] The regulatory sequence, or fragments thereof, as provided by the present invention may also be used to discover new transcription factors. Though thousands of transcription factors are predicted to exist in humans (see Venter et al, Science 291 : 1304- 1350 (2001)), only a few hundred have been discovered; far fewer have been described as regulating gene expression in the nervous system. Transcription factors binding to the regulatory sequences provided herein may be discovered by any means known to those in the art. For example, fragments of the regulatory sequence can be separated on a non- denaturing agarose or polyacrylamide gel, under conditions allowing for binding of transcription factors to appropriate DNA recognition sequences or elements, in the presence or absence of extracts of cells derived from the nervous system; a shift in the mobility of a particular fragment in the presence of cell extracts indicates that the fragment is being bound by a protein that may regulate transcription. Alternatively, a column can be constructed, comprising a packing material having a fragment of the regulatory sequence available for binding to cell extract components passed through the column, followed by washing of the column with a buffer that allows for DNA-protein interactions; proteins binding to the fragment, including potential new transcription factors, can thereupon be eluted and characterized. [0184] The nucleic acid regulatory sequences of the invention can also be used to aid in the construction of microarrays that allow the simultaneous assessment of the binding of specific transcription factors to a plurality of regulatory DNA sequences. Such a microarray has been reported in the yeast genetic system (Ren et al, Science 290:2306- 2309 (2001)), and the techniques utilized therein can be readily utilized in the construction of such micro-arrays. Using the regulatory sequences provided herein, in addition to known regulatory sequences, one can construct a similar microarray for human regulatory DNA sequences in order to profile transcription factor utilization in different cell, tissues, or between different physiological conditions or disease states.
6. EXAMPLES
6.1. IDENTIFICATION AND ANALYSIS OF THE REGULATORY
SEQUENCES 6.1.1. DNA PREPARATION
[0185] Human chromosome 22 DNA libraries were prepared by cloning fragments of BamHJ- or Pstl-digested human chromosome 22 DNA sequences into the unique BajήHI or PstJ sites present in the multiple cloning site (MCS) of a plasmid vector constructed at
Cogent Neuroscience, Inc (pCOGENTl) (FIG. 1). This plasmid contains a multiple cloning site (MCS) containing unique restriction enzyme sites for BamHI, EcoRI and
PstJ. Downstream of the MCS, the vector also contains a basal promoter sequence containing a "TATA" box and a sequence encoding green fluorescent protein. The vector also contains an ampicillin resistance gene, and a pMBl -derived origin of DNA replication. A positive control plasmid, pCOGENTl (Ε), was created by inserting an approximately 400 nucleotide DNA fragment containing the strong transcription enhancer from the CMV immediate early (IE) gene promoter (Boshart et al Cell 41(2):521-30 (1985)) into the unique EcoRI site in the MCS of pCOGENTl. The vector pCOGENTl, with no library insert, was used as the negative control. [0186] Library transformants were plated and grown on LB agar (DIFCO Laboratories) bioassay plates with 0.2 mg/ml ampicillin at 37°C for 24 hours. Single colonies were then used to inoculate deep-well blocks containing 1.5 ml LB broth containing 0.2 mg/ml ampicillin. Inoculated cultures were incubated at 37°C with agitation at 150-200 rpm for 18-24 hours. Replicate plates were created from the cultures by adding 20 μl of culture to 80 μl of LB broth containing 18% glycerol and 0.2 mg/ml ampicillin and stored at -80°C. The remaining bacterial cells inoculated into 15-150 ml of fresh LB broth containing 0.2 mg/ml ampicillin. Following incubation at 37°C with agitation at 150-200 rpm for 18-24 hours, plasmid DNA was extracted using Promega DNA extraction kits. Purified plasmid DNA was introduced into mammalian nervous system cells.
6.1.2. EVALUATION OF MODULATORY ACTIVITY OF CLONED
SEQUENCES [0187] The ability of the nucleotide sequences of SEQ ID NO: 1 , SEQ ID NO: 2, SEQ JD NO: 5, and SEQ D NO: 6 to modulate the expression of a nucleic acid of interest in a cell were measured quantitatively in the following manner. Plasmid DNAs from human DNA library clones (see Section 6.1.1) containing the sequences of SEQ ED NO: 1, SEQ ID NO: 2, SEQ ID NO: 5, or SEQ JD NO: 6 were introduced by transfection into cells in rat brain slices and maintained in culture. The number of cells expressing the reporter gene, green fluorescent protein (GFP), was determined by fluorescence microscopy. Individual library clones were judged to contain nucleotide sequences capable of modulating gene expression if they caused a greater number of cells to express GFP than the number of cells in brain slices that were similarly transfected with only the negative control plasmid (pCOGENTl). Expression of the positive control plasmid pCOGENTl (E), similarly transfected, was also determined.
[0188] The ability of the nucleotide sequences of SEQ ID NO: 3 and SEQ ID NO: 4 to modulate the expression of a nucleic acid of interest in a cell were measured quantitatively in the following manner. Plasmid DNAs from human DNA library clones (see Section 6.1.1) containing the sequences of SEQ ID NO: 3 or SEQ ED NO: 4 were co- transfected into cells in rat brain slices in culture with the plasmid pECFP (Clontech, PaloAlto, CA). pECFP contains the gene for cyan fluorescent protein (CFP) under the control of a strong CMV promoter, which results in high-level expression of CFP in mammalian cells. The number of cells expressing GFP and CFP were determined separately by fluorescence microscopy using different excitation and emission wavelengths for each fluorescent protein. The percentage of cells expressing CFP that also expressed GFP was calculated [(GFP-expressing cells/CFP-expressing cells) x 100]. Individual library clones were judged to contain nucleotide sequences capable of modulating gene expression if they caused a higher percentage of cells to express GFP (relative to CFP) than the percentage of cells expressing GFP (relative to CFP) in brain slices that were similarly co-transfected with the negative control plasmid pCOGENTl and pECFP. The percentage of GFP-expressing cells similarly co-transfected with the positive control plasmid pCOGENTl(E) and pECFP, was also determined.
6.1.3. DNA SEQUENCING
[0189] The nucleotide sequence of a DNA insert that was selected for its ability to cause detectable expression of the reporter gene when introduced into cells was determined using the ABI Big Dye terminator Cycle Sequencing Ready Reaction Kit followed by subsequent analysis on the ABI3700 capillary sequencing machine (PE Biosystems, Foster City, CA). Plasmid DNA was annealed with oligonucleotide primers complementary to regions upstream (forward primer) and downstream (reverse primer) of the MCS. Cycle sequencing reactions were carried out in a thermocycler (PCR machine) using standard methods. The extension products from the sequencing reaction were purified by precipitation using isopropanol and analyzed on the ABI3700 sequencer according to the manufacturer's protocol.
6.1.4. SEQUENCE ANALYSIS
[0190] The sequence data for the nucleic acid regulatory sequences was compared using the BLAST 2.0 algorithm (Altschul et al, Nucleic Acids Res. 25:3389 (1997)) against known sequences in the GenBank sequence database maintained by NCBI (National Center for Biotechnology Information). This program uses the two-hit method to find homology within the database. The BLAST nucleotide searches were performed with the "BLAST N" program (wordlength = 11). [0191] Predictions of transcription factor binding sites were made using GeneTools software from BioTools, Inc. (BTI). The eukaryotic transcription factors and DNA motifs from the Transcription Factor Database (TFD) are located on the Internet, via file transfer protocol, at ncbi.nlm.nih.gov/repository/TFD. Information present in the University of California, Santa Cruz (UCSC), draft assembly of the human genome (available on the Internet at genome.ucsc.edu/goldenPath/octTracks.html) was used to position the regulatory sequence on human chromosome 22.
6.2. NUCLEIC ACID REGULATORY SEQUENCE CNI-01142 [0192] The sequence of the 1311 nucleotide CNI-01142 regulatory sequence is shown in FIG. 2. A BLAST analysis showed homology to GenBank accession number U62317, which is the sequence of human chromosome 22ql3 BAC clone CIT987SK-384D8, and GenBank accession number AC005226, which is human PAC clone RP4-683L10 from chromosome 14q24.3. [0193] As depicted in FIG. 3 (UCSC linkage map of a region of human chromosome 22), the nearest known or predicted gene to the sequence of CNI-01142 is a known gene, arylsulfatase A (ARSA). The sequence of the complement of the gene extends from position 47481553 to 47484673. The sequence of arylsulfatase A precursor is in the opposite orientation of the sequence of CNI-01142. The 3' end of the sequence of the gene is approximately 1,776 base pairs "downstream" from the 3' end of the sequence of CNI-01142. A multiplicity of mutations in the gene yielding unstable mRNA, or incomplete, unstable, or enzymatically-deficient protein products lead to a sulfatide lipidosis alternatively named metachromatic leukodystrophy, metachromatic leukoencephalopathy, cerebral sclerosis of the diffuse metachromatic form, and others. Arylsulfatase A deficiency leads to an accumulation of cerebroside sulfates in white matter, cerebrospinal fluid, kidney, and urinary sediment. Symptoms include motor disturbances, rigidity, mental deterioration, convulsions, and progressive physical and mental deterioration. Symptoms can begin as early as a few months after birth, with death occurring by age 5. Regulatory elements dictating the expression pattern or level of this gene can affect the development of this neurological disease. The CNI-01142 nucleotide sequence was analyzed for transcription factor recognition sites (FIG. 4). [0194] Genomic clone CNI-01142 caused expression of the reporter gene above the level for the negative control pCOGENTl (FIG. 5). Expression in the middle region of the brain was greater than in caudal and rostral regions. Nervous system cells transfected with CNI-01142 clearly show expression in brain slices (FIGS. 6A-6C).
6.3. NUCLEIC ACID REGULATORY SEQUENCE CNI-01080
[0195] The sequence of the 65 nucleotide CNI-01080 regulatory sequence is shown in FIG. 8. A BLAST analysis showed the highest homology to GenBank accession number HS941F9, a human DNA sequence from clone CTA-941F9 on cliromosome 22ql3. [0196] As depicted in FIG. 9 (UCSC linkage map of a region of human cliromosome 22), the sequence of CNI-01080 is located within an intron of the known gene, FBLNl, which encodes the extracellular matrix protein, fibulin-1. The sequence of FBLNl extends from sequence positions 42448246 to 42545953. The sequence of FBLNl is in the opposite orientation of the sequence of CNI-01080. The 5' end of FBLNl is approximately 60,981 base pairs "upstream" from the 3' end of the sequence of CNI- 01080. Fibulin-1 is reported to be expressed in brain exclusively in neurons (to the exclusion of astrocytes or microglia), and is implicated in the pathogenesis of Alzheimer's disease as a consequence of its ability to bind to amyloid precursor protein (Ohsawa I., et al. J Neurochem. 76(5):1411-20 (2001) "Fibulin-1 binds the amino-terminal head of beta- amyloid precursor protein and modulates its physiological function"). The CNI-01080 nucleotide sequence was analyzed for transcription factor recognition sites (FIG. 10) [0197] Genomic clone CNI-01080 caused expression of the reporter gene above the level for the negative control pCOGENTl (FIG. 11). The expression of the reporter gene is higher in the caudal region than that of the middle region, and lower in the rostral region. Nervous system cells transfected with CNI-01080 clearly show expression in brain slices (FIG. 12).
6.4. NUCLEIC ACID REGULATORY SEQUENCE CNI-01104
[0198] The sequence of the 146 nucleotide CNI-01104 regulatory sequence is shown in FIG. 14. A BLAST analysis showed the highest homology to GenBank accession number AC000079, a human cliromosome 22ql 1.2 sequence found in cosmid clone
49cl2. [0199] As depicted in FIG. 15 (UCSC linkage map of a region of human cliromosome 22), two genes are positioned near the sequence of CNI-01104. The sequence of CNI- 01104 lies within an intron of a known gene, HTRA. The sequence of the complement of the HTRA gene extends from position 16258356 to 16359351. The sequence of the HIRA gene is in the opposite orientation of the sequence of CNI-01104. The 5 ' end of the sequence of the HTRA gene is approximately 18,585 base pairs "downstream" from the 3' end of the sequence of CNI-01104. A second gene (not shown in FIG. 15), NLVCF (nuclear localization Velo-cardio-facial syndrome), extends from position 16360198 tol6363730 (UCSC linkage map, December 12, 2000 freeze). The sequence of the NLVCF gene is in the same orientation as the sequence of CNI-01104. The 5' end of the NLVCF gene is approximately 19,432 nucleotides "downstream" from the 3' end of the sequence of CNI-01104. The HERA gene (histone cell cycle regulation defective, S. cerevisiae homolog A, also known as DiGeorge critical region gene 1 [DCGR1]) encodes a putative transcription factor, TUPLE 1 (TUP-like enhancer of Split 1), which aligns with a region of chromosome 22ql 1 deleted in patients with Velo-cardio-facial and DiGeorge syndromes. These syndromes are congenital disorders characterized by craniofacial anomalies, conotruncal heart defects, immune deficiencies, and learning disabilities (Halford et al, Hu. Mol. Genet. 2:1577-1582 (1993), "Isolation of a gene expressed during early embryogenesis from the region of 22ql 1 commonly deleted in DiGeorge syndrome"). In early chick embryos, Hira is expressed in the developing neural plate, the neural tube, the neural crest, and the mesenchyme of the head and branchial arch structures (Roberts et al, Hu. Mol. Genet. 6: 237-245, (1997), "Cloning and developmental expression analysis of chick Hira [Chira], a candidate gene for DiGeorge syndrome"). The NLVCF gene aligns with the same region of chromosome 22 deleted in patients with Velo-cardio-facial and DiGeorge syndromes. NLVCF is expressed at high levels in the brain during development and may be co-regulated with HERA, the murine homolog of which displays a similar expression pattern in mouse embryos as does the murine Nlvcf gene (Funke et al, Genomics 53:146-54 (1998), "Isolation and characterization of a human gene containing a nuclear localization signal from the critical region for velo-cardio-facial syndrome on 22ql 1 ") . The CNI-01104 nucleotide was analyzed for transcription factor recognition sites (FIG. 16). [0200] Genomic clone CNI-01104 caused expression of the reporter gene above the level for the negative control pCOGENTl (FIG. 17). The expression of the reporter gene is highest in the middle region of the brain, with lower expression in the rostral and caudal regions. Nervous system cells transfected with CNI-01104 clearly show expression in brain slices (FIG. 18).
6.5. NUCLEIC ACID REGULATORY SEQUENCE CNI-01120
[0201] The sequence of the 2768 nucleotide CNI-01120 regulatory sequence is shown in FIG. 20. A BLAST analysis showed the highest homology to GenBank accession number AC000079, a human chromosome 22ql 1.2 sequence found in cosmid clone 49cl2.
[0202] As depicted in FIG. 21 (UCSC linkage map of a region of human chromosome 22), the sequence of CNI-01120 is located within the first intron of the known gene, HTRA. The sequence of the complement of the gene extends from position 16258356 to 16359351. The sequence of HTRA is in the opposite orientation of the sequence of CNI- 01120. The 5' end of the predicted gene is approximately 15,823 base pairs "downstream" from the 3' end of the sequence of CNI-01120. A second gene, NLVCF, extends from position 16360198 tol6363730 (UCSC linkage map, December 12, 2000 freeze). The sequence of the NLVCF gene is in the same orientation as the sequence of CNI-01120. The 5' end of the NLVCF gene is approximately 16,670 nucleotides
"downstream" from the 3' end of the sequence of CNI-01120. The HTRA gene (histone cell cycle regulation defective, S. cerevisiae homolog A, also known as DiGeorge critical region gene 1 [DCGR1]) encodes a putative transcription factor, TUPLE 1 (TUP-like enhancer of Split 1), which aligns with a region of chromosome 22ql 1 deleted in patients with Velo-cardio-facial and DiGeorge syndromes. These syndromes are congenital disorders characterized by craniofacial anomalies, conotruncal heart defects, immune deficiencies, and learning disabilities (Halford et al, Hu. Mol Genet. 2:1577-1582 (1993), "Isolation of a gene expressed during early embryogenesis from the region of 22ql 1 commonly deleted in DiGeorge syndrome"). In early chick embryos, Hira is expressed in the developing neural plate, the neural tube, the neural crest, and the mesenchyme of the head and branchial arch structures (Roberts et al, Hu. Mol. Genet. 6:
237-245, (1997), "Cloning and developmental expression analysis of chick Hira [Chira], a candidate gene for DiGeorge syndrome"). The NLVCF (nuclear localization Velo- cardio-facial syndrome) gene aligns with the same region of chromosome 22 deleted in patients with Velo-cardio-facial and DiGeorge syndromes. NLVCF is expressed at high levels in the brain during development and may be co-regulated with HIRA, the murine homolog of which displays a similar expression pattern in mouse embryos as does the murine Nlvcf gene (Funke et al, Genomics 53:146-54 (1998), "Isolation and characterization of a human gene containing a nuclear localization signal from the critical region for velo-cardio-facial syndrome on 22ql l"). The CNI-01104 nucleotide was analyzed for transcription factor recognition sites (FIG. 22). [0203] Genomic clone CNI-01120 caused expression of the reporter gene above the level for the negative control pCOGENTl (FIG. 23.) The expression of the reporter gene is higher in the middle region of the brain than in the rostral and causal regions. Nervous system cells transfected with CNI-01120 show expression in brain slices (FIG. 24).
6.6. NUCLEIC ACID REGULATORY SEQUENCE CNI-01125
[0204] The sequence of the 1697 nucleotide CNI-01125 regulatory sequence is shown in FIG. 26. A BLAST analysis showed the highest homology to GenBank accession number AL031186, a human DNA sequence from clone CTA-984G1 on cliromosome 22ql2.1-12.2 that contains the 5' part of the EWSR1 gene for Ewing sarcoma breakpoint region 1 protein.
[0205] As depicted in FIG. 27 (UCSC linkage map of a region of human chromosome 22), the sequence of CNI-01125 overlaps intronic and exonic sequence of two known genes, EWSR1 and C22ORF3. The sequence of EWSR1 extends from position 26310272 to 26342152, in the same orientation as the sequence of CNI-01125. The 5 ' end of the sequence of the gene is approximately 535 base pairs "downstream" from the 5' end of the sequence of CNI-01125. The sequence of CNI-01125 extends approximately 1,162 bases further "downstream" into the sequence of EWSR1. The sequence of the complement of C22ORF3 extends from 26301845 to 26309915, in the opposite orientation of the sequence of CNI-01125. The 5' end of the sequence of C22ORF3 is approximately 178 bases "downstream" of the 5' end of the sequence of
CNI-01125. EWSR1 (Ewing sarcoma breakpoint region 1) is a gene that lies at the point of cliromosome 22 at which translocations with cliromosome 11 associated with Ewing sarcoma typically occur (Plougastel et al, Genomics 18:609-15 (1993), "Genomic structure of the EWS gene and its relationship to EWSR1, a site of tumor-associated chromosome translocation"). Patients with other kinds of cancer, including peripheral neuroepithelioma, display identical translocations between several different chromosomes and 22 (Whang-Peng et al. , New Eng. J. Med. 311: 584-585 (1984), "Chromosome translocation in peripheral neuroepithelioma"). In each case the result is the creation of a chimeric protein in which the 5' end of EWSR1 is fused to portions of different transcription factors (Aman et al, Genomics 37: 1-8 (1996), "Expression patterns of the human sarcoma-associated genes FUS and EWS and the genomic structure of FUS"). The resulting chimeric protein abnormally regulates gene expression in ways that lead to the various cancers associated with the translocations. That the 5' promoter region of the EWSR1 gene remains intact in each case highlights the importance of regulatory elements in this region in driving the development of several distinct cancers. The CNI-01125 nucleotide was analyzed for transcription factor recognition sites (FIG. 28). [0206] Genomic clone CNI-01125 caused significant expression of the reporter gene above the level for the negative control pCOGENTl (FIG. 29). The expression of the reporter gene is higher in the caudal region of the brain than in the rostral and middle regions. In particular, expression in the caudal region was substantially higher than that which occurred in the positive control. Nervous system cells transfected with CNI-01125 show expression in brain slices (FIG. 30).
6.7. NUCLEIC ACID REGULATORY SEQUENCE CNI-01131
[0207] The sequence of the 756 nucleotide CNI-01131 regulatory sequence is shown in
FIG. 32. A BLAST analysis showed the highest homology to GenBank accession number AL022314, a human DNA sequence from clone RP5-1170K4 on chromosome
22ql2.2-13.1 that contains several novel genes, one of which codes for a trypsin family protein with class A LDL receptor domains, and the IL2RB gene for
Interleukin 2 Receptor, Beta (IL-2 Receptor, CD122).
[0208] As depicted in FIG. 33 (UCSC linkage map of a region of human chromosome 22), the sequence of CNI-01131 encompasses an exon and part of the adjacent intron of the known gene, IL2RB. The sequence of the complement of the gene extends from position 34050482 to 34074557. The sequence of IL2RB is in the opposite orientation of the sequence of CNI-01131. The 5 ' end of the sequence of the gene is approximately 13,824 base pairs "downstream" from the 3' end of the sequence of CNI-01131. IL2RB encodes the beta chain of the IL2 receptor, which with the alpha and gamma chains comprise the high affinity IL2 receptor. All three subunits are expressed in multiple tissues, including brain (Pettito et al, Brain Res. 650:140-5 (1994), "Molecular cloning of a partial cDNA of the interleukin-2 receptor-beta in normal mouse brain: in situ localization in the hippocampus and expression by neuroblastoma cells"; Pettito et al, Brain Res. Mol. Brain Res. 53:152-62 (1998), "Molecular cloning of the cDNA coding sequence of IL-2 receptor-gamma (gammac) from human and murine forebrain: expression in the hippocampus in situ and by brain cells in vitro"). IL2, the protypical T cell growth factor and immunoregulatory cytokine produced by lymphocytes, has been implicated as a brain neurotrophic factor and neuromodulator (Shimojo et al, Neurosci. Lett. 151:170-3 (1993), "Interleukin-2 enhances the viability of primary cultured rat neocortical neurons"). Additionally, IL2 has been influences inflammatory processes in the brain such as encephalomyelitis (Petitto et al, Neurosci. Lett. 285:66-70 (2000),
"Interleukin-2 gene deletion produces a robust reduction in susceptibility to experimental autoimmune encephalomyelitis in C57BL/6 mice"), learning and memory (Petitto et al, J. Neurosci. Res. 56:441-6 (1999), "Impaired learning and memory and altered hippocampal neurodevelopment resulting from interleukin-2 gene deletion"), and in controlling tumor growth (Sampath et al, Cancer Res. 59:2107-14 (1999), "Paracrine immunotherapy with interleukin-2 and local chemotherapy is synergistic in the treatment of experimental brain tumors"). The CNI-01131 nucleotide was analyzed for transcription factor recognition sites (FIG. 34). [0209] Genomic clone CNI-01131 caused significant expression of the reporter gene above the level for the negative control pCOGENTl (FIG. 35). The expression of the reporter gene is higher in the middle region of the brain than in the rostral and caudal regions. Nervous system cells transfected with CNI-01131 clearly show expression in brain slices (FIG. 32).

Claims

WHAT IS CLAIMED IS:
1. An isolated nucleic acid molecule comprising the nucleotide sequence of SEQ JD NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ID NO: 4, SEQ JD NO: 5, or SEQ ID NO: 6.
2. An isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence of SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ED NO: 5, or SEQ ED NO: 6.
3. The isolated nucleic acid regulatory sequence molecule of claim 2, wherein the isolated nucleic acid regulatory sequence molecule is a restriction fragment of SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ ED NO: 4, SEQ ED NO: 5, or SEQ ID NO: 6.
4. The isolated nucleic acid regulatory sequence molecule of claim 2, wherein the isolated nucleic acid regulatory sequence is created by nuclease digestion of a nucleic acid molecule comprising SEQ TD NO: 1, SEQ ID NO: 2, SEQ TD NO: 3, SEQ ID NO: 4, SEQ ED NO: 5, or SEQ ID NO: 6.
5. The isolated nucleic acid molecule of claim 1, wherein the isolated nucleic acid molecule is operably linked to a nucleic acid molecule comprising a coding sequence.
6. The isolated nucleic acid regulatory sequence molecule of any one of claims 2-4, wherein the isolated nucleic acid regulatory sequence molecule is operably linked to a nucleic acid molecule comprising a coding sequence.
7. An isolated nucleic acid molecule comprising the reverse complement of the nucleotide sequence of SEQ ID NO: 1, SEQ JD NO: 2, SEQ JD NO: 3, SEQ JD NO: 4, SEQ JD NO: 5 or SEQ JD NO: 6.
8. An isolated nucleic acid regulatory sequence molecule comprising the reverse complement of the nucleotide sequence of the nucleic acid regulatory sequence molecule of claim 2.
9. The isolated nucleic acid regulatory sequence molecule of claim 2, wherein the transcription activating nucleotide sequence comprises at least about 50 contiguous nucleotides of SEQ ED NO: 1, SEQ ID NO: 2, SEQ ED NO: 3, SEQ ID NO: 4,
SEQ ID NO: 5 or SEQ ID NO: 6.
10. An isolated nucleic acid regulatory sequence molecule comprising a transcription activating nucleotide sequence that hybridizes over its entire length to the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ED NO: 4, SEQ ED NO: 5 or SEQ ED NO: 6 or the complement thereof.
11. A vector comprising the nucleotide sequence of claim 1 or claim 2.
12. The vector of claim 11 further comprising a coding sequence operably linked to the nucleotide sequence.
13. The vector of claim 12, wherein the coding sequence is heterologous to the nucleotide sequence.
14. The vector of claim 11 further comprising a multiple cloning site (MCS), wherein when a coding sequence is inserted into the MCS, the coding sequence is operably linked to the nucleotide sequence.
15. The vector of claim 11, further comprising an internal ribosomal entry site
(IRES).
16. The vector of claim 12, wherein the coding sequence is heterologous to the nucleotide sequence.
17. The vector of any one of claims 12 or 14, wherein said coding sequence is a reporter gene sequence.
18. The vector of any one of claims 12 or 14, wherein said coding sequence is a neuroprotective sequence.
19. The vector of claim 17, wherein said reporter gene sequence encodes β- galactosidase, a fluorescent protein, chloramphenicol acetyltransferase, luciferase or an antigenic marker.
20. A vector comprising a promoter and an MCS operably linked in an upstream-to-downstream order, and the nucleotide sequence of claim 1 or claim 2 or a transcription activating nucleotide sequence thereof.
21. The vector of claim 20, further comprising an internal ribosomal entry site (IRES).
22. The vector of claim 20, wherein when a coding sequence is present within the MCS, the coding sequence is operably linked to said promoter sequence and to the nucleotide sequence or transcription activating nucleotide sequence thereof.
23. The vector of claim 20, wherein said promoter is heterologous to the coding sequence.
24 The vector of claim 20, wherein the vector is adapted for transfer to a eukaryotic host cell.
25. The vector of claim 24, wherein the eukaryotic host cell is a nervous system cell.
26. The vector of claim 25, wherein the nervous system cell is a nervous system cell line, glial cell, astrocyte, oligodendrocyte, mesencephalic neuron, hypothalamic neuron or cortical neuron.
27. The vector claim 20, wherein said vector is adapted for transfer to a prokaryotic host cell.
28. A host cell, or progeny thereof, comprising the vector of claim 11.
29. The host cell of claim 28, wherein said host cell is a eukaryotic cell.
30. The host cell of claim 29, wherein said host cell is a nervous system cell.
31. The host cell of claim 28, wherein said host cell is a prokaryotic cell.
32. A kit comprising the vector of claim 11 , 22 or 25.
33. A kit comprising the host cell of claim 28.
34. A kit comprising the host cell of claim 31.
35. A transgenic non-human animal comprising the nucleotide sequence of claim 1 or claim 2, wherein the nucleotide sequence is heterologous to said nonhuman animal.
36. The transgenic animal of claim 35, wherein said nucleotide sequence is contained within an episome.
37. The transgenic animal of claim 35, wherein said nucleotide sequence is inserted into the genome of said animal by homologous recombination.
38. The transgenic animal of claim 35, wherein said nucleotide sequence is inserted into the genome of said animal by nonhomologous recombination.
39. The transgenic animal of claim 37 or 38 wherein said nucleotide sequence promotes or enhances expression of a coding sequence in the genome of said animal.
40. A method of expressing a coding sequence in a host cell in cell culture, comprising culturing a host cell of claim 28 under conditions effective to allow expression of the coding sequence by said host cell.
41. The method of claim 40, wherein said host cell is a nervous system cell.
42. The method of claim 40, wherein said vector exists within said host cell as an episome.
43. The method of claim 40, wherein said vector is present in the genome of said host cell.
44. The method of claim 43, wherein said vector is introduced into the genome of said host cell by homologous recombination.
45. The method of claim 43, wherein said vector is introduced into the genome of said host cell by nonhomologous recombination.
46. The method of claim 43, wherein said nucleic acid sequence controls expression of a coding sequence endogenously present in the genome of said host cell.
47. A method of producing a polypeptide comprising:
(a) introducing the vector of claim 11 into a host cell such that a nucleotide sequence contained within said vector promotes or enhances the expression of a coding sequence; and
(b) maintaining said host cell under conditions effective to allow expression of said coding sequence, and to allow translation of mRNA, wherein said expression of said coding sequence produces a polypeptide.
48. The method of claim 47, wherein the vector is present in the genome of said host cell.
49. The method of claim 48, wherein said vector is introduced into the genome of said host cell by homologous recombination.
50. The method of claim 48, wherein said vector is introduced into the genome of said host cell by nonhomologous recombination.
51. A method of identifying a modulator of a regulatory sequence active in nervous system-derived host cells comprising:
(a) contacting the nervous system-derived host cell containing the vector of claim 11 with a test compound; and
(b) detecting a change of expression of the reporter gene, relative to its expression in the absence of the test compound, such that, if a change is detected, a modulator of the nucleic acid regulatory sequence is identified.
52. The method of Claim 51 , wherein said regulatory sequence active in nervous system-derived cells is SEQ ED NO: 1, SEQ ED NO: 2, SEQ ED NO: 3, SEQ TD NO: 4, SEQ ED NO: 5, or SEQ ED NO: 6 or a transcription activating sequence thereof.
53. The method of Claim 51 , wherein said reporter gene encodes β- galactosidase, a fluorescent protein, chloramphenicol acetyltransferase, luciferase or an antigenic marker.
54. A method of constructing a transgenic animal comprising introducing the nucleic acid molecule of claim 1 or claim 2 into an embryonic host cell.
PCT/US2002/021228 2001-07-06 2002-07-03 Nucleic acid regulatory sequences and uses thereof WO2003004682A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2002316557A AU2002316557A1 (en) 2001-07-06 2002-07-03 Nucleic acid regulatory sequences and uses thereof

Applications Claiming Priority (12)

Application Number Priority Date Filing Date Title
US30339801P 2001-07-06 2001-07-06
US60/303,398 2001-07-06
US30526101P 2001-07-13 2001-07-13
US60/305,261 2001-07-13
US30739501P 2001-07-24 2001-07-24
US30739401P 2001-07-24 2001-07-24
US60/307,395 2001-07-24
US60/307,394 2001-07-24
US30766601P 2001-07-25 2001-07-25
US60/307,666 2001-07-25
US30988501P 2001-08-03 2001-08-03
US60/309,885 2001-08-03

Publications (2)

Publication Number Publication Date
WO2003004682A2 true WO2003004682A2 (en) 2003-01-16
WO2003004682A3 WO2003004682A3 (en) 2004-02-05

Family

ID=27559656

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2002/021228 WO2003004682A2 (en) 2001-07-06 2002-07-03 Nucleic acid regulatory sequences and uses thereof

Country Status (2)

Country Link
AU (1) AU2002316557A1 (en)
WO (1) WO2003004682A2 (en)

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DATABASE GENBANK [Online] 03 February 2000 LOFTUS B.J.: 'Genome duplications and other features in 12 Mb of DNA sequence from human chromosome 16p and 16q.', XP002971449 Database accession no. u62317 *
DATABASE GENBANK [Online] 26 February 2003 ADAMS, M. D., XP002971448 Database accession no. l49235 *
LOFTUS B.J. ET AL: 'Genome duplications and other features in 12 Mb of DNA sequence from human chromosome 16p and 16q.' GENOMICS vol. 60, no. 3, 1999, pages 295 - 308, XP002931311 *

Also Published As

Publication number Publication date
AU2002316557A1 (en) 2003-01-21
WO2003004682A3 (en) 2004-02-05

Similar Documents

Publication Publication Date Title
AU756620B2 (en) Mutations in the myostation gene cause double-muscling in mammals
KR102620328B1 (en) Targeted augmentation of nuclear gene output
US5800998A (en) Assays for diagnosing type II diabetes in a subject
AU6952498A (en) Novel human delta3 compositions and therapeutic and diagnostic uses therefor
US5807708A (en) Conservin nucleic acid molecules and compositions
EP1532246B1 (en) Genetic suppression and replacement
US6399760B1 (en) RP compositions and therapeutic and diagnostic uses therefor
WO1998021239A2 (en) Therapeutic compositions and methods and diagnostic assays for type ii diabetes involving hnf-1
US6833239B1 (en) Methods to identify modulators of FKHL7 DNA-binding activity
KR102624979B1 (en) B4GALT1 variants and their uses
US20010041353A1 (en) Novel SSP-1 compositions and therapeutic and diagnostic uses therefor
US5972609A (en) Utrophin gene promotor
EP1474434B1 (en) The eaat2 promoter and uses thereof
JP2002510508A (en) Glaucoma treatment and diagnostic agents
US20030037351A1 (en) Nucleic acid regulatory sequences and uses therefor
WO1998046748A1 (en) Therapeutic compositions and diagnostic assays for diseases involving trbp
WO2003004682A2 (en) Nucleic acid regulatory sequences and uses thereof
US6303370B1 (en) Tissue-specific regulatory elements
Zheng et al. Transcriptional precision in photoreceptor development and diseases–Lessons from 25 years of CRX research
AU3642399A (en) Glaucoma therapeutics and diagnostics based on a novel human transcription factor
WO2023060132A1 (en) Allele specific editing to treat fus-induced neurodegeneration
RU2805557C2 (en) B4galt1 options and their applications
Rose III et al. Expression of myosin heavy chain gene in the sea urchin: Coregulation with muscle actin transcription in early development
WO1998021363A1 (en) Compositions and methods for treating type ii diabetes involving hnf-4
US6825035B1 (en) Compositions and methods for modulating expression within smooth muscle cells

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AU CA JP

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SK TR

121 Ep: the epo has been informed by wipo that ep was designated in this application
32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 69(1) EPC

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP