WO2009134814A2 - Identification de séquences centromères et leurs utilisations - Google Patents

Identification de séquences centromères et leurs utilisations Download PDF

Info

Publication number
WO2009134814A2
WO2009134814A2 PCT/US2009/041998 US2009041998W WO2009134814A2 WO 2009134814 A2 WO2009134814 A2 WO 2009134814A2 US 2009041998 W US2009041998 W US 2009041998W WO 2009134814 A2 WO2009134814 A2 WO 2009134814A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
centromere
sequence
nucleic acid
algal
Prior art date
Application number
PCT/US2009/041998
Other languages
English (en)
Other versions
WO2009134814A3 (fr
Inventor
Helge Zieler
Robert Christopher Brown
Toby Howard Richardson
Douglas Gillette Smith
Original Assignee
Synthetic Genomics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Synthetic Genomics, Inc. filed Critical Synthetic Genomics, Inc.
Publication of WO2009134814A2 publication Critical patent/WO2009134814A2/fr
Publication of WO2009134814A3 publication Critical patent/WO2009134814A3/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6804Nucleic acid analysis using immunogens

Definitions

  • the present invention relates to the identification of centromeres that are useful, for example, in constructing artificial chromosomes and cells comprising such artificial chromosomes.
  • centromere is an important element in an artificial chromosome, mediating faithful chromosome segregation between the two daughter cells in a cell division. Accordingly, the isolation and identification of functional centromere sequences is an essential part of constructing artificial chromosomes for any specific organism.
  • Eukaryotic centromeres vary greatly in size, ranging from 120-200 bp in budding yeasts to tens of megabases in plants and animals. They are also very diverse in structure and sequence, with centromeres in higher eukaryotes often composed of large tracts of tandem satellite repeats, interspersed with retrotransposons and other sequences, including in some cases functional genes.
  • centromere function i.e., establishment of centromere function from naked DNA introduced into a cell
  • sequences from a related organism may not work efficiently in establishing centromere function.
  • the high amount of species specificity of centromere sequences correlates with the observation that centromere sequences evolve very rapidly and can lose all homology between related species within several million years of evolution (e.g., centromere repeat sequences within the genus Arabidopsis).
  • centromere repeat sequences within the genus Arabidopsis e.g., centromere repeat sequences within the genus Arabidopsis
  • CenH3 (known as CENP-A in humans) is a variant of the nucleosome protein histone H3 that is preferentially associated with centromeric chromatin. This protein differs from histone H3 in having longer and divergent N- terminal sequences.
  • Antibodies raised against the unique N-terminal sequences of CenH3 have been used in some strategies for isolating centromere sequences from some species, for example, using chromatin immunoprecipitation ("ChIP").
  • centromere sequences that can quickly process and specifically identify centromere sequences (as distinguished from nonspecific sequences) among large pools of nucleic acids molecules, when there are no known centromeres for comparison, for example in several algal species where centromere identification has been particularly difficult.
  • identifying a centromere sequence includes: immunoprecipitating protein-DNA complexes from fragmented chromatin derived from a cell using an antibody to a centromere-associated protein; isolating nucleic acid molecules from the immunoprecipitated protein-DNA complexes; and sequencing the isolated nucleic acid molecules to identify a centromere sequence.
  • methods for identifying a centromere sequence in which the methods include: immunoprecipitating protein-DNA complexes from fragmented chromatin isolated from a cell using an antibody to a centromere-associated protein; separately sequencing individual nucleic acid molecules of a population of nucleic acid molecules isolated from the protein-DNA complexes; calculating the frequency of occurrence of each nucleic acid sequence in the population of nucleic acid molecules isolated from the protein-DNA complexes; and identifying a nucleic acid molecule sequence which has an increased frequency of occurrence in the population as a centromere sequence.
  • the methods of the invention in some preferred embodiments use chromatin isolated from one or more cells of an algal, fungal, or protist species.
  • An algal cell used in the methods is at least one green, yellow-green, brown, golden brown, or red algal cell, such as an alga of any of the Rhodophyta, Euglenophyta, Cryptophyta, Pyrrophyta, Raphidophyta, Haptophyta, Chrysophyta, Xanthophyta, Eustigmatophyta, Phaeophyta (Fucophyta), Prasinophyta, Bacillariophyta, Glaucophyta, or Chlorophyta phyla, and in some embodiments is a cell of an alga of the Chlorophyceae class.
  • individual nucleic acid molecules of a population of nucleic acid molecules isolated from immunoprecipitated protein-DNA complexes are sequenced separately using a machine that performs high-throughput parallel sequencing.
  • separate sequencing of individual nucleic molecules is performed using a machine that isolates single nucleic acid molecules of a population of nucleic acid molecules prior to sequencing, such as a high-throughput parallel sequencing machine, that performs, for example, at least 10,000 sequencing reactions simultaneously.
  • the methods disclosed herein do not include addition of a cross-linking agent prior to immunoprecipitating protein-DNA complexes from the fragmented chromatin.
  • the methods provided herein do not include hybridizing a nucleic acid molecule isolated from the immunoprecipitated protein-DNA complexes to one or more known centromere-associated sequences, or comparing the sequence of a nucleic acid molecule isolated from the immunoprecipitated protein-DNA complexes to one or more known centromere sequences.
  • the methods of identifying a centromere sequence do not include hybridizing a nucleic acid molecule isolated from the immunoprecipitated protein-DNA complexes to one or more repetitive sequences known in the organism from which the chromatin is isolated.
  • immunoprecipitation can use an antibody that specifically binds any centromere-associated protein, including without limitation a centromere protein, a centromere protein-recruiting protein, or a kinetochore protein.
  • chromatin immunoprecipitation is performed with an antibody that specifically binds a centromere protein, such as for example, an antibody that specifically binds to CENP- A/CenH3 or a homolog of CENP- A/CenH3.
  • an antibody used for chromatin immunoprecipitation specifically binds to the N terminus of CENP-A/CenH3 or a homolog of CENP-A/CenH3.
  • the method includes amplifying the nucleic acid molecules isolated from the immunoprecipitated protein-DNA complexes prior to sequencing the isolated nucleic acid molecules.
  • individual nucleic acid molecules isolated from the immunoprecipitated protein-DNA complexes are amplified separately prior to sequencing the nucleic acid molecules.
  • the methods include, prior to sequencing the nucleic acid molecules, separately amplifying individual nucleic acid molecules of the population of immunoprecipitated nucleic acid molecules to generate single nucleic acid molecule amplification products corresponding to individual nucleic acid molecules of the immunoprecipitated nucleic acid molecule population using a machine that isolates single nucleic acid molecules from a population of nucleic acid molecules prior to amplification.
  • a high throughput parallel sequencing system isolates single nucleic acid molecules from a population of nucleic acid molecules prior to amplification, performs amplification reactions on the isolated individual nucleic acid molecules to generate isolated amplification products of the individual nucleic acid molecules of the population, and performs parallel sequencing reactions on the isolated amplification products of the individual nucleic acid molecules of the population to provide sequences of the individual molecules of the population.
  • the methods further include performing one or more assays to evaluate the centromere sequence.
  • an assay can be performed for stable heritability of an artificial chromosome comprising the centromere sequence in which the presence of the centromere sequence or a nucleic acid sequence linked thereto on an artificial chromosome is detected.
  • An assay for centromere function in some embodiments detects the presence of a selectable or nonselectable marker on an artificial chromosome comprising the centromere sequence.
  • centromere sequences identified by the methods of the invention comprising centromere sequences identified by the methods of the invention, in which the centromere sequence is not adjacent to one or more sequences positioned adjacent to the centromere sequence in the genome from which the centromere sequence is derived.
  • the recombinant nucleic acid molecule can include sequences adjacent to the identified centromere sequence that are derived from the same organism or species from which the centromere sequence is derived, can be adjacent to sequences derived from another organism or species, or can include synthetic sequences.
  • nucleic acid molecules that comprise a sequence having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity to at least 30, between 30 and 40, between 40 and 50, between 50 and 60, between 60 and 70, between 70 and 80, between 80 and 90, between 90 and 100, or at least 100 bp, between 100 and 125 bp, between aboutl25 bp and about 150 bp, between about 150 bp and about 200 bp, between about 200 bp and about 300 bp, between about 300 bp and about 400 bp, between about 400 bp and about 500 bp, between about 500 bp and about 1 Kb, between about 1 Kb and about 2 Kb, between about 2 Kb and about 3 Kb, between about 3 Kb and about 4 Kb, between about 4 Kb and about 5 Kb, between about 5 Kb and about 6 Kb, between about 6 Kb and
  • the invention further includes a recombinant nucleic acid molecule comprising an algal centromere sequence having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity, to at least 30, between 30 and 40, between 40 and 50, between 50 and 60, between 60 and 70, between 70 and 80, between 80 and 90, between 90 and 100, or at least 100 bp, between 100 and 125 bp, between aboutl25 bp and about 150 bp, between about 150 bp and about 200 bp, between about 200 bp and about 300 bp, between about 300 bp and about 400 bp, between about 400 bp and about 500 bp, between about 500 bp and about 1 Kb, between about 1 Kb and about 2 Kb, between about 2 Kb and about 3 Kb, between about 3 Kb and about 4 Kb, between about 4 Kb and about 5 Kb, between about 5 Kb and about 6 Kb,
  • the artificial chromosome can include at least one selectable or nonselectable marker.
  • an artificial chromosome that includes a centromere sequence identified by the methods of the invention or a sequence derived therefrom includes at least one gene encoding a structural protein, a regulatory protein, an enzyme, a ribozyme, an antisense RNA, or an RNA that participates in gene silencing, such as but not limited to an shRNA, or an siRNA.
  • Also included in the invention are cells that comprise artificial chromosomes as disclosed herein.
  • An artificial chromosome can be introduced into a cell by any feasible transformation method, or an artificial chromosome can be transmitted to a cell by means of sexual or asexual reproduction.
  • the terms "about” or “approximately” when referring to any numerical value are intended to mean a value of plus or minus 10% of the stated value.
  • “about 50 degree C.” (or “approximately 50 degrees C”) encompasses a range of temperatures from 45degree C to 55 degree C, inclusive.
  • “about 100 mM” (or “approximately 100 mM”) encompasses a range of concentrations from 90 mM to 110 mM, inclusive.
  • a “homolog” of a gene or protein refers to its functional equivalent in another species.
  • a “variant” of a gene or protein sequence is a sequence having at least 65% identity with the referenced gene or protein sequence, and can include one or more base deletions, additions, or substitutions with respect to the referenced sequence.
  • centromere is used herein to mean a nucleic acid sequence that confers the apportionment of a nucleic acid molecule that comprises the sequence to daughter cells during cell division.
  • a centromere can be a naturally occurring sequence, a variant of a naturally-occurring sequence, or a fully synthetic sequence.
  • a centromere may be derived from an organism other than the organism in which it promotes stable transmission of a nucleic acid molecule comprising the centromere sequence.
  • a centromere as identified by the methods herein and used in compositions as disclosed herein, such as artificial chromosomes, can confer stable transmission of a nucleic acid molecule to between about 50 and about 100% of daughter cells, for example, to about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 90%, about 95% or greater than 95% of daughter cells.
  • the centromere may confer stable segregation of a nucleic acid sequence, including a recombinant construct comprising the centromere, through mitotic or meiotic divisions, including through both meiotic and meitotic divisions.
  • the invention also relates to centromeres identified using the disclosed methods, and recombinant nucleic acid molecules that include centromere sequences and valiants thereof.
  • the invention includes artificial chromosomes that include centromeres.
  • an "artificial chromosome'" is a recombinant linear or circular DNA molecule that is able to replicate in a cell and is stably inherited by the progeny of the cell.
  • An artificial chromosomes typically includes: (1) an origin of replication, for initiation of DNA replication (which in some embodiments can be present within a centromere sequence), (2) a centromere (which provides for the partitioning of the replicated chromosomes into daughter cells at mitosis or meiosis), and (3) if the chromosome is linear, telomeres (specialized DNA structures at the ends of linear chromosomes that function to stabilize the ends and facilitate the complete replication of the extreme termini of the DNA molecule).
  • An artificial chromosome optionally includes one or more additional genes, regulatory elements, or chromatin organizing regions.
  • the invention includes methods of identifying a centromere sequence that include immunoprecipitating protein-DNA complexes from chromatin isolated from a cell using an antibody to a centromere-associated protein; isolating nucleic acid molecules from the immunoprecipitated protein-DNA complexes; and sequencing the isolated nucleic acid molecules to identify a centromere sequence.
  • the nucleic acid molecules isolated from immunoprecipitated protein-DNA complexes are amplified prior to sequencing.
  • the identification of a centromere sequence does not rely on the use of previously identified sequences.
  • the methods of the invention do not include hybridization of nucleic acid molecules isolated from immunoprecipitated protein-DNA complexes (or nucleic acid molecules amplified therefrom) to confirmed or putative centromere sequences or clones, such as sequences having a repeated sequence motif, and do not include comparison of sequences obtained by sequencing of affinity-captured products to sequences previously identified as putative centromere sequences or centromere-proximal sequences.
  • one or more centromere sequences is identified by methods that include: immunoprecipitating protein-DNA complexes from chromatin isolated from a cell using an antibody to a centromere-associated protein; separately sequencing individual nucleic acid molecules of a population of nucleic acid molecules isolated from the protein-DNA complexes; calculating the frequency of occurrence of each nucleic acid sequence in the population of nucleic acid molecules isolated from the protein-DNA complexes; and identifying a nucleic acid molecule sequence which has an increased frequency of occurrence in the population as a centromere sequence.
  • a high frequency of occurrence of a sequence in a population of sequences isolated using chromatin precipitation with specific binding members that bind centromere-associated proteins is an indication of a high specificity of binding.
  • individual nucleic acid molecules or amplified products thereof are isolated from one another and sequenced separately, such that each independently obtained sequence correlates to a single molecule of a population of nucleic acid molecules isolated from immunoprecipitated protein-DNA complexes.
  • Separate sequencing of isolated individual nucleic molecules is preferably performed by a high-throughput parallel sequencing system that performs, for example, at least 10,000, at least 20,000, at least 50,000, at least 100,000, or at least 200,000 nucleic acid sequencing reactions simultaneously.
  • the methods of the invention in some preferred embodiments use chromatin isolated from one or more cells of an algal, fungal, or protist species, where a centromere sequence identified using the methods of the invention can be an algal, fungal, or protist centromere sequence.
  • An algal species can be any algal species, including, without limitation, a species of green, yellow-green, brown, golden brown, or red algae, a diatom species or a dinoflagellate species.
  • a centromere sequence identified using the methods provided herein is a centromere sequence of an algae of the Chlorophyceae class, such as of the Dunaliellale, Volvocale, Chloroccale, Oedogoniale, Sphaerolpleale, Chaetophorale, Microsporale, or Tetrasporale order.
  • an algal cell can be a cell of an Amphora, Ankistrodesmus, Asteromonas, Botryococcus, Chaetoceros, Chlamydomonas, Chlorococcum, Chlorella, Cricosphaera, Crypthecodinium, Cyclotella, Dunaliella, Emiliania, Euglena, Haematococcus, Halocafeteria, Isochrysis, Monoraphidium, Nannochloris, Nannochloropsis, Navicula, Neochlo ⁇ s, Nitzschia, Ochromonas, Oedogonium, Oocystis, Ostreococcus, Pavlova, Phaeodactylum, Pleurochrysis,, Pleurococcus, Pyramimonas, Scenedesmus, Skeletonerna, Stichococcus, Tetraselmis, Thalassiosira or Volvox species.
  • the cell used for isolation of chromatin is a fungal cell, for example, a cell of a chytrid, blastocladiomycete, neocallimastigomycete, zgomycete, trichomycete, glomeromycote, ascomycete, or basidiomycete.
  • the methods of the invention are used to identify centromeres of protists, including members or the Labyrinthulomycota group (such as but not limited to thraustochytrids), water molds, slime molds (mxomycota), and protozoans ⁇ e.g., members of the rhizopoda, apicomplexa, and cilophora).
  • a Schizochytrium or Thraustochytrium species is used in the methods of the invention. Organisms from the orders Chlorophyta, Bacillariophyta, Prymnesiophyceae, Crysophyta, Prasinophyceae are contemplated for use in the invention.
  • the methods are used to identify a centromere of a microorganism, such as a eukaryotic microalga, protist, or fungus.
  • a microorganism is collected or cultured prior to isolation of chromatin.
  • the microorganism can be cultured on liquid, solid, or semi-solid media, such as, for example, agar plates.
  • nucleii are isolated to provide a source of chromatin.
  • nucleii and/or chromatin can be isolated using osmotic shock or homogenization to isolate and/or can use enzymes that degrade the cell wall, coat, or membrane of an organism, and/or one or more detergents.
  • Chromatin isolation and chromatin immunoprecipitation can be performed under a variety of conditions (see, for example, US 6,410,233; US 6,410,243; Wang et al. The Plant J. 32: 831-843 (2002)), some of which are disclosed herein. Buffers, detergents, and fragmentation conditions, where used, can be altered to increase specificity and allow for high quality sequencing of nucleic acid molecules isolated from immunoprecipitated complexes.
  • the methods disclosed herein do not include addition of a cross-linking agent prior to immunoprecipitating protein-DNA complexes from the fragmented chromatin.
  • affinity capture in which one or more specific binding partners for one or more proteins that associates with the centromere, can be used for affinity capture of protein-DNA complexes that include centromere sequences.
  • one protein that participates in a centromere protein complex can be used as a specific binding member for capture of another member of the complex that directly binds the centromere.
  • Immunoprecipitation or affinity capture can be performed in any format, and can include, for example, capture to a solid support, such as a matrix, bead, particle, fiber, membrane, filter, or chip.
  • Proteins useful for targets for immunoprecipitation or affinity capture of chromatin to isolate or identify centromere sequences include centromere-associated proteins, or proteins that directly or indirectly bind the centromere of a chromosome, and include, without limitation, centromere proteins (proteins that directly bind the centromere), centromere protein-recruiting proteins, and kinetochore proteins (Vos et al. Biochem. Cell Biol. 84: 619-639 (2006)).
  • Centromere proteins include, without limitation, CENP-A/CenH3, CENP-B, CenH3, CENP-C, CENP-G, CENP-H, CENP-I, CENP-U(50), Misl2, PARP-I, and PARP-2, and homologs thereof.
  • Centromere protein-recruiting proteins include, without limitation, RbAp46 and RbAp48 and homologs thereof.
  • Kinetochore proteins include, without limitation, PMFl, DC8, c20orfl72, Zwint-1, ZwIO, Rod, Zwilch, Dynein, pl50(Glued), Ndc80/Hecl, Nuf2, Spc24, Spc25, KNL-3, KNL-I, Bubl, Bub3, BubRl, Madl, Mad2, or homologs thereof.
  • Immunoprecipitation or affinity capture can use antibodies or specific binding members that bind to more than one centromere-associated protein.
  • chromatin immunoprecipitation is performed with an antibody that specifically binds a centromere protein, such as for example, an antibody that specifically binds to CENP-A/CenH3 or a homolog of CENP-A/CenH3.
  • an antibody used for chromatin immunoprecipitation specifically binds to the N terminus of CENP-A/CenH3 or a homolog of CENP-A/CenH3.
  • the chromatin is fragmented prior to sequencing of the nucleic acid molecules of the captured protein-DNA complexes.
  • the chromatin may be fragmented to some extent during the course of the chromatin isolation procedure, and no separate fragmentation step is performed.
  • the fragmentation can be performed prior to immunoprecipitation (or affinity capture), after immunoprecipitation (or affinity capture), or both.
  • Chromatin can be fragmented by physical (mechanical) or chemical means, for example, by sonicating, shearing, or enzymatically digestion or chemical cleavage of DNA.
  • nucleic acid molecules are individually sequenced using any nucleic acid sequencing techniques that provide accurate sequences of a large number of individual nucleic acid molecules.
  • solid phase sequencing performed by a high throughput parallel sequencing system can be used to sequence at least 10,000, at least 20,000, at least 50,000, at least 100,000, or at least 200,000 or more, nucleic acid molecules in parallel.
  • separate sequencing of individual nucleic molecules is performed using a high throughput parallel sequencing machine that isolates single nucleic acid molecules of a population of nucleic acid molecules prior to sequencing,.
  • a high throughput parallel sequencing machine that isolates single nucleic acid molecules of a population of nucleic acid molecules prior to sequencing.
  • Such machines or "Next Generation sequencing systems” include, without limitation, sequencing machines developed by Illumina and Solexa (the Genome Analyzer), sequencing machines developed by Applied Biosystems, Inc. (the SOLiD Sequencer), sequencing systems developed by Roche (e.g., the 454 GS FLX sequencer), and others.
  • sequences of a large number of the individual nucleic acid molecules of the population are determined (or as many as can be determined with high accuracy), for example, 10,000 or more, 20,000 or more, 50,000 or more, 100,000 or more, or 200,000 or more 500,000 or more, 1,000,000 or more, 2,000,000 or more, 5,000,000 or more or 10,000,000 or more.
  • a baseline frequency of the occurrence of a non-centromere sequence in the immunoprecipitated population is determined by mapping the sequences onto the genome of the organism, if available, and computing the average sequence coverage in regions of the genome, excluding peaks of high coverage that may represent centromere sequences. Averaging of sequence coverage may be done across entire chromosomes excluding peaks of high coverage, or across specific chromosomal regions. Sequences occurring at greater than a selected frequency above background, such as above a frequency that is 2-fold, between 2 and 5- fold, 5 -fold, between 5 and 10 fold, 10 fold, or more than 10 fold background frequency in the population of nucleic acid molecules isolated from immunoprecipitated protein-DNA complexes are identified as centromere sequences.
  • a further normalization step can be performed in which the frequency of sequences across the genomic locus corresponding to the obtained sequence frequency peak is normalized to reflect equal representation of repetitive and nonrepetitive sequence across the locus.
  • identifying a high frequency occurrence sequence as a centromere sequence also includes identifying one or more regions of higher than average A+T content of the genome. In some methods, identifying a high frequency occurrence sequence as a centromere sequence also includes identifying one or more repeated sequences within the high frequency occurrence sequence. In some embodiments, a repeated sequence ("motif) found in one or more high frequency occurrence sequences is used in identifying further putative centromere sequences. In some cases, a repeated sequence is at least 10 base pairs in length, such as between about 10 base pairs and about I Kb, or between about 10 base pairs and about 500 base pairs, or between about 25 base pairs and about 350 base pairs, or between about 50 base pairs and about 250 base pairs.
  • a repeated sequence motif identified within a high frequency occurrence sequence is less than 10 bp, such as a dinucleotide repeat, a trinucleotide repeat, a tetranucleotide repeat, a pentanucleotide repeat, a sextanucleotide repeat, a heptanucleotide repeat, an octonucleotide repeat, or a nonanucleotide repeat.
  • a repeated sequence motif identified within a high frequency occurrence sequence is a dinucleotide repeat or a trinucleotide repeat.
  • a repeated sequence of greater than 10 base pairs can be present in 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, between 20 and 30, between 30 and 40, between 40 and 50, between 50 and 60, between 60 and 70, between 70 and 80, between 80 and 90, between 90 and 100, between 100 and 125, between 125 and 150, between 150 and 200, between 250 and 300, between 300 and 350, between 350 and 400, between 400 and 450, between 450 and 500, between 500 and 1000 copies at a locus identified using the present methods.
  • a repeated sequence of less than 10 base pairs such as, for example, a repeat of dinucleotide or trinucleotide repeat, is in some cases found in repeats of 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, between 20 and 30, between 30 and 40, between 40 and 50, between 50 and 60, between 60 and 70, between 70 and 80, between 80 and 90, between 90 and 100, between 100 and 125, between 125 and 150, between 150 and 200, between 250 and 300, between 300 and 350, between 350 and 400, between 400 and 450, between 450 and 500, between 500 and 1000 copies at a locus identified using the present methods.
  • the cutoff frequency level above which a sequence is identified as a centromere can take into account the expected number of centromeres in the organism used for chromatin isolation. Selection of a cutoff frequency level above which a sequence is identified as a centromere in some embodiments takes into account the percentage of A+T in sequences that are above or below a proposed cutoff level. Selection of a cutoff value can in some embodiments take into account the presence or absence of repeated sequence motifs within individual nucleic acid molecule sequences above a frequency value, such as the presence or absence of repeated dinucleotide or trinucleotide sequence motifs, or the presence or absence of satellite sequences within individual nucleic acid molecule sequences above a frequency value.
  • the methods provided herein do not include hybridizing a nucleic acid molecule isolated from the immunoprecipitated protein-DNA complexes to one or more known centromere sequences or centromere-linked sequences. In some preferred embodiments, the methods do not include hybridizing a nucleic acid molecule isolated from the immunoprecipitated protein-DNA complexes to one or more repetitive sequences previously known in the organism from which the chromatin is isolated. [0048] In some embodiments, the method includes amplifying the nucleic acid molecules isolated from the immunoprecipitated protein-DNA complexes prior to sequencing the isolated nucleic acid molecules.
  • individual nucleic acid molecules isolated from the immunoprecipitated protein-DNA complexes are amplified separately prior to sequencing the nucleic acid molecules.
  • individual nucleic acid molecules of a population of nucleic acid molecules isolated from immunoprecipitated protein- DNA complexes are sequenced separately using a machine that performs high-throughput parallel sequencing.
  • a high-throughput parallel sequencing system isolates single nucleic acid molecules from a population of nucleic acid molecules prior to amplification, performs amplification reactions on the isolated individual nucleic acid molecules to generate isolated amplification products of the individual nucleic acid molecules of the population, and performs parallel sequencing reactions on the isolated amplification products of the individual nucleic acid molecules of the population to provide sequences of the individual molecules of the population.
  • the methods further include performing one or more assays to evaluate the centromere sequence.
  • an assay can be performed for nonintegration into chromosomes and for stable heritability of a nucleic acid construct introduced into a cell, that is, for a nucleic acid construct that includes the sequence to behave as an artificial chromosome.
  • An artificial chromosome vector of the present invention minimally includes a centromere for conferring stable heritability of the artificial chromosome and an origin of replication or "autonomous replication sequence” (ARS) allowing for continuing synthesis of the artificial chromosome, which in some cases may be included in the centromere sequences.
  • ARS autonomous replication sequence
  • An artificial chromosome may optionally also contain any of a variety of elements, including one or more exogenous nucleic acids, including, for example, genes that can be expressed in the host organism (including but not limited to marker genes); a bacterial or yeast plasmid backbone for propagation of the plasmid in bacteria; sequences that function as telomeres in the host organism, where the artificial chromosome is not configured as a circular molecule, cloning sites; such as restriction enzyme recognition sites or sequences that serve as recombination sites; and "chromatin packaging sequences" such as cohesion and condensing binding sites or matrix attachment regions (MARs). Other sequences may be used to intervene between genes or other genetic elements on the artificial chromosome.
  • genes that can be expressed in the host organism including but not limited to marker genes
  • a bacterial or yeast plasmid backbone for propagation of the plasmid in bacteria
  • An assay for centromere function in some embodiments detects the presence of a selectable or nonselectable marker on an artificial chromosome comprising the centromere sequence, or detects the presence of the centromere sequence or a nucleic acid sequence linked thereto on an artificial chromosome.
  • a nucleic acid molecule construct that includes a sequence as identified by the invention or a variant thereof can be introduced into cells using any feasible method, including, without limitation, microparticle bombardment, electroporation, calcium phosphate precipitation of DNA, liposome-mediated transfection, the use of lipid-based transfection agents (such as but not limited to, cationic lipid transfection agents) (e.g., US 7,479,573; US 7,145,039), the use of glass beads or metal "whiskers” with or without agitation, etc., and the cells or nucleic acids isolated from the cells can be examined to determine whether the nucleic acid molecule construct is an autonomous DNA molecule, or whether it is integrated into the chromosomes of the cells.
  • lipid-based transfection agents such as but not limited to, cationic lipid transfection agents
  • the cells or nucleic acids isolated from the cells can be examined to determine whether the nucleic acid molecule construct is an autonomous DNA molecule, or whether it is integrated into the chromos
  • the host cells can be of any species, for example, algal cells, fungal cells, cells or protists, or cells of plants, such as but not limited to higher plants.
  • the host cells will be of the same species or class of organism from which the centromere species is derived, although this is not a requirement of the invention.
  • identified sequences can be tested for their ability to function as centromeres in species other than the species from which the sequence was derived.
  • Methods used for functional analysis of centromeres include, but are not limited to the following techniques: 1) Detection of marker protein expression by microscopy, flow cytometry, fluorimetry, enzymatic assays, cell staining or any other technique that allows the detection of a marker protein having a specific enzymatic activity, or conferring a specific color or fluorescence or emission property, or other observable property, onto the cells.
  • a cell line has been selected for containing an artificial chromosome by selecting for the function of a resistance gene encoded by the artificial chromosome, and if a marker protein is also encoded by the artificial chromosome, then expression of this marker protein in the selected cells is an indication of the presence of the entire artificial chromosome, and could indicate autonomy of this artificial chromosome from the cell's other chromosomes.
  • genomic DNA isolated from the cells, tissues or organisms can be fractionated by gel electrophoresis, either intact or following digestion with restriction endonucleases or homing endonucleases, allowing the detection of an artificial chromosome or a fragment of an artificial chromosome.
  • digestion of genomic DNA extracted from the cells, tissues or organisms can be fractionated by agarose gel electrophoresis, blotted onto a DNA-binding membrane, and probed with labeled DNA sequences corresponding to sequences present on the artificial chromosome to detect specific fragments of artificial chromosome DNA, and thus allowing the determination of the autonomous, or integrated structure of the artificial chromosome.
  • Cytological techniques for directly visualizing the artificial chromosome in the transformed cells such as staining of cells with DNA-binding dyes or in situ hybridization with labeled DNA probes corresponding to sequences present on the artificial chromosome.
  • markers present on an autonomous artificial chromosome will segregate independently from markers on the arms of the host chromosomes in a population of F2 progeny generated from a cross between a line carrying an artificial chromosome and a second marked line that doesn't carry the artificial chromosome.
  • markers present on an autonomous artificial chromosome will segregate independently from markers on the arms of the host chromosomes in a population of F2 progeny generated from a cross between a line carrying an artificial chromosome and a second marked line that doesn't carry the artificial chromosome.
  • the artificial chromosome contains an antibiotic resistance marker for E. coli and an E. coli origin of replication, then DNA extracts from a cell in which the artificial chromosome is present in an autonomous state will be expected to form antibiotic-resistant colonies when transformed into E. coli, and the structure and sequence of the resulting plasmid in E.
  • coli will partially or completely resemble the structure and sequence of the artificial chromosome, whereas DNA extracted from a cell with an integrated copy of the same DNA will not give rise to such colonies, and/or the structure and sequence of any colonies that should arise would provide clear indication of the DNA having been in an integrated state in that cell.
  • an optical map of an organism transformed with an autonomous artificial chromosome would be expected to result in a physical map of that organism's genome showing an extra chromosome, unlinked to the other chromosomes, compared to the untransformed organism or compared to an organism with an integrated copy of the same DNA.
  • Markers that can be used in the nucleic acid constructs include but are not limited to: visible markers conferring a visible characteristic to the plant; selectable markers, conferring resistance to an antibiotic, herbicide, or other toxic compound; enzymatic markers, conferring an enzymatic activity that can be assays in the plant or in extracts made from the plant; protein markers, allowing the specific detection of a protein expressed in the plant; molecular markers, such as restriction fragment length polymorphisms, amplified fragment length polymorphisms, short sequence repeat (microsatellite) markers, presence of certain sequences in the DNA of the plant as detected by the polymerase chain reaction, single nucleotide polymorphisms or cleavable amplified polymorphic sites.
  • the inheritance of artificial chromosomes can also be measured through one or more cell divisions. After isolating cells that contain the artificial chromosome (for example, by selection for the presence of a marker present on the nucleic acid construct that includes the centromere sequence), the population of cells is allowed to grow (either with or without selection), and the presence of the artificial chromosome is monitored as the cells divide.
  • Artificial chromosomes can be detected in cells by a variety of methods, including but not limited to: detection of fluorescence or any other visual characteristic arising from a marker protein gene present on the artificial chromosome; resistance to an antibiotic, herbicide, toxic metal, salt, mineral or other substance, or abiotic stress as outlined above (isolating cells containing artificial chromosomes); staining of cells with DNA-binding molecules to allow detection of an additional chromosome; in situ hybridization with labeled DNA probes corresponding to sequences present on the artificial chromosome; southern blots or dot blots of DNA extracted from the cell population and probed with labeled DNA sequences corresponding to sequences present on the artificial chromosome; expression of a marker enzyme encoded by a gene present on the artificial chromosome (e.g., luciferase, alkaline phosphatase, beta- galactosidase, etc.) that can be assayed in the cells or in an extract made from the
  • the percentage of cells containing the chromosome is determined at regular intervals during this growth phase.
  • the change in the fraction of cells harboring the artificial chromosome, divided by the number of cell divisions, represents the average artificial chromosome loss rate.
  • Artificial chromosomes with the lowest loss rates have the highest level of inheritance.
  • centromere on an artificial chromosome can be detected by a variety of methods relating to the presence of proteins normally found associated with centromeres.
  • proteins include but are not limited to CenH3, CenpA, CenpB and other proteins normally found associated with the centromere or kinetochore.
  • Methods for detecting such proteins to demonstrate centromere function include but are not limited to immunocytochemistry, chromatin immonoprecipitation (ChIP) followed by selective hybridization, PCR or sequencing to demonstrate enriched presence of particular sequences, fluorescence activated chromosome sorting or other methods of fractionating a cell's genome followed by immunocytochemistry or chromatin immonoprecipitation (ChIP).
  • Recovery of artificial chromosomes from cells can be achieved by any of a variety of techniques, including, but not limited to, the following: 1) Extracting the genomic DNA of transformed cells and introducing that DNA into E. coli, other bacteria or yeast and selecting for the antibiotic resistance genes present on the artificial chromosome.
  • the resulting artificial chromosomes recovered after being passaged through host cells in this way may differ from their parental molecules in total size, size of the centromere, presence or absence of additional sequences, and overall arrangement of the sequences.
  • These procedures allow the isolation of DNA molecules capable of replicating and segregating in cells of an organism of interest, such as an alga, fungus, or protist, without having to test artificial chromosomes individually.
  • pools of artificial chromosomes, or pools of centromere clones into algal cells and recovering them by the methods listed above facilitates the selection of specific artificial chromosomes or centromere clones that remain autonomous in algal cells.
  • pools of centromere clones can be delivered into cells of an organism followed by recovery of the ones that successfully replicate and persist, such that the recovered clones can guide the design of optimal artificial chromosome constructs.
  • the invention includes recombinant nucleic acid molecules comprising centromere sequences identified by the methods of the invention, in which the centromere sequence is no longer adjacent to one or more sequences positioned adjacent to the centromere sequence in the genome from which the centromere sequence is derived.
  • a centromere sequence identified using the methods provided herein is a centromere sequence derived from an alga, such as of an alga of the Chlorophyceae class, such as a centromere sequence of an algal of the Dunaliellale, Volvocale, Chloroccale, Oedogoniale, Sphaerolpleale, Chaetophorale, Microsporale, or Tetrasporale order.
  • an algal cell can be a cell of an Amphora, Ankistrodesmus, Aster omonas, Botryococcus, Chaetoceros, Chlamydomonas, Chlorococcum, Chlorella, Cricosphaera, Crypthecodiniurn, Cyclotella, Dunaliella, Emiliania, Euglena, Haematococcus, Halocafeteria, Isochrysis, Monoraphidium, Nannochloris, Nannochloropsis, Navicula, Neochloris, Nitzschia, Ochromonas, Oedogonium, Oocystis, Ostreococcus, Pavlova, Phaeodactylum, Pleurochrysis,, Pleurococcus, Pyramimonas, Scenedesmus, Skeletonema, Stichococcus, Tetraselmis, Thalassiosira or Volvox species.
  • a recombinant nucleic acid molecule comprises a centromere sequence derived from a fungal or protist cell.
  • nucleic acid molecules that comprise centromere sequences in some embodiments comprise one or more copies of a repeated sequence of greater than 10 base pairs, such as, for example a repeated motif of between about 10 and about 500 base pairs, can be present in 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, between 20 and 30, between 30 and 40, between 40 and 50, between 50 and 60, between 60 and 70, between 70 and 80, between 80 and 90, between 90 and 100, between 100 and 125, between 125 and 150, between 150 and 200, between 250 and 300, between 300 and 350, between 350 and 400, between 400 and 450, between 450 and 500, between 500 and 1000 copies at a locus identified using the present methods.
  • a repeated motif of between about 10 and about 500 base pairs can be present in 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, between 20 and 30, between 30 and 40, between 40 and 50, between 50 and 60, between 60 and 70, between 70 and 80, between 80 and 90, between 90 and 100, between 100 and
  • a repeated motif is the 184 base pair sequence of Table 7, for example, SEQ ID NO:168, SEQ ID NO:169, or SEQ ID NO:170, as disclosed in Example 10.
  • the invention includes an algal centromere sequence that comprises two or more copies of the sequence of SEQ ID NO:168, SEQ ID NO:169, or SEQ ID NO:170, as well as algal centromeres having two or more copies of sequences having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity to SEQ ID NO: 168.
  • the invention includes an artificial chromosome that comprises an algal centromere sequence that comprises two or more copies of the sequence of SEQ ID NO: 168, as well as algal artificial chromosomes having two or more copies of sequences having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity to SEQ ID NO: 168.
  • the artificial chromosome is a Chlamydomonas artificial chromosome.
  • the invention further includes an algal cell, such as a Chlamydomonas cell, that includes an artificial chromosome having a centromere that comprises two or more copies of sequences having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity to SEQ ID NO: 168.
  • an algal cell such as a Chlamydomonas cell
  • an artificial chromosome having a centromere that comprises two or more copies of sequences having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity to SEQ ID NO: 168.
  • a repeated motif is the 111 or 112 base pair sequence of Table 8, for example SEQ ID NO:171, SEQ ID NO:172, SEQ ID NO:173, SEQ ID NO:174, SEQ ID NO: 175, or SEQ ID NO: 176, as disclosed in Example 10.
  • the invention includes an algal centromere sequence that comprises two or more copies of the sequence of SEQ ID NO:171, SEQ ID NO:172, SEQ ID NO:173, SEQ ID NO:174, SEQ ID NO:175, or SEQ ID NO: 176, as well as algal centromeres having two or more copies of sequences having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity to SEQ ID NO.171, SEQ ID NO:172, SEQ ID NO:173, SEQ ID NO.174, SEQ ID NO.175, or SEQ ID NO: 176.
  • the invention includes an artificial chromosome that comprises an algal centromere sequence that comprises two or more copies of the sequence of SEQ ID NO:171, SEQ ID NO:172, SEQ ID NO:173, SEQ ID NO:174, SEQ ID NO:175, or SEQ ID NO.176, as well as algal artificial chromosomes having two or more copies of sequences having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity to SEQ ID NO:171, SEQ ID NO:172, SEQ ID NO:173, SEQ ID NO:174, SEQ ID NO:175, or SEQ ID NO: 176.
  • the artificial chromosome is a Chlamydomonas artificial chromosome.
  • the invention further includes an algal cell, such as a Chlamydomonas cell that includes an artificial chromosome having a centromere that comprises two or more copies of sequences having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity to SEQ ID NO:171, SEQ ID NO:172, SEQ ID NO:173, SEQ ID NO:174, SEQ ID NO:175, or SEQ ID NO:176.
  • Short repeated sequences of less than ten base pairs are also identified at genomic loci using the present methods for identifying centromeres.
  • a short repeated sequence can be, for example, a repeat of dinucleotide or trinucleotide repeat, and is in some cases found in repeats of 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, between 20 and 30, between 30 and 40, between 40 and 50, between 50 and 60, between 60 and 70, between 70 and 80, between 80 and 90, between 90 and 100, between 100 and 125, between 125 and 150, between 150 and 200, between 250 and 300, between 300 and 350, between 350 and 400, between 400 and 450, between 450 and 500, between 500 and 1000 copies at a locus identified using the present methods.
  • a repeated motif is the dinucleotide sequence GA, AT, CT, CA, GT (or, reading from the opposite strand, TC, TA, AG, TG. AC), as disclosed in Example 10.
  • the invention includes an algal centromere sequence that comprises two or more copies of any of the dinucleotide sequences of Table 9, between two and ten copies of a dinucleotide sequence of Table 9, or ten or more copies of any of the dinucleotide sequences of Table 9.
  • the invention includes an artificial chromosome having an algal centromere that comprises two or more copies of any of the dinucleotide sequences of Table 9, between two and ten copies of a dinucleotide sequence of Table 9, or ten or more copies of any of the dinucleotide sequences of Table 9.
  • the artificial chromosome is a Chlamydomonas artificial chromosome.
  • the invention further includes an algal cell, such as a Chlamydomonas cell, that includes an artificial chromosome having a centromere that comprises two or more copies, for example between two and ten copies, or ten or more tandemly repeated copies, of any of the dinucleotide sequences of Table 9.
  • a repeated motif is the tandemly repeated trinucleotide sequence AAT, TAT, TAA, CAA, CCA., GCT, AGG, or CGT (or, reading from the opposite strand, ATT. ATA, TTA, TTG, TGG, AGC, CCT, or CAG), as disclosed in Example 10.
  • the invention includes an algal centromere sequence that comprises two or more copies of any of the trinucleotide sequences of Table 9, between two and ten copies of a trinucleotide sequence of Table 9, or ten or more copies of any of the trinucleotide sequences of Table 9.
  • the invention includes an artificial chromosome having an algal centromere that comprises two or more repeats of any of the trinucleotide sequences of Table 9, between two and ten repeats or a trinucleotide sequence of Table 9, or ten or more repeats of any of the trinucleotide sequences of Table 9.
  • the artificial chromosome is a Chlamydomonas artificial chromosome.
  • the invention further includes a Chlamydomonas cell that includes an artificial chromosome having a centromere that comprises two or more copies, between two and ten copies, or ten or more tandemly repeated copies of any of the trinucleotide sequences of Table 9.
  • the invention includes recombinant nucleic acid molecules comprising a centromere sequence identified by the methods of the invention, in which the centromere sequence not adjacent to one or more sequences that is positioned next to the centromere sequence in the genome from which the centromere sequence is derived.
  • the invention includes recombinant nucleic acid molecules comprising a centromere sequence identified using the methods of the invention, in which the centromere sequence is adjacent to one or more sequences not positioned adjacent to the centromere sequence in the genome from which the centromere sequence is derived.
  • a recombinant nucleic acid molecule that includes a centromere sequence can include sequences adjacent to the identified centromere sequence that are derived from the same organism or species from which the centromere sequence is derived (but are not adjacent to the centromere sequences in the naturally-occurring genome), can be adjacent to sequences derived from another organism or species, or can include synthetic sequences that are adjacent to the centromere sequence.
  • nucleic acid molecules that comprise a sequence having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity to at least 30, between 30 and 40, between 40 and 50, between 50 and 60, between 60 and 70, between 70 and 80, between 80 and 90, between 90 and 100, or at least 100 bp, between 100 and 125 bp, between aboutl25 bp and about 150 bp, between about 150 bp and about 200 bp, between about 200 bp and about 300 bp, between about 300 bp and about 400 bp, between about 400 bp and about 500 bp, between about 500 bp and about 1 Kb, between about 1 Kb and about 2 Kb, between about 2 Kb and about 3 Kb, between about 3 Kb and about 4 Kb, between about 4 Kb and about 5 Kb, between about 5 Kb and about 6 Kb, between about 6 Kb and
  • centromere nucleic acid sequences include any of SEQ ID NOs 21-167 (sequences of Table 6), variants, fragments, or variants of fragments of any of SEQ ID Nos 21-167 (sequences of Table 6), such as fragments or variants of SEQ ID NOs 21-167 that retain the ability to segregate during mitotic or meiotic division as described herein.
  • Variants of such sequences include artificially produced modifications as described herein and modifications produced via passaging through one or more bacterial, plant or other host cells as described herein.
  • a variant sequence has at least at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity to at least 30, between 30 and 40, between 40 and 50, between 50 and 60, between 60 and 70, between 70 and 80, between 80 and 90, between 90 and 100, or at least 100 bp, between 100 and 125 bp, between aboutl25 bp and about 150 bp, between about 150 bp and about 200 bp, between about 200 bp and about 300 bp, between about 300 bp and about 400 bp, between about 400 bp and about 500 bp, between about 500 bp and about 1 Kb, between about 1 Kb and about 2 Kb, between about 2 Kb and about 3 Kb, between about 3 Kb and about 4 Kb, between about 4 Kb and about 5 Kb, between about 5 Kb and about 6 Kb, between about 6 Kb and about 7 Kb, between about 7 Kb and about 8 K
  • a centromere in a recombinant nucleic acid molecule or artificial chromosome of the present invention may comprise novel repeating centromeric sequences.
  • Nucleic acid constructs, including artificial chromosome constructs can comprise one, two, three, four, five, six, seven, eight , nine, ten, 15 or 20 or more of the elements contained in any of the exemplary vectors described in the examples below are also contemplated.
  • the invention specifically contemplates the alternative use of fragments or variants (mutants) of any of the nucleic acids described herein that retain the desired activity, including nucleic acids that function as centromeres, nucleic acids that function as promoters or other regulatory control sequences, or exogenous nucleic acids.
  • variantants may have one or more additions, substitutions or deletions of nucleotides within the original nucleotide sequence or consensus sequence.
  • Variants include nucleic acid sequences that are at least 50%, 55%, 60, 65, 70, 75, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100% identical to the original nucleic acid sequence.
  • Genes used in constructs of the invention may be modified to accommodate the codon usage of the intended host organism, to insert preferred motifs near the translation initiation ATG codon, to remove sequences recognized by the host organism as 5 ' or 3 ' splice sites, or to better reflect the GC/ AT content of the host organism.
  • the nucleotide sequence of genes can be altered to reflect the codon bias or GC content of the intended host organism.
  • Genes used in constructs of the invention may include a promoter, a coding region and a terminator sequence, which may be separated from each other by restriction endonuclease sites or recombination sites or both. Genes may also include introns, which may be present in any number and at any position within the transcribed portion of the gene, including the 5' untranslated sequence, the coding region and the 3' untranslated sequence. Introns may be natural introns derived from any species, or artificial introns based on the splice site consensus that has been defined for the host species or a related species.
  • the exogenous nucleic acid may include at transcriptional terminator, non- translated leader sequences that enhance expression, a minimal promoter, or a signal sequence controlling the targeting of gene products to plant compartments or organelles such as but not limited to the chloroplast of an algal host cell.
  • the coding regions of the genes can encode any protein, including but not limited to visible marker genes (for example, fluorescent protein genes, other genes conferring a visible phenotype to the plant) or other screenable or selectable marker genes (for example, conferring resistance to antibiotics, herbicides or other toxic compounds or encoding a protein that confers a growth advantage to the cell expressing the protein) or genes which confer some commercial or environmental remediation value to the organism.
  • genes can be placed on the same mini- chromosome vector, limited only by the number of restriction endonuclease sites or site- specific recombination sites present in the vector.
  • the genes may be separated from each other by restriction endonuclease sites, homing endonuclease sites, recombination sites or any combinations thereof. Any number of genes can be present, for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more than 10 genes may be present on an artificial chromosome.
  • the artificial chromosome vector may also contain a bacterial plasmid backbone for propagation of the plasmid in bacteria such as E. coli.
  • the plasmid backbone may be that of a low-copy vector or in other embodiments it may be desirable to use a mid to high level copy backbone. In one embodiment of the invention, this backbone contains the replicon of the F' plasmid of E. coli.
  • other plasmid replicons such as the bacteriophage Pl replicon, or other low-copy plasmid systems such as the RK2 replication origin, may also be used.
  • the backbone may include one or several antibiotic-resistance genes conferring resistance to a specific antibiotic to the bacterial cell in which the plasmid is present.
  • Bacterial antibiotic- resistance genes include but are not limited to kanamycin-, ampicillin-, chloramphenicol-, streptomycin-, spectinomycin-, tetracycline- and gentamycin-resistance genes.
  • the artificial chromosome vector may optionally also contain telomeres.
  • Telomeres are specialized DNA structures at the ends of linear chromosomes that function to stabilize the ends and facilitate the complete replication of the extreme termini of the DNA molecule.
  • An exemplary telomere sequence identified in the green unicellular alga Chlamydomonas reinhardtii is TTTTAGGG or its complement (Petracek et al. Proceedings of the National Academy of Sciences 87: 8222-8226 (1990)).
  • the artificial chromosome vector may contain "stuffer DNA" sequences that serve to separate the various components on the artificial chromosome (centromere, genes, telomeres) from each other.
  • the stuffer DNA may be of any origin, prokaryotic or eukaryotic, and from any genome or species, plant, animal, microbe or organelle, or may be of synthetic origin.
  • the stuffer DNA can range from 100 bp to 10 Mb in length and can be repetitive in sequence, with unit repeats from 10 to 1 ,000,000 bp.
  • repetitive sequences that can be used as stuffer DNAs include but are not limited to: rDNA, satellite repeats, retroelcments, transposons, pscudogenes, transcribed genes, microsatellitcs, tDNA genes, short sequence repeats and combinations thereof.
  • the stuffer DNA can consist of unique, non- repetitive DNA of any origin or sequence.
  • the stuffer sequences may also include DNA with the ability to form boundary domains, such as but not limited to scaffold attachment regions (SARs) or matrix attachment regions (MARs).
  • SARs scaffold attachment regions
  • MARs matrix attachment regions
  • the stuffer DNA may be entirely synthetic, composed of random sequence.
  • the stuffer DNA may have any base composition, or any A/T or G/C content.
  • the G/C content of the stuffer DNA could resemble that of the organism or could be much lower or much higher.
  • the stuffer sequences could be synthesized to contain an excess of any given nucleotide such as A, C, G or T.
  • Different synthetic stuffers of different compositions may also be combined with each other. For example a fragment with low G/C content may be flanked or abutted by a fragment of medium or high G/C content, or vice versa.
  • the artificial chromosome has a circular structure without telomeres. In another embodiment, the artificial chromosome has a circular structure with telomeres.
  • the artificial chromosome has a linear structure with telomeres, as would result if a "linear" structure were to be cut with a unique endonuclease, exposing the telomeres at the ends of a DNA molecule that contains all of the sequence contained in the original, closed construct with the exception of the an antibiotic-resistance gene.
  • the telomeres could be placed in such a manner that the bacterial replicon, backbone sequences, antibiotic-resistance genes and any other sequences of bacterial origin and present for the purposes of propagation of the artificial chromosome in bacteria, can be removed from the plant- expressed genes, the centromere, telomeres, and other sequences by cutting the structure with an unique endonuclease. This results in an artificial chromosome from which much of, or preferably all, bacterial sequences have been removed.
  • bacterial sequence present between or among the plant-expressed genes or other artificial chromosome sequences would be excised prior to removal of the remaining bacterial sequences by cutting the artificial chromosome with a homing endonuclease and re-ligating the structure such that the antibiotic- resistance gene has been lost.
  • the unique endonuclease site may be the recognition sequence of a homing endonuclease.
  • the endonucleases and their sites can be replaced with any specific DNA cutting mechanism and its specific recognition site such as rare-cutting endonuclease or recombinase and its specific recognition site, as long as that site is present in the artificial chromosomes only at the indicated positions.
  • mini- chromosome elements can be oriented with respect to each other.
  • a centromere can be placed on an artificial chromosome either between genes or outside a cluster of genes next to one telomere or next to the other telomere.
  • Stuffer DNAs can be combined with these configurations to place the stuffer sequences inside the telomeres, around the centromere between genes or any combination thereof.
  • a large number of alternative artificial chromosome structures are possible, depending on the relative placement of centromere DNA, genes, stuffer DNAs, bacterial (or yeast) sequences, telomeres, and other sequences.
  • the sequence content of each of these variants is the same, but their structure may be different depending on how the sequences are placed.
  • the invention further includes a recombinant nucleic acid molecule comprising an algal centromere sequence having at least at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identity to at least 30, between 30 and 40, between 40 and 50, between 50 and 60, between 60 and 70, between 70 and 80, between 80 and 90, between 90 and 100, or at least 100 bp, between 100 and 125 bp, between about 125 bp and about 150 bp, between about 150 bp and about 200 bp, between about 200 bp and about 300 bp, between about 300 bp and about 400 bp, between about 400 bp and about 500 bp, between about 500 bp and about 1 Kb, between about 1 Kb and about 2 Kb, between about 2 Kb and about 3 Kb, between about 3 Kb and about 4 Kb, between about 4 Kb and about 5 Kb, between about 5 Kb and about 6 Kb
  • Artificial chromosomes as disclosed herein can include at least one selectable or nonselectable marker.
  • an artificial chromosome that includes a centromere sequence identified by the methods of the invention or a sequence derived therefrom includes at least one gene encoding a structural protein, a regulatory protein, an enzyme, a ribozyme, an antisense RNA, or an RNA that functions in gene silencing, such as but not limited to an shRNA, or an siRNA.
  • an artificial chromosome can be introduced into a cell by any feasible transformation method, or an artificial chromosome can be transmitted to a cell by means of sexual or asexual reproduction.
  • Chlamydomonas reinhardtii centromere sequences were isolated and identified by immunoprecipitation of sheared, native chromatin with antisera raised against epitopes present in the N-terminal part of Chlamydomonas reinhardtii CenH3, and characterized by sequencing.
  • centromere specific histone H3 from the recently sequenced genome of Chlamydomonas reinhardtii was compared with centromere specific genes from other species. Antibodies to this protein were used to immunoprecipitate the centromere region in Chlamydomonas reinhardtii .
  • the peptide RTKQSPARPGRKAQAEAC (SEQ ID NO:2) was synthesized conjugated to keyhole limpet hemocyanin carrier protein. A cysteine was added to the C- terminus for coupling purposes and the peptide was acetylated at its N-terminus. The peptide was injected into rabbits at ProSci Incorporated (Poway,CA). Each rabbit was immunized over an 8 week period.
  • Serum was collected at week 8 and purified by IgG affinity chromatography; 25 ml of serum yielded 2.9 mg at a concentration of 1 mg ml '1 . The data indicated that the sera and the polyclonal IgG had very good affinity for the immunized peptide.
  • M2 buffer Ml buffer with 1OmM MgC12, 0.5% Triton X-100
  • M3 buffer Ml buffer without 2-methyl 2,4-pentanediol
  • Sonication buffer 10 mM potassium phosphate, pH 7.0
  • Ix IP buffer 50 mM Hepes, pH 7.5
  • a 1:20 dilution culture of Chlamydomonas reinhardtii strain CC- 1690 (21 gr mt+) was grown for 5 days in 200 ml TAP medium in a 2L flask, under lights with shaking (100-150 rpm). The cells were collected from 180 ml total culture volume by centrifugation in 50 ml tubes at 3000 rpm, 5 minutes. The supernatant was discarded and cells were combined into a single 50 ml tube and washed twice with Ml buffer.
  • the cell pellet was resuspended in 5 ml Ml and poured into liquid nitrogen in a mortar; another 5 ml Ml were added to the tube, sloshed around to remove the remaining cells and added to the mortar also.
  • the cells were ground for 5 minutes to a very fine powder, and then the ground cells were added to 150 ml Ml buffer in a beaker, stirred briefly to melt and suspend all cells, and filtered through a 40 ⁇ m plastic netting (all material passed through the netting).
  • the filtrate was poured into 50 ml centrifuged tubes and spun at 3000 rpm for 10 min at 4°C.
  • the pellet was washed four times with 50 ml each of M2 buffer then washed once with M3 buffer; about half of the pigment was removed from the pellet in the process but significant pigment remained, possibly indicating semi-intact cells with intact chloroplasts.
  • the pellet was resuspended in 10 ml of sonication buffer and was sonicated with a probe sonicator (Fisher Model 60) at full power (power setting 20) for 4x 20 seconds with extensive chilling in between sonications.
  • a probe sonicator Fisher Model 60
  • Sonication caused the liquid to froth extensively, effectively absorbing all of the sonication energy so that no sound was audible in during sonication; chilling in between sonications allowed the froth to settle somewhat.
  • the mixture was kept on ice after sonication to allow the froth to settle; the suspension was distributed among 6 2 ml Eppendorf tubes and spun at 12,000xg, 4 °C for 10 minutes. The clear but green supernatant was removed and distributed into a 15 ml tube for immunoprecipitation.
  • the immunoprecipitated DNA was purified by addition of SDS to 1% and extracted once with phenol-chloroform and once with chloroform.
  • the DNA was precipitated in ethanol and dissolved in TE buffer.
  • the ends of the DNA were repaired by treatment with T4 DNA polymerase, and the DNA molecules were then treated with Taq polymerase in the presence of deoxynucleotide triphosphates to allow nucleotide addition to the ends of the DNA.
  • the DNA fragments were cloned using T/A topoisomerase cloning into pCR4-TOPO (Invitrogen).
  • the topoisomerase ligation products were transformed into E. coli, and transformants were selected on LB-kanamycin plates.
  • Chlamydomonas reinhardtii cells of strains CC503 (cw92 mt+) and CC3491 (cwl5 mt-) were inoculated from plate cultures into 100 ml TAP medium in a 500 ml flask and grown for 4 days, then spun down and resuspended in fresh medium and grown for another 1.5 days under lights with shaking (100 rpm). Cells (400 ml total for each strain) were collected by centrifugation in 500 ml centrifuge bottles at 9000 rpm for 15 minutes.
  • the supernatant was discarded and the cells were resuspended gently in approximately 5 ml TAP medium, then the resuspended cells were added dropwise to liquid nitrogen with a 5 ml pipet to flash freeze the cells in small pellets. The centrifuge bottles were then rinsed with another 2 ml TAP medium which was then frozen in the same manner.
  • the frozen cell pellets were transferred into 50 ml polycarbonate tubes which had been prechilled in liquid nitrogen, each tube containing one 1/2 in stainless steel ball, then two 3/8 inch diameter stainless steel balls were added to each tube and on top of the cell pellets and the frozen drops were fragmented by shaking in a Spex GenoGrinder 6x for 1 min each at 1,500 rpm with re-cooling in liquid nitrogen in between shaking cycles.
  • the pellets were resuspended in 40 ml MPDB buffer (I M 2-methyl- 2,4-pentanediol, 10 mM PIPES KOH, 10 mM MgC12, 10 mM sodium metabisulfite, 0.5% (w/v) sodium diethyldithiocarbamate, 0.2% (v/v) ⁇ -mercaptoethanol, 1% (v/v) Triton X-100, pH 7.0 with NaOH ) in a 50 ml tube, then passed through a 40 ml dounce homogenizer for 15 strokes to break up remaining cell clumps.
  • MPDB buffer I M 2-methyl- 2,4-pentanediol, 10 mM PIPES KOH, 10 mM MgC12, 10 mM sodium metabisulfite, 0.5% (w/v) sodium diethyldithiocarbamate, 0.2% (v/v) ⁇ -mercaptoethanol, 1% (v/
  • the cells were respun and washed with 50 ml each of MPDB buffer and then spun again; the last spin was done at 3,000 rpm for 10 minutes.
  • Each pellet was resuspended in 40 ml of sonication buffer without detergent (the pellets did not resuspend completely; but there was no visible lysis of the nuclei) and the cells were respun at 3,000 rpm for 10 minutes.
  • each pellet then was resuspended in 2 ml sonication buffer without detergent by pipetting up and down with a 1 ml pipet tip, 1 ml of each resuspension was transferred to a 15 ml tube containing 5 ml sonication buffer (10 mM potassium phosphate, pH 7.0, 0.1 mM NaCl, 10 mM EDTA, protease inhibitor cocktail without EDTA was added to the buffer just before use at manufacturer's recommended concentration (Roche Cat # 04693159001), either N-lauryl sarcosine (NLS) or sodium deoxycholate (DOC) was added to the buffer just before use at 0.1%); 6 ml total volume for each of 4 samples were sonicated with a Fisher Scientific Model 60 sonicator fitted with a 1/8 in tip point probe at full power (power setting 20) for 3x 30 seconds with chilling on ice in between sonications.
  • 5 ml sonication buffer (10 mM potassium
  • the immumoprecipiate was mixed with lOO ⁇ l BcMag Protein G Beads (BioClone Inc.) in 1 x binding buffer (58 mM Na 2 HPO 4 , 42 mM NaH 2 PO 4 , pH 7.0) and left to bind for one hour at room temperature and then for an additional 12 hrs at 4 0 C with moderate agitation.
  • the bound complex was placed on the magnetic separator and the supernatant removed (a sample of the supernatant was retained for analysis).
  • the beads were then washed with 10 volumes (ImI) of wash buffer (57.7 mM Na 2 HPO 4 , 42.3 mM NaH 2 PO 4 , pH 7.0) by placing on a roller mixer for 10 mins, and then on the magnetic separator and the supernatant removed. This was repeated four times.
  • the washed bead slurry (lOO ⁇ l) with IgG:centromeric:DNA complex was then subjected to DNA purification.
  • Method 1 To wash the samples bound to magnetic beads, each immunoprecipitated sample was resuspended in 0.5ml Ix phosphate buffered saline, the samples were placed on magnetic particle collector, the beads were collected, and the supernatant was removed and discarded. This was repeated three times for a total of four washes. Like samples were combined at the final resuspension step.
  • each sample was suspended in 150 ⁇ l of 10 mM Tris pH 8.0, 0.1 mM EDTA (TE) with 0.75% SDS and lOO ⁇ g/ml proteinase K.
  • the samples were incubated at 50 0 C with mild agitation for four hours.
  • the samples were then briefly vortexed, then placed on a magnetic particle separator. Supernatants were removed and transferred to fresh tubes. 1/10 volume (15ul) of 3.5M sodium acetate was added to each sample.
  • Each sample was extracted Ix with phenol/chloroform 1 : 1 pH8.0, and after centrifuging samples at 10,000 rpm for 10 minutes to separate the phases, the aqueous phases were transferred to fresh tubes. The samples were then extracted Ix with chloroform and centrifuged again for 10 minutes at 10,000 rpm to separate the phases. The aqueous phases were transferred to fresh tubes.
  • the supernatants were again carefully removed, then discarded.
  • the pellets were dried in a rotovap with no heat. Once the pellets were dry, they were suspended in 50 ⁇ l of TE.
  • the resuspended samples were quantified by qubit (Invitrogen) and characterized for size on the Agilent bioanalyzer 2100 microcapillary electrophoresis apparatus.
  • the DNA was fragmented to the optimal size range for 454 sequencing using a Covaris sonicator.
  • the sheared DNA was subjected to titanium 454 sequencing (Roche) essentially according to the manufacturer's protocols.
  • Method 2 After washing the bead-bound samples as in Method 1, above, like samples were combined.and each sample was suspended in 500 ⁇ l of CNET buffer (2% CTAB (cetyl trimethylammonium bromide), 1.4M NaCl, 40 raM EDTA, 100 mM Tris 8.5, 140 mM beta-mercaptoethanol (added just before use)). The samples were suspended by mixing on a rotating wheel for lOmin at RT. Proteinase K was then added to 200ug/ml and the samples were incubated for two hours at 50 0 C with mild agitation.
  • CNET buffer 2% CTAB (cetyl trimethylammonium bromide), 1.4M NaCl, 40 raM EDTA, 100 mM Tris 8.5, 140 mM beta-mercaptoethanol (added just before use)
  • the phases were mixed and separated as before.
  • the aqueous phases were transferred to fresh tubes.
  • the tube with the organic phase was set aside for back extraction.
  • the aqueous phases were then extracted a second time with chloroform.
  • the phases were mixed and separated as before.
  • the aqueous phases were transferred to fresh tubes.
  • the tube with the organic phase was set aside for back extraction.
  • the pellets were washed with 100% ethanol and spun again to collect the pellets. After removal of the supernatants, the pellets were dried in a vacuum concentrator with no heat. Once the pellets were dry, they were suspended in 50 ⁇ l of 10 mM Tris, 0.1 mM EDTA pH 8.0. The samples were quantified by qubit (Invitrogen) and characterized for size on the Agilent bioanalyzer 2100 microcapillary electrophoresis apparatus. The DNA was further fragmented to the optimal size range for 454 sequencing using a Covaris sonicator and the samples were sequenced using the Roche GD FLX Titanium series pyrosequencer.
  • the percentage of reads from extraction 1 and 2 that mapped to reference genome was 84% and 76% respectively.
  • a normalized coverage score was computed by counting the number of sequenced reads mapped to that position. For example, reads that mapped to a unique locus in the reference genome contributed a score of 1 to each position they covered, and reads that mapped to multiple loci in the genome contributed a score of 1 / no. of loci (1 divided by the number of loci) to each position they covered.
  • Coverage peaks were defined as loci with a normalized coverage score of 25 or greater. The peaks were then extended in both directions as long as the normalized coverage score was 5 or greater to define the start and end loci of each peak.
  • the peak coverage was defined as the maximal normalized coverage score of any loci between the start and end loci of each peak.
  • the average coverage was defined as the average normalized coverage score of all loci between the start and end positions of each peak.
  • the length was defined as the distance in bp between the start and end loci of each peak.
  • the coverage area was defined as the sum of normalized coverage scores of all loci between the start and end positions of each peak.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

L'invention porte sur des procédés pour identifier des centromères et sur des centromères identifiés par de tels procédés. Les centromères d'organismes tels que les algues, les champignons et les protistes peuvent être utilisés, par exemple pour construire des chromosomes artificiels et des cellules contenant de tels chromosomes artificiels.
PCT/US2009/041998 2008-04-28 2009-04-28 Identification de séquences centromères et leurs utilisations WO2009134814A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US4850608P 2008-04-28 2008-04-28
US61/048,506 2008-04-28

Publications (2)

Publication Number Publication Date
WO2009134814A2 true WO2009134814A2 (fr) 2009-11-05
WO2009134814A3 WO2009134814A3 (fr) 2010-01-28

Family

ID=41255738

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/041998 WO2009134814A2 (fr) 2008-04-28 2009-04-28 Identification de séquences centromères et leurs utilisations

Country Status (2)

Country Link
US (1) US9758810B2 (fr)
WO (1) WO2009134814A2 (fr)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011109031A1 (fr) * 2010-03-05 2011-09-09 Synthetic Genomics, Inc. Procédés de clonage et de manipulation de génomes
WO2012061481A3 (fr) * 2010-11-05 2012-06-28 Chromatin, Inc. Identification de séquences de centromère en utilisant des protéines associées au centromères et utilisations de celle-ci
GB2482209B (en) * 2010-12-10 2013-01-02 Porvair Filtration Group Ltd Chromatin immunoprecipitation assay
US9267132B2 (en) 2007-10-08 2016-02-23 Synthetic Genomics, Inc. Methods for cloning and manipulating genomes
US9273310B2 (en) 2007-10-08 2016-03-01 Synthetic Genomics, Inc. Methods for cloning and manipulating genomes
US9523681B2 (en) 2010-12-10 2016-12-20 Porvair Filtration Group Limited Method of performing a chromatin immunoprecipitation assay
US10818378B2 (en) 2009-10-30 2020-10-27 Codex Dna, Inc. Encoding text into nucleic acid sequences

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130288364A1 (en) * 2010-10-28 2013-10-31 Whitehead Institute For Biomedical Research Engineered kinetochores and uses thereof
US20140357498A1 (en) * 2011-05-27 2014-12-04 The Children's Hospital Of Philadelphia Compositions and Methods for the Detection of DNA Cleavage Complexes
WO2016134311A1 (fr) * 2015-02-19 2016-08-25 Synthetic Genomics, Inc. Procédé hautement efficace de transformation d'algues
CN111154723A (zh) * 2020-01-15 2020-05-15 宁夏医科大学 一种siRNA靶向抑制Zwilch的研究方法及其应用

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040081982A1 (en) * 2000-12-21 2004-04-29 Choo Kong-Hong Andy Neocentromere-based mini-chromosomes or artificial chromosomes
US20050221455A1 (en) * 2003-10-21 2005-10-06 Cargill, Inc. Production of monatin and monatin precursors
US20050268359A1 (en) * 1997-06-03 2005-12-01 University Of Chicago Plant centromere compositions
US20070178451A1 (en) * 2001-08-02 2007-08-02 Molian Deng Nucleic acid sequences from Chlorella sarokiniana and uses thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1489098A4 (fr) * 2002-01-31 2006-01-25 Japan Science & Tech Agency Production d'un hybridome produisant un anticorps monoclonal dirige contre des peptides de la proteine cenp-a humaine, et methode d'utilisation correspondante
US20080194414A1 (en) * 2006-04-24 2008-08-14 Albert Thomas J Enrichment and sequence analysis of genomic regions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050268359A1 (en) * 1997-06-03 2005-12-01 University Of Chicago Plant centromere compositions
US20040081982A1 (en) * 2000-12-21 2004-04-29 Choo Kong-Hong Andy Neocentromere-based mini-chromosomes or artificial chromosomes
US20070178451A1 (en) * 2001-08-02 2007-08-02 Molian Deng Nucleic acid sequences from Chlorella sarokiniana and uses thereof
US20050221455A1 (en) * 2003-10-21 2005-10-06 Cargill, Inc. Production of monatin and monatin precursors

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
BARSKI ET AL.: 'High-Resolution Profiling of Histone Methylations in the Human Genome.' CELL vol. 129, no. 4, 18 May 2007, pages 823 - 837 *
'Chlorophyceae' WIKIPEDIA, [Online] 14 February 2008, Retrieved from the Internet: <URL:http://web.archive.org/web/20080307203024/http://en.wikipedia.org/wikUChlorophyceae> [retrieved on 2009-11-19] *
LEE ET AL.: 'Chromatin immunoprecipitation cloning reveals rapid evolutionary patterns of centromeric DNA in Oryza species.' PNAS vol. 102, no. 33, 16 August 2005, pages 11793 - 11798 *
MARDIS: 'The impact of next-generation sequencing technology on genetics' TRENDS IN GENETICS vol. 24, no. 3, 2007, pages 133 - 141 *
NAGAKI ET AL.: 'Sequencing of rice centromere uncovers active genes.' NATURE GENETICS vol. 36, no. 2, February 2004, pages 138 - 145 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9267132B2 (en) 2007-10-08 2016-02-23 Synthetic Genomics, Inc. Methods for cloning and manipulating genomes
US9273310B2 (en) 2007-10-08 2016-03-01 Synthetic Genomics, Inc. Methods for cloning and manipulating genomes
US10975378B2 (en) 2007-10-08 2021-04-13 Codex Dna, Inc. Methods for cloning and manipulating genomes
US10818378B2 (en) 2009-10-30 2020-10-27 Codex Dna, Inc. Encoding text into nucleic acid sequences
WO2011109031A1 (fr) * 2010-03-05 2011-09-09 Synthetic Genomics, Inc. Procédés de clonage et de manipulation de génomes
CN102939380A (zh) * 2010-03-05 2013-02-20 合成基因组股份有限公司 用于克隆和操作基因组的方法
WO2012061481A3 (fr) * 2010-11-05 2012-06-28 Chromatin, Inc. Identification de séquences de centromère en utilisant des protéines associées au centromères et utilisations de celle-ci
GB2482209B (en) * 2010-12-10 2013-01-02 Porvair Filtration Group Ltd Chromatin immunoprecipitation assay
US9523681B2 (en) 2010-12-10 2016-12-20 Porvair Filtration Group Limited Method of performing a chromatin immunoprecipitation assay
US9950280B2 (en) 2010-12-10 2018-04-24 Porvair Filtration Group Limited Methods and devices for chromatin immunoprecipitation assays

Also Published As

Publication number Publication date
WO2009134814A3 (fr) 2010-01-28
US9758810B2 (en) 2017-09-12
US20100041035A1 (en) 2010-02-18

Similar Documents

Publication Publication Date Title
US9758810B2 (en) Identification of centromere sequences and uses therefor
EP3055423B1 (fr) Procédés de détection de séquences d&#39;acide nucléique d&#39;intérêt à l&#39;aide d&#39;un protein du typ talen
CN108473981B (zh) 工程化靶向核酸的核酸
JP4886784B2 (ja) カンジダ種の検出のための組成物および方法
Bendich Structural analysis of mitochondrial DNA molecules from fungi and plants using moving pictures and pulsed-field gel electrophoresis
US5426027A (en) Nucleic acid probes and methods for detecting Candida DNA cells in blood
Maurya et al. Evaluation of salt-out method for the isolation of DNA from whole blood: a pathological approach of DNA based diagnosis
JPH04502107A (ja) 挿入を有するかまたは欠失に相当するdnaの豊富化およびクローニングの方法
ES2336657T3 (es) Acidos nucleicos para la identificacion de hongos y procedimientos para usarlos.
CN107119048B (zh) 桑假尾孢菌rDNA及其在桑假尾孢菌分子检测中的应用
Lee et al. Genetic dissection of the co‐expression of genes encoding the two isoforms of oleosins in the oil bodies of maize kernel
US20120115132A1 (en) Identification of centromere sequences using centromere associated proteins and uses thereof
Owari et al. Subcellular localization of minicircle DNA in the dinoflagellate A mphidinium massartii
US20100055703A1 (en) Organism-Specific Hybridizable Nucleic Acid Molecule
Formanová et al. High-resolution mapping of the Brassica napus Rfp restorer locus using Arabidopsis-derived molecular markers
Wang et al. The plastid-encoded psbA gene in the dinoflagellate Gonyaulax is not encoded on a minicircle
Fourcade-Peronnet et al. A nuclear single-stranded-DNA binding factor interacts with the long terminal repeats of the 1731 Drosophila retrotransposon
US7659067B2 (en) Method for identification of medically relevant fungi
KR100980809B1 (ko) 돼지 내인성 레트로바이러스의 검출을 위한 연접서열 및프라이머쌍
Urban Gene Expression in Dwarf Mistletoe Related to Explosive Seed Dispersal with special Attention to Aquaporins
Malenica et al. Grapevine variety determination from herbarium and archeological specimens
KR20130100090A (ko) 과수류, 또는 묘목에서 발생하는 파이토프토라속 균주를 검출하기 위한 중합효소연쇄반응의 프라이머, 이를 이용한 검출키트 및 방법
WO2022212610A1 (fr) Procédé rapide et automatisé pour la caractérisation d&#39;agents pathogènes non spécifiques dans un échantillon
尾張智美 Intracellular Localization and Sequence Variation of the Dinoflagellate Minicircle DNA
CN117904355A (zh) 一种基于RPA-CRISPR/Cas12a的植物茎基腐病病原菌的快速检测方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09739616

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09739616

Country of ref document: EP

Kind code of ref document: A2