US20200392574A1 - Dna array - Google Patents

Dna array Download PDF

Info

Publication number
US20200392574A1
US20200392574A1 US16/994,343 US202016994343A US2020392574A1 US 20200392574 A1 US20200392574 A1 US 20200392574A1 US 202016994343 A US202016994343 A US 202016994343A US 2020392574 A1 US2020392574 A1 US 2020392574A1
Authority
US
United States
Prior art keywords
dna
array
molecules
individual
probes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/994,343
Inventor
Radoje Drmanac
Matthew J. Callow
Snezana Drmanac
Brian K. Hauser
George Yeung
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Complete Genomics Inc
Original Assignee
Complete Genomics, Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=37571035&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=US20200392574(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Complete Genomics, Inc filed Critical Complete Genomics, Inc
Priority to US16/994,343 priority Critical patent/US20200392574A1/en
Publication of US20200392574A1 publication Critical patent/US20200392574A1/en
Priority to US17/522,708 priority patent/US20220162694A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • C12Q1/6874Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H21/00Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids
    • C07H21/04Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids with deoxyribosyl as saccharide radical
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K1/00General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
    • C07K1/04General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length on carriers
    • C07K1/047Simultaneous synthesis of different peptide species; Peptide libraries
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6816Hybridisation assays characterised by the detection means
    • C12Q1/682Signal amplification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6834Enzymatic or biochemical coupling of nucleic acids to a solid phase
    • C12Q1/6837Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume, or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Electro-optical investigation, e.g. flow cytometers
    • G01N15/1404Fluid conditioning in flow cytometers, e.g. flow cells; Supply; Control of flow
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume, or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Electro-optical investigation, e.g. flow cytometers
    • G01N15/1434Electro-optical investigation, e.g. flow cytometers using an analyser being characterised by its optical arrangement
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2525/00Reactions involving modified oligonucleotides, nucleic acids, or nucleotides
    • C12Q2525/10Modifications characterised by
    • C12Q2525/151Modifications characterised by repeat or repeated sequences, e.g. VNTR, microsatellite, concatemer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2525/00Reactions involving modified oligonucleotides, nucleic acids, or nucleotides
    • C12Q2525/30Oligonucleotides characterised by their secondary structure
    • C12Q2525/313Branched oligonucleotides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2531/00Reactions of nucleic acids characterised by
    • C12Q2531/10Reactions of nucleic acids characterised by the purpose being amplify/increase the copy number of target nucleic acid
    • C12Q2531/125Rolling circle
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2565/00Nucleic acid analysis characterised by mode or means of detection
    • C12Q2565/50Detection characterised by immobilisation to a surface
    • C12Q2565/513Detection characterised by immobilisation to a surface characterised by the pattern of the arrayed oligonucleotides
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S977/00Nanotechnology
    • Y10S977/70Nanostructure
    • Y10S977/778Nanostructure within specified host or matrix material, e.g. nanocomposite films
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S977/00Nanotechnology
    • Y10S977/70Nanostructure
    • Y10S977/788Of specified organic or carbon-based composition
    • Y10S977/789Of specified organic or carbon-based composition in array format
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S977/00Nanotechnology
    • Y10S977/70Nanostructure
    • Y10S977/788Of specified organic or carbon-based composition
    • Y10S977/789Of specified organic or carbon-based composition in array format
    • Y10S977/79Of specified organic or carbon-based composition in array format with heterogeneous nanostructures
    • Y10S977/791Molecular array
    • Y10S977/792Nucleic acid array, e.g. human genome array
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S977/00Nanotechnology
    • Y10S977/84Manufacture, treatment, or detection of nanostructure
    • Y10S977/88Manufacture, treatment, or detection of nanostructure with arrangement, process, or apparatus for testing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S977/00Nanotechnology
    • Y10S977/84Manufacture, treatment, or detection of nanostructure
    • Y10S977/882Assembling of separate components, e.g. by attaching

Definitions

  • Such arrays play a key role in most currently available, or emerging, large-scale genetic analysis and proteomic techniques, including those for single nucleotide polymorphism detection, copy number assessment, nucleic acid sequencing, and the like, e.g. Kennedy et al (2003), Nature Biotechnology, 21: 1233-1237; Gunderson et al (2005), Nature Genetics, 37: 549-554; Pinkel and Albertson (2005), Nature Genetics Supplement, 37: 511 ⁇ 517; Leamon et al (2003), Electrophoresis, 24: 3769-3777; Shendure et al (2005), Science, 309: 1728-1732; Cowie et al (2004), Human Mutation, 24: 261-271; and the like.
  • compositions of the invention in one form include random arrays of a plurality of different single molecules disposed on a surface, where the single molecules each comprise a macromolecular structure and at least one analyte, such that each macromolecular structure comprises a plurality of attachment functionalities that are capable of forming bonds with one or more functionalities on the surface.
  • the analyte is a component of the macromolecular structure, and in another aspect, the analyte is attached to the macromolecular structure by a linkage between a unique functionality on such structure and a reactive group or attachment moiety on the analyte.
  • compositions of the invention include random arrays of single molecules disposed on a surface, where the single molecules each comprise a concatemer of at least one target polynucleotide and each is attached to the surface by linkages formed between one or more functionalities on the surface and complementary functionalities on the concatemer.
  • compositions of the invention include random arrays of single molecules disposed on a surface, where the single molecules each comprise a concatemer of at least one target polynucleotide and at least one adaptor oligonucleotide and each is attached to such surface by the formation of duplexes between capture oligonucleotides on the surface and the attachment oligonucleotides in the concatemer.
  • substantially equivalent means a substantially circular region having a diameter that is one half or less than the total length of the polymer; or in another embodiment one tenth or less; or in another embodiment, one hundredth or less.
  • the invention includes an array of polynucleotide molecules comprising: (a) a support having a surface; and (b) a plurality of polynucleotide molecules attached to the surface, wherein each polynucleotide molecule has a random coil state and comprises a concatemer of multiple copies of a target sequence such that the polynucleotide molecule is attached to the surface within a region substantially equivalent to a projection of the random coil on the surface and randomly disposed at a density such that at least thirty percent of the polynucleotide molecules have a nearest neighbor distance of at least fifty nm.
  • the invention provides an array of single molecules comprising: (a) a support having a planar surface having a regular array of discrete spaced apart regions, wherein each discrete spaced apart region has an area of less than 1 ⁇ m 2 and contains reactive functionalities attached thereto; and (b) a plurality of single molecules attached to the surface, wherein each single molecule comprises a macromolecular structure and at least one analyte having an attachment moiety, such that each macromolecular structure comprises a unique functionality and a plurality of attachment functionalities that are capable of forming linkages with the reactive functionalities of the discrete spaced apart regions, and such that the analyte is attached to the macromolecular structure by a linkage between the unique functionality and the attachment moiety of the analyte, wherein the plurality of single molecules are randomly disposed on the discrete spaced apart regions such that at least a majority of the discrete spaced apart regions contain only one single molecule.
  • the invention provides an array of polynucleotide molecules comprising: (a) a support having a surface with capture oligonucleotides attached thereto; and (b) a plurality of polynucleotide molecules attached to the surface, wherein each polynucleotide molecule comprises a concatemer of multiple copies of a target sequence and an adaptor oligonucleotide such that the polynucleotide molecule is attached to the surface by one or more complexes formed between capture oligonucleotides and adaptor oligonucleotides, the polynucleotide molecules being randomly disposed on the surface at a density such that at least a majority of the polynucleotide molecules have a nearest neighbor distance of at least fifty nm.
  • the surface is a planar surface having an array of discrete spaced apart regions, wherein each discrete spaced apart region has a size equivalent to that of the polynucleotide molecule and contains the capture oligonucleotides attached thereto and wherein substantially all such regions have at most one of the polynucleotide molecules attached.
  • the invention further includes, a method of making an array of polynucleotide molecules comprising the following steps: (a) generating a plurality of polynucleotide molecules each comprising a concatemer of a DNA fragment from a source DNA and an adaptor oligonucleotide; and (b) disposing the plurality of polynucleotide molecules onto a support having a surface with capture oligonucleotides attached thereto so that the polynucleotide molecules are fixed to the surface by one or more complexes formed between capture oligonucleotides and adaptor oligonucleotides and so that the polynucleotide molecules are randomly distributed on the surface at a density such that a majority of the polynucleotide molecules have a nearest neighbor distance of at least fifty nm, thereby forming the array of polynucleotide molecules.
  • the invention provides a method of determining a nucleotide sequence of a target polynucleotide, the method comprising the steps of: (a) generating a plurality of target concatemers from the target polynucleotide, each target concatemer comprising multiple copies of a fragment of the target polynucleotide and the plurality of target concatemers including a number of fragments that substantially covers the target polynucleotide; (b) forming a random array of target concatemers fixed to a surface at a density such that at least a majority of the target concatemers are optically resolvable; (c) identifying a sequence of at least a portion of each fragment in each target concatemer; and (d) reconstructing the nucleotide sequence of the target polynucleotide from the identities of the sequences of the portions of fragments of the concatemers.
  • the step of identifying includes the steps of (a) hybridizing one or more probes from a first set of probes to the random array under conditions that permit the formation of perfectly matched duplexes between the one or more probes and complementary sequences on target concatemers; (b) hybridizing one or more probes from a second set of probes to the random array under conditions that permit the formation of perfectly matched duplexes between the one or more probes and complementary sequences on target concatemers; (c) ligating probes from the first and second sets hybridized to a target concatemer at contiguous sites; (d) identifying the sequences of the ligated first and second probes; and (e) repeating steps (a through (d) until the sequence of the target polynucleotide can be determined from the identities of the sequences of the ligated probes.
  • the invention includes kits for making random arrays of the invention and for implementing applications of the random arrays of the invention, particularly high-throughput analysis of one or more target polynucleotides.
  • the methods of the invention provide flexibility in making and using an array of structured random arrays for more efficient haplotype and splice variant determination, analysis of multiple samples in parallel, staggered sequencing reaction to eliminate the idle time of CCD detectors, parallel probing cycles to shorten the sequencing completion time of longer DNA fragments.
  • the present invention provides a significant advance in the microarray field by providing arrays of single molecules comprising linear and/or branched polymer structures that may incorporate or have attached target analyte molecules.
  • such single molecules are concatemers of target polynucleotides arrayed at densities that permit efficient high resolution analysis of mammalian-sized genomes, including sequence determination of all or substantial parts of such genomes, sequence determination of tagged fragments from selected regions of multiple genomes, digital readouts of gene expression, and genome-wide assessments of copy number patterns, methylation patterns, chromosomal stability, individual genetic variation, and the like.
  • FIG. 1A , FIG. 1B , FIG. 1C , FIG. 1D , FIG. 1E , FIG. 1F , FIG. 1G , FIG. 1H and FIG. 1I illustrate various embodiments of the methods and compositions of the invention.
  • FIGS. 2A-2B illustrate methods of circularizing genomic DNA fragments for generating concatemers of polynucleotide analytes.
  • FIG. 3 is an image of a glass surface containing a disposition of concatemers of E. coli fragments.
  • FIG. 4 is an image of concatemers derived from two different organisms that are selectively labeled using oligonucleotide probes.
  • FIG. 5 is an image of concatemers of DNA fragments that contain a degenerated base, each of which is identified by a specific ligation probe.
  • FIG. 6 is an image of concatemers of DNA fragments that contain a segment of degenerate bases, pairs of which are identified by specific probes.
  • FIG. 7 is a scheme for identifying sequence differences between reference sequences and test sequences using enzymatic mismatch detection and for constructing DNA circles therefrom.
  • FIG. 8 is another for identifying sequence differences between a reference sequence and a test sequence using enzymatic mismatch detection and for constructing DNA circles therefrom.
  • FIG. 9 shows general elements of the universal nano-ball probe template single stranded DNA circle.
  • FIG. 10 illustrates using the MetaMorph software, 3 images were overlaid together with slight shifts.
  • the blue colored image corresponds to result of hybridization of the BrPrb3 (the adaptor probe) to the array.
  • the red colored image shifted slightly above the blue image, corresponds to the result of hybridization of the Ba3 probe to the array.
  • the green colored image shifted slightly below the blue image, corresponds to the result of hybridization of the Yp3 probe to the array.
  • the circle denoted with ‘A’ indicates the position of one of the spots co-hybridize with both the adaptor probe and the Ba3 probe
  • the circle denoted with ‘B’ indicates the position of one of the spots co-hybridize with both the adaptor probe and the Yp3 probe.
  • these arrays are produce by attaching DNA nano-balls without any size selection to glass surface covered with a carpet of capture oligonucleotides.
  • FIG. 11 illustrates using the MetaMorph software, 5 images were overlaid together with slight shifts.
  • the blue colored image corresponds to result of hybridization of the BrPrb3 (the adaptor probe) to the array.
  • the red image corresponds to hybridization with the A-specific ligation probe pair (T1Aa9 and T1Ab9)
  • the green image corresponds to hybridization with the C-specific ligation probe pair (T1Aa10 and T1Ab9)
  • the yellow image corresponds to hybridization with the G-specific ligation probe pair (T1Aa11 and T1Ab9)
  • the cyan image corresponds to hybridization with the T-specific ligation probe pair (T1Aa12 and T1Ab9).
  • the circle denoted with ‘A’ indicates the position of one of the spots co-hybridize with both the adaptor probe and the A-specific ligation probe pair, similarly for circles denoted with ‘C’, ‘G’ and ‘T’.
  • these arrays are produce by attaching DNA nano-balls without any size selection to glass surface covered with a carpet of capture oligonucleotides.
  • nano-printing or surface pattering by photochemistry technologies to producing a glass substrate containing a grid of DNA nano-ball binding sites where each site is about 0.25-0.50 micrometer in size and surrounded by 0.75 micron or 0.50 micron of surface that does not bind DNA. Only one DNA nano-ball will be able to attach to such a binding site. This will produce a regular grid of individual submicron DNA spots of similar size.
  • FIG. 13 shows an image of randomly distributed concatemers hybridized to capture oligonucleotides. Sequences were detected with a TAM RA labeled probe to adapter sequences.
  • FIG. 14 shows a circle formation schema: Panel A. Ligation of an adapter to 5′ end of genomic fragment via universal template. Panel B. Closing of the adapter-modified fragment having 3′-polyA tail using a bridging template. Gel tests: Panel C. Preservation of DNA circles (top band) with Exonuclease V digestion. Panel D. In the presence of Phi29 DNA polymerase high molecular weight DNA molecules are observed, indicating the success of the rolling circle amplification.
  • FIG. 15 shows PCR amplification with tailed primers ( 1 ) is followed by strand removal or strand separation and the addition of a bridging oligonucleotide ( 2 ). Circle formation proceeds utilizing the bridge and DNA ligase ( 3 ).
  • FIG. 21 shows Method 11 for production, capture and amplification of DNA mismatches.
  • the practice of the present invention may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and immunology, which are within the skill of the art.
  • Such conventional techniques include polymer array synthesis, hybridization, ligation, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the example herein below. However, other equivalent conventional procedures can, of course, also be used.
  • Such conventional techniques and descriptions can be found in standard laboratory manuals such as Genome Analysis: A Laboratory Manual Series ( Vols.
  • DNA/RNA and their derivatives or peptides or protein and other array products including processes for their preparation and uses, that are based on applying mixtures of detecting molecules of partially or fully known primary structure or polymer sequence, preferably as concatemers of the same molecule, on substrates with a pattern of high density small binding sites separated by non-binding surface, followed by determining which detecting molecule from the mixture is attached at which binding site.
  • macromolecular structures may be selected to satisfy a variety of design objectives in particular embodiments. For example, in some embodiments, it may be advantageous to maintain an analyte molecule as far from the surface as possible, e.g. by providing an inflexible molecular spacer as part of a unique linkage. As another example, reactive functionalities may be selected as having a size that effectively prevents attachment of multiple macromolecular structures to one discrete spaced apart region. As still another example, macromolecular structures may be provided with other functionalities for a variety of other purposes, e.g. enhancing solubility, promoting formation of secondary structures via hydrogen bonding, and the like.
  • macromolecular structures of the invention are single stranded polynucleotides comprising concatemers of a target sequence or fragment.
  • polynucleotides may be concatemers of a target sequence and an adaptor oligonucleotide.
  • Treatment ( 1001 ) usually entails fragmentation by a conventional technique, such as chemical fragmentation, enzymatic fragmentation, or mechanical fragmentation, followed by denaturation to produce single stranded DNA fragments.
  • Adaptor oligonucleotides ( 1004 ), in this example, are used to form ( 1008 ) a population ( 1010 ) of DNA circles by the method illustrated in FIG. 2A .
  • each member of population ( 1010 ) has an adaptor with an identical primer binding site and a DNA fragment from source nucleic acid ( 1000 ).
  • the adapter also may have other functional elements including, but not limited to, tagging sequences, attachment sequences, palindromic sequences, restriction sites, functionalization sequences, and the like.
  • classes of DNA circles may be created by providing adaptors having different primer binding sites.
  • a primer and rolling circle replication (RCR) reagents may be added to generate ( 1011 ) in a conventional RCR reaction a population ( 1012 ) of concatemers ( 1015 ) of the complements of the adaptor oligonucleotide and DNA fragments, which population can then be isolated using conventional separation techniques.
  • RCR may be implemented by successive ligation of short oligonucleotides, e.g. 6-mers, from a mixture containing all possible sequences, or if circles are synthetic, a limited mixture of oligonucleotides having selected sequences for circle replication.
  • surface ( 1018 ) may have attached capture oligonucleotides that form complexes, e.g. double stranded duplexes, with a segment of the adaptor oligonucleotide, such as the primer binding site or other elements.
  • capture oligonucleotides may comprise oligonucleotide clamps, or like structures, that form triplexes with adaptor oligonucleotides, e.g. Gryaznov et al, U.S. Pat. No. 5,473,060.
  • surface ( 1018 ) may have reactive functionalities that react with complementary functionalities on the concatemers to form a covalent linkage, e.g.
  • DNA molecules e.g. several hundred nucleotides or larger, may also be efficiently attached to hydrophobic surfaces, such as a clean glass surface that has a low concentration of various reactive functionalities, such as —OH groups. Concatemers of DNA fragments may be further amplified in situ after disposition of a surface.
  • concatemer may be cleaved by reconstituting a restriction site in adaptor sequences by hybridization of an oligonucleotide, after which the fragments are circularized as described below and amplified in situ by a RCR reaction.
  • a random coil polymer such as single stranded DNA
  • a root mean square of the end-to-end distance is roughly a measure of the diameter of the randomly coiled structure.
  • Such diameter referred to herein as a “random coil diameter”
  • Additional size measures of macromolecular structures of the invention include molecular weight, e.g. in Daltons, and total polymer length, which in the case of a branched polymer is the sum of the lengths of all its branches.
  • single stranded polynucleotides fill a flattened spheroidal volume that on average is bounded by a region ( 1107 ) defined by dashed circles ( 1108 ) having a diameter ( 1110 ), which is approximately equivalent to the diameter of a concatemer in random coil configuration.
  • macromolecular structures e.g. concatemers, and the like, are attached to surface ( 1102 ) within a region that is substantially equivalent to a projection of its random coil state onto surface ( 1102 ), for example, as illustrated by dashed circles ( 1108 ).
  • a variety of distance metrics may be employed for measuring the closeness of single molecules on a surface, including center-to-center distance of regions ( 1107 ), edge-to-edge distance of regions ( 1007 ), and the like. Usually, center-to-center distances are employed herein.
  • the selection of these parameters in fabricating arrays of the invention depends in part on the signal generation and detection systems used in the analytical processes. Generally, densities of single molecules are selected that permit at least twenty percent, or at least thirty percent, or at least forty percent, or at least a majority of the molecules to be resolved individually by the signal generation and detection systems used. In one aspect, a density is selected that permits at least seventy percent of the single molecules to be individually resolved.
  • a density is selected such that at least a majority of single molecules have a nearest neighbor distance of 300 nm or greater; and in another aspect, such density is selected to ensure that at least seventy percent of single molecules have a nearest neighbor distance of 300 nm or greater, or 400 nm or greater, or 500 nm or greater, or 600 nm or greater, or 700 nm or greater, or 800 nm or greater.
  • a density is selected such that at least a majority of single molecules have a nearest neighbor distance of at least twice the minimal feature resolution power of the microscope.
  • polymer molecules of the invention are disposed on a surface so that the density of separately detectable polymer molecules is at least 1000 per ⁇ m 2 , or at least 10,000 per ⁇ m 2 , or at least 100,000 per ⁇ m 2 .
  • the likelihood of having only one single molecule per discrete spaced apart region may be increased by selecting a density of reactive functionalities or capture oligonucleotides that results in fewer such moieties than their respective complements on single molecules.
  • a single molecule will “occupy” all linkages to the surface at a particular discrete spaced apart region, thereby reducing the chance that a second single molecule will also bind to the same region.
  • substantially all the capture oligonucleotides in a discrete spaced apart region hybridize to adaptor oligonucleotides a single macromolecular structure.
  • the lengths of capture oligonucleotides are in a range of from 6 to 30 nucleotides, and in another aspect, within a range of from 8 to 30 nucleotides, or from 10 to 24 nucleotides.
  • Lengths and sequences of capture oligonucleotides are selected (i) to provide effective binding of macromolecular structures to a surface, so that losses of macromolecular structures are minimized during steps of analytical operations, such as washing, etc., and (ii) to avoid interference with analytical operations on analyte molecules, particularly when analyte molecules are DNA fragments in a concatemer.
  • a discrete spaced apart region may contain more than one kind of capture oligonucleotide, and each different capture oligonucleotide may have a different length and sequence.
  • sequences of capture oligonucleotides are selected so that sequences of capture oligonucleotide at nearest neighbor regions have different sequences. In a rectilinear array, such configurations are achieved by rows of alternating sequence types.
  • a surface may have a plurality of subarrays of discrete spaced apart regions wherein each different subarray has capture oligonucleotides with distinct nucleotide sequences different from those of the other subarrays.
  • a plurality of subarrays may include 2 subarrays, or 4 or fewer subarrays, or 8 or fewer subarrays, or 16 or fewer subarrays, or 32 or fewer subarrays, or 64 of fewer subarrays.
  • a surface may include 5000 or fewer subarrays.
  • capture oligonucleotides are attached to the surface of an array by a spacer molecule, e.g. polyethylene glycol, or like inert chain, as is done with microarrays, in order to minimize undesired effects of surface groups or interactions with the capture oligonucleotides or other reagents.
  • center-to-center distances of nearest neighbors of regions ( 1122 ) are in the range of from 0.25 um to 20 ⁇ m; and in another aspect, such distances are in the range of from 1 um to 10 ⁇ m, or in the range from 50 to 1000 nm.
  • regions ( 1120 ) may be arranged on surface ( 1018 ) in virtually any pattern in which regions ( 1122 ) have defined locations, i.e. in any regular array, which makes signal collection and data analysis functions more efficient.
  • Such patterns include, but are not limited to, concentric circles of regions ( 1122 ), spiral patterns, rectilinear patterns, hexagonal patterns, and the like.
  • regions ( 1122 ) are arranged in a rectilinear or hexagonal pattern.
  • DNA circles prepared from source nucleic acid ( 1200 ) need not include an adaptor oligonucleotide.
  • source nucleic acid ( 1200 ) is fragmented and denatured ( 1202 ) to form a population of single strand fragments ( 1204 ), preferably in the size range of from about 50 to 600 nucleotides, and more preferably in the size range of from about 300 to 600 nucleotides, after which they are circularized in a non-template driven reaction with circularizing ligase, such as CircLigase (Epicentre Biotechnologies, Madison, Wis.), or the like.
  • circularizing ligase such as CircLigase (Epicentre Biotechnologies, Madison, Wis.), or the like.
  • concatemers are generated by providing a mixture of primers that bind to selected sequences.
  • the mixture of primers may be selected so that only a subset of the total number of DNA circles ( 1206 ) generate concatemers.
  • concatemers are generated ( 1208 ), they are isolated and applied to surface ( 1210 ) to form a random array of the invention.
  • single molecules of the invention comprise an attachment portion and an analyte portion such that the attachment portion comprises a macromolecular structure that provides multivalent attachment of the single molecule to a surface.
  • macromolecular structures may be concatemers made by an RCR reaction in which the DNA circles in the reaction are synthetic.
  • An analyte portion of a single molecule is then attached by way of a unique functionality on the concatemer.
  • Synthetic DNA circles of virtually any sequence can be produced using well-known techniques, conveniently, in sizes up to several hundred nucleotides, e.g. 200, and with more difficulty, in sizes of many hundreds of nucleotides, e.g. up to 500, e.g. Kool, U.S.
  • Synthetic DNA circles ( 1300 ) that comprise primer binding sites ( 1301 ) are combined with primer ( 1302 ) in an RCR reaction ( 1306 ) to produce concatemers ( 1308 ).
  • all circles have the same sequence, although different sequences can be employed, for example, for directing subsets of concatemers to preselected regions of an array via complementary attachment moieties, such as adaptor sequences and capture oligonucleotides.
  • concatemers ( 1308 ) may be combined with analytes ( 1312 ) so that attachment moieties and unique functionalities can react to form a linkage, after which the resulting conjugate is applied to array ( 1310 ).
  • attachment moieties and unique functionalities are abundant guidance in the literature in selecting appropriate attachment moieties and unique functionalities for linking concatemers ( 1308 ) and many classes of analyte.
  • many homo- and heterobifunctional reagents are available commercially (e.g. Pierce) and are disclosed in references such as Hermanson, Bioconjugate Techniques (Academic Press, New York, 1996), which is incorporated by reference.
  • Suitable complementary functionalities on analytes include amino groups, sulfhydryl groups, carbonyl groups, which may occur naturally on analytes or may be added by reaction with a suitable homo- or heterobifunctional reagent.
  • Analyte molecules may also be attached to macromolecular structures by way of non-covalent linkages, such as biotin ⁇ streptavidin linkages, the formation of complexes, e.g. a duplexes, between a first oligonucleotide attached to a concatemer and a complementary oligonucleotide attached to, or forming part of, an analyte, or like linkages.
  • Analytes include biomolecules, such as nucleic acids, for example, DNA or RNA fragments, polysaccharides, proteins, and the like.
  • macromolecular structures of the invention may comprise branched polymers as well as linear polymers, such as concatemers of DNA fragments.
  • Exemplary branched polymer structures are illustrated in FIGS. 1F and 1G .
  • a branched DNA structure is illustrated that comprises a backbone polynucleotide ( 1400 ) and multiple branch polynucleotides ( 1402 ) each connected to backbone polynucleotide ( 1400 ) by their 5′ ends to form a comb-like structure that has all 3′ ends, except for a single 5′ end ( 1404 ) on backbone polynucleotide ( 1400 ), which is derivatized to have a unique functionality.
  • such unique functionality may be a reactive chemical group, e.g. a protected or unprotected amine, sulfhydryl, or the like, or it may be an oligonucleotide having a unique sequence for capturing an analyte having an oligonucleotide with a complementary sequence thereto.
  • such unique functionality may be a capture moiety, such as biotin, or the like.
  • Such branched DNA structures are synthesized using known techniques, e.g. Gryaznov, U.S. Pat. No. 5,571,677; Urdea et al, U.S. Pat. No. 5,124,246; Seeman et al, U.S. Pat. No.
  • a dendrimer structure is illustrated that comprises oligonucleotide ( 1406 ), which is derivatized with multiple tri-valent linking groups ( 1408 ) that each have two functionalities ( 1410 , designated by “R”) by which additional polymers ( 1407 ), e.g. polynucleotides, can be attached to form a linkage to oligonucleotide ( 1406 ) thereby forming macromolecular structure ( 1409 ), which, in turn, if likewise derivatized with multivalent linkers, can form a nucleic acid dendrimer.
  • Trivalent linkers ( 1408 ) for use with oligonucleotides are disclosed in Iyer et al, U.S. Pat. No.
  • analyte is a polynucleotide ( 1440 ) with a free 3′ end, as shown in FIG. 1I , such end may be extended in an in situ RCR reaction to form either concatemers of target sequences or other sequences for further additions.
  • polynucleotide analytes may be extended by ligation using conventional techniques.
  • fragments are generated from at least 10 genome-equivalents of DNA; and in another aspect, fragments are generated from at least 30 genome-equivalents of DNA; and in another aspect, fragments are generated from at least 60 genome-equivalents of DNA.
  • Genomic DNA is obtained using conventional techniques, for example, as disclosed in Sambrook et al., supra, 1999; Current Protocols in Molecular Biology, Ausubel et al., eds. (John Wiley and Sons, Inc., NY, 1999), or the like.
  • Important factors for isolating genomic DNA include the following: 1) the DNA is free of DNA processing enzymes and contaminating salts; 2) the entire genome is equally represented; and 3) the DNA fragments are between about 5,000 and 100,000 by in length. In many cases, no digestion of the extracted DNA is required because shear forces created during lysis and extraction will generate fragments in the desired range. In another embodiment, shorter fragments (1-5 kb) can be generated by enzymatic fragmentation using restriction endonucleases.
  • fragments may be derived from either an entire genome or it may be derived from a selected subset of a genome.
  • Many techniques are available for isolating or enriching fragments from a subset of a genome, as exemplified by the following references that are incorporated by reference: Kandpal et al (1990), Nucleic Acids Research, 18: 1789-1795; Callow et al, U.S. patent publication 2005/0019776; Zabeau et al, U.S. Pat. No. 6,045,994; Deugau et al, U.S. Pat. No. 5,508,169; Sibson, U.S. Pat. No.
  • an initial fragmentation of genomic DNA can be achieved by digestion with one or more “rare” cutting restriction endonucleases, such as Not I, Asc I, Bae I, CspC I, Pac I, Fse I, Sap I, Sfi I, Psr I, or the like.
  • the resulting fragments can be used directly, or for genomes that have been sequenced, specific fragments may be isolated from such digested DNA for subsequent processing as illustrated in FIG. 2B .
  • Genomic DNA ( 230 ) is digested ( 232 ) with a rare cutting restriction endonuclease to generate fragments ( 234 ), after which the fragments ( 234 ) are further digested for a short period (i.e.
  • primer extension from a genomic DNA template is used to generate a linear amplification of selected sequences greater than 10 kilobases surrounding genomic regions of interest. For example, to create a population of defined-sized targets, 20 cycles of linear amplification is performed with a forward primer followed by 20 cycles with a reverse primer. Before applying the second primer, the first primer is removed with a standard column for long DNA purification or degraded if a few uracil bases are incorporated. A greater number of reverse strands are generated relative to forward strands resulting in a population of double stranded molecules and single stranded reverse strands.
  • the reverse primer may be biotinylated for capture to streptavidin beads which can be heated to melt any double stranded homoduplexes from being captured. All attached molecules will be single stranded and representing one strand of the original genomic DNA.
  • the products produced can be fragmented to 0.2-2 kb in size, or more preferably, 0.3-0.6 kb in size (effectively releasing them from the solid support) and circularized for an RCR reaction.
  • genomic DNA 200
  • denatured 202
  • single stranded DNA fragments 204
  • a terminal transferase 206
  • attach a poly dA tails 208
  • ligation 212
  • the free ends intra-molecularly with the aid of bridging oligonucleotide ( 210 ).
  • Duplex region ( 214 ) of bridging oligonucleotide ( 210 ) contains at least a primer binding site for RCR and, in some embodiments, sequences that provide complements to a capture oligonucleotide, which may be the same or different from the primer binding site sequence, or which may overlap the primer binding site sequence.
  • capture oligonucleotides may vary widely, In one aspect, capture oligonucleotides and their complements in a bridging oligonucleotide have lengths in the range of from 10 to 100 nucleotides; and more preferably, in the range of from 10 to 40 nucleotides.
  • duplex region ( 214 ) may contain additional elements, such as an oligonucleotide tag, for example, for identifying the source nucleic acid from which it's associated DNA fragment came.
  • circles or adaptor ligation or concatemers from different source nucleic acids may be prepared separately during which a bridging adaptor containing a unique tag is used, after which they are mixed for concatemer preparation or application to a surface to produce a random array.
  • the associated fragments may be identified on such a random array by hybridizing a labeled tag complement to its corresponding tag sequences in the concatemers, or by sequencing the entire adaptor or the tag region of the adaptor.
  • Circular products ( 218 ) may be conveniently isolated by a conventional purification column, digestion of non-circular DNA by one or more appropriate exonucleases, or both.
  • DNA fragments of the desired sized range can also be circularized using circularizing enzymes, such as CircLigase, as single stranded DNA ligase that circularizes single stranded DNA without the need of a template.
  • CircLigase is used in accordance with the manufacturer's instructions (Epicentre, Madison, Wis.).
  • a preferred protocol for forming single stranded DNA circles comprising a DNA fragment and one or more adapters is to use standard ligase such as T4 ligase for ligation an adapter to one end of DNA fragment and then to use CircLigase to close the circle, as described more fully below.
  • An exemplary protocol for generating a DNA circle comprising an adaptor oligonucleotide and a target sequence using T4 ligase.
  • the target sequence is a synthetic oligo T1N (sequence: 5′-NNNNNNGCATANCACGANGTCATNATCGTNCAAACGTCAGTCCANGAATCNAGATCCACTTAGANTGNCGN NNNNN-3′)(SEQ ID NO: 1).
  • the adaptor is made up of 2 separate oligos.
  • the adaptor oligo that joins to the 5′ end of T1N is BR2-ad (sequence: 5′-TATCATCTGGATGTTAGGAAGACAAAAGGAAGCT GAGGACATTAACGGAC-3′) (SEQ ID NO: 2) and the adaptor oligo that joins to the 3′ end of T1N is UR3-ext (sequence: 5′-ACCTTCAGACCAGAT-3′) (SEQ ID NO: 3) UR3-ext contains a type IIs restriction enzyme site (Acu I: CTTCAG) to provide a way to linearize the DNA circular for insertion of a second adaptor.
  • BR2-ad sequence: 5′-TATCATCTGGATGTTAGGAAGACAAAAGGAAGCT GAGGACATTAACGGAC-3′
  • UR3-ext contains a type IIs restriction enzyme site (Acu I: CTTCAG) to provide a way to linearize the DNA circular for insertion of a second adaptor.
  • BR2-ad is annealed to BR2-temp (sequence 5′-NNNNNNGTCCGTTAATGTCCTCAG-3′) (SEQ ID NO: 4) to form a double-stranded adaptor BR2 adaptor.
  • UR3-ext is annealed to biotinylated UR3-temp (sequence 5′-[BIOTIN]-ATCTGGTCTGAAGGTNNNNNNNNN-3′) (SEQ ID NO: 5) to form a double-stranded adaptor UR3 adaptor.
  • 1 pmol of target T1N is ligated to 25 pmol of BR2 adaptor and 10 pmol of UR3 adaptor in a single ligation reaction containing 50 mM Tris-C1, pH7.8, 10% PEG, 1 mM ATP, 50 mg/L BSA, 10 mM MgCl2, 0.3 unit/ ⁇ l T4 DNA ligase (Epicentre Biotechnologies, WI) and 10 mM DTT) in a final volume of 10 ul.
  • the ligation reaction is incubated in a temperature cycling program of 15° C. for 11 min, 37° C. for 1 min repeated 18 times. The reaction is terminated by heating at 70° C. for 10 min.
  • Elution buffer (10 mM Tris HCl pH7.5) is pre-warmed to 70 deg, 10 ⁇ l of which is added to the beads at 70° C. for 5 min. After magnetic separation, the supernatant is retained as primary purified sample.
  • This sample is further purified by removing the excess UR3 adaptors with magnetic beads pre-bound with a biotinylated oligo BR-rc-bio (sequence: 5′-[BIOTIN]CTTTTGTCTTCCTAACATCC-3′) (SEQ ID NO: 6) that is reverse complementary to BR2-ad similarly as described above.
  • the concentration of the adaptor-target ligated product in the final purified sample is estimated by urea polyacrylamide gel electrophoresis analysis.
  • single molecules comprise concatemers of polynucleotides, usually polynucleotide analytes, i.e. target sequences, that have been produce in a conventional rolling circle replication (RCR) reaction.
  • RCR rolling circle replication
  • Guidance for selecting conditions and reagents for RCR reactions is available in many references available to those of ordinary skill, as evidence by the following that are incorporated by reference: Kool, U.S. Pat. No. 5,426,180; Lizardi, U.S. Pat. Nos. 5,854,033 and 6,143,495; Landegren, U.S. Pat. No. 5,871,921; and the like.
  • RCR reaction components comprise single stranded DNA circles, one or more primers that anneal to DNA circles, a DNA polymerase having strand displacement activity to extend the 3′ ends of primers annealed to DNA circles, nucleoside triphosphates, and a conventional polymerase reaction buffer. Such components are combined under conditions that permit primers to anneal to DNA circles and be extended by the DNA polymerase to form concatemers of DNA circle complements.
  • An exemplary RCR reaction protocol is as follows: In a 50 ⁇ L reaction mixture, the following ingredients are assembled: 2-50 pmol circular DNA, 0.5 units/pi phage ( ⁇ 29 DNA polymerase, 0.2 ⁇ g/ ⁇ L BSA, 3 mM dNTP, 1 ⁇ 429 DNA polymerase reaction buffer (Amersham). The RCR reaction is carried out at 30° C. for 12 hours. In some embodiments, the concentration of circular DNA in the polymerase reaction may be selected to be low (approximately 10-100 billion circles per ml, or 10-100 circles per picoliter) to avoid entanglement and other intermolecular interactions.
  • concatemers produced by RCR are approximately uniform in size; accordingly, in some embodiments, methods of making arrays of the invention may include a step of size selecting concatemers.
  • concatemers are selected that as a population have a coefficient of variation in molecular weight of less than about 30%; and in another embodiment, less than about 20%.
  • size uniformity is further improved by adding low concentrations of chain terminators, such ddNTPs, to the RCR reaction mixture to reduce the presence of very large concatemers, e.g. produced by DNA circles that are synthesized at a higher rate by polymerases.
  • concentrations of ddNTPs are used that result in an expected concatemer size in the range of from 50-250 Kb, or in the range of from 50-100 Kb.
  • concatemers may be enriched for a particular size range using a conventional separation techniques, e.g. size-exclusion chromatography, membrane filtration, or the like.
  • macromolecular structures comprise polymers having at least one unique functionality, which for polynucleotides is usually a functionality at a 5′ or 3′ end, and a plurality of complementary functionalities that are capable of specifically reacting with reactive functionalities of the surface of a solid support.
  • Macromolecular structures comprising branched polymers, especially branched polynucleotides, may be synthesized in a variety of ways, as disclosed by Gryaznov (cited above), Urdea (cited above), and like references.
  • branched polymers of the invention include comb-type branched polymers, which comprise a linear polymeric unit with one or more branch points located at interior monomers and/or linkage moieties.
  • Branched polymers of the invention also include fork-type branched polymers, which comprise a linear polymeric unit with one or two branch points located at terminal monomers and/or linkage moieties.
  • Macromolecular structures of the invention also include assemblies of linear and/or branched polynucleotides bound together by one or more duplexes or triplexes. Such assemblies may be self-assembled from component linear polynucleotide, e.g. as disclosed by Goodman et al, Science, 310: 1661-1665 (2005); Birac et al, J. Mol. Graph Model, (Apr. 18, 2006); Seeman et al, U.S. Pat. No.
  • linear polymeric units of the invention have the form: “-(M-L) n -” wherein L is a linker moiety and M is a monomer that may be selected from a wide range of chemical structures to provide a range of functions from serving as an inert non-sterically hindering spacer moiety to providing a reactive functionality which can serve as a branching point to attach other components, a site for attaching labels; a site for attaching oligonucleotides or other binding polymers for hybridizing or binding to amplifier strands or structures, e.g. as described by Urdea et al, U.S. Pat. No.
  • M is a straight chain, cyclic, or branched organic molecular structure containing from 1 to 20 carbon atoms and from 0 to 10 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
  • M is alkyl, alkoxy, alkenyl, or aryl containing from 1 to 16 carbon atoms; heterocyclic having from 3 to 8 carbon atoms and from 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur; glycosyl; or nucleosidyl.
  • the haloacyl. or haloalkylamino groups are haloacetylamino groups; and more preferably, the haloacetylamino groups are bromoacetylamino groups.
  • the acyl or alkyl moieties of the haloacyl- or haloalkylamino groups contain from 1 to 12 carbon atoms; and more preferably, such moieties contain from 1 to 8 carbon atoms.
  • the reaction may take place in a wide range of solvent systems; but generally, the assembly reaction takes place under liquid aqueous conditions or in a frozen state in ice, e.g. obtained by lowering the temperature of a liquid aqueous reaction mixture.
  • the thio- or dithiophosphorylacyl- or thio- or dithiophosphorylalkylamino bridges are preferred because they can be readily and selectively cleaved by oxidizing agents, such as silver nitrate, potassium iodide, and the like.
  • the bridges are cleaved with potassium iodide, KI 3 , at a concentration equivalent to about a hundred molar excess of the bridges.
  • a KI 3 is employed at a concentration of about 0.1M.
  • a 5′ ⁇ haloacetylamino derivatized oligonucleotide 3 is reacted with a 3′-monophosphorothioate oligonucleotide 4 according to the following scheme:
  • oligonucleotide 4 can be prepared as described by Thuong and Asscline (cited above). Oligonucleotides 1 and 4 and 2 and 3 may be reacted to form polymeric units having either two 5′ termini or two 3′ termini, respectively.
  • Amino functionalities may also be introduced by a protected hydroxyamine phosphoramidite commercially available from Clontech Laboratories (Palo Alto, Calif.) as Aminomodifier ILTM.
  • amino functionalities are introduced by generating a derivatized phosphoramidate linkage by oxidation of a phosphite linkage with 12 and an alkyldiamine, e.g. as taught by Agrawal et al, Nucleic Acids Research, 18:5419-5423 (1990); and Jager et al, Biochemistry, 27:7237-7246 (1988).
  • haloacyl- or haloalkylamino derivatized polymeric units be prepared separately from the phosphorothioate derivatized polymeric units, otherwise the phosphorothioate moieties require protective groups.
  • supports are rigid solids that have a surface, preferably a substantially planar surface so that single molecules to be interrogated are in the same plane. The latter feature permits efficient signal collection by detection optics, for example.
  • solid supports of the invention are nonporous, particularly when random arrays of single molecules are analyzed by hybridization reactions requiring small volumes. Suitable solid support materials include materials such as glass, polyacrylamide-coated glass, ceramics, silica, silicon, quartz, various plastics, and the like.
  • the area of a planar surface may be in the range of from 0.5 to 4 cm 2 .
  • the solid support is glass or quartz, such as a microscope slide, having a surface that is uniformly silanized.
  • This may be accomplished using conventional protocols, e.g. acid treatment followed by immersion in a solution of 3-glycidoxypropyl trimethoxysilane, N,N-diisopropylethylamine, and anhydrous xylene (8:1:24 v/v) at 80° C., which forms an epoxysilanized surface.
  • a solution e.g. Beattie et a (1995), Molecular Biotechnology, 4: 213.
  • Such a surface is readily treated to permit end-attachment of capture oligonucleotides, e.g.
  • capture oligonucleotides may comprise non-natural nucleosidic units and/or linkages that confer favorable properties, such as increased duplex stability; such compounds include, but not limited to, peptide nucleic acids (PNAs), locked nucleic acids (LNA), oligonucleotide N3′ ⁇ P5′ phosphoramidates, oligo-2′-0-alkylribonucleotides, and the like.
  • PNAs peptide nucleic acids
  • LNA locked nucleic acids
  • oligonucleotide N3′ ⁇ P5′ phosphoramidates oligo-2′-0-alkylribonucleotides, and the like.
  • surfaces containing a plurality of discrete spaced apart regions are fabricated by photolithography.
  • a commercially available, optically flat, quartz substrate is spin coated with a 100-500 nm thick layer of photo-resist.
  • the photo-resist is then baked on to the quartz substrate.
  • An image of a reticle with a pattern of regions to be activated is projected onto the surface of the photo-resist, using a stepper. After exposure, the photo-resist is developed, removing the areas of the projected pattern which were exposed to the UV source. This is accomplished by plasma etching, a dry developing technique capable of producing very fine detail.
  • the substrate is then baked to strengthen the remaining photo-resist. After baking, the quartz wafer is ready for functionalization.
  • surfaces containing a plurality of discrete spaced apart regions are fabricated by nano-imprint lithography (NIL).
  • NIL nano-imprint lithography
  • a quartz substrate is spin coated with a layer of resist, commonly called the transfer layer.
  • a second type of resist is then applied over the transfer layer, commonly called the imprint layer.
  • the master imprint tool then makes an impression on the imprint layer.
  • the overall thickness of the imprint layer is then reduced by plasma etching until the low areas of the imprint reach the transfer layer. Because the transfer layer is harder to remove than the imprint layer, it remains largely untouched.
  • the imprint and transfer layers are then hardened by heating.
  • the substrate is then put into a plasma etcher until the low areas of the imprint reach the quartz.
  • the substrate is then derivatized by vapor deposition as described above.
  • a high density array of capture oligonucleotide spots of sub micron size is prepared using a printing head or imprint-master prepared from a bundle, or bundle of bundles, of about 10,000 to 100 million optical fibers with a core and cladding material.
  • a unique material is produced that has about 50-1000 nm cores separated by a similar or 2-5 fold smaller or larger size cladding material.
  • differential etching (dissolving) of cladding material a nano-printing head is obtained having a very large number of nano-sized posts.
  • This printing head may be used for depositing oligonucleotides or other biological (proteins, oligopeptides, DNA, aptamers) or chemical compounds such as silane with various active groups.
  • the glass fiber tool is used as a patterned support to deposit oligonucleotides or other biological or chemical compounds. In this case only posts created by etching may be contacted with material to be deposited. Also, a flat cut of the fused fiber bundle may be used to guide light through cores and allow light-induced chemistry to occur only at the tip surface of the cores, thus eliminating the need for etching.
  • the same support may then be used as a light guiding/collection device for imaging fluorescence labels used to tag oligonucleotides or other reactants.
  • This device provides a large field of view with a large numerical aperture (potentially >1).
  • Stamping or printing tools that perform active material or oligonucleotide deposition may be used to print 2 to 100 different oligonucleotides in an interleaved pattern. This process requires precise positioning of the print head to about 50-500 nm.
  • This type of oligonucleotide array may be used for attaching 2 to 100 different DNA populations such as different source DNA. They also may be used for parallel reading from sub-light resolution spots by using DNA specific anchors or tags.
  • “inert” concatemers are used to prepare a surface for attachment of test concatemers.
  • the surface is first covered by capture oligonucleotides complementary to the binding site present on two types of synthetic concatemers; one is a capture concatemer, the other is a spacer concatemer.
  • the spacer concatemers do not have DNA segments complementary to the adapter used in preparation of test concatemers and they are used in about 5-50, preferably 10 ⁇ excess to capture concatemers.
  • the surface with capture oligonucleotide is “saturated” with a mix of synthetic concatemers (prepared by chain ligation or by RCR) in which the spacer concatemers are used in about 10-fold (or 5 to 50-fold) excess to capture concatemers.
  • multiple arrays of the invention may be place on a single surface.
  • patterned array substrates may be produced to match the standard 96 or 384 well plate format.
  • a production format can be an 8 ⁇ 12 pattern of 6 mm ⁇ 6 mm arrays at 9 mm pitch or 16 ⁇ 24 of 3.33 mm ⁇ 3.33 mm array at 4.5 mm pitch, on a single piece of glass or plastic and other optically compatible material.
  • each 6 mm ⁇ 6 mm array consists of 36 million 250-500 nm square regions at 1 micrometer pitch. Hydrophobic or other surface or physical barriers may be used to prevent mixing different reactions between unit arrays.
  • binding sites i.e. discrete spaced apart regions
  • binding sites for DNA samples are prepared by silanization of lithographically defined sites on silicon dioxide on silicon, quartz, or glass surfaces with 3-aminopropyldimethylethoxysilane or similar silanization agent followed by derivatization with p-phenylenediisothiocyanate or similar derivatization agent.
  • the binding sites may be square, circular or regular/irregular polygons produced by photolithography, direct-write electron beam, or nano-imprint lithography. Minimization of non-specific binding in regions between binding site The wetability (hydrophobic v.
  • hydrophilic and reactivity of the field surrounding the binding sites can be controlled to prevent DNA samples from binding in the field; that is, in places other than the binding sites.
  • the field may be prepared with hexamethyldisilazane (HMDS), or a similar agent covalently bonded to the surface, to be hydrophobic and hence unsuitable to hydrophilic bonding of the DNA samples.
  • HMDS hexamethyldisilazane
  • the field may be coated with a chemical agent such as a fluorine-based carbon compound that renders it unreactive to DNA samples.
  • 3-aminopropyldimethylethoxysilane may be used as a replacement for 3-aminopropyltriethoxysilane because it forms a mono-layer on the glass surface.
  • the monolayer surface provides a lower background.
  • the silanization agent may also be applied using vapor deposition.
  • 3-aminopropyltriethoxysilane tends to form more of a polymeric surface when deposited in solution phase.
  • the amino modified silane is then terminated with a thiocyanate group.
  • Capture oligonucleotides are bound to the surface of the cover slide by applying a solution of 10-50 micromolar capture oligonucleotide in 100 millimolar sodium bicarbonate in water to the surface. The solution is allowed to dry, and is then washed in water.
  • random arrays are prepared using nanometer-sized beads.
  • Sub-micron glass or other types of beads e.g. in the 20-50 nm range
  • a short oligonucleotide e.g. 6-30 nucleotides, complementary to an adaptor oligonucleotide in the circles used to generate concatemers.
  • the number of oligonucleotides on the bead and the length of the sequence can be controlled to weakly bind the concatemers in solution. Reaction rate of the beads should be much faster than that of the solid support alone. After binding concatemers, the beads are then allowed to settle on the surface of an array substrate.
  • One way to control the density of capture probes is to mix in this case about 8 times more of a 2-4 bases long oligonucleotieds with the same attachment chemistry with the capture probe. Also, much smaller nano-beads (20-50 nm) may be used. 2. Reaction conditions (temperature, pH, salt concentration) are adjusted so that concatemers with over 300 copies will attach to nanobeads in significant numbers. 3.
  • system hardware may be described as consisting of five major components; the robotic fluid handling system, the reaction chamber, the temperature control system, the illumination system and the detection system.
  • Each DNA array segment may be housed in a separate flow cell, allowing cycles to be run asynchronously.
  • Each flow cell provides temperature control, physically indexes the substrate, and creates a fluid path over the active area of the substrate.
  • the active area of a flow cell may be determined by how many unit sub-arrays each flow cell contains. For an eight flow-cell system, each flow cell may contain an active area of 48 ⁇ 4 square millimeters, or 192 square millimeters in a 6 ⁇ 8 arrangement of unit sub-arrays. Similarly, in a 16 flow-cell system, each flow cell may have a 1 cm ⁇ 1.5 cm substrate with 4 ⁇ 6 unit subarrays.
  • the thin cross section of fluid between the array substrate and optical window may cool or heat to room temperature relatively quickly. Creating a pocket above the optical window may allow filling the area directly above the window with optical oil. This oil may act as a thermal transfer medium connecting the top of the thin optical window to the temperature controlled flow cell body.
  • the Solexa (5) is attempting sequencing by synthesis on random array substrates with non-amplified or in-situ amplified DNA. Cycles of fluorescent nucleotide addition result in read lengths of about 25 bases that are then used to assemble and align the final sequence to a reference sequence.
  • researchers (6, 7) and companies such as Helicos Biosciences are also attempting sequencing by synthesis from non-amplified templates. The main limitations of these methods are short read lengths leading to incomplete sequence determination. Furthermore, the ability to read only one base per DNA per cycle with random attachment of DNA, requiring larger array surfaces and large numbers of CCD pixels per DNA sample leads to higher genome sequencing costs.
  • This configuration images RCR concatemers bound randomly to a substrate (non-ordered array).
  • Imaging speed may be improved by decreasing the objective magnification power, using grid patterned arrays and increasing the number of pixels of data collected in each image. For example, up to four or more cameras may be used, preferably in the 10-16 megapixel range. Multiple band pass filters and dichroic mirrors may also be used to collect pixel data across up to four or more emission spectra. To compensate for the lower light collecting power of the decreased magnification objective, the power of the excitation light source can be increased. Throughput can be increased by using one or more flow chambers with each camera, so that the imaging system is not idle while the samples are being hybridized/reacted. Because the probing of arrays can be non-sequential, more than one imaging system can be used to collect data from a set of arrays, further decreasing assay time.
  • the substrate must remain in focus.
  • Some key factors in maintaining focus are the flatness of the substrate, orthogonality of the substrate to the focus plane, and mechanical forces on the substrate that may deform it.
  • Substrate flatness can be well controlled, glass plates which have better than 1 ⁇ 4 wave flatness are readily obtained. Uneven mechanical forces on the substrate can be minimized through proper design of the hybridization chamber.
  • Orthogonality to the focus plane can be achieved by a well-adjusted, high precision stage.
  • Auto focus routines generally take additional time to run, so it is desirable to run them only if necessary. After each image is acquired, it will be analyzed using a fast algorithm to determine if the image is in focus. If the image is out of focus, the auto focus routine will run. It will then store the objectives Z position information to be used upon return to that section of that array during the next imaging cycle. By mapping the objectives Z position at various locations on the substrate, we will reduce the time required for substrate image acquisition.
  • a suitable illumination and detection system for fluorescence-based signal is a Zeiss Axiovert 200 equipped with a TIRF slider coupled to a 80 milliwatt 532 nm solid state laser.
  • the slider illuminates the substrate through the objective at the correct TIRF illumination angle.
  • TIRF can also be accomplished without the use of the objective by illuminating the substrate though a prism optically coupled to the substrate.
  • Planar wave guides can also be used to implement TIRF on the substrate
  • Epi illumination can also be employed.
  • the light source can be rastered, spread beam, coherent, incoherent, and originate from a single or multi-spectrum source.
  • One embodiment for the imaging system contains a 20 ⁇ lens with a 1.25 mm field of view, with detection being accomplished with a 10 megapixel camera. Such a system images approx 1.5 million concatemers attached to the patterned array at 1 micron pitch. Under this configuration there are approximately 6.4 pixels per concatemer.
  • the number of pixels per concatemer can be adjusted by increasing or decreasing the field of view of the objective. For example a 1 mm field of view would yield a value of 10 pixels per concatemer and a 2 mm field of view would yield a value of 2.5 pixels per concatemer.
  • the field of view may be adjusted relative to the magnification and NA of the objective to yield the lowest pixel count per concatemer that is still capable of being resolved by the optics, and image analysis software.
  • Both TIRF and EPI illumination allow for almost any light source to be used.
  • One illumination schema is to share a common set of monochromatic illumination sources (about 4 lasers for 6-8 colors) amongst imagers. Each imager collects data at a different wavelength at any given time and the light sources would be switched to the imagers via an optical switching system.
  • the illumination source preferably produces at least 6, but more preferably 8 different wavelengths.
  • Such sources include gas lasers, multiple diode pumped solid state lasers combined through a fiber coupler, filtered Xenon Arc lamps, tunable lasers, or the more novel Spectralum Light Engine, soon to be offered by Tidal Photonics.
  • the Spectralum Light Engine uses prism to spectrally separate light.
  • the spectrum is projected onto a Texas Instruments Digital Light Processor, which can selectively reflect any portion of the spectrum into a fiber or optical connector.
  • This system is capable of monitoring and calibrating the power output across individual wavelengths to keep them constant so as to automatically compensate for intensity differences as bulbs age or between bulb changes.
  • a preferred instrument design is 4 imager modules each serving 4 flow cells (16 flow cells total).
  • the above described imaging schema assumes that each imager has a CCD detector with 10 million pixels and be used with an exposure time of roughly 300 milliseconds. This should be an acceptable method for collecting data for 6 fluorophore labels.
  • One possible drawback to this imaging technique is that certain fluorophores may be unintentionally photo bleached by the light source while other fluorophores are being imaged. Keeping the illumination power low and exposure times to a minimum would greatly reduce photo bleaching.
  • intensified CCDs ICCDs
  • ICCDs intensified CCDs
  • a one megapixel ICCD can acquire ten or more images in the time a standard CCD acquires a single image. Used in conjunction with fast filter wheels, and a high speed flow cell stage, a one mega pixel ICCD should be able to collect the same amount of data as a 10 megapixel standard CCD.
  • the reaction efficiency on the concatemer and other random DNA arrays may depend on the efficient use of probes, anchors or primers and enzymes. This may be achieved by mixing liquids (such as pooling liquid back and forth in the flow through chamber), applying agitations or using horizontal or vertical electric fields to bring DNA from different parts of the reaction volume in the proximity of the surface.
  • One approach for efficient low cost assay reaction is to apply reaction mixes in a thin layer such as droplets or layers of about one to a few microns, but preferably less than 10 microns, in size/thickness. In a 1 ⁇ 1 ⁇ 1 micron volume designated for a 1 ⁇ 1 micron spot area, in 1 pmol/1 ul (1 uM concentration) there would be about 1000 molecules of probe in close proximity to 1-1000 copies of DNA. Using up to 100-300 molecules of probes would not significantly reduce the probe concentration and it would provide enough reacted probes to get significant signal. This approach may be used in an open reaction chamber that may stay open or closed for removal and washing of the probes and enzyme.
  • the physical makeup of the machine will include a number of additions to the standard microscope.
  • a large area automated plate stage may be added to the microscope. This stage will accommodate the two substrates needed for each decoding assay.
  • Another possibility is to use two smaller substrates that can fit in the standard plate stage.
  • Each substrate will be fitted into a cassette and those cassettes will be fitted on to the stage.
  • the cassette will index the substrate to the stage and provide a method to contain fluids over the assay substrate.
  • Cassettes will have ports to facilitate the addition and removal of large volumes of buffer. They will also provide a means to control the temperature of the substrate, through a connection with a temperature control subsystem with ability to maintain temperature in the range from about 5-95° C.
  • reagents and substrates may be contained on a microfluidic chip.
  • the control software may run assay cycles asynchronously, allowing each imager to run continuously throughout the assay.
  • Flow cells are connected to a temperature control system with one heater and one chiller allowing for heating or cooling on demand of each flow cell or 2-4 blocks of cells independently.
  • Each flow cell temperature may be monitored, and if a flow cell temperature drops below a set threshold, a valve may open to a hot water recirculation. Likewise, if a flow cell temperature is above the set threshold a valve may open to a cold water recirculation. If a flow cell is within a set temperature range neither valve may open.
  • the hot and cold recirculation water runs through the aluminum flow cell body, but remains separate and isolated from the assay buffers and reagents.
  • random arrays of biomolecules such as genomic DNA fragments or cDNA fragments
  • SAGE serial analysis of gene expression
  • massively parallel signature sequencing e.g. Velculescu, et al, (1995), Science 270, 484-487; and Brenner et al (2000), Nature Biotechnology, 18: 630-634.
  • Such genome-wide measurements include, but are not limited to, determination of polymorphisms, including nucleotide substitutions, deletions, and insertions, inversions, and the like, determination of methylation patterns, copy number patterns, and the like, such as could be carried out by a wide range of assays known to those with ordinary skill in the art, e.g. Syvanen (2005), Nature Genetics Supplement, 37: S5-510; Gunderson et al (2005), Nature Genetics, 37: 549-554; Fan et al (2003), Cold Spring Harbor Symposia on Quantitative Biology, LXVIII: 69-78; and U.S. Pat. Nos. 4,883,750; 6,858,412; 5,871,921; 6,355,431; and the like, which are incorporated herein by reference.
  • a variety of sequencing methodologies can be used with random arrays of the invention, including, but not limited to, hybridization-based methods, such as disclosed in Drmanac, U.S. Pat. Nos. 6,864,052; 6,309,824; and 6,401,267; and Drmanac et al, U.S. patent publication 2005/0191656, which are incorporated by reference, sequencing by synthesis methods, e.g. Nyren et al, U.S. Pat. No. 6,210,891; Ronaghi, U.S. Pat. No. 6,828,100; Ronaghi et al (1998), Science, 281: 363-365; Balasubramanian, U.S. Pat. No.
  • DNA arrays may be prepared from DNA samples about 100-1000 bases in length where, in one segment or two segments close to adapters/primers/anchors sequences are determined by positional methods (SBS and others). The same DNA array is subjected to probe hybridization or probe-probe combinatorial ligation on the entire DNA or a part that is still in the form of ssDNA.
  • “substantially covers” means that the amount of DNA analyzed contains an equivalent of at least two copies of the target polynucleotide, or in another aspect, at least ten copies, or in another aspect, at least twenty copies, or in another aspect, at least 100 copies.
  • Target polynucleotides may include DNA fragments, including genomic DNA fragments and cDNA fragments, and RNA fragments.
  • Guidance for the step of reconstructing target polynucleotide sequences can be found in the following references, which are incorporated by reference: Lander et al, Genomics, 2: 231-239 (1988); Vingron et al, J. Mol. Biol., 235: 1-12 (1994); and like references.
  • the sequence identity of each attached DNA concatemer may be determined by a “signature” approach. About 50 to 100 or possibly 200 probes are used such that about 25-50% or in some applications 10-30% of attached concatemers will have a full match sequence for each probe.
  • This type of data allows each amplified DNA fragment within a concatemer to be mapped to the reference sequence. For example, by such a process one can score 64 4-mers (i.e. 25% of all possible 256 4-mers) using 16 hybridization/stripoff cycles in a 4 colors labeling schema. On a 60-70 base fragment amplified in a concatemer about 16 of 64 probes will be positive since there are 64 possible 4-mers present in a 64 base long sequence (i.e.
  • Unrelated 60-70 base fragments will have a very different set of about 16 positive decoding probes.
  • a combination of 16 probes out of 64 probes has a random chance of occurrence in 1 of every one billion fragments which practically provides a unique signature for that concatemer.
  • Scoring 80 probes in 20 cycles and generating 20 positive probes create a signature even more likely to be unique: occurrence by chance is 1 in billion billions.
  • a “signature” approach was used to select novel genes from cDNA libraries.
  • An implementation of a signature approach is to sort obtained intensities of all tested probes and select up to a predefined (expected) number of probes that satisfy the positive probe threshold.
  • a preferred way to score 4-mers is to ligate pairs of probes, for example: N (5-7) BBB with BN (7-9) , where B is the defined base and N is a degenerate base.
  • B the defined base
  • N a degenerate base.
  • more unique bases will be used. For example, a 25% positive rate in a fragment 1000 bases in length would be achieved by N (4-6) BBBB and BBN (6-8) . Note that longer fragments need the same number of about 60-80 probes (15-20 ligation cycles using 4 colors).
  • the decoding of sequencing probes with large numbers of Ns may be prepared from multiple syntheses of subsets of sequences at degenerated bases to minimize difference in the efficiency. Each subset is added to the mix at a proper concentration. Also, some subsets may have more degenerated positions than others. For example, each of 64 probes from the set N (5-7) BBB may be prepared in 4 different synthesis. One is regular all 5-7 bases to be fully degenerated; second is NO-3(A,T)5BBB; third is NO-2(A,T)(G,C)(A,T)(G,C)(A,T)BBB, and the fourth is NO-2(G,C)(A,T)(G,C)(A,T)(G,C)BBB.
  • arrays are formed as random distributions of unique 8 to 20 base recognition sequences in the form of DNA concatemers.
  • the probes need to be decoded to determine the sequence of the 8-20 base probe region.
  • At least two options are available to do this and the following example describes the process for a 12 mer. In the first, one half of the sequence is determined by utilizing the hybridization specificity of short probes and the ligation specificity of fully matched hybrids. Six to ten bases adjacent to the 12 mer are predefined and act as a support for a 6mer to 10-mer oligonucleotide. This short 6mer will ligate at its 3-prime end to one of 4 labeled 6-mers to 10-mers.
  • decoding probes consist of a pool of 4 oligonucleotides in which each oligonucleotide consists of 4-9 degenerate bases and 1 defined base. This oligonucleotide will also be labeled with one of four fluorescent labels. Each of the 4 possible bases A, C, G, or T will therefore be represented by a fluorescent dye.
  • Six or more bases can be sequences with additional probe pools.
  • the 6mer oligonucleotide can be positioned further into the 12mer sequence. This will necessitate the incorporation of degenerate bases into the 3-prime end of the non-labeled oligonucleotide to accommodate the shift. This is an example of decoding probes for position 6 and 7 in the 12-mer.
  • One way to discriminate which anchor is positive from the pool is to mix specific probes with distinct hybrid stability (maybe different number of Ns in addition).
  • Anchors may be also tagged to determine which anchor from the pool is hybridized to a spot. Tags, as additional DNA segment, may be used for adjustable displacement as a detection method.
  • anchors labeled or tagged with multiple colors may be ligated to unlabeled N7-N10 supporter oligonucleotides.
  • Tagging probes with DNA tags for larger multiplex of decoding or sequence determination probes can be tagged with different oligonucleotide sequences made of natural bases or new synthetic bases (such as isoG and isoC).
  • Tags can be designed to have very precise binding efficiency with their anti-tags using different oligonucleotide lengths (about 6-24 bases) and/or sequence including GC content. For example 4 different tags may be designed that can be recognized with specific anti-tags in 4 consecutive cycles or in one hybridization cycle followed by a discriminative wash. In the discriminative wash initial signal is reduced to 95-99%, 30-40%, 10-20% and 0-5% for each tag, respectively.
  • this high image acquisition rate is achieved by using four ten-megapixel cameras, each imaging the emission spectra of a different fluorophore.
  • the cameras are coupled to the microscope through a series of dichroic beam splitters.
  • the autofocus routine which takes extra time, runs only if an acquired image is out of focus. It will then store the Z axis position information to be used upon return to that section of that array during the next imaging cycle. By mapping the autofocus position for each location on the substrate we will drastically reduce the time required for image acquisition.
  • Each array requires about 12-24 cycles to decode.
  • Each cycle consists of a hybridization, wash, array imaging, and strip-off step. These steps, in their respective orders, may take for the above example 5,2,12, and 5 minutes each, for a total of 24 minutes each cycle, or roughly 5-10 hours for each array, if the operations were performed linearly.
  • the time to decode each array can be reduced by a factor of two by allowing the system to image constantly. To accomplish this, the imaging of two separate substrates on each microscope is staggered. While one substrate is being reacted, the other substrate is imaged.
  • oligonucleotide probes of the invention can be labeled in a variety of ways, including the direct or indirect attachment of radioactive moieties, fluorescent moieties, colorimetric moieties, chemiluminescent moieties, and the like.
  • Many comprehensive reviews of methodologies for labeling DNA and constructing DNA adaptors provide guidance applicable to constructing oligonucleotide probes of the present invention. Such reviews include Kricka, Ann. Clin. Biochem., 39: 114-129 (2002); Schaferling et al, Anal. Bioanal. Chem., (Apr. 12, 2006); Matthews et al, Anal. Biochem., Vol 169, pgs.
  • one or more fluorescent dyes are used as labels for the oligonucleotide probes, e.g. as disclosed by Menchen et al, U.S. Pat. No. 5,188,934 (4,7-dichlorofluorscein dyes); Begot et al, U.S. Pat. No. 5,366,860 (spectrally resolvable rhodamine dyes); Lee et al, U.S. patent 5, 847,162 (4,7-dichlororhodamine dyes); Khanna et al, U.S. Pat. No. 4,318,846 (ether-substituted fluorescein dyes); Lee et al, U.S. Pat. No.
  • fluorescent signal generating moiety means a signaling means which conveys information through the fluorescent absorption and/or emission properties of one or more molecules.
  • fluorescent properties include fluorescence intensity, fluorescence life time, emission spectrum characteristics, energy transfer, and the like.
  • fluorophores available for post-synthetic attachment include, inter alia, Alexa Fluor® 350, Alexa Fluor® 532, Alexa Fluor® 546, Alexa Fluor® 568, Alexa Fluor® 594, Alexa Fluor® 647, BODIPY 493/503, BODIPY FL, BODIPY R6G, BODIPY 530/550, BODIPY TMR, BODIPY 558/568, BODIPY 558/568, BODIPY 564/570, BODIPY 576/589, BODIPY 581/591, BODIPY 630/650, BODIPY 650/665, Cascade Blue, Cascade Yellow, Dansyl, lissamine rhodamine B, Marina Blue, Oregon Green 488, Oregon Green 514, Pacific Blue, rhodamine 6G, rhodamine green, rhodamine red, tetramethylrhod
  • Digoxigenin may be incorporated as a label and subsequently bound by a detectably labeled anti-digoxigenin antibody (e.g. fluoresceinated anti-digoxigenin).
  • a detectably labeled anti-digoxigenin antibody e.g. fluoresceinated anti-digoxigenin
  • An aminoallyl-dUTP residue may be incorporated into a detection oligonucleotide and subsequently coupled to an N-hydroxy succinimide (NHS) derivitized fluorescent dye, such as those listed supra.
  • NHS N-hydroxy succinimide
  • any member of a conjugate pair may be incorporated into a detection oligonucleotide provided that a detectably labeled conjugate partner can be bound to permit detection.
  • the term antibody refers to an antibody molecule of any class, or any subfragment thereof, such as an Fab.
  • suitable labels for detection oligonucleotides may include fluorescein (FAM), digoxigenin, dinitrophenol (DNP), dansyl, biotin, bromodeoxyuridine (BrdU), hexahistidine (6 ⁇ His), phosphor amino acids (e.g. P-tyr, P-ser, P-thr), or any other suitable label.
  • FAM fluorescein
  • DNP dinitrophenol
  • PrdU bromodeoxyuridine
  • 6 ⁇ His hexahistidine
  • phosphor amino acids e.g. P-tyr, P-ser, P-thr
  • the following hapten/antibody pairs are used for detection, in which each of the antibodies is derivatized with a detectable label: biotin/ ⁇ -biotin, digoxigenin/a-digoxigenin, dinitrophenol (DNP)/ ⁇ -DNP, 5-Carboxyfluorescein (FAM)/ ⁇ -FAM.
  • probes may also be indirectly labeled, especially with a hapten that is then bound by a capture agent, e.g. as disclosed in Holtke et al, U.S. Pat. Nos. 5,344,757; 5,702,888; and 5,354,657; Huber et al, U.S. Pat. No. 5,198,537; Miyoshi, U.S. Pat. No. 4,849,336; Misiura and Gait, PCT publication WO 91/17160; and the like. Many different hapten-capture agent pairs are available for use with the invention.
  • haptens include, biotin, des-biotin and other derivatives, dinitrophenol, dansyl, fluorescein, CY5, and other dyes, digoxigenin, and the like.
  • a capture agent may be avidin, streptavidin, or antibodies.
  • Antibodies may be used as capture agents for the other haptens (many dye-antibody pairs being commercially available, e.g. Molecular Probes).
  • kits for construction of random arrays of the invention include, but are not limited to, kits for determining the nucleotide sequence of a target polynucleotide, kits for large-scale identification of differences between reference DNA sequences and test DNA sequences, kits for profiling exons, and the like.
  • a kit typically comprises at least one support having a surface and one or more reagents necessary or useful for constructing a random array of the invention or for carrying out an application therewith.
  • Such reagents include, without limitation, nucleic acid primers, probes, adaptors, enzymes, and the like, and are each packaged in a container, such as, without limitation, a vial, tube or bottle, in a package suitable for commercial distribution, such as, without limitation, a box, a sealed pouch, a blister pack and a carton.
  • the package typically contains a label or packaging insert indicating the uses of the packaged materials.
  • packaging materials includes any article used in the packaging for distribution of reagents in a kit, including without limitation containers, vials, tubes, bottles, pouches, blister packaging, labels, tags, instruction sheets and package inserts.
  • the invention provides a kit for making a random array of concatemers of DNA fragments from a source nucleic acid comprising the following components: (i) a support having a surface; and (ii) at least one adaptor oligonucleotide for ligating to each DNA fragment and forming a DNA circle therewith, each DNA circle capable of being replicated by a rolling circle replication reaction to form a concatemer that is capable of being randomly disposed on the surface.
  • the surface may be a planar surface having an array of discrete spaced apart regions, wherein each discrete spaced apart region has a size equivalent to that of said concatemers.
  • the discrete spaced apart regions may form a regular array with a nearest neighbor distance in the range of from 0.1 to 20 ⁇ m
  • the concatemers on the discrete spaced apart regions may have a nearest neighbor distance such that they are optically resolvable.
  • the discrete spaced apart regions may have capture oligonucleotides attached and the adaptor oligonucleotides may each have a region complementary to the capture oligonucleotides such that the concatemers are capable of being attached to the discrete spaced apart regions by formation of complexes between the capture oligonucleotides and the complementary regions of the adaptor oligonucleotides.
  • kits may further comprise (a) a terminal transferase for attaching a homopolymer tail to said DNA fragments to provide a binding site for a first end of said adaptor oligonucleotide, (b) a ligase for ligating a strand of said adaptor oligonucleotide to ends of said DNA fragment to form said DNA circle, (c) a primer for annealing to a region of the strand of said adaptor oligonucleotide, and (d) a DNA polymerase for extending the primer annealed to the strand in a rolling circle replication reaction.
  • the above adaptor oligonucleotide may have a second end having a number of degenerate bases in the range of from 4 to 12.
  • kits for sequencing a target polynucleotide comprising the following components: (i) a support having a planar surface having an array of optically resolvable discrete spaced apart regions, wherein each discrete spaced apart region has an area of less than 1 ⁇ m 2 ; (ii) a first set of probes for hybridizing to a plurality of concatemers randomly disposed on the discrete spaced apart regions, the concatemers each containing multiple copies of a DNA fragment of the target polynucleotide; and (iii) a second set of probes for hybridizing to the plurality of concatemers such that whenever a probe from the first set hybridizes contiguously to a probe from the second set, the probes are ligated.
  • kits may further include a ligase, a ligase buffer, and a hybridization buffer.
  • the discrete spaced apart regions may have capture oligonucleotides attached and the concatemers may each have a region complementary to the capture oligonucleotides such that said concatemers are capable of being attached to the discrete spaced apart regions by formation of complexes between the capture oligonucleotides and the complementary regions of said concatemers.
  • kits for constructing a single molecule array comprising the following components: (i) a support having a surface having reactive functionalities; and (ii) a plurality of macromolecular structures each having a unique functionality and multiple complementary functionalities, the macromolecular structures being capable of being attached randomly on the surface wherein the attachment is formed by one or more linkages formed by reaction of one or more reactive functionalities with one or more complementary functionalities; and wherein the unique functionality is capable of selectively reacting with a functionality on an analyte molecule to form the single molecule array.
  • the surface is a planar surface having an array of discrete spaced apart regions containing said reactive functionalities and wherein each discrete spaced apart region has an area less than 1 ptm 2 .
  • the discrete spaced apart regions form a regular array with a nearest neighbor distance in the range of from 0.1 to 20 pun.
  • the concatemers on the discrete spaced apart regions have a nearest neighbor distance such that they are optically resolvable.
  • the macromolecular structures may be concatemers of one or more DNA fragments and wherein the unique functionalities are at a 3′ end or a 5′ end of the concatemers.
  • kits for circularizing DNA fragments comprising the components: (a) at least one adaptor oligonucleotide for ligating to one or more DNA fragments and forming DNA circles therewith (b) a terminal transferase for attaching a homopolymer tail to said DNA fragments to provide a binding site for a first end of said adaptor oligonucleotide, (c) a ligase for ligating a strand of said adaptor oligonucleotide to ends of said DNA fragment to form said DNA circle, (d) a primer for annealing to a region of the strand of said adaptor oligonucleotide, and (e) a DNA polymerase for extending the primer annealed to the strand in a rolling circle replication reaction.
  • the above adaptor oligonucleotide may have a second end having a number of degenerate bases in the range of from 4 to 12.
  • the above kit may further include reaction buffers for the terminal transferase, ligase, and DNA polymerase.
  • the invention includes a kit for circularizing DNA fragments using a Circligase enzyme (Epicentre Biotechnologies, Madison, Wis.), which kit comprises a volume exclusion polymer.
  • such kit further includes the following components: (a) reaction buffer for controlling pH and providing an optimized salt composition for Circligase, and (b) Circligase cofactors.
  • a reaction buffer for such kit comprises 0.5 M MOPS (pH 7.5), 0.1 M KCl, 50 mM MgCl 2 , and 10 mM DTT.
  • such kit includes Circligase, e.g. 10-100 ⁇ L Circligase solution (at 100 unit/pi).
  • Exemplary volume exclusion polymers are disclosed in U.S. Pat. No. 4,886,741, which is incorporated by reference, and include polyethylene glycol, polyvinylpyrrolidone, dextran sulfate, and like polymers.
  • polyethylene glycol (PEG) is 50% PEG4000.
  • a kit for circle formation includes the following:
  • Circligase 10X reaction buffer Ix 0.5 ⁇ L 1 mM ATP 25 ⁇ M 0.5 ⁇ L 50 mM MnC1 2 1.25 mM 4 ⁇ L 50% PEG4000 10% 2 ⁇ L Circligase ssDNA ligase (100 units/pi) 10 units/ ⁇ L single stranded DNA template 0.5-10 pmol/ ⁇ L sterile water
  • Arrays and sequencing methods of the invention used may be used for large-scale identification of polymorphisms using mismatch cleavage techniques.
  • Several approaches to mutation detection employ a heteroduplex in which the mismatch itself is utilized for cleavage recognition.
  • Chemical cleavage with piperidine at mismatches modified with hydroxylamine or osmium tetroxide provides one approach to release a cleaved fragment.
  • the enzymes T7 endonuclease I or T4 endonuclease VII have been used in the enzyme mismatch cleavage (EMC) techniques, e.g. Youil et al, Proc. Natl. Acad.
  • Cleavase is used in the cleavage fragments length polymorphism (CFLP) technique which has been commercialized by Third Wave Technologies. When single stranded DNA is allowed to fold and adopt a secondary structure the DNA will form internal hairpin loops at locations dependent upon the base sequence of the strand.
  • CFLP cleavage fragments length polymorphism
  • Cleavase will cut single stranded DNA five-prime of the loop and the fragments can then be separated by PAGE or similar size resolving techniques.
  • Mismatch binding proteins such as Mut S and Mut Y also rely upon the formation of heteroduplexes for their ability to identify mutation sites. Mismatches are usually repaired but the binding action of the enzymes can be used for the selection of fragments through a mobility shift in gel electrophoresis or by protection from exonucleases, e.g. Ellis et al (cited above).
  • Templates for heteroduplex formation are prepared by primer extension from genomic DNA. For the same genomic region of the reference DNA, an excess of the opposite strand is prepared in the same way as the test DNA but in a separate reaction. The test DNA strand produced is biotinylated and is attached to a streptavidin support. Homoduplex formation is prevented by heating and removal of the complementary strand. The reference preparation is now combined with the single stranded test preparation and annealed to produce heteroduplexes. This heteroduplex is likely to contain a number of mismatches.
  • Residual DNA is washed away before the addition of the mismatch endonuclease, which, if there is a mismatch every 1 kb would be expected to produce about 10 fragments for a 10 kb primer extension. After cleavage, each fragment can bind an adapter at each end and enter the mismatch-fragment circle selection process.
  • the 5-10 kb genomic fragments prepared from large genomic fragments as described above are biotinylated by the addition of a biotinylated dideoxy nucleotide at the 3-prime end with terminal transferase and excess biotinylated nucleotide are removed by filtration.
  • a reference BAC clone that covers the same region of sequence is digested with the same six-base cutter to match the fragments generated from the test DNA.
  • the biotinylated genomic fragments are heat denatured in the presence of the BAC reference DNA and slowly annealed to generate biotinylated heteroduplexes.
  • the reference BAC DNA is in large excess to the genomic DNA so the majority of biotinylated products will be heteroduplexes.
  • the biotinylated DNA can then be attached to the surface for removal of the reference DNA. Residual DNA is washed away before the addition of the mismatch endonuclease. After cleavage, each fragment can bind an adapter at each end and enter the mismatch circle selection process as follows. See FIG. 20 .
  • heteroduplexes generated above can be used for selection of small DNA circles, as illustrated in FIGS. 7 and 8 .
  • heteroduplex ( 700 ) of a sample is treated with the mismatch enzyme to create products cleaved on both strands ( 704 and 706 ) surrounding the mutation site ( 702 ) to produce fragments ( 707 ) and ( 705 ).
  • T7 endonuclease I or similar enzyme cleaves 5-prime of the mutation site to reveal a 5-prime overhang of varying length on both strands surrounding the mutation.
  • the next phase is to capture the cleaved products in a form suitable for amplification and sequencing.
  • Adapter ( 710 ) is ligated to the overhang produced by the mismatch cutting (only fragment ( 705 ) shown), but because the nature of the overhang is unknown, at least three adapters are needed and each adapter is synthesized with degenerate bases to accommodate all possible ends.
  • the adapter can be prepared with an internal biotin ( 708 ) on the non-circularizing strand to allow capture for buffer exchange and sample cleanup, and also for direct amplification on the surface if desired.
  • the intervening sequence between mutations does not need to be sequenced and reduces the sequencing capacity of the system it is removed when studying genomic-derived samples. Reduction of sequence complexity is accomplished by a type Its enzyme that cuts the DNA at a point away from the enzyme recognition sequence. In doing so, the cut site and resultant overhangs will be a combination of all base variants. Enzymes that can be used include Mmcl (20 bases with 2 base 3′ overhang) and Eco P15I (with 25 bases and 2 base 5′ overhang).
  • the adapter is about 50 by in length to provide sequences for initiation of rolling circle amplification and also provide stiffer sequence for circle formation, as well as recognition site ( 715 ) for a type Its restriction endonuclease. Once the adapter has been ligated to the fragment the DNA is digested ( 720 ) with the type Its restriction enzyme to release all but 20-25 bases of sequence containing the mutation site that remains attached to the adapter.
  • the adaptered DNA fragment is now attached to a streptavidin support for removal of excess fragment DNA. Excess adapter that did not ligate to mismatch cleaved ends will also bind to the streptavidin solid support.
  • the new degenerate end created by the type Its enzyme can now be ligated to a second adapter through the phosphorylation of one strand of the second adapter. The other strand is non-phosphorylated and blocked at the 3-prime end with a dideoxy nucleotide.
  • the structure formed is essentially the genomic fragment of interest captured between two different adapters. To create a circle from this structure would simply require both ends of the molecule coming together and ligating, e.g.
  • second adapter If second adapter has been attached then it will be protected from digestion because there is no 5-prime phosphate available. If only the first adapter is attached to the surface then the 5-prime phosphate is exposed for degradation of the lower strand of the adapter. This will lead to loss of excess first adapter from the surface.
  • the 5 prime end of the top strand of the first adapter is prepared for ligation to the 3-prime end of the second adapter.
  • This can be achieved by introducing a restriction enzyme site into the adapters so that re-circularization of the molecule can occur with ligation.
  • Amplification of DNA captured into the circular molecules proceeds by a rolling circle amplification to form long linear concatemer copies of the circle. If extension initiates 5-prime of the biotin, the circle and newly synthesized strand is released into solution.
  • Complementary oligonucleotides on the surface are responsible for condensation and provide sufficient attachment for downstream applications.
  • One strand is a closed circle and acts as the template. The other strand, with an exposed 3-prime end, acts as an initiating primer and is extended.
  • the adapter can be prepared with a 3-prime biotin ( 808 ) on the non-circularized strand to allow capture for buffer exchange and sample cleanup.
  • the adapter—DNA fragment can be attached to a streptavidin support for removal of excess fragment DNA.
  • a polynucleotide can be added to the 3-prime end with terminal transferase to create a sequence for one half of a bridge oligonucleotide ( 818 ) to hybridize to, shown as polyA tail ( 816 ). The other half will bind to sequences in the adapter.
  • an adapter can be added to the end generated by the 4-base cutter which will provide sequence for the bridge to hybridize to after removal of one strand by exonuclease.
  • a key aspect of this selection procedure is the ability to select the strand for circularization and amplification. This ensures that only the strand with the original mutation (from the 5-prime overhang) and not the strand from the adapter is amplified.
  • mis-match derived small circular DNA molecules may be amplified by other means such as PCR. Common primer binding sites can be incorporated into the adapter sequences
  • the amplified material can be used for mutation detection by methods such as Sanger sequencing or array based sequencing.
  • the first step of this procedure will involve ligating a gene specific oligonucleotide directed to the 5-prime end with a poly dA sequence for binding to the poly dT sequence of the 3-prime end of the cDNA.
  • This oligonucleotide acts as a bridge to allow T4 DNA ligase to ligate the two ends and form a circle.
  • the second step of the reaction is to use a primer, or the bridging oligonucleotide, for a strand displacing polymerase such as Phi 29 polymerase to create a concatemer of the circle.
  • the long linear molecules will then be diluted and arrayed in 1536 well plates such that wells with single molecules can be selected. To ensure about 10% of the wells contain 1 molecule approximately 90% would have to be sacrificed as having no molecules.
  • a dendrimer that recognizes a universal sequence in the target is hybridized to generate 10K-100K dye molecules per molecule of target. Excess dendrimer is removed through hybridization to biotinylated capture oligos. The wells are analyzed with a fluorescent plate reader and the presence of DNA scored. Positive wells are then re-arrayed to consolidate the clones into plates with complete wells for further amplification
  • the process described is based on random DNA arrays and “smart” probe pools for the identification and quantification of expression levels of thousands of genes and their splice variants.
  • spliceosomes interact with splice sites on the primary transcript to excise out the introns, e.g. Maniatis et al, Nature, 418: 236-243 (2002).
  • mutations that alter the splice site sequences, or external factors that affect spliceosome interaction with splice sites, alternative splice sites, or cryptic splice sites could be selected resulting in expression of protein variants encoded by mRNA with different sets of exons.
  • methods of the invention permit large-scale measurement of splice variants with the following steps: (a) Prepare full length first strand cDNA for targeted or all mRNAs. (b) Circularize the generated full length (or all) first strand cDNA molecules by incorporating an adapter sequence. (c) By using primer complementary to the adapter sequence perform rolling circle replication (RCR) of cDNA circles to form concatemers with over 100 copies of initial cDNA.
  • RCR rolling circle replication
  • This system provides a complete analysis of the exon pattern on a single transcript, instead of merely providing information on the ratios of exon usage or quantification of splicing events over the entire population of transcribed genes using the current expression arrays hybridized with labeled mRNA/cDNA. At the maximum limit of its sensitivity, it allows a detailed analysis down to a single molecule of a mRNA type present in only one in hundreds of other cells; this would provide unique potentials for early diagnosis of cancer cells.
  • cDNA is prepared for those specific genes only.
  • gene-specific primers are used, therefore for 1000 genes, 1000 primers are used.
  • the location of the priming site for the reverse transcription is selected with care, since it is not reasonable to expect the synthesis of cDNA >2 kb to be of high efficiency. It is quite common that the last exon would consist of the end of the coding sequence and a long 3′ untranslated region.
  • the 3′ UTR comprises of 3 kb, while the coding region is only 2.2 kb.
  • the logical location of the reverse transcription primer site is usually immediately downstream of the end of the coding sequence.
  • the alternative exons are often clustered together as a block to create a region of variability.
  • Tenascin C variants 8.5 kb
  • the most common isoform has a block of 8 extra exons, and there is evidence to suggest that there is variability in exon usage in that region. So for Tenascin C, the primer will be located just downstream of that region. Because of the concern of synthesizing cDNA with length >2 kb, for long genes, it might be necessary to divide the exons into blocks of 2 kb with multiple primers.
  • Reverse transcription reactions may be carried out with commercial systems, e.g. SuperScript III system from Invitrogen (Carlsbad, Calif.) and the StrataScript system from Stratagene (La Jolla, Calif.).
  • SuperScript III system from Invitrogen (Carlsbad, Calif.)
  • StrataScript system from Stratagene (La Jolla, Calif.).
  • a single template oligonucleotide that is complementary to both the adaptor sequence and the universal tag can be used to ligate the adaptor to all the target molecules, without using the template oligonucleotide with degenerate bases.
  • the 3′ end of the cDNA (5′ end of the mRNA) which is usually ill-defined, it may be treated like a random sequence end of a genomic fragment. Similar methods of adding a polyA tail will be applied, thus the same circle closing reaction may also be used.
  • Reverse transcriptases are prone to terminate prematurely to create truncated cDNAs. Severely truncated cDNAs probably will not have enough probe binding sites to be identified with a gene assignment, thus would not be analyzed. cDNA molecules that are close, but not quite full-length, may show up as splice variant with missing 5′ exons. If there are no corroborating evidence from a sequence database to support such variants, they may be discounted. A way to avoid such problem is to select for only the full-length cDNA (or those with the desired 3′ end) to be compatible with circle closing reaction, then any truncated molecules will not be circularized nor replicated.
  • a dideoxy-cytosine residue can be added to the 3′ end of all the cDNA to block ligation, then by using a mismatch oligo targeting the desired sequence, a new 3′ end can be generated by enzyme mismatch cleavage using T4 endonuclease VII. With the new 3′ end, the cDNA can proceed with the adding a poly-dA tail and with the standard protocols of circularization and replication.
  • Replicated and arrayed concatemers of the exon fragments may be carried out using combinatorial SBH, as described above.
  • the algorithm of the following steps may be used to select 5-mer and 6-mer probes for use in the technique:
  • Step 1 Select 1000-2000 shortest exons (total about 20-50 kb), and find out matching sequences for each of 1024 available labeled 5-mers. On average each 5-mer will occur 20 times over 20 kb, but some may occur over 50 or over 100 times. By selecting the most frequent 5-mer, the largest number of short exons will be detected with the single labeled probe. A goal would be to detect about 50-100 short exons (10%-20% of 500 exons) per cycle. Thus less than 10 labeled probes and 50-100 unlabeled 6-mers would be sufficient. Small number of labeled probes is favorable because it minimizes overall fluorescent background.
  • Step 2 Find out all 6-mers that are contiguous with all sites in all 1000 genes that are complementary to 10 selected 5-mers. On average 20 such sites will exist in each 2 kb gene. Total number of sites would be about 20,000, e.g., each 6-mer on average will occur 5 times. Sort 6-mers by the hit frequency. The most frequent may have over 20 hits, e.g. such 6-mer will detect 20 genes through combinations with 10 labeled probes. Thus, to get a single probe pair for each of the 500 genes a minimum of 25 6-mer probes would be required. Realistically, 100 to 200 6-mers may be required.
  • probe pools are readily prepared with about 200 probes per pool using established pipetting robotics.
  • the information generated is equivalent to having over 3 probes per exon, therefore the use of 8000 5-mers and 6-mers effectively replaces the 30,000 longer exons specific probes required for a single set of 1000 genes.
  • Exon profiling The profiling of exons can be performed in two phases: the gene identification phase and the exon identification phase.
  • the gene identification phase each concatemer on the array can be uniquely identified with a particular gene.
  • 10 probe pools or hybridization cycles will be enough to identify 1000 genes using the following scheme.
  • Each gene is assigned a unique binary code. The number of binary digits thus depends on the total number of genes: 3 digits for 8 genes, 10 digits for 1024 genes.
  • Each probe pool is designed to correspond to a digit of the binary code and would contain probes that would hit a unique combination of half of the genes and one hit per gene only. Thus for each hybridization cycle, an unique half of the genes will score a 1 for that digit and the other half will score zero.
  • exon identification phase After identifying each ampliot with a gene assignment, its exon pattern will be profiled in the exon identification phase. For the exon identification phase, one exon per gene in all or most of the genes is tested per hybridization cycle. In most cases 10-20 exon identification cycles should be sufficient. Thus, in the case of using 20 exon identification cycles we will obtain information of 2 probes per each of 10 exons in each gene. For genes with more than 20 exons, methods can be developed so that 2 exons per gene can be probed at the same cycle. One possibility is using multiple fluorophores of different colors, and another possibility is to exploit differential hybrid stabilities of different ligation probe pairs.
  • a total of about 40 assay cycles will provide sufficient information to obtain gene identity at each spot and to provide three matching probe-pairs for each of 10,000 exons with enough informational redundancy to provide accurate identification of missing exons due to alternative splicing or chromosomal deletions.
  • the Dnase digested DNA (26 ng/ul) was reacted with Terminal deoxynucleotide transferase (0.66 U/ul) from New England Biolabs (NEB) in reaction buffer supplied by NEB.
  • the reaction contained dATP (2 mM) and was performed at 37 C for 30 min and then heat inactivated at 70 C for 10 min.
  • the DNA sample was then heated to 95 C for 5 min before rapid cooling on ice.
  • a synthetic DNA adapter was then ligated to the 5′ end of the genomic DNA by first forming a hybrid of a 65-base oligonucleotide (TATCATCTACTGCACTGACCGGATGTTAGGAAGACAAAAGGAAGCT GAGGGTCACATTAACGGAC)(SEQ ID NO: 8) with a second oligonucleotide (NNNNNNNGTCCGTTAA TGTGAC 3′ 2′3′ddC) (SEQ ID NO: 9) at the 3′ end of the 65mer in which the 7 “Ns” form an overhang.
  • the shorter oligo will act as a splint for ligation of the 65mer to the 5′ end of the genomic fragments.
  • T4 DNA Ligase (0.3 U/ul) was combined with genomic DNA (17 ng/ul) and adapter-splint (0.5 uM) in 1 ⁇ ligase reaction buffer supplied by NEB. The ligation proceeded at 15 C for 30 min, 20 C for 30 min and then inactivated at 70 C for 10 min.
  • a second splint molecule (AGATGATATTTTTTTT 3′ 2′3′ddC) (SEQ ID NO: 10) (0.6 uM) was then added to the reaction and the mix was supplemented with more ligase buffer and T4 DNA ligase (0.3 U/ul). The reaction proceeded at 15 C for 30 min and then at 20 C for 30 min before inactivation for 10 min at 70 C.
  • the ligation mix was then treated with exonuclease I (NEB) (1 Uhl) at 37 C for 60 min, followed by inactivation at 80 C for 20 min
  • Rolling circle replication was performed in reaction buffer supplied by NEB with BSA (0.1 ug/ul), 0.2 mM each dNTP, an initiating primer (TCAGCTTCCTTTTGTCTTCCTAAC) (SEQ ID NO: 11) at 2 fmol/ul, exonuclease treated ligation of genomic DNA at 24 pg/ul, and Phi 29 polymerase (0.2 U/ul).
  • the reaction was performed for 1 hr at 30 C and then heat inactivated at 70 C for 10 min.
  • RCR reaction products were attached to the surface of cover slips by first attaching amine modified oligonucleotides to the surface of the cover slips.
  • a capture probe GAMINOC6][SP C18][SP-C18]GGATGTTAGGAAGACAAAAGGAAGCTGAGG) (SEQ ID NO: 12) (50 uM) was added to the DITC derivatized cover slips in 0.1 uM NaHCO 3 and allowed to dry at 40 C for about 30 min.
  • the cover slips were rinsed in DDI water for 15 min and dried.
  • RCR reaction products (4.5 ul) were then combined with 0.5 ul of 20 ⁇ SSPE and added to the center of the slide.
  • RCR reaction products were formed from a single stranded 80mer synthetic DNA target N N NGCATANCACGANGTCATNATCGTNCAAACGTCAGTCCANGAATC NAGATCCACTTAGANTAAAAAAAAAAAA) (SEQ ID NO: 13) as above but without poly A addition with TDT.
  • the RCR reaction contained target molecules at an estimated 12.6 fmol/ul.
  • Reaction products (5 ul) were combined with SSPE (2 ⁇ ) and SDS (0.3%) in a total reaction volume of 20 ul.
  • the sample was applied to a cover-slip in which lines of capture probe ([AMINOC6][SP-C18][SP C18]GGATGTTAGGAAGACAAAAGGAAGCTGAGG), deposited in a solution of 50 uM with 0.1 uM NaHCO 3 , were dried onto the surface and left in a humid chamber for 30 min. The solution was then washed off in 3 ⁇ SSPE for 10 min and then briefly in water. Various reaction components were tested for their effect upon RCR product formation. The addition of Phi 29 to the RCR reaction at a final concentration of 0.1 U/ul rather than 0.2 U/ul was found to create a greater proportion of RCR products that were of larger intensity after detection probe hybridization.
  • initiating primer at 10 to 100 fold molar ratio relative to estimated target concentration was also found to be optimal. Increased extension times produced more intense fluorescent signals but tended to produce more diffuse concatemers. With the current attachment protocols a 21 u-extension time produced enhanced signals relative to a Jackpot incubation with minimal detrimental impact upon RCR product morphology.
  • RCR products may be attached by method 2.
  • Attachment method 1 RCR reaction products (4.5 ul) were combined with 0.5 ul of 20 ⁇ SSPE and added to the center of the slide. The sample was allowed to air dry and non-attached material was washed off for 10 min in 3 ⁇ SSPE and then briefly in DDI water. The slide was then dried before assembly on the microscope. Attached RCR products were visualized by hybridizing an I Imer TAMRA labeled probe that is complementary to a region of the adapter. Attachment method 2. RCR reaction products (1 ul) were combined with 50 ul of 3 ⁇ SSPE and added to the center of the cover slip with capture probe attached. Addition of SDS (0.3%) was found to promote specific attachment to the capture probes and not to the derivatized surface.
  • PCR products from diagnostic regions of Bacillus anthracis and Yersinia pestis were converted into single stranded DNA and attached to a universal adaptor. These two samples were then mixed and replicated together using RCR and deposited onto a glass surface as a random array. Successive hybridization with amplicon specific probes showed that each spot on the array corresponded uniquely to either one of the two sequences and that they can be identified specifically with the probes, as illustrated in FIG. 4 . This result demonstrates sensitivity and specificity of identifying DNA present in submicron sized DNA concatemers having about 100-1000 copies of a DNA fragment generated by the RCR reaction. A 155 by amplicon sequence from B. anthracis and a 275 by amplicon sequence from Y.
  • pestis were amplified using standard PCR techniques with PCR primers in which one primer of the pair was phosphorylated.
  • a single stranded form of the PCR products was generated by degradation of the phosphorylated strand using lambda exonuclease.
  • the 5′ end of the remaining strand was then phosphorylated using T4 DNA polynucleotide kinase to allow ligation of the single stranded product to the universal adaptor.
  • the universal adaptor was ligated using T4 DNA ligase to the 5′ end of the target molecule, assisted by a template oligonucleotide complementary to the 5′ end of the targets and 3′ end of the universal adaptor.
  • the adaptor ligated targets were then circularized using bridging oligonucleotides with bases complementary to the adaptor and to the 3′ end of the targets.
  • Linear DNA molecules were removed by treating with exonuclease I.
  • RCR products DNA concatemers
  • RCR products were generated by mixing the single-stranded samples and using Phi29 polymerase to replicate around the circularized adaptor-target molecules with the bridging oligonucleotides as the initiating primers.
  • cover slips for attaching amine-modified oligonucleotides were first cleaned in a potassium/ethanol solution followed by rinsing and drying. They were then treated with a solution of 3-aminopropyldimethylethoxysilane, acetone, and water for 45 minutes and cured in an oven at 100° C. for 1 hour. As a final step, the cover slips were treated with a solution of p-phenylenediisothiocyanate (PDC), pyridine, and dimethylformamide for 2 hours.
  • PDC p-phenylenediisothiocyanate
  • the capture oligonucleotide (sequence 5′-GGATGTTAGGAAGACAAAAGGAAGCTGAGG-3′) (SEQ ID NO: 14) is complementary to the universal adaptor sequence. and is modified at the 5′ end with an amine group and 2 C-18 linkers.
  • 10 IA of the capture oligo at 10 ⁇ M in 0.1M NaHCO 3 was spotted onto the center of the derivatized cover slip, dried for 10 minutes in a 70° C. oven and rinsed with water.
  • the RCR reaction containing the DNA concatemers was diluted 10-folds with 3 ⁇ SSPE, 20 IA of which was then deposited over the immobilized capture oligonucleotides on the cover slip surface for 30 minutes in a moisture saturated chamber.
  • the cover slip with the DNA concatemers was then assembled into a reaction chamber and was rinsed by 2 ml of 3 ⁇ SSPE.
  • pestis PCR amplicons were probed sequentially with TAMRA-labeled oligomer: probe BrPrb3 (sequence: 5′-CATTAACGGAC-3′ (SEQ ID NO: 15), specifically complementary to the universal adaptor sequence), probe Ba3 (sequence: 5′-TGAGCGATTCG-3′ (SEQ ID NO: 16), specifically complementary to the Ba3 amplicon sequence), probe Yp3 (sequence: 5′-GGTGTCATGGA-3′, specifically complementary to the Yp3 amplicon sequence).
  • the probes were hybridized to the array at a concentration of 0.1 ⁇ M for 20 min in 3 ⁇ SSPE at room temperature. Excess probes were washed off with 2 ml of 3 ⁇ SSPE. Images were taken with the TIRF microscope. The probes were then stripped off with 1 ml of 3 ⁇ SSPE at 80° C. for 5 minutes to prepare the arrayed target molecules for the next round of hybridization.
  • Example 4 Decoding a Base Position in Arrayed Concatemers Created From a Synthetic 80-Mer Oligonucleotide Containing a Degenerated Base
  • Individual molecules of a synthetic oligonucleotide containing a degenerate base can be divided into 4 sub-populations, each may have either an A, C, G or T base at that particular position.
  • An array of concatemers created from this synthetic DNA may have about 25% of spots with each of the bases.
  • Successful identification of these sub-populations of concatemers was demonstrated by four successive hybridization and ligation of pairs of probes, specific to each of the 4 bases, as shown in FIG. 5 .
  • a 5′ phosphorylated, 3′ TAMRA-labeled pentamer oligonucleotide was paired with one of the four hexamer oligonucleotides.
  • Each of these 4 ligation probe pairs should hybridize to either an A, C, G or T containing version of the target.
  • Discrimination scores of greater than 3 were obtained for most targets, demonstrating the ability to identify single base differences between the nanoball targets.
  • the discrimination score is the highest spot score divided by the average of the other 3 base-specific signals of the same spot.
  • the results further demonstrate the ability to determine partial or complete sequences of DNA present in concatemers by increasing the number of consecutive probe cycles or by using 4 or more probes labeled with different dyes per each cycle.
  • Synthetic oligonucleotide (T1A: 5′-GCATANCACGANGTCATNATCGTNCAAACGTCAGTCCANGAATCNAGATCCACTTAGANTAAAAAA AAAAAA-3′) (SEQ ID NO: 13) contains at position 32 a degenerate base. Universal adaptor was ligated to this oligonucleotide and the adaptor-T1A DNA was circularized as described before. DNA concatemers made using the rolling circle replication (RCR) reaction on this target were arrayed onto the random array.
  • RCR rolling circle replication
  • DNA in a particular arrayed spot would contain either an A, or a C, or a G, or a T at positions corresponding to position 32 of T1A.
  • a set of 4 ligation probes specific to each of the 4 bases was used.
  • a 5′ phosphorylated, 3′ TAMRA-labeled pentamer oligonucleotide corresponding to position 33-37 of TIA with sequence CAAAC (probe T1A9b) was paired with one of the following hexamer oligonucleotides corresponding to position 27-32: ACTGTA (probe T1A9a), ACTGTC (probe T1A10a), ACTGTG (probe T1A1 1 a), ACTGTT (probe T1Al2a).
  • ACTGTA probe T1A9a
  • ACTGTC probe T1A10a
  • ACTGTG probe T1A1 1 a
  • ACTGTT probe T1Al2a
  • the probes were incubated with the array in a ligation/hybridization buffer containing T4 DNA ligase at 20° C. for 5 minutes. Excess probes were washed off at 20° C. and images were taken with a TIRF microscope. Bound probes were stripped to prepare for the next round of hybridization.
  • An adaptor specific probe (BrPrb3) was hybridized to the array to establish the positions of all the spots.
  • the same synthetic oligonucleotide described above contains 8 degenerate bases at the 5′ end to simulate random genomic DNA ends.
  • the concatemers created from this oligonucleotide may have these 8 degenerate bases placed directly next to the adaptor sequence.
  • a 12-mer oligonucleotide (UKO-12 sequence 5′-ACATTAACGGAC-3′) (SEQ ID NO: 17) with a specific sequence to hybridize to the 3′ end of the adaptor sequence was used as the anchor, and a set of 16 TAMRA-labeled oligonucleotides in the form of BBNNNNNN were used as the sequence-reading probes.
  • spots were able to be identified spots on the concatemer array created from targets that specifically bind to one of these 4 probes, with an average full match/mismatch ratio of over 20, as shown in FIG. 6 .
  • the nucleic acid hybridization process is used widely for characterization of a DNA/RNA sample.
  • Antibodies or other proteins or compounds are used in various binding assays for characterization of protein samples.
  • arrays of gene/genomic fragments or synthetic oligonucleotides are prepared in various ways.
  • individual fragments are usually prepared in separate tubes/wells and than deposited on the substrate. This process is too laborious for preparing large number of samples (e.g. close or more than one million) and/or does not allow to prepare an array of small, high density spots, especially below 10 micrometer dot size.
  • DNA/RNA and their derivatives or peptides or protein and other array products including processes for their preparation and uses, that are based on applying mixtures of detecting molecules of partially of fully known primary structure or polymer sequence, preferably as concatemers of the same molecule, on substrates with a pattern of high density small binding sites separated by non-binding surface, followed by determining which detecting molecule from the mixture is attached at which binding site.
  • the saDNA platform utilizes attached nano-balls of concatenated DNA/RNA as detecting molecules (DMs) for hybridization to a solution phase, labeled DNA or RNA target. Since no specific DMs are attached to specific binding sites on the substrate they must first pass through a full or partial sequencing, re-sequencing or signature identification.
  • DMs detecting molecules
  • High density DNA nano-ball probe arrays are prepared from source nucleic acids (NA) that can be derived from
  • the source NA may originate from one species or from multiple species.
  • DNA probe segments of the sample it is preferable to have all of the DNA probe segments of the sample in a similar number of copies to avoid over representation of individual sequences in the array.
  • DNA from multiple individuals of one species may be mixed to get the best representation of every part of the genome.
  • Some important or control DNA probe segments may be intentionally added in higher or lower amount than other fragments. Too many DNA probes having the same, or significantly overlapped DNA, may reduce the sensitivity of detection by competing for target DNA in solution.
  • DNA for probe generation may be fragmented to the preferred length of 30-100 bases, although sizes of about 10-2000 bases in length may also be used, and longer DNA may provide better sensitivity. For example, twenty labeled target DNA fragments of 100 bases in length can hybridize to one 2000 base attached DNA probe template thereby increasing the label density per probe site.
  • the preferred DNA length may be selected by various separation methods including size exclusion matrices or gel electrophoresis.
  • DNA for probe arrays may be also be generated from synthetic DNA that has all sequence variants within a given length of eight to twenty bases.
  • the short probes will create a universal chip for DNA sequencing by representing all possible sequences of 8 to 20 bases within the array.
  • Selection of 10,000 to 1 million or more specific genomic DNA fragments 20-2000 (preferably 100-1000) bases in length may be performed for preparing sequence-specific DNA nano-ball probe arrays.
  • a large number of specific primers could be synthesized and used individually or in pools for selecting subsets of genomic DNA by primer extension or PCR.
  • Another option is to make a universal library of all 6-mers or 7-mers with and without 5 to 10 degenerate bases at the 5′ end and a universal tail further 5-prime of the degenerate bases.
  • BBBBBBB and U20N5_1 oBBBBBBB where B represents defined bases in the synthesis, U represents a sequence present in all primers and N represents degenerate bases in the synthesis).
  • primers can be used directly to amplify selected DNA segments in viral or bacterial genomes, in one to three consecutive amplification steps, and with the possibility of using nested pairs of primers. Ligation of two 6-mers or two 7-mers (or 6-mer+7-mer) may generate a more specific primer that can be also be used for genomes of higher complexity, including human. Several pairs of primers could be created in one reaction tube using selected 7-mer templates from a library of all 7-mers. Because there is no need to produce a large quantity of DNA, 14-mers with a universal primer tail may be sufficient. Nested 14-mer primers may also be used to assure amplification of the region of interest.
  • DNA nano-balls Preparation of concatenated detector molecules (DNA nano-balls) requires the formation of circular DNA molecules.
  • DNA is initially heat denatured and one end is ligated to an adapter.
  • the second end of the probe template is ligated to the free end of the adapter to complete the circle formation.
  • the adapter may include short palindromic sequences (eg. ATCGATCGAT) to induce intra-molecular hybrid formation between adapter replicas e.g. -ATCGATCGAT-TAGCTAGCTA- and compaction of the concatemer.
  • the RCR reaction may result in products of varying length so removal of small nano-balls may be important for good quantification of target in the hybridization assay.
  • Selection of small DNA nano-balls could occur by size exclusion methodologies or complementary concantenated blocker molecules that will bybridize to all adapter molecules in short molecules. Longer molecules will have excess adapter molecules that can be hybridized to a capture molecule on a solid support, whereas shorter molecules will be blocked from binding to the support.
  • a continuous amplification of selected DNA fragments may be performed by cutting concatemers by hybridizing an oligonucleotide to the adapter region that generates a restriction enzyme site. New circles are formed by ligation of the free ends and a second RCR reaction is performed.
  • One-billion fold amplification of the original DNA is possible to achieve in three to four rounds of RCR (1) and would provide enough DNA for making millions of arrays.
  • coli genomic DNA at an estimated concentration of less than 150 nano-balls/picoliter of RCR reaction which assumes maximal efficiency of circle DNA formation and polymerase extension. This approach of attachment will help to minimize the binding and association of two nano-balls with complementary DNA because of no surface mixing of millions of nano-balls over already attached nano-balls.
  • in situ DNA amplification may also be desirable to perform additional in situ DNA amplification that requires cutting the attached concatemerized DNA, recircularization (preferably by using a different adapter DNA) and RCR. This could be achieved with two different capture probes present at the oligonucleotide attachment site such that DNA concatenated with both adapters can be captured at the site.
  • Another method for in situ amplification is to use capture oligonucleotides as primers for a strand displacing polymerase. These methods could achieve 10,000 to 100,000 or more copies per attachment site.
  • 200 to 300 bacterial species will have about 10 9 bases of DNA so 10 7 ⁇ 100 base long fragments will cover more than 50% of bases with occasional rare gaps longer than 1 kb appearing.
  • arrays for 10 meaningful groups of 20-30 bacterial species and all different isolates is more than sufficient for screening 10 specific human or other samples (e.g. blood, urine, saliva, skin each on specific array.
  • About 108 bases of human coding DNA may be represented by 10 DNA nano-balls having 50-200 base fragments. All long exons and almost every short exon will be represented with at least one fragment and every gene will be represented by about 30-3000 fragments.
  • each attached DNA nano-ball may be determined by a “signature” approach. About 50 to 100 or possibly 200 probes will be used such that about 25-50% or in some application 10-30% of attached nano-balls will have a full match sequence for each probe. This type of data will allow each amplified DNA fragment within the nano-ball to be mapped to the reference sequence.
  • One example of this process would be to score 64 4-mers (i.e. 25% of all possible 256 4-mers) using 16 hybridization/stripoff cycles in a 4 colors labeling schema. On a 60-70 base fragment amplified in the DNA nano-ball about 16 of 64 probes will be positive since there are 64 possible 4mers present in a 64 base long sequence (ie one quarter of all possible 4mers).
  • Unrelated 60-70 base fragments will have a very different set of about 16 positive decoding probes.
  • a combination of 16 probes out of 64 probes has a random chance of occurrence in 1 of every one billion fragments which practically provides a unique signature for that nano-ball.
  • Scoring 80 probes in 20 cycles and generating 20 positive probes would create even more unique signature: occurrence by chance is 1 in billions.
  • a “signature” approach was used to select novel genes from cDNA libraries (3)
  • An implementation of a signature approach is to sort obtained intensities of all tested probes and select up to a predefined (expected) number of probes that satisfy the positive probe threshold.
  • N(5-7)BBB may be prepared in 4 different synthesis.
  • decoding probes consist of a pool of 4 oligonucleotides in which each oligonucleotide consists of 4-9 degenerate bases and 1 defined base. This oligonucleotide will also be labeled with one of four fluorescent labels. Each of the 4 possible bases A, C, G, or T will therefore be represented by a fluorescent dye.
  • Six or more bases can be sequences with additional probe pools.
  • the 6mer oligonucleotide can be positioned further into the 12mer sequence. This will necessitate the incorporation of degenerate bases into the 3-prime end of the non-labeled oligonucleotide to accommodate the shift. This is an example of decoding probes for position 6 and 7 in the 12-mer.
  • the 6 bases from the right side of the 12mer can be decoded by using a fixed oligonucleotide and 5-prime labeled probes.
  • 6 cycles are required to define 6 bases of one side of the 12mer. With redundant cycle analysis of bases distant to the ligation site this may increase to 7 or 8 cycles. In total then, complete sequencing of the 12mer could be accomplished with 12-16 cycles of ligation.
  • one set has probes of the general type N3$134-6 (anchors) that are ligated with the first 2 or 3 or 4 probes/probe pools from the set BN′′, NBN5-7, N2BN4-6, and N3BN3-5.
  • the process is:
  • Anchors may be also tagged to determine which anchor from the pool is hybridized to a spot.
  • Tags, as additional DNA segment, may be used for adjustable displacement as a detection method. For example, EEEEEEEENNNAAAAA and FFFFFFFFNNNCCCCC probes can be after hybridization or hybridization and ligation differentially removed with two corresponding displacers: EEEEEEEENNNNN and FFFFFFNNNNNNNN where the second is more efficient. Separate cycles may be used just to determine which anchor is positive. For this purpose anchors labeled or tagged with multiple colors may be ligated to unlabeled N7-N10 supporter oligonucleotides.
  • Another benefit of having many different tags even if they are consecutively decoded (or 2-16 at a time labeled with 2-16 distinct colors) is the ability to use a large number of individually recognizable probes in one assay reaction. This way a 4-64 times longer assay time (that may provide more specific or stronger signal) may be affordable if the probes are decoded in short incubation and removal reactions.
  • a key component of successful array production is having a cost-effective methodology for decoding each array.
  • Decoding arrays during production simplifies assays for the end user.
  • Our decoding methodology includes a fast, automated imaging and assay platform designed specifically to optimize this task.
  • patterned array substrates are produced to match the standard 96 or 384 well plate format.
  • Our production format will be an 8 ⁇ 12 pattern of 6 mm ⁇ 6 mm arrays at 9 mm pitch or 16 ⁇ 24 of 3.33 mm ⁇
  • each 6 mm ⁇ 6 mm array consists of 36 million 250-500 nm square activated regions at 1 micrometer pitch. Throughout the production process, our arrays will be manipulated in this array of arrays format.
  • the rate limiting step for the production process may be array decoding. While arrays can be printed and hybridized at an astonishing rate through the use of processes derived from the semiconductor industry, they must be decoded at the rate of image acquisition.
  • the decoding process described in other sections of this document, will require the use of 48-96 or more decoding probes. These pools will be further combined into 12-24 or more pools by encoding them with four fluorophores, each having different emission spectra. Additional tagging may be used as described in the biochemistry of decoding.
  • each 6 mm ⁇ 6 mm array may require roughly 30 images for full coverage by using a 10 mega pixel camera with. Each of 1 micrometer array areas will be read by about 8 pixels.
  • Our prior experience suggests that each image could be acquired in 250 milliseconds, 150 ms for exposure and 100 ms to move the stage. Using this fast acquisition it will take ⁇ 7.5 seconds to image each array, or 12 minutes to image the complete set of 96 arrays on each substrate.
  • the autofocus routine which takes extra time, will run only if an acquired image is out of focus. It will then store the Z axis position information to be used upon return to that section of that array during the next imaging cycle. By mapping the autofocus position for each location on the substrate we will drastically reduce the time required for image acquisition.
  • Each array will require about 12-24 cycles to decode.
  • Each cycle consist of a hybridization, wash, array imaging, and strip-off step. These steps, in their respective orders, may take for the above example 5,2,12, and 5 minutes each, for a total of 24 minutes each cycle, or roughly 5-10 hours for each array, if the operations were performed linearly.
  • the time to decode each array can be reduced by a factor of two by allowing the system to image constantly. To accomplish this, we will stagger the imaging of two separate substrates on each microscope. While one substrate is being reacted, the other substrate will be imaged.
  • Image each array preferably with a mid power (20 ⁇ ) microscope objective optically coupled to a high pixel count high sensitivity ccd camera, or cameras.
  • Plate stage moves chambers (or perhaps flow-cells with input funnels) over object, or objective-optics assembly moves under chamber.
  • Certain optical arrangements, using di-chroic mirrors/beam-splitters can be employed to collect multi-spectral images simultaneously, thus decreasing image acquisition time.
  • Arrays can be imaged in sections or whole, depending on array/image size/pixel density.
  • Sections can be assembled by aligning images using statistically significant empty regions pre-coded onto substrate (during active site creation) or can be made using a multi step nano-printing technique, for example sites (grid of activated sites) can be printed using specific capture probe, leaving empty regions in the grid. Then print a different pattern or capture probe in that region using separate print head.
  • Drain chamber and replace with probe strip buffer then heat chamber to probe stripoff temperature (60-90° C.).
  • High pH buffer may be used in the strip-off step to reduce stripoff temperature. Wait for the specified time.
  • the substrate may be sectioned off and divided into strips to accommodate fluid flow/capillary effects caused by sandwiching.
  • the substrate may be made of thicker glass to resist flexing in the chamber, reducing reliance on autofocus.
  • the substrate may be housed in an “open air”/“open face” chamber to promote even flow of the buffers over the substrate by eliminating capillary flow effects.
  • the substrate During the imaging process, the substrate must remain in focus. Some key factors in maintaining focus are the flatness of the substrate, orthogonality of the substrate to the focus plane, and mechanical forces on the substrate that may deform it. Substrate flatness can be well controlled, glass plates which have better than % wave flatness are readily obtained. Uneven mechanical forces on the substrate can be minimized through proper design of the hybridization chamber. Orthogonality to the focus plane can be achieved by a well adjusted, high precision stage. Even when all these issues are addressed, it is likely that some auto focus methodology will have to be used during substrate imaging. Auto focus routines generally take additional time to run, so it is desirable to run them only if necessary. After each image is acquired, it will be analyzed using a fast algorithm to determine if the image is in focus.
  • the auto focus routine will run. It will then store the objectives Z position information to be used upon return to that section of that array during the next imaging cycle. By mapping the objectives Z position at various locations on the substrate, we will reduce the time required for substrate image acquisition.
  • the current system uses a zeiss TIRF slider coupled to a 80 milliwatt 532 nm solid state laser.
  • the slider illuminates the substrate through the objective at the correct TIRF illumination angle.
  • TIRF can also be accomplished without the use of the objective by illuminating the substrate though a prism optically coupled to the substrate.
  • Planar wave guides can also be used to implement TIRF on the substrate Epi illumination can also be employed.
  • the light source can be rastered, spread beam, coherent, incoherent, and originate from a single or multi-spectrum source.
  • Our current microscope can do standard epi illumination on the entire plate substrate.
  • Our current system successfully detects hybridization on DNA nano-balls with both TIRF and epi fluorescence.
  • a preferred embodiment for the imaging system will contain a 20 ⁇ lens with a 1.25 mm field of view, with detection being accomplished with a 10 megapixel camera. Such a system would image approx 1.5 million nano-balls attached to the patterned array at 1 micron pitch. Under this configuration there are approximately 6.4 pixels per nano-ball.
  • the number of pixels per nano-ball can be adjusted by increasing or decreasing the field of view of the objective. For example a 1 mm field of view would yield a value of 10 pixels per nano-ball and a 2 mm field of view would yield a value of 2.5 pixels per nanoball.
  • the field of view will be adjusted relative to the magnification and NA of the objective to yield the lowest pixel count per nano-ball that is still capable of being resolved by the optics, and image analysis software.
  • Our current 3 axis robotic gantry pipetting system can be scaled up to serve more than one microscope.
  • the system has one pipette head. If the number of chambers becomes too great for a single pipetter to service efficiently, multiple pipetting channels can be added to the pipetter head, each head individually accessible via a simple linear extension system, increasing robot efficiency by increasing the service potential for each robot move. It may be more efficient or cost effective to implement a non-gantry style robot, such as a scara style robot to perform certain operations.
  • a larger than standard plate stage may be needed to image more than one plate sized substrate per microscope.
  • the plate stage should be designed for rigidity, positional accuracy, and repeatability.
  • the DNA nano-ball probe arrays can be used for sequence (for example genes, exons, promores, diagnostic sites, SNPs, mutations) identification in amplified or possibly non-amplified target samples.
  • sequence for example genes, exons, promores, diagnostic sites, SNPs, mutations
  • the reduction of human DNA contamination may be achieved by using affinity columns or beads directed to Alu or LINE repeats in the human genome. Sample DNA of 1-10 kb length could be hybridized to these affinity columns and the un-bound fraction collected and fragmented to the final preferred length before amplification or direct hybridization to the nano-ball probe arrays. It may be important to quantify the amount of isolated DNA.
  • “whole genome” methods of amplification could be employed.
  • One approach could be to form single stranded DNA circles (50-500 bases in length) using a 20-100 base adapter and amplify by RCR 100-1000 fold in a linear amplification from the original copy.
  • Concatamers can then be fragmented by a restriction enzyme after hybridizing a complementary oligonucleotide to the adapter such that a double stranded cutting site is formed.
  • fragmentation may be beneficial to randomly fragment the target DNA to about 50-200 bases using DNAse at a pre-tested enzyme dilution for a specific incubation time and depending upon the amount of DNA in the sample. Fragmentation has the benefit of improving hybridization kinetics and decreasing negative repellent forces of the molecules. It is also a more efficient use of sample DNA and less likely to build chaining of two DNA fragments from solution that may cause false signals. It may also be beneficial to develop an internal control target that reports the degree of fragmentation for example through the separation of quenching dyes.
  • Each target DNA that is prepared by RCR will be a single stranded concatemer of sample target and adapter.
  • the adapter sequence portion of the RCR concatemers allows for the hybridization of labeled dendrimers in which a single arm of the dendrimer is complementary to a portion of the adapter used to form DNA circles.
  • Non-amplified DNA may be labeled by poly-C or poly-A tailing using terminal transferase and than hybridized to dendrimers with a single complementary arm.
  • An alternative may be to ligate on each end of single stranded DNA a complement to different dendrimer arm. Longer DNA with multiple attached dendrimers may be ligated to each end or other standard labeling procedures may be used. The excess of label may be hybridized or ligated to a biotinylated oligonucleotide, and remove with Strepavidin coated beads.
  • a target DNA sample may also be prepared by utilizing the detection DNA nano-ball array itself for sample isolation. In this procedure the target DNA would be collected in a small volume, then fragmented and denatured. The target DNA is then hybridized to an array of nano-ball sequences complementary to those desired from the sample and any excess un-hybridized DNA would then be washed away. Captured target DNA could be amplified on the surface of the array by covalently ligating fragmented concatemers with the capture oligonucleotide as a bridging support, followed by RCR. Alternatively, an adapter with a tag or label can be ligated to the hybridized DNA and detected.
  • RNA may also serve as the target with or without conversion to DNA
  • Sample DNA or RNA amplification may not be required due to: 1) extensive miniaturization, low reaction volume and effective reaction mixing to allow DNA or RNA fragments to find complementary nano-ball probes; 2) longer DNA detector molecules in the array enable efficient and specific hybridization in complex mixtures. This also allows the use of bulky signal amplification molecules; 3) the use of multiplicity of different DNA fragments for each DNA region of each sample reduces experimental noise (it also allows finding of specific DNA fragments for detecting given gene or genomic region with no cross talk to other DNA in the sample); 4) signal amplification for example with the application of dendrimers or concatenated detector DNA as labeling methods for the target.
  • the longer lengths of the DNA nano-ball probes allows for stringent washing conditions which can improve specificity of the probes and targets. Temperature gradients and obtaining several measurements per spot in the wash steps could also be employed to increase specificity and this may be important for shorter probes to detect single or a few base changes. Detection of hybrid formation without washing excess target from the reaction chamber (homogenous assay) is also an option by focusing the CCD on the surface. This could be especially applicable if a longer hybridization is performed to deplete labeled DNA from solution.
  • One or more images of an array will be generated preferably using a CCD camera.
  • Raw signals will be determined by image analysis and assigned to each spot and associated with provided information about identity of detector molecule at each spot. Empty or other control dots may be used to assure proper assignments of spot signals to detector molecules.
  • the array with different identified DNA per spot is used to create programmable connections and nano-wires between neighboring spots by providing a bridging oligonucleotide . . . PPPPPPPPPPPPPPPPPPPPPPPP SSSSSSSSSSSSSSSSSSS . . . PPPPPPPPPBBBBB . . . BBBBSSS SSSSSSS.
  • a controllable switches can be generated using temperature as the trigger.
  • the connector may be designed to stay in one of two connected spots for reconnecting.
  • nano-ball probe arrays vs. in-situ prepared probe arrays include: —longer probe lengths (50-5000 base) allows for increased specificity and sensitivity. —Ultra-high density of probe nano-balls allows for higher sensitivity and lower assay cost—Low production cost compared with existing array technologies
  • nano-balls are arrayed on the surface and can be quantitated by counting the occurance of specific nanoballs.
  • nano-ball probes attached to the surface are used to quantitate solution phase labeled target through relative intensity levels of the label at the nano-ball.
  • a further advantage of the saDNA platform over array of DNA from the test sample is that only 10-100 million nano-balls need be scored instead of 1-10 billion for quantitative representation of all informative fragments.
  • One limitation of saDNA however is that it may be difficult to identify low frequency targets such as gene duplications or deletions in one of every 10-10,000 tumor cells.
  • duplications and deletions in tumor samples 100 million (maybe only 30 million after removing repeats in one 96-well) ⁇ 100-1000 bases: one every 30 bases on average; can detect 1000 base deletions;
  • RCR products of synthetic or natural DNA fragments of about 30-3000 bases initiated with a primer that has RNA polymerase promoter extension are used to produce long RNA and in vitro translated protein with multiple copies of the same peptide with an adapter (used for forming DNA circles) coded spacer peptide.
  • the resulting protein with 100 to 10000 amino acids may be folded maybe initiated by the spacer protein to form several to hundreds of almost independently folded unit peptides.
  • Each peptide may form several domains for binding different molecules like antibodies, oligo peptides, single or double-stranded oligonucleotides or other chemical compounds that can be used to identify given peptide.
  • These protein balls may be attached to binding sites of a support having a peptide or other molecule that binds to spacer peptide or by using other general protein binding chemistry.
  • Small size of active binding sites surrounded by non-binding support allow to attach only one (first to bind) protein nano-ball by binding saturation of all available binding molecules in the binding site or by physical prevention of other protein nano-balls to interact with the same binding site.
  • To minimize double or multiple occupancy protein nano-balls smaller than give size may be removed by size separation or saturation of spacer protein.
  • PCR products from diagnostic regions of Bacillus anthracis and Yersinia pestis were converted into single stranded DNA and were attached with a universal adaptor. These 2 samples were then mixed and replicated together using the rolling circle replication method and deposited onto the random array. Successive hybridization with amplicon specific probes showed that each spot on the array corresponded uniquely to either one of the 2 amplicon sequences and they can be identified specifically with the probes. This result demonstrates sensitivity and specificity of identifying DNA present in submicron size spots created by attaching DNA nano-balls having about 100-1000 copies of a DNA fragment generated by RCR reaction.
  • Amplicons Ba3 (a 155 by amplicon sequence 5′-TCCCAATACATATGAGCGATTCGCCTTTAT AAACGACGTATTCCTTTGAACTCGTTATGACACTCATTACTCAACTCCCCTTTTCTACTAAAATAGCGTTTTTGTTT GGTTTTTTTCTTCACATAATC CGTCCTATTTGATTTTTACATACCACC-3′ from B.
  • Yp3 a 275 by amplicon sequence 5′ TGTAGCCGCTAAGCACTACCATCCTCAAGGTTATTGACGGTATCGAGTAG GGTTAGGTGGGCATCATTGTCCATTTCATGGCGGTAATATCGGGATGAGATAACGCGGGTGTCATGGACGTATGG CGGGTCAACAAAATGAAGCGTTGAAACTGTGTCATGGTCTAACATGCATTGGACGGCATCACGATTCTCTACCAAA ACGCCCTCGAATCGCTGGCCAACTGCTGCCAAGTTTTCAGGCATCCTTGCCCAAAGGTGTTGAGCTGTTGCC-3′ from Y.
  • the 5′ end of the remaining strand was phosphorylated using T4 DNA polynucleotide kinase (Epicenter) to allow ligation to the universal adaptor.
  • the universal adaptor (sequence 5′-TATCATCTACTGCACTGACCGGATGTTAGGAAGAC AAAAGGAAGCTGAGGGTCACATTAACG GAC-3′) was ligated using T4 DNA ligase (Epicenter) to the 5′ end of the target molecule assisted by a template oligonucleotide (Ba3-5 end 5′-ATTGGGAGTCCGTTAATGTGAC-3′ for amplicon Ba3, Yp3-5 end 5′-GGCTACAGTCCGTTAATGTGAC-3′ for amplicon Yp3) specifically complementary to the 5 end of the targets and 3′ end of the universal adaptor.
  • the adaptor ligated targets were then circularized using bridging oligonucleotides (Ba3-3 end 5′-AGATGATAGGTGGTAT-3′ for amplicon Ba3, Yp3-3 end 5′-AGATGATAGGCAACAG-3′ for amplicon Yp3) with bases complementary to the adaptor and to the 3′ end-of the targets.
  • Linear DNA molecules were removed by treating with exonuclease I (New England Biolabs) at 37° C. for 4 hours under standard reaction conditions.
  • Rolling circle replication (RCR) products were generated by mixing the single-stranded samples of Ba3 and Yp3 together, and using Phi29 polymerase (New England Biolabs) to replicate around the circularized adaptor-target molecules with the bridging oligos as the initiating primers. Specifically, 0.1 to 0.5 pmol of the circularized DNA was incubated with 5 units of Phi29 DNA polymerase, 2 pmol of the bridging oligos, 0.4 mM dNTP, 0.2 mg/ml BSA and 1 ⁇ standard Phi29 DNA polymerase buffer at 30° C. for 2 hours, followed by 1 hour incubation at 55° C. with 1 ug/ul proteinase K.
  • Phi29 polymerase New England Biolabs
  • the RCR products were captured on the glass slide via the capture oligo (sequence 5′-GGATGTTAGGAAGACAAAAGGAAGCTGAGG-3′) attached to derivatized glass coverslips (Corning) that is complementary to the universal adaptor sequence.
  • the coverslips were first cleaned in a potassium/ethanol solution followed by rinsing and drying. They were then treated with a solution of 3-am inopropyld im ethylethoxysi lane, acetone, and water for 45 minutes and cured in an oven at 100° C. for 1 hour. As a final step, the coverslips were treated with a solution of p-phenylenediisothiocyanate (PDC), pyridine, and dimethylformamide for 2 hours.
  • PDC p-phenylenediisothiocyanate
  • the capture oligo is modified at the 5′ end with an amine group and 2 C-18 linkers.
  • FIG. 10 shows that most of the arrayed molecules that hybridized with the adaptor probe (blue spots) would only hybridized to either the Ba3 probe (red spots) or the Yp3 probe (green spots), with very few that would hybridized to both.
  • This specific hybridization pattern demonstrates that each spot on the array contains only one type of sequence, either the Ba3 amplicon or the Yp3 amplicons. It also demonstrates that the rSBH process is able to distinguish target molecules of different sequences deposited onto the array by using sequence specific probes.
  • Individual molecules of a synthetic oligo containing a degenerate base can be divided into 4 sub-populations, each will have either an A, C, G or T base at that particular position.
  • An array of DNA nano-balls created from this synthetic DNA will have about 25% of spots with each of the bases.
  • the results demonstrate ability to determine partial or complete sequence of DNA present in DNA nano-balls by increasing number of consecutive probe cycles or by using 4 or more probes labeled with different dyes per each cycle.
  • a synthetic oligo (T1A: 5′-NNNNNNNNGCATANCACGANGTCATNATCGTNCAAACGTCAGTCCA NGAATCNAGATCCAC TTAGANT-3′) contains at position 32 a degenerate base. Universal adaptor was ligated to this oligo and the adaptor-T1A DNA was circularized as described before. DNA nano-balls made using the rolling circle replication (RCR) reaction on this target were arrayed onto the random array. Because each spot on this random array corresponded to tandemly replicated copies originated from a single molecule of T1A, therefore DNA in a particular arrayed spot would contain either an A, or a C, or a G, or a T at positions corresponding to position 32 of T1 A.
  • RCR rolling circle replication
  • a set of 4 ligation probes specific to each of the 4 bases was used.
  • a 5′ phosphorylated, 3′ TAMRA-labeled pentamer oligo corresponding to position 33-37 of T1A with sequence CAAAC (probe T1A9b) was paired with one of the following hexamer oligos corresponding to position 27-32: ACTGTA (probe T1A9a), ACTGTC (probe T1A10a), ACTGTG (probe T1A11a), ACTGTT (probe T1Al2a).
  • Each of these 4 ligation probe pairs should hybridize to either an A-, C-, G- or T-containing version of T1 A.
  • the adaptor specific probe BrPrb3 at 0.1 pM was hybridized to the array to establish the positions of all the spots (shown as blue in FIG. 11 ).
  • the 4 ligation probe pairs, at 0.4 NM, were then hybridized successively to the array: the spots hybridized to the A-specific ligation probe pair are shown as red in FIG. 11 , the C-specific spots are green, G-specific spots are yellow and the T-specific spots are cyan.
  • circle A indicates the position of one of the spots hybridized to both the adaptor probe and the A-specific ligation probe pair, suggesting that the DNA arrayed at this spot is derived from a molecule of T1A that contains an A at position 32. It is clear that most of the spots are associated with only one of the 4 ligation probe pairs, and thus the nature of the base at position 32 of T1A can be determined specifically.
  • spots were identified using the images taken for the hybridization cycle using the adaptor probe. The same spots were also identified and the fluorescent signals were quantified for subsequence cycles with the base-specific ligation probes. A instrument background of 205 was subtracted off of each signal. A discrimination score was calculated for each signal: for each base-specific signal of each spot, it was divided by the average of the other 3 base-specific signals of the same spot. For each spot, the highest of the 4 base-specific discrimination scores was compared with the second highest score, and if the ratio of the two was above 1.8, then the base corresponding to the maximum discrimination score is selected for the base calling. In this analysis over 500 spots were successfully base-called and the average discrimination score is 3.34.
  • the average full match signal is 272, while the average single mismatch signal (signals from the un-selected bases) is 83.2, thus the full match/mismatch ratio is 3.27.
  • the image background noise was calculated by quantifying signals from randomly selected empty spots and the average signal of these empty spots is 82.9, thus the full match/background noise ratio is 3.28. In these experiments the mismatch discrimination is limited by the low full match signal relative to the background.
  • a solution of 3-aminopropyldimethoxysilane, trimethylethoxysilane, acetone and water is used in the second step.
  • the trim ethylethoxysilane is used in various ratios to the 3-aminopropyidimethylethoxysilane to control the density of 3 aminopropyldimethylethoxysilane on the surface.
  • silanes that have longer or shorter alchohol groups on the silane molecule.
  • the core of this new approach involves the creation and efficient analysis of high-density random arrays containing millions of DNA molecules.
  • Such random arrays eliminate the costly, time-consuming steps of arraying probes on the substrate surface and the need for individual preparation of thousands of sequencing templates. Instead they provide a fast and cost effective way to analyze complex DNA mixtures.
  • DNA molecules are arrayed at a density of about one molecule per square micron of substrate.
  • a 3 ⁇ 3 mm array has the capacity to hold 1-10 million fragments, or approximately 1-10 billion DNA bases, the upper limit being the equivalent of three human genomes.
  • rSBH random sequencing by hybridization
  • the rSBH process preserves all the advantages of combinatorial SBH demonstrated on our HyChip product, including the high specificity of the ligation process.
  • DNA attachment creates the possibility of using random DNA arrays with much greater capacity than regular probe arrays, and allows detection by ligation of two labeled probes in solution.
  • having both probe modules in solution allows us to expand our informative probe pool (IPP) strategy to both probe sets, which was not possible on the HyChipTM format.
  • the rSBH process allows for the identification of unknown-sequence targets bound to the surface substrate either through full sequencing or by partial sequence signatures.
  • the targets are “random” because that are distributed randomly at attachment sites on the array surface.
  • the system hardware consists of three major components; the illumination system, the reaction chamber, and the detector system. These components work together to provide single fluorophore detection sensitivity.
  • TIRFM creates a 100-500 nm thick evanescent field at the interface of two optically different materials (4).
  • the evanescent field is an extension of the beam energy that reaches beyond the glass/water interface by a few hundred nanometers (generally between 100-500 nm). This field can be used to excite fluorophores close to the glass-water boundary and virtually eliminates background from the excitation source.
  • the substrate once attached to the reaction chamber, forms the bottom section of a hybridization chamber.
  • This chamber controls the hybridization temperature, provides ports for the addition of probe pools or targets to the chamber, removal of the probe pools or targets and substrate washing.
  • a fluorescently labeled solution is introduced into the chamber and is given time to hybridize with the attached DNA.
  • a high sensitivity CCD camera capable of single photon detection is used to detect fluorescent hybridization/ligation events.
  • multiple solution phase probe pools may be cycled through the chamber. After image acquisition the chamber may be flushed to remove all probes and the next probe pool is introduced. This process is repeated 256-512 times until all probe pools have been assayed.
  • the detection instrument has been fully assembled with features including: adjustable laser power, electronic shutter, auto focus, and operating software.
  • the system was optimized and tested using arrays of individual Tamra dye molecules and arrays of dendrimers with 350 and 50 dye molecules. Detection of a single molecule of dye is achieved in about 10% of cases demonstrating projected sensitivity of a TIRFM-based strong illumination/low background process coupled with a sensitive CCD camera. This result also demonstrates that detection of single molecules is statistically inefficient due to various physical and chemical factors and that a target DNA amplification schema is required. Dendrimers with only 50 dye molecules produce signal that is 50 fold stronger than background indicating that about 100 fold target amplification would be sufficient.
  • Effective glass activation chemistry has been developed that creates a monolayer of isothiocyanate reactive groups for attaching amine modified primers or DNA capture oligonucleotides. This monolayer chemistry reduces trapping of labeled probe and thus dramatically reduces the assay's background.
  • RCR generated concatemers has been a range of feature sizes produced from a homogeneous target population (see FIG. 13 ) This may be a result of extension initiating at different times on different circular templates or different rates of extension are occurring for individual polymerase molecules. It is believed that one polymerase molecule is responsible for the continuous extension of a primer ( 5 ) although we are not aware of any studies describing an upper limit to size of product produced by a single polymerase molecule. To create more uniform sizes of the amplified targets we will incorporate dideoxy nucleotides as a very small proportion of the total dNTP concentration (e.g. 1 in 50,000).
  • a universal adapter that also serves as the binding site for capture probes and RCR primer is ligated to the 5′ end of the target molecule using a universal template DNA containing degenerate bases for binding to all genomic sequences.
  • the 3′ end of the target molecule is modified by addition of a poly-dA tail using terminal transferase.
  • the modified target is then circularized using a bridging template complementary to the adapter and to the oligo-dA tail ( FIG. 14 ).
  • Single stranded PCR products can be prepared by exonuclease digestion of one of the strands or by strand separation with high temperature and rapid cooling. Primer sequences will be incorporated into the 5-prime ends of the primers to allow for the hybridization of a bridge oligonucleotide for circularization ( FIG. 15 ).
  • This approach can be utilized for genomic fragment capture with adapters ligated to restriction enzyme fragmented genomic DNA. With two adapters, approximately 50% of fragments will possess two different adapters at each end which can then be used for strand removal and circle formation.
  • Capture sequences in the bridge will be the same for each molecule but probe binding sequences for sequence identification will vary. Circularization of the molecule proceeds with a bridging oligonucleotide of about 20-30 bases in length that will bring the two ends into juxtaposition for ligation by T4 DNA ligase. The bridging molecule can now act as the primer for extension. Amplification of DNA captured into the circular molecules proceeds by a rolling circle replication to form long linear concatemer copies of the circle.
  • the novelty of the method proposed here is that millions of single DNA molecules, randomly arrayed on an optically clear surface, serve as templates for hybridization and ligation of fluorescent-tagged probe pairs. Pairs of probe pools, at least one of which is labeled with a fluorophore, are mixed with DNA ligase and presented to the random array. When probes hybridize to adjacent sites on a target fragment they are ligated together, forming a stable hybrid. A sensitive mega pixel CCD camera with advanced optics is used to simultaneously detect millions of these individual hybridization/ligation events on the entire array. Once signals from the first pool are detected, the probes are removed and successive ligation cycles are used to test different probe combinations. The fixed position of the CCD camera relative to the array ensures accurate tracking of consecutive hybridizations to individual target molecules. The entire sequence of each DNA fragment is compiled based on fluorescent signals generated by hundreds of independent hybridization/ligation events.
  • TAMRA labeled 5mer probes can be used in conjunction with a pair of 6mers to identify the base sequence at the mutated site.
  • the proposal is to structure random DNA arrays into a high density grid, such that each DNA binding site is only 100-300 nm in size and each binding site contains only a single DNA fragment.
  • This approach should minimize cross hybridization between DNA targets, while at the same time substantially decreasing the size of each binding site and thus increasing the density of binding sites per array.
  • the significance of being able to efficiently and inexpensively make such “perfect” random DNA arrays is tremendous. Maximizing the number of DNA segments per surface area will enable scientists to analyze a complex genome on one small glass chip, about 1 cm2 in size or less.
  • a CCD chip can be perfectly aligned with the DNA array to provide a one to one correspondence between each CCD pixel and DNA binding site, maximizing reading efficiency.
  • DNA random arrays in the form of “perfect” high density grids with sub-micron spots will provide the basis for daily sequencing of multiple human genomes using affordable 10 mega pixel CCD detectors. These whole genome DNA arrays have over 1000 times more DNA spots than the current high density probe arrays. Because a one cm2 chip can hold over one billion DNA fragments (>100 billion bases or over 30 human genomes) an automated process can be developed such that the total sequencing reaction volume for 100 interrogation cycles would be only 1 ml, reducing sequencing cost to less than $1000 dollars per genome.
  • the proposed high density structured random DNA array chip will have capture oligonucleotides concentrated in small, segregated capture cells aligned into a rectangular grid formation ( FIG. 16 ). Most importantly, each capture cell or binding site will be surrounded by an inert surface and will have a sufficient but limited number of capture molecules (100-400). Each capture molecule will bind one copy of the matching adaptor sequence on the RCR produced DNA concatemer. Since each concatemer contains over 1000 copies of the adapter sequence, it will quickly saturate the binding site upon contact and prevent other concatemers from binding, resulting in exclusive attachment of one RCR product per binding site or spot. The proper concentration of RCR products and sufficient reaction time will ensure that almost every spot on the array contains one and only one unique DNA target.
  • RCR “molecular cloning” allows the application of the saturation/exclusion (single occupancy) principle in making random arrays. The exclusion process is not feasible in making single molecule arrays if an in situ amplification is alternatively applied.
  • RCR concatemers provide an optimal size to form small non-mixed DNA spots. Each concatemer of about 100 kb is expected to occupy a space of about 0.10.10.1 um. This indicates that RCR products can fit into the 100 nm capture cells.
  • Another advantage of RCR products is that the single stranded DNA is ready for hybridization and is very flexible for forming a randomly coiled ball of DNA. It is important to note that 1000 copies of DNA target produced by RCR provide much higher specificity than analysis of single molecule. Thus, RCRs provide several important advantages without any serious penalties.
  • Having 125-250 nm DNA sites in a regular grid with 250-500 nm center-to-center spacing will provide 20-80 times more DNA samples per surface than arrays with random attached DNA with spots of about 1000 nm in size and 20% usable occupancy. This will result in 20-80 fold lower reagent consumption and 20-80 fold faster readout. Furthermore, attaching RCR products onto this dense grid of capture probe spots ensures that each DNA ball is concentrated on a much smaller surface, increasing the signal and the speed of biochemical assays. Overall, the reduction of DNA attachment spots from 500 nm to 125 nm in size will result in up to 16 fold higher signal intensities. In short, the proposed DNA arrays will provide an order of magnitude lower cost, higher throughput and higher sensitivity than standard random DNA arrays.
  • a long term goal is to develop a structured array of 384-unit arrays each 3.33 ⁇ 3.33 mm in size (10 mm 2 ) spaced at standard 384-well plate dimensions of 4.5 mm well to well distance.
  • This composite array can have 384 ⁇ 100 million DNA spots spaced at 333 nm center to center a density that provides 10 million spots per mm 2 , 1 billion spots per cm 2 or a total of 38.4 billion DNA spots.
  • To analyze these arrays at the speed of 100 million spots per second (one unit array per second) will require a 30-100 mega-pixel CCD detector and it will take 6.5 minutes per cycle.
  • the goal for the first usable system based on the composite structured arrays would be to produce DNA features that are spaced at 1 micrometer center to center and total of up to 3.8 billion spots (10 million per unit array) that can be read in about 5 minutes with a 10 mega pixel CCD detector.
  • One billion binding sites with 100 base long DNA fragments can hold an equivalent of 30 human genomes at 1 ⁇ coverage.
  • Composite arrays of hundreds of smaller unit arrays have many advantages over a single large array. For example, a subset of genes instead of entire genomes can be selectively amplified in a multiplex reaction and sequenced in hundreds of individuals at the same time on one composite array. Another very important application of array of arrays is to determine whole chromosome sequence and haplotypes using our novel two-level fragmentation method. This method represents an enabling technology that provides mapping information for assembling chromosomal haplotypes and alternatively spliced mRNAs for any analysis based on random DNA fragmentation.
  • genomic sample DNA is first prepared in the form of about 5, 10, 100, or 200 kb length fragments.
  • a small subset of these fragments are at random placed in discreet wells of multi-well plates or similar accessories. For example a plate with 96, 384 or 1536 wells can be used for these fragment subsets.
  • An optimal way to create these DNA aliquots is to take only 10-30 cells, isolate the DNA, fragment in long segments and then split the entire preparation into 384 wells. This will assure that all chromosomal regions are represented with the same coverage.
  • the DNA aliquots will contain a few to 10, 20 or more fragments.
  • the fragment subset's complexity is determined by the capacity of unit arrays and by statistical requirements.
  • the goal is to minimize cases where any two overlapping fragments from the same region of chromosome or any two mRNA molecules transcribed from the same gene are placed in the same subset, e.g. the same plate well.
  • the same subset e.g. the same plate well.
  • By forming 384 fractions in a standard 384-well plate there is only about 1/400 chance that two overlapping fragments will end up in the same well. Even if some matching fragments are placed in the same well, the other overlapping fragments from each chromosomal region will provide the necessary unique mapping information.
  • the prepared groups of long fragments are further cut to the final fragment size of about 200 to 2000 bases.
  • the DNA in each well may be amplified before final cutting using well-developed whole genome amplification methods. All short fragments from one well will then be arrayed and sequenced on one separate unit array or in one section of a larger continuous matrix.
  • the above described composite array of 384 unit arrays is ideal for parallel analysis of these groups of fragments.
  • the algorithm will use the critical information that short fragments detected in one unit array belong′ to a limited number of longer continuous segments each representing a discreet portion of one chromosome or one mRNA molecule in the case of analyzing expressed sequences.
  • homologous chromosomal segments will be analyzed on different unit arrays. Long continuous initial segments form a tailing pattern and provide sufficient mapping information to assemble each parental chromosome separately as depicted below by relaying on about 100 polymorphic sites per 100 kb of DNA. Dots represent 100-1000 consecutive bases that are identical in corresponding segments.
  • Random arrays prepared by two-level DNA fragmenting combines the advantages of both BAC sequencing and shotgun sequencing in a simple and efficient way.
  • this innovation will extend use of random DNA arrays for de novo sequencing of complex genomes or mixtures of genomes, e.g. all bacterial and protozoa genomes in a drop of see water.
  • the high density structured random DNA arrays and array of arrays will provide 20-80 times more DNA binding sites per surface area than random attachment, resulting in several advantages:
  • rSBH and SBH in general is the difficulty in determining exact length of long simple repeats (ACACACACACAC . . . ), usually longer than about 10 bases.
  • Special probes can be used for extending the read length of such sequences. For example for reading (A) n repeats, probes (C,G,T) 3 A 6-10 and A 6-10 (C,G,T) 3 alone or in combination with (A) 7-20 spacers can be used in 10-15 additional ligation cycles to extend the read length of simple repeats to about 30 bases.
  • Primer extension from a genomic DNA template may be used to generate a linear amplification of greater than 10 kilobases of sequence surrounding the genomic region of interest.
  • 20 cycles of linear amplification will be performed with the forward primer followed by 20 cycles with the reverse primer ( FIG. 17 ).
  • the first primer can be removed with a standard column for long DNA purification or degraded if a few uracil bases are incorporated.
  • a greater number of reverse strands are generated relative to forward strands resulting in a population of double stranded molecules and single stranded reverse strands.
  • the reverse primer for the test DNA is biotinylated for capture to streptavidin beads which can be heated to melt any double stranded homoduplexes from being captured. All attached molecules will be single stranded and representing one strand of the original genomic DNA. Although full long-range PCR is an option here, the chance of introducing base changes by polymerase mis-incorporation and selective amplification of deleted products is minimized by avoiding exponential amplification. In addition, the amount of sample required for downstream random DNA array applications will still be ample with a linear amplification approach.
  • the 10 kb products produced can be fragmented to 0.2-2 kb in size (effectively releasing them from the solid support) and used for RCR and random array production or RCR and solution phase target production for saDNA.
  • single stranded DNA fragments are first treated with terminal transferase to attach a poly dA tail to the 3-prime end. This is then followed by ligation of the free end intra-molecularly with the aid of another bridging oligonucleotide (see section 3.5.1. for a description of the procedure).
  • a primer for a strand displacing polymerase such as Phi 29 polymerase can be used to create a long, linear concatemer of the circle.
  • the concatemers may then be attached to the surface of a glass slide for detection with fluorescent probes or labeled and used as targets on an array.
  • Sequence specific probes may be designed for specific regions of the attached concatemer, spaced about 1 kb apart, within the 10 kb of original sequence. In effect this process will “count” the number of individual molecules that were produced through a positive or negative hybrid formation to probes for targeted regions or non-targeted regions.
  • Rare-cutting restriction enzymes may be mapped to the genomic regions of interest to predict fragment size and the sequences at the ends of the molecules.
  • a possible enzyme includes Notl, which cleaves the human genome on average every 130 kb and so would be a suitable enzyme. Although methylation could affect the cutting efficiency, if the genomic DNA is from a homogeneous source then digestion patterns should be complete and not partial.
  • lambda exonuclease This enzyme degrades bases from the 5-prime end of double stranded templates possessing a 5-prime phosphate. The strand shortening will be controlled to degrade approximately 50 to 100 bases of one strand from each end. The single stranded sequence that is revealed can act as a region of hybridization for a tagged primer for selection.
  • the primer will be extended with the Stoffel fragment of Taq polymerase which will extend the strand until it is adjacent to the 5-prime end of the degraded strand.
  • the newly synthesized strand can then be ligated to the undigested portion to complete the strand with a thermostable ligase.
  • One half of the primer (3-prime) is used for sequence specific extension of the primer and reconstruction of the strand.
  • the other half (5-prime tagged end) of the primer contains at least 20 bases of sequence for hybridization to a complementary sequence attached to the surface of a microplate well.
  • the sample will first be filtered to remove small DNA fragments.
  • the sample is then hybridized via one end to the surface and non-attached sequences are removed by gentle washing. Capture of the molecule via one end allows one level of selection, but release of the captured molecule and re-capture via the other end provides a second and higher level of purification and selection.
  • the sample After the final release of the 100 kb DNA fragment from the surface the sample will be digested with a six-base restriction enzyme. These fragments of 5 to 10 kb can be used for subsequent DNase fragmentation and circle formation.
  • Cleavase is used in the cleavage fragments length polymorphism (CFLP) technique (9) which has been commercialized by Third Wave Technologies.
  • CFLP cleavage fragments length polymorphism
  • Cleavase will cut single stranded DNA five-prime of the loop and the fragments can then be separated by PAGE or similar size resolving techniques.
  • Mismatch binding proteins such as Mut S and Mut Y also rely upon the formation of heteroduplexes for their ability to identify mutation sites. Mismatches are usually repaired but the binding action of the enzymes can be used for the selection of fragments through a mobility shift in gel electrophoresis or by protection from exonucleases (10).
  • T7 endonuclease I a population of molecules with 5-prime phosphorylated overhangs surrounding the site of the mutation will be created while T4 endonuclease VII cuts 3-prime of the mismatch.
  • a range of overhang types will therefore be generated depending on the position of the cut sites.
  • Gel analysis will display the efficiency of cutting and re-ligation will display the nature of the overhang.
  • Templates for heteroduplex formation will be prepared by primer extension from genomic DNA. For the same genomic region of the reference DNA, an excess of the opposite strand is prepared in the same way from the test DNA but in a separate reaction. The test DNA strand produced is biotinylated and will be attached to a streptavidin support. Homoduplex formation is prevented by heating and removal of the complementary strand. The reference preparation is now combined with the single stranded test preparation and annealed to produce heteroduplexes ( FIG. 18 ). This heteroduplex is likely to contain a number of mismatches.
  • the 5-10 kb genomic fragments prepared from large genomic fragments in section 4.1.2 will be biotinylated by the addition of a biotinylated dideoxy nucleotide at the 3-prime end with terminal transferase and excess biotinylated nucleotide will be removed by filtration.
  • a reference BAC clone that covers the same region of sequence will be digested with the same six-base cutter to match the fragments generated from the test DNA.
  • the biotinylated genomic fragments will be heat denatured in the presence of the BAC reference DNA and slowly annealed to generate biotinylated heterohybrids ( FIG. 21 ).
  • the reference BAC DNA is in large excess to the genomic DNA so the majority of biotinylated products will be heteroduplexes.
  • the biotinylated DNA can then be attached to the surface for removal of the reference DNA. Residual DNA is washed away before the addition of the mismatch endonuclease. After cleavage, each fragment can bind an adapter at each end and enter the mismatch circle selection process as outlined in FIG. 13 and section 4.3.2.
  • mismatch cleavage of DNA nanoball probes and hybridized target may be used to identify single base mutations. Cleaved mismatch hybrids could be identified through detection of the newly formed DNA ends at the cleavage site by end specific labeling.
  • the heteroduplexes generated in section 4.2.1. can be used for selection of small DNA circles.
  • the sample is treated with the mismatch enzyme to create products cleaved on both strands surrounding the mutation site ( FIG. 20 ).
  • T7 endonuclease I or similar enzyme will cleave 5-prime of the mutation site to reveal a 5-prime overhang of varying length on both strands surrounding the mutation.
  • the next phase is to capture the cleaved products into a form (see Panel A) suitable for amplification and sequencing.
  • An adapter is ligated to the overhang produced by the mismatch cutting, but because the nature of the overhang is unknown, at least three adapters will be needed and each adapter will be synthesized with degenerate bases to accommodate all possible ends.
  • the adapter can be prepared with an internal biotin on the non-circularizing strand to allow capture for buffer exchange and sample cleanup, and also for direct amplification on the surface if desired.
  • the intervening sequence between mutations does not need to be sequenced and reduces the sequencing capacity of the system it will be removed when studying genomic derived samples.
  • Reduction of sequence complexity will utilize a type lis enzyme that cuts the DNA at a point away from the enzyme recognition sequence. In doing so, the cut site and resultant overhangs will be a combination of all base variants.
  • a possible enzyme to use in this case is Mmel (20 bases with 2 base 3′ overhang) or Eco P151 (with 25 bases and 2 base 5′ overhang).
  • the adapter will be about 50 by in length to provide sequences for initiation of rolling circle amplification and also provide stuffer sequence for circle formation. Once the adapter has been ligated to the fragment the DNA is digested with the type Its restriction enzyme to release all but 20-25 bases of sequence containing the mutation site that remains attached to the adapter.
  • the adapter (A) DNA fragment can now be attached to a streptavidin support for removal of excess fragment DNA. Excess adapter that did not ligate to mismatch cleaved ends will also bind to the streptavidin solid support. The new degenerate end created by the type Its enzyme can now be ligated to adapter B through the phosphorylation of one strand of adapter B. The other strand is non-phosphorylated and blocked at the 3-prime end with a dideoxy nucleotide. The structure formed is essentially the genomic fragment of interest captured between two different adapters. To create a circle from this structure would simply require both ends of the molecule coming together and ligating.
  • An alternative and preferred strategy to maximize the efficiency of circle formation without inter-molecular ligation events occurring is to block excess (A) adapters on the surface. This can be achieved by using Lambda exonuclease to digest the lower strand. If adapter B has been attached then it will be protected from digestion because there is no 5-prime phosphate available. If only adapter A is attached to the surface then the 5-prime phosphate is exposed for degradation of the lower strand of adapter A. This will lead to loss of excess adapter A from the surface.
  • the 5 prime end of the top strand of adapter A is prepared for ligation to the 3-prime end of adapter B. This can be achieved by introducing a restriction enzyme site into the adapters so that re-circularization of the molecule can occur with ligation.
  • Amplification of DNA captured into the circular molecules proceeds by a rolling circle amplification to form long linear concatemer copies of the circle. If extension initiates 5-prime of the biotin, the circle and newly synthesized strand is released into solution. Complementary oligonucleotides on the surface are responsible for condensation and provide sufficient attachment for downstream applications.
  • One strand is a closed circle and acts as the template. The other strand, with an exposed 3-prime end, acts as an initiating primer and is extended.
  • the adapter can be prepared with a 3-prime biotin on the non-circularized strand to allow capture for buffer exchange and sample cleanup.
  • Amplification of DNA captured into the circular molecules proceeds by a rolling circle amplification to form linear concatemer copies of the circle.
  • the mis-match derived small circular DNA molecules may be amplified by other means such as PCR. Common primer binding sites can be incorporated into the adapter sequences
  • the amplified material can be used for mutation detection by methods such as Sanger sequencing or array based sequencing.
  • the first step of this procedure will involve ligating a gene specific oligonucleotide directed to the 5-prime end with a poly dA sequence for binding to the poly dT sequence of the 3-prime end of the cDNA.
  • This oligonucleotide acts as a bridge to allow T4 DNA ligase to ligate the two ends and form a circle.
  • the second step of the reaction is to use a primer, or the bridging oligonucleotide, for a strand displacing polymerase such as Phi 29 polymerase to create a concatemer of the circle.
  • the long linear molecules will then be diluted and arrayed in 1536 well plates such that wells with single molecules can be selected. To ensure about 10% of the wells contain 1 molecule approximately 90% would have to be sacrificed as having no molecules.
  • To detect the wells that are positive we plan to hybridize a dendrimer that recognizes a universal sequence in the target to generate 10K-100K dye molecules per molecule of target. Excess dendrimer could be removed through hybridization to biotinylated capture oligos.
  • the wells will be analyzed with a fluorescent plate reader and the presence of DNA scored. Positive wells will then be re-arrayed to consolidate the clones into plates with complete wells for further amplification
  • RCR rolling circle replication
  • This system provides a complete analysis of the exon pattern on a single transcript, instead of merely providing information on the ratios of exon usage or quantification of splicing events over the entire population of transcribed genes using the current expression arrays hybridized with labeled mRNA/cDNA. At the maximum limit of its sensitivity, it should be able to allow a detailed analysis down to a single molecule of a mRNA type present in only one in hundreds of other cells; this would provide unique potentials for early diagnosis of cancer cells.
  • the analysis provides simultaneously 1) detection of each specific splice variant, 2) quantification of expression of wild type and alternatively spliced mRNAs. It can also be used to monitor gross chromosomal alterations based on the detection of gene deletions and gene translocations by loss of heterozigosity and presence of two sub-sets of exons from two genes in the same transcript on a single spot on the random array.
  • the proposed splice variant profiling process is equivalent to high throughput sequencing of individual full length cDNA clones; rSBH throughput can reach one billion cDNA molecules profiled in a 4-8 hour assay.
  • This system will provide a powerful tool to monitor changes in expression levels of various splice variants during disease emergence and progression. It can enable discovery of novel splice variants or validate known splice variants to serve as biomarkers to monitor cancer progression. It can also provide means to further understanding the roles of alternative splice variants and their possible uses as therapeutic targets. Universal nature and flexibility of this low cost and high throughput assay provides great commercial opportunities for cancer research and diagnostics and in all other biomedical areas. This high capacity system is ideal for service providing labs or companies.
  • Exon sequences will be cloned into the multiple cloning sites (MCS) of plasmid pBluescript.
  • MCS multiple cloning sites
  • genes that are shorter than 1 kb it should not be difficult to generate PCR products from cDNA using gene specific oligos for the full length sequence.
  • the easiest approach would be to generate PCR products of about 500 by corresponding to contiguous block of exons and ordered the fragments by cloning into appropriate cloning sites in the MCS of pBluescript. This will also be the approach for cloning the alternative spliced versions, since the desired variant might not be present in the cDNA source used for PCR.
  • the last site of the MCS will be used to insert a string of 40 A's to simulate the polyA tails of cellular mRNA. This is to control for the possibility that the polyA tail might interfere with the sample preparation step described below, although it is not expected to be a problem since a poly-dA tail is actually incorporated into our standard methods for the sample preparation of genomic fragments as described in section C.
  • RNA generated will be purified with the standard methods.
  • probe pools are designed for specific genes, cDNA will be prepared for those specific genes only.
  • gene-specific primers will be used, therefore for 1000 genes, 1000 primers will be used.
  • the location of the priming site for the reverse transcription will be selected with care, since it is not reasonable to expect the synthesis of cDNA >2 kb to be of high efficiency. It is quite common that the last exon would consist of the end of the coding sequence and a long 3′ untranslated region. In the case of CD44 for example, although the full-length mRNA is about 5.7 kb, the 3′ UTR comprises of 3 kb, while the coding region is only 2.2 kb. Therefore the logical location of the reverse transcription primer site would be immediately downstream of the end of the coding sequence. For some splice variants, the alternative exons are often clustered together as a block to create a region of variability.
  • a dideoxy-cytosine residue can be added to the 3′ end of all the cDNA to block ligation, then by using a mismatch oligo targeting the desired sequence, a new 3′ end can be generated by enzyme mismatch cleavage using T4 endonuclease VII (13, 14). With the new 3′ end, the cDNA can proceed with the adding a poly-dA tail and with the standard protocols of circularization and replication.
  • the rolling circle replication will initiate from a oligo priming at the adaptor sequence, and the replication around the circular cDNA molecule will be carried out by Phi29 polymerase whose high processivity allow many tandem copies to be made from circular templates.
  • the exon probe will actually consist of a pair of oligos ligated together upon hybridization to the target (see description on combinatorial probe ligation chemistry).
  • One of the pair will be selected from a library of 4096 6 mer oligos and the other will be from a library of 1024 TAMRA-labeled 5 mer oligos.
  • a software program can be developed for preparing optimized pools of 6-mer and 5-mer probes for a given set of 1000 genes and about 10,000 exons. The goal is to keep the number of individual probes in a pool that will detect 500 genes and one exon per gene to be less than 200.
  • the algorithm will consist of two main steps:
  • Step 2 Find out all 6-mers that are contiguous with all sites in all 1000 genes that are complementary to 10 selected 5-mers. On average 20 such sites will exist in each 2 kb gene. Total number of sites would be about 20,000, eg, each 6-mer on average will occur 5 times. Sort 6-mers by the hit frequency. The most frequent may have over 20 hits, e.g. such 6-mer will detect 20 genes through combinations with 10 labeled probes. Thus, to get a single probe pair for each of the 500 genes a minimum of 25 6-mer probes would be required. Realistically, 100 to 200 6-mers may be required.
  • the profiling of exons can be performed in two phases: the gene identification phase and the exon identification phase.
  • the gene identification phase each concatemer on the array can be uniquely identified with a particular gene.
  • 10 probe pools or hybridization cycles will be enough to identify 1000 genes using the following scheme.
  • Each gene is assigned a unique binary code. The number of binary digits thus depends on the total number of genes: 3 digits for 8 genes, 10 digits for 1024 genes.
  • Each probe pool is designed to correspond to a digit of the binary code and would contain probes that would hit a unique combination of half of the genes and one hit per gene only. Thus for each hybridization cycle, an unique half of the genes will score a 1 for that digit and the other half will score zero.
  • exon identification phase After identifying each ampliot with a gene assignment, its exon pattern will be profiled in the exon identification phase. For the exon identification phase, one exon per gene in all or most of the genes is tested per hybridization cycle. In most cases 10-20 exon identification cycles should be sufficient. Thus, in the case of using 20 exon identification cycles we will obtain information of 2 probes per each of 10 exons in each gene. For genes with more than 20 exons, methods can be developed so that 2 exons per gene can be probed at the same cycle. One possibility is using multiple fluorophores of different colors, and another possibility is to exploit differential hybrid stabilities of different ligation probe pairs.
  • a total of about 40 assay cycles will provide sufficient information to obtain gene identity at each spot and to provide three matching probe-pairs for each of 10,000 exons with enough informational redundancy to provide accurate identification of missing exons due to alternative splicing or chromosomal deletions.
  • Amplicon means the product of a polynucleotide amplification reaction. That is, it is a population of polynucleotides, usually double stranded, that are replicated from one or more starting sequences. The one or more starting sequences may be one or more copies of the same sequence, or it may be a mixture of different sequences. Amplicons may be produced by a variety of amplification reactions whose products are multiple replicates of one or more target nucleic acids. Generally, amplification reactions producing amplicons are “template-driven” in that base pairing of reactants, either nucleotides or oligonucleotides, have complements in a template polynucleotide that are required for the creation of reaction products.
  • template-driven reactions are primer extensions with a nucleic acid polymerase or oligonucleotide ligations with a nucleic acid ligase.
  • Such reactions include, but are not limited to, polymerase chain reactions (PCRs), linear polymerase reactions, nucleic acid sequence-based amplification (NASBAs), rolling circle amplifications, and the like, disclosed in the following references that are incorporated herein by reference: Mullis et al, U.S. Pat. Nos. 4,683,195; 4,965,188; 4,683,202; 4,800,159 (PCR); Gelfand et al, U.S. Pat. No.
  • amplicons of the invention are produced by PCRs.
  • An amplification reaction may be a “real-time” amplification if a detection chemistry is available that permits a reaction product to be measured as the amplification reaction progresses, e.g.
  • reaction mixture means a solution containing all the necessary reactants for performing a reaction, which may include, but not be limited to, buffering agents to maintain pH at a selected level during a reaction, salts, co-factors, scavengers, and the like.
  • “Complementary or substantially complementary” refers to the hybridization or base pairing or the formation of a duplex between nucleotides or nucleic acids, such as, for instance, between the two strands of a double stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single stranded nucleic acid.
  • Complementary nucleotides are, generally, A and T (or A and U), or C and G.
  • Two single stranded RNA or DNA molecules are said to be substantially complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the nucleotides of the other strand, usually at least about 90% to 95%, and more preferably from about 98 to 100%.
  • substantial complementarity exists when an RNA or DNA strand will hybridize under selective hybridization conditions to its complement.
  • selective hybridization will occur when there is at least about 65% complementary over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 90% complementary. See, M. Kanehisa Nucleic Acids Res. 12:203 (1984), incorporated herein by reference.
  • Duplex means at least two oligonucleotides and/or polynucleotides that are fully or partially complementary undergo Watson-Crick type base pairing among all or most of their nucleotides so that a stable complex is formed.
  • annealing and “hybridization” are used interchangeably to mean the formation of a stable duplex.
  • Perfectly matched in reference to a duplex means that the poly- or oligonucleotide strands making up the duplex form a double stranded structure with one another such that every nucleotide in each strand undergoes Watson Crick basepairing with a nucleotide in the other strand.
  • duplex comprehends the pairing of nucleoside analogs, such as deoxyinosine, nucleosides with 2-aminopurine bases, PNAs, and the like, that may be employed.
  • a “mismatch” in a duplex between two oligonucleotides or polynucleotides means that a pair of nucleotides in the duplex fails to undergo Watson-Crick bonding.
  • Genetic locus in reference to a genome or target polynucleotide, means a contiguous subregion or segment of the genome or target polynucleotide.
  • genetic locus, or locus may refer to the position of a nucleotide, a gene, or a portion of a gene in a genome, including mitochondrial DNA, or it may refer to any contiguous portion of genomic sequence whether or not it is within, or associated with, a gene.
  • a genetic locus refers to any portion of genomic sequence, including mitochondrial DNA, from a single nucleotide to a segment of few hundred nucleotides, e.g. 100-300, in length.
  • Genetic variant means a substitution, inversion, insertion, or deletion of one or more nucleotides at genetic locus, or a translocation of DNA from one genetic locus to another genetic locus.
  • genetic variant means an alternative nucleotide sequence at a genetic locus that may be present in a population of individuals and that includes nucleotide substitutions, insertions, and deletions with respect to other members of the population.
  • insertions or deletions at a genetic locus comprises the addition or the absence of from 1 to 10 nucleotides at such locus, in comparison with the same locus in another individual of a population.
  • Hybridization refers to the process in which two single-stranded polynucleotides bind non-covalently to form a stable double-stranded polynucleotide.
  • the term “hybridization” may also refer to triple-stranded hybridization.
  • the resulting (usually) double-stranded polynucleotide is a “hybrid” or “duplex.”
  • “Hybridization conditions” will typically include salt concentrations of less than about 1M, more usually less than about 500 mM and less than about 200 mM.
  • a “hybridization buffer” is a buffered salt solution such as 5 ⁇ SSPE, or the like.
  • Hybridization temperatures can be as low as 5° C., but are typically greater than 22° C., more typically greater than about 30° C., and preferably in excess of about 37° C.
  • Hybridizations are usually performed under stringent conditions, i.e. conditions under which a probe will hybridize to its target subsequence. Stringent conditions are sequence-dependent and are different in different circumstances. Longer fragments may require higher hybridization temperatures for specific hybridization. As other factors may affect the stringency of hybridization, including base composition and length of the complementary strands, presence of organic solvents and extent of base mismatching, the combination of parameters is more important than the absolute measure of any one alone. Generally, stringent conditions are selected to be about 5° C.
  • Exemplary stringent conditions include salt concentration of at least 0.01 M to no more than 1 M Na ion concentration (or other salts) at a pH 7.0 to 8.3 and a temperature of at least 25° C.
  • salt concentration of at least 0.01 M to no more than 1 M Na ion concentration (or other salts) at a pH 7.0 to 8.3 and a temperature of at least 25° C.
  • 5 ⁇ SSPE 750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4
  • a temperature of 25-30° C. are suitable for allele-specific probe hybridizations.
  • stringent conditions see for example, Sambrook, Fritsche and Maniatis. “Molecular Cloning A laboratory Manual” 2nd Ed.
  • Hybridizing specifically to or “specifically hybridizing to” or like expressions refer to the binding, duplexing, or hybridizing of a molecule substantially to or only to a particular nucleotide sequence or sequences under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA.
  • “Ligation” means to form a covalent bond or linkage between the termini of two or more nucleic acids, e.g. oligonucleotides and/or polynucleotides, in a template-driven reaction.
  • the nature of the bond or linkage may vary widely and the ligation may be carried out enzymatically or chemically.
  • ligations are usually carried out enzymatically to form a phosphodiester linkage between a 5′ carbon of a terminal nucleotide of one oligonucleotide with 3′ carbon of another oligonucleotide.
  • Enzymatic ligation usually takes place in a ligase buffer, which is a buffered salt solution containing any required divalent cations, cofactors, and the like, for the particular ligase employed.
  • “Microarray” or “array” refers to a solid phase support having a surface, usually planar or substantially planar, which carries an array of sites containing nucleic acids, such that each member site of the array comprises identical copies of immobilized oligonucleotides or polynucleotides and is spatially defined and not overlapping with other member sites of the array; that is, the sites are spatially discrete.
  • sites of a microarray may also be spaced apart as well as discrete; that is, different sites do not share boundaries, but are separated by inter-site regions, usually free of bound nucleic acids.
  • Spatially defined hybridization sites may additionally be “addressable” in that its location and the identity of its immobilized oligonucleotide are known or predetermined, for example, prior to its use.
  • the oligonucleotides or polynucleotides are single stranded and are covalently attached to the solid phase support, usually by a 5′-end or a 3′-end.
  • oligonucleotides or polynucleotides are attached to the solid phase support non-covalently, e.g. by a biotin-streptavidin linkage, hybridization to a capture oligonucleotide that is covalently bound, and the like.
  • random array or “random microarray” refers to a microarray whose spatially discrete regions of oligonucleotides or polynucleotides are not spatially addressed. That is, the identity of the attached oligonucleoties or polynucleotides is not discernable, at least initially, from its location, but may be determined by a particular operation on the array, e.g.
  • Random microarrays are frequently formed from a planar array of microbeads, e.g. Brenner et al, Nature Biotechnology, 18: 630-634 (2000); Tulley et al, U.S. Pat. No. 6,133,043; Stuelpnagel et al, U.S. Pat. No. 6,396,995; Chee et al, U.S. Pat. No. 6,544,732; and the like.
  • mismatch means a base pair between any two of the bases A, T (or U for RNA), G, and C other than the Watson-Crick base pairs G-C and A-T.
  • the eight possible mismatches are A-A, T-T, G-G, C-C, T-G, C-A, T-C, and A-G.
  • “Mutation” and “polymorphism” are usually used somewhat interchangeably to mean a DNA molecule, such as a gene, that differs in nucleotide sequence from a reference DNA sequence, or wild type sequence, or normal tissue sequence, by one or more bases, insertions, and/or deletions.
  • the usage of Cotton is followed in that a mutation is understood to be any base change whether pathological to an organism or not, whereas a polymorphism is usually understood to be a base change with no direct pathological consequences.
  • Nucleoside as used herein includes the natural nucleosides, including 2′-deoxy and 2′-hydroxyl forms, e.g. as described in Kornberg and Baker, DNA Replication, 2nd Ed. (Freeman, San Francisco, 1992). “Analogs” in reference to nucleosides includes synthetic nucleosides having modified base moieties and/or modified sugar moieties, e.g. described by Scheit, Nucleotide Analogs (John Wiley, New York, 1980); Uhlman and Peyman, Chemical Reviews, 90: 543-584 (1990), or the like, with the proviso that they are capable of specific hybridization.
  • Such analogs include synthetic nucleosides designed to enhance binding properties, reduce complexity, increase specificity, and the like.
  • Polynucleotides comprising analogs with enhanced hybridization or nuclease resistance properties are described in Uhlman and Peyman (cited above); Crooke et al, Exp. Opin. Ther. Patents, 6: 855-870 (1996); Mesmaeker et al, Current Opinion in Structual Biology, 5: 343-355 (1995); and the like.
  • Exemplary types of polynucleotides that are capable of enhancing duplex stability include oligonucleotide N3′ ⁇ P5′ phosphoramidates (referred to herein as “amidates”), peptide nucleic acids (referred to herein as “PNAs”), oligo-2′-O-alkylribonucleotides, polynucleotides containing C-5 propynylpyrimidines, locked nucleic acids (LNAs), and like compounds.
  • Such oligonucleotides are either available commercially or may be synthesized using methods described in the literature.
  • PCR Polymerase chain reaction
  • PCR is a reaction for making multiple copies or replicates of a target nucleic acid flanked by primer binding sites, such reaction comprising one or more repetitions of the following steps: (i) denaturing the target nucleic acid, (ii) annealing primers to the primer binding sites, and (iii) extending the primers by a nucleic acid polymerase in the presence of nucleoside triphosphates.
  • the reaction is cycled through different temperatures optimized for each step in a thermal cycler instrument.
  • a double stranded target nucleic acid may be denatured at a temperature >90° C., primers annealed at a temperature in the range 50-75° C., and primers extended at a temperature in the range 72-78° C.
  • PCR encompasses derivative forms of the reaction, including but not limited to, RT-PCR, real-time PCR, nested PCR, quantitative PCR, multiplexed PCR, and the like. Reaction volumes range from a few hundred nanoliters, e.g. 200 nL, to a few hundred′, IL, e.g. 200 pt.
  • Reverse transcription PCR or “RT-PCR,” means a PCR that is preceded by a reverse transcription reaction that converts a target RNA to a complementary single stranded DNA, which is then amplified, e.g. Tecott et al, U.S. Pat. No. 5,168,038, which patent is incorporated herein by reference.
  • Real-time PCR means a PCR for which the amount of reaction product, i.e. amplicon, is monitored as the reaction proceeds.
  • Nested PCR means a two-stage PCR wherein the amplicon of a first PCR becomes the sample for a second PCR using a new set of primers, at least one of which binds to an interior location of the first amplicon.
  • initial primers in reference to a nested amplification reaction mean the primers used to generate a first amplicon
  • secondary primers mean the one or more primers used to generate a second, or nested, amplicon.
  • Multiplexed PCR means a PCR wherein multiple target sequences (or a single target sequence and one or more reference sequences) are simultaneously carried out in the same reaction mixture, e.g. Bernard et al, Anal. Biochem., 273: 221-228 (1999)(two-color real-time PCR). Usually, distinct sets of primers are employed for each sequence being amplified.
  • Quantitative PCR means a PCR designed to measure the abundance of one or more specific target sequences in a sample or specimen. Quantitative PCR includes both absolute quantitation and relative quantitation of such target sequences. Quantitative measurements are made using one or more reference sequences that may be assayed separately or together with a target sequence.
  • the reference sequence may be endogenous or exogenous to a sample or specimen, and in the latter case, may comprise one or more competitor templates.
  • Typical endogenous reference sequences include segments of transcripts of the following genes: f3-actin, GAPDH, 132-microglobulin, ribosomal RNA, and the like.
  • Polynucleotide or “oligonucleotide” are used interchangeably and each mean a linear polymer of nucleotide monomers. As used herein, the terms may also refer to double stranded forms. Monomers making up polynucleotides and oligonucleotides are capable of specifically binding to a natural polynucleotide by way of a regular pattern of monomer-to-monomer interactions, such as Watson-Crick type of base pairing, base stacking, Hoogsteen or reverse Hoogsteen types of base pairing, or the like, to form duplex or triplex forms. Such monomers and their internucleosidic linkages may be naturally occurring or may be analogs thereof, e.g.
  • Non-naturally occurring analogs may include PNAs, phosphorothioate internucleosidic linkages, bases containing linking groups permitting the attachment of labels, such as fluorophores, or haptens, and the like.
  • oligonucleotide or polynucleotide requires enzymatic processing, such as extension by a polymerase, ligation by a ligase, or the like, one of ordinary skill would understand that oligonucleotides or polynucleotides in those instances would not contain certain analogs of internucleosidic linkages, sugar moities, or bases at any or some positions, when such analogs are incompatable with enzymatic reactions.
  • Polynucleotides typically range in size from a few monomeric units, e.g. 5-40, when they are usually referred to as “oligonucleotides,” to several thousand monomeric units.
  • a polynucleotide or oligonucleotide is represented by a sequence of letters (upper or lower case), such as “ATGCCTG,” it will be understood that the nucleotides are in 5′ ⁇ 3′ order from left to right and that “A” denotes deoxyadenosine, “C” denotes deoxycytidine, “G” denotes deoxyguanosine, and “T” denotes thymidine, “I” denotes deoxyinosine, “U” denotes uridine, unless otherwise indicated or obvious from context.
  • polynucleotides comprise the four natural nucleosides (e.g. deoxyadenosine, deoxycytidine, deoxyguanosine, deoxythymidine for DNA or their ribose counterparts for RNA) linked by phosphodiester linkages; however, they may also comprise non-natural nucleotide analogs, e.g. including modified bases, sugars, or internucleosidic linkages.
  • nucleosides e.g. deoxyadenosine, deoxycytidine, deoxyguanosine, deoxythymidine for DNA or their ribose counterparts for RNA
  • non-natural nucleotide analogs e.g. including modified bases, sugars, or internucleosidic linkages.
  • Primer means an oligonucleotide, either natural or synthetic, that is capable, upon forming a duplex with a polynucleotide template, of acting as a point of initiation of nucleic acid synthesis and being extended from its 3′ end along the template so that an extended duplex is formed.
  • the sequence of nucleotides added during the extension process are determined by the sequence of the template polynucleotide.
  • primers are extended by a DNA polymerase. Primers usually have a length in the range of from 9 to 40 nucleotides, or in some embodiments, from 14 to 36 nucleotides.
  • Readout means a parameter, or parameters, which are measured and/or detected that can be converted to a number or value.
  • readout may refer to an actual numerical representation of such collected or recorded data.
  • a readout of fluorescent intensity signals from a microarray is the position and fluorescence intensity of a signal being generated at each hybridization site of the microarray; thus, such a readout may be registered or stored in various ways, for example, as an image of the microarray, as a table of numbers, or the like.
  • a reference population of DNA may comprise a cDNA library or genomic library from a known cell type or tissue source.
  • a reference population of DNA may comprise a cDNA library or a genomic library derived from the tissue of a healthy individual and a test population of DNA may comprise a cDNA library or genomic library derived from the same tissue of a diseased individual.
  • Reference populations of DNA may also comprise an assembled collection of individual polynucleotides, cDNAs, genes, or exons thereof, e.g. genes or exons encoding all or a subset of known p53 variants, genes of a signal transduction pathway, or the like.
  • “Specific” or “specificity” in reference to the binding of one molecule to another molecule, such as a labeled target sequence for a probe, means the recognition, contact, and formation of a stable complex between the two molecules, together with substantially less recognition, contact, or complex formation of that molecule with other molecules.
  • “specific” in reference to the binding of a first molecule to a second molecule means that to the extent the first molecule recognizes and forms a complex with another molecules in a reaction or sample, it forms the largest number of the complexes with the second molecule. Preferably, this largest number is at least fifty percent.
  • molecules involved in a specific binding event have areas on their surfaces or in cavities giving rise to specific recognition between the molecules binding to each other.
  • specific binding examples include antibody-antigen interactions, enzyme-substrate interactions, formation of duplexes or triplexes among polynucleotides and/or oligonucleotides, receptor-ligand interactions, and the like.
  • contact in reference to specificity or specific binding means two molecules are close enough that weak noncovalent chemical interactions, such as Van der Waal forces, hydrogen bonding, base-stacking interactions, ionic and hydrophobic interactions, and the like, dominate the interaction of the molecules.
  • Tm is used in reference to the “melting temperature.”
  • the melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands.
  • Other references e.g., Allawi, H. T. & SantaLucia, J., Jr., Biochemistry 36, 10581-94 (1997)
  • sample usually means a quantity of material from a biological, environmental, medical, or patient source in which detection, measurement, or labeling of target nucleic acids is sought.
  • a specimen or culture e.g., microbiological cultures
  • a sample may include a specimen of synthetic origin.
  • Biological samples may be animal, including human, fluid, solid (e.g., stool) or tissue, as well as liquid and solid food and feed products and ingredients such as dairy items, vegetables, meat and meat by-products, and waste.
  • Biological samples may include materials taken from a patient including, but not limited to cultures, blood, saliva, cerebral spinal fluid, pleural fluid, milk, lymph, sputum, semen, needle aspirates, and the like. Biological samples may be obtained from all of the various families of domestic animals, as well as feral or wild animals, including, but not limited to, such animals as ungulates, bear, fish, rodents, etc. Environmental samples include environmental material such as surface matter, soil, water and industrial samples, as well as samples obtained from food and dairy processing instruments, apparatus, equipment, utensils, disposable and non-disposable items. These examples are not to be construed as limiting the sample types applicable to the present invention.

Abstract

Random arrays of single molecules are provided for carrying out large scale analyses, particularly of biomolecules, such as genomic DNA, cDNAs, proteins, and the like. In one aspect, arrays of the invention comprise concatemers of DNA fragments that are randomly disposed on a regular array of discrete spaced apart regions, such that substantially all such regions contain no more than a single concatemer.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of U.S. application Ser. No. 16/425,846; which is a continuation of U.S. application Ser. No. 15/442,659, filed Feb. 25, 2017, now U.S. Pat. No. 10,351,909; which is a continuation of U.S. application Ser. No. 14/714,133 filed May 15, 2015, now U.S. Pat. No. 9,650,673; which is a continuation of U.S. application Ser. No. 12/882,880, filed Sep. 15, 2010; which is continuation of U.S. application Ser. No. 11/451,691, filed Jun. 13, 2006, now U.S. Pat. No. 8,445,194; which claims priority from U.S. provisional application Nos. 60/776,415, filed Feb. 24, 2006, 60/725,116, filed Oct. 7, 2005, and 60/690,771 filed Jun. 15, 2005, each of which is hereby incorporated by reference in its entirety.
  • GOVERNMENT INTERESTS
  • This invention was made with Government support under grant No. 1 U01AI057315-01 awarded by the National Institutes of Health. The Government has certain rights in the invention.
  • REFERENCE TO SUBMISSION OF A SEQUENCE LISTING
  • The instant application contains a Sequence Listing which has been submitted via EFS-Web and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Aug. 14, 2020, is named 092171-1204079-5004-US12_SL.txt, and is 13,173 bytes in size.
  • FIELD OF THE INVENTION
  • The present invention relates to methods and compositions for high-throughput analysis of populations of individual molecules, and more particularly, to methods and compositions related to fabrication of single molecule arrays and applications thereof, especially in high-throughput nucleic acid sequencing and genetic analysis.
  • BACKGROUND
  • Large-scale molecular analysis is central to understanding a wide range of biological phenomena related to states of health and disease both in humans and in a host of economically important plants and animals, e.g. Collins et al (2003), Nature, 422: 835-847; Hirschhorn et al (2005), Nature Reviews Genetics, 6: 95-108; National Cancer Institute, Report of Working Group on Biomedical Technology, “Recommendation for a Human Cancer Genome Project,” (February, 2005). Miniaturization has proved to be extremely important for increasing the scale and reducing the costs of such analyses, and an important route to miniaturization has been the use of microarrays of probes or analytes. Such arrays play a key role in most currently available, or emerging, large-scale genetic analysis and proteomic techniques, including those for single nucleotide polymorphism detection, copy number assessment, nucleic acid sequencing, and the like, e.g. Kennedy et al (2003), Nature Biotechnology, 21: 1233-1237; Gunderson et al (2005), Nature Genetics, 37: 549-554; Pinkel and Albertson (2005), Nature Genetics Supplement, 37: 511¬517; Leamon et al (2003), Electrophoresis, 24: 3769-3777; Shendure et al (2005), Science, 309: 1728-1732; Cowie et al (2004), Human Mutation, 24: 261-271; and the like. However, the scale of microarrays currently used in such techniques still falls short of that required to meet the goals of truly low cost analyses that would make practical such operations as personal genome sequencing, environmental sequencing to use changes in complex microbial communities as an indicator of states of health, either personal or environmental, studies that associate genomic features with complex traits, such as susceptibilities to cancer, diabetes, cardiovascular disease, and the like, e.g. Collins et al (cited above); Hirschhorn et al (cited above); Tringe et al (2005), Nature Reviews Genetics, 6: 805-814; Service (2006), Science, 311: 1544-1546.
  • Increasing the scale of analysis in array-based schemes for DNA sequencing is particularly challenging as the feature size of the array is decreased to molecular levels, since most schemes require not only a procedure for forming high density arrays, but also repeated cycles of complex biochemical steps that complicate the problems of array integrity, signal generation, signal detection, and the like, e.g. Metzker (2005), Genome Research, 15: 1767-1776; Shendure et al (2004), Nature Reviews Genetics, 5: 335-344; Weiss (1999), Science, 283: 1676-1683. Some approaches have employed high density arrays of unamplified target sequences, which present serious signal-to-noise challenges, when “sequencing by synthesis” chemistries have been used, e.g. Balasubramanian et al, U.S. Pat. No. 6,787,308. Other approaches have employed in situ amplification of randomly disposed target sequences, followed by application of “sequencing by synthesis” chemistries. Such approaches also have given rise to various difficulties, including (i) significant variability in the size of target sequence clusters, (ii) gradual loss of phase in extension steps carried out by polymerases, (iii) lack of sequencing cycle efficiency that inhibits read lengths, and the like, e.g. Kartalov et al, Nucleic Acids Research, 32: 2873-2879 (2004); Mitra et al, Anal. Biochem. 320: 55-65 (2003); Metzker (cited above).
  • In view of the above, it would be advantageous for the medical, life science, and agricultural fields if there were available molecular arrays and arraying techniques that permitted efficient and convenient analysis of large numbers of individual molecules, such as DNA fragments covering substantially an entire mammalian-sized genome, in parallel in a single analytical operation.
  • SUMMARY OF THE INVENTION
  • In one aspect, the invention provides high density single molecule arrays, methods of making and using such compositions, and kits for implementing such methods. Compositions of the invention in one form include random arrays of a plurality of different single molecules disposed on a surface, where the single molecules each comprise a macromolecular structure and at least one analyte, such that each macromolecular structure comprises a plurality of attachment functionalities that are capable of forming bonds with one or more functionalities on the surface. In one aspect, the analyte is a component of the macromolecular structure, and in another aspect, the analyte is attached to the macromolecular structure by a linkage between a unique functionality on such structure and a reactive group or attachment moiety on the analyte. In another aspect, compositions of the invention include random arrays of single molecules disposed on a surface, where the single molecules each comprise a concatemer of at least one target polynucleotide and each is attached to the surface by linkages formed between one or more functionalities on the surface and complementary functionalities on the concatemer. In another form, compositions of the invention include random arrays of single molecules disposed on a surface, where the single molecules each comprise a concatemer of at least one target polynucleotide and at least one adaptor oligonucleotide and each is attached to such surface by the formation of duplexes between capture oligonucleotides on the surface and the attachment oligonucleotides in the concatemer. In still another form, compositions of the invention include random arrays of single molecules disposed on a surface, where each single molecule comprises a bifunctional macromolecular structure having a unique functionality and a plurality of complementary functionalities, and where each single molecule is attached to the surface by linkages between one or more functionalities on the surface and complementary functionalities on the bifunctional macromolecular structure, the unique functionality having an orthogonal chemical reactivity with respect to the complementary functionalities and being capable of forming a covalent linkage with an analyte. In regard to the above compositions, in another aspect, such single molecules are disposed in a planar array randomly distributed onto discrete spaced apart regions having defined positions. Preferably, in this aspect, the discrete spaced apart regions each have an area that permits the capture of no more than a single molecule and each is surrounded by an inter-regional space that is substantially free of other single molecules.
  • In one aspect, the invention includes an array of polymer molecules comprising: (a) a support having a surface; and (b) a plurality of polymer molecules attached to the surface, wherein each polymer molecule has a random coil state and comprises a branched or linear structure of multiple copies of one or more linear polymeric units, such that the polymer molecule is attached to the surface within a region substantially equivalent to a projection of the random coil on the surface and randomly disposed at a density such that at least thirty percent of the polymer molecules are separately detectable. As discussed more fully below, whenever the polymer molecules are linear, in one embodiment, “substantially equivalent” in reference to the above projection means a substantially circular region with a diameter equal to the root mean square of the end-to-end distance of such linear polymer. In another embodiment, for linear or branched polymers, “substantially equivalent” means a substantially circular region having a diameter that is one half or less than the total length of the polymer; or in another embodiment one tenth or less; or in another embodiment, one hundredth or less.
  • In another aspect, the invention includes an array of polynucleotide molecules comprising: (a) a support having a surface; and (b) a plurality of polynucleotide molecules attached to the surface, wherein each polynucleotide molecule has a random coil state and comprises a concatemer of multiple copies of a target sequence such that the polynucleotide molecule is attached to the surface within a region substantially equivalent to a projection of the random coil on the surface and randomly disposed at a density such that at least thirty percent of the polynucleotide molecules have a nearest neighbor distance of at least fifty nm.
  • A method of making arrays of provided polymer molecules wherein each polymer molecule has a random coil or similar or other three-dimensional state and comprises a branched or linear structure of multiple copies of one or more linear polymeric units, such that the existing polymer molecule is attached to the surface within a region substantially equivalent to a projection of the random coil on the surface or a region having size that is one half or less, one tenth or less or one hundredth or less of the total length of the polymer, and randomly disposed at a density such that at least twenty or at least thirty percent of the polymer molecules are separately detectable.
  • In still another aspect, the invention provides an array of single molecules comprising: (a) a support having a planar surface having a regular array of discrete spaced apart regions, wherein each discrete spaced apart region has an area of less than 1 μm2 and contains reactive functionalities attached thereto; and (b) a plurality of single molecules attached to the surface, wherein each single molecule comprises a macromolecular structure and at least one analyte having an attachment moiety, such that each macromolecular structure comprises a unique functionality and a plurality of attachment functionalities that are capable of forming linkages with the reactive functionalities of the discrete spaced apart regions, and such that the analyte is attached to the macromolecular structure by a linkage between the unique functionality and the attachment moiety of the analyte, wherein the plurality of single molecules are randomly disposed on the discrete spaced apart regions such that at least a majority of the discrete spaced apart regions contain only one single molecule.
  • In another aspect, the invention provides an array of polynucleotide molecules comprising: (a) a support having a surface with capture oligonucleotides attached thereto; and (b) a plurality of polynucleotide molecules attached to the surface, wherein each polynucleotide molecule comprises a concatemer of multiple copies of a target sequence and an adaptor oligonucleotide such that the polynucleotide molecule is attached to the surface by one or more complexes formed between capture oligonucleotides and adaptor oligonucleotides, the polynucleotide molecules being randomly disposed on the surface at a density such that at least a majority of the polynucleotide molecules have a nearest neighbor distance of at least fifty nm. In one embodiment of this aspect, the surface is a planar surface having an array of discrete spaced apart regions, wherein each discrete spaced apart region has a size equivalent to that of the polynucleotide molecule and contains the capture oligonucleotides attached thereto and wherein substantially all such regions have at most one of the polynucleotide molecules attached.
  • The invention further includes, a method of making an array of polynucleotide molecules comprising the following steps: (a) generating a plurality of polynucleotide molecules each comprising a concatemer of a DNA fragment from a source DNA and an adaptor oligonucleotide; and (b) disposing the plurality of polynucleotide molecules onto a support having a surface with capture oligonucleotides attached thereto so that the polynucleotide molecules are fixed to the surface by one or more complexes formed between capture oligonucleotides and adaptor oligonucleotides and so that the polynucleotide molecules are randomly distributed on the surface at a density such that a majority of the polynucleotide molecules have a nearest neighbor distance of at least fifty nm, thereby forming the array of polynucleotide molecules.
  • In another aspect, the invention provides a method of determining a nucleotide sequence of a target polynucleotide, the method comprising the steps of: (a) generating a plurality of target concatemers from the target polynucleotide, each target concatemer comprising multiple copies of a fragment of the target polynucleotide and the plurality of target concatemers including a number of fragments that substantially covers the target polynucleotide; (b) forming a random array of target concatemers fixed to a surface at a density such that at least a majority of the target concatemers are optically resolvable; (c) identifying a sequence of at least a portion of each fragment in each target concatemer; and (d) reconstructing the nucleotide sequence of the target polynucleotide from the identities of the sequences of the portions of fragments of the concatemers. In one embodiment of this aspect, the step of identifying includes the steps of (a) hybridizing one or more probes from a first set of probes to the random array under conditions that permit the formation of perfectly matched duplexes between the one or more probes and complementary sequences on target concatemers; (b) hybridizing one or more probes from a second set of probes to the random array under conditions that permit the formation of perfectly matched duplexes between the one or more probes and complementary sequences on target concatemers; (c) ligating probes from the first and second sets hybridized to a target concatemer at contiguous sites; (d) identifying the sequences of the ligated first and second probes; and (e) repeating steps (a through (d) until the sequence of the target polynucleotide can be determined from the identities of the sequences of the ligated probes.
  • In another aspect, the invention includes kits for making random arrays of the invention and for implementing applications of the random arrays of the invention, particularly high-throughput analysis of one or more target polynucleotides.
  • Among other advantages, the methods of the invention provide flexibility in making and using an array of structured random arrays for more efficient haplotype and splice variant determination, analysis of multiple samples in parallel, staggered sequencing reaction to eliminate the idle time of CCD detectors, parallel probing cycles to shorten the sequencing completion time of longer DNA fragments.
  • The present invention provides a significant advance in the microarray field by providing arrays of single molecules comprising linear and/or branched polymer structures that may incorporate or have attached target analyte molecules. In one form, such single molecules are concatemers of target polynucleotides arrayed at densities that permit efficient high resolution analysis of mammalian-sized genomes, including sequence determination of all or substantial parts of such genomes, sequence determination of tagged fragments from selected regions of multiple genomes, digital readouts of gene expression, and genome-wide assessments of copy number patterns, methylation patterns, chromosomal stability, individual genetic variation, and the like.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1A, FIG. 1B, FIG. 1C, FIG. 1D, FIG. 1E, FIG. 1F, FIG. 1G, FIG. 1H and FIG. 1I illustrate various embodiments of the methods and compositions of the invention.
  • FIGS. 2A-2B illustrate methods of circularizing genomic DNA fragments for generating concatemers of polynucleotide analytes.
  • FIG. 3 is an image of a glass surface containing a disposition of concatemers of E. coli fragments.
  • FIG. 4 is an image of concatemers derived from two different organisms that are selectively labeled using oligonucleotide probes.
  • FIG. 5 is an image of concatemers of DNA fragments that contain a degenerated base, each of which is identified by a specific ligation probe.
  • FIG. 6 is an image of concatemers of DNA fragments that contain a segment of degenerate bases, pairs of which are identified by specific probes.
  • FIG. 7 is a scheme for identifying sequence differences between reference sequences and test sequences using enzymatic mismatch detection and for constructing DNA circles therefrom.
  • FIG. 8 is another for identifying sequence differences between a reference sequence and a test sequence using enzymatic mismatch detection and for constructing DNA circles therefrom.
  • FIG. 9 shows general elements of the universal nano-ball probe template single stranded DNA circle.
  • FIG. 10 illustrates using the MetaMorph software, 3 images were overlaid together with slight shifts. The blue colored image corresponds to result of hybridization of the BrPrb3 (the adaptor probe) to the array. The red colored image, shifted slightly above the blue image, corresponds to the result of hybridization of the Ba3 probe to the array. The green colored image, shifted slightly below the blue image, corresponds to the result of hybridization of the Yp3 probe to the array. The circle denoted with ‘A’ indicates the position of one of the spots co-hybridize with both the adaptor probe and the Ba3 probe, while the circle denoted with ‘B’ indicates the position of one of the spots co-hybridize with both the adaptor probe and the Yp3 probe. Note: these arrays are produce by attaching DNA nano-balls without any size selection to glass surface covered with a carpet of capture oligonucleotides. We are working on applying nano-printing or surface pattering by photochemistry technologies to producing a glass substrate containing a grid of DNA nano-ball binding sites where each site is about 0.25-0.50 micrometer in size and surrounded by 0.75 micron or 0.50 micron of surface that does not bind DNA. Only one DNA nano-ball will be able to attach to such a binding site. This will produce a regular grid of individual submicron DNA spots of similar size.
  • FIG. 11 illustrates using the MetaMorph software, 5 images were overlaid together with slight shifts. The blue colored image corresponds to result of hybridization of the BrPrb3 (the adaptor probe) to the array. The red image corresponds to hybridization with the A-specific ligation probe pair (T1Aa9 and T1Ab9), the green image corresponds to hybridization with the C-specific ligation probe pair (T1Aa10 and T1Ab9), the yellow image corresponds to hybridization with the G-specific ligation probe pair (T1Aa11 and T1Ab9), the cyan image corresponds to hybridization with the T-specific ligation probe pair (T1Aa12 and T1Ab9). The circle denoted with ‘A’ indicates the position of one of the spots co-hybridize with both the adaptor probe and the A-specific ligation probe pair, similarly for circles denoted with ‘C’, ‘G’ and ‘T’. Note: these arrays are produce by attaching DNA nano-balls without any size selection to glass surface covered with a carpet of capture oligonucleotides. We are working on applying nano-printing or surface pattering by photochemistry technologies to producing a glass substrate containing a grid of DNA nano-ball binding sites where each site is about 0.25-0.50 micrometer in size and surrounded by 0.75 micron or 0.50 micron of surface that does not bind DNA. Only one DNA nano-ball will be able to attach to such a binding site. This will produce a regular grid of individual submicron DNA spots of similar size.
  • FIG. 12 shows attachment of single stranded concatemers to glass surface. RCA generated concatemer of a 94mer was incubated on a capture-probe coated glass in the presence of TAMRA labeled probe. Panel a, initially attached single molecule concatemers with partial attachment and extension of one molecule; Panel b, final attachment and condensation of the molecule due to hybridization to capture probes. Panel c, image shows concatemer threads on glass with no capture probes. Random arrays were imaged on our rSBH instrument.
  • FIG. 13 shows an image of randomly distributed concatemers hybridized to capture oligonucleotides. Sequences were detected with a TAM RA labeled probe to adapter sequences.
  • FIG. 14 shows a circle formation schema: Panel A. Ligation of an adapter to 5′ end of genomic fragment via universal template. Panel B. Closing of the adapter-modified fragment having 3′-polyA tail using a bridging template. Gel tests: Panel C. Preservation of DNA circles (top band) with Exonuclease V digestion. Panel D. In the presence of Phi29 DNA polymerase high molecular weight DNA molecules are observed, indicating the success of the rolling circle amplification.
  • FIG. 15 shows PCR amplification with tailed primers (1) is followed by strand removal or strand separation and the addition of a bridging oligonucleotide (2). Circle formation proceeds utilizing the bridge and DNA ligase (3).
  • FIG. 16 shows comparison of structured and standard random DNA arrays made by attaching RCR products. On the left is a standard random array with capture oligonucleotides spread over the entire glass slide (black bars shown at the side-view at the top-left). RCR concatemer products are randomly attached at a low spot density (bottom panel) in order to prevent co-localization of multiple DNA products per spot (blue and green concatemer chains). The spots vary in size (up to 1 um) due to FIG. 17 shows single stranded DNA is amplified and captured to a solid support through biotinylated reverse primers
  • FIG. 18 shows mismatches are formed along the 10 kb heteroduplex from test and reference DNA (Panel A). After cleavage each fragment can bind two adapter molecules that will release a large proportion of the genomic DNA and capture the mutated regions (Panel B).
  • FIG. 19 shows that Biotinylated (b) test DNA is mixed with a reference DNA and after heat denaturation and annealing produces a population of biotinylated heteroduplexes and non¬biotinylated homoduplexes. Heteroduplexes containing polymorphisms are attached to the surface with streptavidin (S) for isolation from reference DNA (Panel A). Mismatches are formed along the 10 kb heteroduplex from test and reference DNA (Panel B). After cleavage each fragment can bind two adapter molecules that will release a large proportion of the genomic DNA and capture the mutated regions (Panel C).
  • FIG. 20 shows production, capture and amplification of DNA mismatches. Structures A-G show various steps. (Structure A) DNA is cleaved on both sides of the mismatch. (Structure B) 5-prime overhangs are generated that can be ligated. 3′ overhangs are also created by digesting with an appropriate restriction endonuclease having a four base recognition site. (Structure C) An adapter is introduced that contains an active overhang at one side. (Structure D) An adapter is ligated to each of the two generated fragments (only ligation to the right from the 5′ phosphate after addition of sequences to the 3′ end of the top strand). The molecule is phosphorylated and a bridging oligonucleotide (Structure F) is used to ligate the two ends of the single stranded molecule. (Structure G) After circularization, a concatemer is generated by extending a primer in a RCR reaction.
  • FIG. 21 shows Method 11 for production, capture and amplification of DNA mismatches.
  • DETAILED DESCRIPTION OF THE INVENTION
  • The practice of the present invention may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and immunology, which are within the skill of the art. Such conventional techniques include polymer array synthesis, hybridization, ligation, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the example herein below. However, other equivalent conventional procedures can, of course, also be used. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Genome Analysis: A Laboratory Manual Series (Vols. I-IV), Using Antibodies: A Laboratory Manual, Cells: A Laboratory Manual, PCR Primer: A Laboratory Manual, and Molecular Cloning: A Laboratory Manual (all from Cold Spring Harbor Laboratory Press), Stryer, L. (1995) Biochemistry (4th Ed.) Freeman, New York, Gait, “Oligonucleotide Synthesis: A Practical Approach” 1984, IRL Press, London, Nelson and Cox (2000), Lehninger, Principles of Biochemistry 3rd Ed., W. H. Freeman Pub., New York, N.Y. and Berg et al. (2002) Biochemistry, 5th Ed., W. H. Freeman Pub., New York, N.Y., all of which are herein incorporated in their entirety by reference for all purposes.
  • The invention provides random single molecule arrays for large-scale parallel analysis of populations of molecules, particularly DNA fragments, such as genomic DNA fragments. Generally, single molecules of the invention comprise an attachment portion and an analyte portion. The attachment portion comprises a macromolecular structure that provides for multivalent attachment to a surface, particularly a compact or restricted area on a surface so that signals generated from it or an attached analyte are concentrated. That is, the macromolecular structure occupies a compact and limited region of the surface. Macromolecular structures of the invention may be bound to a surface in a variety of ways. Multi-valent bonds may be covalent or non-covalent. Non-covalent bonds include formation of duplexes between capture oligonucleotides on the surface and complementary sequences in the macromolecular structure, and adsorption to a surface by attractive noncovalent interactions, such as Van der Waal forces, hydrogen bonding, ionic and hydrophobic interactions, and the like. Multi-valent covalent bonding may be accomplished, as described more fully below, by providing reactive functionalities on the surface that can reactive with a plurality of complementary functionalities in the macromolecular structures. An analyte portion may be attached to a macromolecular structure by way of a unique linkage or it may form a part of, and be integral with, the macromolecular structure. Single molecules of the invention are disposed randomly on a surface of a support material, usually from a solution; thus, in one aspect, single molecules are uniformly distributed on a surface in close approximation to a Poisson distribution. In another aspect, single molecules are disposed on a surface that contains discrete spaced apart regions in which single molecules are attached. Preferably, macromolecular structures, preparation methods, and areas of such discrete spaced apart regions are selected so that substantially all such regions contain at most only one single molecule. Preferably, single molecules of the invention, particularly concatemers, are roughly in a random coil configuration on a surface and are confined to the area of a discrete spaced apart region. In one aspect, the discrete spaced apart regions have defined locations in a regular array, which may correspond to a rectilinear pattern, hexagonal pattern, or the like. A regular array of such regions is advantageous for detection and data analysis of signals collected from the arrays during an analysis. Also, single molecules confined to the restricted area of a discrete spaced apart region provide a more concentrated or intense signal, particularly when fluorescent probes are used in analytical operations, thereby providing higher signal-to-noise values. Single molecules of the invention are randomly distributed on the discrete spaced apart regions so that a given region usually is equally likely to receive any of the different single molecules. In other words, the resulting arrays are not spatially addressable immediately upon fabrication, but may be made so by carrying out an identification or decoding operation. That is, the identities of the single molecules are discernable, but not known. As described more fully below, in some embodiments, there are subsets of discrete spaced apart regions that receive single molecules only from corresponding subsets, for example, as defined by complementary sequences of capture oligonucleotides and adaptor oligonucleotides.
  • In one aspect the invention provides products and processes for making them. For example, in one approach, preparation of DNA and detection and quantification arrays includes providing a mixture of DNA fragments 10, 20, 50, 100 or more bases and shorter than 25, or 50, or 100, or 500, or 1000, or 2000 or 5000 or 10,000 bases from a source DNA. In embodiments, DNA arrays are formed by attaching concatemers of the same fragment or by in-situ amplification of a single DNA molecule. In embodiments, the DNA in each spot is identified by hybridization signature or partial or complete sequence determination. Some embodiments comprise RCR based formation of DNA concatemers with or without sequence complementary to the support bound capture oligonucleotide. Some embodiments utilize a support with a grid of regions with DNA capture chemistry separated by surface without DNA capture chemistry, each region being 0.1-10 micrometer with center to center distance of about 0.2 to 20 um. In some embodiments, the source DNA is all sequence variants of given length 8 to 20 base. In some embodiments, the methods comprise identifying nano-ball sequence by ligation of two adapter dependent or adapter independent oligonucleotides, and use individual probes or pools of probes with 0 to about 8 informative bases. In some embodiments, the invention comprises highly multiplexed DNA detection and quantification methods consisting of providing a DNA array containing more than 100,000, more than one million, or more than ten million DNA spots identified by hybridization signature or partial or complete sequence, hybridizing target sample comprising labeled or tagged (or target able to be labeled or tagged) DNA fragments under conditions allowing the formation of complementary DNA hybrids, detecting bound labels/tags or bound DNA in array spots; analyzing data to detect and quantify DNA molecules in the sample substantially complementary to one or more DNAs on the array. In some embodiments, DNA are arrays prepared using RCR based formation of DNA concatemers with or without sequence complementary to the support bound capture oligo bound. Some embodiments include a washing step before a detecting step to remove non-hybridized DNA. Some embodiments include a stringent washing step before a detecting step to remove non-hybridized DNA and DNA hybridized to targets with larger number of mismatches. Some embodiments include performing multiple detection step during the increased stringency (for example higher temperature, or higher pH) washes. Some embodiments include determining gene expression and or alternative splicing; gene deletion or duplication; pathogen detection, quantification and characterization, SNP detection; mutation discovery, microbe detection and quantification in natural sources; DNA sequencing, industrial use in agriculture, food pathogens, medical diagnostics, cancer samples. In some embodiments, labeling or tagging of sample molecules is done after binding them to the detector molecules in the array. In one aspect the invention provides a support with DNA/RNA with natural or analog bases spots in a grid or random spot array with informative single stranded DNA longer than 15, or 25, or 50, or 75 or 100 or 125, or 150, or 200, or 250, or 300, or 400, or 500, or 750, or 1000 bases and more than 10,000 or 100,000 or 1 million spots per mm2 containing multiple copies of the same DNA per spot, wherein more than 1000 or 10,000 or 100,000 different DNA is present in the array and which DNA is at which spot is determined after DNA attachment. In some embodiments, more than 50, 60, 70, 80, 90 or 95% of spots in the grid have single informative DNA species excluding errors produced by amplification. In some embodiments the invention provides a plate with 2, 4, 6, 8, 10, 12, 16, 24, 32, 48, 64, 96, 192, 384 or more such DNA arrays, where in most cases the same DNA is in different spots in the individual arrays. In some embodiments an array containing DNA fragments from multiple (2-2000, 10-2000, 20-2000, 50-2000, 100-2000, 100-10,000, 500-10,000 species) is provided. In some embodiments, an array contains DNA fragments that have SNP or other differences between individuals or species. In some embodiments, DNA copies per spot produced by RCR before attachment. In some embodiments, the DNA isolated from natural sources. In some embodiments, the identity or sequence of DNA/RNA or other detector molecule in usable spots is inferred by matching hybridization or other binding signature or partial or complete polymer sequence to a reference data base of signatures or sequences.
  • Described herein are DNA/RNA and their derivatives or peptides or protein and other array products, including processes for their preparation and uses, that are based on applying mixtures of detecting molecules of partially or fully known primary structure or polymer sequence, preferably as concatemers of the same molecule, on substrates with a pattern of high density small binding sites separated by non-binding surface, followed by determining which detecting molecule from the mixture is attached at which binding site.
  • Macromolecular structures of the invention comprise polymers, either branched or linear, and may be synthetic, e.g. branched DNA, or may be derived from natural sources, e.g. linear DNA fragments from a patient's genomic DNA. Usually, macromolecular structures comprise concatemers of linear single stranded DNA fragments that can be synthetic, derived from natural sources, or can be a combination of both. As used herein, the term “target sequence” refers to either a synthetic nucleic acid or a nucleic acid derived from a natural source, such as a patient specimen, or the like. Usually, target sequences are part of a concatemer generated by methods of the invention, e.g. by RCR, but may also be part of other structures, such as dendrimers, and other branched structures. When target sequences are synthetic or derived from natural sources, they are usually replicated by various methods in the process of forming macromolecular structures or single molecules of the invention. It is understood that such methods can introduce errors into copies, which nonetheless are encompassed by the term “target sequence.”
  • Particular features or components of macromolecular structures may be selected to satisfy a variety of design objectives in particular embodiments. For example, in some embodiments, it may be advantageous to maintain an analyte molecule as far from the surface as possible, e.g. by providing an inflexible molecular spacer as part of a unique linkage. As another example, reactive functionalities may be selected as having a size that effectively prevents attachment of multiple macromolecular structures to one discrete spaced apart region. As still another example, macromolecular structures may be provided with other functionalities for a variety of other purposes, e.g. enhancing solubility, promoting formation of secondary structures via hydrogen bonding, and the like.
  • In one aspect, macromolecular structures are sufficiently large that their size, e.g. a linear dimension (such as a diameter) of a volume occupied in a conventional physiological saline solution, is approximately equivalent to that a discrete spaced apart region. For macromolecular structures that are linear polynucleotides, in one aspect, sizes may range from a few thousand nucleotides, e.g. 10,000, to several hundred thousand nucleotides, e.g. 100-200 thousand. As explained more fully below, in several embodiments, such macromolecular structures are made by generating circular DNAs and then replicating them in a rolling circle replication reaction to form concatemers of complements of the circular DNAs.
  • The above concepts are illustrated more fully in the embodiments shown schematically in FIGS. 1A-1G. After describing these Figures, elements of the invention are disclosed in additional detail and examples are given. As mentioned above, in one aspect, macromolecular structures of the invention are single stranded polynucleotides comprising concatemers of a target sequence or fragment. In particular, such polynucleotides may be concatemers of a target sequence and an adaptor oligonucleotide. For example, source nucleic acid (1000) is treated (1001) to form single stranded fragments (1006), preferably in the range of from 50 to 600 nucleotides, and more preferably in the range of from 300 to 600 nucleotides, which are then ligated to adaptor oligonucleotides (1004) to form a population of adaptor-fragment conjugates (1002). Source nucleic acid (1000) may be genomic DNA extracted from a sample using conventional techniques, or a cDNA or genomic library produced by conventional techniques, or synthetic DNA, or the like. Treatment (1001) usually entails fragmentation by a conventional technique, such as chemical fragmentation, enzymatic fragmentation, or mechanical fragmentation, followed by denaturation to produce single stranded DNA fragments. Adaptor oligonucleotides (1004), in this example, are used to form (1008) a population (1010) of DNA circles by the method illustrated in FIG. 2A. In one aspect, each member of population (1010) has an adaptor with an identical primer binding site and a DNA fragment from source nucleic acid (1000). The adapter also may have other functional elements including, but not limited to, tagging sequences, attachment sequences, palindromic sequences, restriction sites, functionalization sequences, and the like. In other embodiments, classes of DNA circles may be created by providing adaptors having different primer binding sites. After DNA circles (1010) are formed, a primer and rolling circle replication (RCR) reagents may be added to generate (1011) in a conventional RCR reaction a population (1012) of concatemers (1015) of the complements of the adaptor oligonucleotide and DNA fragments, which population can then be isolated using conventional separation techniques. Alternatively, RCR may be implemented by successive ligation of short oligonucleotides, e.g. 6-mers, from a mixture containing all possible sequences, or if circles are synthetic, a limited mixture of oligonucleotides having selected sequences for circle replication. Concatemers may also be generated by ligation of target DNA in the presence of a bridging template DNA complementary to both beginning and end of the target molecule. A population of different target DNA may be converted in concatemers by a mixture of corresponding bridging templates. Isolated concatemers (1014) are then disposed (1016) onto support surface (1018) to form a random array of single molecules. Attachment may also include wash steps of varying stringencies to remove incompletely attached single molecules or other reagents present from earlier preparation steps whose presence is undesirable or that are nonspecifically bound to surface (1018). Concatemers (1020) can be fixed to surface (1018) by a variety of techniques, including covalent attachment and non-covalent attachment. In one embodiment, surface (1018) may have attached capture oligonucleotides that form complexes, e.g. double stranded duplexes, with a segment of the adaptor oligonucleotide, such as the primer binding site or other elements. In other embodiments, capture oligonucleotides may comprise oligonucleotide clamps, or like structures, that form triplexes with adaptor oligonucleotides, e.g. Gryaznov et al, U.S. Pat. No. 5,473,060. In another embodiment, surface (1018) may have reactive functionalities that react with complementary functionalities on the concatemers to form a covalent linkage, e.g. by way of the same techniques used to attach cDNAs to microarrays, e.g. Smirnov et al (2004), Genes, Chromosomes & Cancer, 40: 72-77; Beaucage (2001), Current Medicinal Chemistry, 8: 1213-1244, which are incorporated herein by reference. Long DNA molecules, e.g. several hundred nucleotides or larger, may also be efficiently attached to hydrophobic surfaces, such as a clean glass surface that has a low concentration of various reactive functionalities, such as —OH groups. Concatemers of DNA fragments may be further amplified in situ after disposition of a surface. For example after disposition, concatemer may be cleaved by reconstituting a restriction site in adaptor sequences by hybridization of an oligonucleotide, after which the fragments are circularized as described below and amplified in situ by a RCR reaction.
  • FIG. 1B illustrates a section (1102) of a surface of a random array of single molecules, such as single stranded polynucleotides. Such molecules under conventional conditions (a conventional DNA buffer, e.g. TE, SSC, SSPE, or the like, at room temperature) form random coils that roughly fill a spherical volume in solution having a diameter of from about 100 to 300 nm, which depends on the size of the DNA and buffer conditions, in a manner well known in the art, e.g. Edvinsson, “On the size and shape of polymers and polymer complexes,” Dissertation 696 (University of Uppsala, 2002). One measure of the size of a random coil polymer, such as single stranded DNA, is a root mean square of the end-to-end distance, which is roughly a measure of the diameter of the randomly coiled structure. Such diameter, referred to herein as a “random coil diameter,” can be measured by light scatter, using instruments, such as a Zetasizer Nano System (Malvern Instruments, UK), or like instrument. Additional size measures of macromolecular structures of the invention include molecular weight, e.g. in Daltons, and total polymer length, which in the case of a branched polymer is the sum of the lengths of all its branches. Upon attachment to a surface, depending on the attachment chemistry, density of linkages, the nature of the surface, and the like, single stranded polynucleotides fill a flattened spheroidal volume that on average is bounded by a region (1107) defined by dashed circles (1108) having a diameter (1110), which is approximately equivalent to the diameter of a concatemer in random coil configuration. Stated another way, in one aspect, macromolecular structures, e.g. concatemers, and the like, are attached to surface (1102) within a region that is substantially equivalent to a projection of its random coil state onto surface (1102), for example, as illustrated by dashed circles (1108). An area occupied by a macromolecular structure can vary, so that in some embodiments, an expected area may be within the range of from 2-3 times the area of projection (1108) to some fraction of such area, e.g. 25-50 percent. As mentioned elsewhere, preserving the compact form of the macromolecular structure on the surface allows a more intense signal to be produced by probes, e.g. fluorescently labeled oligonucleotides, specifically directed to components of a macromolecular structure or concatemer. The size of diameter (1110) of regions (1107) and distance (1106) to the nearest neighbor region containing a single molecule are two quantities of interest in the fabrication of arrays. A variety of distance metrics may be employed for measuring the closeness of single molecules on a surface, including center-to-center distance of regions (1107), edge-to-edge distance of regions (1007), and the like. Usually, center-to-center distances are employed herein. The selection of these parameters in fabricating arrays of the invention depends in part on the signal generation and detection systems used in the analytical processes. Generally, densities of single molecules are selected that permit at least twenty percent, or at least thirty percent, or at least forty percent, or at least a majority of the molecules to be resolved individually by the signal generation and detection systems used. In one aspect, a density is selected that permits at least seventy percent of the single molecules to be individually resolved. In one aspect, whenever scanning electron microscopy is employed, for example, with molecule-specific probes having gold nanoparticle labels, e.g. Nie et al (2006), Anal. Chem., 78: 1528-1534, which is incorporated by reference, a density is selected such that at least a majority of single molecules have a nearest neighbor distance of 50 nm or greater; and in another aspect, such density is selected to ensure that at least seventy percent of single molecules have a nearest neighbor distance of 100 nm or greater. In another aspect, whenever optical microscopy is employed, for example with molecule-specific probes having fluorescent labels, a density is selected such that at least a majority of single molecules have a nearest neighbor distance of 200 nm or greater; and in another aspect, such density is selected to ensure that at least seventy percent of single molecules have a nearest neighbor distance of 200 nm or greater. In still another aspect, whenever optical microscopy is employed, for example with molecule-specific probes having fluorescent labels, a density is selected such that at least a majority of single molecules have a nearest neighbor distance of 300 nm or greater; and in another aspect, such density is selected to ensure that at least seventy percent of single molecules have a nearest neighbor distance of 300 nm or greater, or 400 nm or greater, or 500 nm or greater, or 600 nm or greater, or 700 nm or greater, or 800 nm or greater. In still another embodiment, whenever optical microscopy is used, a density is selected such that at least a majority of single molecules have a nearest neighbor distance of at least twice the minimal feature resolution power of the microscope. In another aspect, polymer molecules of the invention are disposed on a surface so that the density of separately detectable polymer molecules is at least 1000 per μm2, or at least 10,000 per μm2, or at least 100,000 per μm2.
  • In another aspect of the invention, illustrated for a particular embodiment in FIG. 1C, the requirement of selecting densities of randomly disposed single molecules to ensure desired nearest neighbor distances is obviated by providing on a surface discrete spaced apart regions that are substantially the sole sites for attaching single molecules. That is, in such embodiments the regions on the surface between the discrete spaced apart regions, referred to herein as “inter-regional areas,” are inert in the sense that concatemers, or other macromolecular structures, do not bind to such regions. In some embodiments, such inter-regional areas may be treated with blocking agents, e.g. DNAs unrelated to concatemer DNA, other polymers, and the like. As in FIG. 1A, source nucleic acids (1000) are fragmented and adaptored (1002) for circularization (1010), after which concatemers are formed by RCR (1012). Isolated concatemers (1014) are then applied to surface (1120) that has a regular array of discrete spaced apart regions (1122) that each have a nearest neighbor distance (1124) that is determined by the design and fabrication of surface (1120). As described more fully below, arrays of discrete spaced apart regions (1122) having micron and submicron dimensions for derivatizing with capture oligonucleotides or reactive functionalities can be fabricated using conventional semiconductor fabrication techniques, including electron beam lithography, nano imprint technology, photolithography, and the like. Generally, the area of discrete spaced apart regions (1122) is selected, along with attachment chemistries, macromolecular structures employed, and the like, to correspond to the size of single molecules of the invention so that when single molecules are applied to surface (1120) substantially every region (1122) is occupied by no more than one single molecule. The likelihood of having only one single molecule per discrete spaced apart region may be increased by selecting a density of reactive functionalities or capture oligonucleotides that results in fewer such moieties than their respective complements on single molecules. Thus, a single molecule will “occupy” all linkages to the surface at a particular discrete spaced apart region, thereby reducing the chance that a second single molecule will also bind to the same region. In particular, in one embodiment, substantially all the capture oligonucleotides in a discrete spaced apart region hybridize to adaptor oligonucleotides a single macromolecular structure. In one aspect, a discrete spaced apart region contains a number of reactive functionalities or capture oligonucleotides that is from about ten percent to about fifty percent of the number of complementary functionalities or adaptor oligonucleotides of a single molecule. The length and sequence(s) of capture oligonucleotides may vary widely, and may be selected in accordance with well-known principles, e.g. Wetmur, Critical Reviews in Biochemistry and Molecular Biology, 26: 227-259 (1991); Britten and Davidson, chapter 1 in Hames et al, editors, Nucleic Acid Hybridization: A Practical Approach (IRL Press, Oxford, 1985). In one aspect, the lengths of capture oligonucleotides are in a range of from 6 to 30 nucleotides, and in another aspect, within a range of from 8 to 30 nucleotides, or from 10 to 24 nucleotides. Lengths and sequences of capture oligonucleotides are selected (i) to provide effective binding of macromolecular structures to a surface, so that losses of macromolecular structures are minimized during steps of analytical operations, such as washing, etc., and (ii) to avoid interference with analytical operations on analyte molecules, particularly when analyte molecules are DNA fragments in a concatemer. In regard to (i), in one aspect, sequences and lengths are selected to provide duplexes between capture oligonucleotides and their complements that are sufficiently stable so that they do not dissociate in a stringent wash. In regard to (ii), if DNA fragments are from a particular species of organism, then databases, when available, may be used to screen potential capture sequences that may form spurious or undesired hybrids with DNA fragments. Other factors in selecting sequences for capture oligonucleotides are similar to those considered in selecting primers, hybridization probes, oligonucleotide tags, and the like, for which there is ample guidance, as evidenced by the references cited below in the Definitions section. In some embodiments, a discrete spaced apart region may contain more than one kind of capture oligonucleotide, and each different capture oligonucleotide may have a different length and sequence. In one aspect of embodiments employing regular arrays of discrete spaced apart regions, sequences of capture oligonucleotides are selected so that sequences of capture oligonucleotide at nearest neighbor regions have different sequences. In a rectilinear array, such configurations are achieved by rows of alternating sequence types. In other embodiments, a surface may have a plurality of subarrays of discrete spaced apart regions wherein each different subarray has capture oligonucleotides with distinct nucleotide sequences different from those of the other subarrays. A plurality of subarrays may include 2 subarrays, or 4 or fewer subarrays, or 8 or fewer subarrays, or 16 or fewer subarrays, or 32 or fewer subarrays, or 64 of fewer subarrays. In still other embodiments, a surface may include 5000 or fewer subarrays. In one aspect, capture oligonucleotides are attached to the surface of an array by a spacer molecule, e.g. polyethylene glycol, or like inert chain, as is done with microarrays, in order to minimize undesired effects of surface groups or interactions with the capture oligonucleotides or other reagents.
  • In one aspect, the area of discrete spaced apart regions (1122) is less than 1 μm2; and in another aspect, the area of discrete spaced apart regions (1122) is in the range of from 0.04 μm2 to 1 μm2; and in still another aspect, the area of discrete spaced apart regions (1122) is in the range of from 0.2 μm2 to 1 μm2. In another aspect, when discrete spaced apart regions are approximately circular or square in shape so that their sizes can be indicated by a single linear dimension, the size of such regions are in the range of from 125 nm to 250 nm, or in the range of from 200 nm to 500 nm. In one aspect, center-to-center distances of nearest neighbors of regions (1122) are in the range of from 0.25 um to 20 μm; and in another aspect, such distances are in the range of from 1 um to 10 μm, or in the range from 50 to 1000 nm. In one aspect, regions (1120) may be arranged on surface (1018) in virtually any pattern in which regions (1122) have defined locations, i.e. in any regular array, which makes signal collection and data analysis functions more efficient. Such patterns include, but are not limited to, concentric circles of regions (1122), spiral patterns, rectilinear patterns, hexagonal patterns, and the like. Preferably, regions (1122) are arranged in a rectilinear or hexagonal pattern.
  • As illustrated in FIG. 1D, in certain embodiments, DNA circles prepared from source nucleic acid (1200) need not include an adaptor oligonucleotide. As before, source nucleic acid (1200) is fragmented and denatured (1202) to form a population of single strand fragments (1204), preferably in the size range of from about 50 to 600 nucleotides, and more preferably in the size range of from about 300 to 600 nucleotides, after which they are circularized in a non-template driven reaction with circularizing ligase, such as CircLigase (Epicentre Biotechnologies, Madison, Wis.), or the like. After formation of DNA circles (1206), concatemers are generated by providing a mixture of primers that bind to selected sequences. The mixture of primers may be selected so that only a subset of the total number of DNA circles (1206) generate concatemers. After concatemers are generated (1208), they are isolated and applied to surface (1210) to form a random array of the invention.
  • As mentioned above, single molecules of the invention comprise an attachment portion and an analyte portion such that the attachment portion comprises a macromolecular structure that provides multivalent attachment of the single molecule to a surface. As illustrated in FIG. 1E, macromolecular structures may be concatemers made by an RCR reaction in which the DNA circles in the reaction are synthetic. An analyte portion of a single molecule is then attached by way of a unique functionality on the concatemer. Synthetic DNA circles of virtually any sequence can be produced using well-known techniques, conveniently, in sizes up to several hundred nucleotides, e.g. 200, and with more difficulty, in sizes of many hundreds of nucleotides, e.g. up to 500, e.g. Kool, U.S. Pat. No. 5,426,180; Dolinnaya et al (1993), Nucleic Acids Research, 21: 5403¬5407; Rubin et al (1995), Nucleic Acids Research, 23: 3547-3553; and the like, which are incorporated herein by reference. Synthetic DNA circles (1300) that comprise primer binding sites (1301) are combined with primer (1302) in an RCR reaction (1306) to produce concatemers (1308). Usually, in this embodiment, all circles have the same sequence, although different sequences can be employed, for example, for directing subsets of concatemers to preselected regions of an array via complementary attachment moieties, such as adaptor sequences and capture oligonucleotides. Primer (1302) is synthesized with a functionality (1304, designated as “R”) at its 5′ end that is capable of reacting with a complementary functionality on an analyte to form a covalent linkage. Exemplary functionalities include amino groups, sulfhydryl groups, and the like, that can be attached with commercially available chemistries (e.g. Glen Research). Concatemers (1308) are applied to surface (1310) to form an array (1314), after which analytes (1312) having an attachment moiety are applied to array (1310) where a linkage is formed with a concatemer by reaction of unique functionalities, R (1311) and attachment moiety (1312). Alternatively, prior to application to array (1310), concatemers (1308) may be combined with analytes (1312) so that attachment moieties and unique functionalities can react to form a linkage, after which the resulting conjugate is applied to array (1310). There is abundant guidance in the literature in selecting appropriate attachment moieties and unique functionalities for linking concatemers (1308) and many classes of analyte. In one aspect, for linking protein or peptide analytes to concatemers, many homo- and heterobifunctional reagents are available commercially (e.g. Pierce) and are disclosed in references such as Hermanson, Bioconjugate Techniques (Academic Press, New York, 1996), which is incorporated by reference. For example, whenever the unique functionality is an amino group, then concatemers (1308) can be linked to a sufhydryl group on an analyte using N-succinimidyl 3-(2-pyridyldithio)propionate (SPDP), succinimidyloxycarbonyl-a¬methyl-a-(2-pyridyldithio)toluene (SMPT), succinimidyl-4-(N-maleimidomethyl)cyclohexane-1-carboxylate (SMCC), m-maleimidobenzoyl-N-hydroxysuccinimide ester (MB S), N¬succinimidyl(4-iodoacetyl)aminobenzoate (SIAB), succinimidyl 6-((iodoacetyl)amino)hexanoate (SIAX), and like reagents. Suitable complementary functionalities on analytes include amino groups, sulfhydryl groups, carbonyl groups, which may occur naturally on analytes or may be added by reaction with a suitable homo- or heterobifunctional reagent. Analyte molecules may also be attached to macromolecular structures by way of non-covalent linkages, such as biotin¬streptavidin linkages, the formation of complexes, e.g. a duplexes, between a first oligonucleotide attached to a concatemer and a complementary oligonucleotide attached to, or forming part of, an analyte, or like linkages. Analytes include biomolecules, such as nucleic acids, for example, DNA or RNA fragments, polysaccharides, proteins, and the like.
  • As mentioned above, macromolecular structures of the invention may comprise branched polymers as well as linear polymers, such as concatemers of DNA fragments. Exemplary branched polymer structures are illustrated in FIGS. 1F and 1G. In FIG. 1F, a branched DNA structure is illustrated that comprises a backbone polynucleotide (1400) and multiple branch polynucleotides (1402) each connected to backbone polynucleotide (1400) by their 5′ ends to form a comb-like structure that has all 3′ ends, except for a single 5′ end (1404) on backbone polynucleotide (1400), which is derivatized to have a unique functionality. As mentioned below, such unique functionality may be a reactive chemical group, e.g. a protected or unprotected amine, sulfhydryl, or the like, or it may be an oligonucleotide having a unique sequence for capturing an analyte having an oligonucleotide with a complementary sequence thereto. Likewise, such unique functionality may be a capture moiety, such as biotin, or the like. Such branched DNA structures are synthesized using known techniques, e.g. Gryaznov, U.S. Pat. No. 5,571,677; Urdea et al, U.S. Pat. No. 5,124,246; Seeman et al, U.S. Pat. No. 6,255,469; and the like, which are incorporated herein by reference. Whenever such macromolecular structures are polynucleotides, the sequences of components thereof may be selected for facile self-assembly, or they may be linked by way of specialized linking chemistries, e.g. as disclosed below, in which case sequences are selected based on other factors, including, in some embodiments, avoidance of self-annealing, facile binding to capture oligonucleotides on a surface, and the like. In FIG. 1G, a dendrimer structure is illustrated that comprises oligonucleotide (1406), which is derivatized with multiple tri-valent linking groups (1408) that each have two functionalities (1410, designated by “R”) by which additional polymers (1407), e.g. polynucleotides, can be attached to form a linkage to oligonucleotide (1406) thereby forming macromolecular structure (1409), which, in turn, if likewise derivatized with multivalent linkers, can form a nucleic acid dendrimer. Trivalent linkers (1408) for use with oligonucleotides are disclosed in Iyer et al, U.S. Pat. No. 5,916,750, which is incorporated herein by reference. As illustrated in FIG. 1H, once such dendrimeric or branched structures (1411) are constructed, they can be attached to array (1420) as described above for linear polynucleotides, after which analytes (1430) can be attached via unique functionalities (1410). Optionally, unreacted unique functionalities (1422) may be capped using conventional techniques. Alternatively, dendrimeric or branched structures (1411) may be combined with analytes (1430) first, e.g. in solution, so that conjugates are formed, and then the conjugates are disposed on array (1420). When the analyte is a polynucleotide (1440) with a free 3′ end, as shown in FIG. 1I, such end may be extended in an in situ RCR reaction to form either concatemers of target sequences or other sequences for further additions. Likewise, polynucleotide analytes may be extended by ligation using conventional techniques.
  • Source Nucleic Acids and Circularization of Target Sequences
  • In one aspect of the invention, macromolecular structures comprise concatemers of polynucleotide analytes, i.e. target sequences, which are extracted or derived from a sample, such as genomic DNA or cDNAs from a patient, an organism of economic interest, or the like. Random arrays of the invention comprising such single molecules are useful in providing genome-wide analyses, including sequence determination, SNP measurement, allele quantitation, copy number measurements, and the like. For mammalian-sized genomes, preferably fragmentation is carried out in at least two stages, a first stage to generate a population of fragments in a size range of from about 100 kilobases (Kb) to about 250 kilobases, and a second stage, applied separately to each 100-250 Kb fragment, to generate fragments in the size range of from about 50 to 600 nucleotides, and more preferably in the range of from about 300 to 600 nucleotides, for generating concatemers for a random array. In some aspects of the invention, the first stage of fragmentation may also be employed to select a predetermined subset of such fragments, e.g. fragments containing genes that encode proteins of a signal transduction pathway, or the like. The amount of genomic DNA required for constructing arrays of the invention can vary widely. In one aspect, for mammalian-sized genomes, fragments are generated from at least 10 genome-equivalents of DNA; and in another aspect, fragments are generated from at least 30 genome-equivalents of DNA; and in another aspect, fragments are generated from at least 60 genome-equivalents of DNA.
  • Genomic DNA is obtained using conventional techniques, for example, as disclosed in Sambrook et al., supra, 1999; Current Protocols in Molecular Biology, Ausubel et al., eds. (John Wiley and Sons, Inc., NY, 1999), or the like, Important factors for isolating genomic DNA include the following: 1) the DNA is free of DNA processing enzymes and contaminating salts; 2) the entire genome is equally represented; and 3) the DNA fragments are between about 5,000 and 100,000 by in length. In many cases, no digestion of the extracted DNA is required because shear forces created during lysis and extraction will generate fragments in the desired range. In another embodiment, shorter fragments (1-5 kb) can be generated by enzymatic fragmentation using restriction endonucleases. In one embodiment, 10-100 genome-equivalents of DNA ensure that the population of fragments covers the entire genome. In some cases, it is advantageous to provide carrier DNA, e.g. unrelated circular synthetic double-stranded DNA, to be mixed and used with the sample DNA whenever only small amounts of sample DNA are available and there is danger of losses through nonspecific binding, e.g. to container walls and the like.
  • In generating fragments in either stage, fragments may be derived from either an entire genome or it may be derived from a selected subset of a genome. Many techniques are available for isolating or enriching fragments from a subset of a genome, as exemplified by the following references that are incorporated by reference: Kandpal et al (1990), Nucleic Acids Research, 18: 1789-1795; Callow et al, U.S. patent publication 2005/0019776; Zabeau et al, U.S. Pat. No. 6,045,994; Deugau et al, U.S. Pat. No. 5,508,169; Sibson, U.S. Pat. No. 5,728,524; Guilfoyle et al, U.S. Pat. No. 5,994,068; Jones et al, U.S. patent publication 2005/0142577; Gullberg et al, U.S. patent publication 2005/0037356; Matsuzaki et al, U.S. patent publication 2004/0067493; and the like.
  • For mammalian-sized genomes, an initial fragmentation of genomic DNA can be achieved by digestion with one or more “rare” cutting restriction endonucleases, such as Not I, Asc I, Bae I, CspC I, Pac I, Fse I, Sap I, Sfi I, Psr I, or the like. The resulting fragments can be used directly, or for genomes that have been sequenced, specific fragments may be isolated from such digested DNA for subsequent processing as illustrated in FIG. 2B. Genomic DNA (230) is digested (232) with a rare cutting restriction endonuclease to generate fragments (234), after which the fragments (234) are further digested for a short period (i.e. the reaction is not allowed to run to completion) with a 5′ single stranded exonuclease, such as 2 exonuclease, to expose sequences (237) adjacent to restriction site sequences at the end of the fragments. Such exposed sequences will be unique for each fragment. Accordingly, biotinylated primers (241) specific for the ends of desired fragments can be annealed to a capture oligonucleotide for isolation; or alternatively, such fragments can be annealed to a primer having a capture moiety, such as biotin, and extended with a DNA polymerase that does not have strand displacement activity, such as Taq polymerase Stoffel fragment. After such extension, the 3′ end of primers (241) abut the top strand of fragments (242) such that they can be ligated to form a continuous strand. The latter approach may also be implemented with a DNA polymerase that does have strand displacement activity and replaces the top strand (242) by synthesis. In either approach, the biotinylated fragments may then be isolated (240) using a solid support (239) derivatized with streptavidin.
  • In another aspect, primer extension from a genomic DNA template is used to generate a linear amplification of selected sequences greater than 10 kilobases surrounding genomic regions of interest. For example, to create a population of defined-sized targets, 20 cycles of linear amplification is performed with a forward primer followed by 20 cycles with a reverse primer. Before applying the second primer, the first primer is removed with a standard column for long DNA purification or degraded if a few uracil bases are incorporated. A greater number of reverse strands are generated relative to forward strands resulting in a population of double stranded molecules and single stranded reverse strands. The reverse primer may be biotinylated for capture to streptavidin beads which can be heated to melt any double stranded homoduplexes from being captured. All attached molecules will be single stranded and representing one strand of the original genomic DNA.
  • The products produced can be fragmented to 0.2-2 kb in size, or more preferably, 0.3-0.6 kb in size (effectively releasing them from the solid support) and circularized for an RCR reaction. In one method of circularization, illustrated in FIG. 2A, after genomic DNA (200) is fragmented and denatured (202), single stranded DNA fragments (204) are first treated with a terminal transferase (206) to attach a poly dA tails (208) to 3-prime ends. This is then followed by ligation (212) of the free ends intra-molecularly with the aid of bridging oligonucleotide (210). that is complementary to the poly dA tail at one end and complementary to any sequence at the other end by virtue of a segment of degenerate nucleotides. Duplex region (214) of bridging oligonucleotide (210) contains at least a primer binding site for RCR and, in some embodiments, sequences that provide complements to a capture oligonucleotide, which may be the same or different from the primer binding site sequence, or which may overlap the primer binding site sequence. The length of capture oligonucleotides may vary widely, In one aspect, capture oligonucleotides and their complements in a bridging oligonucleotide have lengths in the range of from 10 to 100 nucleotides; and more preferably, in the range of from 10 to 40 nucleotides. In some embodiments, duplex region (214) may contain additional elements, such as an oligonucleotide tag, for example, for identifying the source nucleic acid from which it's associated DNA fragment came. That is, in some embodiments, circles or adaptor ligation or concatemers from different source nucleic acids may be prepared separately during which a bridging adaptor containing a unique tag is used, after which they are mixed for concatemer preparation or application to a surface to produce a random array. The associated fragments may be identified on such a random array by hybridizing a labeled tag complement to its corresponding tag sequences in the concatemers, or by sequencing the entire adaptor or the tag region of the adaptor. Circular products (218) may be conveniently isolated by a conventional purification column, digestion of non-circular DNA by one or more appropriate exonucleases, or both.
  • As mentioned above, DNA fragments of the desired sized range, e.g. 50-600 nucleotides, can also be circularized using circularizing enzymes, such as CircLigase, as single stranded DNA ligase that circularizes single stranded DNA without the need of a template. CircLigase is used in accordance with the manufacturer's instructions (Epicentre, Madison, Wis.). A preferred protocol for forming single stranded DNA circles comprising a DNA fragment and one or more adapters is to use standard ligase such as T4 ligase for ligation an adapter to one end of DNA fragment and then to use CircLigase to close the circle, as described more fully below.
  • An exemplary protocol for generating a DNA circle comprising an adaptor oligonucleotide and a target sequence using T4 ligase. The target sequence is a synthetic oligo T1N (sequence: 5′-NNNNNNNNGCATANCACGANGTCATNATCGTNCAAACGTCAGTCCANGAATCNAGATCCACTTAGANTGNCGN NNNNNNN-3′)(SEQ ID NO: 1). The adaptor is made up of 2 separate oligos. The adaptor oligo that joins to the 5′ end of T1N is BR2-ad (sequence: 5′-TATCATCTGGATGTTAGGAAGACAAAAGGAAGCT GAGGACATTAACGGAC-3′) (SEQ ID NO: 2) and the adaptor oligo that joins to the 3′ end of T1N is UR3-ext (sequence: 5′-ACCTTCAGACCAGAT-3′) (SEQ ID NO: 3) UR3-ext contains a type IIs restriction enzyme site (Acu I: CTTCAG) to provide a way to linearize the DNA circular for insertion of a second adaptor. BR2-ad is annealed to BR2-temp (sequence 5′-NNNNNNNNGTCCGTTAATGTCCTCAG-3′) (SEQ ID NO: 4) to form a double-stranded adaptor BR2 adaptor. UR3-ext is annealed to biotinylated UR3-temp (sequence 5′-[BIOTIN]-ATCTGGTCTGAAGGTNNNNNNNNN-3′) (SEQ ID NO: 5) to form a double-stranded adaptor UR3 adaptor. 1 pmol of target T1N is ligated to 25 pmol of BR2 adaptor and 10 pmol of UR3 adaptor in a single ligation reaction containing 50 mM Tris-C1, pH7.8, 10% PEG, 1 mM ATP, 50 mg/L BSA, 10 mM MgCl2, 0.3 unit/μl T4 DNA ligase (Epicentre Biotechnologies, WI) and 10 mM DTT) in a final volume of 10 ul. The ligation reaction is incubated in a temperature cycling program of 15° C. for 11 min, 37° C. for 1 min repeated 18 times. The reaction is terminated by heating at 70° C. for 10 min. Excess BR2 adaptors are removed by capturing the ligated products with streptavidin magnetic beads (New England Biolabs, MA). 3.3 ul of 4× binding buffer (2M NaCl, 80 mM Tris HCl pH7.5) is added to the ligation reaction which is then combined with 15 μg of streptavidin magnetic beads in lx binding buffer (0.5M NaCl, 20 mM Tris HCl pH7.5). After 15 min incubation in room temperature, the beads are washed twice with 4 volumes of low salt buffer (0.15M NaCl, 20 mM Tris HCl pH7.5). Elution buffer (10 mM Tris HCl pH7.5) is pre-warmed to 70 deg, 10 μl of which is added to the beads at 70° C. for 5 min. After magnetic separation, the supernatant is retained as primary purified sample. This sample is further purified by removing the excess UR3 adaptors with magnetic beads pre-bound with a biotinylated oligo BR-rc-bio (sequence: 5′-[BIOTIN]CTTTTGTCTTCCTAACATCC-3′) (SEQ ID NO: 6) that is reverse complementary to BR2-ad similarly as described above. The concentration of the adaptor-target ligated product in the final purified sample is estimated by urea polyacrylamide gel electrophoresis analysis. The circularization is carried out by phosphorylating the ligation products using 0.2 unit/μl T4 polynucleotide kinase (Epicentre Biotechnologies) in 1 mM ATP and standard buffer provided by the supplier, and circularized with ten-fold molar excess of a splint oligo UR3¬closing-88 (sequence 5′-AGATGATAATCTGGTC-3′) (SEQ ID NO: 7) using 0.3 unit/μl of T4 DNA ligase (Epicentre Biotechnologies) and 1 mM ATP. The circularized product is validated by performing RCR reactions as described below.
  • Generating Polynucleotide Concatemers by Rolling Circle Replication
  • In one aspect of the invention, single molecules comprise concatemers of polynucleotides, usually polynucleotide analytes, i.e. target sequences, that have been produce in a conventional rolling circle replication (RCR) reaction. Guidance for selecting conditions and reagents for RCR reactions is available in many references available to those of ordinary skill, as evidence by the following that are incorporated by reference: Kool, U.S. Pat. No. 5,426,180; Lizardi, U.S. Pat. Nos. 5,854,033 and 6,143,495; Landegren, U.S. Pat. No. 5,871,921; and the like. Generally, RCR reaction components comprise single stranded DNA circles, one or more primers that anneal to DNA circles, a DNA polymerase having strand displacement activity to extend the 3′ ends of primers annealed to DNA circles, nucleoside triphosphates, and a conventional polymerase reaction buffer. Such components are combined under conditions that permit primers to anneal to DNA circles and be extended by the DNA polymerase to form concatemers of DNA circle complements. An exemplary RCR reaction protocol is as follows: In a 50 μL reaction mixture, the following ingredients are assembled: 2-50 pmol circular DNA, 0.5 units/pi phage (φ29 DNA polymerase, 0.2 μg/μL BSA, 3 mM dNTP, 1×429 DNA polymerase reaction buffer (Amersham). The RCR reaction is carried out at 30° C. for 12 hours. In some embodiments, the concentration of circular DNA in the polymerase reaction may be selected to be low (approximately 10-100 billion circles per ml, or 10-100 circles per picoliter) to avoid entanglement and other intermolecular interactions.
  • Preferably, concatemers produced by RCR are approximately uniform in size; accordingly, in some embodiments, methods of making arrays of the invention may include a step of size selecting concatemers. For example, in one aspect, concatemers are selected that as a population have a coefficient of variation in molecular weight of less than about 30%; and in another embodiment, less than about 20%. In one aspect, size uniformity is further improved by adding low concentrations of chain terminators, such ddNTPs, to the RCR reaction mixture to reduce the presence of very large concatemers, e.g. produced by DNA circles that are synthesized at a higher rate by polymerases. In one embodiment, concentrations of ddNTPs are used that result in an expected concatemer size in the range of from 50-250 Kb, or in the range of from 50-100 Kb. In another aspect, concatemers may be enriched for a particular size range using a conventional separation techniques, e.g. size-exclusion chromatography, membrane filtration, or the like.
  • Generation of Macromolecular Structures Comprising Branched Polymers and DNA Assemblies
  • In one aspect of the invention, macromolecular structures comprise polymers having at least one unique functionality, which for polynucleotides is usually a functionality at a 5′ or 3′ end, and a plurality of complementary functionalities that are capable of specifically reacting with reactive functionalities of the surface of a solid support. Macromolecular structures comprising branched polymers, especially branched polynucleotides, may be synthesized in a variety of ways, as disclosed by Gryaznov (cited above), Urdea (cited above), and like references. In one aspect, branched polymers of the invention include comb-type branched polymers, which comprise a linear polymeric unit with one or more branch points located at interior monomers and/or linkage moieties. Branched polymers of the invention also include fork-type branched polymers, which comprise a linear polymeric unit with one or two branch points located at terminal monomers and/or linkage moieties. Macromolecular structures of the invention also include assemblies of linear and/or branched polynucleotides bound together by one or more duplexes or triplexes. Such assemblies may be self-assembled from component linear polynucleotide, e.g. as disclosed by Goodman et al, Science, 310: 1661-1665 (2005); Birac et al, J. Mol. Graph Model, (Apr. 18, 2006); Seeman et al, U.S. Pat. No. 6,255,469; and the like, which are incorporated herein by reference. In one aspect, linear polymeric units of the invention have the form: “-(M-L)n-” wherein L is a linker moiety and M is a monomer that may be selected from a wide range of chemical structures to provide a range of functions from serving as an inert non-sterically hindering spacer moiety to providing a reactive functionality which can serve as a branching point to attach other components, a site for attaching labels; a site for attaching oligonucleotides or other binding polymers for hybridizing or binding to amplifier strands or structures, e.g. as described by Urdea et al, U.S. Pat. No. 5,124,246 or Wang et al, U.S. Pat. No. 4,925,785; a site for attaching “hooks”, e.g. as described in Whiteley et al, U.S. Pat. No. 4,883,750; or as a site for attaching other groups for affecting solubility, promotion of duplex and/or triplex formation, such as intercalators, alkylating agents, and the like. The following references disclose several phosphoramidite and/or hydrogen phosphonate monomers suitable for use in the present invention and provide guidance for their synthesis and inclusion into oligonucleotides: Newton et al, Nucleic Acids Research, 21:1155-1162 (1993); Griffin et al, J. Am. Chem. Soc., 114:7976-7982 (1992); Jaschke et al, Tetrahedron Letters, 34:301-304 (1992); Ma et al, International application PCT/CA92/00423; Zon et al, International application PCT/US90/06630; Durand et al, Nucleic Acids Research, 18:6353¬6359 (1990); Salunkhe et al, J. Am. Chem. Soc., 114:8768-8772 (1992); Urdea et al, U.S. Pat. No. 5,093,232; Ruth, U.S. Pat. No. 4,948,882; Cruickshank, U.S. Pat. No. 5,091,519; Haralambidis et al, Nucleic Acids Research, 15:4857-4876 (1987); and the like. More particularly, M is a straight chain, cyclic, or branched organic molecular structure containing from 1 to 20 carbon atoms and from 0 to 10 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. Preferably, M is alkyl, alkoxy, alkenyl, or aryl containing from 1 to 16 carbon atoms; heterocyclic having from 3 to 8 carbon atoms and from 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur; glycosyl; or nucleosidyl. More preferably, M is alkyl, alkoxy, alkenyl, or aryl containing from 1 to 8 carbon atoms; glycosyl; or nucleosidyl. Preferably, L is a phosphorus(V) linking group which may be phosphodiester, phosphotriester, methyl or ethyl phosphonate, phosphorothioate, phosphorodithioate, phosphoramidate, or the like. Generally, linkages derived from phosphoramidite or hydrogen phosphonate precursors are preferred so that the linear polymeric units of the invention can be conveniently synthesized with commercial automated DNA synthesizers, e.g. Applied Biosystems, Inc. (Foster City, Calif.) model 394, or the like and may vary significantly depending on the nature of M and L. Usually, n varies from about 3 to about 100. When M is a nucleoside or analog thereof or a nucleoside-sized monomer and L is a phosphorus(V) linkage, then n varies from about 12 to about 100. Preferably, when M is a nucleoside or analog thereof or a nucleoside-sized monomer and L is a phosphorus(V) linkage, then n varies from about 12 to about 40. Polymeric units are assembled by forming one or more covalent bridges among them. In one aspect, bridges are formed by reacting thiol, phosphorothioate, or phosphorodithioate groups on one or more components with haloacyl- or haloalkylamimo groups on one or more other components to form one or more thio- or dithiophosphorylacyl or thio- or dithiophosphorylalkyi bridges. Generally, such bridges have one of the following forms: —NHRSP(═Z)(O)—OR —NHRS—, wherein R is alkyl or acyl and Z is sulfur or oxygen. The assembly reaction may involve from 2 to 20 components depending on the particular embodiment; but preferably, it involves from 2 to 8 components; and more preferably, it involves from 2 to 4 components. Preferably, the haloacyl. or haloalkylamino groups are haloacetylamino groups; and more preferably, the haloacetylamino groups are bromoacetylamino groups. The acyl or alkyl moieties of the haloacyl- or haloalkylamino groups contain from 1 to 12 carbon atoms; and more preferably, such moieties contain from 1 to 8 carbon atoms. The reaction may take place in a wide range of solvent systems; but generally, the assembly reaction takes place under liquid aqueous conditions or in a frozen state in ice, e.g. obtained by lowering the temperature of a liquid aqueous reaction mixture. Alternatively, formation of thiophosphorylacetylamino bridges in DMSO/H20 has been reported by Thuong et al, Tetrahedron Letters, 28:4157-4160 (1987); and Francois et al, Proc. Natl. Acad. Sci., 86:9702-9706 (1989). Typical aqueous conditons include 4 μM of reactants in 25 mM NaCl and 15 mM phosphate buffer (pH 7.0). The thio- or dithiophosphorylacyl- or thio- or dithiophosphorylalkylamino bridges are preferred because they can be readily and selectively cleaved by oxidizing agents, such as silver nitrate, potassium iodide, and the like. Preferably, the bridges are cleaved with potassium iodide, KI3, at a concentration equivalent to about a hundred molar excess of the bridges. Usually, a KI3 is employed at a concentration of about 0.1M. The facile cleavage of these bridges is a great advantage in synthesis of complex macromolecular structures, as it provides a convenient method for analyzing final products and for confirming that the structure of the final product is correct. A 3′-haloacyl- or haloalkylamino (in this example, haloacetylamino) derivatized oligonucleotide 1 is reacted with a 5′-phosphorothioate derivatized oligonucleotide 2 according to the following scheme:

  • 5′-BBB . . . B—NHC(═O)CH2X+  (1)

  • SP(═O)(O-)-BBB . . . B-3′→  (2)

  • 5′-BBB . . . B—NHC(═O)CH2SP(═O)(O—)O-BBB . . . B-3′
  • wherein X is halo and B is a nucleotide. It is understood that the nucleotides are merely exemplary of the more general polymeric units, (M-L), described above. Compound 1 can be prepared by reacting N-succinimidyl haloacetate in N,N-dimethylformamide (DMF) with a 3′-aminodeoxyribonucleotide precursor in a sodium borate buffer at room temperature. After about 35 minutes the mixture is diluted (e.g. with H2 0), desalted and, purified, e.g. by reverse phase HPLC. The Y-aminodeoxyribonucleotide precursor can be prepared as described in Gryaznov and Letsinger, Nucleic Acids Research, 20:3403-3409 (1992). Briefly, after deprotection, the 5′ hydroxyl of a deoxythymidine linked to a support via a standard succinyl linkage is phosphitylated by reaction with chloro-(diisopropylethylamino)-methoxyphosphine in an appropriate solvent, such as dichloromethane/diisopropylethylamine. After activation with tetrazole, the 5′¬phosphitylated thymidine is reacted with a 5′-trityl-O-3′-amino-3′-deoxynucleoside to form a nucleoside-thymidine dimer wherein the nucleoside moieties are covalently joined by a phosphoramidate linkage. The remainder of the oligonucleotide is synthesized by standard phosphoramidite chemistry. After cleaving the succinyl linkage, the oligonucleotide with a 3′ terminal amino group is generated by cleaving the phosphoramidate link by acid treatment, e.g. 80% aqueous acetic acid for 18-20 hours at room temperature. 5′-monophosphorothioate oligonucleotide 2 is formed as follows: A 5′ monophosphate is attached to the 5′ end of an oligonucleotide either chemically or enzymatically with a kinase, e.g. Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Edition (Cold Spring Harbor Laboratory, New York, 1989). Preferably, as a final step in oligonucleotide synthesis, a monophosphate is added by chemical phosphorylation as described by Thuong and Asscline, Chapter 12 in, Eckstein, editor, Oligonucleotides and Analogues (IRL Press, Oxford, 1991) or by Horn and Urdea, Tetrahedron Lett., 27:4705 (1986) (e.g. using commercially available reagents such as 5′ Phosphate-ON™ from Clontech Laboratories (Palo Alto, Calif.)). The 5′-monophosphate is then sulfurized using conventional sulfurizing agents, e.g. treatment with a 5% solution of S8 in pyfidine/CS2 (1:1, v/v, 45 minutes at room temperature); or treatment with sulfurizing agent described in U.S. Pat. Nos. 5,003,097; 5,151,510; or 5,166,387. Monophosphorodithioates are prepared by analogous procedures, e.g. Froehler et al, European patent publication 0 360 609 A2; Caruthers et al, International application PCT/US89/02293; and the like. Likewise to the above, a 5′¬haloacetylamino derivatized oligonucleotide 3 is reacted with a 3′-monophosphorothioate oligonucleotide 4 according to the following scheme:

  • 3′-BBB . . . B—NHC(═O)CH2X+  (3)

  • S—P(═O)(O-)O-BBB . . . B-5′->  (4)

  • 3′-BBB . . . B—NHC(═O)CH2SP(═O)(O-)-BBB . . . B-5′
  • wherein the symbols are defined the same as above, except that the nucleotides monomers of the j- and k-mers are in opposite orientations. In this case, Compound 3 can be prepared by reacting N¬succinimidyl haloacetate in N,N-dimethylformamide (DMF) with a 5′-aminodeoxyribonucleotide precursor in a sodium borate buffer at room temperature, as described above for the 3′-amino oligonucleotide. 5′-aminodeoxynucleosides are prepared in accordance with Glinski et al, J. Chem. Soc. Chem. Comm., 915-916 (1970); Miller et al, J. Org. Chem. 29:1772 (1964); Ozols et al, Synthesis, 7:557-559 (1980); and Azhayev et al, Nucleic Acids Research, 6:625-643 (1979); which are incorporated by reference. The 3′-monophosphorothioate oligonucleotide 4 can be prepared as described by Thuong and Asscline (cited above). Oligonucleotides 1 and 4 and 2 and 3 may be reacted to form polymeric units having either two 5′ termini or two 3′ termini, respectively.
  • Reactive functionalities for the attachment of branches may be introduced at a variety of sites. Preferably, amino functionalities are introduce on a polymeric unit or loop at selected monomers or linking moieties which are then converted to haloacetylamino groups as described above. Amino-derivatized bases of nucleoside monomers may be introduced as taught by Urdea et al, U.S. Pat. No. 5,093,232; Ruth U.S. Pat. No. 4,948,882; Haralambidis et al, Nucleic Acids Research, 15:4857-4876 (1987); or the like. Amino functionalities may also be introduced by a protected hydroxyamine phosphoramidite commercially available from Clontech Laboratories (Palo Alto, Calif.) as Aminomodifier IL™. Preferably, amino functionalities are introduced by generating a derivatized phosphoramidate linkage by oxidation of a phosphite linkage with 12 and an alkyldiamine, e.g. as taught by Agrawal et al, Nucleic Acids Research, 18:5419-5423 (1990); and Jager et al, Biochemistry, 27:7237-7246 (1988). Generally, for the above procedures, it is preferable that the haloacyl- or haloalkylamino derivatized polymeric units be prepared separately from the phosphorothioate derivatized polymeric units, otherwise the phosphorothioate moieties require protective groups.
  • Solid Phase Surfaces for Constructing Random Arrays
  • A wide variety of supports may be used with the invention. In one aspect, supports are rigid solids that have a surface, preferably a substantially planar surface so that single molecules to be interrogated are in the same plane. The latter feature permits efficient signal collection by detection optics, for example. In another aspect, solid supports of the invention are nonporous, particularly when random arrays of single molecules are analyzed by hybridization reactions requiring small volumes. Suitable solid support materials include materials such as glass, polyacrylamide-coated glass, ceramics, silica, silicon, quartz, various plastics, and the like. In one aspect, the area of a planar surface may be in the range of from 0.5 to 4 cm2. In one aspect, the solid support is glass or quartz, such as a microscope slide, having a surface that is uniformly silanized. This may be accomplished using conventional protocols, e.g. acid treatment followed by immersion in a solution of 3-glycidoxypropyl trimethoxysilane, N,N-diisopropylethylamine, and anhydrous xylene (8:1:24 v/v) at 80° C., which forms an epoxysilanized surface. e.g. Beattie et a (1995), Molecular Biotechnology, 4: 213. Such a surface is readily treated to permit end-attachment of capture oligonucleotides, e.g. by providing capture oligonucleotides with a 3′ or 5′ triethylene glycol phosphoryl spacer (see Beattie et al, cited above) prior to application to the surface. Many other protocols may be used for adding reactive functionalites to glass and other surfaces, as evidenced by the disclosure in Beaucage (cited above).
  • Whenever enzymatic processing is not required, capture oligonucleotides may comprise non-natural nucleosidic units and/or linkages that confer favorable properties, such as increased duplex stability; such compounds include, but not limited to, peptide nucleic acids (PNAs), locked nucleic acids (LNA), oligonucleotide N3′→P5′ phosphoramidates, oligo-2′-0-alkylribonucleotides, and the like.
  • In embodiments of the invention in which patterns of discrete spaced apart regions are required, photolithography, electron beam lithography, nano imprint lithography, and nano printing may be used to generate such patterns on a wide variety of surfaces, e.g. Pirrung et al, U.S. Pat. No. 5,143,854; Fodor et al, U.S. Pat. No. 5,774,305; Guo, (2004) Journal of Physics D: Applied Physics, 37: R123-141; which are incorporated herein by reference.
  • In one aspect, surfaces containing a plurality of discrete spaced apart regions are fabricated by photolithography. A commercially available, optically flat, quartz substrate is spin coated with a 100-500 nm thick layer of photo-resist. The photo-resist is then baked on to the quartz substrate. An image of a reticle with a pattern of regions to be activated is projected onto the surface of the photo-resist, using a stepper. After exposure, the photo-resist is developed, removing the areas of the projected pattern which were exposed to the UV source. This is accomplished by plasma etching, a dry developing technique capable of producing very fine detail. The substrate is then baked to strengthen the remaining photo-resist. After baking, the quartz wafer is ready for functionalization. The wafer is then subjected to vapor-deposition of 3-aminopropyldimethylethoxysilane. The density of the amino functionalized monomer can be tightly controlled by varying the concentration of the monomer and the time of exposure of the substrate. Only areas of quartz exposed by the plasma etching process may react with and capture the monomer. The substrate is then baked again to cure the monolayer of amino-functionalized monomer to the exposed quartz. After baking, the remaining photo-resist may be removed using acetone. Because of the difference in attachment chemistry between the resist and silane, aminosilane-functionalized areas on the substrate may remain intact through the acetone rinse. These areas can be further functionalized by reacting them with p-phenylenediisothiocyanate in a solution of pyridine and N-N-dimethlyformamide. The substrate is then capable of reacting with amine-modified oligonucleotides. Alternatively, oligonucleotides can be prepared with a 5′-carboxy-modifier-c10 linker (Glen Research). This technique allows the oligonucleotide to be attached directly to the amine modified support, thereby avoiding additional functionalization steps.
  • In another aspect, surfaces containing a plurality of discrete spaced apart regions are fabricated by nano-imprint lithography (NIL). For DNA an-ay production, a quartz substrate is spin coated with a layer of resist, commonly called the transfer layer. A second type of resist is then applied over the transfer layer, commonly called the imprint layer. The master imprint tool then makes an impression on the imprint layer. The overall thickness of the imprint layer is then reduced by plasma etching until the low areas of the imprint reach the transfer layer. Because the transfer layer is harder to remove than the imprint layer, it remains largely untouched. The imprint and transfer layers are then hardened by heating. The substrate is then put into a plasma etcher until the low areas of the imprint reach the quartz. The substrate is then derivatized by vapor deposition as described above.
  • In another aspect, surfaces containing a plurality of discrete spaced apart regions are fabricated by nano printing. This process uses photo, imprint, or e-beam lithography to create a master mold, which is a negative image of the features required on the print head. Print heads are usually made of a soft, flexible polymer such as polydimethylsiloxane (PDMS). This material, or layers of materials having different properties, are spin coated onto a quartz substrate. The mold is then used to emboss the features onto the top layer of resist material under controlled temperature and pressure conditions. The print head is then subjected to a plasma based etching process to improve the aspect ratio of the print head, and eliminate distortion of the print head due to relaxation over time of the embossed material. Random array substrates are manufactured using nano-printing by depositing a pattern of amine modified oligonucleotides onto a homogenously derivatized surface. These oligonucleotides would serve as capture probes for the RCR products. One potential advantage to nano-printing is the ability to print interleaved patterns of different capture probes onto the random array support. This would be accomplished by successive printing with multiple print heads, each head having a differing pattern, and all patterns fitting together to form the final structured support pattern. Such methods allow for some positional encoding of DNA elements within the random array. For example, control concatemers containing a specific sequence can be bound at regular intervals throughout a random array.
  • In still another aspect, a high density array of capture oligonucleotide spots of sub micron size is prepared using a printing head or imprint-master prepared from a bundle, or bundle of bundles, of about 10,000 to 100 million optical fibers with a core and cladding material. By pulling and fusing fibers a unique material is produced that has about 50-1000 nm cores separated by a similar or 2-5 fold smaller or larger size cladding material. By differential etching (dissolving) of cladding material a nano-printing head is obtained having a very large number of nano-sized posts. This printing head may be used for depositing oligonucleotides or other biological (proteins, oligopeptides, DNA, aptamers) or chemical compounds such as silane with various active groups. In one embodiment the glass fiber tool is used as a patterned support to deposit oligonucleotides or other biological or chemical compounds. In this case only posts created by etching may be contacted with material to be deposited. Also, a flat cut of the fused fiber bundle may be used to guide light through cores and allow light-induced chemistry to occur only at the tip surface of the cores, thus eliminating the need for etching. In both cases, the same support may then be used as a light guiding/collection device for imaging fluorescence labels used to tag oligonucleotides or other reactants. This device provides a large field of view with a large numerical aperture (potentially >1). Stamping or printing tools that perform active material or oligonucleotide deposition may be used to print 2 to 100 different oligonucleotides in an interleaved pattern. This process requires precise positioning of the print head to about 50-500 nm. This type of oligonucleotide array may be used for attaching 2 to 100 different DNA populations such as different source DNA. They also may be used for parallel reading from sub-light resolution spots by using DNA specific anchors or tags. Information can be accessed by DNA specific tags, e.g. 16 specific anchors for 16 DNAs and read 2 bases by a combination of 5-6 colors and using 16 ligation cycles or one ligation cycle and 16 decoding cycles. This way of making arrays is efficient if limited information (e.g. a small number of cycles) is required per fragment, thus providing more information per cycle or more cycles per surface.
  • In one embodiment “inert” concatemers are used to prepare a surface for attachment of test concatemers. The surface is first covered by capture oligonucleotides complementary to the binding site present on two types of synthetic concatemers; one is a capture concatemer, the other is a spacer concatemer. The spacer concatemers do not have DNA segments complementary to the adapter used in preparation of test concatemers and they are used in about 5-50, preferably 10× excess to capture concatemers. The surface with capture oligonucleotide is “saturated” with a mix of synthetic concatemers (prepared by chain ligation or by RCR) in which the spacer concatemers are used in about 10-fold (or 5 to 50-fold) excess to capture concatemers. Because of the −10:1 ratio between spacer and capture concatemers, the capture concatemers are mostly individual islands in a sea of spacer concatemers. The 10:1 ratio provides that two capture concatemers are on average separated by two spacer concatemers. If concatemers are about 200 nm in diameter, then two capture concatemers are at about 600 nm center-to-center spacing. This surface is then used to attach test concatemers or other molecular structures that have a binding site complementary to a region of the capture concatemers but not present on the spacer concatemers. Capture concatemers may be prepared to have less copies than the number of binding sites in test concatemers to assure single test concatemer attachment per capture concatemer spot. Because the test DNA can bind only to capture concatemers, an array of test concatemers may be prepared that have high site occupancy without congregation. Due to random attachment, some areas on the surface may not have any concatemers attached, but these areas with free capture oligonucleotide may not be able to bind test concatemers since they are designed not to have binding sites for the capture oligonculeotide. An array of individual test concatemers as described would not be arranged in a grid pattern. An ordered grid pattern should simplify data collection because less pixels are needed and less sophisticated image analysis systems are needed also.
  • In one aspect, multiple arrays of the invention may be place on a single surface. For example, patterned array substrates may be produced to match the standard 96 or 384 well plate format. A production format can be an 8×12 pattern of 6 mm×6 mm arrays at 9 mm pitch or 16×24 of 3.33 mm×3.33 mm array at 4.5 mm pitch, on a single piece of glass or plastic and other optically compatible material. In one example each 6 mm×6 mm array consists of 36 million 250-500 nm square regions at 1 micrometer pitch. Hydrophobic or other surface or physical barriers may be used to prevent mixing different reactions between unit arrays.
  • By way of example, binding sites (i.e. discrete spaced apart regions) for DNA samples are prepared by silanization of lithographically defined sites on silicon dioxide on silicon, quartz, or glass surfaces with 3-aminopropyldimethylethoxysilane or similar silanization agent followed by derivatization with p-phenylenediisothiocyanate or similar derivatization agent. For example, the binding sites may be square, circular or regular/irregular polygons produced by photolithography, direct-write electron beam, or nano-imprint lithography. Minimization of non-specific binding in regions between binding site The wetability (hydrophobic v. hydrophilic) and reactivity of the field surrounding the binding sites can be controlled to prevent DNA samples from binding in the field; that is, in places other than the binding sites. For example, the field may be prepared with hexamethyldisilazane (HMDS), or a similar agent covalently bonded to the surface, to be hydrophobic and hence unsuitable to hydrophilic bonding of the DNA samples. Similarly, the field may be coated with a chemical agent such as a fluorine-based carbon compound that renders it unreactive to DNA samples.
  • For the three surface fabrication processes listed in the prior paragraph, the follow exemplary steps are followed. For photolithography:
      • 1) Clean glass wafer
      • 2) Prime surface with HMDS
      • 3) Pattern binding sites in photoresist
      • 4) Reactive ion etch binding site surface with oxygen to remove HMDS
      • 5) Silanize with 0.3% 3-aminopropyldimethylethoxysilane
      • 6) Coat with photoresist to protect wafer during sawing
      • 7) Saw wafer into chips
      • 8) Strip photoresist
      • 9) Derivatize binding sites with solution of 10% pyridine and 90% N,N-Dimethylformaide (DMF) using 2.25 mg p-phenylenediisothiocyanate (PDC) per ml of solution for 2 h followed by methanol, acetone, and water rinses
  • For direct write electron beam surface fabrication:
      • 1) Clean glass wafer
      • 2) Prime surface with HMDS
      • 3) Pattern binding sites in PMMA with electron beam
      • 4) Reactive ion etch binding site surface with oxygen to remove HMDS
      • 5) Silanize with 0.3% 3-aminopropyldimethylethoxysilane
      • 6) Coat with photoresist to protect wafer during sawing
      • 7) Saw wafer into chips
      • 8) Strip photoresist
      • 9) Derivatize binding sites with solution of 10% pyridine and 90% N,N Dimethylformaide (DMF) using 2.25 mg p-phenylenediisothiocyanate (PDC) per ml of solution for 2 h followed by methanol, acetone, and water rinses.
  • For nano imprint lithography surface fabrication:
      • 1) Clean glass wafer
      • 2) Prime surface with HMDS
      • 3) Coat wafer with transfer layer
      • 4) Contact print pattern with nano imprint template and photopolymer on top of transfer layer
      • 5) Dry etch pattern into transfer layer
      • 6) Reactive ion etch binding site surface with oxygen to remove HMDS
      • 7) Silanize with 0.3% 3-aminopropyldimethylethoxysilane
      • 8) Coat with photoresist to protect wafer during sawing
      • 9) Saw wafer into chips
      • 10) Strip photoresist
      • 11) Derivatize binding sites with solution of 10% pyridine and 90% N,N Dimethylformaide (DMF) using 2.25 mg p-phenylenediisothiocyanate (PDC) per ml of solution for 2 h followed by methanol, acetone, and water rinses.
  • As mentioned above, a glass surface may also be used for constructing random arrays of the invention. For example, a suitable glass surface may be constructed from microscope cover slips. Microscope cover slips (22 mm sq-170 um thick) are placed in Teflon racks. They are soaked in 3 molar KOH in 95% ethanol/water for 2 minutes. They are then rinsed in water, followed by an acetone rinse. This removes surface contamination and prepares the glass for silanization. Plasma cleaning is an alternative to KOH cleaning. Fused silica or quartz may also be substituted for glass. The clean, dry cover slips are immersed in 0.3% 3-aminopropyldimethylethoxysilane, 0.3% water, in acetone. They are left to react for 45 minutes. They are then rinsed in acetone and cured at 100° C. for 1 hour. 3-aminopropyldimethylethoxysilane may be used as a replacement for 3-aminopropyltriethoxysilane because it forms a mono-layer on the glass surface. The monolayer surface provides a lower background. The silanization agent may also be applied using vapor deposition. 3-aminopropyltriethoxysilane tends to form more of a polymeric surface when deposited in solution phase. The amino modified silane is then terminated with a thiocyanate group. This is done in a solution of 10% pyridine and 90% N,N-Dimethylformaide (DMF) using 2.25 mg p-phenylenediisothiocyanate (PDC) per ml of solution. The reaction is run for 2 hours, then the slide is washed in methanol, followed by acetone, and water rinses. The cover slips are then dried and ready to bind probe. There are additional chemistries that can be used to modify the amino group at the end of the silanization agent. For example, glutaraldehyde can be used to modify the amino group at the end of the silanization agent to a aldehyde group which can be coupled to an amino modified oligonucleotide. Capture oligonucleotides are bound to the surface of the cover slide by applying a solution of 10-50 micromolar capture oligonucleotide in 100 millimolar sodium bicarbonate in water to the surface. The solution is allowed to dry, and is then washed in water.
  • It may be beneficial to avoid terminating the 3-amino group with PDC and perform a direct conjugation (of the 3-amino end) to the capture oligonucleotide which has been modified with either a carboxyl group or an aldehyde group at the 5′ end. In the case of the carboxyl group, the oligonucleotide is applied in a solution that contains EDC (1-Ethyl-3-(3-dimethylaminopropyl)-carbodiimide). In the case of the aldehyde group, the oligo is kept wet for 5-10 minutes then the surface is treated with a 1% solution of sodium borohydride.
  • In another aspect of the invention, random arrays are prepared using nanometer-sized beads. Sub-micron glass or other types of beads (e.g. in the 20-50 nm range) are used which are derivatized with a short oligonucleotide, e.g. 6-30 nucleotides, complementary to an adaptor oligonucleotide in the circles used to generate concatemers. The number of oligonucleotides on the bead and the length of the sequence can be controlled to weakly bind the concatemers in solution. Reaction rate of the beads should be much faster than that of the solid support alone. After binding concatemers, the beads are then allowed to settle on the surface of an array substrate. The array substrate has longer, more stable, more numerous oligonucleotides, such that conditions may be selected to permit preferential binding to the surface, thereby forming a spaced array of concatemers. If the beads are magnetic, a magnetic field can be used to pull them to the surface, it may also be used to move them around the surface. Alternatively, a centrifuge may be used to concentrate the beads on the surface. An exemplary protocol is as follows: 1. A preparation of 20 ul of concatemer solution with one million concatemers per 1 ul is mixed with 20 million nano-beads with about 500 capture oligonucleotides about 8 bases in length (6-16 bases may be use under different conditions). A 100 nm nano-bead there is approximately 40,000 nm2 and can hold up to 4000 short oligonucleotides. One way to control the density of capture probes is to mix in this case about 8 times more of a 2-4 bases long oligonucleotieds with the same attachment chemistry with the capture probe. Also, much smaller nano-beads (20-50 nm) may be used. 2. Reaction conditions (temperature, pH, salt concentration) are adjusted so that concatemers with over 300 copies will attach to nanobeads in significant numbers. 3. The reaction is applied under the same stringent conditions to a support with 4×4 mm of patterned surface with 16 million active sites about 200 nm in size, and nanobeads are allowed or forced to settle on the substrate surface bringing large concatemers with them. The largest distance that a nano-bead-concatemer has to travel is about 1 mm. The vertical movement of beads minimizes number of potential concatemer-concatemer encounters. The reaction solution may be applied in aliquots, e.g. 4 applications 5 ul each. In this case the thickness of the applied solution (e.g. the nano-bead maximal travel distance) is only about 250 microns. 4. Further increase stringency of the reaction to release concatemers from nano-beads and attach them to active sites on the support with ˜300 capture oligonucleotides 20-50 bases in length. 5. Concatemers attached to nano-beads will predominately settle initially between active sites on the support because there are 25 times more inactive than active surface. Slight horizontal movement force (e.g. substrate tilting, and other forces), may be applied to move nano-bead-concatemers about one to a few microns around.
  • Detection Instrumentation
  • As mentioned above, signals from single molecules on random arrays made in accordance with the invention are generated and detected by a number of detection systems, including, but not limited to, scanning electron microscopy, near field scanning optical microscopy (NSOM), total internal reflection fluorescence microscopy (TIRFM), and the like. Abundant guidance is found in the literature for applying such techniques for analyzing and detecting nanoscale structures on surfaces, as evidenced by the following references that are incorporated by reference: Reimer et al, editors, Scanning Electron Microscopy: Physics of Image Formation and Microanalysis, 2nd Edition (Springer, 1998); Nie et al, Anal. Chem., 78: 1528-1534 (2006); Hecht et al, Journal Chemical Physics, 112: 7761-7774 (2000); Zhu et al, editors, Near-Field Optics: Principles and Applications (World Scientific Publishing, Singapore, 1999); Drmanac, International patent publication WO 2004/076683; Lehr et al, Anal. Chem., 75: 2414-2420 (2003); Neuschafer et al, Biosensors & Bioelectronics, 18: 489-497 (2003); Neuschafer et al, U.S. Pat. No. 6,289,144; and the like. Of particular interest is TIRFM, for example, as disclosed by Neuschafer et al, U.S. Pat. No. 6,289,144; Lehr et al (cited above); and Drmanac, International patent publication WO 2004/076683. In one aspect, instruments for use with arrays of the invention comprise three basic components: (i) a fluidics system for storing and transferring detection and processing reagents, e.g. probes, wash solutions, and the like, to an array; (ii) a reaction chamber, or flow cell, holding or comprising an array and having flow-through and temperature control capability; and (iii) an illumination and detection system. In one embodiment, a flow cell has a temperature control subsystem with ability to maintain temperature in the range from about 5-95° C., or more specifically 10-85° C., and can change temperature with a rate of about 0.5-2° C. per second.
  • In some cases, the system hardware may be described as consisting of five major components; the robotic fluid handling system, the reaction chamber, the temperature control system, the illumination system and the detection system.
  • Reaction flow cell. Each DNA array segment may be housed in a separate flow cell, allowing cycles to be run asynchronously. Each flow cell provides temperature control, physically indexes the substrate, and creates a fluid path over the active area of the substrate. The active area of a flow cell may be determined by how many unit sub-arrays each flow cell contains. For an eight flow-cell system, each flow cell may contain an active area of 48×4 square millimeters, or 192 square millimeters in a 6×8 arrangement of unit sub-arrays. Similarly, in a 16 flow-cell system, each flow cell may have a 1 cm×1.5 cm substrate with 4×6 unit subarrays. A side port is connected to a dedicated syringe pump, which “pulls” or “pushes” fluid from the flow cell (see A.4.). A thin optical window may be installed in the flow cell. This window may allow the top surface of the substrate to be imaged. The DNB array substrate cannot be imaged through the bottom due to the required substrate thickness. The placement of the optical window over the DNB array creates thermal regulation difficulties. The thin cross section of fluid between the array substrate and optical window may cool or heat to room temperature relatively quickly. Creating a pocket above the optical window may allow filling the area directly above the window with optical oil. This oil may act as a thermal transfer medium connecting the top of the thin optical window to the temperature controlled flow cell body. The use of optical oil as the thermal transfer medium may also allow designing a lens system with an numerical aperture (NA) better than 1.0. Various other solutions are possible and may be explored.
  • Each DNA array segment may be housed in a separate flow cell, allowing cycles to be run asynchronously. Each flow cell provides temperature control, physically indexes the substrate, and creates a fluid path over the active area of the substrate. The active area of a flowcell may be determined by how many unit sub arrays each flowcell contains. For an eight flowcell system each flowcell may contain an active area of 48×4 square millimeters, or 192 square millimeters in a 6×8 arrangement of unit subarrays. Similarly, in a 16 flowcell system each cell may have a 1 cm×1.5 cm substrate with 4×6 unit subarrays. A side port is connected to a dedicated syringe pump, which “pulls” or “pushes” fluid from the flow cell. A second port is connected to a funnel like mixing chamber that is equipped with a liquid level sensor. The solutions are dispensed into the mixing chamber, mixed if needed, then drawn into the flow cell. When the level sensor detects air in the funnel's connection to the flow cell, the pump is reversed a known amount to back the fluid up to the funnel. This prevents air from entering the flow cell. This design has worked well for the small test substrates and may be scaled up for the random array substrates. A thin optical window may be installed in the flow cell. This window may allow the top surface of the substrate to be imaged. The DNB array substrate cannot be imaged through the bottom due to required substrate thickness. The placement of the optical window over the DNB array creates thermal regulation difficulties. The thin cross section of fluid between the array substrate and optical window may cool or heat to room temperature relatively quickly. Creating a pocket above the optical window may allow filling the area directly above the window with optical oil. This oil may act as a thermal transfer medium connecting the top of the thin optical window to the temperature controlled flow cell body. The Solexa (5) is attempting sequencing by synthesis on random array substrates with non-amplified or in-situ amplified DNA. Cycles of fluorescent nucleotide addition result in read lengths of about 25 bases that are then used to assemble and align the final sequence to a reference sequence. Researchers (6, 7) and companies such as Helicos Biosciences are also attempting sequencing by synthesis from non-amplified templates. The main limitations of these methods are short read lengths leading to incomplete sequence determination. Furthermore, the ability to read only one base per DNA per cycle with random attachment of DNA, requiring larger array surfaces and large numbers of CCD pixels per DNA sample leads to higher genome sequencing costs.
  • In one aspect, a flow cell for 1″square 170 micrometer thick cover slips can be used that has been derivatized to bind macromolecular structures of the invention. The cell encloses the “array” by sandwiching the glass and a gasket between two planes. One plane has an opening of sufficient size to permit imaging, and an indexing pocket for the cover slip. The other plane has an indexing pocket for the gasket, fluid ports, and a temperature control system. One fluid port is connected to a syringe pump which “pulls” or “pushes” fluid from the flow cell the other port is connected to a funnel like mixing chamber. The chamber, in turn is equipped with a liquid level sensor. The solutions are dispensed into the funnel, mixed if needed, then drawn into the flow cell. When the level sensor reads air in the funnels connection to the flow cell the pump is reversed a known amount to back the fluid up to the funnel. This prevents air from entering the flow cell. The cover slip surface may be sectioned off and divided into strips to accommodate fluid flow/capillary effects caused by sandwiching. Such substrate may be housed in an “open air”/“open face” chamber to promote even flow of the buffers over the substrate by eliminating capillary flow effects. Imaging may be accomplished with a 100× objective using TIRF or epi illumination and a 1.3 mega pixel Hamamatsu orca-er-ag on a Zeiss axiovert 200, or like system. This configuration images RCR concatemers bound randomly to a substrate (non-ordered array). Imaging speed may be improved by decreasing the objective magnification power, using grid patterned arrays and increasing the number of pixels of data collected in each image. For example, up to four or more cameras may be used, preferably in the 10-16 megapixel range. Multiple band pass filters and dichroic mirrors may also be used to collect pixel data across up to four or more emission spectra. To compensate for the lower light collecting power of the decreased magnification objective, the power of the excitation light source can be increased. Throughput can be increased by using one or more flow chambers with each camera, so that the imaging system is not idle while the samples are being hybridized/reacted. Because the probing of arrays can be non-sequential, more than one imaging system can be used to collect data from a set of arrays, further decreasing assay time.
  • During the imaging process, the substrate must remain in focus. Some key factors in maintaining focus are the flatness of the substrate, orthogonality of the substrate to the focus plane, and mechanical forces on the substrate that may deform it. Substrate flatness can be well controlled, glass plates which have better than ¼ wave flatness are readily obtained. Uneven mechanical forces on the substrate can be minimized through proper design of the hybridization chamber. Orthogonality to the focus plane can be achieved by a well-adjusted, high precision stage. Auto focus routines generally take additional time to run, so it is desirable to run them only if necessary. After each image is acquired, it will be analyzed using a fast algorithm to determine if the image is in focus. If the image is out of focus, the auto focus routine will run. It will then store the objectives Z position information to be used upon return to that section of that array during the next imaging cycle. By mapping the objectives Z position at various locations on the substrate, we will reduce the time required for substrate image acquisition.
  • A suitable illumination and detection system for fluorescence-based signal is a Zeiss Axiovert 200 equipped with a TIRF slider coupled to a 80 milliwatt 532 nm solid state laser. The slider illuminates the substrate through the objective at the correct TIRF illumination angle. TIRF can also be accomplished without the use of the objective by illuminating the substrate though a prism optically coupled to the substrate. Planar wave guides can also be used to implement TIRF on the substrate Epi illumination can also be employed. The light source can be rastered, spread beam, coherent, incoherent, and originate from a single or multi-spectrum source.
  • One embodiment for the imaging system contains a 20× lens with a 1.25 mm field of view, with detection being accomplished with a 10 megapixel camera. Such a system images approx 1.5 million concatemers attached to the patterned array at 1 micron pitch. Under this configuration there are approximately 6.4 pixels per concatemer. The number of pixels per concatemer can be adjusted by increasing or decreasing the field of view of the objective. For example a 1 mm field of view would yield a value of 10 pixels per concatemer and a 2 mm field of view would yield a value of 2.5 pixels per concatemer. The field of view may be adjusted relative to the magnification and NA of the objective to yield the lowest pixel count per concatemer that is still capable of being resolved by the optics, and image analysis software.
  • Both TIRF and EPI illumination allow for almost any light source to be used. One illumination schema is to share a common set of monochromatic illumination sources (about 4 lasers for 6-8 colors) amongst imagers. Each imager collects data at a different wavelength at any given time and the light sources would be switched to the imagers via an optical switching system. In such an embodiment, the illumination source preferably produces at least 6, but more preferably 8 different wavelengths. Such sources include gas lasers, multiple diode pumped solid state lasers combined through a fiber coupler, filtered Xenon Arc lamps, tunable lasers, or the more novel Spectralum Light Engine, soon to be offered by Tidal Photonics. The Spectralum Light Engine uses prism to spectrally separate light. The spectrum is projected onto a Texas Instruments Digital Light Processor, which can selectively reflect any portion of the spectrum into a fiber or optical connector. This system is capable of monitoring and calibrating the power output across individual wavelengths to keep them constant so as to automatically compensate for intensity differences as bulbs age or between bulb changes.
  • The following table represent examples of possible lasers, dyes and filters.
  • excitation
    laser filter emission filter Dye
    407 nm 405/12 436/12 Alexa-405 401/421
    407 nm 405/12 546/10 cascade yellow 409/558
    488 nm 488/10 514/11 Alexa-488 492/517
    543 nm 546/10 540/565 Tamra 540/565
    Bodipy
    543 nm 546/10 620/12 577/618 577/618
    546/10 620/12 Alexa-594 594/613
    635 nm 635/11 650/11 Alexa-635 632/647
    635 nm 635/11 A1exa700 702/723
  • Successfully scoring 6 billion concatemers through ˜350 (˜60 per color) images per region over 24 hours may require a combination of parallel image acquisition, increased image acquisition speed, and increased field of view for each imager. Additionally, the imager may support between six to eight colors. Commercially available microscopes commonly image a ˜1 mm field of view at 20× magnification with an NA of 0.8. At the proposed concatemer pitch of 0.5 micron, this translates into roughly 4 million concatemers per image. This yields approximately 1,500 images for 6 billion spots per hybridization cycle, or 0.5 million images for 350 imaging cycles. In a large scale sequencing operation, each imager preferably acquires 200,000 images per day, based on a 300 millisecond exposure time to a 16 mega pixel CCD. Thus, a preferred instrument design is 4 imager modules each serving 4 flow cells (16 flow cells total). The above described imaging schema assumes that each imager has a CCD detector with 10 million pixels and be used with an exposure time of roughly 300 milliseconds. This should be an acceptable method for collecting data for 6 fluorophore labels. One possible drawback to this imaging technique is that certain fluorophores may be unintentionally photo bleached by the light source while other fluorophores are being imaged. Keeping the illumination power low and exposure times to a minimum would greatly reduce photo bleaching. By using intensified CCDs (ICCDs) data could be collected of roughly the same quality with illumination intensities and exposure times that are orders of magnitude lower than standard CCDs. ICCDs are generally available in the 1-1.4 megapixel range. Because they require much shorter exposure times, a one megapixel ICCD can acquire ten or more images in the time a standard CCD acquires a single image. Used in conjunction with fast filter wheels, and a high speed flow cell stage, a one mega pixel ICCD should be able to collect the same amount of data as a 10 megapixel standard CCD.
  • Optics capable of imaging larger fields of view with high numerical apertures can be manufactured as custom lens assemblies. Indications are that 20× optics capable of imaging a 3 mm field of view with a NA >0.9 can be fabricated. Two such imaging systems, in combination with high pixel count CCD's or CCD mosaic arrays should be able to image the complete eight flow cell assay in roughly 14 hours. As described, further gains can be realized by using 16 flow cells. Doubling the number of flow cells would reduce imaging time to 9 hours by reducing the number of images per each field of view.
  • The reaction efficiency on the concatemer and other random DNA arrays may depend on the efficient use of probes, anchors or primers and enzymes. This may be achieved by mixing liquids (such as pooling liquid back and forth in the flow through chamber), applying agitations or using horizontal or vertical electric fields to bring DNA from different parts of the reaction volume in the proximity of the surface. One approach for efficient low cost assay reaction is to apply reaction mixes in a thin layer such as droplets or layers of about one to a few microns, but preferably less than 10 microns, in size/thickness. In a 1×1×1 micron volume designated for a 1×1 micron spot area, in 1 pmol/1 ul (1 uM concentration) there would be about 1000 molecules of probe in close proximity to 1-1000 copies of DNA. Using up to 100-300 molecules of probes would not significantly reduce the probe concentration and it would provide enough reacted probes to get significant signal. This approach may be used in an open reaction chamber that may stay open or closed for removal and washing of the probes and enzyme.
  • The physical makeup of the machine will include a number of additions to the standard microscope. A large area automated plate stage may be added to the microscope. This stage will accommodate the two substrates needed for each decoding assay. Another possibility is to use two smaller substrates that can fit in the standard plate stage. Each substrate will be fitted into a cassette and those cassettes will be fitted on to the stage. The cassette will index the substrate to the stage and provide a method to contain fluids over the assay substrate. Cassettes will have ports to facilitate the addition and removal of large volumes of buffer. They will also provide a means to control the temperature of the substrate, through a connection with a temperature control subsystem with ability to maintain temperature in the range from about 5-95° C. or more specifically 10-85° C.) and can change temperature in the cycle about 0.5-2° C. per second. Another key component is the 3 axis robot gantry which will be equipped with a syringe pump actuated pipetting head. This robotic pipetter will be used to add the probe pools to each cassette. Syringe pumps will be used to pump buffers into and out of each cassette. In another embodiment, the robotic pipetting may be replaced with pumps and valves based automation of decoding probe pool delivery. In yet another embodiment all reagents and substrates may be contained on a microfluidic chip.
  • As mentioned above, higher throughput can be achieved by using multiple cameras and multiple flow cells. A single robotic liquid handling gantry may service, for example, 16 flow cells. In addition, all components of the system may share a common temperature control system, and set of reagents. For combinatorial SBH sequencing operations, the robot may prepare probe pools and ligation buffers to be dispensed into the flow cell funnels. Dedicated syringe pumps may dispense wash and hybridization buffers directly into the funnel ports for each flow cell. Each imager may service a group of 2-4 flow cells. Each group of flow cells may be positioned on an XY motion platform, similar to the automated plate stages commonly found on research microscopes. System control and coordination between all system components may be performed via software running on a master computer. The control software may run assay cycles asynchronously, allowing each imager to run continuously throughout the assay. Flow cells are connected to a temperature control system with one heater and one chiller allowing for heating or cooling on demand of each flow cell or 2-4 blocks of cells independently. Each flow cell temperature may be monitored, and if a flow cell temperature drops below a set threshold, a valve may open to a hot water recirculation. Likewise, if a flow cell temperature is above the set threshold a valve may open to a cold water recirculation. If a flow cell is within a set temperature range neither valve may open. The hot and cold recirculation water runs through the aluminum flow cell body, but remains separate and isolated from the assay buffers and reagents.
  • Sequence Analysis of Random Arrays of Target Sequence Concatemers
  • As mentioned above, random arrays of biomolecules, such as genomic DNA fragments or cDNA fragments, provides a platform for large scale sequence determination and for genome-wide measurements based on counting sequence tags, in a manner similar to measurements made by serial analysis of gene expression (SAGE) or massively parallel signature sequencing, e.g. Velculescu, et al, (1995), Science 270, 484-487; and Brenner et al (2000), Nature Biotechnology, 18: 630-634. Such genome-wide measurements include, but are not limited to, determination of polymorphisms, including nucleotide substitutions, deletions, and insertions, inversions, and the like, determination of methylation patterns, copy number patterns, and the like, such as could be carried out by a wide range of assays known to those with ordinary skill in the art, e.g. Syvanen (2005), Nature Genetics Supplement, 37: S5-510; Gunderson et al (2005), Nature Genetics, 37: 549-554; Fan et al (2003), Cold Spring Harbor Symposia on Quantitative Biology, LXVIII: 69-78; and U.S. Pat. Nos. 4,883,750; 6,858,412; 5,871,921; 6,355,431; and the like, which are incorporated herein by reference.
  • A variety of sequencing methodologies can be used with random arrays of the invention, including, but not limited to, hybridization-based methods, such as disclosed in Drmanac, U.S. Pat. Nos. 6,864,052; 6,309,824; and 6,401,267; and Drmanac et al, U.S. patent publication 2005/0191656, which are incorporated by reference, sequencing by synthesis methods, e.g. Nyren et al, U.S. Pat. No. 6,210,891; Ronaghi, U.S. Pat. No. 6,828,100; Ronaghi et al (1998), Science, 281: 363-365; Balasubramanian, U.S. Pat. No. 6,833,246; Quake, U.S. Pat. No. 6,911,345; Li et al, Proc. Natl. Acad. Sci., 100: 414-419 (2003), which are incorporated by reference, and ligation-based methods, e.g. Shendure et al (2005), Science, 309: 1728-1739, which is incorporated by reference.
  • Combination of probe hybridization or probe-probe ligation with other DNA array based short read sequencing methods. There are many approaches to determine about 10-100 bases of sequence per DNA samples on an array of DNA samples. There are various sequencing by synthesis (SBS) methods (Solexa, 454) including primer extension methods, ligation based methods (4) or degradation/ligation based methods (Lynx). All of these methods may be combined with probe hybridization or probe-probe ligation data to provide longer read lengths with small numbers of cycles to get higher accuracy. DNA arrays may be prepared from DNA samples about 100-1000 bases in length where, in one segment or two segments close to adapters/primers/anchors sequences are determined by positional methods (SBS and others). The same DNA array is subjected to probe hybridization or probe-probe combinatorial ligation on the entire DNA or a part that is still in the form of ssDNA.
  • In one aspect, a method of determining a nucleotide sequence of a target polynucleotide in accordance with the invention comprises the following steps: (a) generating a plurality of target concatemers from the target polynucleotide, each target concatemer comprising multiple copies of a fragment of the target polynucleotide and the plurality of target concatemers including a number of fragments that substantially covers the target polynucleotide; (b) forming a random array of target concatemers fixed to a surface at a density such that at least a majority of the target concatemers are optically resolvable; (c) identifying a sequence of at least a portion of each fragment in each target concatemer; and (d) reconstructing the nucleotide sequence of the target polynucleotide from the identities of the sequences of the portions of fragments of the concatemers. Usually, “substantially covers” means that the amount of DNA analyzed contains an equivalent of at least two copies of the target polynucleotide, or in another aspect, at least ten copies, or in another aspect, at least twenty copies, or in another aspect, at least 100 copies. Target polynucleotides may include DNA fragments, including genomic DNA fragments and cDNA fragments, and RNA fragments. Guidance for the step of reconstructing target polynucleotide sequences can be found in the following references, which are incorporated by reference: Lander et al, Genomics, 2: 231-239 (1988); Vingron et al, J. Mol. Biol., 235: 1-12 (1994); and like references.
  • In one aspect, a sequencing method for use with the invention for determining sequences in a plurality of DNA or RNA fragments comprises the following steps: (a) generating a plurality of polynucleotide molecules each comprising a concatemer of a DNA or RNA fragment; (b) forming a random array of polynucleotide molecules fixed to a surface at a density such that at least a majority of the target concatemers are optically resolvable; and (c) identifying a sequence of at least a portion of each DNA or RNA fragment in resolvable polynucleotides using at least one chemical reaction of an optically detectable reactant. In one embodiment, such optically detectable reactant is an oligonucleotide. In another embodiment, such optically detectable reactant is a nucleoside triphosphate, e.g. a fluorescently labeled nucleoside triphosphate that may be used to extend an oligonucleotide hybridized to a concatemer. In another embodiment, such optically detectable reagent is an oligonucleotide formed by ligating a first and second oligonucleotides that form adjacent duplexes on a concatemer. In another embodiment, such chemical reaction is synthesis of DNA or RNA, e.g. by extending a primer hybridized to a concatemer. In yet another embodiment, the above optically detectable reactant is a nucleic acid binding oligopeptide or polypeptide or protein.
  • In one aspect, parallel sequencing of polynucleotide analytes of concatemers on a random array is accomplished by combinatorial SBH (cSBH), as disclosed by Drmanac in the above-cited patents. In one aspect, a first and second sets of oligonucleotide probes are provide, wherein each sets has member probes that comprise oligonucleotides having every possible sequence for the defined length of probes in the set. For example, if a set contains probes of length six, then it contains 4096 (=46) probes. In another aspect, first and second sets of oligonucleotide probes comprise probes having selected nucleotide sequences designed to detect selected sets of target polynucleotides. Sequences are determined by hybridizing one probe or pool of probe, hybridizing a second probe or a second pool of probes, ligating probes that form perfectly matched duplexes on their target sequences, identifying those probes that are ligated to obtain sequence information about the target sequence, repeating the steps until all the probes or pools of probes have been hybridized, and determining the nucleotide sequence of the target from the sequence information accumulated during the hybridization and identification steps.
  • For sequencing operation, in some embodiments, the sets may be divided into subsets that are used together in pools, as disclosed in U.S. Pat. No. 6,864,052. Probes from the first and second sets may be hybridized to target sequences either together or in sequence, either as entire sets or as subsets, or pools. In one aspect, lengths of the probes in the first or second sets are in the range of from 5 to 10 nucleotides, and in another aspect, in the range of from 5 to 7 nucleotides, so that when ligated they form ligation products with a length in the range of from 10 to 20, and from 10 to 14, respectively.
  • A nice feature of the probe-probe assay is that only a small subset of all 6-mers need to be scored for each 300-600 base fragment to allow an efficient mapping of DNA fragments. In addition, redundant base reading with all 6-mers is achieved by combined data from overlapping DNA fragments. This is especially efficient when combined with 40-base reads obtained by probe-anchor ligation (see algorithm description in D.2.7). It is possible to score 1/16 of all 6-mers for each DNB by creating 16 subsets of 256 6-mers (i.e. a total of 4096 6-mers) and 16 array sections from a 3-6 billion whole genome DNB array. These 16 array sections comprised of 24 2×2 mm unit arrays (see array preparation above) may be analyzed in parallel in 16 reaction chambers each with a different subset of 256 6-mers. The 16 6-mer subsets may be scored by a combinatorial ligation of 16 N563 and 16 B3N5-tail probes each. For 500-base fragments, out of 256 6-mers scored, about 32 may be positive on average, e.g. −200 bases may be read for each fragment. This is more than enough for mapping 500-base fragments especially in combination with 40-base end sequence and the hierarchical fragmentation schema (see algorithm description in C.2.7.). In addition, because there are 250-500 overlapping 500-base fragments covering each base in the genome, all 6-mers may be scored −20 times in these fragments providing −120 6-mer reads for each base with 6 overlapping 6-mers. The required 16 chambers are also good for optics and imaging since multiple chambers with reactions staggered in time, allow simple continuous use of multiple CCD cameras.
  • In another aspect, using such techniques, the sequence identity of each attached DNA concatemer may be determined by a “signature” approach. About 50 to 100 or possibly 200 probes are used such that about 25-50% or in some applications 10-30% of attached concatemers will have a full match sequence for each probe. This type of data allows each amplified DNA fragment within a concatemer to be mapped to the reference sequence. For example, by such a process one can score 64 4-mers (i.e. 25% of all possible 256 4-mers) using 16 hybridization/stripoff cycles in a 4 colors labeling schema. On a 60-70 base fragment amplified in a concatemer about 16 of 64 probes will be positive since there are 64 possible 4-mers present in a 64 base long sequence (i.e. one quarter of all possible 4-mers). Unrelated 60-70 base fragments will have a very different set of about 16 positive decoding probes. A combination of 16 probes out of 64 probes has a random chance of occurrence in 1 of every one billion fragments which practically provides a unique signature for that concatemer. Scoring 80 probes in 20 cycles and generating 20 positive probes create a signature even more likely to be unique: occurrence by chance is 1 in billion billions. Previously, a “signature” approach was used to select novel genes from cDNA libraries. An implementation of a signature approach is to sort obtained intensities of all tested probes and select up to a predefined (expected) number of probes that satisfy the positive probe threshold. These probes will be mapped to sequences of all DNA fragments (sliding window of a longer reference sequence may be used) expected to be present in the array. The sequence that has all or a statistically sufficient number of the selected positive probes is assigned as the sequence of the DNA fragment in the given concatemer. In another approach an expected signal can be defined for all used probes using their pre measured full match and mismatch hybridization/ligation efficiency. In this case a measure similar to the correlation factor can be calculated.
  • A preferred way to score 4-mers is to ligate pairs of probes, for example: N(5-7)BBB with BN(7-9), where B is the defined base and N is a degenerate base. For generating signatures on longer DNA concatemer probes, more unique bases will be used. For example, a 25% positive rate in a fragment 1000 bases in length would be achieved by N(4-6) BBBB and BBN(6-8). Note that longer fragments need the same number of about 60-80 probes (15-20 ligation cycles using 4 colors).
  • In one embodiment all probes of a given length (e.g. 4096 N2-4BBBBBBN2-4) or all ligation pairs may be used to determine complete sequence of the DNA in a concatemer. For example, 1024 combinations of N(5-7)B3 and BBN(6-8) may be scored (256 cycles if 4 colors are used) to determine sequence of DNA fragments of up to about 250 bases, preferably up to about 100 bases.
  • The decoding of sequencing probes with large numbers of Ns may be prepared from multiple syntheses of subsets of sequences at degenerated bases to minimize difference in the efficiency. Each subset is added to the mix at a proper concentration. Also, some subsets may have more degenerated positions than others. For example, each of 64 probes from the set N(5-7)BBB may be prepared in 4 different synthesis. One is regular all 5-7 bases to be fully degenerated; second is NO-3(A,T)5BBB; third is NO-2(A,T)(G,C)(A,T)(G,C)(A,T)BBB, and the fourth is NO-2(G,C)(A,T)(G,C)(A,T)(G,C)BBB.
  • Oligonucleotide preparation from the three specific syntheses is added in to regular synthesis in experimentally determined amounts to increase hybrid generation with target sequences that have in front of the BBB sequence an AT rich (e.g. AATAT) or (A or T) and (G or C) alternating sequence (e.g. ACAGT or GAGAC). These sequences are expected to be less efficient in forming a hybrid. All 1024 target sequences can be tested for the efficiency to form hybrid with N0-3NNNNNBBB probes and those types that give the weakest binding may be prepared in about 1-10 additional synthesis and added to the basic probe preparation.
  • Decoding by Signatures: a smaller number of probes for small number of distinct samples: 5-7 positive out of 20 probes (5 cycles using 4 colors) has capacity to distinct about 10-100 thousand distinct fragments
  • Decoding of 8-20mer RCR products. In this application arrays are formed as random distributions of unique 8 to 20 base recognition sequences in the form of DNA concatemers. The probes need to be decoded to determine the sequence of the 8-20 base probe region. At least two options are available to do this and the following example describes the process for a 12 mer. In the first, one half of the sequence is determined by utilizing the hybridization specificity of short probes and the ligation specificity of fully matched hybrids. Six to ten bases adjacent to the 12 mer are predefined and act as a support for a 6mer to 10-mer oligonucleotide. This short 6mer will ligate at its 3-prime end to one of 4 labeled 6-mers to 10-mers. These decoding probes consist of a pool of 4 oligonucleotides in which each oligonucleotide consists of 4-9 degenerate bases and 1 defined base. This oligonucleotide will also be labeled with one of four fluorescent labels. Each of the 4 possible bases A, C, G, or T will therefore be represented by a fluorescent dye. For example these 5 groups of 4 oligonucleotides and one universal oligonucleotide (Us) can be used in the ligation assays to sequence first 5 bases of 12-mers: B=each of 4 bases associated with a specific dye or tag at the end:
    • UUUUUUUU.BNNNNNNN*
    • UUUUUUUU.NBNNNNNN
    • UUUUUUUU.NNBNNNNN
    • UUUUUUUU.NNNBNNNN
    • UUUUUUUU.NNNNBNNN
  • Six or more bases can be sequences with additional probe pools. To improve discrimination at positions near the center of the 12mer the 6mer oligonucleotide can be positioned further into the 12mer sequence. This will necessitate the incorporation of degenerate bases into the 3-prime end of the non-labeled oligonucleotide to accommodate the shift. This is an example of decoding probes for position 6 and 7 in the 12-mer.
    • UUUUUUNN.NNNBNNNN
    • UUUUUUNN.NNNNBNNN
  • In a similar way the 6 bases from the right side of the 12mer can be decoded by using a fixed oligonucleotide and 5-prime labeled probes. In the above described system 6 cycles are required to define 6 bases of one side of the 12mer. With redundant cycle analysis of bases distant to the ligation site this may increase to 7 or 8 cycles. In total then, complete sequencing of the 12mer could be accomplished with 12-16 cycles of ligation. Partial or complete sequencing of arrayed DNA by combining two distinct types of libraries of detector probes. In this approach one set has probes of the general type N3-8B4-6 (anchors) that are ligated with the first 2 or 3 or 4 probes/probe pools from the set BN6-8, NBN5-7, N2BN4-6, and N3BN3-5. The main requirement is to test in a few cycles a probe from the first set with 2-4 or even more probes from the second set to read longer continuous sequence such as 5-6+3−4=8-10 in just 3-4 cycles. In one example, the process is:
  • 1) Hybridize 1-4 4-mers or more 5-mer anchors to obtain 70-80% 1 or 2 anchors per DNA. One way to discriminate which anchor is positive from the pool is to mix specific probes with distinct hybrid stability (maybe different number of Ns in addition). Anchors may be also tagged to determine which anchor from the pool is hybridized to a spot. Tags, as additional DNA segment, may be used for adjustable displacement as a detection method.
  • For example, certain probes can be used after hybridization or hybridization and ligation differentially removed with two corresponding displacers:
  • Separate cycles may be used just to determine which anchor is positive. For this purpose anchors labeled or tagged with multiple colors may be ligated to unlabeled N7-N10 supporter oligonucleotides.
  • 2) Hybridize BNNNNNNNN probe with 4 colors corresponding to 4 bases; wash discriminatively (or displace by complement to the tag) to read which of two scored bases is associated to which anchor if two anchors are positive in one DNA. Thus, two 7-10 base sequences can be scores at the same time.
  • In 2-4 cycles extend to 4-6 base anchor for additional 2-4 bases run 16 different anchors per each array (32-64 physical cycles if 4 colors are used) to determine about 16 possible 8-mers (˜100 bases total) per each fragment (more than enough to map it to the reference (probability that a 100-mer will have a set of 10 8-mers is less than 1 in trillion trillions; (10 exp-28). By combining data from different anchors scored in parallel on the same fragment in another array complete sequence of that fragment and by extension to entire genomes may be generated from overlapping 7-10-mers.
  • Tagging probes with DNA tags for larger multiplex of decoding or sequence determination probes Instead of directly labeling probes they can be tagged with different oligonucleotide sequences made of natural bases or new synthetic bases (such as isoG and isoC). Tags can be designed to have very precise binding efficiency with their anti-tags using different oligonucleotide lengths (about 6-24 bases) and/or sequence including GC content. For example 4 different tags may be designed that can be recognized with specific anti-tags in 4 consecutive cycles or in one hybridization cycle followed by a discriminative wash. In the discriminative wash initial signal is reduced to 95-99%, 30-40%, 10-20% and 0-5% for each tag, respectively. In this case by obtaining two images 4 measurements are obtained assuming that probes with different tags will rarely hybridize to the same dot. Another benefit of having many different tags even if they are consecutively decoded (or 2-16 at a time labeled with 2-16 distinct colors) is the ability to use a large number of individually recognizable probes in one assay reaction. This way a 4-64 times longer assay time (that may provide more specific or stronger signal) may be affordable if the probes are decoded in short incubation and removal reactions.
  • The decoding process requires the use of 48-96 or more decoding probes. These pools will be further combined into 12-24 or more pools by encoding them with four fluorophores, each having different emission spectra. Using a 20× objective, each 6 mm×6 mm array may require roughly 30 images for full coverage by using a 10 mega pixel camera with. Each of 1 micrometer array areas is read by about 8 pixels. Each image is acquired in 250 milliseconds, 150 ms for exposure and 100 ms to move the stage. Using this fast acquisition it will take ˜7.5 seconds to image each array, or 12 minutes to image the complete set of 96 arrays on each substrate. In one embodiment of an imaging system, this high image acquisition rate is achieved by using four ten-megapixel cameras, each imaging the emission spectra of a different fluorophore. The cameras are coupled to the microscope through a series of dichroic beam splitters. The autofocus routine, which takes extra time, runs only if an acquired image is out of focus. It will then store the Z axis position information to be used upon return to that section of that array during the next imaging cycle. By mapping the autofocus position for each location on the substrate we will drastically reduce the time required for image acquisition.
  • Each array requires about 12-24 cycles to decode. Each cycle consists of a hybridization, wash, array imaging, and strip-off step. These steps, in their respective orders, may take for the above example 5,2,12, and 5 minutes each, for a total of 24 minutes each cycle, or roughly 5-10 hours for each array, if the operations were performed linearly. The time to decode each array can be reduced by a factor of two by allowing the system to image constantly. To accomplish this, the imaging of two separate substrates on each microscope is staggered. While one substrate is being reacted, the other substrate is imaged.
  • An exemplary decoding cycle using cSBH includes the following steps: (i) set temperature of array to hybridization temperature (usually in the range 5-25° C.); (ii) use robot pipetter to pre mix a small amount of decoding probe with the appropriate amount of hybridization buffer; (iii) pipette mixed reagents into hybridization chamber; (iv) hybridize for predetermined time; (v) drain reagents from chamber using pump (syringe or other); (vi) add a buffer to wash mismatches of non-hybrids; (vii) adjust chamber temperature to appropriate wash temp (about 10-40° C.); (viii) drain chamber; (ix) add more wash buffer if needed to improve imaging; (x) image each array, preferably with a mid power (20×) microscope objective optically coupled to a high pixel count high sensitivity CCD camera, or cameras; plate stage moves chambers (or perhaps flow-cells with input funnels) over object, or objective-optics assembly moves under chamber; certain optical arrangements, using di-chroic mirrors/beam-splitters can be employed to collect multi-spectral images simultaneously, thus decreasing image acquisition time; arrays can be imaged in sections or whole, depending on array/image size/pixel density; sections can be assembled by aligning images using statistically significant empty regions pre-coded onto substrate (during active site creation) or can be made using a multi-step nano-printing technique, for example sites (grid of activated sites) can be printed using specific capture probe, leaving empty regions in the grid; then print a different pattern or capture probe in that region using separate print head; (xi) drain chamber and replace with probe strip buffer (or use the buffer already loaded) then heat chamber to probe stripoff temperature (60-90° C.); high pH buffer may be used in the strip-off step to reduce stripoff temperature; wait for the specified time; (xii) remove buffer; (xiii) start next cycle with next decoding probe pool in set.
  • Labels and Signal Generation by Probes Directed to Polynucleotides on Arrays of the Invention
  • The oligonucleotide probes of the invention can be labeled in a variety of ways, including the direct or indirect attachment of radioactive moieties, fluorescent moieties, colorimetric moieties, chemiluminescent moieties, and the like. Many comprehensive reviews of methodologies for labeling DNA and constructing DNA adaptors provide guidance applicable to constructing oligonucleotide probes of the present invention. Such reviews include Kricka, Ann. Clin. Biochem., 39: 114-129 (2002); Schaferling et al, Anal. Bioanal. Chem., (Apr. 12, 2006); Matthews et al, Anal. Biochem., Vol 169, pgs. 1-25 (1988); Haugland, Handbook of Fluorescent Probes and Research Chemicals, Tenth Edition (Invitrogen/Molecular Probes, Inc., Eugene, 2006); Keller and Manak, DNA Probes, 2nd Edition (Stockton Press, New York, 1993); and Eckstein, editor, Oligonucleotides and Analogues: A Practical Approach (IRL Press, Oxford, 1991); Wetmur, Critical Reviews in Biochemistry and Molecular Biology, 26: 227-259 (1991); Hermanson, Bioconjugate Techniques (Academic Press, New York, 1996); and the like. Many more particular methodologies applicable to the invention are disclosed in the following sample of references: Fung et al, U.S. Pat. No. 4,757,141; Hobbs, Jr., et al U.S. Pat. No. 5,151,507; Cruickshank, U.S. Pat. No. 5,091,519; (synthesis of functionalized oligonucleotides for attachment of reporter groups); Jablonski et al, Nucleic Acids Research, 14: 6115-6128 (1986)(enzyme-oligonucleotide conjugates); Ju et al, Nature Medicine, 2: 246-249 (1996); Bawendi et al, U.S. Pat. No. 6,326,144 (derivatized fluorescent nanocrytals); Bruchez et al, U.S. Pat. No. 6,274,323 (derivatized fluorescent nanocrystals); and the like.
  • In one aspect, one or more fluorescent dyes are used as labels for the oligonucleotide probes, e.g. as disclosed by Menchen et al, U.S. Pat. No. 5,188,934 (4,7-dichlorofluorscein dyes); Begot et al, U.S. Pat. No. 5,366,860 (spectrally resolvable rhodamine dyes); Lee et al, U.S. patent 5, 847,162 (4,7-dichlororhodamine dyes); Khanna et al, U.S. Pat. No. 4,318,846 (ether-substituted fluorescein dyes); Lee et al, U.S. Pat. No. 5,800,996 (energy transfer dyes); Lee et al, U.S. Pat. No. 5,066,580 (xanthene dyes): Mathies et al, U.S. Pat. No. 5,688,648 (energy transfer dyes); and the like. Labeling can also be carried out with quantum dots, as disclosed in the following patents and patent publications, incorporated herein by reference: 6,322,901; 6,576,291; 6,423,551; 6,251,303; 6,319,426; 6,426,513; 6,444,143; 5,990,479; 6,207,392; 2002/0045045; 2003/0017264; and the like. As used herein, the term “fluorescent signal generating moiety” means a signaling means which conveys information through the fluorescent absorption and/or emission properties of one or more molecules. Such fluorescent properties include fluorescence intensity, fluorescence life time, emission spectrum characteristics, energy transfer, and the like.
  • Commercially available fluorescent nucleotide analogues readily incorporated into the labeling oligonucleotides include, for example, Cy3-dCTP, Cy3-dUTP, Cy5-dCTP, Cy5-dUTP (Amersham Biosciences, Piscataway, N.J., USA), fluorescein-12-dUTP, tetramethylrhodamine-6-dUTP, Texas Red®-5-dUTP, Cascade Blue®-7-dUTP, BODIPY® FL-14-dUTP, BODIPY® R-14-dUTP, BODIPY® TR-14-dUTP, Rhodamine Green™-5-dUTP, Oregon Green® 488-5-dUTP, Texas Red®-12-dUTP, BODIPY® 630/650-14-dUTP, BODIPY® 650/665-14-dUTP, Alexa Fluor® 488-5-dUTP, Alexa Fluor® 532-5-dUTP, Alexa Fluor® 568-5-dUTP, Alexa Fluor® 594-5-dUTP, Alexa Fluor® 546-14-dUTP, fluorescein-12-UTP, tetramethylrhodamine-6-UTP, Texas Red®-5-UTP, Cascade Blue®-7-UTP, BODIPY® FL-14-UTP, BODIPY® TMR-14-UTP, BODIPY® TR-14-UTP, Rhodamine Green™-5-UTP, Alexa Fluor® 488-5-UTP, Alexa Fluor® 546-14-UTP (Molecular Probes, Inc. Eugene, Oreg., USA). Other fluorophores available for post-synthetic attachment include, inter alia, Alexa Fluor® 350, Alexa Fluor® 532, Alexa Fluor® 546, Alexa Fluor® 568, Alexa Fluor® 594, Alexa Fluor® 647, BODIPY 493/503, BODIPY FL, BODIPY R6G, BODIPY 530/550, BODIPY TMR, BODIPY 558/568, BODIPY 558/568, BODIPY 564/570, BODIPY 576/589, BODIPY 581/591, BODIPY 630/650, BODIPY 650/665, Cascade Blue, Cascade Yellow, Dansyl, lissamine rhodamine B, Marina Blue, Oregon Green 488, Oregon Green 514, Pacific Blue, rhodamine 6G, rhodamine green, rhodamine red, tetramethylrhodamine, Texas Red (available from Molecular Probes, Inc., Eugene, Oreg., USA), and Cyt, Cy3.5, Cy5.5, and Cy7 (Amersham Biosciences, Piscataway, N.J. USA, and others). FRET tandem fluorophores may also be used, such as PerCP-Cy5.5, PE-Cy5, PE-Cy5.5, PE-Cy7, PE-Texas Red, and APC-Cy7; also, PE-Alexa dyes (610, 647, 680) and APC-Alexa dyes. Biotin, or a derivative thereof, may also be used as a label on a detection oligonucleotide, and subsequently bound by a detectably labeled avidin/streptavidin derivative (e.g. phycoerythrin-conjugated streptavidin), or a detectably labeled anti-biotin antibody. Digoxigenin may be incorporated as a label and subsequently bound by a detectably labeled anti-digoxigenin antibody (e.g. fluoresceinated anti-digoxigenin). An aminoallyl-dUTP residue may be incorporated into a detection oligonucleotide and subsequently coupled to an N-hydroxy succinimide (NHS) derivitized fluorescent dye, such as those listed supra. In general, any member of a conjugate pair may be incorporated into a detection oligonucleotide provided that a detectably labeled conjugate partner can be bound to permit detection. As used herein, the term antibody refers to an antibody molecule of any class, or any subfragment thereof, such as an Fab. Other suitable labels for detection oligonucleotides may include fluorescein (FAM), digoxigenin, dinitrophenol (DNP), dansyl, biotin, bromodeoxyuridine (BrdU), hexahistidine (6×His), phosphor amino acids (e.g. P-tyr, P-ser, P-thr), or any other suitable label. In one embodiment the following hapten/antibody pairs are used for detection, in which each of the antibodies is derivatized with a detectable label: biotin/α-biotin, digoxigenin/a-digoxigenin, dinitrophenol (DNP)/α-DNP, 5-Carboxyfluorescein (FAM)/α-FAM. As described in schemes below, probes may also be indirectly labeled, especially with a hapten that is then bound by a capture agent, e.g. as disclosed in Holtke et al, U.S. Pat. Nos. 5,344,757; 5,702,888; and 5,354,657; Huber et al, U.S. Pat. No. 5,198,537; Miyoshi, U.S. Pat. No. 4,849,336; Misiura and Gait, PCT publication WO 91/17160; and the like. Many different hapten-capture agent pairs are available for use with the invention. Exemplary, haptens include, biotin, des-biotin and other derivatives, dinitrophenol, dansyl, fluorescein, CY5, and other dyes, digoxigenin, and the like. For biotin, a capture agent may be avidin, streptavidin, or antibodies. Antibodies may be used as capture agents for the other haptens (many dye-antibody pairs being commercially available, e.g. Molecular Probes).
  • Kits of the Invention
  • In the commercialization of the methods described herein, certain kits for construction of random arrays of the invention and for using the same for various applications are particularly useful. Kits for applications of random arrays of the invention include, but are not limited to, kits for determining the nucleotide sequence of a target polynucleotide, kits for large-scale identification of differences between reference DNA sequences and test DNA sequences, kits for profiling exons, and the like. A kit typically comprises at least one support having a surface and one or more reagents necessary or useful for constructing a random array of the invention or for carrying out an application therewith. Such reagents include, without limitation, nucleic acid primers, probes, adaptors, enzymes, and the like, and are each packaged in a container, such as, without limitation, a vial, tube or bottle, in a package suitable for commercial distribution, such as, without limitation, a box, a sealed pouch, a blister pack and a carton. The package typically contains a label or packaging insert indicating the uses of the packaged materials. As used herein, “packaging materials” includes any article used in the packaging for distribution of reagents in a kit, including without limitation containers, vials, tubes, bottles, pouches, blister packaging, labels, tags, instruction sheets and package inserts.
  • In one aspect, the invention provides a kit for making a random array of concatemers of DNA fragments from a source nucleic acid comprising the following components: (i) a support having a surface; and (ii) at least one adaptor oligonucleotide for ligating to each DNA fragment and forming a DNA circle therewith, each DNA circle capable of being replicated by a rolling circle replication reaction to form a concatemer that is capable of being randomly disposed on the surface. In such kits, the surface may be a planar surface having an array of discrete spaced apart regions, wherein each discrete spaced apart region has a size equivalent to that of said concatemers. The discrete spaced apart regions may form a regular array with a nearest neighbor distance in the range of from 0.1 to 20 μm The concatemers on the discrete spaced apart regions may have a nearest neighbor distance such that they are optically resolvable. The discrete spaced apart regions may have capture oligonucleotides attached and the adaptor oligonucleotides may each have a region complementary to the capture oligonucleotides such that the concatemers are capable of being attached to the discrete spaced apart regions by formation of complexes between the capture oligonucleotides and the complementary regions of the adaptor oligonucleotides. In some embodiments, the concatemers are randomly distributed on said discrete spaced apart regions and the nearest neighbor distance is in the range of from 0.3 to 3′, am Such kits may further comprise (a) a terminal transferase for attaching a homopolymer tail to said DNA fragments to provide a binding site for a first end of said adaptor oligonucleotide, (b) a ligase for ligating a strand of said adaptor oligonucleotide to ends of said DNA fragment to form said DNA circle, (c) a primer for annealing to a region of the strand of said adaptor oligonucleotide, and (d) a DNA polymerase for extending the primer annealed to the strand in a rolling circle replication reaction. The above adaptor oligonucleotide may have a second end having a number of degenerate bases in the range of from 4 to 12.
  • In another aspect the invention provides kits for sequencing a target polynucleotide comprising the following components: (i) a support having a planar surface having an array of optically resolvable discrete spaced apart regions, wherein each discrete spaced apart region has an area of less than 1 μm2; (ii) a first set of probes for hybridizing to a plurality of concatemers randomly disposed on the discrete spaced apart regions, the concatemers each containing multiple copies of a DNA fragment of the target polynucleotide; and (iii) a second set of probes for hybridizing to the plurality of concatemers such that whenever a probe from the first set hybridizes contiguously to a probe from the second set, the probes are ligated. Such kits may further include a ligase, a ligase buffer, and a hybridization buffer. In some embodiments, the discrete spaced apart regions may have capture oligonucleotides attached and the concatemers may each have a region complementary to the capture oligonucleotides such that said concatemers are capable of being attached to the discrete spaced apart regions by formation of complexes between the capture oligonucleotides and the complementary regions of said concatemers.
  • In still another aspect, the invention provides kits for constructing a single molecule array comprising the following components: (i) a support having a surface having reactive functionalities; and (ii) a plurality of macromolecular structures each having a unique functionality and multiple complementary functionalities, the macromolecular structures being capable of being attached randomly on the surface wherein the attachment is formed by one or more linkages formed by reaction of one or more reactive functionalities with one or more complementary functionalities; and wherein the unique functionality is capable of selectively reacting with a functionality on an analyte molecule to form the single molecule array. In some embodiments of such kits, the surface is a planar surface having an array of discrete spaced apart regions containing said reactive functionalities and wherein each discrete spaced apart region has an area less than 1 ptm2. In further embodiments, the discrete spaced apart regions form a regular array with a nearest neighbor distance in the range of from 0.1 to 20 pun. In further embodiments, the concatemers on the discrete spaced apart regions have a nearest neighbor distance such that they are optically resolvable. In still further embodiments, the macromolecular structures may be concatemers of one or more DNA fragments and wherein the unique functionalities are at a 3′ end or a 5′ end of the concatemers.
  • In another aspect, the invention includes kits for circularizing DNA fragments comprising the components: (a) at least one adaptor oligonucleotide for ligating to one or more DNA fragments and forming DNA circles therewith (b) a terminal transferase for attaching a homopolymer tail to said DNA fragments to provide a binding site for a first end of said adaptor oligonucleotide, (c) a ligase for ligating a strand of said adaptor oligonucleotide to ends of said DNA fragment to form said DNA circle, (d) a primer for annealing to a region of the strand of said adaptor oligonucleotide, and (e) a DNA polymerase for extending the primer annealed to the strand in a rolling circle replication reaction. In an embodiment of such kit, the above adaptor oligonucleotide may have a second end having a number of degenerate bases in the range of from 4 to 12. The above kit may further include reaction buffers for the terminal transferase, ligase, and DNA polymerase. In still another aspect, the invention includes a kit for circularizing DNA fragments using a Circligase enzyme (Epicentre Biotechnologies, Madison, Wis.), which kit comprises a volume exclusion polymer. In another aspect, such kit further includes the following components: (a) reaction buffer for controlling pH and providing an optimized salt composition for Circligase, and (b) Circligase cofactors. In another aspect, a reaction buffer for such kit comprises 0.5 M MOPS (pH 7.5), 0.1 M KCl, 50 mM MgCl2, and 10 mM DTT. In another aspect, such kit includes Circligase, e.g. 10-100 μL Circligase solution (at 100 unit/pi). Exemplary volume exclusion polymers are disclosed in U.S. Pat. No. 4,886,741, which is incorporated by reference, and include polyethylene glycol, polyvinylpyrrolidone, dextran sulfate, and like polymers. In one aspect, polyethylene glycol (PEG) is 50% PEG4000. In one aspect, a kit for circle formation includes the following:
  • Amount Component Final Conc.
      2 μL Circligase 10X reaction buffer Ix
    0.5 μL 1 mM ATP 25 μM
    0.5 μL 50 mM MnC12 1.25 mM
      4 μL 50% PEG4000 10%
      2 μL Circligase ssDNA ligase (100 units/pi) 10 units/μL
    single stranded DNA template 0.5-10 pmol/μL
    sterile water
  • Final reaction volume: 20 μL. The above components are used in the following protocol:
      • Heat DNA at 60-96° C. depending on the length of the DNA (ssDNA templates that have a 5′-phosphate and a 3′-hydroxyl group).
      • Preheat 2.2× reaction mix at 60° C. for about 5-10 min.
      • If DNA was preheated to 96° C. cool it down at 60° C.
      • Mix DNA and buffer at 60° C. without cooling it down and incubate for 2-3 h.
      • Heat Inactivate enzyme to stop the ligation reaction.
  • Large-Scale Mutation Discovery by Mismatch Enzyme Cleavage
  • Arrays and sequencing methods of the invention used may be used for large-scale identification of polymorphisms using mismatch cleavage techniques. Several approaches to mutation detection employ a heteroduplex in which the mismatch itself is utilized for cleavage recognition. Chemical cleavage with piperidine at mismatches modified with hydroxylamine or osmium tetroxide provides one approach to release a cleaved fragment. In a similar way the enzymes T7 endonuclease I or T4 endonuclease VII have been used in the enzyme mismatch cleavage (EMC) techniques, e.g. Youil et al, Proc. Natl. Acad. Sci., 92: 87-91 (1995); Mashal et al, Nature Genetics, 9: 177-183 (1995); Babon et al, Molecular Biotechnology, 23: 73-81 (2003); Ellis et al, Nucleic Acids Research, 22: 2710-2711 (1994); and the like, which are incorporated herein by reference. Cleavase is used in the cleavage fragments length polymorphism (CFLP) technique which has been commercialized by Third Wave Technologies. When single stranded DNA is allowed to fold and adopt a secondary structure the DNA will form internal hairpin loops at locations dependent upon the base sequence of the strand. Cleavase will cut single stranded DNA five-prime of the loop and the fragments can then be separated by PAGE or similar size resolving techniques. Mismatch binding proteins such as Mut S and Mut Y also rely upon the formation of heteroduplexes for their ability to identify mutation sites. Mismatches are usually repaired but the binding action of the enzymes can be used for the selection of fragments through a mobility shift in gel electrophoresis or by protection from exonucleases, e.g. Ellis et al (cited above).
  • Templates for heteroduplex formation are prepared by primer extension from genomic DNA. For the same genomic region of the reference DNA, an excess of the opposite strand is prepared in the same way as the test DNA but in a separate reaction. The test DNA strand produced is biotinylated and is attached to a streptavidin support. Homoduplex formation is prevented by heating and removal of the complementary strand. The reference preparation is now combined with the single stranded test preparation and annealed to produce heteroduplexes. This heteroduplex is likely to contain a number of mismatches. Residual DNA is washed away before the addition of the mismatch endonuclease, which, if there is a mismatch every 1 kb would be expected to produce about 10 fragments for a 10 kb primer extension. After cleavage, each fragment can bind an adapter at each end and enter the mismatch-fragment circle selection process. Capture of mismatch cleaved DNA from Large genomic fragments. The 5-10 kb genomic fragments prepared from large genomic fragments as described above are biotinylated by the addition of a biotinylated dideoxy nucleotide at the 3-prime end with terminal transferase and excess biotinylated nucleotide are removed by filtration. A reference BAC clone that covers the same region of sequence is digested with the same six-base cutter to match the fragments generated from the test DNA. The biotinylated genomic fragments are heat denatured in the presence of the BAC reference DNA and slowly annealed to generate biotinylated heteroduplexes. The reference BAC DNA is in large excess to the genomic DNA so the majority of biotinylated products will be heteroduplexes. The biotinylated DNA can then be attached to the surface for removal of the reference DNA. Residual DNA is washed away before the addition of the mismatch endonuclease. After cleavage, each fragment can bind an adapter at each end and enter the mismatch circle selection process as follows. See FIG. 20.
  • Circle Formation from Mismatch Cleavage Products
  • Method I. The heteroduplexes generated above can be used for selection of small DNA circles, as illustrated in FIGS. 7 and 8. As shown in FIG. 7, in this process, heteroduplex (700) of a sample is treated with the mismatch enzyme to create products cleaved on both strands (704 and 706) surrounding the mutation site (702) to produce fragments (707) and (705). T7 endonuclease I or similar enzyme cleaves 5-prime of the mutation site to reveal a 5-prime overhang of varying length on both strands surrounding the mutation. The next phase is to capture the cleaved products in a form suitable for amplification and sequencing. Adapter (710) is ligated to the overhang produced by the mismatch cutting (only fragment (705) shown), but because the nature of the overhang is unknown, at least three adapters are needed and each adapter is synthesized with degenerate bases to accommodate all possible ends. The adapter can be prepared with an internal biotin (708) on the non-circularizing strand to allow capture for buffer exchange and sample cleanup, and also for direct amplification on the surface if desired.
  • Because the intervening sequence between mutations does not need to be sequenced and reduces the sequencing capacity of the system it is removed when studying genomic-derived samples. Reduction of sequence complexity is accomplished by a type Its enzyme that cuts the DNA at a point away from the enzyme recognition sequence. In doing so, the cut site and resultant overhangs will be a combination of all base variants. Enzymes that can be used include Mmcl (20 bases with 2 base 3′ overhang) and Eco P15I (with 25 bases and 2 base 5′ overhang). The adapter is about 50 by in length to provide sequences for initiation of rolling circle amplification and also provide stiffer sequence for circle formation, as well as recognition site (715) for a type Its restriction endonuclease. Once the adapter has been ligated to the fragment the DNA is digested (720) with the type Its restriction enzyme to release all but 20-25 bases of sequence containing the mutation site that remains attached to the adapter.
  • The adaptered DNA fragment is now attached to a streptavidin support for removal of excess fragment DNA. Excess adapter that did not ligate to mismatch cleaved ends will also bind to the streptavidin solid support. The new degenerate end created by the type Its enzyme can now be ligated to a second adapter through the phosphorylation of one strand of the second adapter. The other strand is non-phosphorylated and blocked at the 3-prime end with a dideoxy nucleotide. The structure formed is essentially the genomic fragment of interest captured between two different adapters. To create a circle from this structure would simply require both ends of the molecule coming together and ligating, e.g. via formation of staggered ends by digesting at restriction sites (722) and (724), followed by intra-molecular ligation. Although this event should happen efficiently, there is also the possibility that the end of an alternative molecule could ligate at the other end of the molecule creating a dimer molecule, or greater multiples of each unit molecule. One way to minimize this is to perform the ligation under dilute conditions so only intra-molecular ligation is favored, then re-concentrating the sample for future steps. An alternative strategy to maximize the efficiency of circle formation without inter-molecular ligation is to block excess adapters on the surface. This can be achieved by using lambda exonuclease to digest the lower strand. If second adapter has been attached then it will be protected from digestion because there is no 5-prime phosphate available. If only the first adapter is attached to the surface then the 5-prime phosphate is exposed for degradation of the lower strand of the adapter. This will lead to loss of excess first adapter from the surface.
  • After lambda exonuclease treatment the 5 prime end of the top strand of the first adapter is prepared for ligation to the 3-prime end of the second adapter. This can be achieved by introducing a restriction enzyme site into the adapters so that re-circularization of the molecule can occur with ligation. Amplification of DNA captured into the circular molecules proceeds by a rolling circle amplification to form long linear concatemer copies of the circle. If extension initiates 5-prime of the biotin, the circle and newly synthesized strand is released into solution. Complementary oligonucleotides on the surface are responsible for condensation and provide sufficient attachment for downstream applications. One strand is a closed circle and acts as the template. The other strand, with an exposed 3-prime end, acts as an initiating primer and is extended.
  • Method II. This method, illustrated in FIG. 8, is similar to the procedure above with the following modifications. 1) The adapter can be prepared with a 3-prime biotin (808) on the non-circularized strand to allow capture for buffer exchange and sample cleanup. 2) Reduction of sequence complexity of the 10 kb heteroduplex fragments described above occurs through the use of 4-base cutting restriction enzymes, e.g. with restriction sites (810), (812), and (814). Use of 2 or 3 enzymes in the one reaction could reduce the genomic fragment size down to about 100 bases. The adapter—DNA fragment can be attached to a streptavidin support for removal of excess fragment DNA. Excess adapter that did not ligate to mismatch cleaved ends will also bind to the streptavidin solid support. The biotinylated and phosphorylated strand can now be removed by lambda exonuclease which will degrade from the 5-prime end but leave the non-phosphorylated strand intact. To create a circle from this structure now requires both ends of the molecule coming together and ligating to form the circle. Several approaches are available to form the circle using a bridging oligonucleotide, as described above. A polynucleotide can be added to the 3-prime end with terminal transferase to create a sequence for one half of a bridge oligonucleotide (818) to hybridize to, shown as polyA tail (816). The other half will bind to sequences in the adapter. Alternatively, before addition of the exonuclease, an adapter can be added to the end generated by the 4-base cutter which will provide sequence for the bridge to hybridize to after removal of one strand by exonuclease. A key aspect of this selection procedure is the ability to select the strand for circularization and amplification. This ensures that only the strand with the original mutation (from the 5-prime overhang) and not the strand from the adapter is amplified. If the 3-prime recessed strand was amplified then a mismatch from the adapter could create a false base call at the site of or near to the mutation. Amplification of DNA captured into the circular molecules proceeds by a rolling circle amplification to form linear concatemer copies of the circle.
  • Alternative applications of mis-match derived circles. The mis-match derived small circular DNA molecules may be amplified by other means such as PCR. Common primer binding sites can be incorporated into the adapter sequences The amplified material can be used for mutation detection by methods such as Sanger sequencing or array based sequencing.
  • Cell-free clonal selection of cDNAs. Traditional methods of cloning have several drawbacks including the propensity of bacteria to exclude sequences from plasmid replication and the time consuming and reagent-intensive protocols required to generate clones of individual cDNA molecules. Linear single-stranded can be made from amplifications of DNA molecules that have been closed into a circular form. These large concatemeric, linear forms arise from a single molecule and can act as efficient, isolated targets for PCR when separated into a single reaction chamber, in much the same way a bacterial colony is picked to retrieve the cDNA containing plasmid. We plan to develop this approach as a means to select cDNA clones without having to pass through a cell-based clonal selection step. The first step of this procedure will involve ligating a gene specific oligonucleotide directed to the 5-prime end with a poly dA sequence for binding to the poly dT sequence of the 3-prime end of the cDNA. This oligonucleotide acts as a bridge to allow T4 DNA ligase to ligate the two ends and form a circle.
  • The second step of the reaction is to use a primer, or the bridging oligonucleotide, for a strand displacing polymerase such as Phi 29 polymerase to create a concatemer of the circle. The long linear molecules will then be diluted and arrayed in 1536 well plates such that wells with single molecules can be selected. To ensure about 10% of the wells contain 1 molecule approximately 90% would have to be sacrificed as having no molecules. To detect the wells that are positive a dendrimer that recognizes a universal sequence in the target is hybridized to generate 10K-100K dye molecules per molecule of target. Excess dendrimer is removed through hybridization to biotinylated capture oligos. The wells are analyzed with a fluorescent plate reader and the presence of DNA scored. Positive wells are then re-arrayed to consolidate the clones into plates with complete wells for further amplification
  • Splice Variant Detection and Exon Profiling
  • The process described is based on random DNA arrays and “smart” probe pools for the identification and quantification of expression levels of thousands of genes and their splice variants. In eukaryotes, as the primary transcript emerges from the transcription complex, spliceosomes interact with splice sites on the primary transcript to excise out the introns, e.g. Maniatis et al, Nature, 418: 236-243 (2002). However, because of either mutations that alter the splice site sequences, or external factors that affect spliceosome interaction with splice sites, alternative splice sites, or cryptic splice sites, could be selected resulting in expression of protein variants encoded by mRNA with different sets of exons. Surveys of cDNA sequences from large scale EST sequencing projects indicated that over 50% of the genes have known splice variants. In a recent study using a microarray-based approach, it was estimated that as high as 75% of genes are alternatively spliced, e.g. Johnson et al, Science, 302: 2141-2144 (2003).
  • The diversity of proteins generated through alternative splicing could partially contribute to the complexity of biological processes in higher eukaryotes. This also leads to the implication that the aberrant expression of variant protein forms could be responsible for pathogenesis of diseases. Indeed, alternative splicing has been found to associate with various diseases like growth hormone deficiency, Parkinson's disease, cystic fibrosis and myotonic dystrophy, e.g. Garcia-Blanco et al, Nature Biotechnology, 22: 535-546 (2004). Because of the difficulty in isolating and characterizing novel splice variants, the evidence implicating roles of splice variants in cancer could represent the tip of the iceberg. With the availability of tools that could rapidly and reliably characterize splicing patterns of mRNA, it would help to elucidate the role of alternative splicing in cancer and in disease development in general.
  • In one aspect, methods of the invention permit large-scale measurement of splice variants with the following steps: (a) Prepare full length first strand cDNA for targeted or all mRNAs. (b) Circularize the generated full length (or all) first strand cDNA molecules by incorporating an adapter sequence. (c) By using primer complementary to the adapter sequence perform rolling circle replication (RCR) of cDNA circles to form concatemers with over 100 copies of initial cDNA. (d) Prepare random arrays by attaching RCR produced “cDNA balls” to glass surface coated with capture oligonucleotide complementary to a portion of the adapter sequence; with an advanced submicron patterned surface one mm2 can have between 1-10 million cDNA spots; note that the attachment is a molecular process and does not require robotic spotting of individual “cDNA balls” or concatemers. (e) Starting from pre-made universal libraries of 4096 6-mers and 1024 labeled 5-mers, use a sophisticated computer program and a simple robotic pipettor to create 40-80 pools of about 200 6-mers and 20 5-mers for testing all 10,000 or more exons in targeted 1000 or more up to all known genes in the sample organism/tissue. (f) In a 4-8 hour process, hybridize/ligate all probe pools in 40-80 cycles on the same random array using an automated microscope-like instrument with a sensitive 10-mega pixel CCD detector for generating an array image for each cycle. (g) Use a computer program to perform spot signal intensity analysis to identify which cDNA is on which spot, and if any of the expected exons is missing in any of the analyzed genes. Obtain exact expression levels for each splice variant by counting occurrences in the array.
  • This system provides a complete analysis of the exon pattern on a single transcript, instead of merely providing information on the ratios of exon usage or quantification of splicing events over the entire population of transcribed genes using the current expression arrays hybridized with labeled mRNA/cDNA. At the maximum limit of its sensitivity, it allows a detailed analysis down to a single molecule of a mRNA type present in only one in hundreds of other cells; this would provide unique potentials for early diagnosis of cancer cells. The combination of selective cDNA preparation with an “array of random arrays” in a standard 384-well format and with “smart” pools of universal short probes provides great flexibility in designing assays; for examples, deep analysis of a small number of genes in selected samples, or more general analysis in a larger number of samples, or analysis of a large number of genes in smaller number of samples. The analysis provides simultaneously 1) detection of each specific splice variant, 2) quantification of expression of wild type and alternatively spliced mRNAs. It can also be used to monitor gross chromosomal alterations based on the detection of gene deletions and gene translocations by loss of heterozygosity and presence of two sub-sets of exons from two genes in the same transcript on a single spot on the random array. The exceptional capacity and informativeness of this assay is coupled with simple sample preparation from very small quantities of mRNA, fully-automated assay based on all pre-made, validated reagents including libraries of universal labeled and unlabeled probes and primers/adapters that will be ultimately developed for all human and model organism genes. The proposed splice variant profiling process is equivalent to high throughput sequencing of individual full length cDNA clones; rSBH throughput can reach one billion cDNA molecules profiled in a 4-8 hour assay. This system will provide a powerful tool to monitor changes in expression levels of various splice variants during disease emergence and progression. It can enable discovery of novel splice variants or validate known splice variants to serve as biomarkers to monitor cancer progression. It can also provide means to further understanding the roles of alternative splice variants and their possible uses as therapeutic targets. Universal nature and flexibility of this low cost and high throughput assay provides great commercial opportunities for cancer research and diagnostics and in all other biomedical areas. This high capacity system is ideal for service providing labs or companies.
  • Preparation of templates for in vitro transcription. Exon sequences are cloned into the multiple cloning sites (MCS) of plasmid pBluescript, or like vector. For the purposes of demonstrating the usefulness of the probe pools, it is not necessary to clone the contiguous full-length sequence, nor to maintain the proper protein coding frame. For genes that are shorter than 1 kb, PCR products are generated from cDNA using gene specific oligos for the full length sequence. For longer genes, PCR products are generated comprising about 500 by that corresponding to contiguous block of exons and ordered the fragments by cloning into appropriate cloning sites in the MCS of pBluescript. This is also the approach for cloning the alternative spliced versions, since the desired variant might not be present in the cDNA source used for PCR.
  • The last site of the MCS is used to insert a string of 40 A's to simulate the polyA tails of cellular mRNA. This is to control for the possibility that the polyA tail might interfere with the sample preparation step described below, although it is not expected to be a problem since a poly-dA tail is incorporated in sample preparation of genomic fragments as described. T7 RNA polymerase will be used to generate the run-off transcripts and the RNA generated will be purified with the standard methods.
  • Preparation of samples for arraying. Because the probe pools are designed for specific genes, cDNA is prepared for those specific genes only. For priming the reverse transcription reactions, gene-specific primers are used, therefore for 1000 genes, 1000 primers are used. The location of the priming site for the reverse transcription is selected with care, since it is not reasonable to expect the synthesis of cDNA >2 kb to be of high efficiency. It is quite common that the last exon would consist of the end of the coding sequence and a long 3′ untranslated region. In the case of CD44 for example, although the full-length mRNA is about 5.7 kb, the 3′ UTR comprises of 3 kb, while the coding region is only 2.2 kb. Therefore the logical location of the reverse transcription primer site is usually immediately downstream of the end of the coding sequence. For some splice variants, the alternative exons are often clustered together as a block to create a region of variability. In the case of Tenascin C variants (8.5 kb), the most common isoform has a block of 8 extra exons, and there is evidence to suggest that there is variability in exon usage in that region. So for Tenascin C, the primer will be located just downstream of that region. Because of the concern of synthesizing cDNA with length >2 kb, for long genes, it might be necessary to divide the exons into blocks of 2 kb with multiple primers.
  • Reverse transcription reactions may be carried out with commercial systems, e.g. SuperScript III system from Invitrogen (Carlsbad, Calif.) and the StrataScript system from Stratagene (La Jolla, Calif.). Once single stranded cDNA molecules are produced, the rest of the procedures involved putting on the adaptor sequence, circularization of the molecule and RCR as described above. The 5′ ends of the cDNAs are basically the incorporated gene-specific primers used for initiating the reverse transcription. By incorporating a 7 base universal tag on the 5′ end of the reverse-transcription priming oligos, all the cDNA generated will carry the same 7 base sequence at the 5′ end. Thus a single template oligonucleotide that is complementary to both the adaptor sequence and the universal tag can be used to ligate the adaptor to all the target molecules, without using the template oligonucleotide with degenerate bases. As for the 3′ end of the cDNA (5′ end of the mRNA) which is usually ill-defined, it may be treated like a random sequence end of a genomic fragment. Similar methods of adding a polyA tail will be applied, thus the same circle closing reaction may also be used.
  • Reverse transcriptases are prone to terminate prematurely to create truncated cDNAs. Severely truncated cDNAs probably will not have enough probe binding sites to be identified with a gene assignment, thus would not be analyzed. cDNA molecules that are close, but not quite full-length, may show up as splice variant with missing 5′ exons. If there are no corroborating evidence from a sequence database to support such variants, they may be discounted. A way to avoid such problem is to select for only the full-length cDNA (or those with the desired 3′ end) to be compatible with circle closing reaction, then any truncated molecules will not be circularized nor replicated. First a dideoxy-cytosine residue can be added to the 3′ end of all the cDNA to block ligation, then by using a mismatch oligo targeting the desired sequence, a new 3′ end can be generated by enzyme mismatch cleavage using T4 endonuclease VII. With the new 3′ end, the cDNA can proceed with the adding a poly-dA tail and with the standard protocols of circularization and replication.
  • Replicated and arrayed concatemers of the exon fragments may be carried out using combinatorial SBH, as described above. The algorithm of the following steps may be used to select 5-mer and 6-mer probes for use in the technique:
  • Step 1: Select 1000-2000 shortest exons (total about 20-50 kb), and find out matching sequences for each of 1024 available labeled 5-mers. On average each 5-mer will occur 20 times over 20 kb, but some may occur over 50 or over 100 times. By selecting the most frequent 5-mer, the largest number of short exons will be detected with the single labeled probe. A goal would be to detect about 50-100 short exons (10%-20% of 500 exons) per cycle. Thus less than 10 labeled probes and 50-100 unlabeled 6-mers would be sufficient. Small number of labeled probes is favorable because it minimizes overall fluorescent background.
  • Step 2. Find out all 6-mers that are contiguous with all sites in all 1000 genes that are complementary to 10 selected 5-mers. On average 20 such sites will exist in each 2 kb gene. Total number of sites would be about 20,000, e.g., each 6-mer on average will occur 5 times. Sort 6-mers by the hit frequency. The most frequent may have over 20 hits, e.g. such 6-mer will detect 20 genes through combinations with 10 labeled probes. Thus, to get a single probe pair for each of the 500 genes a minimum of 25 6-mer probes would be required. Realistically, 100 to 200 6-mers may be required.
  • Due to benefits of combinatorial SBH that uses pre-made libraries of 6-mer and 5-mer probes 40 probe pools are readily prepared with about 200 probes per pool using established pipetting robotics. The information generated is equivalent to having over 3 probes per exon, therefore the use of 8000 5-mers and 6-mers effectively replaces the 30,000 longer exons specific probes required for a single set of 1000 genes.
  • Exon profiling. The profiling of exons can be performed in two phases: the gene identification phase and the exon identification phase. In the gene identification phase, each concatemer on the array can be uniquely identified with a particular gene. In theory, 10 probe pools or hybridization cycles will be enough to identify 1000 genes using the following scheme. Each gene is assigned a unique binary code. The number of binary digits thus depends on the total number of genes: 3 digits for 8 genes, 10 digits for 1024 genes. Each probe pool is designed to correspond to a digit of the binary code and would contain probes that would hit a unique combination of half of the genes and one hit per gene only. Thus for each hybridization cycle, an unique half of the genes will score a 1 for that digit and the other half will score zero. Ten hybridization cycles with 10 probe pools will generate 1024 unique binary codes, enough to assign 1000 unique genes to all the concatemers on the array. To provide redundancy in the identification data, 15-20 cycles would be used. If 20 cycles are used, it would provide 1 million unique binary codes and there should be enough information to account for loss of signals due to missing exons or gene deletions. It will also be equivalent to having 10 data points per gene (20 cycles of 500 data point each give 10,000 data points total), or one positive probe-pair per exon, on average. At this point after 20 cycles, this system is capable of making assignment of 1 million unique gene identities to the ampliots. Therefore by counting gene identities of the ampliots, one can determine quantitatively the expression level of all the genes (but not sub-typing of splice variants) in any given samples.
  • After identifying each ampliot with a gene assignment, its exon pattern will be profiled in the exon identification phase. For the exon identification phase, one exon per gene in all or most of the genes is tested per hybridization cycle. In most cases 10-20 exon identification cycles should be sufficient. Thus, in the case of using 20 exon identification cycles we will obtain information of 2 probes per each of 10 exons in each gene. For genes with more than 20 exons, methods can be developed so that 2 exons per gene can be probed at the same cycle. One possibility is using multiple fluorophores of different colors, and another possibility is to exploit differential hybrid stabilities of different ligation probe pairs.
  • In conclusion, a total of about 40 assay cycles will provide sufficient information to obtain gene identity at each spot and to provide three matching probe-pairs for each of 10,000 exons with enough informational redundancy to provide accurate identification of missing exons due to alternative splicing or chromosomal deletions.
  • Example 1: Glass Cover Slip as Random Array Support: Derivatization Protocol
  • concatemers. The following materials are used:
    • Millipore DI water
    • 2.5 ml of 3-Aminopropyldimethylethoxysilane (Gelest)
    • 1.6 grams p-phenylenediisothiocyanate (Acros Organics/fisher)
    • 210 grams KOH (VWR)
    • Ethanol (VWR)
    • Methanol (VWR)
    • Pyridine (VWR)
    • N,N-dimethylformamide (VWR)
    • Acetone (VWR)
    • Equipment
    • 100c oven
    • magnetic stir plate
    • 1 2″×.5″ magnetic stir bar
    • 2 4 liter Nunc beaker
    • 7 4″×8″×4″ glass containers
    • 1 liter graduated cylinder
    • 1 100 ml graduated cylinder
    • 1 lab scale
    • 1 Metzler scale
    • 1 large weigh boat
    • 1 small weigh boat
    • 1 pair thick nitrite gloves
    • 1 large funnel
    • 1 ml pipettman with filter tips
    • 1 nalgene stir bar
    • 1 airtight container (tupperware)
  • Using the large graduated cylinder measure 950 m1 of ethanol, add to the 4 liter Nunc beaker. Measure 50 m1 of DI water in the small graduated cylinder and add to the same nunc beaker. Measure out 210 grams of KOH pellets in a weigh boat on the lab scale. Add stir bar and KOH pellets to the beaker. Place beaker on stir plate and stir at low speed until KOH is completely dissolved. While KOH is dissolving, lay out 6 pre-washed glass containers fill containers 2-5 with DI water until ½ inch from top (800 m1). Fill container 6 with acetone ½″ to top. Carefully pour dissolved KOH solution into container 1 until ½″ to top. Add racked cover slips to container 1 wait 3 minutes, remove racks from container 1 and wash in containers 2-5 leaving racks in each container a minimum of 15 seconds. Submerse racks briefly in container 6. Set aside racks, dispose the solutions from containers 1 and 2 in the basic waste container using the large funnel and thick nitrile gloves, clean and dry labware. Lay out 7 clean and dry glass containers. Add 775 ml of acetone to container 1 add 2.5 ml of DI water to container 1. stir container 1 with pipette tip for 20 seconds. With a new pipette tip add 2.5 ml of 3-aminopropyldimethylethoxysilane to container 1. Stir with pipette tip for 10 seconds. Immerse all 5 racks of cover slips into container 1. Cover container 1 with polypropylene box top. Wait 45 minutes. 15 minutes prior to the completion of the reaction, fill containers 2-4 until ½″ to top with acetone, fill container 5 with water ½″ to top. Fill container 6 until ½″ to top with acetone. Upon reaction completion (45 minutes) transfer cover slip racks 1-5 from container 1 to container 2, wait 15 seconds. Repeat this though container 6. Place racks into empty container 7 and put in 100c oven. Wait one hour.
  • Lay out 7 glass containers. After racks come out of oven, use the Meltzer scale to weigh out 1.6 grams of p-phenylenediisothiocyanate (PDC) in the small weigh boat. Pour 720 ml dimethylformamide into the cleaned 1 liter graduated cylinder, fill to 800 m1 with pyridine. Pour 50% this solution into a clean class container then pour it back into the cylinder to mix (repeat once). Fill container 1 until ½″ to top with this solution. Add the PDC from the weigh boat to container 1. Use stir bar to mix solution. Crush PDC clumps that refuse to dissolve, then stir again. Cover slip racks should be cool by now. Place all 5 racks into container one. Cover with polypropylene box top. Wait 2 hours. 10 minutes prior to reaction completion fill containers 2 and 3 with methanol until ½″ from top. Fill containers 4 and 5 with acetone until ½″ from top. Fill container 6 with 65% acetone 35% water until ½″ from top. Fill container 7 with acetone. Successively transfer racks through all containers, waiting 15 seconds between each transfer.
  • Remove racks from container 7 dump contents of containers 1-7 into organic waste drum. Replace racks to container 7 and dry in oven for 15 minutes. Place dry racks into airtight container, they are now ready for attachment.
  • Example 2: Preparation of RCR Products from E. coli Genomic DNA & Disposition onto a Glass Cover Slip
  • E. coli genomic DNA (32 ug) (Sigma Chemical Co) was fragmented with 0.16 U of DnaseI (Epicentre) at 37° C. for 10 min and then heat inactivated at 95° C. for 10 min. Reaction products were distributed with an average size of 200 by as determined by agarose gel electrophoresis. If reaction products did not meet the required size distribution they were further digested with the addition of fresh enzyme. The final concentration was 200 ng/ul of genomic DNA.
  • The Dnase digested DNA (26 ng/ul) was reacted with Terminal deoxynucleotide transferase (0.66 U/ul) from New England Biolabs (NEB) in reaction buffer supplied by NEB. The reaction contained dATP (2 mM) and was performed at 37 C for 30 min and then heat inactivated at 70 C for 10 min. The DNA sample was then heated to 95 C for 5 min before rapid cooling on ice.
  • A synthetic DNA adapter was then ligated to the 5′ end of the genomic DNA by first forming a hybrid of a 65-base oligonucleotide (TATCATCTACTGCACTGACCGGATGTTAGGAAGACAAAAGGAAGCT GAGGGTCACATTAACGGAC)(SEQ ID NO: 8) with a second oligonucleotide (NNNNNNNGTCCGTTAA TGTGAC 3′ 2′3′ddC) (SEQ ID NO: 9) at the 3′ end of the 65mer in which the 7 “Ns” form an overhang. The shorter oligo will act as a splint for ligation of the 65mer to the 5′ end of the genomic fragments. The splint molecule consists of 7 degenerate bases at its 5′ end to hybridize to variable bases at the 5′ end of the genomic DNA. The adapter hybrid was formed by slowly hybridizing 1200 pmol of adapter with 1200 pmol of splint in 52 ul from 95 C to room temperature over Ihr.
  • T4 DNA Ligase (0.3 U/ul) was combined with genomic DNA (17 ng/ul) and adapter-splint (0.5 uM) in 1× ligase reaction buffer supplied by NEB. The ligation proceeded at 15 C for 30 min, 20 C for 30 min and then inactivated at 70 C for 10 min. A second splint molecule (AGATGATATTTTTTTT 3′ 2′3′ddC) (SEQ ID NO: 10) (0.6 uM) was then added to the reaction and the mix was supplemented with more ligase buffer and T4 DNA ligase (0.3 U/ul). The reaction proceeded at 15 C for 30 min and then at 20 C for 30 min before inactivation for 10 min at 70 C.
  • The ligation mix was then treated with exonuclease I (NEB) (1 Uhl) at 37 C for 60 min, followed by inactivation at 80 C for 20 min
  • Rolling circle replication was performed in reaction buffer supplied by NEB with BSA (0.1 ug/ul), 0.2 mM each dNTP, an initiating primer (TCAGCTTCCTTTTGTCTTCCTAAC) (SEQ ID NO: 11) at 2 fmol/ul, exonuclease treated ligation of genomic DNA at 24 pg/ul, and Phi 29 polymerase (0.2 U/ul). The reaction was performed for 1 hr at 30 C and then heat inactivated at 70 C for 10 min.
  • RCR reaction products were attached to the surface of cover slips by first attaching amine modified oligonucleotides to the surface of the cover slips. A capture probe GAMINOC6][SP C18][SP-C18]GGATGTTAGGAAGACAAAAGGAAGCTGAGG) (SEQ ID NO: 12) (50 uM) was added to the DITC derivatized cover slips in 0.1 uM NaHCO3 and allowed to dry at 40 C for about 30 min. The cover slips were rinsed in DDI water for 15 min and dried. RCR reaction products (4.5 ul) were then combined with 0.5 ul of 20×SSPE and added to the center of the slide. The sample was allowed to air dry and non-attached material was washed off for 10 min in 3×SSPE and then briefly in DDI water. The slide was then dried before assembly on the microscope. Attached RCR products were visualized by hybridizing an I Imer TAMRA labeled probe that is complementary to a region of the adapter
  • RCR reaction products were formed from a single stranded 80mer synthetic DNA target N N NGCATANCACGANGTCATNATCGTNCAAACGTCAGTCCANGAATC NAGATCCACTTAGANTAAAAAAAAAAAA) (SEQ ID NO: 13) as above but without poly A addition with TDT. The RCR reaction contained target molecules at an estimated 12.6 fmol/ul. Reaction products (5 ul) were combined with SSPE (2×) and SDS (0.3%) in a total reaction volume of 20 ul. The sample was applied to a cover-slip in which lines of capture probe ([AMINOC6][SP-C18][SP C18]GGATGTTAGGAAGACAAAAGGAAGCTGAGG), deposited in a solution of 50 uM with 0.1 uM NaHCO3, were dried onto the surface and left in a humid chamber for 30 min. The solution was then washed off in 3×SSPE for 10 min and then briefly in water. Various reaction components were tested for their effect upon RCR product formation. The addition of Phi 29 to the RCR reaction at a final concentration of 0.1 U/ul rather than 0.2 U/ul was found to create a greater proportion of RCR products that were of larger intensity after detection probe hybridization. The addition of initiating primer at 10 to 100 fold molar ratio relative to estimated target concentration was also found to be optimal. Increased extension times produced more intense fluorescent signals but tended to produce more diffuse concatemers. With the current attachment protocols a 21 u-extension time produced enhanced signals relative to a Ihr incubation with minimal detrimental impact upon RCR product morphology.
  • Further optimization of RCR products have been achieved by reducing the estimated concentration of synthetic and genomic targets to 0.1 to 0.25 fmol/ul in the RCR reaction. This typically results in distinct and unique RCR products on the surface of the microscope slide using method 1 for attachment. For synthetic targets in which a higher concentration of targets in the RCR reaction may be present (e.g. >5 fmol/ul), RCR products may be attached by method 2.
  • Attachment method 1. RCR reaction products (4.5 ul) were combined with 0.5 ul of 20×SSPE and added to the center of the slide. The sample was allowed to air dry and non-attached material was washed off for 10 min in 3×SSPE and then briefly in DDI water. The slide was then dried before assembly on the microscope. Attached RCR products were visualized by hybridizing an I Imer TAMRA labeled probe that is complementary to a region of the adapter. Attachment method 2. RCR reaction products (1 ul) were combined with 50 ul of 3×SSPE and added to the center of the cover slip with capture probe attached. Addition of SDS (0.3%) was found to promote specific attachment to the capture probes and not to the derivatized surface. The sample was incubated at room temperature for 30 min and non-attached material was washed off for 10 min in 3×SSPE and then briefly in DDI water. The slide was then dried before assembly on the microscope. Attached RCR products were visualized by hybridizing an IImer TAMRA labeled probe that is complementary to a region of the adapter. The above protocols provide RCR product densities of about 1 RCR product per 2-4 micron square. Exemplary image of a resulting cover slip is shown in FIG. 3.
  • Example 3: Distinguish RCR Products on Random Arrays Using Fluorescently Labeled Probes
  • PCR products from diagnostic regions of Bacillus anthracis and Yersinia pestis were converted into single stranded DNA and attached to a universal adaptor. These two samples were then mixed and replicated together using RCR and deposited onto a glass surface as a random array. Successive hybridization with amplicon specific probes showed that each spot on the array corresponded uniquely to either one of the two sequences and that they can be identified specifically with the probes, as illustrated in FIG. 4. This result demonstrates sensitivity and specificity of identifying DNA present in submicron sized DNA concatemers having about 100-1000 copies of a DNA fragment generated by the RCR reaction. A 155 by amplicon sequence from B. anthracis and a 275 by amplicon sequence from Y. pestis were amplified using standard PCR techniques with PCR primers in which one primer of the pair was phosphorylated. A single stranded form of the PCR products was generated by degradation of the phosphorylated strand using lambda exonuclease. The 5′ end of the remaining strand was then phosphorylated using T4 DNA polynucleotide kinase to allow ligation of the single stranded product to the universal adaptor. The universal adaptor was ligated using T4 DNA ligase to the 5′ end of the target molecule, assisted by a template oligonucleotide complementary to the 5′ end of the targets and 3′ end of the universal adaptor. The adaptor ligated targets were then circularized using bridging oligonucleotides with bases complementary to the adaptor and to the 3′ end of the targets. Linear DNA molecules were removed by treating with exonuclease I. RCR products (DNA concatemers) were generated by mixing the single-stranded samples and using Phi29 polymerase to replicate around the circularized adaptor-target molecules with the bridging oligonucleotides as the initiating primers.
  • To prepare the cover slips for attaching amine-modified oligonucleotides, the cover slips were first cleaned in a potassium/ethanol solution followed by rinsing and drying. They were then treated with a solution of 3-aminopropyldimethylethoxysilane, acetone, and water for 45 minutes and cured in an oven at 100° C. for 1 hour. As a final step, the cover slips were treated with a solution of p-phenylenediisothiocyanate (PDC), pyridine, and dimethylformamide for 2 hours. The capture oligonucleotide (sequence 5′-GGATGTTAGGAAGACAAAAGGAAGCTGAGG-3′) (SEQ ID NO: 14) is complementary to the universal adaptor sequence. and is modified at the 5′ end with an amine group and 2 C-18 linkers. For attachment, 10 IA of the capture oligo at 10 μM in 0.1M NaHCO3 was spotted onto the center of the derivatized cover slip, dried for 10 minutes in a 70° C. oven and rinsed with water. To create an array of DNA concatemers, the RCR reaction containing the DNA concatemers was diluted 10-folds with 3×SSPE, 20 IA of which was then deposited over the immobilized capture oligonucleotides on the cover slip surface for 30 minutes in a moisture saturated chamber. The cover slip with the DNA concatemers was then assembled into a reaction chamber and was rinsed by 2 ml of 3×SSPE. Arrayed target concatemer molecules derived from B. anthracis and Y. pestis PCR amplicons were probed sequentially with TAMRA-labeled oligomer: probe BrPrb3 (sequence: 5′-CATTAACGGAC-3′ (SEQ ID NO: 15), specifically complementary to the universal adaptor sequence), probe Ba3 (sequence: 5′-TGAGCGATTCG-3′ (SEQ ID NO: 16), specifically complementary to the Ba3 amplicon sequence), probe Yp3 (sequence: 5′-GGTGTCATGGA-3′, specifically complementary to the Yp3 amplicon sequence). The probes were hybridized to the array at a concentration of 0.1 μM for 20 min in 3×SSPE at room temperature. Excess probes were washed off with 2 ml of 3×SSPE. Images were taken with the TIRF microscope. The probes were then stripped off with 1 ml of 3×SSPE at 80° C. for 5 minutes to prepare the arrayed target molecules for the next round of hybridization.
  • By overlaying the images obtained from successive hybridization of 3 probes, as shown in FIG. 4, it can be seen that most of the arrayed molecules that hybridized with the adaptor probe would only hybridize to either the amplicon 1 probe (e.g. “A” in FIG. 4) or the amplicon 2 probe (e.g. “B” in FIG. 4), with very few that would hybridize to both. This specific hybridization pattern demonstrates that each spot on the array contains only one type of sequence, either the B anthracis amplicon or the Y. pestis amplicon.
  • Example 4: Decoding a Base Position in Arrayed Concatemers Created From a Synthetic 80-Mer Oligonucleotide Containing a Degenerated Base
  • Individual molecules of a synthetic oligonucleotide containing a degenerate base can be divided into 4 sub-populations, each may have either an A, C, G or T base at that particular position. An array of concatemers created from this synthetic DNA may have about 25% of spots with each of the bases. Successful identification of these sub-populations of concatemers was demonstrated by four successive hybridization and ligation of pairs of probes, specific to each of the 4 bases, as shown in FIG. 5. A 5′ phosphorylated, 3′ TAMRA-labeled pentamer oligonucleotide was paired with one of the four hexamer oligonucleotides. Each of these 4 ligation probe pairs should hybridize to either an A, C, G or T containing version of the target. Discrimination scores of greater than 3 were obtained for most targets, demonstrating the ability to identify single base differences between the nanoball targets. The discrimination score is the highest spot score divided by the average of the other 3 base-specific signals of the same spot. By adjusting the assay conditions (buffer composition, concentrations of all components, time and temperature of each step in the cycle) higher signal to background and full match to mismatch ratios are expected. This was demonstrated with a similar ligation assay performed on the spotted arrays of 6-mer probes. In this case full-match/background ratio was about 50 and the average full match/mismatch ratio was 30. The results further demonstrate the ability to determine partial or complete sequences of DNA present in concatemers by increasing the number of consecutive probe cycles or by using 4 or more probes labeled with different dyes per each cycle. Synthetic oligonucleotide (T1A: 5′-GCATANCACGANGTCATNATCGTNCAAACGTCAGTCCANGAATCNAGATCCACTTAGANTAAAAAA AAAAAA-3′) (SEQ ID NO: 13) contains at position 32 a degenerate base. Universal adaptor was ligated to this oligonucleotide and the adaptor-T1A DNA was circularized as described before. DNA concatemers made using the rolling circle replication (RCR) reaction on this target were arrayed onto the random array. Because each spot on this random array corresponded to tandemly replicated copies originated from a single molecule of T1A, therefore DNA in a particular arrayed spot would contain either an A, or a C, or a G, or a T at positions corresponding to position 32 of T1A. To identify these sub-populations, a set of 4 ligation probes specific to each of the 4 bases was used. A 5′ phosphorylated, 3′ TAMRA-labeled pentamer oligonucleotide corresponding to position 33-37 of TIA with sequence CAAAC (probe T1A9b) was paired with one of the following hexamer oligonucleotides corresponding to position 27-32: ACTGTA (probe T1A9a), ACTGTC (probe T1A10a), ACTGTG (probe T1A1 1 a), ACTGTT (probe T1Al2a). Each of these 4 ligation probe pairs should hybridize to either an A, C, G or T containing version of T1A. For each hybridization cycle, the probes were incubated with the array in a ligation/hybridization buffer containing T4 DNA ligase at 20° C. for 5 minutes. Excess probes were washed off at 20° C. and images were taken with a TIRF microscope. Bound probes were stripped to prepare for the next round of hybridization.
  • An adaptor specific probe (BrPrb3) was hybridized to the array to establish the positions of all the spots. The 4 ligation probe pairs, at 0.4 1.1M, were then hybridized successively to the array with the base identifications as illustrated for four spots in FIG. 5. It is clear that most of the spots are associated with only one of the 4 ligation probe pairs, and thus the nature of the base at position 32 of T1A can be determined specifically.
  • Example 5: Decoding Two Degenerate Bases at the End of a Synthetic 80-Mer Oligonucleotide
  • The same synthetic oligonucleotide described above contains 8 degenerate bases at the 5′ end to simulate random genomic DNA ends. The concatemers created from this oligonucleotide may have these 8 degenerate bases placed directly next to the adaptor sequence. To demonstrate the feasibility of sequencing the two unknown bases adjacent to the known adaptor sequence, a 12-mer oligonucleotide (UKO-12 sequence 5′-ACATTAACGGAC-3′) (SEQ ID NO: 17) with a specific sequence to hybridize to the 3′ end of the adaptor sequence was used as the anchor, and a set of 16 TAMRA-labeled oligonucleotides in the form of BBNNNNNN were used as the sequence-reading probes. For each hybridization cycle, 0.2 uM of UKO-12 anchor probe and 0.4 uM of the BBNNNNNN probe were incubated with the array in a ligation/hybridization buffer containing T4 DNA ligase at 20° C. for 10 minutes. Excess probes were washed off at 20° C. and images were taken with a TIRF microscope. Bound probes were stripped to prepare for the next round of hybridization.
  • Using a subset of the BBNNNNNN probe set (namely GA, GC, GG and GT in the place of BB), spots were able to be identified spots on the concatemer array created from targets that specifically bind to one of these 4 probes, with an average full match/mismatch ratio of over 20, as shown in FIG. 6.
  • 1. Comprehensive DNA/RNA Analysis Using Ultra-High Capacity Self-Assembled DNA Nano-Array (saDNA) Chips Produced from Mixtures of Natural or Synthetic DNA FRAGMENTS
  • The nucleic acid hybridization process is used widely for characterization of a DNA/RNA sample. Antibodies or other proteins or compounds are used in various binding assays for characterization of protein samples. For an efficient extensive analysis of sample with many hybridization assays arrays of gene/genomic fragments or synthetic oligonucleotides are prepared in various ways. For preparing arrays of gene/genome fragments, individual fragments are usually prepared in separate tubes/wells and than deposited on the substrate. This process is too laborious for preparing large number of samples (e.g. close or more than one million) and/or does not allow to prepare an array of small, high density spots, especially below 10 micrometer dot size. For preparing high density arrays of about 100,000 or more oligonucleotides in situ chemical synthesis of DNA is usually performed. We describe here DNA/RNA and their derivatives or peptides or protein and other array products, including processes for their preparation and uses, that are based on applying mixtures of detecting molecules of partially of fully known primary structure or polymer sequence, preferably as concatemers of the same molecule, on substrates with a pattern of high density small binding sites separated by non-binding surface, followed by determining which detecting molecule from the mixture is attached at which binding site.
  • 1.1 saDNA Chip Preparation
  • In one embodiment, the saDNA platform utilizes attached nano-balls of concatenated DNA/RNA as detecting molecules (DMs) for hybridization to a solution phase, labeled DNA or RNA target. Since no specific DMs are attached to specific binding sites on the substrate they must first pass through a full or partial sequencing, re-sequencing or signature identification.
  • 1.1.1. Preparation of DNA Fragments for Probe Generation
  • High density DNA nano-ball probe arrays are prepared from source nucleic acids (NA) that can be derived from
      • 1. A library of gene clones
      • 2. PCR or otherwise derived amplicons
      • 3. Selected fragments of genomic DNA
      • 4. cDNA or mRNA, siRNA or other RNA mixture
      • 5. The entire genomic DNA of one or a mixture of individuals.
  • The source NA may originate from one species or from multiple species.
  • It is preferable to have all of the DNA probe segments of the sample in a similar number of copies to avoid over representation of individual sequences in the array. DNA from multiple individuals of one species may be mixed to get the best representation of every part of the genome. Some important or control DNA probe segments may be intentionally added in higher or lower amount than other fragments. Too many DNA probes having the same, or significantly overlapped DNA, may reduce the sensitivity of detection by competing for target DNA in solution.
  • DNA for probe generation may be fragmented to the preferred length of 30-100 bases, although sizes of about 10-2000 bases in length may also be used, and longer DNA may provide better sensitivity. For example, twenty labeled target DNA fragments of 100 bases in length can hybridize to one 2000 base attached DNA probe template thereby increasing the label density per probe site. The preferred DNA length may be selected by various separation methods including size exclusion matrices or gel electrophoresis.
  • DNA for probe arrays may be also be generated from synthetic DNA that has all sequence variants within a given length of eight to twenty bases. The short probes will create a universal chip for DNA sequencing by representing all possible sequences of 8 to 20 bases within the array.
  • Elements of the RCR generated universal saDNA probes as shown in FIG. 9 include
      • 1. An adapter sequence (BBBBB) and an N8_20 degenerate detector oligonucleotide sequence (e.g. mixture of all oligonucleotides of given length); mixtures of oligonucleotides of variable length may also be used and none or some of the lengths may not have all possible sequences.
      • 2. A capture sequence of 20-100, more frequently 25-50 bases in length allows for the attachment of the concatenated RCR product targets to the array via hybrid formation to attached oligonucleotides on the surface.
      • 3. Primer binding site
      • 4. A probe binding sequence for QC of attachment efficiency and relative quantitation of copy number in the concatenated RCR product
      • 5. Sequences at the 5-prime and 3-prime ends of the adapter allow for ligation of the N8.20 degenerate oligonucleotide via two bridging oligonucleotides about 12-20 bases in length. Bridging oligonucleotides may have several degenerated bases that bind to the ends of detector oligonucleotides.
  • Selection of 10,000 to 1 million or more specific genomic DNA fragments 20-2000 (preferably 100-1000) bases in length may be performed for preparing sequence-specific DNA nano-ball probe arrays. A large number of specific primers could be synthesized and used individually or in pools for selecting subsets of genomic DNA by primer extension or PCR. Another option is to make a universal library of all 6-mers or 7-mers with and without 5 to 10 degenerate bases at the 5′ end and a universal tail further 5-prime of the degenerate bases. For example BBBBBBB and U20N5_1 oBBBBBBB (where B represents defined bases in the synthesis, U represents a sequence present in all primers and N represents degenerate bases in the synthesis). These primers can be used directly to amplify selected DNA segments in viral or bacterial genomes, in one to three consecutive amplification steps, and with the possibility of using nested pairs of primers. Ligation of two 6-mers or two 7-mers (or 6-mer+7-mer) may generate a more specific primer that can be also be used for genomes of higher complexity, including human. Several pairs of primers could be created in one reaction tube using selected 7-mer templates from a library of all 7-mers. Because there is no need to produce a large quantity of DNA, 14-mers with a universal primer tail may be sufficient. Nested 14-mer primers may also be used to assure amplification of the region of interest.
  • 1.1.2. DNA Nano-Ball Preparation
  • Preparation of concatenated detector molecules (DNA nano-balls) requires the formation of circular DNA molecules. DNA is initially heat denatured and one end is ligated to an adapter. In a second ligation reaction the second end of the probe template is ligated to the free end of the adapter to complete the circle formation. The adapter may include short palindromic sequences (eg. ATCGATCGAT) to induce intra-molecular hybrid formation between adapter replicas e.g. -ATCGATCGAT-TAGCTAGCTA- and compaction of the concatemer.
  • Rolling circle replication (RCR) then occurs with a primer that is complementary to a portion of the adapter and phi29 strand displacing polymerase. The concentration of circular DNA in the polymerase reaction may be low (approximately 10-100 billion circles per ml, or 10-100 circles per picoliter) to avoid entanglement, and incorporation of palindromes in the adapter may also minimize intermolecular interactions.
  • The RCR reaction may result in products of varying length so removal of small nano-balls may be important for good quantification of target in the hybridization assay. Selection of small DNA nano-balls could occur by size exclusion methodologies or complementary concantenated blocker molecules that will bybridize to all adapter molecules in short molecules. Longer molecules will have excess adapter molecules that can be hybridized to a capture molecule on a solid support, whereas shorter molecules will be blocked from binding to the support.
  • For making a large number of unit arrays a continuous amplification of selected DNA fragments may be performed by cutting concatemers by hybridizing an oligonucleotide to the adapter region that generates a restriction enzyme site. New circles are formed by ligation of the free ends and a second RCR reaction is performed. One-billion fold amplification of the original DNA is possible to achieve in three to four rounds of RCR (1) and would provide enough DNA for making millions of arrays.
  • 1.1.3. Arraying DNA Nano-Balls
  • DNA detector nano balls will be arrayed on a glass or other support with a grid of capture oligonucleotide sites. The capture oligonucleotide may be 20 to 100 bases in length and could be prepared using modified DNA such as LNA and PNA to increase hybrid stability. All attached oligonucleotide sites may have the same capture oligonucleotide and the surface between these sites may be hydrophobic to prevent binding of hydrophilic molecules. The array of capture oligonucleotides may be produced by nano-printing techniques or by creating active sites for oligonucleotide attachment using photochemistry (2). Another, among many DNA nano-ball attachment options, is to create a positively charged spot surface that binds negatively charged DNA. The attached oligonucleotide region size may vary for different applications but could range from about 0.2 microns to 2 microns in diameter. Large oligonucleotide attachment sites may be suitable for longer DNA molecules.
  • Binding of saDNA nano-ball probes may proceed at specific temperatures with or without mixing until about 80%-99% of spots are occupied. Empty sites that do not bind a DNA nano-ball may be used as barcodes for aligning the grid between different CCD camera images of the same array used for decoding and a sample assay. Another option is to stamp or spray the nano-ball solution in the form of a 10 micron solution layer or picoliter or sub-picoliter drops containing about 10 nano-balls. About 10 such drops per the same surface will be sunegatfficient to occupy all 100 binding sites with 1 micrometer pitch at the 10×10 micrometer substrate surface. We have created DNA nano-balls from E. coli genomic DNA at an estimated concentration of less than 150 nano-balls/picoliter of RCR reaction which assumes maximal efficiency of circle DNA formation and polymerase extension. This approach of attachment will help to minimize the binding and association of two nano-balls with complementary DNA because of no surface mixing of millions of nano-balls over already attached nano-balls.
  • It may also be desirable to perform additional in situ DNA amplification that requires cutting the attached concatemerized DNA, recircularization (preferably by using a different adapter DNA) and RCR. This could be achieved with two different capture probes present at the oligonucleotide attachment site such that DNA concatenated with both adapters can be captured at the site. Another method for in situ amplification is to use capture oligonucleotides as primers for a strand displacing polymerase. These methods could achieve 10,000 to 100,000 or more copies per attachment site. Since 100,000 copies of a 1 kb DNA molecule that is 500 nm in length will occupy about 10% of the 500 nm×500 nm×500 nm spot volume, there would be ample space to maintain a concatemerized molecule of this size. RCR products may be fragmented after attachment using a complementary oligonucleotide to create a double-stranded DNA cutting site.
  • In one embodiment, DNA fragments can be attached to a preexisting concatemer of oligonucleotide complementary to capture oligonucleotides present in the binding sites. This attachment can be done by hybridization using an end-adapter or by ligation on one end or both ends to from a circle. In this case individual sample DNA molecules will be arrayed without RCR in solution and may be used as such or amplified in situ by using various methods including RCR. Saturation at a spot by in situ amplification or by cutting the excess unbound to capture oligo may be used to get almost identical copy number per spot.
  • It is estimated that a single well of a 384 well plate could accommodate in the order of 106 DNA attached nano-balls and a single well of a 96-well plate 5×106 DNA nano-balls. The analysis of 107 bases of DNA with a 90%-99% overlap of fragments (i.e. one fragment starting every 1-10 bases on average) would require about 106-107 DNA nano-balls. This amount of sequence equates to 100-300 different viruses of 10-30K base genome size, so potentially a 1536-well plate will work for high throughput viral screening with 10-30 viruses represented on each unit-array. Alternatively 200 to 300 bacterial species will have about 109 bases of DNA so 107×100 base long fragments will cover more than 50% of bases with occasional rare gaps longer than 1 kb appearing. In this case forming arrays for 10 meaningful groups of 20-30 bacterial species and all different isolates is more than sufficient for screening 10 specific human or other samples (e.g. blood, urine, saliva, skin each on specific array. About 108 bases of human coding DNA may be represented by 10 DNA nano-balls having 50-200 base fragments. All long exons and almost every short exon will be represented with at least one fragment and every gene will be represented by about 30-3000 fragments.
  • 1.2. Identification of DNA Nano-Balls
  • Spot DNA identification or sequencing involves characterization with multiple decoding and sequencing probes. The identification process may also provide quantification of DNA in each spot. This information may be used in the interpretation (normalization) of the obtained hybridization results obtained with the test DNA or RNA.
  • 1.2.1. Identification of Long DNA
  • The sequence identity of each attached DNA nano-ball may be determined by a “signature” approach. About 50 to 100 or possibly 200 probes will be used such that about 25-50% or in some application 10-30% of attached nano-balls will have a full match sequence for each probe. This type of data will allow each amplified DNA fragment within the nano-ball to be mapped to the reference sequence. One example of this process would be to score 64 4-mers (i.e. 25% of all possible 256 4-mers) using 16 hybridization/stripoff cycles in a 4 colors labeling schema. On a 60-70 base fragment amplified in the DNA nano-ball about 16 of 64 probes will be positive since there are 64 possible 4mers present in a 64 base long sequence (ie one quarter of all possible 4mers). Unrelated 60-70 base fragments will have a very different set of about 16 positive decoding probes. A combination of 16 probes out of 64 probes has a random chance of occurrence in 1 of every one billion fragments which practically provides a unique signature for that nano-ball. Scoring 80 probes in 20 cycles and generating 20 positive probes would create even more unique signature: occurrence by chance is 1 in billions. Previously, a “signature” approach was used to select novel genes from cDNA libraries (3) An implementation of a signature approach is to sort obtained intensities of all tested probes and select up to a predefined (expected) number of probes that satisfy the positive probe threshold. These probes will be mapped to sequences of all DNA fragments (sliding window of a longer reference sequence may be used) expected to be present in the array. The sequence that has all or a statistically sufficient number of the selected positive probes is assigned as the sequence of the DNA fragment in the given nano-ball. In another approach an expected signal can be defined for all used probes using their pre measured full match and mismatch hybridization/ligation efficiency. In this case a measure similar to the correlation factor can be calculated.
  • A preferred way to score 4-mers is to ligate pairs of probes, for example: Nis 7)BB6 with BN(7_9), where B is the defined base and N is a degenerate base. For generating signatures on longer DNA nano-ball probes, more unique bases will be used. For example, a 25% positive rate in a fragment 1000 bases in length would be achieved by N(4-6)BBBB and BBN(6$). Note that longer fragments need the same number of about 60-80 probes (15-20 ligation cycles using 4 colors).
  • In one embodiment all probes of a given length (e.g. 4096 NZ.4BBBBBBN2-4) or all ligation pairs may be used to determine complete sequence of the DNA in the nano-ball. For example, 1024 combinations of Nib-7 63 and BBN(6$) may be scored (256 cycles if 4 colors are used) to determine sequence of DNA fragments of up to about 250 bases, preferably up to about 100 bases.
  • The decoding of sequencing probes with large numbers of Ns may be prepared from multiple syntheses of subsets of sequences at degenerated bases to minimize difference in the efficiency. Each subset is added to the mix at a proper concentration. Also, some subsets may have more degenerated positions than others. For example, each of 64 probes from the set
  • N(5-7)BBB may be prepared in 4 different synthesis.
  • Oligonucleotide preparation from the three specific syntheses would be added in to regular synthesis in experimentally determined amount to increase hybrid generation with target sequences that have in front of the BBB sequence an AT rich (e.g. AATAT) or (A or T) and (G or C) alternating sequence (e.g. ACAGT or GAGAC). These sequences are expected to be less efficient in forming a hybrid. All 1024 target sequences can be tested for the efficiency to form hybrid with No_3 NNNNNBBB probes and those types that give the weakest binding may be prepared in about 1-10 additional synthesis and added to the basic probe preparation. Decoding by Signatures: a smaller number of probes for small number of distinct samples: 5-7 positive out of 20 probes (5 cycles using 4 colors) has capacity to distinct about 10-100 thousand distinct fragments
  • 1.2.2. Decoding of 8-20Mer RCR Products
  • In this application arrays are formed as random distributions of unique 8 to 20 base recognition sequences in the form of DNA nano-balls. The probes need to be decoded to determine the sequence of the 8-20 base probe region. At least two options are available to do this and the following example describes the process for a 12 mer. In the first, one half of the sequence is determined by utilizing the hybridization specificity of short probes and the ligation specificity of fully matched hybrids. Six to ten bases adjacent to the 12 mer are predefined and act as a support for a 6mer to 10-mer oligonucleotide. This short timer will ligate at its 3-prime end to one of 4 labeled 6-mers to 10-mers. These decoding probes consist of a pool of 4 oligonucleotides in which each oligonucleotide consists of 4-9 degenerate bases and 1 defined base. This oligonucleotide will also be labeled with one of four fluorescent labels. Each of the 4 possible bases A, C, G, or T will therefore be represented by a fluorescent dye. For example these 5 groups of 4 oligonucleotides and one universal oligonucleotide (Us) can be used in the ligation assays to sequence first 5 bases of 12-mers: B=each of 4 bases associated with a specific dye or tag at the end
    • UUUUUUUU. BNNNNNNN′
    • UUUUUUUU. NBNNNNNN
    • UUUUUUUU. NNBNNNNN
    • UUUUUUUU. NNNBNNNN
    • UUUUUUUU.NNNNBNNN
  • Six or more bases can be sequences with additional probe pools. To improve discrimination at positions near the center of the 12mer the 6mer oligonucleotide can be positioned further into the 12mer sequence. This will necessitate the incorporation of degenerate bases into the 3-prime end of the non-labeled oligonucleotide to accommodate the shift. This is an example of decoding probes for position 6 and 7 in the 12-mer.
    • UUUUUUNN.NNNBNNNN UUUUUUNN.NNNNBNNN
  • In a similar way the 6 bases from the right side of the 12mer can be decoded by using a fixed oligonucleotide and 5-prime labeled probes. In the above described system 6 cycles are required to define 6 bases of one side of the 12mer. With redundant cycle analysis of bases distant to the ligation site this may increase to 7 or 8 cycles. In total then, complete sequencing of the 12mer could be accomplished with 12-16 cycles of ligation.
  • 1.2.3. Partial or Complete Sequencing of Arrayed DNA by Combining Two Distinct Types of Libraries of Detector Probes
  • In this approach one set has probes of the general type N3$134-6 (anchors) that are ligated with the first 2 or 3 or 4 probes/probe pools from the set BN″, NBN5-7, N2BN4-6, and N3BN3-5. The main requirement is to test in a few cycles a probe from the first set with 2-4 or even more probes from the second set to read longer continuous sequence such as 5-6+3−4=8-10 in just 3-4 cycles. In one example, the process is:
  • 1) Hybridize 1-4 4-mers or more 5-mer anchors to obtain 70-80% 1 or 2 anchors per DNA. One way to discriminate which anchor is positive from the pool is to mix specific probes with distinct hybrid stability (maybe different number of Ns in addition). Anchors may be also tagged to determine which anchor from the pool is hybridized to a spot. Tags, as additional DNA segment, may be used for adjustable displacement as a detection method. For example, EEEEEEEENNNAAAAA and FFFFFFFFNNNCCCCC probes can be after hybridization or hybridization and ligation differentially removed with two corresponding displacers: EEEEEEEENNNNN and FFFFFFFFNNNNNNNN where the second is more efficient. Separate cycles may be used just to determine which anchor is positive. For this purpose anchors labeled or tagged with multiple colors may be ligated to unlabeled N7-N10 supporter oligonucleotides.
  • 2) Hybridize BNNNNNNNN probe with 4 colors corresponding to 4 bases; wash discriminatively (or displace by complement to the tag) to read which of two scored bases is associated to which anchor if two anchors are positive in one DNA. Thus, two 7-10 base sequences can be scores at the same time.
      • ln 2-4 cycles extend to 4-6 base anchor for additional 2-4 bases
      • Run 16 different anchors per each array (32-64 physical cycles if 4 colors are used) to determine about 16 possible 8-mers (˜100 bases total) per each fragment (more then enough to map it to the reference (probability that a 100-mer will have a set of 10 8-mers is less than 1 in trillion trillions; (10 exp-28). By combining data from different anchors scored in parallel on the same fragment in another array complete sequence of that fragment and by exenstion to entire genomes may be generated from overlapping 7-10-mers.
  • 1.2.4. Tagging Probes with DNA Tags for Larger Multiplex of Decoding or Sequence Determination Probes
  • Instead of directly labeling probes they can be tagged with different oligonucleotide sequences made of natural bases or new synthetic bases (such as isoG and isoC). Tags can be designed to have very precise binding efficiency with their anti-tags using different oligonucleotide lengths (about 6-24 bases) and/or sequence including GC content. For example 4 different tags may be designed that can be recognized with specific anti-tags in 4 consecutive cycles or in one hybridization cycle followed by a discriminative wash. In the discriminative wash initial signal is reduced to 95-99%, 30-40%, 10-20% and 0-5% for each tag, respectively. In this case by obtaining two images 4 measurements are obtained assuming that probes with different tags will rarely hybridize to the same dot. Another benefit of having many different tags even if they are consecutively decoded (or 2-16 at a time labeled with 2-16 distinct colors) is the ability to use a large number of individually recognizable probes in one assay reaction. This way a 4-64 times longer assay time (that may provide more specific or stronger signal) may be affordable if the probes are decoded in short incubation and removal reactions.
  • 1.2.5. System for Decoding saDNA Chip Machine Introduction:
  • A key component of successful array production is having a cost-effective methodology for decoding each array. Decoding arrays during production simplifies assays for the end user. Our decoding methodology includes a fast, automated imaging and assay platform designed specifically to optimize this task. Under the currently described schema, patterned array substrates are produced to match the standard 96 or 384 well plate format. Our production format will be an 8×12 pattern of 6 mm×6 mm arrays at 9 mm pitch or 16×24 of 3.33 mm×
  • 3.33 mm array at 4.5 mm pitch, on a single piece of glass or plastic and other optically compatible material. In one example each 6 mm×6 mm array consists of 36 million 250-500 nm square activated regions at 1 micrometer pitch. Throughout the production process, our arrays will be manipulated in this array of arrays format.
  • The rate limiting step for the production process may be array decoding. While arrays can be printed and hybridized at an astonishing rate through the use of processes derived from the semiconductor industry, they must be decoded at the rate of image acquisition. The decoding process, described in other sections of this document, will require the use of 48-96 or more decoding probes. These pools will be further combined into 12-24 or more pools by encoding them with four fluorophores, each having different emission spectra. Additional tagging may be used as described in the biochemistry of decoding.
  • Using a 20× objective, each 6 mm×6 mm array may require roughly 30 images for full coverage by using a 10 mega pixel camera with. Each of 1 micrometer array areas will be read by about 8 pixels. Our prior experience suggests that each image could be acquired in 250 milliseconds, 150 ms for exposure and 100 ms to move the stage. Using this fast acquisition it will take −7.5 seconds to image each array, or 12 minutes to image the complete set of 96 arrays on each substrate. In one embodiment of an imaging system, we will achieve this high image acquisition rate by using four ten-megapixel cameras, each imaging the emission spectra of a different fluorophore. The cameras will be coupled to the microscope through a series of dichroic beam splitters. The autofocus routine, which takes extra time, will run only if an acquired image is out of focus. It will then store the Z axis position information to be used upon return to that section of that array during the next imaging cycle. By mapping the autofocus position for each location on the substrate we will drastically reduce the time required for image acquisition.
  • Each array will require about 12-24 cycles to decode. Each cycle consist of a hybridization, wash, array imaging, and strip-off step. These steps, in their respective orders, may take for the above example 5,2,12, and 5 minutes each, for a total of 24 minutes each cycle, or roughly 5-10 hours for each array, if the operations were performed linearly. The time to decode each array can be reduced by a factor of two by allowing the system to image constantly. To accomplish this, we will stagger the imaging of two separate substrates on each microscope. While one substrate is being reacted, the other substrate will be imaged.
  • The physical makeup of the machine will include a number of additions to the standard microscope. A large area automated plate stage may be added to the microscope. This stage will accommodate the two substrates needed for each decoding assay. Another possibility is to use two smaller substrates that can fit in the standard plate stage. Each substrate will be fitted into a cassette and those cassettes will be fitted on to the stage. The cassette will index the substrate to the stage and provide a method to contain fluids over the assay substrate. Cassettes will have ports to facilitate the addition and removal of large volumes of buffer. They will also provide a means to control the temperature of the substrate, through a connection with a temperature control subsystem with ability to maintain temperature in the range from about 5-950 C or more specifically 10-85° C.) and can change temperature in the cycle about 0.5-20 C per second. Another key component is the 3 axis robot gantry which will be equipped with a syringe pump actuated pipetting head. This robotic pipetter will be used to add the probe pools to each cassette. Syringe pumps will be used to pump buffers into and out of each cassette. In another embodiment, the robotc piteting may be replaced with pumps and valves based automation of decoding probe pool delivery. In yet another embodiment all reagents and substrates may be contained on a microfluidic chip.
  • Example Cycle:
    • Set temperature of array to hybridization temperature (usually in the range 5-250 C)
    • Use robot pipetter to pre mix a small amount of decoding probe with the appropriate amount of hybridization buffer.
    • Pipette mixed reagents into hybridization chamber
    • Hybridize for predetermined time
    • Drain reagents from chamber using pump (syringe or other)
    • Add a buffer to wash mismatches of non-hybrids
    • Adjust chamber temperature to appropriate wash temp (about 10-40° C.) Drain chamber
    • Add more wash buffer if needed to improve imaging
  • Image each array, preferably with a mid power (20×) microscope objective optically coupled to a high pixel count high sensitivity ccd camera, or cameras. Plate stage moves chambers (or perhaps flow-cells with input funnels) over object, or objective-optics assembly moves under chamber. Certain optical arrangements, using di-chroic mirrors/beam-splitters can be employed to collect multi-spectral images simultaneously, thus decreasing image acquisition time. Arrays can be imaged in sections or whole, depending on array/image size/pixel density. Sections can be assembled by aligning images using statistically significant empty regions pre-coded onto substrate (during active site creation) or can be made using a multi step nano-printing technique, for example sites (grid of activated sites) can be printed using specific capture probe, leaving empty regions in the grid. Then print a different pattern or capture probe in that region using separate print head.
  • Drain chamber and replace with probe strip buffer (or use the buffer already loaded) then heat chamber to probe stripoff temperature (60-90° C.). High pH buffer may be used in the strip-off step to reduce stripoff temperature. Wait for the specified time.
  • Remove Buffer
  • Start next cycle with next decoding probe pool in set Specific solutions:
    • Hybrization chamber
    • Currently we use a flow cell for 1″square 170 micrometer thick coverslips that have been derivitized and activated to bind nano-balls. The cell encloses the “array’ by sandwiching the glass and a gasket between two planes. One plane has an opening of sufficient size to permit imaging, and an indexing pocket for the coverslip. The other plane has an indexing pocket for the gasket, fluid ports, and a temperature control system. One fluid port is connected to a syringe pump which “pulls” or “pushes” fluid from the flow cell the other port is connected to a funnel like mixing chamber. The chamber, in turn is equipped with a liquid level sensor. The solutions are dispensed into the funnel, mixed if needed, then drawn into the flow cell. When the level sensor reads air in the funnels connection to the flow cell the pump is reversed a known amount to back the fluid up to the funnel. This prevents air from entering the flow cell. This system has worked well for the coverslip sized samples and may be used in modified form for the larger substrates.
  • The substrate may be sectioned off and divided into strips to accommodate fluid flow/capillary effects caused by sandwiching. The substrate may be made of thicker glass to resist flexing in the chamber, reducing reliance on autofocus. The substrate may be housed in an “open air”/“open face” chamber to promote even flow of the buffers over the substrate by eliminating capillary flow effects.
  • Imaging/Imaging Speed
  • Currently imaging is accomplished with a 100× objective using TIRF or epi illumination and a 1.3 mega pixel hamamatsu orca-er-ag on a ziess axiovert 200. This configuration currently images nano-balls bound randomly to a substrate (non-ordered array). Imaging speed will be improved by decreasing the objective magnification power, using grid patterned arrays and increasing the number of pixels of data collected in each image.
  • We propose using up to four or more cameras preferably in the the 10-16 megapixel range. Larger pixel count cameras, such as the 81 megapixel ccd 595 from Fairchild imaging may be used if and when they are cost effective. We may also use multiple band pass filters and dichroic mirrors to collect pixel data across up to four or more emission spectra. To compensate for the lower light collecting power of the decreased magnification objective, we will increase the power of the excitation light source. Currently, the imaging system is idle while the samples are being hybridized/reacted. To increase throughput one or more chambers will be assayed while one or more chambers is being imaged. Because the probing of arrays can be non-sequential, more than one imaging system can be used to collect data from a set of arrays, further decreasing assay time.
  • During the imaging process, the substrate must remain in focus. Some key factors in maintaining focus are the flatness of the substrate, orthogonality of the substrate to the focus plane, and mechanical forces on the substrate that may deform it. Substrate flatness can be well controlled, glass plates which have better than % wave flatness are readily obtained. Uneven mechanical forces on the substrate can be minimized through proper design of the hybridization chamber. Orthogonality to the focus plane can be achieved by a well adjusted, high precision stage. Even when all these issues are addressed, it is likely that some auto focus methodology will have to be used during substrate imaging. Auto focus routines generally take additional time to run, so it is desirable to run them only if necessary. After each image is acquired, it will be analyzed using a fast algorithm to determine if the image is in focus. If the image is out of focus, the auto focus routine will run. It will then store the objectives Z position information to be used upon return to that section of that array during the next imaging cycle. By mapping the objectives Z position at various locations on the substrate, we will reduce the time required for substrate image acquisition.
  • Illumination:
  • The current system uses a zeiss TIRF slider coupled to a 80 milliwatt 532 nm solid state laser. The slider illuminates the substrate through the objective at the correct TIRF illumination angle. TIRF can also be accomplished without the use of the objective by illuminating the substrate though a prism optically coupled to the substrate. Planar wave guides can also be used to implement TIRF on the substrate Epi illumination can also be employed. The light source can be rastered, spread beam, coherent, incoherent, and originate from a single or multi-spectrum source.
  • Our current microscope can do standard epi illumination on the entire plate substrate. Our current system successfully detects hybridization on DNA nano-balls with both TIRF and epi fluorescence.
  • A preferred embodiment for the imaging system will contain a 20× lens with a 1.25 mm field of view, with detection being accomplished with a 10 megapixel camera. Such a system would image approx 1.5 million nano-balls attached to the patterned array at 1 micron pitch. Under this configuration there are approximately 6.4 pixels per nano-ball. The number of pixels per nano-ball can be adjusted by increasing or decreasing the field of view of the objective. For example a 1 mm field of view would yield a value of 10 pixels per nano-ball and a 2 mm field of view would yield a value of 2.5 pixels per nanoball. The field of view will be adjusted relative to the magnification and NA of the objective to yield the lowest pixel count per nano-ball that is still capable of being resolved by the optics, and image analysis software.
  • Robot Gantry:
  • Our current 3 axis robotic gantry pipetting system can be scaled up to serve more than one microscope. Currently the system has one pipette head. If the number of chambers becomes too great for a single pipetter to service efficiently, multiple pipetting channels can be added to the pipetter head, each head individually accessible via a simple linear extension system, increasing robot efficiency by increasing the service potential for each robot move. It may be more efficient or cost effective to implement a non-gantry style robot, such as a scara style robot to perform certain operations.
  • Plate Stage:
  • A larger than standard plate stage may be needed to image more than one plate sized substrate per microscope. The plate stage should be designed for rigidity, positional accuracy, and repeatability.
  • 1.3. Preparation of Sample Targets for saDNA Probe Arrays
  • The DNA nano-ball probe arrays can be used for sequence (for example genes, exons, promores, diagnostic sites, SNPs, mutations) identification in amplified or possibly non-amplified target samples. For the detection and characterization of viral and bacterial DNA collected from clinical or pre-symptomatic isolates there may be the requirement to minimize contaminating human genomic DNA. The reduction of human DNA contamination may be achieved by using affinity columns or beads directed to Alu or LINE repeats in the human genome. Sample DNA of 1-10 kb length could be hybridized to these affinity columns and the un-bound fraction collected and fragmented to the final preferred length before amplification or direct hybridization to the nano-ball probe arrays. It may be important to quantify the amount of isolated DNA.
  • Under conditions in which the DNA sample is relatively pure: “whole genome” methods of amplification could be employed. One approach could be to form single stranded DNA circles (50-500 bases in length) using a 20-100 base adapter and amplify by RCR 100-1000 fold in a linear amplification from the original copy. Concatamers can then be fragmented by a restriction enzyme after hybridizing a complementary oligonucleotide to the adapter such that a double stranded cutting site is formed.
  • It may be beneficial to randomly fragment the target DNA to about 50-200 bases using DNAse at a pre-tested enzyme dilution for a specific incubation time and depending upon the amount of DNA in the sample. Fragmentation has the benefit of improving hybridization kinetics and decreasing negative repellent forces of the molecules. It is also a more efficient use of sample DNA and less likely to build chaining of two DNA fragments from solution that may cause false signals. It may also be beneficial to develop an internal control target that reports the degree of fragmentation for example through the separation of quenching dyes.
  • Each target DNA that is prepared by RCR will be a single stranded concatemer of sample target and adapter. The adapter sequence portion of the RCR concatemers allows for the hybridization of labeled dendrimers in which a single arm of the dendrimer is complementary to a portion of the adapter used to form DNA circles. Non-amplified DNA may be labeled by poly-C or poly-A tailing using terminal transferase and than hybridized to dendrimers with a single complementary arm. An alternative may be to ligate on each end of single stranded DNA a complement to different dendrimer arm. Longer DNA with multiple attached dendrimers may be ligated to each end or other standard labeling procedures may be used. The excess of label may be hybridized or ligated to a biotinylated oligonucleotide, and remove with Strepavidin coated beads.
  • A target DNA sample may also be prepared by utilizing the detection DNA nano-ball array itself for sample isolation. In this procedure the target DNA would be collected in a small volume, then fragmented and denatured. The target DNA is then hybridized to an array of nano-ball sequences complementary to those desired from the sample and any excess un-hybridized DNA would then be washed away. Captured target DNA could be amplified on the surface of the array by covalently ligating fragmented concatemers with the capture oligonucleotide as a bridging support, followed by RCR. Alternatively, an adapter with a tag or label can be ligated to the hybridized DNA and detected.
  • RNA may also serve as the target with or without conversion to DNA Sample DNA or RNA amplification may not be required due to: 1) extensive miniaturization, low reaction volume and effective reaction mixing to allow DNA or RNA fragments to find complementary nano-ball probes; 2) longer DNA detector molecules in the array enable efficient and specific hybridization in complex mixtures. This also allows the use of bulky signal amplification molecules; 3) the use of multiplicity of different DNA fragments for each DNA region of each sample reduces experimental noise (it also allows finding of specific DNA fragments for detecting given gene or genomic region with no cross talk to other DNA in the sample); 4) signal amplification for example with the application of dendrimers or concatenated detector DNA as labeling methods for the target.
  • 1.3.1. Hybridization and Data Analysis
  • The longer lengths of the DNA nano-ball probes allows for stringent washing conditions which can improve specificity of the probes and targets. Temperature gradients and obtaining several measurements per spot in the wash steps could also be employed to increase specificity and this may be important for shorter probes to detect single or a few base changes. Detection of hybrid formation without washing excess target from the reaction chamber (homogenous assay) is also an option by focusing the CCD on the surface. This could be especially applicable if a longer hybridization is performed to deplete labeled DNA from solution.
  • One or more images of an array will be generated preferably using a CCD camera. Raw signals will be determined by image analysis and assigned to each spot and associated with provided information about identity of detector molecule at each spot. Empty or other control dots may be used to assure proper assignments of spot signals to detector molecules.
  • 1.4. Applications
  • Some applications of the long DNA nano ball probes include
      • 1. Detection of gene duplications and deletions
      • 2. Detection of horizontal transfer of DNA (tumor samples, individual variation, similar species)
      • 3. Gene expression analysis
      • 4. Alternative splicing characterization
      • 5. Pathogen detection and quantification in which there is no need for sequence specific primers.
      • 6. Enrichment columns or elimination columns for example Alu binding columns for human DNA or split on two samples label one by biotin, then rehybridize them and collect. More abundant sequences will bind rapidly to the biotinylated strands and will be removed. Selection of strain specific genes could occur by removing common sequences by hybridizing to the one or more initially selected strains.
      • 7. Environmental screening/quantification for microbes
      • 8. Protein binding assays
      • 9. A SNP-detection chip in which 20-mers with mismatches are selected between multiple pairs of individual genomes. This could be performed in a 96-well plate with 10 million SNPs assayed Genome specific tiling chips could be created through the use of random mutagenesis. Low accuracy polymerase enzymes could be used to incorporate one mutation every 10-20 bases and arrays of 10-25-mers can be prepared with the total sequence covered by the nano-ball probes of 40 to 400 times the length of the genome.
      • 10. DNA assembly by ligation or multiple site specific mutagenesis
      • 11. Programmable nano-wiring support
  • In this application the array with different identified DNA per spot is used to create programmable connections and nano-wires between neighboring spots by providing a bridging oligonucleotide . . . PPPPPPPPPPPPPPPPPPPP SSSSSSSSSSSSSSS . . . PPPPPPPPPPPBBBBB . . . BBBBSSS SSSSSSS.
  • By designing different length for Ps and Ss in the above example a controllable switches can be generated using temperature as the trigger. The connector may be designed to stay in one of two connected spots for reconnecting.
  • Advantages of nano-ball probe arrays vs. in-situ prepared probe arrays include: —longer probe lengths (50-5000 base) allows for increased specificity and sensitivity. —Ultra-high density of probe nano-balls allows for higher sensitivity and lower assay cost—Low production cost compared with existing array technologies
      • Very high probe accuracy even for long probes
      • 10-100× higher density of probes per surface than existing array technologies; —Three-dimensional nature of the nano-ball improves hybridization access
      • Full flexibility in changing and upgrading content similar to mask-less in-situ synthesis The use of non-identical arrays may result in some targets not being represented in the array and may need many spots with the same DNA: losing the advantage of high densities, but accuracy of each measurements increases. Also, there may be a need for specific priming to make arrays representing only selected DNA regions of a genome.
  • We have described two platforms for sequence quantification. In the rSBH platform target (e.g. test sample) nano-balls are arrayed on the surface and can be quantitated by counting the occurance of specific nanoballs. In the saDNA platform nano-ball probes attached to the surface are used to quantitate solution phase labeled target through relative intensity levels of the label at the nano-ball. The advantages of saDNA based quantitation versus rSBH based quantitation include: 1) A duplicated gene will produce on average a two-fold stronger signal on 10-100 representative DNA nano-ball probes. In contrast for rSBH we would need to count sufficient number of nano-balls to determine whether there is truly overrepresentation of one sequence over other sequences.
  • A further advantage of the saDNA platform over array of DNA from the test sample is that only 10-100 million nano-balls need be scored instead of 1-10 billion for quantitative representation of all informative fragments. One limitation of saDNA however is that it may be difficult to identify low frequency targets such as gene duplications or deletions in one of every 10-10,000 tumor cells.
  • Example: duplications and deletions in tumor samples: 100 million (maybe only 30 million after removing repeats in one 96-well)×100-1000 bases: one every 30 bases on average; can detect 1000 base deletions;
  • for full sequencing (cover every part with sufficient redundancy 10× more DNA and 1 Ox more probes.
  • 1.4.1. Self-Assembled Arrays of Peptides, Proteins or Other Polymers
  • RCR products of synthetic or natural DNA fragments of about 30-3000 bases initiated with a primer that has RNA polymerase promoter extension are used to produce long RNA and in vitro translated protein with multiple copies of the same peptide with an adapter (used for forming DNA circles) coded spacer peptide. The resulting protein with 100 to 10000 amino acids may be folded maybe initiated by the spacer protein to form several to hundreds of almost independently folded unit peptides. Each peptide may form several domains for binding different molecules like antibodies, oligo peptides, single or double-stranded oligonucleotides or other chemical compounds that can be used to identify given peptide.
  • These protein balls may be attached to binding sites of a support having a peptide or other molecule that binds to spacer peptide or by using other general protein binding chemistry. Small size of active binding sites surrounded by non-binding support allow to attach only one (first to bind) protein nano-ball by binding saturation of all available binding molecules in the binding site or by physical prevention of other protein nano-balls to interact with the same binding site. To minimize double or multiple occupancy protein nano-balls smaller than give size may be removed by size separation or saturation of spacer protein.
  • 2. EXAMPLES OF SPECIFIC PRODUCTS AND PROCESSES AND PROCEDURES FOR MAKING THEM
  • Preparation of DNA detection and quantification arrays comprising:
      • providing mixture of DNA fragments 10, 20, 50, 100 or more bases and shorter than 25, or 50, or 100, or 500, or 1000, or 2000 or 5000 or 10,000 bases from a source DNA
      • form DNA arrays by attaching concatemers of the same fragment or by in-situ amplification of a single DNA molecule
      • identify the DNA in each spot by hybridization signature or partial or complete sequence determination.
      • Dependent claim: RCR based formation of DNA concatemers with or without sequence complementary to the support bound capture oligonucleotide
      • Dependent claim: Utilize a support with a grid of regions with DNA capture chemistry separated by surface without DNA capture chemistry, each region being 0.1-10 micrometer with center to center distance of about 0.2 to 20 um.
      • Dependent claim: Source DNA is all sequence variants of given length 8 to 20 base
      • Dependent claim: Identity of nano-ball sequence by ligation of two adapter dependent or adapter independent oliognucleotides, and use individual probes or pools of probes with 0 to about 8 informative bases.
      • Highly multiplexed DNA detection and quantification method consisting of
      • providing a DNA array containing >100K, >1 M, >10M DNA spots identified by hybridization signature or partial or complete sequence
      • Hybridizing target sample comprising labeled or tagged (or target able to be labeled or tagged) DNA fragments under conditions allowing the formation of complementary DNA hybrids *Detecting bound labels/tags or bound DNA in array spots
      • Analyzing data to detect and quantify DNA molecules in the sample substantially complementary to one or more DNAs on the array
  • Dependent claim: DNA arrays prepared using RCR based formation of DNA concatemers with or without sequence complementary to the support bound capture oligo bound
  • Dependent claim: Add a washing step before detecting step to remove non-hybridized DNA Dependent claim: Add a stringent washing step before detecting step to remove non-hybridized DNA and DNA hybridized to targets with larger number of mismatches;
  • Dependent claim: performing multiple detection step during the increased stringency (for example higher temperature, or higher pH) washes
  • Dependent claims: determining gene expression and or alternative splicing; gene deletion or duplication; pathogen detection, quantification and characterization, SNP detection; mutation discovery, microbe detection and quantification in natural sources; DNA sequencing, industrial use in agriculture, food pathogens, medical diagnostics, cancer samples;
      • Labeling or tagging of sample molecules is done after binding them to the detector molecules in the array.
      • A support with DNA/RNA with natural or analog bases spots in a grid or random spot array with informative single stranded DNA longer than 15, or 25, or 50, or 75 or 100 or 125, or 150, or 200, or 250, or 300, or 400, or 500, or 750, or 1000 bases and more than 10,000 or 100,000 or 1 million spots per mm2 containing multiple copies of the same DNA per spot, wherein more than 1000 or 10,000 or 100,000 different DNA is present in the array and which DNA is at which spot is determined after DNA attachment.
      • Dependent claim: more than 50, 60, 70, 80, 90 or 95% of spots in the grid have single informative DNA species excluding errors produced by amplification.
      • Dependent claim: a plate with 2, 4, 6, 8, 10, 12, 16, 24, 32, 48, 64, 96, 192, 384 or more such DNA arrays, where in most cases the same DNA is in different spots in the individual arrays.
      • Dependent claim: array containing DNA fragments from multiple (2-2000, 10-2000, 20-2000, 50-2000, 100-2000, 100-10,000, 500-10,000 species.
      • Dependent claims: array containing DNA fragments that have SNP or other differences between individuals or species.
      • Dependent claim for all above product claims: DNA copies per spot produced by RCR before attachment
      • Dependent claim for all above product claims: DNA isolated from natural sources. *Identity or sequence of DNA/RNA or other detector molecule in usable spots is inferred by matching hybridization or other binding signature or partial or complete polymer sequence to reference data base of signatures or sequences.
  • A support with protein, peptide or other polymer detector molecules spots in a grid or random spot array with informative peptide or other polymer longer than 15, or 25, or 50, or 75 or 100 or 125, or 150, or 200, or 250, or 300, or 400, or 500, or 750, or 1000 and more amino acids or other monomers, and more than 10,000 or 100,000 or 1 million spots per mm2 containing multiple copies of the same peptide or other polymer per spot, wherein more than 1000 or 10,000 or 100,000 different peptides or other polymers is present in the array and which peptide or other polymer is at which spot is determined after peptides or other polymer attachment to the suport. *Identification of which peptide or other polymer is present in a spot by generating binding signature using antibodies, oligo peptides, oligonucleotides, sets of compounds. *Binding signatures developed by experimental testing of known peptides or other polymers in tubes, wells or spotted arrays with predefined spot for each tested peptide or other polymer. *Expected binding signatures developed by computing binding properties of each expected peptide (or other polymer) with each binder molecule.
  • 2.1. Examples of DNA Nano-Ball Array Preparation and DNA Identification
  • 2.1.1. DNA Targets on the Random Array Derived from Different Sequences can be Specifically Identified by Sequence-Specific Probes
  • PCR products from diagnostic regions of Bacillus anthracis and Yersinia pestis were converted into single stranded DNA and were attached with a universal adaptor. These 2 samples were then mixed and replicated together using the rolling circle replication method and deposited onto the random array. Successive hybridization with amplicon specific probes showed that each spot on the array corresponded uniquely to either one of the 2 amplicon sequences and they can be identified specifically with the probes. This result demonstrates sensitivity and specificity of identifying DNA present in submicron size spots created by attaching DNA nano-balls having about 100-1000 copies of a DNA fragment generated by RCR reaction.
  • Amplicons Ba3 (a 155 by amplicon sequence 5′-TCCCAATACATATGAGCGATTCGCCTTTAT AAACGACGTATTCCTTTGAACTCGTTATGACACTCATTACTCAACTCCCCTTTTCTACTAAAATAGCGTTTTTGTTT GGTTTTTTTCTTCACATAATC CGTCCTATTTGATTTTTACATACCACC-3′ from B. anthracis) and Yp3 (a 275 by amplicon sequence 5′ TGTAGCCGCTAAGCACTACCATCCCCTCAAGGTTATTGACGGTATCGAGTAG GGTTAGGTGGGCATCATTGTCCATTTCATGGCGGTAATATCGGGATGAGATAACGCGGGTGTCATGGACGTATGG CGGGTCAACAAAATGAAGCGTTGAAACTGTGTCATGGTCTAACATGCATTGGACGGCATCACGATTCTCTACCAAA ACGCCCTCGAATCGCTGGCCAACTGCTGCCAAGTTTTCAGGCATCCTTGCCCAAAGGTGTTGAGCTGTTGCC-3′ from Y. pestis) were amplified using standard PCR techniques with PCR primers Ba3F (5′-TCCCAATACATATGAGCGATTCGCC-3′) and Ba3R (5′-GGTGGTATGTAAAAATCAAATAGGA-3′) for Ba3, Yp3F (5′ TGTAGCCGCTAAGCACTACCATCC-3′) and Yp3R (5′-GGCAACAGCTCAACACCTTTGG-3′) for Yp3. Ba3R and Yp3R were phosphorylated, therefore the complementary strands of the PCR products were phosphorylated at the 5′ end. Single stranded form of the PCR products was generated by degradation of the phosphorylated strand using lambda exonuclease (Epicenter). The 5′ end of the remaining strand was phosphorylated using T4 DNA polynucleotide kinase (Epicenter) to allow ligation to the universal adaptor. The universal adaptor (sequence 5′-TATCATCTACTGCACTGACCGGATGTTAGGAAGAC AAAAGGAAGCTGAGGGTCACATTAACG GAC-3′) was ligated using T4 DNA ligase (Epicenter) to the 5′ end of the target molecule assisted by a template oligonucleotide (Ba3-5 end 5′-ATTGGGAGTCCGTTAATGTGAC-3′ for amplicon Ba3, Yp3-5 end 5′-GGCTACAGTCCGTTAATGTGAC-3′ for amplicon Yp3) specifically complementary to the 5 end of the targets and 3′ end of the universal adaptor. The adaptor ligated targets were then circularized using bridging oligonucleotides (Ba3-3 end 5′-AGATGATAGGTGGTAT-3′ for amplicon Ba3, Yp3-3 end 5′-AGATGATAGGCAACAG-3′ for amplicon Yp3) with bases complementary to the adaptor and to the 3′ end-of the targets. Linear DNA molecules were removed by treating with exonuclease I (New England Biolabs) at 37° C. for 4 hours under standard reaction conditions. Rolling circle replication (RCR) products (DNA nano-balls) were generated by mixing the single-stranded samples of Ba3 and Yp3 together, and using Phi29 polymerase (New England Biolabs) to replicate around the circularized adaptor-target molecules with the bridging oligos as the initiating primers. Specifically, 0.1 to 0.5 pmol of the circularized DNA was incubated with 5 units of Phi29 DNA polymerase, 2 pmol of the bridging oligos, 0.4 mM dNTP, 0.2 mg/ml BSA and 1× standard Phi29 DNA polymerase buffer at 30° C. for 2 hours, followed by 1 hour incubation at 55° C. with 1 ug/ul proteinase K. The RCR products were captured on the glass slide via the capture oligo (sequence 5′-GGATGTTAGGAAGACAAAAGGAAGCTGAGG-3′) attached to derivatized glass coverslips (Corning) that is complementary to the universal adaptor sequence.
  • To prepare the coverslips for attaching amine-modified oligonucleotides, the coverslips were first cleaned in a potassium/ethanol solution followed by rinsing and drying. They were then treated with a solution of 3-am inopropyld im ethylethoxysi lane, acetone, and water for 45 minutes and cured in an oven at 100° C. for 1 hour. As a final step, the coverslips were treated with a solution of p-phenylenediisothiocyanate (PDC), pyridine, and dimethylformamide for 2 hours. The capture oligo is modified at the 5′ end with an amine group and 2 C-18 linkers. For attachment, 10 pl of the capture oligo at 10 pM in 0.1 M NaHCO3 was spotted onto the center of the derivatized coverslip, dried for 10 minutes in a 70° C. oven and rinsed with water. To create an array of DNA nano-balls, the RCR reaction containing the DNA nano-balls was diluted 10-folds with 3×SSPE, 20 pl of which was then deposited over the immobilized capture oligos on the coverslip surface for 30 minutes in a moisture saturated chamber. The coverslip with the DNA nano-balls was then assembled into the reaction chamber of the rSBH instrument and was rinsed by 2 ml of 3×SSPE.
  • The arrayed target molecules were probed sequentially with TAMRA-labeled oligomer: probe BrPrb3 (sequence: 5′-CATTAACGGAC-3′, specifically complementary to the universal adaptor sequence), probe Ba3 (sequence: 5′-TGAGCGATTCG-3′, specifically complementary to the Ba3 amplicon sequence), probe Yp3 (sequence: 5′-GGTGTCATGGA-3′, specifically complementary to the Yp3 amplicon sequence). The probes were hybridized to the array at a concentration of 0.1 pM for 20 min in 3×SSPE at room temperature. Excess probes were washed off with 2 ml of 3×SSPE. Images were taken with the TIRF microscope. The probes were then stripped off with 1 ml of 3×SSPE at 80° C. for 5 minutes to prepare the arrayed target molecules for the next round of hybridization.
  • By overlaying the images obtained from successive hybridization of these 3 probes, (FIG. 10) shows that most of the arrayed molecules that hybridized with the adaptor probe (blue spots) would only hybridized to either the Ba3 probe (red spots) or the Yp3 probe (green spots), with very few that would hybridized to both. This specific hybridization pattern demonstrates that each spot on the array contains only one type of sequence, either the Ba3 amplicon or the Yp3 amplicons. It also demonstrates that the rSBH process is able to distinguish target molecules of different sequences deposited onto the array by using sequence specific probes.
  • 2.1.2. Decoding a Base Position in Arrayed DNA Nano-Balls Created from a Synthetic 80-Mer Oligo with a Degenerate Base
  • Individual molecules of a synthetic oligo containing a degenerate base can be divided into 4 sub-populations, each will have either an A, C, G or T base at that particular position. An array of DNA nano-balls created from this synthetic DNA will have about 25% of spots with each of the bases. We demonstrated successful identification of these sub-populations of DNA nano-balls by four successive hybridization and ligation of pairs of probes specific to each of the 4 bases. The results demonstrate ability to determine partial or complete sequence of DNA present in DNA nano-balls by increasing number of consecutive probe cycles or by using 4 or more probes labeled with different dyes per each cycle.
  • A synthetic oligo (T1A:
    5′-NNNNNNNNGCATANCACGANGTCATNATCGTNCAAACGTCAGTCCA
    NGAATCNAGATCCAC TTAGANT-3′)

    contains at position 32 a degenerate base. Universal adaptor was ligated to this oligo and the adaptor-T1A DNA was circularized as described before. DNA nano-balls made using the rolling circle replication (RCR) reaction on this target were arrayed onto the random array. Because each spot on this random array corresponded to tandemly replicated copies originated from a single molecule of T1A, therefore DNA in a particular arrayed spot would contain either an A, or a C, or a G, or a T at positions corresponding to position 32 of T1 A. To identify these sub-populations, a set of 4 ligation probes specific to each of the 4 bases was used. A 5′ phosphorylated, 3′ TAMRA-labeled pentamer oligo corresponding to position 33-37 of T1A with sequence CAAAC (probe T1A9b) was paired with one of the following hexamer oligos corresponding to position 27-32: ACTGTA (probe T1A9a), ACTGTC (probe T1A10a), ACTGTG (probe T1A11a), ACTGTT (probe T1Al2a). Each of these 4 ligation probe pairs should hybridize to either an A-, C-, G- or T-containing version of T1 A.
  • For each hybridization cycle, the probes were incubated with the array in ligation/hybridization buffer (50 mM Tris-CI, pH7.8, 10% PEG, 1 mM ATP, 50 mg/L BSA, 10 mM MgCl2i 0.05 unit/NI T4 DNA ligase (Epicenter) and 10 mM DTT) at 20° C. for 5 minutes. Excess probes were washed off with 2 ml of wash buffer (50 mM Tris-CI, pH7.5, 10 mM MgCl2) at 20° C. and images were taken with the TIRF microscope. Bound probes were stripped with 10 mM Tris-CI, pH8.0 to prepare for the next round of hybridization.
  • The adaptor specific probe BrPrb3 at 0.1 pM was hybridized to the array to establish the positions of all the spots (shown as blue in FIG. 11). The 4 ligation probe pairs, at 0.4 NM, were then hybridized successively to the array: the spots hybridized to the A-specific ligation probe pair are shown as red in FIG. 11, the C-specific spots are green, G-specific spots are yellow and the T-specific spots are cyan. In FIG. 11, circle A indicates the position of one of the spots hybridized to both the adaptor probe and the A-specific ligation probe pair, suggesting that the DNA arrayed at this spot is derived from a molecule of T1A that contains an A at position 32. It is clear that most of the spots are associated with only one of the 4 ligation probe pairs, and thus the nature of the base at position 32 of T1A can be determined specifically.
  • Using an in-house image analysis program, spots were identified using the images taken for the hybridization cycle using the adaptor probe. The same spots were also identified and the fluorescent signals were quantified for subsequence cycles with the base-specific ligation probes. A instrument background of 205 was subtracted off of each signal. A discrimination score was calculated for each signal: for each base-specific signal of each spot, it was divided by the average of the other 3 base-specific signals of the same spot. For each spot, the highest of the 4 base-specific discrimination scores was compared with the second highest score, and if the ratio of the two was above 1.8, then the base corresponding to the maximum discrimination score is selected for the base calling. In this analysis over 500 spots were successfully base-called and the average discrimination score is 3.34. The average full match signal is 272, while the average single mismatch signal (signals from the un-selected bases) is 83.2, thus the full match/mismatch ratio is 3.27. The image background noise was calculated by quantifying signals from randomly selected empty spots and the average signal of these empty spots is 82.9, thus the full match/background noise ratio is 3.28. In these experiments the mismatch discrimination is limited by the low full match signal relative to the background.
  • By adjusting assay conditions (buffer composition including addition of NaCl, concentrations of all components and time and temperature of each step in the cycle) higher signal to background and full match to mismatch ratios are expected as demonstrated with similar ligation assay performed on our spotted arrays of 6-mer probes. In this case full match/background ratio is about 50 and average full match/mismatch ratio is 30.
  • 2.1.3. Preparation of Glass Slides for Attaching Capture Oligonucleotides
  • The cover slips are prepared by cleaning them in a solution of Potasium hydroxide and ethanol. They are then rinsed and dried. After drying that are immersed in a solution of 3-aminopropyldimethylethoxysilane, acetone, and water for 45 minutes. After rinsing to remove excess reagents, they are cured in an oven at 100c for 1 hour. The cover slips are allowed to cool to room temperature and immersed in a solution of p-phenylenediisothiocyanate (PDC), pyridine, and dimethylformamide for 2 hours. After rinsing and drying the slides are ready to bind amine-modified oligonucleotides.
  • In another embodiment, a solution of 3-aminopropyldimethoxysilane, trimethylethoxysilane, acetone and water is used in the second step. The trim ethylethoxysilane is used in various ratios to the 3-aminopropyidimethylethoxysilane to control the density of 3 aminopropyldimethylethoxysilane on the surface. By using a non-amino functionalizd silane, we will produce fewer amino-functionalized sites on the surface, untimately reducing the number oligonucleotides that can bind to the surface durring capture probe attachment or hybridization assays of the DNA array.
  • Under certain conditions it may be advantagious to use silanes that have longer or shorter alchohol groups on the silane molecule. For example we could use trimethylmethoxysilane in place of trimethylethoxysilane to control the activation rate of the silane molecule in solution. Mixtures of “ethoxy” and “methoxy” silanes could be used to produce better control over silane activation rates.
  • 3. ADDITIONAL EXAMPLES OF METHODS AND PROCESSES USED IN PRODUCING OR APPLICATIONS OF SADNA CHIPS OR OTHER PRODUCTS
  • 3.1. The Ordered Random Array Process
  • The core of this new approach involves the creation and efficient analysis of high-density random arrays containing millions of DNA molecules. Such random arrays eliminate the costly, time-consuming steps of arraying probes on the substrate surface and the need for individual preparation of thousands of sequencing templates. Instead they provide a fast and cost effective way to analyze complex DNA mixtures.
  • DNA molecules are arrayed at a density of about one molecule per square micron of substrate. A 3×3 mm array has the capacity to hold 1-10 million fragments, or approximately 1-10 billion DNA bases, the upper limit being the equivalent of three human genomes.
  • We describe here two broad platforms for nanoscale, ordered arrays of DNA
  • In the first platform, random sequencing by hybridization (rSBH) utilizes attached DNA nano-ball targets and multiple cycling of probe pool solutions over the attached targets The rSBH process preserves all the advantages of combinatorial SBH demonstrated on our HyChip product, including the high specificity of the ligation process. At the same time it adds several important benefits that result from the attachment of DNA fragments instead of probes. DNA attachment creates the possibility of using random DNA arrays with much greater capacity than regular probe arrays, and allows detection by ligation of two labeled probes in solution. In addition, having both probe modules in solution allows us to expand our informative probe pool (IPP) strategy to both probe sets, which was not possible on the HyChip™ format. The rSBH process allows for the identification of unknown-sequence targets bound to the surface substrate either through full sequencing or by partial sequence signatures. In this regard the targets are “random” because that are distributed randomly at attachment sites on the array surface. Once identification of the surface targets is enabled either through full or partial sequence acquisition the targets may now take on the role of probes in a subsequent hybridization assay of solution phase targets. This latter process is termed Self-assembled DNA nano-arrays (saDNA) and is the basis for the second sequencing platform described here.
  • 3.2. The Instrument
  • The system hardware consists of three major components; the illumination system, the reaction chamber, and the detector system. These components work together to provide single fluorophore detection sensitivity. TIRFM creates a 100-500 nm thick evanescent field at the interface of two optically different materials (4). The evanescent field is an extension of the beam energy that reaches beyond the glass/water interface by a few hundred nanometers (generally between 100-500 nm). This field can be used to excite fluorophores close to the glass-water boundary and virtually eliminates background from the excitation source.
  • The substrate, once attached to the reaction chamber, forms the bottom section of a hybridization chamber. This chamber controls the hybridization temperature, provides ports for the addition of probe pools or targets to the chamber, removal of the probe pools or targets and substrate washing. A fluorescently labeled solution is introduced into the chamber and is given time to hybridize with the attached DNA. A high sensitivity CCD camera capable of single photon detection is used to detect fluorescent hybridization/ligation events. For sequence determination or signature identification of attached targets multiple solution phase probe pools may be cycled through the chamber. After image acquisition the chamber may be flushed to remove all probes and the next probe pool is introduced. This process is repeated 256-512 times until all probe pools have been assayed.
  • The detection instrument has been fully assembled with features including: adjustable laser power, electronic shutter, auto focus, and operating software. The system was optimized and tested using arrays of individual Tamra dye molecules and arrays of dendrimers with 350 and 50 dye molecules. Detection of a single molecule of dye is achieved in about 10% of cases demonstrating projected sensitivity of a TIRFM-based strong illumination/low background process coupled with a sensitive CCD camera. This result also demonstrates that detection of single molecules is statistically inefficient due to various physical and chemical factors and that a target DNA amplification schema is required. Dendrimers with only 50 dye molecules produce signal that is 50 fold stronger than background indicating that about 100 fold target amplification would be sufficient. We also developed and optimized a reaction chamber with automated temperature control and liquid handling including efficient washing of the chamber. The rSBH instrument is currently fully operational for individual hybridization cycles. For handling 256-1024 hybridization cycles we have designed a robotic station that is fully integrated with both reaction chamber and detection system.
  • 3.3. Coverstip Chemistry and Design
  • Effective glass activation chemistry has been developed that creates a monolayer of isothiocyanate reactive groups for attaching amine modified primers or DNA capture oligonucleotides. This monolayer chemistry reduces trapping of labeled probe and thus dramatically reduces the assay's background.
  • 3.4. Linear Rolling Circle Replication (RCR) from Ss Circles and Surface Attachment
  • Our approach to create random arrays of single molecule sequences has been to linearly amplify the target or saDNA-probe to be sequenced or identified in the form of concatemers. This strategy creates an attached DNA molecule with many rSBH-probe binding sites resulting in higher and more sustained signal intensities than obtained with single fluorophores. We describe these condensed concatemers of linear DNA as DNA nano-balls. The long, single-stranded concatemers are generated by a rolling circle replication (RCR) process that relies upon the desired target molecule first being formed into a circular substrate.
  • Amplification of the DNA in this form has several advantages; 1) The amplification is linear so copies of the same template are created which prevents mutated copies from over-representing the original template sequence. 2) All of the copies are localized to the one single molecule and so are ideal for microscopic analysis with fluorescent probes. Amplification proceeds at 30° C. in the presence of phi29 polymerase and dNTPs. One strand is a closed circle and acts as the template, the other strand with an exposed 3-prime end acts as an initiating primer and is extended. The strand displacing activity of the enzyme creates a long single stranded molecule of hundreds or thousands of copies of the circle. Regions of the single stranded molecule (in the adapter sequence) are utilized to form stable hybrids to complementary oligonucleotides attached to the surface of the coverslip.
  • For the task of preparation of random arrays of amplified DNA, single-stranded hybridization-ready concatemers of DNA fragments were generated by RCR. The continuous strand extension creates a long, single-stranded DNA consisting of hundreds of concatemers complementary to the circle. We found that, if arrayed, these concatemers form long threads on the regular glass surface (FIG. 12, panel c). To achieve compact, dense bundles of the DNA in the form of sub-micron spots or nano-balls we utilized a region of the amplified molecule for hybridization to a capture probe attached to the glass surface Zi FIG. 4, panels a and b). Hundreds of capture probe molecules (spaced about 0.10 nm apart) keep hundreds of concatenated copies of a target molecule tightly bound to a glass surface area of less than 300 nm in diameter.
  • In one study, two synthetic targets were co-amplified and about one million molecules captured on the glass surface, and then probed for one of the targets. After imaging and photo-bleaching the first probe, the second target was probed. There was no evidence of co-localization of targets under these conditions. We then demonstrated that a fluorescent 11mer probe could hybridize to bound DNA and produce a strong signal equivalent to a 30mer probe. We also confirmed that the probe could be removed through heating at 70° C. and then re-hybridized to produce equally strong signals.
  • Uniform RCR Amplicon Length
  • One observed feature of RCR generated concatemers has been a range of feature sizes produced from a homogeneous target population (see FIG. 13) This may be a result of extension initiating at different times on different circular templates or different rates of extension are occurring for individual polymerase molecules. It is believed that one polymerase molecule is responsible for the continuous extension of a primer (5) although we are not aware of any studies describing an upper limit to size of product produced by a single polymerase molecule. To create more uniform sizes of the amplified targets we will incorporate dideoxy nucleotides as a very small proportion of the total dNTP concentration (e.g. 1 in 50,000). This may have the effect of terminating those molecules that extend at a more rapid rate than other molecules that either initiate later or extend at slower rates. In another approach to create more uniform sizes of the amplified targets we may block short concatemers by consuming all potential binding sites with a predefined number of concatemer complementary sequences introduced before surface attachment. This could be achieved by creating ligation concatemers of the 30-40 base capture oligonucleotide attachment site.
  • 3.5. Methods for Circle Formation from Double Stranded DNA 3.5.1. Method I
  • A universal adapter that also serves as the binding site for capture probes and RCR primer is ligated to the 5′ end of the target molecule using a universal template DNA containing degenerate bases for binding to all genomic sequences. The 3′ end of the target molecule is modified by addition of a poly-dA tail using terminal transferase. The modified target is then circularized using a bridging template complementary to the adapter and to the oligo-dA tail (FIG. 14).
  • 3.5.2. Method II
  • Single stranded PCR products can be prepared by exonuclease digestion of one of the strands or by strand separation with high temperature and rapid cooling. Primer sequences will be incorporated into the 5-prime ends of the primers to allow for the hybridization of a bridge oligonucleotide for circularization (FIG. 15). This approach can be utilized for genomic fragment capture with adapters ligated to restriction enzyme fragmented genomic DNA. With two adapters, approximately 50% of fragments will possess two different adapters at each end which can then be used for strand removal and circle formation.
  • Capture sequences in the bridge will be the same for each molecule but probe binding sequences for sequence identification will vary. Circularization of the molecule proceeds with a bridging oligonucleotide of about 20-30 bases in length that will bring the two ends into juxtaposition for ligation by T4 DNA ligase. The bridging molecule can now act as the primer for extension. Amplification of DNA captured into the circular molecules proceeds by a rolling circle replication to form long linear concatemer copies of the circle.
  • 3.6. rSBH-Probe Cycling
  • The novelty of the method proposed here is that millions of single DNA molecules, randomly arrayed on an optically clear surface, serve as templates for hybridization and ligation of fluorescent-tagged probe pairs. Pairs of probe pools, at least one of which is labeled with a fluorophore, are mixed with DNA ligase and presented to the random array. When probes hybridize to adjacent sites on a target fragment they are ligated together, forming a stable hybrid. A sensitive mega pixel CCD camera with advanced optics is used to simultaneously detect millions of these individual hybridization/ligation events on the entire array. Once signals from the first pool are detected, the probes are removed and successive ligation cycles are used to test different probe combinations. The fixed position of the CCD camera relative to the array ensures accurate tracking of consecutive hybridizations to individual target molecules. The entire sequence of each DNA fragment is compiled based on fluorescent signals generated by hundreds of independent hybridization/ligation events.
  • Detection of the attached concatemers can also occur with sequence specific probes. To identify specific mutations of the concatemer, TAMRA labeled 5mer probes can be used in conjunction with a pair of 6mers to identify the base sequence at the mutated site.
  • We have demonstrated the ability for probe ligation to occur with the condensed concatemers. Reactions were carried out at 201 C for 10 min using our ligation kit followed by a brief wash of the chamber to remove excess probe.
  • 3.7. High Density Structured Random Arrays
  • The proposal is to structure random DNA arrays into a high density grid, such that each DNA binding site is only 100-300 nm in size and each binding site contains only a single DNA fragment. This approach should minimize cross hybridization between DNA targets, while at the same time substantially decreasing the size of each binding site and thus increasing the density of binding sites per array. The significance of being able to efficiently and inexpensively make such “perfect” random DNA arrays is tremendous. Maximizing the number of DNA segments per surface area will enable scientists to analyze a complex genome on one small glass chip, about 1 cm2 in size or less. A CCD chip can be perfectly aligned with the DNA array to provide a one to one correspondence between each CCD pixel and DNA binding site, maximizing reading efficiency.
  • Development of DNA random arrays in the form of “perfect” high density grids with sub-micron spots will provide the basis for daily sequencing of multiple human genomes using affordable 10 mega pixel CCD detectors. These whole genome DNA arrays have over 1000 times more DNA spots than the current high density probe arrays. Because a one cm2 chip can hold over one billion DNA fragments (>100 billion bases or over 30 human genomes) an automated process can be developed such that the total sequencing reaction volume for 100 interrogation cycles would be only 1 ml, reducing sequencing cost to less than $1000 dollars per genome.
  • The proposed high density structured random DNA array chip will have capture oligonucleotides concentrated in small, segregated capture cells aligned into a rectangular grid formation (FIG. 16). Most importantly, each capture cell or binding site will be surrounded by an inert surface and will have a sufficient but limited number of capture molecules (100-400). Each capture molecule will bind one copy of the matching adaptor sequence on the RCR produced DNA concatemer. Since each concatemer contains over 1000 copies of the adapter sequence, it will quickly saturate the binding site upon contact and prevent other concatemers from binding, resulting in exclusive attachment of one RCR product per binding site or spot. The proper concentration of RCR products and sufficient reaction time will ensure that almost every spot on the array contains one and only one unique DNA target.
  • RCR “molecular cloning” allows the application of the saturation/exclusion (single occupancy) principle in making random arrays. The exclusion process is not feasible in making single molecule arrays if an in situ amplification is alternatively applied. RCR concatemers provide an optimal size to form small non-mixed DNA spots. Each concatemer of about 100 kb is expected to occupy a space of about 0.10.10.1 um. This indicates that RCR products can fit into the 100 nm capture cells. Another advantage of RCR products is that the single stranded DNA is ready for hybridization and is very flexible for forming a randomly coiled ball of DNA. It is important to note that 1000 copies of DNA target produced by RCR provide much higher specificity than analysis of single molecule. Thus, RCRs provide several important advantages without any serious penalties.
  • Having 125-250 nm DNA sites in a regular grid with 250-500 nm center-to-center spacing will provide 20-80 times more DNA samples per surface than arrays with random attached DNA with spots of about 1000 nm in size and 20% usable occupancy. This will result in 20-80 fold lower reagent consumption and 20-80 fold faster readout. Furthermore, attaching RCR products onto this dense grid of capture probe spots ensures that each DNA ball is concentrated on a much smaller surface, increasing the signal and the speed of biochemical assays. Overall, the reduction of DNA attachment spots from 500 nm to 125 nm in size will result in up to 16 fold higher signal intensities. In short, the proposed DNA arrays will provide an order of magnitude lower cost, higher throughput and higher sensitivity than standard random DNA arrays.
  • A long term goal is to develop a structured array of 384-unit arrays each 3.33×3.33 mm in size (10 mm2) spaced at standard 384-well plate dimensions of 4.5 mm well to well distance. This composite array can have 384×100 million DNA spots spaced at 333 nm center to center a density that provides 10 million spots per mm2, 1 billion spots per cm2 or a total of 38.4 billion DNA spots. To analyze these arrays at the speed of 100 million spots per second (one unit array per second) will require a 30-100 mega-pixel CCD detector and it will take 6.5 minutes per cycle. The goal for the first usable system based on the composite structured arrays would be to produce DNA features that are spaced at 1 micrometer center to center and total of up to 3.8 billion spots (10 million per unit array) that can be read in about 5 minutes with a 10 mega pixel CCD detector. One billion binding sites with 100 base long DNA fragments can hold an equivalent of 30 human genomes at 1× coverage.
  • Composite arrays of hundreds of smaller unit arrays have many advantages over a single large array. For example, a subset of genes instead of entire genomes can be selectively amplified in a multiplex reaction and sequenced in hundreds of individuals at the same time on one composite array. Another very important application of array of arrays is to determine whole chromosome sequence and haplotypes using our novel two-level fragmentation method. This method represents an enabling technology that provides mapping information for assembling chromosomal haplotypes and alternatively spliced mRNAs for any analysis based on random DNA fragmentation.
  • In this method, genomic sample DNA is first prepared in the form of about 5, 10, 100, or 200 kb length fragments. By proper dilution a small subset of these fragments are at random placed in discreet wells of multi-well plates or similar accessories. For example a plate with 96, 384 or 1536 wells can be used for these fragment subsets. An optimal way to create these DNA aliquots is to take only 10-30 cells, isolate the DNA, fragment in long segments and then split the entire preparation into 384 wells. This will assure that all chromosomal regions are represented with the same coverage. The DNA aliquots will contain a few to 10, 20 or more fragments. The fragment subset's complexity is determined by the capacity of unit arrays and by statistical requirements. The goal is to minimize cases where any two overlapping fragments from the same region of chromosome or any two mRNA molecules transcribed from the same gene are placed in the same subset, e.g. the same plate well. For diploid genomes represented with 10× coverage there are 20 overlapping fragments on average to separate in distinct wells. By forming 384 fractions in a standard 384-well plate there is only about 1/400 chance that two overlapping fragments will end up in the same well. Even if some matching fragments are placed in the same well, the other overlapping fragments from each chromosomal region will provide the necessary unique mapping information.
  • The prepared groups of long fragments are further cut to the final fragment size of about 200 to 2000 bases. To obtain 10× coverage of each fragment in a group, the DNA in each well may be amplified before final cutting using well-developed whole genome amplification methods. All short fragments from one well will then be arrayed and sequenced on one separate unit array or in one section of a larger continuous matrix. The above described composite array of 384 unit arrays is ideal for parallel analysis of these groups of fragments. In the assembly of long sequences representing parental chromosomes, the algorithm will use the critical information that short fragments detected in one unit array belong′ to a limited number of longer continuous segments each representing a discreet portion of one chromosome or one mRNA molecule in the case of analyzing expressed sequences. In almost all cases the homologous chromosomal segments will be analyzed on different unit arrays. Long continuous initial segments form a tailing pattern and provide sufficient mapping information to assemble each parental chromosome separately as depicted below by relaying on about 100 polymorphic sites per 100 kb of DNA. Dots represent 100-1000 consecutive bases that are identical in corresponding segments.
  • Example
  • Well 3 T C C...G A
    Well 20 C T T...A G C...
    Well 157 T...A G C A...C...
    Well 258 ...C C...G A T G...T...
  • Wells 3 and 258 assemble mother's chromosome:
  • T C C...G A T G...T... Wells 20 and 157 assemble
    father's chromosome:.
    C T T...A G C A...C...
  • Random arrays prepared by two-level DNA fragmenting combines the advantages of both BAC sequencing and shotgun sequencing in a simple and efficient way. In addition to haplotype determination, this innovation will extend use of random DNA arrays for de novo sequencing of complex genomes or mixtures of genomes, e.g. all bacterial and protozoa genomes in a drop of see water.
  • Overall, the high density structured random DNA arrays and array of arrays will provide 20-80 times more DNA binding sites per surface area than random attachment, resulting in several advantages:
      • 1) A 20-50 fold overall increase in sequencing efficiency per array
      • 2) A 20-50 fold decrease in reagent use and sequencing time and thus an equally large cut in cost.
      • 3) Increased array reading efficiency, since each pixel of the CCD camera can be aligned to one spot on the ordered array, resulting in the largest possible density of spots per image.
      • 4) No overlaps between DNA targets, since targets will be spaced 250-500 nm center to center with 100-300 nm of inert surface space between binding sites.
      • 5) A 16 times higher signal, because DNA targets will be concentrated in a much tighter ball over dense 125 nm spots of probes.
      • 6) A very stable DNA array, since there are over hundred attachment points for each RCR product.
      • 7) A probe and enzyme friendly array, since most of DNA is not directly attached to the glass surface, thus it would be accessible to ligase, polymerase or other DNA processing enzymes and hybridization probes.
      • 8) Flexibility in making and using an array of structured random arrays for more efficient haplotype and splice variant determination, analysis of multiple samples in parallel, staggered sequencing reaction to eliminate the idle time of CCD detectors, parallel probing cycles to shorten the sequencing completion time of longer DNA fragments.
  • Structured, high-density random arrays with submicron patterned support surfaces and RCR concatemers also have many advantages over probe arrays and DNA-on-bead arrays:
      • 1) Light or focused particle induced patterning of the surface is much easier because only one universal “mask” is required. Making an array of 20-mers by in situ synthesis requires 80 steps and 80 masks. The ease of one-step patterning allows for smaller, higher density grid cells or binding sites. Thus, in addition to containing a much higher concentration of grid cells (billions) compared to bead arrays with large grid cells or wells, this patterned surface is far simpler and cheaper to prepare.
      • 2) Amplification of DNA fragments by RCR provides ideal “molecular cloning” in solution without any segregation of individual molecules by physical barriers. The only requirement is the proper concentration of target molecules. A single reaction tube with 1000 ul of RCR solution can amplify one billion fragments, each of which is allocated to a 10×10×10 micrometer volume on average. Each concatemer is expected to occupy a space of about 0.1×0.1×0.1 um. Thus, the average distance between concatemers in RCR solution is 100 times larger than their size. This distance minimizes DNA chain entanglements between concatemers. RCR combined with a patterned surface is an inexpensive solution to make billions of DNA spots in comparison with arrays of long gene specific probes prepared by in situ synthesis of oligonucleotides.
      • 3) In comparison to probe arrays, random DNA arrays provide a better solution for sequencing complex genomes because complex genomes are broken into millions of parallel low complexity sequencing reactions. Structured arrays are especially efficient in providing over 10 billion DNA spots. The DNA array format allows accurate determination (by counting) of low frequency mRNAs or SNPs in complex sample pools. For testing 1000 SNPs in such a pool with that frequency, a unit random array with 10 million DNA fragments would be sufficient.
  • There are also several advantages of rSBH over sequencing by synthesis even if the latter is done on the same structured arrays:
      • 1) It is based on an efficient and proven probe ligation biochemistry that is easy to perform in cycles
      • 2) Can analyze multiple fragments per grid cell with proper total sequence complexity
      • 3) Has longer adjustable read length from 100-1000 bases
      • 4) Allows data combination of different probes tested on different unit arrays prepared from the same DNA sample; this parallel data acquisition cuts the assay time 4-8 fold
      • 5) Allows to use large number of dyes per cycle (much more than maximal number of 4 dyes allowed in sequencing by synthesis) to reduce number of cycles, e.g. total assay time
      • 6) 11 reads per base by 11 overlapping 11-mers provides higher accuracy per each DNA strand;
      • 7) There is no signal degradation with each consecutive cycle or in bases following “reading stops”
      • 8) Provides partial sequence signature analysis for long DNA including entire mRNA (2-5 kb) per spot using special probe pools; an important advantage for efficient gene expression and splice variant analyses
  • The main limitation of rSBH and SBH in general is the difficulty in determining exact length of long simple repeats (ACACACACACAC . . . ), usually longer than about 10 bases. Special probes can be used for extending the read length of such sequences. For example for reading (A) n repeats, probes (C,G,T)3A6-10 and A6-10(C,G,T)3 alone or in combination with (A)7-20 spacers can be used in 10-15 additional ligation cycles to extend the read length of simple repeats to about 30 bases.
  • Even though SBH does not provide direct positional information (which base is on which position) sequencing of short (100-200 by fragments) in random DNA arrays removes the limitation of branching points for de-novo sequence assembly in rSBH because mathematically proven de-novo read length of 11-mers is over 1000 bases. We have used combinatorial probe ligation on HyChip universal arrays for successful de novo sequencing of DNA samples 100-700 base in length.
  • 4. METHODS AND APPLICATIONS FOR RSBH AND RCR UTILIZING TECHNOLOGIES
  • 4.1. Genomic Region Isolation
  • 4.1.1. Method I. Primer Extension
  • Primer extension from a genomic DNA template may be used to generate a linear amplification of greater than 10 kilobases of sequence surrounding the genomic region of interest. To create a population of defined size targets, 20 cycles of linear amplification will be performed with the forward primer followed by 20 cycles with the reverse primer (FIG. 17). Before applying the second primer, the first primer can be removed with a standard column for long DNA purification or degraded if a few uracil bases are incorporated. A greater number of reverse strands are generated relative to forward strands resulting in a population of double stranded molecules and single stranded reverse strands. The reverse primer for the test DNA is biotinylated for capture to streptavidin beads which can be heated to melt any double stranded homoduplexes from being captured. All attached molecules will be single stranded and representing one strand of the original genomic DNA. Although full long-range PCR is an option here, the chance of introducing base changes by polymerase mis-incorporation and selective amplification of deleted products is minimized by avoiding exponential amplification. In addition, the amount of sample required for downstream random DNA array applications will still be ample with a linear amplification approach.
  • The 10 kb products produced can be fragmented to 0.2-2 kb in size (effectively releasing them from the solid support) and used for RCR and random array production or RCR and solution phase target production for saDNA. In this procedure single stranded DNA fragments are first treated with terminal transferase to attach a poly dA tail to the 3-prime end. This is then followed by ligation of the free end intra-molecularly with the aid of another bridging oligonucleotide (see section 3.5.1. for a description of the procedure).
  • Once single stranded circles have been formed, a primer for a strand displacing polymerase such as Phi 29 polymerase can be used to create a long, linear concatemer of the circle. The concatemers may then be attached to the surface of a glass slide for detection with fluorescent probes or labeled and used as targets on an array. Sequence specific probes may be designed for specific regions of the attached concatemer, spaced about 1 kb apart, within the 10 kb of original sequence. In effect this process will “count” the number of individual molecules that were produced through a positive or negative hybrid formation to probes for targeted regions or non-targeted regions.
  • 4.1.2. Method II: Large Fragment Capture
  • Rare-cutting restriction enzymes may be mapped to the genomic regions of interest to predict fragment size and the sequences at the ends of the molecules. A possible enzyme includes Notl, which cleaves the human genome on average every 130 kb and so would be a suitable enzyme. Although methylation could affect the cutting efficiency, if the genomic DNA is from a homogeneous source then digestion patterns should be complete and not partial.
  • To isolate specific fragments from genomic DNA, all released fragments will first be treated with lambda exonuclease. This enzyme degrades bases from the 5-prime end of double stranded templates possessing a 5-prime phosphate. The strand shortening will be controlled to degrade approximately 50 to 100 bases of one strand from each end. The single stranded sequence that is revealed can act as a region of hybridization for a tagged primer for selection. The primer will be extended with the Stoffel fragment of Taq polymerase which will extend the strand until it is adjacent to the 5-prime end of the degraded strand. The newly synthesized strand can then be ligated to the undigested portion to complete the strand with a thermostable ligase. One half of the primer (3-prime) is used for sequence specific extension of the primer and reconstruction of the strand. The other half (5-prime tagged end) of the primer contains at least 20 bases of sequence for hybridization to a complementary sequence attached to the surface of a microplate well. To remove excess primers the sample will first be filtered to remove small DNA fragments. The sample is then hybridized via one end to the surface and non-attached sequences are removed by gentle washing. Capture of the molecule via one end allows one level of selection, but release of the captured molecule and re-capture via the other end provides a second and higher level of purification and selection.
  • After the final release of the 100 kb DNA fragment from the surface the sample will be digested with a six-base restriction enzyme. These fragments of 5 to 10 kb can be used for subsequent DNase fragmentation and circle formation.
  • 4.2. Mutation Discovery by Mismatch Enzyme Cleavage
  • Several approaches to mutation detection employ a heteroduplex in which the mismatch itself is utilized for cleavage recognition. Chemical cleavage with piperidine at mismatches modified with Hydroxylamine or Osmium tetroxide provides one approach to release a cleaved fragment. In a similar way the enzymesT7 endonuclease I or T4 endonuclease VII have been used in the enzyme mismatch cleavage (EMC) technique (6-8).
  • Cleavase is used in the cleavage fragments length polymorphism (CFLP) technique (9) which has been commercialized by Third Wave Technologies. When single stranded DNA is allowed to fold and adopt a secondary structure the DNA will form internal hairpin loops at locations dependent upon the base sequence of the strand. Cleavase will cut single stranded DNA five-prime of the loop and the fragments can then be separated by PAGE or similar size resolving techniques.
  • Mismatch binding proteins such as Mut S and Mut Y also rely upon the formation of heteroduplexes for their ability to identify mutation sites. Mismatches are usually repaired but the binding action of the enzymes can be used for the selection of fragments through a mobility shift in gel electrophoresis or by protection from exonucleases (10).
  • Various factors may affect the specificity of cutting with the mismatch enzymes such as temperature, pH, salt and possibly sequence context (11) so to demonstrate the ability of the mismatch enzymes to detect mutations under specific laboratory conditions, we will use a set of synthetic targets to test for optimal conditions. Two synthetic targets will be mixed, with one containing either a single base change or a deletion of several bases. The mixture will be heat denatured and re-annealed. The re-annealed products will be treated with the mis-match detection enzymes T7 endonuclease I or T4 endonuclease VII to determine the most effective enzyme. In the case of T7 endonuclease I, a population of molecules with 5-prime phosphorylated overhangs surrounding the site of the mutation will be created while T4 endonuclease VII cuts 3-prime of the mismatch. A range of overhang types will therefore be generated depending on the position of the cut sites. Gel analysis will display the efficiency of cutting and re-ligation will display the nature of the overhang.
  • 4.2.1. Method I: Capture of Mismatch Cleaved DNA from Primer Extended Products
  • Templates for heteroduplex formation will be prepared by primer extension from genomic DNA. For the same genomic region of the reference DNA, an excess of the opposite strand is prepared in the same way from the test DNA but in a separate reaction. The test DNA strand produced is biotinylated and will be attached to a streptavidin support. Homoduplex formation is prevented by heating and removal of the complementary strand. The reference preparation is now combined with the single stranded test preparation and annealed to produce heteroduplexes (FIG. 18). This heteroduplex is likely to contain a number of mismatches. Residual DNA is washed away before the addition of the mismatch endonuclease, which, if there is a mismatch every 1 kb would produce around 10 fragments for a 10 kb primer extension. After cleavage, each fragment can bind an adapter at each end and enter the mismatch-fragment circle selection process (FIG. 20).
  • 4.2.2. Method II: Capture of Mismatch Cleaved DNA from Large Genomic Fragments
  • The 5-10 kb genomic fragments prepared from large genomic fragments in section 4.1.2 will be biotinylated by the addition of a biotinylated dideoxy nucleotide at the 3-prime end with terminal transferase and excess biotinylated nucleotide will be removed by filtration. A reference BAC clone that covers the same region of sequence will be digested with the same six-base cutter to match the fragments generated from the test DNA. The biotinylated genomic fragments will be heat denatured in the presence of the BAC reference DNA and slowly annealed to generate biotinylated heterohybrids (FIG. 21). The reference BAC DNA is in large excess to the genomic DNA so the majority of biotinylated products will be heteroduplexes. The biotinylated DNA can then be attached to the surface for removal of the reference DNA. Residual DNA is washed away before the addition of the mismatch endonuclease. After cleavage, each fragment can bind an adapter at each end and enter the mismatch circle selection process as outlined in FIG. 13 and section 4.3.2.
  • In addition. It may be possible to use mismatch cleavage of DNA nanoball probes and hybridized target to identify single base mutations. Cleaved mismatch hybrids could be identified through detection of the newly formed DNA ends at the cleavage site by end specific labeling.
  • 4.3. Circle Formation from Mismatch Cleavage Products 4.3.1. Method
  • The heteroduplexes generated in section 4.2.1. can be used for selection of small DNA circles. In this process the sample is treated with the mismatch enzyme to create products cleaved on both strands surrounding the mutation site (FIG. 20). T7 endonuclease I or similar enzyme will cleave 5-prime of the mutation site to reveal a 5-prime overhang of varying length on both strands surrounding the mutation. The next phase is to capture the cleaved products into a form (see Panel A) suitable for amplification and sequencing. An adapter is ligated to the overhang produced by the mismatch cutting, but because the nature of the overhang is unknown, at least three adapters will be needed and each adapter will be synthesized with degenerate bases to accommodate all possible ends. The adapter can be prepared with an internal biotin on the non-circularizing strand to allow capture for buffer exchange and sample cleanup, and also for direct amplification on the surface if desired.
  • Because the intervening sequence between mutations does not need to be sequenced and reduces the sequencing capacity of the system it will be removed when studying genomic derived samples. Reduction of sequence complexity will utilize a type lis enzyme that cuts the DNA at a point away from the enzyme recognition sequence. In doing so, the cut site and resultant overhangs will be a combination of all base variants. A possible enzyme to use in this case is Mmel (20 bases with 2 base 3′ overhang) or Eco P151 (with 25 bases and 2 base 5′ overhang). The adapter will be about 50 by in length to provide sequences for initiation of rolling circle amplification and also provide stuffer sequence for circle formation. Once the adapter has been ligated to the fragment the DNA is digested with the type Its restriction enzyme to release all but 20-25 bases of sequence containing the mutation site that remains attached to the adapter.
  • The adapter (A) —DNA fragment can now be attached to a streptavidin support for removal of excess fragment DNA. Excess adapter that did not ligate to mismatch cleaved ends will also bind to the streptavidin solid support. The new degenerate end created by the type Its enzyme can now be ligated to adapter B through the phosphorylation of one strand of adapter B. The other strand is non-phosphorylated and blocked at the 3-prime end with a dideoxy nucleotide. The structure formed is essentially the genomic fragment of interest captured between two different adapters. To create a circle from this structure would simply require both ends of the molecule coming together and ligating. Although this event should happen efficiently, there is also the possibility that the end of an alternative molecule could ligate at the other end of the molecule creating a dimer molecule, or greater multiples of each unit molecule. One way to minimize this is to perform the ligation under dilute conditions so only intra-molecular ligation is favored, then re-concentrating the sample for future steps. An alternative and preferred strategy to maximize the efficiency of circle formation without inter-molecular ligation events occurring is to block excess (A) adapters on the surface. This can be achieved by using Lambda exonuclease to digest the lower strand. If adapter B has been attached then it will be protected from digestion because there is no 5-prime phosphate available. If only adapter A is attached to the surface then the 5-prime phosphate is exposed for degradation of the lower strand of adapter A. This will lead to loss of excess adapter A from the surface.
  • After lambda exonuclease treatment the 5 prime end of the top strand of adapter A is prepared for ligation to the 3-prime end of adapter B. This can be achieved by introducing a restriction enzyme site into the adapters so that re-circularization of the molecule can occur with ligation.
  • Amplification of DNA captured into the circular molecules proceeds by a rolling circle amplification to form long linear concatemer copies of the circle. If extension initiates 5-prime of the biotin, the circle and newly synthesized strand is released into solution. Complementary oligonucleotides on the surface are responsible for condensation and provide sufficient attachment for downstream applications. One strand is a closed circle and acts as the template. The other strand, with an exposed 3-prime end, acts as an initiating primer and is extended.
  • 4.3.2. Method II
  • This is similar to the procedure above with the following modifications as shown in FIG. 21.
  • 1) The adapter can be prepared with a 3-prime biotin on the non-circularized strand to allow capture for buffer exchange and sample cleanup.
  • 2) Reduction of sequence complexity of the 10 kb heteroduplex fragments described in section 4.2.2 occurs through the use of 4-base cutting restriction enzymes. Use of 2 or 3 enzymes in the one reaction could reduce the genomic fragment size down to about 100 bases
  • The adapter—DNA fragment can be attached to a streptavidin support for removal of excess fragment DNA. Excess adapter that did not ligate to mismatch cleaved ends will also bind to the streptavidin solid support. The biotinylated and phosphorylated strand can now be removed by lambda exonuclease which will degrade from the 5-prime end but leave the non-phosphorylated strand intact. To create a circle from this structure now requires both ends of the molecule coming together and ligating to form the circle.
  • Several approaches are available to form the circle using a bridging oligonucleotide. A polynucleotide can be added to the 3-prime end with terminal transferase to create a sequence for one half of the bridge to hybridize to. The other half will bind to sequences in the adapter. Alternatively, before addition of the exonuclease, an adapter can be added to the end generated by the 4-base cutter which will provide sequence for the bridge to hybridize to after removal of one strand by exonuclease. A key aspect of this selection procedure is the ability to select the strand for circularization and amplification. This ensures that only the strand with the original mutation (from the 5-prime overhang) and not the strand from the adapter is amplified. If the 3-prime recessed strand was amplified then a mismatch from the adapter could create a false base call at the site of or near to the mutation.
  • Amplification of DNA captured into the circular molecules proceeds by a rolling circle amplification to form linear concatemer copies of the circle.
  • 4.3.3. Alternative Applications of Mis-Match Derived Circles
  • The mis-match derived small circular DNA molecules may be amplified by other means such as PCR. Common primer binding sites can be incorporated into the adapter sequences The amplified material can be used for mutation detection by methods such as Sanger sequencing or array based sequencing.
  • 4.4. Cell-Free Clonal Selection of cDNAs
  • Traditional methods of cloning have several drawbacks including the propensity of bacteria to exclude sequences from plasmid replication and the time consuming and reagent-intensive protocols required to generate clones of individual cDNA molecules. We have previously demonstrated the ability to create linear single-stranded amplifications of DNA molecules that have been closed into a circular form. These large concatemeric, linear forms arise from a single molecule and can act as efficient, isolated targets for PCR when separated into a single reaction chamber, in much the same way a bacterial colony is picked to retrieve the cDNA containing plasmid. We plan to develop this approach as a means to select cDNA clones without having to pass through a cell-based clonal selection step.
  • The first step of this procedure will involve ligating a gene specific oligonucleotide directed to the 5-prime end with a poly dA sequence for binding to the poly dT sequence of the 3-prime end of the cDNA. This oligonucleotide acts as a bridge to allow T4 DNA ligase to ligate the two ends and form a circle.
  • The second step of the reaction is to use a primer, or the bridging oligonucleotide, for a strand displacing polymerase such as Phi 29 polymerase to create a concatemer of the circle. The long linear molecules will then be diluted and arrayed in 1536 well plates such that wells with single molecules can be selected. To ensure about 10% of the wells contain 1 molecule approximately 90% would have to be sacrificed as having no molecules. To detect the wells that are positive we plan to hybridize a dendrimer that recognizes a universal sequence in the target to generate 10K-100K dye molecules per molecule of target. Excess dendrimer could be removed through hybridization to biotinylated capture oligos. The wells will be analyzed with a fluorescent plate reader and the presence of DNA scored. Positive wells will then be re-arrayed to consolidate the clones into plates with complete wells for further amplification
  • 4.5. Exon Profiling Using Probe Pools 4.5.1. Process Overview
  • The challenge on splice variant profiling remains on finding technologies that are able to probe the presence of exons on separated cDNA molecules efficiently and rapidly. The system proposed in this project allows millions of individual cDNA molecules to be arrayed and probed in parallel. Together with a carefully designed pooling scheme of short probes from a universal set, high-throughput and low cost characterization of the splicing pattern of the whole transcript should be achieved. Main steps in the proposed process are:
      • Prepare full length first strand cDNA for targeted or all mRNAs
      • Circularize the generated full length (or all) first strand cDNA molecules by incorporating an adapter sequence;
  • By using primer complementary to the adapter sequence perform rolling circle replication (RCR) of cDNA circles to form concatemers with over 100 copies of initial cDNA
      • Prepare random arrays by attaching RCR produced “cDNA balls” to glass surface coated with capture oligonucleotide complementary to a portion of the adapter sequence; with an advanced submicron patterned surface one mm2 can have between 1-10 million cDNA spots; note that the attachment is a molecular process and does not require robotic spotting of individual “cDNA balls”.
      • Starting from pre-made universal libraries of 4096 6-mers and 1024 labeled 5-mers, use a sophisticated computer program and a simple robotic pipetor to create 40-80 pools of about 200 6-mers and 20 5-mers for testing all 10,000 or more exons in targeted 1000 or more up to all known genes in the sample organism/tissue.
      • In a 4-8 hour process, hybrid ize/ligate all probe pools in 40-80 cycles on the same random array using an automated microscope-like instrument with a sensitive 10-mega pixel CCD detector for generating an array image for each cycle.
      • Use a sophisticated computer program to perform spot signal intensity analysis to identify which cDNA is on which spot, and if any of the expected exons is missing in any of the analyzed genes. Obtain exact expression levels for each splice variant by counting occurrences in the array.
  • 4.5.2. Advantages of Studying Alternative Splicing Using Random Arrays
  • This system provides a complete analysis of the exon pattern on a single transcript, instead of merely providing information on the ratios of exon usage or quantification of splicing events over the entire population of transcribed genes using the current expression arrays hybridized with labeled mRNA/cDNA. At the maximum limit of its sensitivity, it should be able to allow a detailed analysis down to a single molecule of a mRNA type present in only one in hundreds of other cells; this would provide unique potentials for early diagnosis of cancer cells.
  • The combination of selective cDNA preparation with an “array of random arrays” in a standard 384-well format and with “smart” pools of universal short probes provides great flexibility in designing assays; for examples, deep analysis of a small number of genes in selected samples, or more general analysis in a larger number of samples, or analysis of a large number of genes in smaller number of samples.
  • The analysis provides simultaneously 1) detection of each specific splice variant, 2) quantification of expression of wild type and alternatively spliced mRNAs. It can also be used to monitor gross chromosomal alterations based on the detection of gene deletions and gene translocations by loss of heterozigosity and presence of two sub-sets of exons from two genes in the same transcript on a single spot on the random array.
  • The exceptional capacity and informativeness of this assay is coupled with simple sample preparation from very small quantities of mRNA, fully-automated assay based on all pre-made, validated reagents including libraries of universal labeled and unlabeled probes and primers/adapters that will be ultimately developed for all human and model organism genes.
  • The proposed splice variant profiling process is equivalent to high throughput sequencing of individual full length cDNA clones; rSBH throughput can reach one billion cDNA molecules profiled in a 4-8 hour assay.
  • This system will provide a powerful tool to monitor changes in expression levels of various splice variants during disease emergence and progression. It can enable discovery of novel splice variants or validate known splice variants to serve as biomarkers to monitor cancer progression. It can also provide means to further understanding the roles of alternative splice variants and their possible uses as therapeutic targets. Universal nature and flexibility of this low cost and high throughput assay provides great commercial opportunities for cancer research and diagnostics and in all other biomedical areas. This high capacity system is ideal for service providing labs or companies.
  • 4.5.3. Preparation of Templates for In Vitro Transcription
  • Exon sequences will be cloned into the multiple cloning sites (MCS) of plasmid pBluescript. For the purposes of demonstrating the usefulness of the probe pools, it is not necessary to clone the contiguous full-length sequence, nor to maintain the proper protein coding frame. For genes that are shorter than 1 kb, it should not be difficult to generate PCR products from cDNA using gene specific oligos for the full length sequence. For longer genes, the easiest approach would be to generate PCR products of about 500 by corresponding to contiguous block of exons and ordered the fragments by cloning into appropriate cloning sites in the MCS of pBluescript. This will also be the approach for cloning the alternative spliced versions, since the desired variant might not be present in the cDNA source used for PCR.
  • The last site of the MCS will be used to insert a string of 40 A's to simulate the polyA tails of cellular mRNA. This is to control for the possibility that the polyA tail might interfere with the sample preparation step described below, although it is not expected to be a problem since a poly-dA tail is actually incorporated into our standard methods for the sample preparation of genomic fragments as described in section C.
  • Generation of in vitro transcripts will be straight forward. The plasmid will be linearized, T7 RNA polymerase will be used to generate the run-off transcripts and the RNA generated will be purified with the standard methods.
  • 4.5.4. Preparation of Samples for Arraying
  • Because the probe pools are designed for specific genes, cDNA will be prepared for those specific genes only. For priming the reverse transcription reactions, gene-specific primers will be used, therefore for 1000 genes, 1000 primers will be used.
  • The location of the priming site for the reverse transcription will be selected with care, since it is not reasonable to expect the synthesis of cDNA >2 kb to be of high efficiency. It is quite common that the last exon would consist of the end of the coding sequence and a long 3′ untranslated region. In the case of CD44 for example, although the full-length mRNA is about 5.7 kb, the 3′ UTR comprises of 3 kb, while the coding region is only 2.2 kb. Therefore the logical location of the reverse transcription primer site would be immediately downstream of the end of the coding sequence. For some splice variants, the alternative exons are often clustered together as a block to create a region of variability. In the case of Tenascin C variants (8.5 kb), the most common isoform has a block of 8 extra exons, and there is evidence to suggest that there is variability in exon usage in that region(12). So for Tenascin C, the primer will be located just downstream of that region. Because of the concern of synthesizing cDNA with length >2 kb, for long genes, it might be necessary to divide the exons into blocks of 2 kb with multiple primers. Even though we will lose information on correlating splice events that are apart on the same transcripts, it is still better than generating biases of over-representing 3′ exons.
  • There are many off-the-shelf reagents for the reverse transcription reactions. The SuperScript III system from Invitrogen (Carlsbad, Calif.) and the StrataScript system from Stratagene (La Jolla, Calif.) being two of them. Once single stranded cDNA molecules are produced, the rest of the procedures involved putting on the adaptor sequence, circularization of the molecule and RCR. All of these had been extensive tested in previous development for rSBH processes and will follow protocols developed and described in earlier sections. The 5′ ends of the cDNAs are basically the incorporated gene-specific primers used for initiating the reverse transcription. By incorporating a 7 base universal tag on the 5′ end of the reverse-transcription priming oligos, all the cDNA generated will carry the same 7 base sequence at the 5′ end. Thus a single template oligo that is complementary to both the adaptor sequence and the universal tag can be used to ligate the adaptor to all the target molecules, without using the template oligo with degenerate bases. As for the 3′ end of the cDNA (5′ end of the mRNA) which is usually ill-defined, it will be treated like a random sequence end of a genomic fragment. Similar methods of adding a polyA tail will be applied, thus the same circle closing reaction will also be used.
  • Reverse transcriptases are prone to terminate prematurely to create truncated cDNAs. Severely truncated cDNAs probably will not have enough probe binding sites to be identified with a gene assignment, thus would not be analyzed. For those cDNA molecules that are close, but not quite full-length, will show up as splice variant with missing 5′ exons—if there are no collaborating evidence from sequence database to support such variants, they will be discounted. A way to avoid such problem is to select for only the full-length cDNA (or those with the desired 3′ end) to be compatible with circle closing reaction, then any truncated molecules will not be circularized nor replicated. First a dideoxy-cytosine residue can be added to the 3′ end of all the cDNA to block ligation, then by using a mismatch oligo targeting the desired sequence, a new 3′ end can be generated by enzyme mismatch cleavage using T4 endonuclease VII (13, 14). With the new 3′ end, the cDNA can proceed with the adding a poly-dA tail and with the standard protocols of circularization and replication.
  • The rolling circle replication will initiate from a oligo priming at the adaptor sequence, and the replication around the circular cDNA molecule will be carried out by Phi29 polymerase whose high processivity allow many tandem copies to be made from circular templates.
  • 4.5.5. Smart Pooling Scheme for Exon Probes
  • Theoretically to probe for 10000 exons from 1000 genes on a single array would require 10000 specific probes and 10000 cycles of hybridization. However, through a combination of the use of combinatorial probe ligation techniques developed for the HyChip platform, and a judicious pooling scheme of the probes, the number oligo probes actually required is significantly less, while the number of hybridization cycle required would be less than 40.
  • The exon probe will actually consist of a pair of oligos ligated together upon hybridization to the target (see description on combinatorial probe ligation chemistry). One of the pair will be selected from a library of 4096 6 mer oligos and the other will be from a library of 1024 TAMRA-labeled 5 mer oligos.
  • A software program can be developed for preparing optimized pools of 6-mer and 5-mer probes for a given set of 1000 genes and about 10,000 exons. The goal is to keep the number of individual probes in a pool that will detect 500 genes and one exon per gene to be less than 200. The algorithm will consist of two main steps:
  • Step 1: Select 1000-2000 shortest exons (total about 20-50 kb), and find out matching sequences for each of 1024 available labeled 5-mers. On average each 5-mer will occur 20 times over 20 kb, but some may occur over 50 or over 100 times. By selecting the most frequent 5-mer, the largest number of short exons will be detected with the single labeled probe. A goal would be to detect about 50-100 short exons (10%-20% of 500 exons) per cycle. Thus less than 10 labeled probes and 50-100 unlabeled 6-mers would be sufficient. Small number of labeled probes is favorable because it minimizes overall fluorescent background.
  • Step 2. Find out all 6-mers that are contiguous with all sites in all 1000 genes that are complementary to 10 selected 5-mers. On average 20 such sites will exist in each 2 kb gene. Total number of sites would be about 20,000, eg, each 6-mer on average will occur 5 times. Sort 6-mers by the hit frequency. The most frequent may have over 20 hits, e.g. such 6-mer will detect 20 genes through combinations with 10 labeled probes. Thus, to get a single probe pair for each of the 500 genes a minimum of 25 6-mer probes would be required. Realistically, 100 to 200 6-mers may be required.
  • Due to benefits of combinatorial ligation that uses pre-made libraries of 6-mer and 5-mer probes we can quickly prepare 40 probe pools with about 200 probes per pool using established pipeting robotics. Because the information generated is equivalent to having over 3 probes per exon (see later), therefore the use of 8000 5mers and 6 mers effectively replaces the 30,000 longer exons specific probes required for a single set of 1000 genes. Universal short probe libraries would be sufficient to prepare pool sets for hundreds of projects examining a different specific selection of a 1000-gene set in thousands of samples.
  • 4.5.6. Exon Profiling
  • The profiling of exons can be performed in two phases: the gene identification phase and the exon identification phase. In the gene identification phase, each concatemer on the array can be uniquely identified with a particular gene. In theory, 10 probe pools or hybridization cycles will be enough to identify 1000 genes using the following scheme. Each gene is assigned a unique binary code. The number of binary digits thus depends on the total number of genes: 3 digits for 8 genes, 10 digits for 1024 genes. Each probe pool is designed to correspond to a digit of the binary code and would contain probes that would hit a unique combination of half of the genes and one hit per gene only. Thus for each hybridization cycle, an unique half of the genes will score a 1 for that digit and the other half will score zero. Ten hybridization cycles with 10 probe pools will generate 1024 unique binary codes, enough to assign 1000 unique genes to all the concatemers on the array. To provide redundancy in the identification data, 15-20 cycles would be used. If 20 cycles are used, it would provide 1 million unique binary codes and there should be enough information to account for loss of signals due to missing exons or gene deletions. It will also be equivalent to having 10 data points per gene (20 cycles of 500 data point each give 10,000 data points total), or one positive probe-pair per exon, on average. At this point after 20 cycles, this system is capable of making assignment of 1 million unique gene identities to the ampliots. Therefore by counting gene identities of the ampliots, one can determine quantitatively the expression level of all the genes (but not sub-typing of splice variants) in any given samples.
  • After identifying each ampliot with a gene assignment, its exon pattern will be profiled in the exon identification phase. For the exon identification phase, one exon per gene in all or most of the genes is tested per hybridization cycle. In most cases 10-20 exon identification cycles should be sufficient. Thus, in the case of using 20 exon identification cycles we will obtain information of 2 probes per each of 10 exons in each gene. For genes with more than 20 exons, methods can be developed so that 2 exons per gene can be probed at the same cycle. One possibility is using multiple fluorophores of different colors, and another possibility is to exploit differential hybrid stabilities of different ligation probe pairs.
  • In conclusion, a total of about 40 assay cycles will provide sufficient information to obtain gene identity at each spot and to provide three matching probe-pairs for each of 10,000 exons with enough informational redundancy to provide accurate identification of missing exons due to alternative splicing or chromosomal deletions.
  • 5. LITERATURE CITED
    • 1. Dahl, F., Baner, J., Gullberg, M., Mendel-Hartvig, M., Landegren, U., and Nilsson, M. 2004. Circle-to-circle amplification for precise and sensitive DNA analysis. Proc Natl Acad Sci USA 101:4548-4553.
    • 2. Kwak, S. K., Lee, G. S., Ahn, D. J., and Choi, J. W. 2004. Pattern formation of cytochrome c by microcontact printing and dip-pen nanolithography. Materials Science and Engineering. C 24:151-155.
    • 3. Drmanac, S., Stavropoulos, N. A., Labat, I., Vonau, J., Hauser, B., Soares, M. B., and Drmanac, R. 1996. Gene-representing cDNA clusters defined by hybridization of 57,419 clones from infant brain libraries with short oligonucleotide probes. Genomics 37:29-40.
    • 4. Tokunaga, M., Kitamura, K., Saito, K., Iwane, A. H., and Yanagida, T. 1997. Single molecule imaging of fluorophores and enzymatic reactions achieved by objective-type total internal reflection fluorescence microscopy. Biochem Biophys Res Commun 235:47-53.
    • 5. Blanco, L., Bernad, A., Lazaro, J. M., Martin, G., Garmendia, C., and Salas, M. 1989. Highly efficient DNA synthesis by the phage phi 29 DNA polymerase. Symmetrical mode of DNA replication. J Biol Chem 264:8935-8940.
    • 6. Youil, R., Kemper, B. W., and Cotton, R. G. 1995. Screening for mutations by enzyme mismatch cleavage with T4 endonuclease VII. Proc Natl Acad Sci USA 92:87-91.
    • 7. Mashal, R. D., Koontz, J., and Sklar, J. 1995. Detection of mutations by cleavage of DNA heteroduplexes with bacteriophage resolvases. Nat Genet 9:177-183.
    • 8. Babon, J. J., McKenzie, M., and Cotton, R. G. 2003. The use of resolvases T4 endonuclease VII and T7 endonuclease I in mutation detection. Mol Biotechnol 23:73-8 1.
    • 9. Rossetti, S., Englisch, S., Bresin, E., Pignatti, P. F., and Turco, A. E. 1997. Detection of mutations in human genes by a new rapid method: cleavage fragment length polymorphism analysis (CFLPA). Mol Cell Probes 11:155-160.
    • 10. Ellis, L. A., Taylor, G. R., Banks, R., and Baumberg, S. 1994. MutS binding protects heteroduplex DNA from exonuclease digestion in vitro: a simple method for detecting mutations. Nucleic Acids Res 22:2710-2711.
    • 11. Golz, S., Greger, B., and Kemper, B. 1998. Enzymatic mutation detection. Phosphate ions increase incision efficiency of endonuclease VII at a variety of damage sites in DNA. Mutat Res 382:85-92.
    • 12. Dueck, M., Riedl, S., Hinz, U., Tandara, A., Moller, P., Herfarth, C., and Faissner, A. 1999. Detection of tenascin-C isoforms in colorectal mucosa, ulcerative colitis, carcinomas and liver metastases. In/nt J Cancer. 477-483.
    • 13. Youil, R., Kemper, B. W., and Cotton, R. G. 1995. Screening for mutations by enzyme mismatch cleavage with T4 endonuclease VII. In Proc Nat/Acad Sci USA. 87-91.
    • 14. Mashal, R. D., Koontz, J., and Sklar, J. 1995. Detection of mutations by cleavage of DNA heteroduplexes with bacteriophage resolvases. In Nat Genet. 177-183.
    Definitions
  • Terms and symbols of nucleic acid chemistry, biochemistry, genetics, and molecular biology used herein follow those of standard treatises and texts in the field, e.g. Kornberg and Baker, DNA Replication, Second Edition (W.H. Freeman, New York, 1992); Lehninger, Biochemistry, Second Edition (Worth Publishers, New York, 1975); Strachan and Read, Human Molecular Genetics, Second Edition (Wiley-Liss, New York, 1999); Eckstein, editor, Oligonucleotides and Analogs: A Practical Approach (Oxford University Press, New York, 1991); Gait, editor, Oligonucleotide Synthesis: A Practical Approach (IRL Press, Oxford, 1984); and the like.
  • “Amplicon” means the product of a polynucleotide amplification reaction. That is, it is a population of polynucleotides, usually double stranded, that are replicated from one or more starting sequences. The one or more starting sequences may be one or more copies of the same sequence, or it may be a mixture of different sequences. Amplicons may be produced by a variety of amplification reactions whose products are multiple replicates of one or more target nucleic acids. Generally, amplification reactions producing amplicons are “template-driven” in that base pairing of reactants, either nucleotides or oligonucleotides, have complements in a template polynucleotide that are required for the creation of reaction products. In one aspect, template-driven reactions are primer extensions with a nucleic acid polymerase or oligonucleotide ligations with a nucleic acid ligase. Such reactions include, but are not limited to, polymerase chain reactions (PCRs), linear polymerase reactions, nucleic acid sequence-based amplification (NASBAs), rolling circle amplifications, and the like, disclosed in the following references that are incorporated herein by reference: Mullis et al, U.S. Pat. Nos. 4,683,195; 4,965,188; 4,683,202; 4,800,159 (PCR); Gelfand et al, U.S. Pat. No. 5,210,015 (real-time PCR with “taqman” probes); Wittwer et al, U.S. Pat. No. 6,174,670; Kacian et al, U.S. Pat. No. 5,399,491 (“NASBA”); Lizardi, U.S. Pat. No. 5,854,033; Aono et al, Japanese patent publ. JP 4-262799 (rolling circle amplification); and the like. In one aspect, amplicons of the invention are produced by PCRs. An amplification reaction may be a “real-time” amplification if a detection chemistry is available that permits a reaction product to be measured as the amplification reaction progresses, e.g. “real-time PCR” described below, or “real-time NASBA” as described in Leone et al, Nucleic Acids Research, 26: 2150-2155 (1998), and like references. As used herein, the term “amplifying” means performing an amplification reaction. A “reaction mixture” means a solution containing all the necessary reactants for performing a reaction, which may include, but not be limited to, buffering agents to maintain pH at a selected level during a reaction, salts, co-factors, scavengers, and the like.
  • “Complementary or substantially complementary” refers to the hybridization or base pairing or the formation of a duplex between nucleotides or nucleic acids, such as, for instance, between the two strands of a double stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single stranded nucleic acid. Complementary nucleotides are, generally, A and T (or A and U), or C and G. Two single stranded RNA or DNA molecules are said to be substantially complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the nucleotides of the other strand, usually at least about 90% to 95%, and more preferably from about 98 to 100%. Alternatively, substantial complementarity exists when an RNA or DNA strand will hybridize under selective hybridization conditions to its complement. Typically, selective hybridization will occur when there is at least about 65% complementary over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 90% complementary. See, M. Kanehisa Nucleic Acids Res. 12:203 (1984), incorporated herein by reference.
  • “Duplex” means at least two oligonucleotides and/or polynucleotides that are fully or partially complementary undergo Watson-Crick type base pairing among all or most of their nucleotides so that a stable complex is formed. The terms “annealing” and “hybridization” are used interchangeably to mean the formation of a stable duplex. “Perfectly matched” in reference to a duplex means that the poly- or oligonucleotide strands making up the duplex form a double stranded structure with one another such that every nucleotide in each strand undergoes Watson Crick basepairing with a nucleotide in the other strand. The term “duplex” comprehends the pairing of nucleoside analogs, such as deoxyinosine, nucleosides with 2-aminopurine bases, PNAs, and the like, that may be employed. A “mismatch” in a duplex between two oligonucleotides or polynucleotides means that a pair of nucleotides in the duplex fails to undergo Watson-Crick bonding.
  • “Genetic locus,” or “locus” in reference to a genome or target polynucleotide, means a contiguous subregion or segment of the genome or target polynucleotide. As used herein, genetic locus, or locus, may refer to the position of a nucleotide, a gene, or a portion of a gene in a genome, including mitochondrial DNA, or it may refer to any contiguous portion of genomic sequence whether or not it is within, or associated with, a gene. In one aspect, a genetic locus refers to any portion of genomic sequence, including mitochondrial DNA, from a single nucleotide to a segment of few hundred nucleotides, e.g. 100-300, in length.
  • “Genetic variant” means a substitution, inversion, insertion, or deletion of one or more nucleotides at genetic locus, or a translocation of DNA from one genetic locus to another genetic locus. In one aspect, genetic variant means an alternative nucleotide sequence at a genetic locus that may be present in a population of individuals and that includes nucleotide substitutions, insertions, and deletions with respect to other members of the population. In another aspect, insertions or deletions at a genetic locus comprises the addition or the absence of from 1 to 10 nucleotides at such locus, in comparison with the same locus in another individual of a population.
  • “Hybridization” refers to the process in which two single-stranded polynucleotides bind non-covalently to form a stable double-stranded polynucleotide. The term “hybridization” may also refer to triple-stranded hybridization. The resulting (usually) double-stranded polynucleotide is a “hybrid” or “duplex.” “Hybridization conditions” will typically include salt concentrations of less than about 1M, more usually less than about 500 mM and less than about 200 mM. A “hybridization buffer” is a buffered salt solution such as 5×SSPE, or the like. Hybridization temperatures can be as low as 5° C., but are typically greater than 22° C., more typically greater than about 30° C., and preferably in excess of about 37° C. Hybridizations are usually performed under stringent conditions, i.e. conditions under which a probe will hybridize to its target subsequence. Stringent conditions are sequence-dependent and are different in different circumstances. Longer fragments may require higher hybridization temperatures for specific hybridization. As other factors may affect the stringency of hybridization, including base composition and length of the complementary strands, presence of organic solvents and extent of base mismatching, the combination of parameters is more important than the absolute measure of any one alone. Generally, stringent conditions are selected to be about 5° C. lower than the Tm for the specific sequence at s defined ionic strength and pH. Exemplary stringent conditions include salt concentration of at least 0.01 M to no more than 1 M Na ion concentration (or other salts) at a pH 7.0 to 8.3 and a temperature of at least 25° C. For example, conditions of 5×SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4) and a temperature of 25-30° C. are suitable for allele-specific probe hybridizations. For stringent conditions, see for example, Sambrook, Fritsche and Maniatis. “Molecular Cloning A laboratory Manual” 2nd Ed. Cold Spring Harbor Press (1989) and Anderson “Nucleic Acid Hybridization” 1st Ed., BIOS Scientific Publishers Limited (1999), which are hereby incorporated by reference in its entirety for all purposes above. “Hybridizing specifically to” or “specifically hybridizing to” or like expressions refer to the binding, duplexing, or hybridizing of a molecule substantially to or only to a particular nucleotide sequence or sequences under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA.
  • “Ligation” means to form a covalent bond or linkage between the termini of two or more nucleic acids, e.g. oligonucleotides and/or polynucleotides, in a template-driven reaction. The nature of the bond or linkage may vary widely and the ligation may be carried out enzymatically or chemically. As used herein, ligations are usually carried out enzymatically to form a phosphodiester linkage between a 5′ carbon of a terminal nucleotide of one oligonucleotide with 3′ carbon of another oligonucleotide. A variety of template-driven ligation reactions are described in the following references, which are incorporated by reference: Whitely et al, U.S. Pat. No. 4,883,750; Letsinger et al, U.S. Pat. No. 5,476,930; Fung et al, U.S. Pat. No. 5,593,826; Kool, U.S. Pat. No. 5,426,180; Landegren et al, U.S. Pat. No. 5,871,921; Xu and Kool, Nucleic Acids Research, 27: 875-881 (1999); Higgins et al, Methods in Enzymology, 68: 50-71 (1979); Engler et al, The Enzymes, 15: 3-29 (1982); and Namsaraev, U.S. patent publication 2004/0110213. Enzymatic ligation usually takes place in a ligase buffer, which is a buffered salt solution containing any required divalent cations, cofactors, and the like, for the particular ligase employed.
  • “Microarray” or “array” refers to a solid phase support having a surface, usually planar or substantially planar, which carries an array of sites containing nucleic acids, such that each member site of the array comprises identical copies of immobilized oligonucleotides or polynucleotides and is spatially defined and not overlapping with other member sites of the array; that is, the sites are spatially discrete. In some cases, sites of a microarray may also be spaced apart as well as discrete; that is, different sites do not share boundaries, but are separated by inter-site regions, usually free of bound nucleic acids. Spatially defined hybridization sites may additionally be “addressable” in that its location and the identity of its immobilized oligonucleotide are known or predetermined, for example, prior to its use. In some aspects, the oligonucleotides or polynucleotides are single stranded and are covalently attached to the solid phase support, usually by a 5′-end or a 3′-end. In other aspects, oligonucleotides or polynucleotides are attached to the solid phase support non-covalently, e.g. by a biotin-streptavidin linkage, hybridization to a capture oligonucleotide that is covalently bound, and the like. Conventional microarray technology is reviewed in the following references: Schena, Editor, Microarrays: A Practical Approach (IRL Press, Oxford, 2000); Southern, Current Opin. Chem. Biol., 2: 404-410 (1998); Nature Genetics Supplement, 21: 1-60 (1999). As used herein, “random array” or “random microarray” refers to a microarray whose spatially discrete regions of oligonucleotides or polynucleotides are not spatially addressed. That is, the identity of the attached oligonucleoties or polynucleotides is not discernable, at least initially, from its location, but may be determined by a particular operation on the array, e.g. sequencing, hybridizing decoding probes, or the like. Random microarrays are frequently formed from a planar array of microbeads, e.g. Brenner et al, Nature Biotechnology, 18: 630-634 (2000); Tulley et al, U.S. Pat. No. 6,133,043; Stuelpnagel et al, U.S. Pat. No. 6,396,995; Chee et al, U.S. Pat. No. 6,544,732; and the like.
  • “Mismatch” means a base pair between any two of the bases A, T (or U for RNA), G, and C other than the Watson-Crick base pairs G-C and A-T. The eight possible mismatches are A-A, T-T, G-G, C-C, T-G, C-A, T-C, and A-G.
  • “Mutation” and “polymorphism” are usually used somewhat interchangeably to mean a DNA molecule, such as a gene, that differs in nucleotide sequence from a reference DNA sequence, or wild type sequence, or normal tissue sequence, by one or more bases, insertions, and/or deletions. In some contexts, the usage of Cotton (Mutation Detection, Oxford University Press, Oxford, 1997) is followed in that a mutation is understood to be any base change whether pathological to an organism or not, whereas a polymorphism is usually understood to be a base change with no direct pathological consequences.
  • “Nucleoside” as used herein includes the natural nucleosides, including 2′-deoxy and 2′-hydroxyl forms, e.g. as described in Kornberg and Baker, DNA Replication, 2nd Ed. (Freeman, San Francisco, 1992). “Analogs” in reference to nucleosides includes synthetic nucleosides having modified base moieties and/or modified sugar moieties, e.g. described by Scheit, Nucleotide Analogs (John Wiley, New York, 1980); Uhlman and Peyman, Chemical Reviews, 90: 543-584 (1990), or the like, with the proviso that they are capable of specific hybridization. Such analogs include synthetic nucleosides designed to enhance binding properties, reduce complexity, increase specificity, and the like. Polynucleotides comprising analogs with enhanced hybridization or nuclease resistance properties are described in Uhlman and Peyman (cited above); Crooke et al, Exp. Opin. Ther. Patents, 6: 855-870 (1996); Mesmaeker et al, Current Opinion in Structual Biology, 5: 343-355 (1995); and the like. Exemplary types of polynucleotides that are capable of enhancing duplex stability include oligonucleotide N3′→P5′ phosphoramidates (referred to herein as “amidates”), peptide nucleic acids (referred to herein as “PNAs”), oligo-2′-O-alkylribonucleotides, polynucleotides containing C-5 propynylpyrimidines, locked nucleic acids (LNAs), and like compounds. Such oligonucleotides are either available commercially or may be synthesized using methods described in the literature.
  • “Polymerase chain reaction,” or “PCR,” means a reaction for the in vitro amplification of specific DNA sequences by the simultaneous primer extension of complementary strands of DNA. In other words, PCR is a reaction for making multiple copies or replicates of a target nucleic acid flanked by primer binding sites, such reaction comprising one or more repetitions of the following steps: (i) denaturing the target nucleic acid, (ii) annealing primers to the primer binding sites, and (iii) extending the primers by a nucleic acid polymerase in the presence of nucleoside triphosphates. Usually, the reaction is cycled through different temperatures optimized for each step in a thermal cycler instrument. Particular temperatures, durations at each step, and rates of change between steps depend on many factors well-known to those of ordinary skill in the art, e.g. exemplified by the references: McPherson et al, editors, PCR: A Practical Approach and PCR2: A Practical Approach (IRL Press, Oxford, 1991 and 1995, respectively). For example, in a conventional PCR using Taq DNA polymerase, a double stranded target nucleic acid may be denatured at a temperature >90° C., primers annealed at a temperature in the range 50-75° C., and primers extended at a temperature in the range 72-78° C. The term “PCR” encompasses derivative forms of the reaction, including but not limited to, RT-PCR, real-time PCR, nested PCR, quantitative PCR, multiplexed PCR, and the like. Reaction volumes range from a few hundred nanoliters, e.g. 200 nL, to a few hundred′, IL, e.g. 200 pt. “Reverse transcription PCR,” or “RT-PCR,” means a PCR that is preceded by a reverse transcription reaction that converts a target RNA to a complementary single stranded DNA, which is then amplified, e.g. Tecott et al, U.S. Pat. No. 5,168,038, which patent is incorporated herein by reference. “Real-time PCR” means a PCR for which the amount of reaction product, i.e. amplicon, is monitored as the reaction proceeds. There are many forms of real-time PCR that differ mainly in the detection chemistries used for monitoring the reaction product, e.g. Gelfand et al, U.S. Pat. No. 5,210,015 (“taqman”); Wittwer et al, U.S. Pat. Nos. 6,174,670 and 6,569,627 (intercalating dyes); Tyagi et al, U.S. Pat. No. 5,925,517 (molecular beacons); which patents are incorporated herein by reference. Detection chemistries for real-time PCR are reviewed in Mackay et al, Nucleic Acids Research, 30: 1292-1305 (2002), which is also incorporated herein by reference. “Nested PCR” means a two-stage PCR wherein the amplicon of a first PCR becomes the sample for a second PCR using a new set of primers, at least one of which binds to an interior location of the first amplicon. As used herein, “initial primers” in reference to a nested amplification reaction mean the primers used to generate a first amplicon, and “secondary primers” mean the one or more primers used to generate a second, or nested, amplicon. “Multiplexed PCR” means a PCR wherein multiple target sequences (or a single target sequence and one or more reference sequences) are simultaneously carried out in the same reaction mixture, e.g. Bernard et al, Anal. Biochem., 273: 221-228 (1999)(two-color real-time PCR). Usually, distinct sets of primers are employed for each sequence being amplified.
  • “Quantitative PCR” means a PCR designed to measure the abundance of one or more specific target sequences in a sample or specimen. Quantitative PCR includes both absolute quantitation and relative quantitation of such target sequences. Quantitative measurements are made using one or more reference sequences that may be assayed separately or together with a target sequence. The reference sequence may be endogenous or exogenous to a sample or specimen, and in the latter case, may comprise one or more competitor templates. Typical endogenous reference sequences include segments of transcripts of the following genes: f3-actin, GAPDH, 132-microglobulin, ribosomal RNA, and the like. Techniques for quantitative PCR are well-known to those of ordinary skill in the art, as exemplified in the following references that are incorporated by reference: Freeman et al, Biotechniques, 26: 112-126 (1999); Becker-Andre et al, Nucleic Acids Research, 17: 9437-9447 (1989); Zimmerman et al, Biotechniques, 21: 268-279 (1996); Diviacco et al, Gene, 122: 3013-3020 (1992); Becker-Andre et al, Nucleic Acids Research, 17: 9437-9446 (1989); and the like.
  • “Polynucleotide” or “oligonucleotide” are used interchangeably and each mean a linear polymer of nucleotide monomers. As used herein, the terms may also refer to double stranded forms. Monomers making up polynucleotides and oligonucleotides are capable of specifically binding to a natural polynucleotide by way of a regular pattern of monomer-to-monomer interactions, such as Watson-Crick type of base pairing, base stacking, Hoogsteen or reverse Hoogsteen types of base pairing, or the like, to form duplex or triplex forms. Such monomers and their internucleosidic linkages may be naturally occurring or may be analogs thereof, e.g. naturally occurring or non-naturally occurring analogs. Non-naturally occurring analogs may include PNAs, phosphorothioate internucleosidic linkages, bases containing linking groups permitting the attachment of labels, such as fluorophores, or haptens, and the like. Whenever the use of an oligonucleotide or polynucleotide requires enzymatic processing, such as extension by a polymerase, ligation by a ligase, or the like, one of ordinary skill would understand that oligonucleotides or polynucleotides in those instances would not contain certain analogs of internucleosidic linkages, sugar moities, or bases at any or some positions, when such analogs are incompatable with enzymatic reactions. Polynucleotides typically range in size from a few monomeric units, e.g. 5-40, when they are usually referred to as “oligonucleotides,” to several thousand monomeric units. Whenever a polynucleotide or oligonucleotide is represented by a sequence of letters (upper or lower case), such as “ATGCCTG,” it will be understood that the nucleotides are in 5′→3′ order from left to right and that “A” denotes deoxyadenosine, “C” denotes deoxycytidine, “G” denotes deoxyguanosine, and “T” denotes thymidine, “I” denotes deoxyinosine, “U” denotes uridine, unless otherwise indicated or obvious from context. Unless otherwise noted the terminology and atom numbering conventions will follow those disclosed in Strachan and Read, Human Molecular Genetics 2 (Wiley-Liss, New York, 1999). Usually polynucleotides comprise the four natural nucleosides (e.g. deoxyadenosine, deoxycytidine, deoxyguanosine, deoxythymidine for DNA or their ribose counterparts for RNA) linked by phosphodiester linkages; however, they may also comprise non-natural nucleotide analogs, e.g. including modified bases, sugars, or internucleosidic linkages. It is clear to those skilled in the art that where an enzyme has specific oligonucleotide or polynucleotide substrate requirements for activity, e.g. single stranded DNA, RNA/DNA duplex, or the like, then selection of appropriate composition for the oligonucleotide or polynucleotide substrates is well within the knowledge of one of ordinary skill, especially with guidance from treatises, such as Sambrook et al, Molecular Cloning, Second Edition (Cold Spring Harbor Laboratory, New York, 1989), and like references.
  • “Primer” means an oligonucleotide, either natural or synthetic, that is capable, upon forming a duplex with a polynucleotide template, of acting as a point of initiation of nucleic acid synthesis and being extended from its 3′ end along the template so that an extended duplex is formed. The sequence of nucleotides added during the extension process are determined by the sequence of the template polynucleotide. Usually primers are extended by a DNA polymerase. Primers usually have a length in the range of from 9 to 40 nucleotides, or in some embodiments, from 14 to 36 nucleotides.
  • “Readout” means a parameter, or parameters, which are measured and/or detected that can be converted to a number or value. In some contexts, readout may refer to an actual numerical representation of such collected or recorded data. For example, a readout of fluorescent intensity signals from a microarray is the position and fluorescence intensity of a signal being generated at each hybridization site of the microarray; thus, such a readout may be registered or stored in various ways, for example, as an image of the microarray, as a table of numbers, or the like.
  • “Solid support”, “support”, and “solid phase support” are used interchangeably and refer to a material or group of materials having a rigid or semi-rigid surface or surfaces. In many embodiments, at least one surface of the solid support will be substantially flat, although in some embodiments it may be desirable to physically separate synthesis regions for different compounds with, for example, wells, raised regions, pins, etched trenches, or the like. According to other embodiments, the solid support(s) will take the form of beads, resins, gels, microspheres, or other geometric configurations. Microarrays usually comprise at least one planar solid phase support, such as a glass microscope slide.
  • “Reference sequence” or “reference population” of DNA refers to individual DNA sequences or a collection of DNAs (or RNAs derived from it) which is compared to a test population of DNA or RNA, (or “test DNA sequence,” or “test DNA population”) by the formation of heteroduplexes between the complementary strands of the reference DNA population and test DNA population. If perfectly matched heteroduplexes form, then the respective members of the reference and test populations are identical; otherwise, they are variants of one another. Typically, the nucleotide sequences of members of the reference population are known and the sequences typically are listed in sequence databases, such as Genbank, Embl, or the like. In one aspect, a reference population of DNA may comprise a cDNA library or genomic library from a known cell type or tissue source. For example, a reference population of DNA may comprise a cDNA library or a genomic library derived from the tissue of a healthy individual and a test population of DNA may comprise a cDNA library or genomic library derived from the same tissue of a diseased individual. Reference populations of DNA may also comprise an assembled collection of individual polynucleotides, cDNAs, genes, or exons thereof, e.g. genes or exons encoding all or a subset of known p53 variants, genes of a signal transduction pathway, or the like.
  • “Specific” or “specificity” in reference to the binding of one molecule to another molecule, such as a labeled target sequence for a probe, means the recognition, contact, and formation of a stable complex between the two molecules, together with substantially less recognition, contact, or complex formation of that molecule with other molecules. In one aspect, “specific” in reference to the binding of a first molecule to a second molecule means that to the extent the first molecule recognizes and forms a complex with another molecules in a reaction or sample, it forms the largest number of the complexes with the second molecule. Preferably, this largest number is at least fifty percent. Generally, molecules involved in a specific binding event have areas on their surfaces or in cavities giving rise to specific recognition between the molecules binding to each other. Examples of specific binding include antibody-antigen interactions, enzyme-substrate interactions, formation of duplexes or triplexes among polynucleotides and/or oligonucleotides, receptor-ligand interactions, and the like. As used herein, “contact” in reference to specificity or specific binding means two molecules are close enough that weak noncovalent chemical interactions, such as Van der Waal forces, hydrogen bonding, base-stacking interactions, ionic and hydrophobic interactions, and the like, dominate the interaction of the molecules.
  • As used herein, the term “Tm” is used in reference to the “melting temperature.” The melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. Several equations for calculating the Tm of nucleic acids are well known in the art. As indicated by standard references, a simple estimate of the Tm value may be calculated by the equation. Tm=81.5+0.41 (% G+C), when a nucleic acid is in aqueous solution at 1 M NaCl (see e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985). Other references (e.g., Allawi, H. T. & SantaLucia, J., Jr., Biochemistry 36, 10581-94 (1997)) include alternative methods of computation which take structural and environmental, as well as sequence characteristics into account for the calculation of Tm.
  • “Sample” usually means a quantity of material from a biological, environmental, medical, or patient source in which detection, measurement, or labeling of target nucleic acids is sought. On the one hand it is meant to include a specimen or culture (e.g., microbiological cultures). On the other hand, it is meant to include both biological and environmental samples. A sample may include a specimen of synthetic origin. Biological samples may be animal, including human, fluid, solid (e.g., stool) or tissue, as well as liquid and solid food and feed products and ingredients such as dairy items, vegetables, meat and meat by-products, and waste. Biological samples may include materials taken from a patient including, but not limited to cultures, blood, saliva, cerebral spinal fluid, pleural fluid, milk, lymph, sputum, semen, needle aspirates, and the like. Biological samples may be obtained from all of the various families of domestic animals, as well as feral or wild animals, including, but not limited to, such animals as ungulates, bear, fish, rodents, etc. Environmental samples include environmental material such as surface matter, soil, water and industrial samples, as well as samples obtained from food and dairy processing instruments, apparatus, equipment, utensils, disposable and non-disposable items. These examples are not to be construed as limiting the sample types applicable to the present invention.
  • The above teachings are intended to illustrate the invention and do not by their details limit the scope of the claims of the invention. While preferred illustrative embodiments of the present invention are described, it will be apparent to one skilled in the art that various changes and modifications may be made therein without departing from the invention, and it is intended in the appended claims to cover all such changes and modifications that fall within the true spirit and scope of the invention.

Claims (20)

1-73. (canceled)
74. A random DNA array comprising
a) a support,
b) a plurality of individual DNA binding regions arranged in a pattern at predetermined locations on the support, and
c) individual DNA molecules disposed on the support and immobilized at the individual DNA binding regions,
wherein the individual DNA binding regions comprise capture oligonucleotides for immobilizing individual DNA molecules, and all of the individual DNA binding regions of the DNA array comprise the same capture oligonucleotides,
wherein the individual DNA molecules are immobilized on the individual DNA binding regions by a covalent or non-covalent interaction with the capture oligonucleotides,
wherein the individual DNA molecules each comprise a DNA target sequence,
wherein a majority of the individual DNA binding regions on the support each comprise a plurality of individual DNA molecules immobilized thereon,
wherein, in each individual DNA binding region of the majority of the individual DNA binding regions on the support, multiple individual DNA molecules of the plurality of individual DNA molecules immobilized thereon comprise the same DNA target sequence,
wherein the DNA target sequences of the multiple individual DNA molecules immobilized on the individual DNA binding regions are not known,
wherein said individual DNA binding regions of the majority of the individual DNA binding regions on the support do not all comprise the same multiple individual DNA molecules, and
wherein the random DNA array is not a bead array.
75. The random DNA array of claim 74 wherein the individual DNA molecules comprise a target sequence and an adaptor sequence, and the individual DNA molecules are immobilized on the individual DNA binding regions by base pairing between the capture oligonucleotide and the adaptor sequence.
76. The random DNA array of claim 74 wherein the capture oligonucleotides are 20 to 100 bases in length.
77. The random DNA array of claim 74 wherein the immobilized individual DNA molecules comprise two different adaptor sequences.
78. The random DNA array of claim 77 wherein the individual DNA molecules are immobilized by an association between the two different adaptor sequences and two different capture oligonucleotide sequences in the individual DNA binding regions.
79. The random DNA array of claim 74, wherein the individual DNA molecules are immobilized by a capture oligonucleotide at the 5′ terminus, wherein the capture oligonucleotide is immobilized on the individual DNA binding regions, whereby the individual DNA molecules are immobilized on the individual DNA binding regions by a covalent interaction with a capture oligonucleotide.
80. The random DNA array of claim 74, wherein the DNA target sequences are from a cDNA or genomic DNA library.
81. The random DNA array of claim 74, wherein the DNA target sequences are genomic DNA sequences.
82. The random DNA array of claim 81, wherein the DNA target sequences are human genomic sequences.
83. The random DNA array of claim 74, wherein the DNA target sequences are cDNA sequences.
84. The random DNA array of claim 83, wherein the DNA target sequences are human cDNA sequences.
85. The DNA array of claim 79 wherein at least some of the individual DNA molecules comprise one copy of the DNA target sequence.
86. The DNA array of claim 74 wherein the pattern is a rectilinear pattern.
87. The DNA array of claim 74 wherein the pattern is a grid pattern.
88. The random DNA array of claim 74 wherein said individual DNA molecules disposed on the support comprise at least 10,000 different DNA target sequences.
89. The DNA array of claim 74 wherein the individual DNA binding regions are substantially circular or square in shape so that their sizes can be indicated by a single linear dimension, which is in the range of 125-250 nm.
90. The DNA array of claim 74 wherein the individual DNA binding regions are substantially circular or square in shape so that their sizes can be indicated by a single linear dimension, which is in the range of 200-500 nm.
91. The random DNA array of claim 74 wherein said individual DNA binding regions in (b) are at a density greater than 10,000 per mm2.
92. The random DNA array of claim 76 wherein said individual DNA binding regions in (b) are at a density in the range of 1 million to 10 million per mm2.
US16/994,343 2005-06-15 2020-08-14 Dna array Abandoned US20200392574A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/994,343 US20200392574A1 (en) 2005-06-15 2020-08-14 Dna array
US17/522,708 US20220162694A1 (en) 2005-06-15 2021-11-09 Dna array

Applications Claiming Priority (9)

Application Number Priority Date Filing Date Title
US69077105P 2005-06-15 2005-06-15
US72511605P 2005-10-07 2005-10-07
US77641506P 2006-02-24 2006-02-24
US11/451,691 US8445194B2 (en) 2005-06-15 2006-06-13 Single molecule arrays for genetic and chemical analysis
US12/882,880 US20110071053A1 (en) 2005-06-15 2010-09-15 Single Molecule Arrays for Genetic and Chemical Analysis
US14/714,133 US9650673B2 (en) 2005-06-15 2015-05-15 Single molecule arrays for genetic and chemical analysis
US15/442,659 US10351909B2 (en) 2005-06-15 2017-02-25 DNA sequencing from high density DNA arrays using asynchronous reactions
US16/425,846 US20200115748A1 (en) 2005-06-15 2019-05-29 Single Molecule Arrays for Genetic and Chemical Analysis
US16/994,343 US20200392574A1 (en) 2005-06-15 2020-08-14 Dna array

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/425,846 Continuation US20200115748A1 (en) 2005-06-15 2019-05-29 Single Molecule Arrays for Genetic and Chemical Analysis

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/522,708 Continuation US20220162694A1 (en) 2005-06-15 2021-11-09 Dna array

Publications (1)

Publication Number Publication Date
US20200392574A1 true US20200392574A1 (en) 2020-12-17

Family

ID=37571035

Family Applications (27)

Application Number Title Priority Date Filing Date
US11/451,692 Active 2027-12-08 US7709197B2 (en) 2005-06-15 2006-06-13 Nucleic acid analysis by random mixtures of non-overlapping fragments
US11/451,691 Active 2027-05-31 US8445194B2 (en) 2005-06-15 2006-06-13 Single molecule arrays for genetic and chemical analysis
US11/981,607 Active 2027-01-18 US8133719B2 (en) 2005-06-15 2007-10-31 Methods for making single molecule arrays
US11/982,467 Active 2027-05-18 US8445197B2 (en) 2005-06-15 2007-10-31 Single molecule arrays for genetic and chemical analysis
US11/981,767 Active 2027-07-07 US8445196B2 (en) 2005-06-15 2007-10-31 Single molecule arrays for genetic and chemical analysis
US12/335,168 Active US7901891B2 (en) 2005-06-15 2008-12-15 Nucleic acid analysis by random mixtures of non-overlapping fragments
US12/882,880 Abandoned US20110071053A1 (en) 2005-06-15 2010-09-15 Single Molecule Arrays for Genetic and Chemical Analysis
US13/017,244 Active 2026-08-29 US8765379B2 (en) 2005-06-15 2011-01-31 Nucleic acid sequence analysis from combined mixtures of amplified fragments
US13/954,778 Active US8673562B2 (en) 2005-06-15 2013-07-30 Using non-overlapping fragments for nucleic acid sequencing
US13/962,893 Active US8765375B2 (en) 2005-06-15 2013-08-08 Method for sequencing polynucleotides by forming separate fragment mixtures
US13/971,797 Active 2027-12-16 US10125392B2 (en) 2005-06-15 2013-08-20 Preparing a DNA fragment library for sequencing using tagged primers
US13/971,806 Active 2027-06-08 US9637785B2 (en) 2005-06-15 2013-08-20 Tagged fragment library configured for genome or cDNA sequence analysis
US13/971,801 Active 2027-08-21 US9637784B2 (en) 2005-06-15 2013-08-20 Methods for DNA sequencing and analysis using multiple tiers of aliquots
US13/975,223 Active US8765382B2 (en) 2005-06-15 2013-08-23 Genome sequence analysis using tagged amplicons
US13/975,234 Active US8771958B2 (en) 2005-06-15 2013-08-23 Nucleotide sequence from amplicon subfragments
US13/975,215 Active US8771957B2 (en) 2005-06-15 2013-08-23 Sequencing using a predetermined coverage amount of polynucleotide fragments
US14/583,010 Abandoned US20150159204A1 (en) 2005-06-15 2014-12-24 Single Molecule Arrays for Genetic and Chemical Analysis
US14/714,133 Active US9650673B2 (en) 2005-06-15 2015-05-15 Single molecule arrays for genetic and chemical analysis
US15/425,791 Active US9944984B2 (en) 2005-06-15 2017-02-06 High density DNA array
US15/442,659 Active US10351909B2 (en) 2005-06-15 2017-02-25 DNA sequencing from high density DNA arrays using asynchronous reactions
US15/716,314 Abandoned US20180051333A1 (en) 2005-06-15 2017-09-26 Nucleic acid analysis by random mixtures of non-overlapping fragments
US16/425,846 Abandoned US20200115748A1 (en) 2005-06-15 2019-05-29 Single Molecule Arrays for Genetic and Chemical Analysis
US16/730,829 Active US11414702B2 (en) 2005-06-15 2019-12-30 Nucleic acid analysis by random mixtures of non-overlapping fragments
US16/994,343 Abandoned US20200392574A1 (en) 2005-06-15 2020-08-14 Dna array
US17/522,708 Pending US20220162694A1 (en) 2005-06-15 2021-11-09 Dna array
US17/864,913 Pending US20220411865A1 (en) 2005-06-15 2022-07-14 Labeling strategy for use in dna sequencing to facilitate assembly of sequence reads into longer fragments of a genome
US17/864,916 Pending US20220411866A1 (en) 2005-06-15 2022-07-14 Characterizing the genome of individual cells by long fragment read sequencing of oligonucleotide tagged dna fragments

Family Applications Before (23)

Application Number Title Priority Date Filing Date
US11/451,692 Active 2027-12-08 US7709197B2 (en) 2005-06-15 2006-06-13 Nucleic acid analysis by random mixtures of non-overlapping fragments
US11/451,691 Active 2027-05-31 US8445194B2 (en) 2005-06-15 2006-06-13 Single molecule arrays for genetic and chemical analysis
US11/981,607 Active 2027-01-18 US8133719B2 (en) 2005-06-15 2007-10-31 Methods for making single molecule arrays
US11/982,467 Active 2027-05-18 US8445197B2 (en) 2005-06-15 2007-10-31 Single molecule arrays for genetic and chemical analysis
US11/981,767 Active 2027-07-07 US8445196B2 (en) 2005-06-15 2007-10-31 Single molecule arrays for genetic and chemical analysis
US12/335,168 Active US7901891B2 (en) 2005-06-15 2008-12-15 Nucleic acid analysis by random mixtures of non-overlapping fragments
US12/882,880 Abandoned US20110071053A1 (en) 2005-06-15 2010-09-15 Single Molecule Arrays for Genetic and Chemical Analysis
US13/017,244 Active 2026-08-29 US8765379B2 (en) 2005-06-15 2011-01-31 Nucleic acid sequence analysis from combined mixtures of amplified fragments
US13/954,778 Active US8673562B2 (en) 2005-06-15 2013-07-30 Using non-overlapping fragments for nucleic acid sequencing
US13/962,893 Active US8765375B2 (en) 2005-06-15 2013-08-08 Method for sequencing polynucleotides by forming separate fragment mixtures
US13/971,797 Active 2027-12-16 US10125392B2 (en) 2005-06-15 2013-08-20 Preparing a DNA fragment library for sequencing using tagged primers
US13/971,806 Active 2027-06-08 US9637785B2 (en) 2005-06-15 2013-08-20 Tagged fragment library configured for genome or cDNA sequence analysis
US13/971,801 Active 2027-08-21 US9637784B2 (en) 2005-06-15 2013-08-20 Methods for DNA sequencing and analysis using multiple tiers of aliquots
US13/975,223 Active US8765382B2 (en) 2005-06-15 2013-08-23 Genome sequence analysis using tagged amplicons
US13/975,234 Active US8771958B2 (en) 2005-06-15 2013-08-23 Nucleotide sequence from amplicon subfragments
US13/975,215 Active US8771957B2 (en) 2005-06-15 2013-08-23 Sequencing using a predetermined coverage amount of polynucleotide fragments
US14/583,010 Abandoned US20150159204A1 (en) 2005-06-15 2014-12-24 Single Molecule Arrays for Genetic and Chemical Analysis
US14/714,133 Active US9650673B2 (en) 2005-06-15 2015-05-15 Single molecule arrays for genetic and chemical analysis
US15/425,791 Active US9944984B2 (en) 2005-06-15 2017-02-06 High density DNA array
US15/442,659 Active US10351909B2 (en) 2005-06-15 2017-02-25 DNA sequencing from high density DNA arrays using asynchronous reactions
US15/716,314 Abandoned US20180051333A1 (en) 2005-06-15 2017-09-26 Nucleic acid analysis by random mixtures of non-overlapping fragments
US16/425,846 Abandoned US20200115748A1 (en) 2005-06-15 2019-05-29 Single Molecule Arrays for Genetic and Chemical Analysis
US16/730,829 Active US11414702B2 (en) 2005-06-15 2019-12-30 Nucleic acid analysis by random mixtures of non-overlapping fragments

Family Applications After (3)

Application Number Title Priority Date Filing Date
US17/522,708 Pending US20220162694A1 (en) 2005-06-15 2021-11-09 Dna array
US17/864,913 Pending US20220411865A1 (en) 2005-06-15 2022-07-14 Labeling strategy for use in dna sequencing to facilitate assembly of sequence reads into longer fragments of a genome
US17/864,916 Pending US20220411866A1 (en) 2005-06-15 2022-07-14 Characterizing the genome of individual cells by long fragment read sequencing of oligonucleotide tagged dna fragments

Country Status (11)

Country Link
US (27) US7709197B2 (en)
EP (7) EP2463386B1 (en)
JP (4) JP5331476B2 (en)
CN (1) CN101466847B (en)
AU (1) AU2006259565B2 (en)
CA (2) CA2611743C (en)
DK (4) DK2463386T3 (en)
HK (1) HK1209795A1 (en)
IL (2) IL188142A (en)
SG (1) SG162795A1 (en)
WO (2) WO2006138284A2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023015192A1 (en) * 2021-08-03 2023-02-09 10X Genomics, Inc. Nucleic acid concatemers and methods for stabilizing and/or compacting the same
WO2023091592A1 (en) * 2021-11-19 2023-05-25 Dovetail Genomics, Llc Dendrimers for genomic analysis methods and compositions
WO2023107719A3 (en) * 2021-12-10 2023-07-20 Element Biosciences, Inc. Primary analysis in next generation sequencing
US11915444B2 (en) 2020-08-31 2024-02-27 Element Biosciences, Inc. Single-pass primary analysis

Families Citing this family (552)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7846733B2 (en) 2000-06-26 2010-12-07 Nugen Technologies, Inc. Methods and compositions for transcription-based nucleic acid amplification
DE60142709D1 (en) 2000-12-13 2010-09-09 Nugen Technologies Inc METHODS AND COMPOSITIONS FOR GENERATING A VARIETY OF COPIES OF NUCLEIC ACID SEQUENCES AND METHODS OF DETECTING THE SAME
ZA200210369B (en) 2001-03-09 2004-07-08 Nugen Technologies Inc Methods and compositions for amplification or RNA sequences.
CN1791682B (en) * 2003-02-26 2013-05-22 凯利达基因组股份有限公司 Random array DNA analysis by hybridization
WO2004092418A2 (en) 2003-04-14 2004-10-28 Nugen Technologies, Inc. Global amplification using a randomly primed composite primer
US7692219B1 (en) 2004-06-25 2010-04-06 University Of Hawaii Ultrasensitive biosensors
EP2463386B1 (en) 2005-06-15 2017-04-12 Complete Genomics Inc. Nucleic acid analysis by random mixtures of non-overlapping fragments
EP1910562B1 (en) 2005-06-23 2010-12-08 Keygene N.V. Strategies for high throughput identification and detection of polymorphisms
US11111544B2 (en) 2005-07-29 2021-09-07 Natera, Inc. System and method for cleaning noisy genetic data and determining chromosome copy number
US9424392B2 (en) 2005-11-26 2016-08-23 Natera, Inc. System and method for cleaning noisy genetic data from target individuals using genetic data from genetically related individuals
US11111543B2 (en) 2005-07-29 2021-09-07 Natera, Inc. System and method for cleaning noisy genetic data and determining chromosome copy number
WO2007030759A2 (en) 2005-09-07 2007-03-15 Nugen Technologies, Inc. Improved nucleic acid amplification procedure
CA2910861C (en) 2005-09-29 2018-08-07 Michael Josephus Theresia Van Eijk High throughput screening of mutagenized populations
US10316364B2 (en) * 2005-09-29 2019-06-11 Keygene N.V. Method for identifying the source of an amplicon
CA2624896C (en) * 2005-10-07 2017-11-07 Callida Genomics, Inc. Self-assembled single molecule arrays and uses thereof
WO2007133831A2 (en) * 2006-02-24 2007-11-22 Callida Genomics, Inc. High throughput genome sequencing on dna arrays
US7960104B2 (en) 2005-10-07 2011-06-14 Callida Genomics, Inc. Self-assembled single molecule arrays and uses thereof
WO2007120208A2 (en) * 2005-11-14 2007-10-25 President And Fellows Of Harvard College Nanogrid rolling circle dna sequencing
ES2882401T3 (en) 2005-12-22 2021-12-01 Keygene Nv High-throughput AFLP-based polymorphism detection method
SG10201405158QA (en) * 2006-02-24 2014-10-30 Callida Genomics Inc High throughput genome sequencing on dna arrays
CN1876840A (en) * 2006-04-26 2006-12-13 东南大学 Multi-copy monomolecular nucleic acid array chip
WO2008070352A2 (en) * 2006-10-27 2008-06-12 Complete Genomics, Inc. Efficient arrays of amplified polynucleotides
US20090111706A1 (en) 2006-11-09 2009-04-30 Complete Genomics, Inc. Selection of dna adaptor orientation by amplification
US20080221832A1 (en) * 2006-11-09 2008-09-11 Complete Genomics, Inc. Methods for computing positional base probabilities using experminentals base value distributions
US7629125B2 (en) * 2006-11-16 2009-12-08 General Electric Company Sequential analysis of biological samples
US20080242560A1 (en) * 2006-11-21 2008-10-02 Gunderson Kevin L Methods for generating amplified nucleic acid arrays
WO2008063135A1 (en) 2006-11-24 2008-05-29 Agency For Science, Technology And Research Apparatus for processing a sample in a liquid droplet and method of using the same
US9874501B2 (en) 2006-11-24 2018-01-23 Curiox Biosystems Pte Ltd. Use of chemically patterned substrate for liquid handling, chemical and biological reactions
US8262900B2 (en) * 2006-12-14 2012-09-11 Life Technologies Corporation Methods and apparatus for measuring analytes using large scale FET arrays
US11339430B2 (en) 2007-07-10 2022-05-24 Life Technologies Corporation Methods and apparatus for measuring analytes using large scale FET arrays
US8349167B2 (en) 2006-12-14 2013-01-08 Life Technologies Corporation Methods and apparatus for detecting molecular interactions using FET arrays
EP2653861B1 (en) 2006-12-14 2014-08-13 Life Technologies Corporation Method for sequencing a nucleic acid using large-scale FET arrays
US20090053690A1 (en) * 2007-02-02 2009-02-26 California Institute Of Technology Surface chemistry and deposition techniques
US20080194416A1 (en) * 2007-02-08 2008-08-14 Sigma Aldrich Detection of mature small rna molecules
US20080318233A1 (en) * 2007-03-30 2008-12-25 Glenn Travis C Source tagging and normalization of DNA for parallel DNA sequencing, and direct measurement of mutation rates using the same
EP2164985A4 (en) * 2007-06-01 2014-05-14 454 Life Sciences Corp System and meth0d for identification of individual samples from a multiplex mixture
US20110105366A1 (en) * 2007-06-18 2011-05-05 Illumina, Inc. Microfabrication methods for the optimal patterning of substrates
DE102007034967A1 (en) 2007-07-26 2009-01-29 Plansee Se Fuel cell and process for its production
US8222040B2 (en) * 2007-08-28 2012-07-17 Lightspeed Genomics, Inc. Nucleic acid sequencing by selective excitation of microparticles
US8759077B2 (en) * 2007-08-28 2014-06-24 Lightspeed Genomics, Inc. Apparatus for selective excitation of microparticles
US20090061424A1 (en) * 2007-08-30 2009-03-05 Sigma-Aldrich Company Universal ligation array for analyzing gene expression or genomic variations
ITBO20070627A1 (en) * 2007-09-14 2009-03-15 Twof Inc METHOD FOR THE PREPARATION OF MICROARRAY DNA WITH HIGH LINEAR DENSITY PROBES
WO2009045344A2 (en) * 2007-09-28 2009-04-09 Pacific Biosciences Of California, Inc. Error-free amplification of dna for clonal sequencing
US8278047B2 (en) 2007-10-01 2012-10-02 Nabsys, Inc. Biopolymer sequencing by hybridization of probes to form ternary complexes and variable range alignment
US8951731B2 (en) * 2007-10-15 2015-02-10 Complete Genomics, Inc. Sequence analysis using decorated nucleic acids
US8415099B2 (en) 2007-11-05 2013-04-09 Complete Genomics, Inc. Efficient base determination in sequencing reactions
US8617811B2 (en) 2008-01-28 2013-12-31 Complete Genomics, Inc. Methods and compositions for efficient base calling in sequencing reactions
US7897344B2 (en) * 2007-11-06 2011-03-01 Complete Genomics, Inc. Methods and oligonucleotide designs for insertion of multiple adaptors into library constructs
US20090263872A1 (en) * 2008-01-23 2009-10-22 Complete Genomics Inc. Methods and compositions for preventing bias in amplification and sequencing reactions
US8518640B2 (en) * 2007-10-29 2013-08-27 Complete Genomics, Inc. Nucleic acid sequencing and process
EP2215209B1 (en) 2007-10-30 2018-05-23 Complete Genomics, Inc. Apparatus for high throughput sequencing of nucleic acids
US7988918B2 (en) * 2007-11-01 2011-08-02 Complete Genomics, Inc. Structures for enhanced detection of fluorescence
WO2009061840A1 (en) * 2007-11-05 2009-05-14 Complete Genomics, Inc. Methods and oligonucleotide designs for insertion of multiple adaptors employing selective methylation
US10725020B2 (en) 2007-11-14 2020-07-28 Curiox Biosystems Pte Ltd. High throughput miniaturized assay system and methods
WO2013114217A1 (en) * 2012-02-05 2013-08-08 Curiox Biosystems Pte Ltd. Array plates and methods for making and using same
WO2009073629A2 (en) * 2007-11-29 2009-06-11 Complete Genomics, Inc. Efficient shotgun sequencing methods
US9551026B2 (en) 2007-12-03 2017-01-24 Complete Genomincs, Inc. Method for nucleic acid detection using voltage enhancement
DK2565279T3 (en) * 2007-12-05 2015-02-16 Complete Genomics Inc Efficient base determination in sequencing reactions
CA2707901C (en) * 2007-12-05 2015-09-15 Complete Genomics, Inc. Efficient base determination in sequencing reactions
US8592150B2 (en) 2007-12-05 2013-11-26 Complete Genomics, Inc. Methods and compositions for long fragment read sequencing
US20090171640A1 (en) * 2007-12-28 2009-07-02 Microsoft Corporation Population sequencing using short read technologies
US20090203531A1 (en) 2008-02-12 2009-08-13 Nurith Kurn Method for Archiving and Clonal Expansion
GB2470672B (en) 2008-03-21 2012-09-12 Nugen Technologies Inc Methods of RNA amplification in the presence of DNA
US20090270273A1 (en) * 2008-04-21 2009-10-29 Complete Genomics, Inc. Array structures for nucleic acid detection
EP2283132B1 (en) 2008-05-02 2016-10-26 Epicentre Technologies Corporation Selective 5' ligation tagging of rna
US8039817B2 (en) 2008-05-05 2011-10-18 Illumina, Inc. Compensator for multiple surface imaging
TWI460602B (en) * 2008-05-16 2014-11-11 Counsyl Inc Device for universal preconception screening
JP5667049B2 (en) 2008-06-25 2015-02-12 ライフ テクノロジーズ コーポレーション Method and apparatus for measuring analytes using large-scale FET arrays
US9650668B2 (en) 2008-09-03 2017-05-16 Nabsys 2.0 Llc Use of longitudinally displaced nanoscale electrodes for voltage sensing of biomolecules and other analytes in fluidic channels
WO2010028140A2 (en) 2008-09-03 2010-03-11 Nabsys, Inc. Use of longitudinally displaced nanoscale electrodes for voltage sensing of biomolecules and other analytes in fluidic channels
US8262879B2 (en) 2008-09-03 2012-09-11 Nabsys, Inc. Devices and methods for determining the length of biopolymers and distances between probes bound thereto
US8383345B2 (en) 2008-09-12 2013-02-26 University Of Washington Sequence tag directed subassembly of short sequencing reads into long sequencing reads
US20100137143A1 (en) 2008-10-22 2010-06-03 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes
US20100301398A1 (en) 2009-05-29 2010-12-02 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes
US9080211B2 (en) 2008-10-24 2015-07-14 Epicentre Technologies Corporation Transposon end compositions and methods for modifying nucleic acids
EP2508529B1 (en) 2008-10-24 2013-08-28 Epicentre Technologies Corporation Transposon end compositions and methods for modifying nucleic acids
US9506119B2 (en) 2008-11-07 2016-11-29 Adaptive Biotechnologies Corp. Method of sequence determination using sequence tags
US8748103B2 (en) 2008-11-07 2014-06-10 Sequenta, Inc. Monitoring health and disease status using clonotype profiles
US9365901B2 (en) 2008-11-07 2016-06-14 Adaptive Biotechnologies Corp. Monitoring immunoglobulin heavy chain evolution in B-cell acute lymphoblastic leukemia
CN102272327B (en) 2008-11-07 2015-11-25 赛昆塔公司 By the method for sequential analysis monitoring situation
US9528160B2 (en) 2008-11-07 2016-12-27 Adaptive Biotechnolgies Corp. Rare clonotypes and uses thereof
US8628927B2 (en) 2008-11-07 2014-01-14 Sequenta, Inc. Monitoring health and disease status using clonotype profiles
WO2010083456A1 (en) 2009-01-15 2010-07-22 Imdaptive Inc. Adaptive immunity profiling and methods for generation of monoclonal antibodies
WO2010091021A2 (en) * 2009-02-03 2010-08-12 Complete Genomics, Inc. Oligomer sequences mapping
EP2394165A4 (en) * 2009-02-03 2013-12-11 Complete Genomics Inc Oligomer sequences mapping
WO2010091023A2 (en) * 2009-02-03 2010-08-12 Complete Genomics, Inc. Indexing a reference sequence for oligomer sequence mapping
US8455260B2 (en) 2009-03-27 2013-06-04 Massachusetts Institute Of Technology Tagged-fragment map assembly
EP2511843B1 (en) 2009-04-29 2016-12-21 Complete Genomics, Inc. Method and system for calling variations in a sample polynucleotide sequence with respect to a reference polynucleotide sequence
US20130296189A1 (en) * 2009-05-08 2013-11-07 Suzhou Institute of Nano-tech and Nano-bionics, Chinese Academy of Sciences Probes utilizing universal tags, a kit comprising the same and detection methods
US8246799B2 (en) 2009-05-28 2012-08-21 Nabsys, Inc. Devices and methods for analyzing biomolecules and probes bound thereto
US20120261274A1 (en) 2009-05-29 2012-10-18 Life Technologies Corporation Methods and apparatus for measuring analytes
US8673627B2 (en) 2009-05-29 2014-03-18 Life Technologies Corporation Apparatus and methods for performing electrochemical reactions
US8776573B2 (en) 2009-05-29 2014-07-15 Life Technologies Corporation Methods and apparatus for measuring analytes
CN102459592B (en) 2009-06-15 2017-04-05 考利达基因组股份有限公司 For the method and composition of long fragment read sequencing
US9524369B2 (en) 2009-06-15 2016-12-20 Complete Genomics, Inc. Processing and analysis of complex nucleic acid sequence data
EP2446052B1 (en) 2009-06-25 2018-08-08 Fred Hutchinson Cancer Research Center Method of measuring adaptive immunity
KR20110018763A (en) * 2009-08-18 2011-02-24 삼성전자주식회사 Method and apparatus for fixing a target molecule on a substrate
EP2467479B1 (en) 2009-08-20 2016-01-06 Population Genetics Technologies Ltd Compositions and methods for intramolecular nucleic acid rearrangement
WO2011037990A1 (en) * 2009-09-22 2011-03-31 President And Fellows Of Harvard College Entangled mate sequencing
ES2640776T3 (en) 2009-09-30 2017-11-06 Natera, Inc. Methods for non-invasively calling prenatal ploidy
JP5901530B2 (en) * 2009-11-10 2016-04-13 ネステク ソシエテ アノニム Cardiac aging biomarker and method of use
US9023769B2 (en) 2009-11-30 2015-05-05 Complete Genomics, Inc. cDNA library for nucleic acid sequencing
US9217144B2 (en) * 2010-01-07 2015-12-22 Gen9, Inc. Assembly of high fidelity polynucleotides
DE202011003570U1 (en) 2010-03-06 2012-01-30 Illumina, Inc. Systems and apparatus for detecting optical signals from a sample
US8502867B2 (en) 2010-03-19 2013-08-06 Lightspeed Genomics, Inc. Synthetic aperture optics imaging method using minimum selective excitation patterns
US9465228B2 (en) * 2010-03-19 2016-10-11 Optical Biosystems, Inc. Illumination apparatus optimized for synthetic aperture optics imaging using minimum selective excitation patterns
WO2011123246A2 (en) 2010-04-01 2011-10-06 Illumina, Inc. Solid-phase clonal amplification and related methods
US20190300945A1 (en) 2010-04-05 2019-10-03 Prognosys Biosciences, Inc. Spatially Encoded Biological Assays
US10787701B2 (en) 2010-04-05 2020-09-29 Prognosys Biosciences, Inc. Spatially encoded biological assays
ES2555106T3 (en) 2010-04-05 2015-12-29 Prognosys Biosciences, Inc. Spatially coded biological assays
US8774494B2 (en) * 2010-04-30 2014-07-08 Complete Genomics, Inc. Method and system for accurate alignment and registration of array for DNA sequencing
US20190010543A1 (en) 2010-05-18 2019-01-10 Natera, Inc. Methods for simultaneous amplification of target loci
US11332793B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for simultaneous amplification of target loci
US10316362B2 (en) 2010-05-18 2019-06-11 Natera, Inc. Methods for simultaneous amplification of target loci
US11939634B2 (en) 2010-05-18 2024-03-26 Natera, Inc. Methods for simultaneous amplification of target loci
CA2798758C (en) 2010-05-18 2019-05-07 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11339429B2 (en) 2010-05-18 2022-05-24 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11322224B2 (en) 2010-05-18 2022-05-03 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US9677118B2 (en) 2014-04-21 2017-06-13 Natera, Inc. Methods for simultaneous amplification of target loci
US11326208B2 (en) 2010-05-18 2022-05-10 Natera, Inc. Methods for nested PCR amplification of cell-free DNA
US11408031B2 (en) 2010-05-18 2022-08-09 Natera, Inc. Methods for non-invasive prenatal paternity testing
US11332785B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for non-invasive prenatal ploidy calling
EP2580353B1 (en) 2010-06-11 2015-07-29 Life Technologies Corporation Alternative nucleotide flows in sequencing-by-synthesis methods
US9353412B2 (en) 2010-06-18 2016-05-31 Illumina, Inc. Conformational probes and methods for sequencing nucleic acids
TWI569025B (en) 2010-06-30 2017-02-01 生命技術公司 Methods and apparatus for testing isfet arrays
EP2598804B1 (en) 2010-06-30 2021-10-20 Life Technologies Corporation Method to generate an output signal in an isfet-array
WO2012003363A1 (en) 2010-06-30 2012-01-05 Life Technologies Corporation Ion-sensing charge-accumulation circuits and methods
US11307166B2 (en) 2010-07-01 2022-04-19 Life Technologies Corporation Column ADC
US8653567B2 (en) 2010-07-03 2014-02-18 Life Technologies Corporation Chemically sensitive sensor with lightly doped drains
WO2012011877A2 (en) 2010-07-23 2012-01-26 Curiox Biosystems Pte Ltd Apparatus and method for multiple reactions in small volumes
AU2011291538A1 (en) * 2010-08-20 2013-03-14 The Regents Of The University Of California Small molecule arrays and methods for making and using them
US9671344B2 (en) 2010-08-31 2017-06-06 Complete Genomics, Inc. High-density biochemical array chips with asynchronous tracks for alignment correction by moiré averaging
US9880089B2 (en) 2010-08-31 2018-01-30 Complete Genomics, Inc. High-density devices with synchronous tracks for quad-cell based alignment correction
FR2964391B1 (en) * 2010-09-03 2014-04-11 Centre Nat Rech Scient BIOPUCES FOR ANALYSIS OF THE DYNAMICS OF NUCLEIC ACID MOLECULES
US9618475B2 (en) 2010-09-15 2017-04-11 Life Technologies Corporation Methods and apparatus for measuring analytes
EP2619564B1 (en) 2010-09-24 2016-03-16 Life Technologies Corporation Matched pair transistor circuits
US8715933B2 (en) 2010-09-27 2014-05-06 Nabsys, Inc. Assay methods using nicking endonucleases
US8725422B2 (en) 2010-10-13 2014-05-13 Complete Genomics, Inc. Methods for estimating genome-wide copy number variations
EP2632593B1 (en) 2010-10-27 2021-09-29 Illumina, Inc. Flow cells for biological or chemical analysis
US9074251B2 (en) 2011-02-10 2015-07-07 Illumina, Inc. Linking sequence reads using paired code tags
WO2012067911A1 (en) 2010-11-16 2012-05-24 Nabsys, Inc. Methods for sequencing a biomolecule by detecting relative positions of hybridized probes
CN103608466B (en) 2010-12-22 2020-09-18 纳特拉公司 Non-invasive prenatal paternity testing method
WO2012118555A1 (en) 2010-12-29 2012-09-07 Life Technologies Corporation Time-warped background signal for sequencing-by-synthesis operations
WO2012092455A2 (en) 2010-12-30 2012-07-05 Life Technologies Corporation Models for analyzing data from sequencing-by-synthesis operations
US20130060482A1 (en) 2010-12-30 2013-03-07 Life Technologies Corporation Methods, systems, and computer readable media for making base calls in nucleic acid sequencing
US10241075B2 (en) 2010-12-30 2019-03-26 Life Technologies Corporation Methods, systems, and computer readable media for nucleic acid sequencing
US8951781B2 (en) 2011-01-10 2015-02-10 Illumina, Inc. Systems, methods, and apparatuses to image a sample for biological or chemical analysis
EP2670863B1 (en) 2011-01-31 2018-06-27 H. Hoffnabb-La Roche Ag Methods of identifying multiple epitopes in cells
EP2670894B1 (en) 2011-02-02 2017-11-29 University Of Washington Through Its Center For Commercialization Massively parallel continguity mapping
WO2012109574A2 (en) 2011-02-11 2012-08-16 Nabsys, Inc. Assay methods using dna binding proteins
CN105861645B (en) 2011-04-08 2020-02-21 生命科技股份有限公司 Phase-protected reagent flow ordering for use in sequencing-by-synthesis
GB201106254D0 (en) 2011-04-13 2011-05-25 Frisen Jonas Method and product
CN103843001B (en) * 2011-04-14 2017-06-09 考利达基因组股份有限公司 The treatment and analysis of complex nucleic acid sequence data
US8778848B2 (en) 2011-06-09 2014-07-15 Illumina, Inc. Patterned flow-cells useful for nucleic acid analysis
EP3624124B1 (en) 2011-08-18 2023-11-22 Life Technologies Corporation Systems for making base calls in nucleic acid sequencing
US9679103B2 (en) 2011-08-25 2017-06-13 Complete Genomics, Inc. Phasing of heterozygous loci to determine genomic haplotypes
US10704164B2 (en) 2011-08-31 2020-07-07 Life Technologies Corporation Methods, systems, computer readable media, and kits for sample identification
EP2753715A4 (en) 2011-09-09 2015-05-20 Univ Leland Stanford Junior Methods for obtaining a sequence
US10385475B2 (en) 2011-09-12 2019-08-20 Adaptive Biotechnologies Corp. Random array sequencing of low-complexity libraries
US9453258B2 (en) 2011-09-23 2016-09-27 Illumina, Inc. Methods and compositions for nucleic acid sequencing
US10378051B2 (en) 2011-09-29 2019-08-13 Illumina Cambridge Limited Continuous extension and deblocking in reactions for nucleic acids synthesis and sequencing
EP3604555A1 (en) 2011-10-14 2020-02-05 President and Fellows of Harvard College Sequencing by structure assembly
CA2853088C (en) 2011-10-21 2018-03-13 Adaptive Biotechnologies Corporation Quantification of adaptive immune cell genomes in a complex mixture of cells
WO2013063382A2 (en) 2011-10-28 2013-05-02 Illumina, Inc. Microarray fabrication system and method
CN103890161A (en) * 2011-10-31 2014-06-25 株式会社日立高新技术 Nucleic acid amplification method, nucleic acid substrate, nucleic acid analysis method, and nucleic acid analysis device
US10837879B2 (en) * 2011-11-02 2020-11-17 Complete Genomics, Inc. Treatment for stabilizing nucleic acid arrays
US9970984B2 (en) 2011-12-01 2018-05-15 Life Technologies Corporation Method and apparatus for identifying defects in a chemical sensor array
US9824179B2 (en) 2011-12-09 2017-11-21 Adaptive Biotechnologies Corp. Diagnosis of lymphoid malignancies and minimal residual disease detection
US9499865B2 (en) 2011-12-13 2016-11-22 Adaptive Biotechnologies Corp. Detection and measurement of tissue-infiltrating lymphocytes
EP2605001A1 (en) * 2011-12-15 2013-06-19 Hain Lifescience GmbH A device and method for optically measuring fluorescence of nucleic acids in test samples and use of the device and method
CA2859761C (en) 2011-12-22 2023-06-20 President And Fellows Of Harvard College Compositions and methods for analyte detection
WO2014163886A1 (en) 2013-03-12 2014-10-09 President And Fellows Of Harvard College Method of generating a three-dimensional nucleic acid containing matrix
US11021737B2 (en) 2011-12-22 2021-06-01 President And Fellows Of Harvard College Compositions and methods for analyte detection
US8821798B2 (en) 2012-01-19 2014-09-02 Life Technologies Corporation Titanium nitride as sensing layer for microwell structure
US8747748B2 (en) 2012-01-19 2014-06-10 Life Technologies Corporation Chemical sensor with conductive cup-shaped sensor surface
US9864846B2 (en) 2012-01-31 2018-01-09 Life Technologies Corporation Methods and computer program products for compression of sequencing data
US9515676B2 (en) 2012-01-31 2016-12-06 Life Technologies Corporation Methods and computer program products for compression of sequencing data
JP6012767B2 (en) 2012-02-07 2016-10-25 ヴィブラント ホールディングス リミテッド ライアビリティ カンパニー Substrates, peptide arrays, and methods
US20130217023A1 (en) * 2012-02-22 2013-08-22 454 Life Sciences Corporation System And Method For Generation And Use Of Compact Clonally Amplified Products
JP6302847B2 (en) 2012-03-05 2018-03-28 アダプティヴ バイオテクノロジーズ コーポレーション Determination of paired immunoreceptor chains from frequency matched subunits
CN108300765B (en) * 2012-03-13 2022-01-11 斯威夫特生物科学公司 Methods and compositions for size-controlled homopolymer tailing of substrate polynucleotides by nucleic acid polymerases
IN2014DN08135A (en) * 2012-03-16 2015-05-01 Life Technologies Corp
US9803239B2 (en) * 2012-03-29 2017-10-31 Complete Genomics, Inc. Flow cells for high density array chips
US20130261984A1 (en) 2012-03-30 2013-10-03 Illumina, Inc. Methods and systems for determining fetal chromosomal abnormalities
EP4219012A1 (en) 2012-04-03 2023-08-02 Illumina, Inc. Method of imaging a substrate comprising fluorescent features and use of the method in nucleic acid sequencing
US20130274148A1 (en) 2012-04-11 2013-10-17 Illumina, Inc. Portable genetic detection and analysis system and method
WO2013166517A1 (en) 2012-05-04 2013-11-07 Complete Genomics, Inc. Methods for determining absolute genome-wide copy number variations of complex tumors
PL2831276T3 (en) 2012-05-08 2016-10-31 Compositions and method for measuring and calibrating amplification bias in multiplexed pcr reactions
US9646132B2 (en) 2012-05-11 2017-05-09 Life Technologies Corporation Models for analyzing data from sequencing-by-synthesis operations
US9052963B2 (en) 2012-05-21 2015-06-09 International Business Machines Corporation Cloud computing data center machine monitor and control
US8786331B2 (en) 2012-05-29 2014-07-22 Life Technologies Corporation System for reducing noise in a chemical sensor array
WO2013184754A2 (en) 2012-06-05 2013-12-12 President And Fellows Of Harvard College Spatial sequencing of nucleic acids using dna origami probes
US9628676B2 (en) 2012-06-07 2017-04-18 Complete Genomics, Inc. Imaging systems with movable scan mirrors
US9488823B2 (en) 2012-06-07 2016-11-08 Complete Genomics, Inc. Techniques for scanned illumination
US9012022B2 (en) 2012-06-08 2015-04-21 Illumina, Inc. Polymer coatings
US8895249B2 (en) 2012-06-15 2014-11-25 Illumina, Inc. Kinetic exclusion amplification of nucleic acid libraries
WO2014008635A1 (en) * 2012-07-11 2014-01-16 北京贝瑞和康生物技术有限公司 Detection method and detection kit for dna fragments, and use thereof
US9977861B2 (en) 2012-07-18 2018-05-22 Illumina Cambridge Limited Methods and systems for determining haplotypes and phasing of haplotypes
US10221442B2 (en) 2012-08-14 2019-03-05 10X Genomics, Inc. Compositions and methods for sample processing
US10400280B2 (en) 2012-08-14 2019-09-03 10X Genomics, Inc. Methods and systems for processing polynucleotides
US9567631B2 (en) 2012-12-14 2017-02-14 10X Genomics, Inc. Methods and systems for processing polynucleotides
US9951386B2 (en) 2014-06-26 2018-04-24 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10752949B2 (en) 2012-08-14 2020-08-25 10X Genomics, Inc. Methods and systems for processing polynucleotides
US9701998B2 (en) 2012-12-14 2017-07-11 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10323279B2 (en) 2012-08-14 2019-06-18 10X Genomics, Inc. Methods and systems for processing polynucleotides
US9388465B2 (en) 2013-02-08 2016-07-12 10X Genomics, Inc. Polynucleotide barcode generation
CN113528634A (en) 2012-08-14 2021-10-22 10X基因组学有限公司 Microcapsule compositions and methods
US11591637B2 (en) 2012-08-14 2023-02-28 10X Genomics, Inc. Compositions and methods for sample processing
US10273541B2 (en) 2012-08-14 2019-04-30 10X Genomics, Inc. Methods and systems for processing polynucleotides
NL2017959B1 (en) 2016-12-08 2018-06-19 Illumina Inc Cartridge assembly
CA3178340A1 (en) 2012-08-20 2014-02-27 Illumina, Inc. Method and system for fluorescence lifetime based sequencing
US10006909B2 (en) 2012-09-28 2018-06-26 Vibrant Holdings, Llc Methods, systems, and arrays for biomolecular analysis
WO2014052989A2 (en) * 2012-09-28 2014-04-03 Vibrant Holdings, Llc Methods, systems, and arrays for biomolecular analysis
WO2014055561A1 (en) 2012-10-01 2014-04-10 Adaptive Biotechnologies Corporation Immunocompetence assessment by adaptive immune receptor diversity and clonality characterization
US10329608B2 (en) 2012-10-10 2019-06-25 Life Technologies Corporation Methods, systems, and computer readable media for repeat sequencing
DK2914741T3 (en) 2012-11-02 2017-11-20 Life Technologies Corp New Compositions and Methods for Improving PCR Specificity
US10286376B2 (en) 2012-11-14 2019-05-14 Vibrant Holdings, Llc Substrates, systems, and methods for array synthesis and biomolecular analysis
US10829816B2 (en) 2012-11-19 2020-11-10 Apton Biosystems, Inc. Methods of analyte detection
US20150330974A1 (en) 2012-11-19 2015-11-19 Apton Biosystems, Inc. Digital Analysis of Molecular Analytes Using Single Molecule Detection
US10533221B2 (en) 2012-12-14 2020-01-14 10X Genomics, Inc. Methods and systems for processing polynucleotides
US9914966B1 (en) 2012-12-20 2018-03-13 Nabsys 2.0 Llc Apparatus and methods for analysis of biomolecules using high frequency alternating current excitation
US9080968B2 (en) 2013-01-04 2015-07-14 Life Technologies Corporation Methods and systems for point of use removal of sacrificial material
US9841398B2 (en) 2013-01-08 2017-12-12 Life Technologies Corporation Methods for manufacturing well structures for low-noise chemical sensors
US9683230B2 (en) 2013-01-09 2017-06-20 Illumina Cambridge Limited Sample preparation on a solid support
US10294516B2 (en) 2013-01-18 2019-05-21 Nabsys 2.0 Llc Enhanced probe binding
US8962366B2 (en) 2013-01-28 2015-02-24 Life Technologies Corporation Self-aligned well structures for low-noise chemical sensors
GB2547875B (en) 2013-02-01 2017-12-13 Univ California Methods for meta-genomics analysis of microbes
US9411930B2 (en) 2013-02-01 2016-08-09 The Regents Of The University Of California Methods for genome assembly and haplotype phasing
WO2014122548A2 (en) * 2013-02-07 2014-08-14 Koninklijke Philips N.V. Processing of nucleotide sequences
CA2923379C (en) 2013-02-15 2023-01-03 Vibrant Holdings, Llc Methods and compositions for amplified electrochemiluminescence detection
US9512422B2 (en) 2013-02-26 2016-12-06 Illumina, Inc. Gel patterned surfaces
US8963216B2 (en) 2013-03-13 2015-02-24 Life Technologies Corporation Chemical sensor with sidewall spacer sensor surface
US8841217B1 (en) 2013-03-13 2014-09-23 Life Technologies Corporation Chemical sensor with protruded sensor surface
EP2969479B1 (en) 2013-03-13 2021-05-05 Illumina, Inc. Multilayer fluidic devices and methods for their fabrication
EP2970951B1 (en) 2013-03-13 2019-02-20 Illumina, Inc. Methods for nucleic acid sequencing
US20140296080A1 (en) 2013-03-14 2014-10-02 Life Technologies Corporation Methods, Systems, and Computer Readable Media for Evaluating Variant Likelihood
WO2014149779A1 (en) 2013-03-15 2014-09-25 Life Technologies Corporation Chemical device with thin conductive element
US9328382B2 (en) 2013-03-15 2016-05-03 Complete Genomics, Inc. Multiple tagging of individual long DNA fragments
JP6581074B2 (en) 2013-03-15 2019-09-25 ライフ テクノロジーズ コーポレーション Chemical sensor with consistent sensor surface area
JP6431895B2 (en) * 2013-03-15 2018-12-05 アダプティブ バイオテクノロジーズ コーポレイション Reconstituted adaptive immune receptor genes uniquely tagged in a complex gene set
WO2014142981A1 (en) 2013-03-15 2014-09-18 Illumina, Inc. Enzyme-linked nucleotides
WO2014149780A1 (en) 2013-03-15 2014-09-25 Life Technologies Corporation Chemical sensor with consistent sensor surface areas
US9116117B2 (en) 2013-03-15 2015-08-25 Life Technologies Corporation Chemical sensor with sidewall sensor surface
US9835585B2 (en) 2013-03-15 2017-12-05 Life Technologies Corporation Chemical sensor with protruded sensor surface
CA2905410A1 (en) * 2013-03-15 2014-09-25 Abbott Molecular Inc. Systems and methods for detection of genomic copy number changes
SG10201708498VA (en) 2013-04-17 2017-11-29 Agency Science Tech & Res Method for generating extended sequence reads
US20140336063A1 (en) 2013-05-09 2014-11-13 Life Technologies Corporation Windowed Sequencing
US10458942B2 (en) 2013-06-10 2019-10-29 Life Technologies Corporation Chemical sensor array having multiple sensors per well
WO2014210225A1 (en) 2013-06-25 2014-12-31 Prognosys Biosciences, Inc. Methods and systems for determining spatial patterns of biological targets in a sample
US9708657B2 (en) 2013-07-01 2017-07-18 Adaptive Biotechnologies Corp. Method for generating clonotype profiles using sequence tags
KR102070483B1 (en) 2013-07-01 2020-01-29 일루미나, 인코포레이티드 Catalyst-free surface functionalization and polymer grafting
ES2628485T3 (en) 2013-07-03 2017-08-03 Illumina, Inc. Sequencing by orthogonal synthesis
US9557318B2 (en) 2013-07-09 2017-01-31 Curiox Biosystems Pte Ltd. Array plates for washing samples
US9926597B2 (en) 2013-07-26 2018-03-27 Life Technologies Corporation Control nucleic acid sequences for use in sequencing-by-synthesis and methods for designing the same
KR102291045B1 (en) 2013-08-05 2021-08-19 트위스트 바이오사이언스 코포레이션 De novo synthesized gene libraries
DK3030645T3 (en) 2013-08-08 2023-01-30 Illumina Inc FLUID SYSTEM FOR DELIVERY OF REAGENTS TO A FLOW CELL
WO2015027112A1 (en) 2013-08-22 2015-02-26 Apton Biosystems, Inc. Digital analysis of molecular analytes using electrical methods
CN105637099B (en) 2013-08-23 2020-05-19 深圳华大智造科技有限公司 Long fragment de novo assembly using short reads
US10395758B2 (en) 2013-08-30 2019-08-27 10X Genomics, Inc. Sequencing methods
KR20160081896A (en) 2013-08-30 2016-07-08 일루미나, 인코포레이티드 Manipulation of droplets on hydrophilic or variegated-hydrophilic surfaces
US9352315B2 (en) 2013-09-27 2016-05-31 Taiwan Semiconductor Manufacturing Company, Ltd. Method to produce chemical pattern in micro-fluidic structure
US10577655B2 (en) 2013-09-27 2020-03-03 Natera, Inc. Cell free DNA diagnostic testing standards
US10410739B2 (en) 2013-10-04 2019-09-10 Life Technologies Corporation Methods and systems for modeling phasing effects in sequencing using termination chemistry
EP3875601A1 (en) 2013-10-17 2021-09-08 Illumina, Inc. Methods and compositions for preparing nucleic acid libraries
CA2931533C (en) 2013-12-09 2023-08-08 Illumina, Inc. Methods and compositions for targeted nucleic acid sequencing
JP6672149B2 (en) 2013-12-10 2020-03-25 イラミーナ インコーポレーテッド Biosensor for biological or chemical analysis and method of manufacturing the same
KR102379877B1 (en) 2013-12-11 2022-03-30 아큐라젠 홀딩스 리미티드 Compositions and methods for detecting rare sequence variants
US11286519B2 (en) 2013-12-11 2022-03-29 Accuragen Holdings Limited Methods and compositions for enrichment of amplification products
US11859246B2 (en) 2013-12-11 2024-01-02 Accuragen Holdings Limited Methods and compositions for enrichment of amplification products
US9824068B2 (en) 2013-12-16 2017-11-21 10X Genomics, Inc. Methods and apparatus for sorting data
JP6366719B2 (en) 2013-12-20 2018-08-01 イルミナ インコーポレイテッド Preservation of genomic connectivity information in fragmented genomic DNA samples
JP2015139373A (en) * 2014-01-27 2015-08-03 株式会社日立ハイテクノロジーズ Biomolecule analytical device, and biomolecule analyzer
WO2015134787A2 (en) 2014-03-05 2015-09-11 Adaptive Biotechnologies Corporation Methods using randomer-containing synthetic molecules
EP3122894A4 (en) * 2014-03-28 2017-11-08 GE Healthcare Bio-Sciences Corp. Accurate detection of rare genetic variants in next generation sequencing
US10066265B2 (en) 2014-04-01 2018-09-04 Adaptive Biotechnologies Corp. Determining antigen-specific t-cells
MX2016013156A (en) 2014-04-10 2017-02-14 10X Genomics Inc Fluidic devices, systems, and methods for encapsulating and partitioning reagents, and applications of same.
ES2777529T3 (en) 2014-04-17 2020-08-05 Adaptive Biotechnologies Corp Quantification of adaptive immune cell genomes in a complex mixture of cells
RU2717641C2 (en) 2014-04-21 2020-03-24 Натера, Инк. Detection of mutations and ploidy in chromosomal segments
EP3138088A4 (en) 2014-05-02 2017-12-06 Synthetic Genomics, Inc. Tamper-resistant assembly for securing valuable material
SG11201610168YA (en) 2014-05-16 2017-01-27 Illumina Inc Nucleic acid synthesis techniques
CN113249435A (en) 2014-06-26 2021-08-13 10X基因组学有限公司 Methods of analyzing nucleic acids from individual cells or cell populations
SG11201610691QA (en) 2014-06-26 2017-01-27 10X Genomics Inc Processes and systems for nucleic acid sequence assembly
CA2955356C (en) 2014-07-15 2024-01-02 Illumina, Inc. Biochemically activated electronic device
US20160032281A1 (en) * 2014-07-31 2016-02-04 Fei Company Functionalized grids for locating and imaging biological specimens and methods of using the same
EP3174980A4 (en) 2014-08-01 2018-01-17 Dovetail Genomics, LLC Tagging nucleic acids for sequence assembly
CN107076739B (en) 2014-08-21 2018-12-25 伊卢米纳剑桥有限公司 Reversible surface functionalization
US10570385B2 (en) 2014-08-26 2020-02-25 Japan Science And Technology Agency Method for non-enzymatic combination of nucleic acid chains
KR102538753B1 (en) 2014-09-18 2023-05-31 일루미나, 인코포레이티드 Methods and systems for analyzing nucleic acid sequencing data
WO2016060974A1 (en) 2014-10-13 2016-04-21 Life Technologies Corporation Methods, systems, and computer-readable media for accelerated base calling
SG10201903408VA (en) 2014-10-17 2019-05-30 Illumina Cambridge Ltd Contiguity preserving transposition
AU2015339148B2 (en) 2014-10-29 2022-03-10 10X Genomics, Inc. Methods and compositions for targeted nucleic acid sequencing
CA2966201A1 (en) 2014-10-29 2016-05-06 Adaptive Biotechnologies Corp. Highly-multiplexed simultaneous detection of nucleic acids encoding paired adaptive immune receptor heterodimers from many samples
ES2772127T3 (en) 2014-10-31 2020-07-07 Illumina Cambridge Ltd DNA copolymer polymers and coatings
US9975122B2 (en) 2014-11-05 2018-05-22 10X Genomics, Inc. Instrument systems for integrated sample processing
WO2016073237A1 (en) 2014-11-05 2016-05-12 Illumina Cambridge Limited Reducing dna damage during sample preparation and sequencing using siderophore chelators
US10246701B2 (en) 2014-11-14 2019-04-02 Adaptive Biotechnologies Corp. Multiplexed digital quantitation of rearranged lymphoid receptors in a complex mixture
US10900065B2 (en) 2014-11-14 2021-01-26 University Of Washington Methods and kits for labeling cellular molecules
US10954559B2 (en) 2014-11-21 2021-03-23 Mgi Tech Co., Ltd. Bubble-shaped adaptor element and method of constructing sequencing library with bubble-shaped adaptor element
CA2965988A1 (en) * 2014-11-21 2016-05-26 Research Institute At Nationwide Children's Hospital Parallel-processing systems and methods for highly scalable analysis of biological sequence data
US10233490B2 (en) 2014-11-21 2019-03-19 Metabiotech Corporation Methods for assembling and reading nucleic acid sequences from mixed populations
EP3224384A4 (en) 2014-11-25 2018-04-18 Adaptive Biotechnologies Corp. Characterization of adaptive immune response to vaccination or infection using immune repertoire sequencing
CN114438172A (en) 2014-12-15 2022-05-06 亿明达股份有限公司 Compositions and methods for single molecule placement on a substrate
EP3234575B1 (en) 2014-12-18 2023-01-25 Life Technologies Corporation Apparatus for measuring analytes using large scale fet arrays
US10077472B2 (en) 2014-12-18 2018-09-18 Life Technologies Corporation High data rate integrated circuit with power management
KR102593647B1 (en) 2014-12-18 2023-10-26 라이프 테크놀로지스 코포레이션 High data rate integrated circuit with transmitter configuration
KR102321863B1 (en) 2015-01-12 2021-11-08 10엑스 제노믹스, 인크. Method and system for preparing nucleic acid sequencing library and library prepared using same
KR20170106979A (en) 2015-01-13 2017-09-22 10엑스 제노믹스, 인크. System and method for visualizing structure variation and phase adjustment information
GB2548763B (en) * 2015-02-02 2021-07-21 Hitachi High Tech Corp Multicolor fluorescence analysis device
CA2975855A1 (en) 2015-02-04 2016-08-11 Twist Bioscience Corporation Compositions and methods for synthetic gene assembly
CA2975852A1 (en) 2015-02-04 2016-08-11 Twist Bioscience Corporation Methods and devices for de novo oligonucleic acid assembly
SG11201705996PA (en) 2015-02-09 2017-09-28 10X Genomics Inc Systems and methods for determining structural variation and phasing using variant call data
CA2976786A1 (en) * 2015-02-17 2016-08-25 Complete Genomics, Inc. Dna sequencing using controlled strand displacement
US9715573B2 (en) 2015-02-17 2017-07-25 Dovetail Genomics, Llc Nucleic acid sequence assembly
AU2016222788B2 (en) 2015-02-24 2022-03-31 Adaptive Biotechnologies Corp. Methods for diagnosing infectious disease and determining HLA status using immune repertoire sequencing
EP4286516A3 (en) 2015-02-24 2024-03-06 10X Genomics, Inc. Partition processing methods and systems
EP3936619A1 (en) 2015-02-24 2022-01-12 10X Genomics, Inc. Methods for targeted nucleic acid sequence coverage
CN114250278A (en) 2015-03-13 2022-03-29 生命技术公司 Methods, compositions and kits for capturing, detecting and quantifying small RNAs
AU2016235288B2 (en) 2015-03-24 2019-02-28 Illumina Cambridge Limited Methods, carrier assemblies, and systems for imaging samples for biological or chemical analysis
US11807896B2 (en) 2015-03-26 2023-11-07 Dovetail Genomics, Llc Physical linkage preservation in DNA storage
WO2016161273A1 (en) 2015-04-01 2016-10-06 Adaptive Biotechnologies Corp. Method of identifying human compatible t cell receptors specific for an antigenic target
EP3901281B1 (en) 2015-04-10 2022-11-23 Spatial Transcriptomics AB Spatially distinguished, multiplex nucleic acid analysis of biological specimens
WO2016172377A1 (en) 2015-04-21 2016-10-27 Twist Bioscience Corporation Devices and methods for oligonucleic acid library synthesis
US11479812B2 (en) 2015-05-11 2022-10-25 Natera, Inc. Methods and compositions for determining ploidy
EP4190912A1 (en) 2015-05-11 2023-06-07 Illumina, Inc. Platform for discovery and analysis of therapeutic agents
EP4220645A3 (en) 2015-05-14 2023-11-08 Life Technologies Corporation Barcode sequences, and related systems and methods
EP3302804B1 (en) 2015-05-29 2022-07-13 Illumina, Inc. Sample carrier and assay system for conducting designated reactions
US10545139B2 (en) 2015-06-16 2020-01-28 Curiox Biosystems Pte Ltd. Methods and devices for performing biological assays using magnetic components
CN107924121B (en) 2015-07-07 2021-06-08 亿明达股份有限公司 Selective surface patterning via nanoimprinting
US20180207920A1 (en) 2015-07-17 2018-07-26 Illumina, Inc. Polymer sheets for sequencing applications
BR112017023418A2 (en) 2015-07-30 2018-07-24 Illumina, Inc. orthogonal nucleotide unlocking
CN108474805A (en) 2015-08-24 2018-08-31 亿明达股份有限公司 For accumulator and flow control system in biological and chemical setting-out line road
US10906044B2 (en) 2015-09-02 2021-02-02 Illumina Cambridge Limited Methods of improving droplet operations in fluidic systems with a filler fluid including a surface regenerative silane
KR20180050411A (en) 2015-09-18 2018-05-14 트위스트 바이오사이언스 코포레이션 Oligonucleotide mutant library and its synthesis
CN113604546A (en) 2015-09-22 2021-11-05 特韦斯特生物科学公司 Flexible substrates for nucleic acid synthesis
EP3359693A4 (en) 2015-10-09 2019-03-06 Accuragen Holdings Limited Methods and compositions for enrichment of amplification products
SG11201803289VA (en) 2015-10-19 2018-05-30 Dovetail Genomics Llc Methods for genome assembly, haplotype phasing, and target independent nucleic acid detection
MX2018005611A (en) 2015-11-03 2018-11-09 Harvard College Method and apparatus for volumetric imaging of a three-dimensional nucleic acid containing matrix.
US10253352B2 (en) 2015-11-17 2019-04-09 Omniome, Inc. Methods for determining sequence profiles
US11371094B2 (en) 2015-11-19 2022-06-28 10X Genomics, Inc. Systems and methods for nucleic acid processing using degenerate nucleotides
WO2017095958A1 (en) 2015-12-01 2017-06-08 Twist Bioscience Corporation Functionalized surfaces and preparation thereof
CN115927547A (en) * 2015-12-03 2023-04-07 安可济控股有限公司 Methods and compositions for forming ligation products
EP3882357B1 (en) 2015-12-04 2022-08-10 10X Genomics, Inc. Methods and compositions for nucleic acid analysis
CA3008031A1 (en) 2016-01-11 2017-07-20 Illumina Singapore Pte Ltd Detection apparatus having a microfluorometer, a fluidic system, and a flow cell latch clamp module
CN108779491B (en) 2016-02-11 2021-03-09 10X基因组学有限公司 Systems, methods, and media for de novo assembly of whole genome sequence data
CA3014911A1 (en) 2016-02-23 2017-08-31 Dovetail Genomics, Llc Generation of phased read-sets for genome assembly and haplotype phasing
JP6584986B2 (en) 2016-03-18 2019-10-02 株式会社東芝 Nucleic acid detection method
CA3210120C (en) 2016-04-25 2024-04-09 President And Fellows Of Harvard College Hybridization chain reaction methods for in situ molecular detection
US10619205B2 (en) 2016-05-06 2020-04-14 Life Technologies Corporation Combinatorial barcode sequences, and related systems and methods
IL262946B2 (en) 2016-05-13 2023-03-01 Dovetail Genomics Llc Recovering long-range linkage information from preserved samples
WO2017197338A1 (en) 2016-05-13 2017-11-16 10X Genomics, Inc. Microfluidic systems and methods of use
CN109511265B (en) * 2016-05-16 2023-07-14 安可济控股有限公司 Method for improving sequencing by strand identification
KR102171865B1 (en) 2016-05-18 2020-10-29 일루미나, 인코포레이티드 Self-assembled patterning using patterned hydrophobic surfaces
EP3465506B1 (en) 2016-06-01 2024-04-03 Life Technologies Corporation Methods and systems for designing gene panels
JP6889769B2 (en) * 2016-07-18 2021-06-18 エフ.ホフマン−ラ ロシュ アーゲーF. Hoffmann−La Roche Aktiengesellschaft Asymmetric templates and asymmetric methods of nucleic acid sequencing
SG11201901296TA (en) * 2016-08-15 2019-03-28 Accuragen Holdings Ltd Compositions and methods for detecting rare sequence variants
WO2018038772A1 (en) 2016-08-22 2018-03-01 Twist Bioscience Corporation De novo synthesized nucleic acid libraries
JP7239465B2 (en) 2016-08-31 2023-03-14 プレジデント アンド フェローズ オブ ハーバード カレッジ Methods for preparing nucleic acid sequence libraries for detection by fluorescence in situ sequencing
CN109923216A (en) 2016-08-31 2019-06-21 哈佛学院董事及会员团体 By the detection combination of biomolecule to the method for the single test using fluorescent in situ sequencing
US10428325B1 (en) 2016-09-21 2019-10-01 Adaptive Biotechnologies Corporation Identification of antigen-specific B cell receptors
KR102217487B1 (en) 2016-09-21 2021-02-23 트위스트 바이오사이언스 코포레이션 Nucleic acid-based data storage
WO2018064116A1 (en) 2016-09-28 2018-04-05 Illumina, Inc. Methods and systems for data compression
US11485996B2 (en) 2016-10-04 2022-11-01 Natera, Inc. Methods for characterizing copy number variation using proximity-litigation sequencing
CN111781139B (en) 2016-10-14 2023-09-12 亿明达股份有限公司 Clamping box assembly
US11667951B2 (en) 2016-10-24 2023-06-06 Geneinfosec, Inc. Concealing information present within nucleic acids
KR102639137B1 (en) 2016-11-03 2024-02-21 엠쥐아이 테크 컴퍼니 엘티디. Biosensors for biological or chemical analysis and methods of manufacturing the same
WO2018093780A1 (en) 2016-11-16 2018-05-24 Illumina, Inc. Validation methods and systems for sequence variant calls
WO2018102759A1 (en) 2016-12-01 2018-06-07 Ignite Biosciences, Inc. Methods of assaying proteins
US10011870B2 (en) 2016-12-07 2018-07-03 Natera, Inc. Compositions and methods for identifying nucleic acid molecules
EA201991262A1 (en) 2016-12-16 2020-04-07 Твист Байосайенс Корпорейшн LIBRARIES OF OPTIONS OF IMMUNOLOGICAL SYNAPSIS AND THEIR SYNTHESIS
US10815525B2 (en) 2016-12-22 2020-10-27 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10550429B2 (en) 2016-12-22 2020-02-04 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10011872B1 (en) 2016-12-22 2018-07-03 10X Genomics, Inc. Methods and systems for processing polynucleotides
IL267836B2 (en) 2017-01-04 2023-09-01 Complete Genomics Inc Stepwise sequencing by non-labeled reversible terminators or natural nucleotides
GB201704754D0 (en) 2017-01-05 2017-05-10 Illumina Inc Kinetic exclusion amplification of nucleic acid libraries
EP4310183A3 (en) 2017-01-30 2024-02-21 10X Genomics, Inc. Methods and systems for droplet-based single cell barcoding
US10995333B2 (en) 2017-02-06 2021-05-04 10X Genomics, Inc. Systems and methods for nucleic acid preparation
WO2018152162A1 (en) 2017-02-15 2018-08-23 Omniome, Inc. Distinguishing sequences by detecting polymerase dissociation
CA3054303A1 (en) 2017-02-22 2018-08-30 Twist Bioscience Corporation Nucleic acid based data storage
CA3056388A1 (en) 2017-03-15 2018-09-20 Twist Bioscience Corporation Variant libraries of the immunological synapse and synthesis thereof
JP7305611B2 (en) * 2017-03-17 2023-07-10 アプトン バイオシステムズ インコーポレイテッド Methods of sequencing and high-resolution imaging
AU2018237066B2 (en) 2017-03-20 2022-09-15 Mgi Tech Co., Ltd. Biosensors for biological or chemical analysis and methods of manufacturing the same
WO2018187013A1 (en) 2017-04-04 2018-10-11 Omniome, Inc. Fluidic apparatus and methods useful for chemical and biological reactions
KR102446247B1 (en) 2017-04-05 2022-09-21 큐리옥스 바이오시스템즈 피티이 엘티디. Methods, devices and apparatus for cleaning samples on array plates
US10161003B2 (en) 2017-04-25 2018-12-25 Omniome, Inc. Methods and apparatus that increase sequencing-by-binding efficiency
EP3625715A4 (en) 2017-05-19 2021-03-17 10X Genomics, Inc. Systems and methods for analyzing datasets
US20180340169A1 (en) 2017-05-26 2018-11-29 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
EP4230746A3 (en) 2017-05-26 2023-11-01 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
US10538808B2 (en) 2017-05-26 2020-01-21 Vibrant Holdings, Llc Photoactive compounds and methods for biomolecule detection and sequencing
WO2018231864A1 (en) 2017-06-12 2018-12-20 Twist Bioscience Corporation Methods for seamless nucleic acid assembly
KR20240013290A (en) 2017-06-12 2024-01-30 트위스트 바이오사이언스 코포레이션 Methods for seamless nucleic acid assembly
JP6991504B2 (en) * 2017-07-13 2022-01-12 国立研究開発法人産業技術総合研究所 Target substance detection device and target substance detection method
WO2019023948A1 (en) 2017-08-01 2019-02-07 深圳华大智造科技有限公司 Gene sequencing reaction device, gene sequencing system, and gene sequencing reaction method
US10858701B2 (en) 2017-08-15 2020-12-08 Omniome, Inc. Scanning apparatus and method useful for detection of chemical and biological analytes
US11125748B2 (en) * 2017-09-01 2021-09-21 University Of British Columbia Method for organizing individual molecules on a patterned substrate and structures assembled thereby
GB2581620A (en) 2017-09-11 2020-08-26 Twist Bioscience Corp GPCR binding proteins and synthesis thereof
EP3685426A4 (en) 2017-09-19 2021-06-09 MGI Tech Co., Ltd. Wafer level sequencing flow cell fabrication
WO2019060771A2 (en) 2017-09-22 2019-03-28 University Of Washington In situ combinatorial labeling of cellular molecules
GB2566986A (en) 2017-09-29 2019-04-03 Evonetix Ltd Error detection during hybridisation of target double-stranded nucleic acid
US10837047B2 (en) 2017-10-04 2020-11-17 10X Genomics, Inc. Compositions, methods, and systems for bead formation using improved polymers
US11485966B2 (en) 2017-10-11 2022-11-01 Mgi Tech Co., Ltd. Method for improving loading and stability of nucleic acid
KR102362711B1 (en) 2017-10-16 2022-02-14 일루미나, 인코포레이티드 Deep Convolutional Neural Networks for Variant Classification
EP3622519B1 (en) 2017-10-16 2023-09-13 Illumina, Inc. Deep learning-based aberrant splicing detection
US11193166B2 (en) 2017-10-19 2021-12-07 Omniome, Inc. Simultaneous background reduction and complex stabilization in binding assay workflows
CN111565834B (en) 2017-10-20 2022-08-26 特韦斯特生物科学公司 Heated nanopores for polynucleotide synthesis
WO2019084043A1 (en) 2017-10-26 2019-05-02 10X Genomics, Inc. Methods and systems for nuclecic acid preparation and chromatin analysis
EP4241882A3 (en) 2017-10-27 2023-12-06 10X Genomics, Inc. Methods for sample preparation and analysis
SG11201913654QA (en) 2017-11-15 2020-01-30 10X Genomics Inc Functionalized gel beads
US10829815B2 (en) 2017-11-17 2020-11-10 10X Genomics, Inc. Methods and systems for associating physical and genetic properties of biological particles
US11254980B1 (en) 2017-11-29 2022-02-22 Adaptive Biotechnologies Corporation Methods of profiling targeted polynucleotides while mitigating sequencing depth requirements
WO2019108851A1 (en) 2017-11-30 2019-06-06 10X Genomics, Inc. Systems and methods for nucleic acid preparation and analysis
CA3044782A1 (en) 2017-12-29 2019-06-29 Clear Labs, Inc. Automated priming and library loading device
EP3735459A4 (en) 2018-01-04 2021-10-06 Twist Bioscience Corporation Dna-based digital information storage
US11561196B2 (en) 2018-01-08 2023-01-24 Illumina, Inc. Systems and devices for high-throughput sequencing with semiconductor-based detection
NZ759650A (en) 2018-01-08 2022-07-01 Illumina Inc High-throughput sequencing with semiconductor-based detection
CA3065939A1 (en) 2018-01-15 2019-07-18 Illumina, Inc. Deep learning-based variant classifier
US11366303B2 (en) 2018-01-30 2022-06-21 Rebus Biosystems, Inc. Method for detecting particles using structured illumination
WO2019157529A1 (en) 2018-02-12 2019-08-15 10X Genomics, Inc. Methods characterizing multiple analytes from individual cells or cell populations
US11639928B2 (en) 2018-02-22 2023-05-02 10X Genomics, Inc. Methods and systems for characterizing analytes from individual cells or cell populations
US11203782B2 (en) 2018-03-29 2021-12-21 Accuragen Holdings Limited Compositions and methods comprising asymmetric barcoding
AU2019247652A1 (en) * 2018-04-02 2020-10-15 Enumera Molecular, Inc. Methods, systems, and compositions for counting nucleic acid molecules
WO2019195197A1 (en) 2018-04-02 2019-10-10 Dropworks, Inc. Systems and methods for serial flow emulsion processes
EP3775196A4 (en) 2018-04-04 2021-12-22 Nautilus Biotechnology, Inc. Methods of generating nanoarrays and microarrays
SG11202009889VA (en) 2018-04-06 2020-11-27 10X Genomics Inc Systems and methods for quality control in single cell processing
WO2019200338A1 (en) 2018-04-12 2019-10-17 Illumina, Inc. Variant classifier based on deep neural networks
CN108444966A (en) * 2018-04-16 2018-08-24 成都博奥晶芯生物科技有限公司 A kind of concentration gradient fluorescent calibration piece of laser micro array chip scanner
AU2019255987A1 (en) 2018-04-19 2020-12-10 Pacific Biosciences Of California, Inc. Improving accuracy of base calls in nucleic acid sequencing methods
WO2019209426A1 (en) 2018-04-26 2019-10-31 Omniome, Inc. Methods and compositions for stabilizing nucleic acid-nucleotide-polymerase complexes
CN112639094A (en) 2018-05-08 2021-04-09 深圳华大智造科技股份有限公司 Single-tube bead-based DNA co-barcoding for accurate and cost-effective sequencing, haplotyping and assembly
CA3100739A1 (en) 2018-05-18 2019-11-21 Twist Bioscience Corporation Polynucleotides, reagents, and methods for nucleic acid hybridization
US11339428B2 (en) 2018-05-31 2022-05-24 Pacific Biosciences Of California, Inc. Increased signal to noise in nucleic acid sequencing
FI3810774T3 (en) 2018-06-04 2023-12-11 Illumina Inc Methods of making high-throughput single-cell transcriptome libraries
US11932899B2 (en) 2018-06-07 2024-03-19 10X Genomics, Inc. Methods and systems for characterizing nucleic acid molecules
US11703427B2 (en) 2018-06-25 2023-07-18 10X Genomics, Inc. Methods and systems for cell and bead processing
JP2020000197A (en) * 2018-06-29 2020-01-09 キヤノン株式会社 Counting method, concentration measuring apparatus, and concentration measuring system
US11525159B2 (en) 2018-07-03 2022-12-13 Natera, Inc. Methods for detection of donor-derived cell-free DNA
EP3818166A4 (en) * 2018-07-05 2022-03-30 AccuraGen Holdings Limited Compositions and methods for digital polymerase chain reaction
US20200251183A1 (en) 2018-07-11 2020-08-06 Illumina, Inc. Deep Learning-Based Framework for Identifying Sequence Patterns that Cause Sequence-Specific Errors (SSEs)
WO2020023362A1 (en) 2018-07-24 2020-01-30 Omniome, Inc. Serial formation of ternary complex species
US20200032335A1 (en) 2018-07-27 2020-01-30 10X Genomics, Inc. Systems and methods for metabolome analysis
WO2020028194A1 (en) 2018-07-30 2020-02-06 Readcoor, Inc. Methods and systems for sample processing or analysis
US11519033B2 (en) 2018-08-28 2022-12-06 10X Genomics, Inc. Method for transposase-mediated spatial tagging and analyzing genomic DNA in a biological sample
KR20210047364A (en) * 2018-09-19 2021-04-29 앱톤 바이오시스템즈, 인코포레이티드 Densely packed analyte layer and detection method
CN113705585A (en) 2018-10-15 2021-11-26 因美纳有限公司 Method and system based on neural network implementation
US20200149095A1 (en) * 2018-11-14 2020-05-14 Element Biosciences, Inc. Low binding supports for improved solid-phase dna hybridization and amplification
US10876148B2 (en) 2018-11-14 2020-12-29 Element Biosciences, Inc. De novo surface preparation and uses thereof
US10704094B1 (en) 2018-11-14 2020-07-07 Element Biosciences, Inc. Multipart reagents having increased avidity for polymerase binding
GB2613480B (en) * 2018-11-15 2023-11-22 Element Biosciences Inc Methods for generating circular nucleic acid molecules
TW202043450A (en) 2018-11-15 2020-12-01 中國商深圳華大智造科技有限公司 System and method for integrated sensor cartridge
WO2020101795A1 (en) 2018-11-15 2020-05-22 Omniome, Inc. Electronic detection of nucleic acid structure
WO2020108588A1 (en) 2018-12-01 2020-06-04 Mgi Tech Co., Ltd. Methods and structures to improve light collection efficiency in biosensors
US10710076B2 (en) 2018-12-04 2020-07-14 Omniome, Inc. Mixed-phase fluids for nucleic acid sequencing and other analytical assays
WO2020123319A2 (en) 2018-12-10 2020-06-18 10X Genomics, Inc. Methods of using master / copy arrays for spatial detection
US11459607B1 (en) 2018-12-10 2022-10-04 10X Genomics, Inc. Systems and methods for processing-nucleic acid molecules from a single cell using sequential co-partitioning and composite barcodes
US20200208214A1 (en) 2018-12-19 2020-07-02 Illumina, Inc. Methods for improving polynucleotide cluster clonality
CN113227348A (en) 2018-12-20 2021-08-06 欧姆尼欧美公司 Temperature control for analysis of nucleic acids and other analytes
US11649485B2 (en) 2019-01-06 2023-05-16 10X Genomics, Inc. Generating capture probes for spatial analysis
US11926867B2 (en) 2019-01-06 2024-03-12 10X Genomics, Inc. Generating capture probes for spatial analysis
US11845983B1 (en) 2019-01-09 2023-12-19 10X Genomics, Inc. Methods and systems for multiplexing of droplet based assays
WO2020157684A1 (en) 2019-01-29 2020-08-06 Mgi Tech Co., Ltd. High coverage stlfr
EP3924505A1 (en) 2019-02-12 2021-12-22 10X Genomics, Inc. Methods for processing nucleic acid molecules
US11851683B1 (en) 2019-02-12 2023-12-26 10X Genomics, Inc. Methods and systems for selective analysis of cellular samples
US11467153B2 (en) 2019-02-12 2022-10-11 10X Genomics, Inc. Methods for processing nucleic acid molecules
US11499189B2 (en) 2019-02-14 2022-11-15 Pacific Biosciences Of California, Inc. Mitigating adverse impacts of detection systems on nucleic acids and other biological analytes
US11680950B2 (en) 2019-02-20 2023-06-20 Pacific Biosciences Of California, Inc. Scanning apparatus and methods for detecting chemical and biological analytes
US11655499B1 (en) 2019-02-25 2023-05-23 10X Genomics, Inc. Detection of sequence elements in nucleic acid molecules
KR20210144698A (en) 2019-02-26 2021-11-30 트위스트 바이오사이언스 코포레이션 Variant Nucleic Acid Libraries for Antibody Optimization
EP3930753A4 (en) 2019-02-26 2023-03-29 Twist Bioscience Corporation Variant nucleic acid libraries for glp1 receptor
SG11202102530QA (en) 2019-03-01 2021-04-29 Illumina Inc High-throughput single-nuclei and single-cell libraries and methods of making and of using
WO2020180813A1 (en) 2019-03-06 2020-09-10 Qiagen Sciences, Llc Compositions and methods for adaptor design and nucleic acid library construction for rolony-based sequencing
EP3938537A1 (en) 2019-03-11 2022-01-19 10X Genomics, Inc. Systems and methods for processing optically tagged beads
WO2020205296A1 (en) 2019-03-21 2020-10-08 Illumina, Inc. Artificial intelligence-based generation of sequencing metadata
US11210554B2 (en) 2019-03-21 2021-12-28 Illumina, Inc. Artificial intelligence-based generation of sequencing metadata
US11783917B2 (en) 2019-03-21 2023-10-10 Illumina, Inc. Artificial intelligence-based base calling
NL2023316B1 (en) 2019-03-21 2020-09-28 Illumina Inc Artificial intelligence-based sequencing
EP3947718A4 (en) 2019-04-02 2022-12-21 Enumera Molecular, Inc. Methods, systems, and compositions for counting nucleic acid molecules
KR20220004947A (en) 2019-04-29 2022-01-12 일루미나, 인코포레이티드 Identification and Analysis of Microbial Samples by Rapid Incubation and Nucleic Acid Enrichment
US11423306B2 (en) 2019-05-16 2022-08-23 Illumina, Inc. Systems and devices for characterization and performance analysis of pixel-based sequencing
US11593649B2 (en) 2019-05-16 2023-02-28 Illumina, Inc. Base calling using convolutions
WO2020252186A1 (en) 2019-06-11 2020-12-17 Omniome, Inc. Calibrated focus sensing
CN114729342A (en) 2019-06-21 2022-07-08 特韦斯特生物科学公司 Barcode-based nucleic acid sequence assembly
WO2020264260A1 (en) * 2019-06-27 2020-12-30 Zymergen Inc. Laboratory automation system implementing efficient path for material and labware transfers
US11377655B2 (en) 2019-07-16 2022-07-05 Pacific Biosciences Of California, Inc. Synthetic nucleic acids having non-natural structures
US10656368B1 (en) 2019-07-24 2020-05-19 Omniome, Inc. Method and system for biological imaging using a wide field objective lens
US20230220466A1 (en) * 2019-09-10 2023-07-13 The Regents Of The University Of California Immune cell sequencing methods
TW202124406A (en) 2019-09-10 2021-07-01 美商歐姆尼歐美公司 Reversible modification of nucleotides
EP4045683A1 (en) 2019-10-18 2022-08-24 Omniome, Inc. Methods and compositions for capping nucleic acids
EP4025711A2 (en) 2019-11-08 2022-07-13 10X Genomics, Inc. Enhancing specificity of analyte binding
WO2021091611A1 (en) 2019-11-08 2021-05-14 10X Genomics, Inc. Spatially-tagged analyte capture agents for analyte multiplexing
BR112021019640A2 (en) 2019-12-19 2022-06-21 Illumina Inc High-throughput single cell libraries and methods of preparation and use
SG11202106899SA (en) 2019-12-23 2021-09-29 10X Genomics Inc Methods for spatial analysis using rna-templated ligation
US20210189483A1 (en) 2019-12-23 2021-06-24 Mgi Tech Co. Ltd. Controlled strand-displacement for paired end sequencing
US11732299B2 (en) 2020-01-21 2023-08-22 10X Genomics, Inc. Spatial assays with perturbed cells
US11821035B1 (en) 2020-01-29 2023-11-21 10X Genomics, Inc. Compositions and methods of making gene expression libraries
US11898205B2 (en) 2020-02-03 2024-02-13 10X Genomics, Inc. Increasing capture efficiency of spatial assays
US20230054204A1 (en) 2020-02-04 2023-02-23 Pacific Biosciences Of California, Inc. Flow cells and methods for their manufacture and use
US11732300B2 (en) 2020-02-05 2023-08-22 10X Genomics, Inc. Increasing efficiency of spatial analysis in a biological sample
US11835462B2 (en) 2020-02-11 2023-12-05 10X Genomics, Inc. Methods and compositions for partitioning a biological sample
IL295560A (en) 2020-02-20 2022-10-01 Illumina Inc Artificial intelligence-based many-to-many base calling
US11891654B2 (en) 2020-02-24 2024-02-06 10X Genomics, Inc. Methods of making gene expression libraries
US11926863B1 (en) 2020-02-27 2024-03-12 10X Genomics, Inc. Solid state single cell method for analyzing fixed biological cells
EP4114966A1 (en) 2020-03-03 2023-01-11 Pacific Biosciences Of California, Inc. Methods and compositions for sequencing double stranded nucleic acids
US11768175B1 (en) 2020-03-04 2023-09-26 10X Genomics, Inc. Electrophoretic methods for spatial analysis
WO2021188717A1 (en) * 2020-03-18 2021-09-23 Apton Biosystems, Inc. Systems and methods of detecting densely-packed analytes
WO2021216708A1 (en) 2020-04-22 2021-10-28 10X Genomics, Inc. Methods for spatial analysis using targeted rna depletion
US20230183798A1 (en) 2020-05-05 2023-06-15 Pacific Biosciences Of California, Inc. Compositions and methods for modifying polymerase-nucleic acid complexes
US11188778B1 (en) 2020-05-05 2021-11-30 Illumina, Inc. Equalization-based image processing and spatial crosstalk attenuator
US11851700B1 (en) 2020-05-13 2023-12-26 10X Genomics, Inc. Methods, kits, and compositions for processing extracellular molecules
US20230235392A1 (en) * 2020-05-20 2023-07-27 Element Biosciences, Inc. Methods for paired-end sequencing library preparation
WO2021237087A1 (en) 2020-05-22 2021-11-25 10X Genomics, Inc. Spatial analysis to detect sequence variants
EP4153775A1 (en) 2020-05-22 2023-03-29 10X Genomics, Inc. Simultaneous spatio-temporal measurement of gene expression and cellular activity
WO2021242834A1 (en) 2020-05-26 2021-12-02 10X Genomics, Inc. Method for resetting an array
EP4025692A2 (en) 2020-06-02 2022-07-13 10X Genomics, Inc. Nucleic acid library methods
WO2021247568A1 (en) 2020-06-02 2021-12-09 10X Genomics, Inc. Spatial trancriptomics for antigen-receptors
EP4162074B1 (en) 2020-06-08 2024-04-24 10X Genomics, Inc. Methods of determining a surgical margin and methods of use thereof
WO2021252617A1 (en) 2020-06-09 2021-12-16 Illumina, Inc. Methods for increasing yield of sequencing libraries
EP4165207A1 (en) 2020-06-10 2023-04-19 10X Genomics, Inc. Methods for determining a location of an analyte in a biological sample
CN116034166A (en) 2020-06-25 2023-04-28 10X基因组学有限公司 Spatial analysis of DNA methylation
US11761038B1 (en) 2020-07-06 2023-09-19 10X Genomics, Inc. Methods for identifying a location of an RNA in a biological sample
GB202010691D0 (en) * 2020-07-10 2020-08-26 Vidya Holdings Ltd Improvements in or relating to an apparatus for imaging
US20240043913A1 (en) 2020-07-17 2024-02-08 The Regents Of The University Of Michigan Materials and methods for localized detection of nucleic acids in a tissue sample
US11926822B1 (en) 2020-09-23 2024-03-12 10X Genomics, Inc. Three-dimensional spatial analysis
KR20230118570A (en) 2020-11-11 2023-08-11 노틸러스 서브시디어리, 인크. Affinity reagents with enhanced binding and detection properties
US11827935B1 (en) 2020-11-19 2023-11-28 10X Genomics, Inc. Methods for spatial analysis using rolling circle amplification and detection probes
AU2021409136A1 (en) 2020-12-21 2023-06-29 10X Genomics, Inc. Methods, compositions, and systems for capturing probes and/or barcodes
AU2022227563A1 (en) 2021-02-23 2023-08-24 10X Genomics, Inc. Probe-based analysis of nucleic acids and proteins
JP2024514405A (en) 2021-03-11 2024-04-02 ノーティラス・サブシディアリー・インコーポレイテッド Systems and methods for biomolecule retention
WO2022197754A1 (en) 2021-03-16 2022-09-22 Illumina Software, Inc. Neural network parameter quantization for base calling
WO2022198068A1 (en) 2021-03-18 2022-09-22 10X Genomics, Inc. Multiplex capture of gene and protein expression from a biological sample
CA3210451A1 (en) 2021-03-22 2022-09-29 Illumina Cambridge Limited Methods for improving nucleic acid cluster clonality
US20220336054A1 (en) 2021-04-15 2022-10-20 Illumina, Inc. Deep Convolutional Neural Networks to Predict Variant Pathogenicity using Three-Dimensional (3D) Protein Structures
US20220336052A1 (en) * 2021-04-19 2022-10-20 University Of Utah Research Foundation Systems and methods for facilitating rapid genome sequence analysis
EP4352286A1 (en) * 2021-06-11 2024-04-17 Cellanome, Inc. Systems and methods for analyzing biological samples
EP4355476A1 (en) 2021-06-15 2024-04-24 Illumina, Inc. Hydrogel-free surface functionalization for sequencing
US11859241B2 (en) 2021-06-17 2024-01-02 Element Biosciences, Inc. Compositions and methods for pairwise sequencing
US11427855B1 (en) 2021-06-17 2022-08-30 Element Biosciences, Inc. Compositions and methods for pairwise sequencing
WO2023278184A1 (en) 2021-06-29 2023-01-05 Illumina, Inc. Methods and systems to correct crosstalk in illumination emitted from reaction sites
US20230027409A1 (en) 2021-07-13 2023-01-26 Illumina, Inc. Methods and systems for real time extraction of crosstalk in illumination emitted from reaction sites
US11455487B1 (en) 2021-10-26 2022-09-27 Illumina Software, Inc. Intensity extraction and crosstalk attenuation using interpolation and adaptation for base calling
US20230116852A1 (en) 2021-07-23 2023-04-13 Illumina, Inc. Methods for preparing substrate surface for dna sequencing
WO2023034489A1 (en) 2021-09-01 2023-03-09 10X Genomics, Inc. Methods, compositions, and kits for blocking a capture probe on a spatial array
WO2023069927A1 (en) 2021-10-20 2023-04-27 Illumina, Inc. Methods for capturing library dna for sequencing
US20230167488A1 (en) 2021-11-30 2023-06-01 Nautilus Biotechnology, Inc. Particle-based isolation of proteins and other analytes
WO2023141154A1 (en) 2022-01-20 2023-07-27 Illumina Cambridge Limited Methods of detecting methylcytosine and hydroxymethylcytosine by sequencing
WO2023196572A1 (en) 2022-04-07 2023-10-12 Illumina Singapore Pte. Ltd. Altered cytidine deaminases and methods of use
WO2024015962A1 (en) 2022-07-15 2024-01-18 Pacific Biosciences Of California, Inc. Blocked asymmetric hairpin adaptors
WO2024030954A1 (en) 2022-08-03 2024-02-08 Nautilus Subsidiary, Inc. Chemical modification of antibodies and functional fragments thereof
WO2024040058A1 (en) 2022-08-15 2024-02-22 Element Biosciences, Inc. Methods for preparing nucleic acid nanostructures using compaction oligonucleotides
WO2024072614A1 (en) 2022-09-27 2024-04-04 Nautilus Subsidiary, Inc. Polypeptide capture, in situ fragmentation and identification
WO2024073047A1 (en) 2022-09-30 2024-04-04 Illumina, Inc. Cytidine deaminases and methods of use in mapping modified cytosine nucleotides
WO2024073043A1 (en) 2022-09-30 2024-04-04 Illumina, Inc. Methods of using cpg binding proteins in mapping modified cytosine nucleotides
WO2024069581A1 (en) 2022-09-30 2024-04-04 Illumina Singapore Pte. Ltd. Helicase-cytidine deaminase complexes and methods of use
CN115409174B (en) * 2022-11-01 2023-03-31 之江实验室 Base sequence filtering method and device based on DRAM memory calculation

Family Cites Families (413)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3356882A (en) 1965-10-21 1967-12-05 Ford Motor Co Spark plug having the center electrode sheath with a nickel alloy
US3958144A (en) 1973-10-01 1976-05-18 Franks Harry E Spark plug
JPS5180845A (en) 1975-01-14 1976-07-15 Teijin Ltd 2*22 bisu * 3*55 jiburomu 44 * 22 hidorokishetokishi * fueniru * puropannoyojuhojiho
US4318846A (en) 1979-09-07 1982-03-09 Syva Company Novel ether substituted fluorescein polyamino acid compounds as fluorescers and quenchers
US4469863A (en) 1980-11-12 1984-09-04 Ts O Paul O P Nonionic nucleic acid alkyl and aryl phosphonates and processes for manufacture and use thereof
JPS5965108A (en) 1982-10-06 1984-04-13 Sumikin Kozai Kogyo Kk Principal rafter-shaped steel dam and its construction
US4994373A (en) * 1983-01-27 1991-02-19 Enzo Biochem, Inc. Method and structures employing chemically-labelled polynucleotide probes
US4605735A (en) 1983-02-14 1986-08-12 Wakunaga Seiyaku Kabushiki Kaisha Oligonucleotide derivatives
US4948882A (en) 1983-02-22 1990-08-14 Syngene, Inc. Single-stranded labelled oligonucleotides, reactive monomers and methods of synthesis
US4719179A (en) 1984-11-30 1988-01-12 Pharmacia P-L Biochemicals, Inc. Six base oligonucleotide linkers and methods for their use
US4883750A (en) 1984-12-13 1989-11-28 Applied Biosystems, Inc. Detection of specific sequences in nucleic acids
US5034506A (en) 1985-03-15 1991-07-23 Anti-Gene Development Group Uncharged morpholino-based polymers having achiral intersubunit linkages
US5235033A (en) 1985-03-15 1993-08-10 Anti-Gene Development Group Alpha-morpholino ribonucleoside derivatives and polymers thereof
US4683202A (en) 1985-03-28 1987-07-28 Cetus Corporation Process for amplifying nucleic acid sequences
US4683195A (en) 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US4965188A (en) 1986-08-22 1990-10-23 Cetus Corporation Process for amplifying, detecting, and/or cloning nucleic acid sequences using a thermostable enzyme
US4757141A (en) 1985-08-26 1988-07-12 Applied Biosystems, Incorporated Amino-derivatized phosphite and phosphate linking agents, phosphoramidite precursors, and useful conjugates thereof
US5093232A (en) 1985-12-11 1992-03-03 Chiron Corporation Nucleic acid probes
US4800159A (en) 1986-02-07 1989-01-24 Cetus Corporation Process for amplifying, detecting, and/or cloning nucleic acid sequences
US4925785A (en) 1986-03-07 1990-05-15 Biotechnica Diagnostics, Inc. Nucleic acid hybridization assays
US5091519A (en) 1986-05-01 1992-02-25 Amoco Corporation Nucleotide compositions with linking groups
US5151507A (en) 1986-07-02 1992-09-29 E. I. Du Pont De Nemours And Company Alkynylamino-nucleotides
US4725254A (en) 1986-11-24 1988-02-16 Allied Corporation Method for manufacturing a center electrode for a spark plug
US6270961B1 (en) * 1987-04-01 2001-08-07 Hyseq, Inc. Methods and apparatus for DNA sequencing and DNA identification
US5525464A (en) 1987-04-01 1996-06-11 Hyseq, Inc. Method of sequencing by hybridization of oligonucleotide probes
US5202231A (en) 1987-04-01 1993-04-13 Drmanac Radoje T Method of sequencing of genomes by hybridization of oligonucleotide probes
US4942124A (en) 1987-08-11 1990-07-17 President And Fellows Of Harvard College Multiplex sequencing
US5124246A (en) * 1987-10-15 1992-06-23 Chiron Corporation Nucleic acid multimers and amplified nucleic acid hybridization assays using same
US4886741A (en) 1987-12-09 1989-12-12 Microprobe Corporation Use of volume exclusion agents for the enhancement of in situ hybridization
US5354657A (en) 1988-01-12 1994-10-11 Boehringer Mannheim Gmbh Process for the highly specific detection of nucleic acids in solid
DE3813278A1 (en) 1988-01-12 1989-07-20 Boehringer Mannheim Gmbh METHOD FOR DETECTING NUCLEIC ACIDS
US5216141A (en) 1988-06-06 1993-06-01 Benner Steven A Oligonucleotide analogs containing sulfur linkages
US5168038A (en) 1988-06-17 1992-12-01 The Board Of Trustees Of The Leland Stanford Junior University In situ transcription in cells and tissues
US5066580A (en) 1988-08-31 1991-11-19 Becton Dickinson And Company Xanthene dyes that emit to the red of fluorescein
GB8822228D0 (en) 1988-09-21 1988-10-26 Southern E M Support-bound oligonucleotides
US5194599A (en) 1988-09-23 1993-03-16 Gilead Sciences, Inc. Hydrogen phosphonodithioate compositions
DE3836656A1 (en) 1988-10-27 1990-05-03 Boehringer Mannheim Gmbh NEW DIGOXIGENINE DERIVATIVES AND THEIR USE
US5091302A (en) * 1989-04-27 1992-02-25 The Blood Center Of Southeastern Wisconsin, Inc. Polymorphism of human platelet membrane glycoprotein iiia and diagnostic and therapeutic applications thereof
US5424186A (en) 1989-06-07 1995-06-13 Affymax Technologies N.V. Very large scale immobilized polymer synthesis
US6346413B1 (en) * 1989-06-07 2002-02-12 Affymetrix, Inc. Polymer arrays
US5527681A (en) 1989-06-07 1996-06-18 Affymax Technologies N.V. Immobilized molecular synthesis of systematically substituted compounds
US5143854A (en) 1989-06-07 1992-09-01 Affymax Technologies N.V. Large scale photolithographic solid phase synthesis of polypeptides and receptor binding screening thereof
US5800992A (en) 1989-06-07 1998-09-01 Fodor; Stephen P.A. Method of detecting nucleic acids
US5242974A (en) 1991-11-22 1993-09-07 Affymax Technologies N.V. Polymer reversal on solid surfaces
US6379895B1 (en) * 1989-06-07 2002-04-30 Affymetrix, Inc. Photolithographic and other means for manufacturing arrays
US5744101A (en) * 1989-06-07 1998-04-28 Affymax Technologies N.V. Photolabile nucleoside protecting groups
CA2020958C (en) 1989-07-11 2005-01-11 Daniel L. Kacian Nucleic acid sequence amplification methods
DE3924454A1 (en) 1989-07-24 1991-02-07 Cornelis P Prof Dr Hollenberg THE APPLICATION OF DNA AND DNA TECHNOLOGY FOR THE CONSTRUCTION OF NETWORKS FOR USE IN CHIP CONSTRUCTION AND CHIP PRODUCTION (DNA CHIPS)
JPH03101086A (en) 1989-09-14 1991-04-25 Ngk Spark Plug Co Ltd Spark plug for internal combustion engine
US5366860A (en) 1989-09-29 1994-11-22 Applied Biosystems, Inc. Spectrally resolvable rhodamine dyes for nucleic acid sequence determination
US5003097A (en) 1989-10-02 1991-03-26 The United States Of America As Represented By The Department Of Health And Human Services Method for the sulfurization of phosphorous groups in compounds
US5252743A (en) 1989-11-13 1993-10-12 Affymax Technologies N.V. Spatially-addressable immobilization of anti-ligands on surfaces
US5188934A (en) 1989-11-14 1993-02-23 Applied Biosystems, Inc. 4,7-dichlorofluorescein dyes as molecular probes
US5166387A (en) 1990-01-12 1992-11-24 Applied Biosystems, Inc. Method of synthesizing sulfurized oligonucleotide analogs with thiuram disulfides
US5427930A (en) * 1990-01-26 1995-06-27 Abbott Laboratories Amplification of target nucleic acids using gap filling ligase chain reaction
CA2036946C (en) 1990-04-06 2001-10-16 Kenneth V. Deugau Indexing linkers
US5151510A (en) 1990-04-20 1992-09-29 Applied Biosystems, Inc. Method of synethesizing sulfurized oligonucleotide analogs
GB9009980D0 (en) 1990-05-03 1990-06-27 Amersham Int Plc Phosphoramidite derivatives,their preparation and the use thereof in the incorporation of reporter groups on synthetic oligonucleotides
US5073562A (en) 1990-05-10 1991-12-17 G. D. Searle & Co. Alkoxy-substituted dihydrobenzopyran-2-carboxylic acids and derivatives thereof
US5091320A (en) * 1990-06-15 1992-02-25 Bell Communications Research, Inc. Ellipsometric control of material growth
WO1992001813A1 (en) 1990-07-25 1992-02-06 Syngene, Inc. Circular extension for generating multiple nucleic acid complements
US5386023A (en) 1990-07-27 1995-01-31 Isis Pharmaceuticals Backbone modified oligonucleotide analogs and preparation thereof through reductive coupling
US5602240A (en) 1990-07-27 1997-02-11 Ciba Geigy Ag. Backbone modified oligonucleotide analogs
US5210015A (en) 1990-08-06 1993-05-11 Hoffman-La Roche Inc. Homogeneous assay system using the nuclease activity of a nucleic acid polymerase
JP3080178B2 (en) 1991-02-18 2000-08-21 東洋紡績株式会社 Method for amplifying nucleic acid sequence and reagent kit therefor
US5426180A (en) 1991-03-27 1995-06-20 Research Corporation Technologies, Inc. Methods of making single-stranded circular oligonucleotides
JP3085409B2 (en) 1991-03-29 2000-09-11 東洋紡績株式会社 Method for detecting target nucleic acid sequence and reagent kit therefor
US5474796A (en) * 1991-09-04 1995-12-12 Protogene Laboratories, Inc. Method and apparatus for conducting an array of chemical reactions on a support surface
US6589726B1 (en) * 1991-09-04 2003-07-08 Metrigen, Inc. Method and apparatus for in situ synthesis on a solid support
JP2001524926A (en) 1991-09-18 2001-12-04 アフィマックス テクノロジーズ ナームロゼ フェンノートシャップ Method for synthesizing a heterogeneous library of oligomers
IL103267A (en) 1991-09-24 2004-07-25 Keygene Nv Process and kit for amplification of restriction fragment obtained from starting dna
US5632957A (en) * 1993-11-01 1997-05-27 Nanogen Molecular biological diagnostic systems including electrodes
US5981179A (en) 1991-11-14 1999-11-09 Digene Diagnostics, Inc. Continuous amplification reaction
US5550215A (en) 1991-11-22 1996-08-27 Holmes; Christopher P. Polymer reversal on solid surfaces
US5384261A (en) 1991-11-22 1995-01-24 Affymax Technologies N.V. Very large scale immobilized polymer synthesis using mechanically directed flow paths
US5324633A (en) 1991-11-22 1994-06-28 Affymax Technologies N.V. Method and apparatus for measuring binding affinity
DE69233331T3 (en) 1991-11-22 2007-08-30 Affymetrix, Inc., Santa Clara Combinatorial Polymersynthesis Strategies
US5644048A (en) 1992-01-10 1997-07-01 Isis Pharmaceuticals, Inc. Process for preparing phosphorothioate oligonucleotides
EP1382386A3 (en) 1992-02-19 2004-12-01 The Public Health Research Institute Of The City Of New York, Inc. Novel oligonucleotide arrays and their use for sorting, isolating, sequencing, and manipulating nucleic acids
US5403708A (en) * 1992-07-06 1995-04-04 Brennan; Thomas M. Methods and compositions for determining the sequence of nucleic acids
GB9214873D0 (en) 1992-07-13 1992-08-26 Medical Res Council Process for categorising nucleotide sequence populations
US6261808B1 (en) * 1992-08-04 2001-07-17 Replicon, Inc. Amplification of nucleic acid molecules via circular replicons
US5834202A (en) * 1992-08-04 1998-11-10 Replicon, Inc. Methods for the isothermal amplification of nucleic acid molecules
WO1994003624A1 (en) * 1992-08-04 1994-02-17 Auerbach Jeffrey I Methods for the isothermal amplification of nucleic acid molecules
US5583211A (en) 1992-10-29 1996-12-10 Beckman Instruments, Inc. Surface activated organic polymers useful for location - specific attachment of nucleic acids, peptides, proteins and oligosaccharides
US5593826A (en) 1993-03-22 1997-01-14 Perkin-Elmer Corporation, Applied Biosystems, Inc. Enzymatic ligation of 3'amino-substituted oligonucleotides
US5491074A (en) 1993-04-01 1996-02-13 Affymax Technologies Nv Association peptides
ATE246702T1 (en) 1993-04-12 2003-08-15 Univ Northwestern METHOD FOR PREPARING OLIGONUCLEOTIDES
US5714320A (en) * 1993-04-15 1998-02-03 University Of Rochester Rolling circle synthesis of oligonucleotides and amplification of select randomized circular oligonucleotides
US6096880A (en) * 1993-04-15 2000-08-01 University Of Rochester Circular DNA vectors for synthesis of RNA and DNA
US6077668A (en) * 1993-04-15 2000-06-20 University Of Rochester Highly sensitive multimeric nucleic acid probes
US5858659A (en) 1995-11-29 1999-01-12 Affymetrix, Inc. Polymorphism detection
US5837832A (en) * 1993-06-25 1998-11-17 Affymetrix, Inc. Arrays of nucleic acid probes on biological chips
US5473060A (en) 1993-07-02 1995-12-05 Lynx Therapeutics, Inc. Oligonucleotide clamps having diagnostic applications
WO1995001365A1 (en) * 1993-07-02 1995-01-12 Lynx Therapeutics, Inc. Synthesis of branched nucleic acids
US6401267B1 (en) 1993-09-27 2002-06-11 Radoje Drmanac Methods and compositions for efficient nucleic acid sequencing
EP0723598B1 (en) 1993-09-27 2004-01-14 Arch Development Corporation Methods and compositions for efficient nucleic acid sequencing
US5472672A (en) 1993-10-22 1995-12-05 The Board Of Trustees Of The Leland Stanford Junior University Apparatus and method for polymer synthesis using arrays
JPH09507121A (en) 1993-10-26 1997-07-22 アフィマックス テクノロジーズ ナームロゼ ベノートスハップ Nucleic acid probe array on biological chip
US6156501A (en) 1993-10-26 2000-12-05 Affymetrix, Inc. Arrays of modified nucleic acid probes and methods of use
US5429807A (en) 1993-10-28 1995-07-04 Beckman Instruments, Inc. Method and apparatus for creating biopolymer arrays on a solid support surface
US5925517A (en) 1993-11-12 1999-07-20 The Public Health Research Institute Of The City Of New York, Inc. Detectably labeled dual conformation oligonucleotide probes, assays and kits
GB2285942A (en) 1994-01-25 1995-08-02 Ford Motor Co Forming an erosion resistant coating on an electrode
US5654419A (en) 1994-02-01 1997-08-05 The Regents Of The University Of California Fluorescent labels and their use in separations
US6090555A (en) 1997-12-11 2000-07-18 Affymetrix, Inc. Scanned image alignment systems and methods
US5631734A (en) 1994-02-10 1997-05-20 Affymetrix, Inc. Method and apparatus for detection of fluorescently labeled materials
US5578832A (en) 1994-09-02 1996-11-26 Affymetrix, Inc. Method and apparatus for imaging a sample on a device
SE9400522D0 (en) 1994-02-16 1994-02-16 Ulf Landegren Method and reagent for detecting specific nucleotide sequences
US5637684A (en) 1994-02-23 1997-06-10 Isis Pharmaceuticals, Inc. Phosphoramidate and phosphorothioamidate oligomeric compounds
AU2360195A (en) 1994-05-05 1995-11-29 Beckman Instruments, Inc. Oligonucleotide repeat arrays
US5571639A (en) 1994-05-24 1996-11-05 Affymax Technologies N.V. Computer-aided engineering system for design of sequence arrays and lithographic masks
US5807522A (en) 1994-06-17 1998-09-15 The Board Of Trustees Of The Leland Stanford Junior University Methods for fabricating microarrays of biological samples
US7625697B2 (en) 1994-06-17 2009-12-01 The Board Of Trustees Of The Leland Stanford Junior University Methods for constructing subarrays and subarrays made thereby
US5641658A (en) * 1994-08-03 1997-06-24 Mosaic Technologies, Inc. Method for performing amplification of nucleic acid with two primers bound to a single solid support
US5710000A (en) 1994-09-16 1998-01-20 Affymetrix, Inc. Capturing sequences adjacent to Type-IIs restriction sites for genomic library mapping
US6013445A (en) 1996-06-06 2000-01-11 Lynx Therapeutics, Inc. Massively parallel signature sequencing by ligation of encoded adaptors
US6654505B2 (en) * 1994-10-13 2003-11-25 Lynx Therapeutics, Inc. System and apparatus for sequential processing of analytes
US5604097A (en) 1994-10-13 1997-02-18 Spectragen, Inc. Methods for sorting polynucleotides using oligonucleotide tags
US5795716A (en) 1994-10-21 1998-08-18 Chee; Mark S. Computer-aided visualization and analysis system for sequence evaluation
US5556752A (en) 1994-10-24 1996-09-17 Affymetrix, Inc. Surface-bound, unimolecular, double-stranded DNA
FR2726286B1 (en) * 1994-10-28 1997-01-17 Genset Sa SOLID PHASE NUCLEIC ACID AMPLIFICATION PROCESS AND REAGENT KIT USEFUL FOR CARRYING OUT SAID PROCESS
US5744367A (en) 1994-11-10 1998-04-28 Igen International, Inc. Magnetic particle based electrochemiluminescent detection apparatus and method
US5599695A (en) 1995-02-27 1997-02-04 Affymetrix, Inc. Printing molecular library arrays using deprotection agents solely in the vapor phase
US5866337A (en) * 1995-03-24 1999-02-02 The Trustees Of Columbia University In The City Of New York Method to detect mutations in a nucleic acid using a hybridization-ligation procedure
US5750341A (en) 1995-04-17 1998-05-12 Lynx Therapeutics, Inc. DNA sequencing by parallel oligonucleotide extensions
US5624711A (en) 1995-04-27 1997-04-29 Affymax Technologies, N.V. Derivatization of solid supports and methods for oligomer synthesis
US5648245A (en) * 1995-05-09 1997-07-15 Carnegie Institution Of Washington Method for constructing an oligonucleotide concatamer library by rolling circle replication
DE69637315T2 (en) 1995-05-12 2008-08-28 Novartis Ag PROCESS FOR THE PARALLEL DETERMINATION OF SEVERAL ANALYTS BY EVENT-RELATED LUMINESCENCE
US5774305A (en) 1995-06-07 1998-06-30 Seagate Technology, Inc. Head gimbal assembly to reduce slider distortion due to thermal stress
US5545531A (en) 1995-06-07 1996-08-13 Affymax Technologies N.V. Methods for making a device for concurrently processing multiple biological chip assays
US5675209A (en) 1995-06-19 1997-10-07 Hoskins Manufacturing Company Electrode material for a spark plug
US5968740A (en) 1995-07-24 1999-10-19 Affymetrix, Inc. Method of Identifying a Base in a Nucleic Acid
US6132580A (en) 1995-09-28 2000-10-17 The Regents Of The University Of California Miniature reaction chamber and devices incorporating same
US20020068357A1 (en) 1995-09-28 2002-06-06 Mathies Richard A. Miniaturized integrated nucleic acid processing and analysis device and method
ATE496288T1 (en) 1995-10-11 2011-02-15 Luminex Corp SIMULTANEOUS MULTIPLE ANALYSIS OF CLINICAL SAMPLES
US5658734A (en) 1995-10-17 1997-08-19 International Business Machines Corporation Process for synthesizing chemical compounds
US6143495A (en) 1995-11-21 2000-11-07 Yale University Unimolecular segment amplification and sequencing
US5854033A (en) 1995-11-21 1998-12-29 Yale University Rolling circle replication reporter systems
AU704750B2 (en) * 1995-12-05 1999-05-06 Jorn Erland Koch A cascade nucleic acid amplification reaction
US6022963A (en) 1995-12-15 2000-02-08 Affymetrix, Inc. Synthesis of oligonucleotide arrays using photocleavable protecting groups
US6660233B1 (en) 1996-01-16 2003-12-09 Beckman Coulter, Inc. Analytical biochemistry system with robotically carried bioarray
ATE305052T1 (en) * 1996-03-01 2005-10-15 Univ Dundee DRUG INVESTIGATION SYSTEM
US6013440A (en) * 1996-03-11 2000-01-11 Affymetrix, Inc. Nucleic acid affinity columns
US6458530B1 (en) 1996-04-04 2002-10-01 Affymetrix Inc. Selecting tag nucleic acids
US5800996A (en) 1996-05-03 1998-09-01 The Perkin Elmer Corporation Energy transfer dyes with enchanced fluorescence
US5847162A (en) 1996-06-27 1998-12-08 The Perkin Elmer Corporation 4, 7-Dichlororhodamine dyes
US5851804A (en) * 1996-05-06 1998-12-22 Apollon, Inc. Chimeric kanamycin resistance gene
US5981956A (en) 1996-05-16 1999-11-09 Affymetrix, Inc. Systems and methods for detection of labeled materials
ATE318327T1 (en) 1996-06-04 2006-03-15 Univ Utah Res Found FLUORESCENCE-DONOR-ACCEPTOR PAIR
US5869245A (en) * 1996-06-05 1999-02-09 Fox Chase Cancer Center Mismatch endonuclease and its use in identifying mutations in targeted polynucleotide strands
US6582921B2 (en) 1996-07-29 2003-06-24 Nanosphere, Inc. Nanoparticles having oligonucleotides attached thereto and uses thereof
EP0925494B1 (en) 1996-09-04 2001-12-19 Scandinavian Micro Biodevices A/S A micro flow system for particle separation and analysis
GB9620209D0 (en) 1996-09-27 1996-11-13 Cemu Bioteknik Ab Method of sequencing DNA
US5935793A (en) * 1996-09-27 1999-08-10 The Chinese University Of Hong Kong Parallel polynucleotide sequencing method using tagged primers
US6083697A (en) 1996-11-14 2000-07-04 Affymetrix, Inc. Chemical amplification for the synthesis of patterned arrays
US6054100A (en) * 1996-11-18 2000-04-25 Robbins Scientific Corporation Apparatus for multi-well microscale synthesis
US5916750A (en) 1997-01-08 1999-06-29 Biogenex Laboratories Multifunctional linking reagents for synthesis of branched oligomers
US6297006B1 (en) 1997-01-16 2001-10-02 Hyseq, Inc. Methods for sequencing repetitive sequences and for determining the order of sequence subfragments
US20020042048A1 (en) 1997-01-16 2002-04-11 Radoje Drmanac Methods and compositions for detection or quantification of nucleic acid species
US6309824B1 (en) 1997-01-16 2001-10-30 Hyseq, Inc. Methods for analyzing a target nucleic acid using immobilized heterogeneous mixtures of oligonucleotide probes
US5994068A (en) 1997-03-11 1999-11-30 Wisconsin Alumni Research Foundation Nucleic acid indexing
US6327410B1 (en) * 1997-03-14 2001-12-04 The Trustees Of Tufts College Target analyte sensors utilizing Microspheres
AU6846798A (en) 1997-04-01 1998-10-22 Glaxo Group Limited Method of nucleic acid sequencing
ES2563643T3 (en) 1997-04-01 2016-03-15 Illumina Cambridge Limited Nucleic acid sequencing method
US5888737A (en) * 1997-04-15 1999-03-30 Lynx Therapeutics, Inc. Adaptor-based sequence analysis
US20040229221A1 (en) 1997-05-08 2004-11-18 Trustees Of Columbia University In The City Of New York Method to detect mutations in a nucleic acid using a hybridization-ligation procedure
US5919626A (en) 1997-06-06 1999-07-06 Orchid Bio Computer, Inc. Attachment of unmodified nucleic acids to silanized solid phase surfaces
EP1908832B1 (en) 1997-07-07 2012-12-26 Medical Research Council A method for increasing the concentration of a nucleic acid molecule
US6124120A (en) 1997-10-08 2000-09-26 Yale University Multiple displacement amplification
US6485944B1 (en) 1997-10-10 2002-11-26 President And Fellows Of Harvard College Replica amplification of nucleic acid arrays
AU737174B2 (en) 1997-10-10 2001-08-09 President & Fellows Of Harvard College Replica amplification of nucleic acid arrays
US6511803B1 (en) * 1997-10-10 2003-01-28 President And Fellows Of Harvard College Replica amplification of nucleic acid arrays
US5866473A (en) * 1997-10-31 1999-02-02 Advanced Micro Devices, Inc. Method of manufacturing a polysilicon gate having a dimension below the photolithography limitation
ATE291097T1 (en) 1997-10-31 2005-04-15 Affymetrix Inc A Delaware Corp EXPRESSION PROFILES IN ADULT AND FETAL ORGANS
US6322901B1 (en) 1997-11-13 2001-11-27 Massachusetts Institute Of Technology Highly luminescent color-selective nano-crystalline materials
US5990479A (en) 1997-11-25 1999-11-23 Regents Of The University Of California Organo Luminescent semiconductor nanocrystal probes for biological applications and process for making and using such probes
US6207392B1 (en) 1997-11-25 2001-03-27 The Regents Of The University Of California Semiconductor nanocrystal probes for biological applications and process for making and using such probes
US6087102A (en) 1998-01-07 2000-07-11 Clontech Laboratories, Inc. Polymeric arrays and methods for their use in binding assays
US6428752B1 (en) 1998-05-14 2002-08-06 Affymetrix, Inc. Cleaning deposit devices that form microarrays and the like
US6269846B1 (en) 1998-01-13 2001-08-07 Genetic Microsystems, Inc. Depositing fluid specimens on substrates, resulting ordered arrays, techniques for deposition of arrays
US6287776B1 (en) 1998-02-02 2001-09-11 Signature Bioscience, Inc. Method for detecting and classifying nucleic acid hybridization
JP2002502588A (en) 1998-02-06 2002-01-29 アフィメトリックス インコーポレイテッド Quality control methods in the manufacturing process
US6136537A (en) 1998-02-23 2000-10-24 Macevicz; Stephen C. Gene expression analysis
JP3944996B2 (en) 1998-03-05 2007-07-18 株式会社日立製作所 DNA probe array
US6174687B1 (en) 1999-02-26 2001-01-16 The Burnham Institute Methods of identifying lung homing molecules using membrane dipeptidase
DE19812729A1 (en) 1998-03-24 1999-09-30 Bosch Gmbh Robert Electric motor, in particular with a fan wheel to form an axial or radial fan
DK1997909T3 (en) * 1998-03-25 2012-04-23 Olink Ab Rolling circle replication of circularized target nucleic acid fragments
US5936324A (en) 1998-03-30 1999-08-10 Genetic Microsystems Inc. Moving magnet scanner
US6004755A (en) * 1998-04-07 1999-12-21 Incyte Pharmaceuticals, Inc. Quantitative microarray hybridizaton assays
US6284497B1 (en) * 1998-04-09 2001-09-04 Trustees Of Boston University Nucleic acid arrays and methods of synthesis
JP4262799B2 (en) 1998-04-16 2009-05-13 平田機工株式会社 Raw tire supply method
US6355419B1 (en) * 1998-04-27 2002-03-12 Hyseq, Inc. Preparation of pools of nucleic acids based on representation in a sample
US6270831B2 (en) 1998-04-30 2001-08-07 Medquest Products, Inc. Method and apparatus for providing a conductive, amorphous non-stick coating
US6255469B1 (en) 1998-05-06 2001-07-03 New York University Periodic two and three dimensional nucleic acid structures
US6031078A (en) 1998-06-16 2000-02-29 Millennium Pharmaceuticals, Inc. MTbx protein and nucleic acid molecules and uses therefor
US5980345A (en) 1998-07-13 1999-11-09 Alliedsignal Inc. Spark plug electrode having iridium based sphere and method for manufacturing same
US6316229B1 (en) * 1998-07-20 2001-11-13 Yale University Single molecule analysis target-mediated ligation of bipartite primers
GB0002310D0 (en) 2000-02-01 2000-03-22 Solexa Ltd Polynucleotide sequencing
US20040106110A1 (en) * 1998-07-30 2004-06-03 Solexa, Ltd. Preparation of polynucleotide arrays
US6787308B2 (en) 1998-07-30 2004-09-07 Solexa Ltd. Arrayed biomolecules and their use in sequencing
WO2000006770A1 (en) 1998-07-30 2000-02-10 Solexa Ltd. Arrayed biomolecules and their use in sequencing
US6232067B1 (en) * 1998-08-17 2001-05-15 The Perkin-Elmer Corporation Adapter directed expression analysis
US6046039A (en) 1998-08-19 2000-04-04 Battelle Memorial Institute Methods for producing partially digested restriction DNA fragments and for producing a partially modified PCR product
US6653077B1 (en) * 1998-09-04 2003-11-25 Lynx Therapeutics, Inc. Method of screening for genetic polymorphism
EP1114184A2 (en) * 1998-09-15 2001-07-11 Yale University Molecular cloning using rolling circle amplification
US20020019007A1 (en) * 1998-09-18 2002-02-14 Jensen Wayne A. PCR methods and materials
US6251303B1 (en) 1998-09-18 2001-06-26 Massachusetts Institute Of Technology Water-soluble fluorescent nanocrystals
US6326144B1 (en) 1998-09-18 2001-12-04 Massachusetts Institute Of Technology Biological applications of quantum dots
US6426513B1 (en) 1998-09-18 2002-07-30 Massachusetts Institute Of Technology Water-soluble thiol-capped nanocrystals
US6235502B1 (en) * 1998-09-18 2001-05-22 Molecular Staging Inc. Methods for selectively isolating DNA using rolling circle amplification
US6610492B1 (en) 1998-10-01 2003-08-26 Variagenics, Inc. Base-modified nucleotides and cleavage of polynucleotides incorporating them
US6277628B1 (en) 1998-10-02 2001-08-21 Incyte Genomics, Inc. Linear microarrays
US7272507B2 (en) * 1999-10-26 2007-09-18 Michael Paul Strathmann Applications of parallel genomic analysis
WO2000040755A2 (en) 1999-01-06 2000-07-13 Cornell Research Foundation, Inc. Method for accelerating identification of single nucleotide polymorphisms and alignment of clones in genomic sequencing
EP2145963A1 (en) 1999-01-06 2010-01-20 Callida Genomics, Inc. Enhanced sequencing by hybridization using pools of probes
GB9901475D0 (en) 1999-01-22 1999-03-17 Pyrosequencing Ab A method of DNA sequencing
US6514768B1 (en) 1999-01-29 2003-02-04 Surmodics, Inc. Replicable probe array
EP1147229A2 (en) * 1999-02-02 2001-10-24 Bernhard O. Palsson Methods for identifying drug targets based on genomic sequence data
EP1196554B1 (en) 1999-03-18 2009-02-25 Complete Genomics AS Methods of cloning and producing fragment chains with readable information content
WO2000056937A2 (en) 1999-03-25 2000-09-28 Hyseq, Inc. Solution-based methods and materials for sequence analysis by hybridization
EP1165839A2 (en) 1999-03-26 2002-01-02 Whitehead Institute For Biomedical Research Universal arrays
ATE553219T1 (en) 1999-04-20 2012-04-15 Illumina Inc DETECTION OF NUCLEIC ACID REACTIONS ON BEAD ARRAYS
US6355431B1 (en) 1999-04-20 2002-03-12 Illumina, Inc. Detection of nucleic acid amplification reactions using bead arrays
US6521428B1 (en) * 1999-04-21 2003-02-18 Genome Technologies, Llc Shot-gun sequencing and amplification without cloning
EP1179185B1 (en) 1999-05-07 2009-08-12 Life Technologies Corporation A method of detecting an analyte using semiconductor nanocrystals
WO2000075373A2 (en) * 1999-05-20 2000-12-14 Illumina, Inc. Combinatorial decoding of random nucleic acid arrays
US6544732B1 (en) 1999-05-20 2003-04-08 Illumina, Inc. Encoding and decoding of array sensors utilizing nanocrystals
ATE338273T1 (en) 1999-05-20 2006-09-15 Illumina Inc DEVICE FOR HOLDING AND PRESENTING AT LEAST ONE MICRO SPHERE MATRIX FOR SOLUTIONS AND/OR OPTICAL IMAGING SYSTEMS
US6573369B2 (en) * 1999-05-21 2003-06-03 Bioforce Nanosciences, Inc. Method and apparatus for solid state molecular analysis
US6326719B1 (en) 1999-06-16 2001-12-04 Alliedsignal Inc. Spark plug shell having a bimetallic ground electrode spark plug incorporating the shell, and method of making same
US6818395B1 (en) 1999-06-28 2004-11-16 California Institute Of Technology Methods and apparatus for analyzing polynucleotide sequences
US7501245B2 (en) 1999-06-28 2009-03-10 Helicos Biosciences Corp. Methods and apparatuses for analyzing polynucleotide sequences
AU6387000A (en) 1999-07-29 2001-02-19 Genzyme Corporation Serial analysis of genetic alterations
US6440706B1 (en) 1999-08-02 2002-08-27 Johns Hopkins University Digital amplification
US6472156B1 (en) * 1999-08-30 2002-10-29 The University Of Utah Homogeneous multiplex hybridization analysis by color and Tm
US7211390B2 (en) 1999-09-16 2007-05-01 454 Life Sciences Corporation Method of sequencing a nucleic acid
US7244559B2 (en) 1999-09-16 2007-07-17 454 Life Sciences Corporation Method of sequencing a nucleic acid
US6274320B1 (en) 1999-09-16 2001-08-14 Curagen Corporation Method of sequencing a nucleic acid
AU7537200A (en) 1999-09-29 2001-04-30 Solexa Ltd. Polynucleotide sequencing
EP1235932A2 (en) * 1999-10-08 2002-09-04 Protogene Laboratories, Inc. Method and apparatus for performing large numbers of reactions using array assembly
US6297016B1 (en) 1999-10-08 2001-10-02 Applera Corporation Template-dependent ligation with PNA-DNA chimeric probes
US6287778B1 (en) 1999-10-19 2001-09-11 Affymetrix, Inc. Allele detection using primer extension with sequence-coded identity tags
AU783841B2 (en) 1999-11-26 2005-12-15 454 Life Sciences Corporation Nucleic acid probe arrays
CA2360342A1 (en) * 1999-12-02 2001-06-07 Molecular Staging Inc. Generation of single-strand circular dna from linear self-annealing segments
US20030186256A1 (en) 1999-12-23 2003-10-02 Achim Fischer Method for carrying out the parallel sequencing of a nucleic acid mixture on a surface
EP1255860A2 (en) 1999-12-29 2002-11-13 Mergen Ltd. Methods for amplifying and detecting multiple polynucleotides on a solid phase support
GB0002389D0 (en) 2000-02-02 2000-03-22 Solexa Ltd Molecular arrays
US6221603B1 (en) * 2000-02-04 2001-04-24 Molecular Dynamics, Inc. Rolling circle amplification assay for nucleic acid analysis
US6913884B2 (en) 2001-08-16 2005-07-05 Illumina, Inc. Compositions and methods for repetitive use of genomic DNA
DE60136166D1 (en) * 2000-02-07 2008-11-27 Illumina Inc NUCLEIC ACID PROOF METHOD WITH UNIVERSAL PRIMING
AU2001238067B2 (en) * 2000-02-07 2007-01-25 Illumina, Inc. Nucleic acid detection methods using universal priming
US6770441B2 (en) 2000-02-10 2004-08-03 Illumina, Inc. Array compositions and methods of making same
AU2001241723A1 (en) 2000-02-25 2001-09-03 Affymetrix, Inc. Methods for multi-stage solid phase amplification of nucleic acids
US20020004204A1 (en) 2000-02-29 2002-01-10 O'keefe Matthew T. Microarray substrate with integrated photodetector and methods of use thereof
US6413722B1 (en) * 2000-03-22 2002-07-02 Incyte Genomics, Inc. Polymer coated surfaces for microarray applications
AU2001253310A1 (en) * 2000-04-10 2001-10-23 Matthew Ashby Methods for the survey and genetic analysis of populations
WO2001090415A2 (en) 2000-05-20 2001-11-29 The Regents Of The University Of Michigan Method of producing a dna library using positional amplification
WO2001090821A1 (en) 2000-05-25 2001-11-29 Fujitsu Limited Toner and image forming method
DE10027651C2 (en) 2000-06-03 2002-11-28 Bosch Gmbh Robert Electrode, method for its production and spark plug with such an electrode
AU2001268468A1 (en) 2000-06-13 2001-12-24 The Trustees Of Boston University Use of nucleotide analogs in the analysis of oligonucleotide mixtures and in highly multiplexed nucleic acid sequencing
JP3930227B2 (en) * 2000-06-14 2007-06-13 ペンタックス株式会社 Surveyor equipped with magnetic encoder and magnetic encoder
JP2002085097A (en) 2000-09-12 2002-03-26 Hitachi Ltd Method for determination of dna base sequence
US6649138B2 (en) 2000-10-13 2003-11-18 Quantum Dot Corporation Surface-modified semiconductive and metallic nanoparticles having enhanced dispersibility in aqueous media
ATE380883T1 (en) 2000-10-24 2007-12-15 Univ Leland Stanford Junior DIRECT MULTIPLEX CHARACTERIZATION OF GENOMIC DNA
US20020055102A1 (en) * 2000-10-24 2002-05-09 David Stern Apparatus and method for scanning multiple arrays of biological probes
US6576291B2 (en) 2000-12-08 2003-06-10 Massachusetts Institute Of Technology Preparation of nanocrystallites
US20020074920A1 (en) 2000-12-15 2002-06-20 Chiu Randolph Kwok-Kin High efficiency, extended life spark plug having improved firing tips
AU2002239679A1 (en) * 2000-12-20 2002-07-01 The Regents Of The University Of California Rolling circle amplification detection of rna and dna
WO2002053732A2 (en) * 2000-12-28 2002-07-11 Pangenex Methods for making polynucleotide libraries, polynucleotide arrays, and cell librraries for high-throughput genomics analysis
ATE374259T1 (en) 2001-01-30 2007-10-15 Solexa Ltd PRODUCTION OF MATRICES FROM POLYNUCLEOTIDES
WO2002061143A2 (en) * 2001-01-31 2002-08-08 Ambion, Inc. Comparative analysis of nucleic acids using population tagging
EP2465943A3 (en) * 2001-03-16 2012-10-03 Kalim Mir Linear polymer display
JP4171206B2 (en) 2001-03-16 2008-10-22 株式会社デンソー Spark plug and manufacturing method thereof
US6777187B2 (en) 2001-05-02 2004-08-17 Rubicon Genomics, Inc. Genome walking by selective amplification of nick-translate DNA library and amplification from complex mixtures of templates
GB0114853D0 (en) * 2001-06-18 2001-08-08 Medical Res Council Happier Mapping
WO2003092043A2 (en) 2001-07-20 2003-11-06 Quantum Dot Corporation Luminescent nanoparticles and methods for their preparation
US7297778B2 (en) 2001-07-25 2007-11-20 Affymetrix, Inc. Complexity management of genomic DNA
AU2002322224A1 (en) 2001-07-27 2003-02-17 Mcgill University Isp-1 and ctb-1 genes and uses thereof
US6881579B2 (en) 2001-07-30 2005-04-19 Agilent Technologies, Inc. Sample processing apparatus and methods
GB2378245A (en) 2001-08-03 2003-02-05 Mats Nilsson Nucleic acid amplification method
WO2005040094A1 (en) * 2003-10-24 2005-05-06 Postech Foundation Novel dendrimer compound and a biochip using the same
US9201067B2 (en) * 2003-03-05 2015-12-01 Posco Size-controlled macromolecule
US6975943B2 (en) * 2001-09-24 2005-12-13 Seqwright, Inc. Clone-array pooled shotgun strategy for nucleic acid sequencing
US6617137B2 (en) 2001-10-15 2003-09-09 Molecular Staging Inc. Method of amplifying whole genomes without subjecting the genome to denaturing conditions
GB0126887D0 (en) 2001-11-08 2002-01-02 Univ London Method for producing and identifying soluble protein domains
WO2003044216A2 (en) 2001-11-19 2003-05-30 Parallele Bioscience, Inc. Multiplex oligonucleotide addition and target amplification
GB2382137A (en) 2001-11-20 2003-05-21 Mats Gullberg Nucleic acid enrichment
US7361310B1 (en) * 2001-11-30 2008-04-22 Northwestern University Direct write nanolithographic deposition of nucleic acids from nanoscopic tips
US7011945B2 (en) * 2001-12-21 2006-03-14 Eastman Kodak Company Random array of micro-spheres for the analysis of nucleic acids
US20040002090A1 (en) 2002-03-05 2004-01-01 Pascal Mayer Methods for detecting genome-wide sequence variations associated with a phenotype
WO2004011665A2 (en) 2002-05-17 2004-02-05 Nugen Technologies, Inc. Methods for fragmentation, labeling and immobilization of nucleic acids
DE10224339A1 (en) * 2002-05-29 2003-12-11 Axaron Bioscience Ag Method for highly parallel nucleic acid sequencing
AUPS298102A0 (en) 2002-06-13 2002-07-04 Nucleics Pty Ltd Method for performing chemical reactions
US20050019776A1 (en) 2002-06-28 2005-01-27 Callow Matthew James Universal selective genome amplification and universal genotyping system
US7046376B2 (en) 2002-07-05 2006-05-16 Therma-Wave, Inc. Overlay targets with isolated, critical-dimension features and apparatus to measure overlay
US6747331B2 (en) * 2002-07-17 2004-06-08 International Business Machines Corporation Method and packaging structure for optimizing warpage of flip chip organic packages
TW200403052A (en) 2002-07-17 2004-03-01 Novartis Ag Use of organic compounds
WO2004017376A2 (en) * 2002-08-16 2004-02-26 Miragene, Inc. Integrated system for printing, processing, and imaging protein microarrays
SI3587433T1 (en) 2002-08-23 2020-08-31 Illumina Cambridge Limited Modified nucleotides
AU2002348943A1 (en) 2002-09-11 2004-04-30 Moltech Invent S.A. Non-carbon anodes for aluminium electrowinning and other oxidation resistant components with iron oxide-containing coatings
EP1556506A1 (en) 2002-09-19 2005-07-27 The Chancellor, Masters And Scholars Of The University Of Oxford Molecular arrays and single molecule detection
AU2003279089A1 (en) 2002-09-30 2004-04-19 Parallele Bioscience, Inc. Polynucleotide synthesis and labeling by kinetic sampling ligation
US7459273B2 (en) 2002-10-04 2008-12-02 Affymetrix, Inc. Methods for genotyping selected polymorphism
US20040120861A1 (en) * 2002-10-11 2004-06-24 Affymetrix, Inc. System and method for high-throughput processing of biological probe arrays
US8859748B2 (en) 2002-10-11 2014-10-14 Jacobus Johannes Maria van Dongen Nucleic acid amplification primers for PCR-based clonality studies
US20040086892A1 (en) 2002-11-06 2004-05-06 Crothers Donald M. Universal tag assay
CA2510166A1 (en) 2002-12-20 2004-09-30 Caliper Life Sciences, Inc. Single molecule amplification and detection of dna
US6977153B2 (en) * 2002-12-31 2005-12-20 Qiagen Gmbh Rolling circle amplification of RNA
DE10300599A1 (en) 2003-01-10 2004-07-22 Jörg Dr. Sommer Multi-hull ocean-going ship with movable float body wave power plant, converts float body movement relative to columns caused by wave motion to energy available on board, including for ship's drive
EP2159285B1 (en) 2003-01-29 2012-09-26 454 Life Sciences Corporation Methods of amplifying and sequencing nucleic acids
US7575865B2 (en) 2003-01-29 2009-08-18 454 Life Sciences Corporation Methods of amplifying and sequencing nucleic acids
EP1592810A2 (en) 2003-02-12 2005-11-09 Genizon Svenska AB Methods and means for nucleic acid sequencing
CN1791682B (en) 2003-02-26 2013-05-22 凯利达基因组股份有限公司 Random array DNA analysis by hybridization
EP1604040B1 (en) 2003-03-07 2010-10-13 Rubicon Genomics, Inc. Amplification and analysis of whole genome and whole transcriptome libraries generated by a dna polymerization process
FR2852605B1 (en) 2003-03-18 2012-11-30 Commissariat Energie Atomique PROCESS FOR PREPARING DNA FRAGMENTS AND ITS APPLICATIONS
CA2520811C (en) 2003-03-28 2013-05-28 Japan As Represented By Director General Of National Rehabilitation Center For Persons With Disabilities Method of cdna synthesis
WO2004092331A2 (en) * 2003-04-08 2004-10-28 Li-Cor, Inc. Composition and method for nucleic acid sequencing
US20050181394A1 (en) 2003-06-20 2005-08-18 Illumina, Inc. Methods and compositions for whole genome amplification and genotyping
US20040259118A1 (en) * 2003-06-23 2004-12-23 Macevicz Stephen C. Methods and compositions for nucleic acid sequence analysis
AU2003263632A1 (en) * 2003-08-19 2005-03-07 Posco Novel dendrimer compound, a biochip using the same and a fabricating method thereof
GB0320337D0 (en) 2003-08-29 2003-10-01 Syrris Ltd A microfluidic system
JP5171037B2 (en) 2003-09-10 2013-03-27 アルセア・ディーエックス,インク. Expression profiling using microarrays
AT501110A1 (en) * 2003-09-16 2006-06-15 Upper Austrian Res Gmbh ARRAYS TO BIND MOLECULES
US8222005B2 (en) * 2003-09-17 2012-07-17 Agency For Science, Technology And Research Method for gene identification signature (GIS) analysis
EP1685380A2 (en) 2003-09-18 2006-08-02 Parallele Bioscience, Inc. System and methods for enhancing signal-to-noise ratios of microarray-based measurements
GB0324456D0 (en) 2003-10-20 2003-11-19 Isis Innovation Parallel DNA sequencing methods
TWM245024U (en) * 2003-10-30 2004-10-01 Tranmax Machinery Co Ltd Airtight structure for air passage of pneumatic tool
EP1682680B2 (en) 2003-10-31 2018-03-21 AB Advanced Genetic Analysis Corporation Methods for producing a paired tag from a nucleic acid sequence and methods of use thereof
JP3796607B2 (en) 2003-10-31 2006-07-12 株式会社リコー Fixing device
US7148361B2 (en) 2003-10-31 2006-12-12 North Carolina State University Synthesis of phosphono-substituted porphyrin compounds for attachment to metal oxide surfaces
US7169560B2 (en) 2003-11-12 2007-01-30 Helicos Biosciences Corporation Short cycle methods for sequencing polynucleotides
US7854108B2 (en) 2003-12-12 2010-12-21 Vision Robotics Corporation Agricultural robot system and method
US7972994B2 (en) * 2003-12-17 2011-07-05 Glaxosmithkline Llc Methods for synthesis of encoded libraries
US20050136414A1 (en) 2003-12-23 2005-06-23 Kevin Gunderson Methods and compositions for making locus-specific arrays
US20050208538A1 (en) 2003-12-29 2005-09-22 Nurith Kurn Methods for analysis of nucleic acid methylation status and methods for fragmentation, labeling and immobilization of nucleic acids
US20110059865A1 (en) * 2004-01-07 2011-03-10 Mark Edward Brennan Smith Modified Molecular Arrays
GB0400584D0 (en) * 2004-01-12 2004-02-11 Solexa Ltd Nucleic acid chacterisation
ES2432040T3 (en) 2004-01-28 2013-11-29 454 Life Sciences Corporation Nucleic acid amplification with continuous flow emulsion
GB0402895D0 (en) 2004-02-10 2004-03-17 Solexa Ltd Arrayed polynucleotides
DE602005020421D1 (en) 2004-02-19 2010-05-20 Helicos Biosciences Corp METHOD FOR THE ANALYSIS OF POLYNUCLEOTIDE SEQUENCES
CN1950519A (en) 2004-02-27 2007-04-18 哈佛大学的校长及成员们 Polony fluorescent in situ sequencing beads
KR100552706B1 (en) 2004-03-12 2006-02-20 삼성전자주식회사 Method and apparatus for nucleic acid amplification
US20050214840A1 (en) 2004-03-23 2005-09-29 Xiangning Chen Restriction enzyme mediated method of multiplex genotyping
GB2413796B (en) * 2004-03-25 2006-03-29 Global Genomics Ab Methods and means for nucleic acid sequencing
EP1740719B1 (en) 2004-04-09 2011-06-22 Trustees of Boston University Method for de novo detection of sequences in nucleic acids:target sequencing by fragmentation
US20050260609A1 (en) 2004-05-24 2005-11-24 Lapidus Stanley N Methods and devices for sequencing nucleic acids
US7635562B2 (en) * 2004-05-25 2009-12-22 Helicos Biosciences Corporation Methods and devices for nucleic acid sequence determination
US20070117104A1 (en) 2005-11-22 2007-05-24 Buzby Philip R Nucleotide analogs
US7565346B2 (en) 2004-05-31 2009-07-21 International Business Machines Corporation System and method for sequence-based subspace pattern clustering
US20060024711A1 (en) 2004-07-02 2006-02-02 Helicos Biosciences Corporation Methods for nucleic acid amplification and sequence determination
US7276720B2 (en) 2004-07-19 2007-10-02 Helicos Biosciences Corporation Apparatus and methods for analyzing samples
US20060012793A1 (en) 2004-07-19 2006-01-19 Helicos Biosciences Corporation Apparatus and methods for analyzing samples
WO2006073504A2 (en) 2004-08-04 2006-07-13 President And Fellows Of Harvard College Wobble sequencing
US20060073506A1 (en) 2004-09-17 2006-04-06 Affymetrix, Inc. Methods for identifying biological samples
GB0422551D0 (en) 2004-10-11 2004-11-10 Univ Liverpool Labelling and sequencing of nucleic acids
US20060110764A1 (en) 2004-10-25 2006-05-25 Tom Tang Large-scale parallelized DNA sequencing
JP2008520975A (en) 2004-11-16 2008-06-19 ヘリコス バイオサイエンシーズ コーポレイション TIRF single molecule analysis and method for sequencing nucleic acids
CA2586228A1 (en) 2004-11-17 2006-05-26 The University Of Chicago Histone deacetylase inhibitors and methods of use
WO2006074351A2 (en) 2005-01-05 2006-07-13 Agencourt Personal Genomics Reversible nucleotide terminators and uses thereof
WO2006084132A2 (en) 2005-02-01 2006-08-10 Agencourt Bioscience Corp. Reagents, methods, and libraries for bead-based squencing
US7393665B2 (en) 2005-02-10 2008-07-01 Population Genetics Technologies Ltd Methods and compositions for tagging and identifying polynucleotides
US20060223122A1 (en) 2005-03-08 2006-10-05 Agnes Fogo Classifying and predicting glomerulosclerosis using a proteomics approach
US8110196B2 (en) 2005-04-29 2012-02-07 Polytopas LLC Methods and compositions for polytopic vaccination
US20060263789A1 (en) * 2005-05-19 2006-11-23 Robert Kincaid Unique identifiers for indicating properties associated with entities to which they are attached, and methods for using
EP2463386B1 (en) 2005-06-15 2017-04-12 Complete Genomics Inc. Nucleic acid analysis by random mixtures of non-overlapping fragments
CA2616433A1 (en) 2005-07-28 2007-02-01 Helicos Biosciences Corporation Consecutive base single molecule sequencing
US20090291419A1 (en) * 2005-08-01 2009-11-26 Kazuaki Uekawa System of sound representaion and pronunciation techniques for english and other european languages
EP1910688A4 (en) 2005-08-04 2010-03-03 Helicos Biosciences Corp Multi-channel flow cells
DK1915446T3 (en) 2005-08-11 2017-09-11 Synthetic Genomics Inc IN VITRO RECOMBINATION PROCEDURE
US7666593B2 (en) 2005-08-26 2010-02-23 Helicos Biosciences Corporation Single molecule sequencing of captured nucleic acids
CA2910861C (en) 2005-09-29 2018-08-07 Michael Josephus Theresia Van Eijk High throughput screening of mutagenized populations
AU2012216376B2 (en) 2005-10-07 2015-08-13 Complete Genomics, Inc. High throughput genome sequencing on DNA arrays
AU2013202990B2 (en) 2005-10-07 2015-08-20 Complete Genomics, Inc. High throughput genome sequencing on DNA arrays
US7960104B2 (en) 2005-10-07 2011-06-14 Callida Genomics, Inc. Self-assembled single molecule arrays and uses thereof
CA2624896C (en) 2005-10-07 2017-11-07 Callida Genomics, Inc. Self-assembled single molecule arrays and uses thereof
WO2007133831A2 (en) 2006-02-24 2007-11-22 Callida Genomics, Inc. High throughput genome sequencing on dna arrays
WO2007120208A2 (en) * 2005-11-14 2007-10-25 President And Fellows Of Harvard College Nanogrid rolling circle dna sequencing
WO2007087310A2 (en) * 2006-01-23 2007-08-02 Population Genetics Technologies Ltd. Nucleic acid analysis using sequence tokens
WO2007092538A2 (en) * 2006-02-07 2007-08-16 President And Fellows Of Harvard College Methods for making nucleotide probes for sequencing and synthesis
CN101415839B (en) 2006-02-08 2012-06-27 亿明达剑桥有限公司 Method for sequencing a polynucleotide template
US8460879B2 (en) * 2006-02-21 2013-06-11 The Trustees Of Tufts College Methods and arrays for target analyte detection and determination of target analyte concentration in solution
SG10201405158QA (en) 2006-02-24 2014-10-30 Callida Genomics Inc High throughput genome sequencing on dna arrays
CA2647786A1 (en) * 2006-03-14 2007-09-20 Genizon Biosciences Inc. Methods and means for nucleic acid sequencing
DE102006015433A1 (en) 2006-03-31 2007-10-04 Reuver, Hermannus S.F. Soil-processing of red mud comprises mixing the red mud with a soil substrate to form soil-mixture, spreading the mixture on a processing surface, adding soil processing worms into the mixture, and biologically processing by the worms
US7359610B2 (en) 2006-04-03 2008-04-15 Adc Telecommunications, Inc. Cable manager including nestable radius limiter
US7569979B2 (en) 2006-04-07 2009-08-04 Federal-Mogul World Wide, Inc. Spark plug having spark portion provided with a base material and a protective material
US20090062129A1 (en) 2006-04-19 2009-03-05 Agencourt Personal Genomics, Inc. Reagents, methods, and libraries for gel-free bead-based sequencing
EP2047910B1 (en) 2006-05-11 2012-01-11 Raindance Technologies, Inc. Microfluidic device and method
WO2008070352A2 (en) 2006-10-27 2008-06-12 Complete Genomics, Inc. Efficient arrays of amplified polynucleotides
US20090111706A1 (en) 2006-11-09 2009-04-30 Complete Genomics, Inc. Selection of dna adaptor orientation by amplification
WO2008092155A2 (en) 2007-01-26 2008-07-31 Illumina, Inc. Image data efficient genetic sequencing method and system
JP5258760B2 (en) 2007-06-08 2013-08-07 合同会社Bio−Dixam Method for amplifying methylated or unmethylated nucleic acid
EP2188386A2 (en) 2007-09-17 2010-05-26 Universite De Strasbourg Method for detecting or quantifying a truncating mutation
US8268564B2 (en) 2007-09-26 2012-09-18 President And Fellows Of Harvard College Methods and applications for stitched DNA barcodes
US7897344B2 (en) * 2007-11-06 2011-03-01 Complete Genomics, Inc. Methods and oligonucleotide designs for insertion of multiple adaptors into library constructs
US8518640B2 (en) 2007-10-29 2013-08-27 Complete Genomics, Inc. Nucleic acid sequencing and process
WO2009061840A1 (en) * 2007-11-05 2009-05-14 Complete Genomics, Inc. Methods and oligonucleotide designs for insertion of multiple adaptors employing selective methylation
US8592150B2 (en) 2007-12-05 2013-11-26 Complete Genomics, Inc. Methods and compositions for long fragment read sequencing
CN102016068A (en) 2008-01-09 2011-04-13 生命科技公司 Method of making a paired tag library for nucleic acid sequencing
US20090298131A1 (en) 2008-02-19 2009-12-03 Intelligent Biosystems, Inc. Non-Emulsion Methods And Masked Biomolecules
CA2731219A1 (en) 2008-07-18 2010-01-21 Ideapaint, Inc. Ambient cure solvent-based coatings for writable-erasable surfaces
US8407554B2 (en) 2009-02-03 2013-03-26 Complete Genomics, Inc. Method and apparatus for quantification of DNA sequencing quality and construction of a characterizable model system using Reed-Solomon codes
US9524369B2 (en) * 2009-06-15 2016-12-20 Complete Genomics, Inc. Processing and analysis of complex nucleic acid sequence data
CN102725424B (en) 2010-01-25 2014-07-09 Rd生物科技公司 Self-folding amplification of target nucleic acid
US9506112B2 (en) 2010-02-05 2016-11-29 Siemens Healthcare Diagnostics Inc. Increasing multiplex level by externalization of passive reference in polymerase chain reactions
EP2363502B1 (en) 2010-03-04 2017-02-15 miacom Diagnostics GmbH Enhanced multiplex FISH
DE202011003570U1 (en) 2010-03-06 2012-01-30 Illumina, Inc. Systems and apparatus for detecting optical signals from a sample
WO2012098599A1 (en) 2011-01-17 2012-07-26 パナソニック株式会社 Imaging device
CN103843001B (en) * 2011-04-14 2017-06-09 考利达基因组股份有限公司 The treatment and analysis of complex nucleic acid sequence data
WO2012162161A1 (en) 2011-05-20 2012-11-29 Phthisis Diagnostics Microsporidia detection system and method
US9006630B2 (en) 2012-01-13 2015-04-14 Altasens, Inc. Quality of optically black reference pixels in CMOS iSoCs
AU2013266394B2 (en) 2012-05-21 2019-03-14 The Scripps Research Institute Methods of sample preparation
US9146248B2 (en) 2013-03-14 2015-09-29 Intelligent Bio-Systems, Inc. Apparatus and methods for purging flow cells in nucleic acid sequencing instruments

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11915444B2 (en) 2020-08-31 2024-02-27 Element Biosciences, Inc. Single-pass primary analysis
WO2023015192A1 (en) * 2021-08-03 2023-02-09 10X Genomics, Inc. Nucleic acid concatemers and methods for stabilizing and/or compacting the same
WO2023091592A1 (en) * 2021-11-19 2023-05-25 Dovetail Genomics, Llc Dendrimers for genomic analysis methods and compositions
WO2023107719A3 (en) * 2021-12-10 2023-07-20 Element Biosciences, Inc. Primary analysis in next generation sequencing

Also Published As

Publication number Publication date
AU2006259565B2 (en) 2011-01-06
EP2463386A2 (en) 2012-06-13
US9650673B2 (en) 2017-05-16
WO2006138284A3 (en) 2007-12-13
US20170175184A1 (en) 2017-06-22
EP1907583B2 (en) 2019-10-23
US8765379B2 (en) 2014-07-01
US20220411865A1 (en) 2022-12-29
JP2009500004A (en) 2009-01-08
US20090137414A1 (en) 2009-05-28
CN101466847A (en) 2009-06-24
US8445196B2 (en) 2013-05-21
US20140011688A1 (en) 2014-01-09
US8765375B2 (en) 2014-07-01
IL188142A0 (en) 2008-03-20
US20220411866A1 (en) 2022-12-29
US10125392B2 (en) 2018-11-13
DK2463386T3 (en) 2017-07-31
IL188142A (en) 2015-10-29
US20220162694A1 (en) 2022-05-26
WO2006138257A3 (en) 2008-12-11
DK2620510T4 (en) 2020-03-30
US20130345069A1 (en) 2013-12-26
US20080234136A1 (en) 2008-09-25
HK1209795A1 (en) 2016-04-08
CN101466847B (en) 2014-02-19
EP3257949A1 (en) 2017-12-20
US8133719B2 (en) 2012-03-13
US8765382B2 (en) 2014-07-01
JP5331476B2 (en) 2013-10-30
EP2463386A3 (en) 2012-08-15
JP2016174608A (en) 2016-10-06
EP1907571A2 (en) 2008-04-09
EP2620510A1 (en) 2013-07-31
JP2011234723A (en) 2011-11-24
US20200115748A1 (en) 2020-04-16
EP1907571B1 (en) 2017-04-26
DK1907583T3 (en) 2017-01-16
WO2006138257A2 (en) 2006-12-28
US20110071053A1 (en) 2011-03-24
US11414702B2 (en) 2022-08-16
US8445197B2 (en) 2013-05-21
DK1907583T4 (en) 2020-01-27
EP2620510B2 (en) 2020-02-19
EP1907583A2 (en) 2008-04-09
US10351909B2 (en) 2019-07-16
US8771958B2 (en) 2014-07-08
US9637785B2 (en) 2017-05-02
EP1907571A4 (en) 2009-10-21
US20090137404A1 (en) 2009-05-28
US20130316920A1 (en) 2013-11-28
US20140018246A1 (en) 2014-01-16
EP2620510B1 (en) 2016-10-12
US8771957B2 (en) 2014-07-08
CA2611743A1 (en) 2006-12-28
JP5965108B2 (en) 2016-08-03
US20150159204A1 (en) 2015-06-11
US7709197B2 (en) 2010-05-04
US20200399695A1 (en) 2020-12-24
DK1907571T3 (en) 2017-08-21
JP2014138597A (en) 2014-07-31
US20130345071A1 (en) 2013-12-26
CA2611671A1 (en) 2006-12-28
CA2611743C (en) 2019-12-31
US20110319281A1 (en) 2011-12-29
EP3492602A1 (en) 2019-06-05
EP2463386B1 (en) 2017-04-12
US20070072208A1 (en) 2007-03-29
DK2620510T3 (en) 2017-01-30
US20170152554A1 (en) 2017-06-01
US20090311691A1 (en) 2009-12-17
SG162795A1 (en) 2010-07-29
US20070099208A1 (en) 2007-05-03
US20180051333A1 (en) 2018-02-22
US7901891B2 (en) 2011-03-08
EP2865766A1 (en) 2015-04-29
US20160017414A1 (en) 2016-01-21
US8673562B2 (en) 2014-03-18
EP1907583B1 (en) 2016-10-05
CA2611671C (en) 2013-10-08
AU2006259565A1 (en) 2006-12-28
US9944984B2 (en) 2018-04-17
EP1907583A4 (en) 2009-11-11
US9637784B2 (en) 2017-05-02
WO2006138284A2 (en) 2006-12-28
US20130310264A1 (en) 2013-11-21
US20130345070A1 (en) 2013-12-26
US20130345068A1 (en) 2013-12-26
US8445194B2 (en) 2013-05-21
IL241986A0 (en) 2015-11-30

Similar Documents

Publication Publication Date Title
US20220162694A1 (en) Dna array

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: SPECIAL NEW

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION