EP0923650B1 - Signatures par ligature d'adaptateurs codes - Google Patents

Signatures par ligature d'adaptateurs codes Download PDF

Info

Publication number
EP0923650B1
EP0923650B1 EP97929757A EP97929757A EP0923650B1 EP 0923650 B1 EP0923650 B1 EP 0923650B1 EP 97929757 A EP97929757 A EP 97929757A EP 97929757 A EP97929757 A EP 97929757A EP 0923650 B1 EP0923650 B1 EP 0923650B1
Authority
EP
European Patent Office
Prior art keywords
tag
oligonucleotide
polynucleotide
adaptors
nucleotides
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP97929757A
Other languages
German (de)
English (en)
Other versions
EP0923650A1 (fr
Inventor
Glenn Albrecht
Sydney Brenner
David H. Lloyd
Robert B. Dubridge
Michael C. Pallas
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Solexa Inc
Original Assignee
Solexa Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US08/659,453 external-priority patent/US5846719A/en
Application filed by Solexa Inc filed Critical Solexa Inc
Publication of EP0923650A1 publication Critical patent/EP0923650A1/fr
Application granted granted Critical
Publication of EP0923650B1 publication Critical patent/EP0923650B1/fr
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates
    • C12Q1/6855Ligating adaptors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6827Hybridisation assays for detection of mutation or polymorphism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • C12Q1/6874Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation

Definitions

  • the invention relates generally to methods for determining the nucleotide sequence of a polynucleotide, and more particularly, to a method of identifying terminal nucleotides of a polynucleotide by specific ligation of encoded adaptors.
  • the chain termination method requires the generation of one or more sets of labeled DNA fragments, each having a common origin and each terminating with a known base.
  • the set or sets of fragments must then be separated by size to obtain sequence information.
  • the size separation is usually accomplished by high resolution gel electrophoresis, which must have the capacity of distinguishing very large fragments differing in size by no more than a single nucleotide.
  • base-by-base approaches promise the possibility of carrying out many thousands of sequencing reactions in parallel, for example, on target polynucleotides attached to microparticles or on solid phase arrays, e.g. International patent application PCT/US95/12678 (WO 96/12039).
  • WO 95/20053 discloses a further method of sequencing nucleic acids.
  • One such method employs specific hybridization of labelled adaptors, where each of four labels corresponds to a different nucleotide, followed by a cleavage step which shortens the target by one nucleotide. In this process, each round of ligation and cleavage is effective to identify one nucleotide in the target.
  • WO 96/12014 discloses the use of oligonucleotide tags, selected from a minimally cross-hybridizing set of oligonucleotides, for tracking, identifying, and/or sorting populations of molecules.
  • the method may be used to sort a population of polynucleotides onto a solid support for simultaneous sequencing.
  • a "single base sequencing methodology" involving repeated steps of ligation of labelled probe, identification, and cleavage with a nuclease, is described for sequencing.
  • EP-A2 0 630 972 discloses a further method for sequencing DNA.
  • a polynucleotide is digested with a restriction endonuclease, producing fragments having identical cleaved ends.
  • a labelled adaptor having a protruding strand complementary to these cleaved ends is then ligated to the fragments. Digestion with an exonuclease renders the end regions of the resulting construct single stranded, including portions of the fragment.
  • the fragments are then separated according to the difference of the terminal sequences following to the ligated known oligomer sequence. This is accomplished by hybridizing the constructs to probes on a solid support, each of which includes a portion of the known adaptor sequence and a variable sequence.
  • base-by-base sequencing schemes have not had widespread application because of numerous problems, such as inefficient chemistries which prevent determination of any more than a few nucleotides in a complete sequencing operation.
  • an object of our invention is to provide a DNA sequencing scheme which does not suffer the drawbacks of current base-by-base approaches.
  • Another object of our invention is to provide a method of DNA sequencing which is amenable to parallel, or simultaneous, application to thousands of DNA fragments present in a common reaction vessel.
  • a further object of our invention is to provide a method of DNA sequencing which permits the identification of a terminal portion of a target polynucleotide with minimal enzymatic steps.
  • Yet another object of our invention is to provide a set of encoded adaptors for identifying the sequence of a plurality of terminal nucleotides of one or more target polynucleotides.
  • Each encoded adaptor comprises a protruding strand and an oligonucleotide tag selected from a minimally cross-hybridizing set of oligonucleotides.
  • Encoded adaptors whose protruding strands form perfectly matched duplexes with the complementary protruding strands of the target polynucleotide are ligated. After ligation, the identity and ordering of the nucleotides in the protruding strands are determined, or "decoded,” by specifically hybridizing a labeled tag complement to its corresponding tag on the ligated adaptor.
  • an encoded adaptor with a protruding strand of four nucleotides say 5'-AGGT
  • the four complementary nucleotides, 3'-TCCA, on the polynucleotide may be identified by a unique oligonucleotide tag selected from a set of 256 such tags, one for every possible four nucleotide sequence of the protruding strands.
  • Tag complements are applied to the ligated adaptors under conditions which allow specific hybridization of only those tag complements that form perfectly matched duplexes (or triplexes) with the oligonucleotide tags of the ligated adaptors.
  • the tag complements may be applied individually or as one or more mixtures to determine the identity of the oligonucleotide tags, and therefore, the sequences of the protruding strands.
  • the encoded adaptors may be used in sequence analysis either (i) to identify one or more nucleotides as a step of a process that involves repeated cycles of ligation, identification, and cleavage, as described in Brenner U.S. patent 5,599,675 and PCT Publication No. WO 95/27080, or (ii) as a "stand alone" identification method, wherein sets of encoded adaptors are applied to target polynucleotides such that each set is capable of identifying the nucleotide sequence of a different portion of a target polynucleotide; that is, in the latter embodiment, sequence analysis is carried out with a single ligation for each set followed by identification.
  • oligonucleotide tags that are members of a minimally cross-hybridizing set of oligonucleotides, e.g. as described in International patent applications PCT/US95/12791 (WO 96/12014) and PCT/US96/09513 (WO 96/41011).
  • the sequences of oligonucleotides of such a set differ from the sequences of every other member of the same set by at least two nucleotides. Thus, each member of such a set cannot form a duplex (or triplex) with the complement of any other member with less than two mismatches.
  • each member of a minimally cross-hybridizing set differs from every other member by as many nucleotides as possible consistent with the size of set required for a particular application.
  • the difference between members of a minimally cross-hybridizing set is preferably significantly greater than two.
  • each member of such a set differs from every other member by at least four nucleotides. More preferably, each member of such a set differs from every other member by at least six nucleotides.
  • tag complements are referred to herein as "tag complements.”
  • Oligonucleotide tags may be single stranded and be designed for specific hybridization to single stranded tag complements by duplex formation. Oligonucleotide tags may also be double stranded and be designed for specific hybridization to single stranded tag complements by triplex formation. Preferably, the oligonucleotide tags of the encoded adaptors are double stranded and their tag complements are single stranded, such that specific hybridization of a tag with its complements occurs through the formation of a triplex structure.
  • the method of the invention comprises the following steps: (a) ligating an encoded adaptor to an end of a polynucleotide, the adaptor having an oligonucleotide tag selected from a minimally cross-hybridizing set of oligonucleotides and a protruding strand complementary to a protruding strand of the polynucleotide; and (b) identifying one or more nucleotides in the protruding strand of the polynucleotide by specifically hybridizing a tag complement to the oligonucleotide tag of the encoded adaptor.
  • encoded adaptor is used synonymously with the term “encoded probe” of priority document U.S. patent application Ser. No. 08/689,587.
  • ligation means the formation of a covalent bond between the ends of one or more (usually two) oligonucleotides.
  • the term encompasses non-enzymatic formation of phosphodiester bonds, as well as the formation of non-phosphodiester covalent bonds between the ends of oligonucleotides, such as phosphorothioate bonds, disulfide bonds, and the like.
  • a ligation reaction is usually template driven, in that the ends of oligo 1 and oligo 2 are brought into juxtaposition by specific hybridization to a template strand.
  • a special case of template-driven ligation is the ligation of two double stranded oligonucleotides having complementary protruding strands.
  • “Complement” or “tag complement” as used herem in reference to oligonucleotide tags refers to an oligonucleotide to which a oligonucleotide tag specifically hybridizes to form a perfectly matched duplex or triplex.
  • the oligonucleotide tag may be selected to be either double stranded or single stranded.
  • the term “complement” is meant to encompass either a double stranded complement of a single stranded oligonucleotide tag or a single stranded complement of a double stranded oligonucleotide tag.
  • oligonucleotide includes linear oligomers of natural or modified monomers or linkages, including deoxyribonucleosides, ribonucleosides, anomeric forms thereof, peptide nucleic acids (PNAs), and the like, capable of specifically binding to a target polynucleotide by way of a regular pattern of monomer-to-monomer interactions, such as Watson-Crick type of base pairing, base stacking, Hoogsteen or reverse Hoogsteen types of base pairing, or the like.
  • monomers are linked by phosphodiester bonds or analogs thereof to form oligonucleotides ranging in size from a few monomeric units, e.g.
  • oligonucleotide is represented by a sequence of letters, such as "ATGCCTG,” it will be understood that the nucleotides are in 5' ⁇ 3' order from left to right and that "A” denotes deoxyadenosine, “C” denotes deoxycytidine, “G” denotes deoxyguanosine, and “T” denotes thymidine, unless otherwise noted.
  • oligonucleotides of the invention comprise the four natural nucleotides; however, they may also comprise non-natural nucleotide analogs It is clear to those skilled in the art when oligonucleotides having natural or non-natural nucleotides may be employed, e.g. where processing by enzymes is called for, usually oligonucleotides consisting of natural nucleotides are required.
  • Perfectly matched in reference to a duplex means that the poly- or oligonucleotide strands making up the duplex form a double stranded structure with one other such that every nucleotide in each strand undergoes Watson-Crick basepairing with a nucleotide in the other strand.
  • the term also comprehends the pairing of nucleoside analogs, such as deoxyinosine, nucleosides with 2-aminopurine bases, and the like, that may be employed.
  • the term means that the triplex consists of a perfectly matched duplex and a third strand in which every nucleotide undergoes Hoogsteen or reverse Hoogsteen association with a basepair of the perfectly matched duplex.
  • a "mismatch" in a duplex between a tag and an oligonucleotide means that a pair or triplet of nucleotides in the duplex or triplex fails to undergo Watson-Crick and/or Hoogsteen and/or reverse Hoogsteen bonding.
  • nucleoside includes the natural nucleosides, including 2'-deoxy and 2'-hydroxyl forms, e.g. as described in Kornberg and Baker, DNA Replication, 2nd Ed. (Freeman, San Francisco, 1992).
  • "Analogs” in reference to nucleosides includes synthetic nucleosides having modified base moieties and/or modified sugar moieties, e.g. described by Scheit, Nucleotide Analogs (John Wiley, New York, 1980); Uhlman and Peyman, Chemical Reviews, 90: 543-584 (1990), or the like, with the only proviso that they are capable of specific hybridization.
  • Such analogs include synthetic nucleosides designed to enhance binding properties, reduce complexity, increase specificity, and the like.
  • sequence determination or "determining a nucleotide sequence” in reference to polynucleotides includes determination of partial as well as full sequence information of the polynucleotide. That is, the term includes sequence comparisons, fingerprinting, and like levels of information about a target polynucleotide, as well as the express identification and ordering of nucleosides, usually each nucleoside, in a target polynucleotide. The term also includes the determination of the identification, ordering, and locations of one, two, or three of the four types of nucleotides within a target polynucleotide.
  • sequence determination may be effected by identifying the ordering and locations of a single type of nucleotide, e.g. cytosines, within the target polynucleotide "CATCGC " so that its sequence is represented as a binary code, e.g. "100101 ... " for "C-(not C)-(not C)-C-(not C)-C ... " and the like.
  • a single type of nucleotide e.g. cytosines
  • complexity m reference to a population of polynucleotides means the number of different species of molecule present in the population.
  • the invention involves the ligation of encoded adaptors specifically hybridized to the terminus or termini of one or more target polynucleotides. Sequence information about the region where specific hybridization occurs is obtained by "decoding" the oligonucleotide tags of the encoded adaptors thus ligated. In one aspect of the invention, multiple sets of encoded adaptors are ligated to a target polynucleotide at staggered cleavage points so that the encoded adaptors provide sequence information from each of a plurality of portions of the target polynucleotide.
  • Such portions may be disjoint, overlapping or contiguous; however, preferably, the portions are contiguous and together permit the identification of a sequence of nucleotides equal to the sum of the lengths of the individual portions.
  • encoded adaptors are employed as an identification step in a process involving repeated cycles of ligation, identification, and cleavage, described more fully below.
  • the invention makes use of nucleases whose recognition sites are separate from their cleavage sites.
  • nucleases are type IIs restriction endonucleases.
  • the nucleases are used to generate protruding strands on target polynucleotides to which encoded adaptors are ligated.
  • the amount of sequence information obtained in a given embodiment of the invention depends in part on how many such nucleases are employed and the length of the protruding strand produce upon cleavage.
  • An important aspect of the invention is the capability of sequencing many target polynucleotides in parallel.
  • the present invention provides a method of determining a nucleotide sequence at an end of a polynucleotide, the method comprising the steps of:
  • said method is useful for determining nucleotide sequences of a plurality of polynucleotides, the method further comprising, prior to step (a), the steps of:
  • said method is useful for identifying a population of mRNA molecules, wherein said polynucleotides are cDNA molecules, and said step (i) comprises:
  • said step of ligating includes ligating a plurality of different encoded adaptors to said end of said polynucleotide, such that said protruding strands of the plurality of different encoded adaptors are complementary to a plui-ality of different portions of said strand of said polynucleotide, and there is a one-to-one correspondence between said different encoded adaptors and the different portions of said strand.
  • said different portions of said strand of said polynucleotide are contiguous.
  • a move preferred embodiment of the method of the invention include the steps of (d) cleaving said encoded adaptors from said polynucleotides with a nuclease having a nuclease recognition site separate from its cleavage site so that a new protruding strand is formed on said end of each of said polynucleotides, and (e) repeating steps (a) through (d).
  • k target polynucleotides are prepared as described below, and also in Brenner, International patent applications PCT/US95/12791 (WO 96/12041) and PCT/US96/09513 (WO 96/41011).. That is, a sample is taken from a population of polynucleotides conjugated with oligonucleotide tags designated by small "t's.” These tags are sometimes referred to as oligonucleotide tags for sorting, or as "first" oligonucleotide tags.
  • the tag-polynucleotide conjugates of the sample are amplified, e.g. by polymerase chain reaction (PCR) or by cloning, to give 1 through k populations of conjugates, indicated by (14)-(18) in Fig. 1A.
  • the ends of the conjugates opposite of the (small "t") tags are prepared for ligating one or more adaptors, each of which contains a recognition site for a nuclease whose cleavage site is separate from its recognition site.
  • three such adaptors referred to herein as "cleavage adaptors," are employed.
  • the number of such adaptors employed depends on several factors, including the amount of sequence information desired, the availability of type Ils nucleases with suitable reaches and cleavage characteristics, and the like.
  • the tag-polynucleotide conjugates may be cleaved with a restriction endonuclease with a high frequency of recognition sites, such as Taq I, Alu I, HinP1 I, Dpn II, Nla III, or the like.
  • a staggered end may be produced with T4 DNA polymerase, e.g.
  • an exemplary set of three cleavage adaptors may be constructed as follows: where cleavage adaptors (1), (2), and (3) are shown in capital letters with the respective recognition sites of nucleases Bbs I, Bbv I, and Bsm FI underlined and a 5' phosphate indicated as "p.”
  • the double underlined portions of the target polynucleotide indicate the positions of the protruding strands after ligation and cleavage.
  • the target polynucleotide is left with a 5' protruding strand of four nucleotides.
  • many different embodiments can be constructed using different numbers and kinds of nucleases. As. discussed in Brenner, U.S. patent 5,599,675 and WO 95/27080, preferably prior to cleavage, internal Bbs 1. Bbv 1, and Bsm FI sites are blocked, e.g. by methylation, to prevent undesirable cleavages at internal sites of the target polynucleotide.
  • cleavage adaptors A 1 , A 2 , and A 3 are ligated (20) in a concentration ratio of 1:1:1 to the k target polynucleotides to give the conjugates shown in Fig. 1B, such that within each population of tag-polynucleotide conjugates there are approximately equal numbers of conjugates having A 1 , A 2 , and A 3 attached.
  • the target polynucleotides are successively cleaved with each of the nucleases of the cleavage adaptors and ligated to a set of encoded adaptors.
  • the target polynucleotides are cleaved (22) with the nuclease of cleavage adaptor A 1 after which a first set of encoded adaptors are ligated to the resulting protruding strands.
  • the cleavage results in about a third of the target polynucleotides of each type, i.e. t 1 , t 2 , ... t k , being available for ligation.
  • the encoded adaptors are applied as one or more mixtures of adaptors which taken together contain every possible sequence of a protruding strand.
  • Reaction conditions are selected so that only encoded adaptors whose protruding strands form perfectly matched duplexes with those of the target polynucleotide are ligated to form encoded conjugates (28), (30), and (32) (Fig. 1C).
  • the capital "T's" with subscripts indicate that unique oligonucleotide tags are carried by the encoded adaptors for labeling.
  • the oligonucleotide tags carried by encoded adaptors are sometimes referred to tags for delivering labels to the encoded adaptors, or as "second" oligonucleotide tags.
  • single stranded oligonucleotide tags used for sorting preferably consist of only three of the four nucleotides, so that a T4 DNA polymerase "stripping" reaction, e.g. Kuijper et al (cited above), can be used to prepare target polynucleotides for loading onto solid phase supports.
  • oligonucleotide tags employed for delivering labels may consist of all four nucleotides.
  • encoded adaptors comprise a protruding strand (24) and an oligonucleotide tag (26).
  • the encoded adaptors in this example may be ligated to the target polynucleotides in one or more mixtures of a total of 768 (3 x 256) members.
  • an encoded adaptor may also comprise a spacer region, as shown in the above example where the 4 nucleotide sequence "ttct" serves as a spacer between the protruding strand and the oligonucleotide tag.
  • the tag-polynucleotide conjugates are cleaved (34) with the nuclease of cleavage adaptor A 2 , after which a second set of encoded adaptors is applied to form conjugates (36), (38), and (40) (Fig. 1D).
  • the tag-polynucleotide conjugates are cleaved (42) with the nuclease of cleavage adaptor A 3 , after which a third set of encoded adaptors is applied to form conjugates (44), (46), and (48) (Fig. 1E).
  • the mixture is loaded (50) onto one or more solid phase supports via oligonucleotide tags t 1 through t k as described more fully below, and as taught by Brenner, e.g. PCT/US95/12791 or PCT/US96/09513. If a single target polynucleotide is analyzed, then clearly multiple oligonucleotide tags, t 1 , t 2 , ... t k , are not necessary. In such an embodiment, biotin, or like moiety, can be employed to anchor the polynucleotide-encoded adaptor conjugate, as no sorting is required.
  • the ordering of the steps of cleavage, ligation, and loading onto solid phase supports depends on the particular embodiment implemented.
  • the tag-polynucleotide conjugates may be loaded onto solid phase support first, followed by ligation of cleavage adaptors, cleavage thereof, and ligation of encoded adaptors; or, the cleavage adaptors may be ligated first, followed by loading, cleavage, and ligation of encoded adaptors; and so on.
  • sequence information is obtained by successively applying labeled tag complements to the immobilized target polynucleotides, either individually or as mixtures under conditions that permit the formation of perfectly matched duplexes and/or triplexes between the oligonucleotide tags of the encoded adaptors and their respective tag complements.
  • the numbers and complexity of the mixtures depends on several factors, including the type of labeling system used, the length of the portions whose sequences are to be identifed, whether complexity reducing analogs are used, and the like.
  • the tag complements are applied individually to identify the nucleotides of each of the four-nucleotide portions of the target polynucleotide (i.e., 4 tag complements for each of 12 positions for a total of 48).
  • portions of different lengths would require different numbers of tag complements, e.g. in accordance with this embodiment, a 5-nucleotide portion would require 20 tag complements, a 2-nucleotide portion would require 8 tag complements, and so on.
  • the tag complements are applied under conditions sufficiently stringent so that only perfectly matched duplexes are formed, signals from the fluorescent labels on the specifically hybridized tag complements are measured, and the tag complements are washed from the encoded tags so that the next mixture can be applied.
  • the 16 tag complements have a one-to-one correspondence with the following sequences of the 4-mer portions of the target sequence: ANNN NANN NNAN NNNA CNNN NCNN NNCN NNNC GNNN NGNN NNGN NNNG TNNN NTNN NNTN NN NNNT where "N" is any one of the nucleotides, A, C, G, or T.
  • This embodiment incorporates a significant degree of redundancy (a total of 16 tag complements are used to identify 4 nucleotides in exchange for increased reliability of nucleotide determination.
  • 12 mixtures of 4 tag complements each could applied in succession by using four spectrally distinguishable fluorescent dyes, such that there is a one-to-one correspondence between dyes and kinds of nucleotide.
  • a fourth adaptor referred to herein as a "stepping adaptor” is ligated to the ends of the target polynucleotides along with cleavage adaptors A 1 , A 2 , and A 3 , for example, in a concentration ratio of 3:1:1:1.
  • cleavage adaptors A 1 , A 2 , and A 3 for example, in a concentration ratio of 3:1:1:1.
  • the stepping adaptor includes a recognition site for a type Ils nuclease positioned such that its reach (defined below) will permit cleavage of the target polynucleotides at the end of the sequence determined via cleavage adaptors A 1 , A 2 , and A 3 .
  • An example of a stepping adaptor that could be used with the above set of cleavage adaptors is as follows: where, as above, the recognition site of the nuclease, in this case BpM I, is singly underlined and the nucleotides at the cleavage site are doubly underlined.
  • the target polynucleotides cleaved with the nuclease of the stepping adaptor may be ligated to a further set of cleavage adaptors A 4 , A 5 , and A 6 , which may contain nuclease recognition sites that are the same or different than those contained in cleavage adaptors A 1 , A 2 , and A 3 .
  • cleavage adaptors A 4 , A 5 , and A 6 may contain nuclease recognition sites that are the same or different than those contained in cleavage adaptors A 1 , A 2 , and A 3 .
  • Whether or not an enlarged set of encoded adaptors is required depends on whether cleavage and ligation reactions can be tolerated in the signal measurement apparatus. If, as above, it is desired to minimize enzyme reactions in connection with signal measurement, then additional sets of encoded adaptors must be employed.
  • cleavage adaptors A 4 , A 5 , and A 6 with the same nuclease recognition sites as A 1 , A 2 , and A 3 , and which could be used with the stepping adaptor shown above are as follows: where the cleavage sites are indicated by double underlining. Cleavage adaptors A 4 , A 5 , and A 6 are preferably applied as mixtures, such that every possible two-nucleotide protruding strand is represented.
  • the target polynucleotides are prepared for loading onto solid phase supports, preferably microparticles, as disclosed in Brenner, International patent application PCT/US95/12791 (WO 96/12014).
  • the oligonucleotide tags for sorting are rendered single stranded using a "stripping" reaction with T4 DNA polymerase, e.g. Kuijper et al (cited above).
  • T4 DNA polymerase e.g. Kuijper et al (cited above).
  • the single stranded oligonucleotide tags arc specifically hybridized and ligated to their tag complements on microparticles.
  • the loaded microparticles are then analyzed in an instrument, such as described in Brenner (cited above) which permits the sequential delivery, specific hybridization, and removal of labeled tag complements to encoded adaptors.
  • encoded adaptors are ligated to a target polynucleotide (or population of target polynucleotides) only one time
  • ligation methods include, but are not limited to, those disclosed in Shabarova, Biochimie 70: 1323-1334 (1988); Dolinnaya et al, Nucleic Acids Research, 16: 3721-3738 (1988); Letsinger et al, U.S.
  • an encoded adaptor having a 3'-bromoacetylated end is reacted with a polynucleotide having a complementary protruding strand and a thiophosphoryl group at its 5' end.
  • B and B' are nucleotides and their complements
  • z, r, s, q, and t are as described below.
  • Encoded adaptors may be used in an adaptor-based method of DNA sequencing that includes repeated cycles of ligation, identification, and cleavage, such as the method described in Brenner, U.S. patent 5,599,675 and PCT Publication No. WO 95/27080.
  • such a method comprises the following steps: (a) ligating an encoded adaptor to an end of a polynucleotide, the encoded adaptor having a nuclease recognition site of a nuclease whose cleavage site is separate from its recognition site; (b) identifying one or more nucleotides at the end of the polynucleotide by the identity of the encoded adaptor ligated thereto, (c) cleaving the polynucleotide with a nuclease recognizing the nuclease recognition site of the encoded adaptor such that the polynucleotide is shortened by one or more nucleotides; and (d) repeating said steps (a) through (c) until said nucleotide sequence of the polynucleotide is determined.
  • successive sets of tag complements are specifically hybridized to the respective tags carried by encoded adaptors ligated to the ends of the target polynucleotides, as described above.
  • the type and sequence of nucleotides in the protruding strands of the polynucleotides are identified by the label carried by the specifically hybridized tag complement and the set from which the tag complement came, as described above.
  • Oligonucleotide tags are employed for two different purposes in the preferred embodiments of the invention: Oligonucleotide tags are employed as described in Brenner, International patent applications PCT/US95/12791 and PCT/US96/09513 (WO 96/12014 and WO 96/41011), to sort large numbers of polynucleotides, e.g. several thousand to several hundred thousand, from a mixture into uniform populations of identical polynucleotides for analysis, and they are employed to deliver labels to encoded adaptors that number in the range of a few tens to a few thousand. For the former use, large numbers, or repertoires, of tags are typically required, and therefore synthesis of individual oligonucleotide tags is problematic.
  • oligonucleotide tags of a minimally cross-hybridizing set may be separately synthesized, as well as synthesized combinatorially.
  • nucleotide sequences of oligonucleotides of a minimally cross-hybridizing set are conveniently enumerated by simple computer programs, such as those exemplified by the programs whose source codes are listed in Appendices I and II. Similar computer programs are readily written for listing oligonucleotides of minimally cross-hybridizing sets for any embodiment of the invention. Table I below provides guidance as to the size of sets of minimally cross-hybridizing oligonucleotides for the indicated lengths and number of nucleotide differences. The above computer programs were used to generate the numbers.
  • tag complements in mixtures are selected to have similar duplex or triplex stabilities to one another so that perfectly matched hybrids have similar or substantially identical melting temperatures.
  • minimally cross-hybridizing sets may be constructed from subunits that make approximately equivalent contributions to duplex stability as every other subunit in the set.
  • the computer programs of Appendices I and II may be used to generate and list the sequences of minimally cross-hybridizing sets of oligonucleotides that are used directly (i.e. without concatenation into "sentences"). Such lists can be further screened for additional criteria, such as GC-content, distribution of mismatches, theoretical melting temperature, and the like, to form additional minimally cross-hybridizing sets.
  • subunits may be provided that have the same terminal nucleotides.
  • the sum of the base-stacking energies of all the adjoining terminal nucleotides will be the same, thereby reducing or eliminating variability in tag melting temperatures.
  • a "word" of terminal nucleotides may also be added to each end of a tag so that a perfect match is always formed between it and a similar terminal "word” on any other tag complement.
  • Such an augmented tag would have the form: W W 1 W 2 ... W k-1 W k W W' W 1 ' W 2 ' ... W k-1 ' W k ' W' where the primed W's indicate complements.
  • oligonucleotide tags used for sorting a preferred embodiment of minimally cross-hybridizing sets are those whose subunits are made up of three of the four natural nucleotides.
  • the absence of one type of nucleotide in the oligonucleotide tags permits target polynucleotides to be loaded onto solid phase supports by use of the 5' ⁇ 3' exonuclease activity of a DNA polymerase.
  • each member would form a duplex having three mismatched bases with the complement of every other member.
  • oligonucleotide tags used for delivering labels to encoded adaptors, all four nucleotides are employed.
  • oligonucleotide tags of the invention and their complements are conveniently synthesized on an automated DNA synthesizer, e.g. an Applied Biosystems, Inc. (Foster City, California) model 392 or 394 DNA/RNA Synthesizer, using standard chemistries, such as phosphoramidite chemistry, e.g. disclosed in the following references: Beaucage and Iyer, Tetrahedron, 48: 2223-2311 (1992); Molko et al, U.S. patent 4,980,460; Koster et al, U.S. patent 4,725,677; Caruthers et al, U.S.
  • an automated DNA synthesizer e.g. an Applied Biosystems, Inc. (Foster City, California) model 392 or 394 DNA/RNA Synthesizer, using standard chemistries, such as phosphoramidite chemistry, e.g. disclosed in the following references: Beaucage and Iyer, Tetrahedron, 48: 22
  • tags may comprise naturally occurring nucleotides that permit processing or manipulation by enzymes, while the corresponding tag complements may comprise non-natural nucleotide analogs, such as peptide nucleic acids, or like compounds, that promote the formation of more stable duplexes during sorting.
  • both the oligonucleotide tags and tag complements may be constructed from non-natural nucleotides, or analogs, provided ligation can take place, either chemically or enzymatically.
  • Double stranded forms of tags may be made by separately synthesizing the complementary strands followed by mixing under conditions that permit duplex formation.
  • double stranded tags may be formed by first synthesizing a single stranded repertoire linked to a known oligonucleotide sequence that serves as a primer binding site. The second strand is then synthesized by combining the single stranded repertoire with a primer and extending with a polymerase. This latter approach is described in Oliphant et al, Gene, 44: 177-183 (1986).
  • duplex tags may then be inserted into cloning vectors along with target polynucleotides for sorting and manipulation of the target polynucleotide in accordance with the invention.
  • tag complements are employed that are made up of nucleotides that have enhanced binding characteristics, such as PNAs or oligonucleotide N3' ⁇ P5' phosphoramidates
  • sorting can be implemented through the formation of D-loops between tags comprising natural nucleotides and their PNA or phosphoramidate complements, as an alternative to the "stripping" reaction employing the 3' ⁇ 5' exonuclease activity of a DNA polymerase to render a tag single stranded.
  • Oligonucleotide tags for sorting may range in length from 12 to 60 nucleotides or basepairs. Preferably, oligonucleotide tags range in length from 18 to 40 nucleotides or basepairs. More preferably, oligonucleotide tags range in length from 25 to 40 nucleotides or basepairs.
  • oligonucleotide tags for sorting are single stranded and specific hybridization occurs via Watson-Crick pairing with a tag complement.
  • repertoires of single stranded oligonucleotide tags for sorting contain at least 100 members; more preferably, repertoires of such tags contain at least 1000 members; and most preferably, repertoires of such tags contain at least 10,000 members.
  • repertoires of tag complements for delivering labels contain at least 16 members; more preferably, repertoires of such tags contain at least 64 members. Still more preferably, such repertoires of tag complements contain from 16 to 1024 members, e.g. a number for identifying nucleotides in protruding strands of from 2 to 5 nucleotides in length. Most preferably, such repertoires of tag complements contain from 64 to 256 members.
  • Repertoires of desired sizes are selected by directly generating sets of words, or subunits, of the desired size, e.g.
  • the length of single stranded tag complements for delivering labels is between 8 and 20. More preferably, the length is between 9 and 15.
  • coding of tag sequences follows the same principles as for duplex-forming tags; however, there are further constraints on the selection of subunit sequences.
  • third strand association via Hoogsteen type of binding is most stable along homopynmidine-homopurine tracks in a double stranded target.
  • base triplets form in T-A*T or C-G*C motifs (where "-" indicates Watson-Crick pairing and "*" indicates Hoogsteen type of binding); however, other motifs are also possible.
  • Hoogsteen base pairing permits parallel and antiparallel orientations between the third strand (the Hoogsteen strand) and the purine-rich strand of the duplex to which the third strand binds, depending on conditions and the composition of the strands.
  • nucleoside type e.g. whether ribose or deoxyribose nucleosides are employed
  • base modifications e.g. methylated cytosine, and the like in order to maximize, or otherwise regulate, triplex stability as desired in particular embodiments, e.g. Roberts et al, Proc. Natl. Acad.
  • oligonucleotide tags of the invention employing triplex hybridization are double stranded DNA and the corresponding tag complements are single stranded. More preferably, 5-methylcytosine is used in place of cytosine in the tag complements in order to broaden the range of pH stability of the triplex formed between a tag and its complement.
  • Preferred conditions for forming triplexes are fully disclosed in the above references. Briefly, hybridization takes place in concentrated salt solution, e.g. 1.0 M NaCl, 1.0 M potassium acetate, or the like, at pH below 5.5 (or 6.5 if 5-methylcytosine is employed).
  • Hybridization temperature depends on the length and composition of the tag; however, for an 18-20-mer tag or longer, hybridization at room temperature is adequate. Washes may be conducted with less concentrated salt solutions, e.g. 10 mM sodium acetate, 100 mM MgCl 2 , pH 5.8, at room temperature. Tags may be eluted from their tag complements by incubation in a similar salt solution at pH 9.0.
  • Minimally cross-hybridizing sets of oligonucleotide tags that form triplexes may be generated by the computer program of Appendix II, or similar programs.
  • An exemplary set of double stranded 8-mer words are listed below in capital letters with the corresponding complements in small letters. Each such word differs from each of the other words in the set by three base pairs.
  • the encoded adaptors and cleavage adaptors are conveniently synthesized on automated DNA synthesizers using standard chemistries, such as phosphoramidite chemistry, e.g. disclosed in the following references: Beaucage and Iyer, Tetrahedron, 48: 2223-2311 (1992); Molko et al, U.S. patent 4,980,460; Koster et al, U.S. patent 4,725,677; Caruthers et al, U.S. patents 4,415,732; 4,458,066; and 4,973,679; and the like.
  • Alternative chemistries e.g.
  • resulting in non-natural backbone groups such as phosphorothioate, phosphoramidate, and the like, may also be employed provided that the resulting oligonucleotides are compatible with the ligation and/or cleavage reagents used in a particular embodiment.
  • the strands are combined to form a double stranded adaptor.
  • the protruding strand of an encoded adaptor may be synthesized as a mixture, such that every possible sequence is represented in the protruding portion. Such mixtures are readily synthesized using well known techniques, e.g.
  • the loop region may comprise from about 3 to 10 nucleotides, or other comparable linking moieties, e.g. alkylether groups, such as disclosed in U.S. patent 4,914,210.
  • alkylether groups such as disclosed in U.S. patent 4,914,210.
  • the 5' end of the adaptor may be phosphorylated in some embodiments.
  • a 5' monophosphate can be attached to a second oligonucleotide either chemically or enzymatically with a kinase, e.g. Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Edition (Cold Spring Harbor Laboratory, New York, 1989). Chemical phosphorylation is described by Horn and Urdea, Tetrahedron Lett., 27: 4705 (1986), and reagents for carrying out the disclosed protocols are commercially available, e.g. 5' Phosphate-ON TM from Clontech Laboratories (Palo Alto, California).
  • Encoded adaptors which can be used in the method of the invention can have several embodiments depending, for example, on whether single or double stranded tags are used, whether multiple tags are used, whether a 5' protruding strand or 3' protruding strand is employed, whether a 3' blocking group is used, and the like. Formulas for several embodiments of encoded adaptors are shown below.
  • Preferred structures for encoded adaptors using one single stranded tag are as follows: 5 ⁇ ⁇ - p ⁇ N n ⁇ N r ⁇ N s ⁇ N q ⁇ N t - 3 ⁇ ⁇ z ⁇ N ⁇ r ⁇ N ⁇ s ⁇ N ⁇ q - 5 ⁇ ⁇ or p ⁇ N r ⁇ N s ⁇ N q ⁇ N t - 3 ⁇ ⁇ 3 ⁇ ⁇ - z ⁇ N n ⁇ N ⁇ r ⁇ N ⁇ s ⁇ N ⁇ q - 5 ⁇ ⁇
  • N is a nucleotide and N' is its complement
  • p is a phosphate group
  • z is a 3' hydroxyl or a 3' blocking group
  • n is an integer between 2 and 6, inclusive
  • r is an integer greater than or equal to
  • s is an integer which is either between four and six whenever the encoded adaptor has a nucle
  • n is 4 or 5
  • t is between 9 and 15, inclusive.
  • an encoded adaptor contains a nuclease recognition site
  • the region of "r" nucleotide pairs is selected so that a predetermined number of nucleotides are cleaved from a target polynucleotide whenever the nuclease recognizing the site is applied.
  • the size of "r” in a particular embodiment depends on the reach of the nuclease (as the term is defined in U.S. patent 5,599,675 and WO 95/27080) and the number of nucleotides sought to be cleaved from the target polynucleotide.
  • r is between 0 and 20; more preferably, r is between 0 and 12.
  • the region of "q" nucleotide pairs is a spacer segment between the nuclease recognition site and the tag region of the encoded probe.
  • the region of "q" nucleotide may include further nuclease recognition sites, labelling or signal generating moieties, or the like.
  • the single stranded oligonucleotide of "t" nucleotides is a "t-mer" oligonucleotide tag selected from a minimally cross-hybridizing set.
  • the 3' blocking group "z' may have a variety of forms and may include alinost any chemical entity that precludes ligation and that does not interfere with other steps of the method, e.g. removal of the 3' blocked strand, ligation, or the like.
  • Exemplary 3' blocking groups include, but are not limited to, hydrogen (i.e. 3' deoxy), phosphate, phosphorothioate, acetyl, and the like.
  • the 3' blocking group is a phosphate because of the convenience in adding the group during the synthesis of the 3' blocked strand and the convenience in removing the group with a phosphatase to render the strand capable of ligation with a ligase.
  • An oligonucleotide having a 3' phosphate may be synthesized using the protocol described in chapter 12 of Eckstein, Editor, Oligonucleotides and Analogues: A Practical Approach (IRL Press, Oxford, 1991).
  • z is a 3' blocking group
  • it is a phosphate group and the double stranded portion of the adaptors contain a nuclease recognition site of a nuclease whose recognition site is separate from its cleavage site.
  • encoded tags of the invention preferably have the following form: 5 ⁇ ⁇ - p ⁇ N n ⁇ N r ⁇ N s ⁇ N q ⁇ N t - 3 ⁇ ⁇ z ⁇ N ⁇ r ⁇ N ⁇ s ⁇ N ⁇ q ⁇ N ⁇ t - 5 ⁇ ⁇ or p ⁇ N r ⁇ N s ⁇ N q ⁇ N t - 3 ⁇ ⁇ 3 ⁇ ⁇ - z ⁇ N n ⁇ N ⁇ r ⁇ N ⁇ s ⁇ N ⁇ q ⁇ N ⁇ t - 5 ⁇ ⁇
  • N, N', p, q, r, s, z, and n are defined as above.
  • t is an integer in the range of 12 to 24.
  • encoded adaptors of the invention include embodiments with multiple tags, such as the following: 5 ⁇ ⁇ - p ⁇ N n ⁇ N r ⁇ N s ⁇ N q ⁇ N t ⁇ 1 ... N tk - 3 ⁇ ⁇ z ⁇ N ⁇ r ⁇ N ⁇ s ⁇ N ⁇ q ⁇ N ⁇ t ⁇ 1 ... N ⁇ tk - 5 ⁇ ⁇ or p ⁇ N r ⁇ N s ⁇ N q ⁇ N t ⁇ 1 ... N tk - 3 ⁇ ⁇ 3 ⁇ ⁇ - z ⁇ N n ⁇ N ⁇ r ⁇ N ⁇ s ⁇ N ⁇ q ⁇ N ⁇ t ⁇ 1 ... N ⁇ tk - 5 ⁇ ⁇ where the encoded adaptor includes k double stranded tags.
  • the present invention also relates to a composition of matter comprising a plurality of double stranded oligonucleotide adaptors, wherein the adaptors are of the form: 5 ⁇ ⁇ - p ⁇ N n ⁇ N r ⁇ N s ⁇ N q ⁇ N t - 3 ⁇ ⁇ z ⁇ N ⁇ r ⁇ N ⁇ s ⁇ N ⁇ q - 5 ⁇ ⁇ or p N ⁇ r N ⁇ s N ⁇ q N ⁇ t - 3 ⁇ ⁇ 3 ⁇ ⁇ - z N ⁇ n N ⁇ ⁇ r N ⁇ ⁇ s N ⁇ ⁇ q - 5 ⁇ ⁇ wherein each (N) t is a unique single stranded oligonucleotide tag and is selected from a minimally cross-hybridizing set of oligonucleotides, such that each oligonucleotide of the set differs from every
  • the tag complements of the invention can be labeled in a variety of ways for decoding oligonucleotide tag, including the direct or indirect attachment of radioactive moieties, fluorescent moieties, colorimetric moieties, chemiluminescent moieties, and the like.
  • Many comprehensive reviews of methodologies for labeling DNA and constructing DNA adaptors provide guidance applicable to constructing adaptors of the present invention. Such reviews include Matthews et al, Anal. Biochem. , Vol 169, pgs.
  • fluorescent signal generating moiety means a signaling means which conveys information through the fluorescent absorption and/or emission properties of one or more molecules. Such fluorescent properties include fluorescence intensity, fluorescence life time, emission spectrum characteristics, energy transfer, and the like.
  • cleavage adaptors are ligated to the ends of target polynucleotides to prepare such ends for eventual ligation of encoded adaptors.
  • ligation is carried out enzymatically using a ligase in a standard protocol.
  • Many ligases are known and are suitable for use in the invention, e.g. Lehman, Science, 186: 790-797 (1974); Engler et al, DNA Ligases, pages 3-30 in Boyer, editor, The Enzymes, Vol. 15B (Academic Press, New York, 1982); and the like.
  • Preferred ligases include T4 DNA ligase, T7 DNA ligase, E.
  • ligases require that a 5' phosphate group be present for ligation to the 3' hydroxyl of an abutting strand. This is conveniently provided for at least one strand of the target polynucleotide by selecting a nuclease which leaves a 5' phosphate, e.g. as Fok I.
  • a special problem may arise in dealing with either polynucleotide ends or adaptors that are capable of self-ligation, such as illustrated in Figure 2, where the four-nucleotide protruding strands of the anchored polynucleotides are complementary to one another (114).
  • This problem is especially severe in embodiments where the polynucleotides (112) to be analyzed are presented to the adaptors as uniform populations of identical polynucleotides attached to a solid phase support (110). In these situations, the free ends of the anchored polynucleotides can twist around to form perfectly matched duplexes (116) with one another.
  • the polynucleotides are readily ligated in the presence of a ligase.
  • An analogous problem also exists for double stranded adaptors. Namely, whenever their 5' strands are phosphorylated, the 5' strand of one adaptor may be ligated to the free 3' hydroxyl of another adaptor whenever the nucleotide sequences of their protruding strands are complementary. When self-ligation occurs, the protruding strands of neither the adaptors nor the target polynucleotides are available for analysis or processing.
  • the encoded adaptors and target polynucleotides may be combined for ligation either singly or as mixtures.
  • a single kind of adaptor having a defined sequence may be combined with a single kind of polynucleotide having a common (and perhaps, unknown) nucleotide sequence; or a single kind of adaptor having a defined sequence may be combmed with a mixture of polynucleotides, such as a plurality of uniform populations of identical polynucleotides attached to different solid phase supports in the same reaction vessel, e.g.
  • a 3' deoxy may be removed from a second strand by a polymerase "exchange" reaction disclosed in Kuijper et al, Gene, 112: 147-155 (1992); Aslanidis et al, Nucleic Acids Research, 18: 6069-6074 (1990); and like references.
  • the 5' ⁇ 3' exonuclease activity of T4 DNA polymerase, and like enzymes may be used to exchange nucleotides in a priming strand with their triphosphate counterparts in solution, e g. Kuijper et al (cited above).
  • a 3' dideoxynucleotide can be exchanged with a 2'-deoxy-3'-hydroxynucleotide from a reaction mixture, which renders the second strand ligatable to the target polynucleotide after treatment with a polynucleotide kinase.
  • a preferred embodiment employing cycles of ligation and cleavage comprises the following steps: (a) ligating (220) an encoded adaptor to an end of the polynucleotide (222), the end of the polynucleotide having a dephosphorylated 5' hydroxyl, the end of the double stranded adaptor to be ligated (224) having a first strand (226) and a second strand (228), the second strand of the double stranded adaptor having a 3' blocking group (230), and the double stranded adaptor having a nuclease recognition site (250) of a nuclease whose recognition site is separate from its cleavage site; (b) removing the 3' blocking group after ligation, e.g.
  • ends of polynucleotides to be analyzed are prepared by digesting them with one or more restriction endonucleases that produce predetermined cleavages, usually having 3' or 5' protruding strands, i.e. "sticky" ends. Such digestions usually leave the 5' strands phosphorylated.
  • these 5' phosphorylated ends are dephosphorylated by treatment with a phosphatase, such as calf intestinal alkaline phosphatase, or like enzyme, using standard protocols, e.g. as described in Sambrook et al, Molecular Cloning, Second Edition (Cold Spring Harbor Laboratory, New York, 1989).
  • a phosphatase such as calf intestinal alkaline phosphatase, or like enzyme
  • Nuclease as the term is used in accordance with the invention means any enzyme, combination of enzymes, or other chemical reagents, or combinations chemical reagents and enzymes that when applied to a ligated complex, discussed more fully below, cleaves the ligated complex to produce an augmented adaptor and a shortened target polynucleotide.
  • a nuclease of the invention need not be a single protein, or consist solely of a combination of proteins.
  • a key feature of the nuclease, or of the combination of reagents employed as a nuclease, is that its (their) cleavage site be separate from its (their) recognition site.
  • the distance between the recognition site of a nuclease and its cleavage site will be referred to herein as its "reach.”
  • “reach” is defined by two integers which give the number of nucleotides between the recognition site and the hydrolyzed phosphodiester bonds of each strand.
  • the recognition and cleavage properties of Fok I is typically represented as "GGATG(9/13)" because it recognizes and cuts a double stranded DNA as follows (SEQ ID NO: 2): where the bolded nucleotides are Fok I's recognition site and the N's are arbitrary nucleotides and their complements.
  • nuclease only cleave the target polynucleotide after it forms a complex with its recognition site; and preferably, the nuclease leaves a protruding strand on the target polynucleotide after cleavage.
  • nucleases employed in the invention are natural protein endonucleases (i) whose recognition site is separate from its cleavage site and (ii) whose cleavage results in a protruding strand on the target polynucleotide.
  • class IIs restriction endonucleases are employed as nucleases in the invention, e.g. as described in Szybalski et al, Gene, 100: 13-26 (1991); Roberts et al, Nucleic Acids Research, 21: 3125-3137 (1993); and Livak and Brenner, U.S. patent 5,093,245.
  • Exemplary class IIs nucleases for use with the invention include Alw XI, Bsm AI, Bbv I, Bsm FI, Sts I, Hga I, Bsc AI, Bbv II, Bce fI, Bce 85I, Bcc I, Bcg I, Bsa I, Bsg I, Bsp MI, Bst 71 I, Ear I, Eco 57I, Esp 31, Fau I, Fok I, Gsu I, Hph I, Mbo II, Mme I, Rle AI, Sap I, Sfa NI, Taq II, Tth 11 III, Bco 5I, Bpu AI, Fin I, Bsr DI, and isoschizomers thereof.
  • Preferred nucleases include Bbv I, Fok I, Hga I, Ear I, and Sfa NI.
  • Bbv I is the most preferred nuclease.
  • the target polynucleotide is treated to block the recognition sites and/or cleavage sites of the nuclease being employed. This prevents undesired cleavage of the target polynucleotide because of the fortuitous occurrence of nuclease recognition sites at interior locations in the target polynucleotide. Blocking can be achieved in a variety of ways, including methylation and treatment by sequence-specific aptamers, DNA binding proteins, or oligonucleotides that form triplexes.
  • recognition sites can be conveniently blocked by methylating the target polynucleotide with the cognate methylase of the nuclease being used. That is, for most if not all type II bacterial restriction endonucleases, there exists a so-called "cognate" methylases that methylates its recognition site. Many such methylases are disclosed in Roberts et al (cited above) and Nelson et al, Nucleic Acids Research, 21: 3139-3154 (1993), and are commercially available from a variety of sources, particularly New England Biolabs (Beverly, MA).
  • 5-methylcytosine triphosphates may be used during amplification so that the natural cytosine are replaced by methylated cytosines in the amplicon.
  • This later approach has the added advantage of eliminating the need to treat a target polynucleotide bound to a solid phase support with another enzyme.
  • kits of the invention include encoded adaptors, cleavage adaptors, and labeled tag complements. Kits further include the nuclease reagents, the ligation reagents, and instructions for practicing the particular embodiment of the invention. In embodiments employing natural protein endonucleases and ligases, ligase buffers and nuclease buffers may be included. In some cases, these buffers may be identical. Such kits may also include a methylase and its reaction buffer. Preferably, kits also include one or more solid phase supports, e.g. microparticles carrying tag complements for sorting and anchoring target polynucleotides
  • An important aspect of the invention is the sorting and attachment of a populations of polynucleotides, e.g. from a cDNA library, to microparticles or to separate regions on a solid phase support such that each microparticle or region has substantially only one kind of polynucleotide attached.
  • This objective is accomplished by insuring that substantially all different polynucleotides have different tags attached. This condition, in turn, is brought about by taking a sample of the full ensemble of tag-polynucleotide conjugates for analysis.
  • sampling can be carried out either overtly--for example, by taking a small volume from a larger mixture-after the tags have been attached to the polynucleotides, it can be carried out inherently as a secondary effect of the techniques used to process the polynucleotides and tags, or sampling can be carried out both overtly and as an inherent part of processing steps.
  • a tag repertoire is employed whose complexity, or number of distinct tags, greatly exceeds the total number of mRNAs extracted from a cell or tissue sample.
  • the complexity of the tag repertoire is at least 10 times that of the polynucleotide population; and more preferably, the complexity of the tag repertoire is at least 100 times that of the polynucleotide population.
  • a protocol is disclosed for cDNA library construction using a primer mixture that contains a full repertoire of exemplary 9-word tags. Such a mixture of tag-containing primers has a complexity of 8 9 , or about 1.34 x 10 8 .
  • mRNA for library construction can be extracted from as few as 10-100 mammalian cells. Since a single mammalian cell contains about 5 x 10 5 copies of mRNA molecules of about 3.4 x 10 4 different kinds, by standard techniques one can isolate the mRNA from about 100 cells, or (theoretically) about 5 x 10 7 mRNA molecules. Comparing this number to the complexity of the primer mixture shows that without any additional steps, and even assuming that mRNAs are converted into cDNAs with perfect efficiency (1% efficiency or less is more accurate), the cDNA library construction protocol results in a population containing no more than 37% of the total number of different tags.
  • the protocol inherently generates a sample that comprises 37%, or less, of the tag repertoire.
  • the probability of obtaining a double under these conditions is about 5%, which is within the preferred range.
  • the fraction of the tag repertoire sampled is reduced to only 3.7%, even assuming that all the processing steps take place at 100% efficiency.
  • the efficiencies of the processing steps for constructing cDNA libraries are very low, a "rule of thumb" being that good library should contain about 10 8 cDNA clones from mRNA extracted from 10 6 mammalian cells.
  • a tag-polynucleotide conjugate mixture potentially contains every possible pairing of tags and types of mRNA or polynucleotide.
  • overt sampling may be implemented by removing a sample volume after a serial dilution of the starting mixture of tag-polynucleotide conjugates. The amount of dilution required depends on the amount of starting material and the efficiencies of the processing steps, which are readily estimated.
  • Such a sample is readily obtained as follows: Assume that the 5 x 10 11 mRNAs are perfectly converted into 5 x 10 11 vectors with tag-cDNA conjugates as inserts and that the 5 x 10 11 vectors are in a reaction solution having a volume of 100 ⁇ l. Four 10-fold serial dilutions may be carried out by transferring 10 ⁇ l from the original solution into a vessel containing 90 ⁇ l of an appropriate buffer, such as TE. This process may be repeated for three additional dilutions to obtain a 100 ⁇ l solution containing 5 x 10 5 vector molecules per ⁇ l. A 2 ⁇ l aliquot from this solution yields 10 6 vectors containing tag-cDNA conjugates as inserts. This sample is then amplified by straight forward transformation of a competent host cell followed by culturing.
  • a repertoire of oligonucleotide tags can be conjugated to a population of polynucleotides in a number of ways, including direct enzymatic ligation, amplification, e.g. via PCR, using primers containing the tag sequences, and the like.
  • the initial ligating step produces a very large population of tag-polynucleotide conjugates such that a single tag is generally attached to many different polynucleotides.
  • the probability of obtaining "doubles” i.e. the same tag on two different polynucleotides, can be made negligible.
  • the larger the sample the greater the probability of obtaining a double.
  • substantially all in reference to attaching tags to molecules, especially polynucleotides, is meant to reflect the statistical nature of the sampling procedure employed to obtain a population of tag-molecule conjugates essentially free of doubles. The meaning of substantially all in terms of actual percentages of tag-molecule conjugates depends on how the tags are being employed.
  • substantially all means that at least eighty percent of the polynucleotides have unique tags attached. More preferably, it means that at least ninety percent of the polynucleotides have unique tags attached. Still more preferably, it means that at least ninety-five percent of the polynucleotides have unique tags attached. And, most preferably, it means that at least ninety-nine percent of the polynucleotides have unique tags attached.
  • oligonucleotides tags may be attached by reverse transcribing the mRNA with a set of primers preferably containing complements of tag sequences.
  • An exemplary set of such primers could have the following sequence: where "[W,W,W,C] 9 " represents the sequence of an oligonucleotide tag of nine subunits of four nucleotides each and "[W,W,W,C]" represents the subunit sequences listed above, i.e. "W” represents T or A.
  • the underlined sequences identify an optional restriction endonuclease site that can be used to release the polynucleotide from attachment to a solid phase support via the biotin, if one is employed.
  • the complement attached to a microparticle could have the form: 5'-[G,W,W,W] 9 TGG-linker-microparticle
  • the mRNA is removed, e.g. by RNase H digestion, and the second strand of the cDNA is synthesized using, for example, a primer of the following form (SEQ ID NO: 3): 5'-NRRGATCYNNN-3' where N is any one of A, T, G, or C; R is a purine-containing nucleotide, and Y is a pyrimidine-containing nucleotide.
  • This particular primer creates a Bst Y1 restriction site in the resulting double stranded DNA which, together with the Sal I site, facilitates cloning into a vector with, for example, Barn HI and Xho I sites.
  • the polynucleotide-tag conjugates may then be manipulated using standard molecular biology techniques.
  • the above conjugate--which is actually a mixture- may be inserted into commercially available cloning vectors, e.g Stratagene Cloning System (La Jolla, CA); transfected into a host, such as a commercially available host bacteria; which is then cultured to increase the number of conjugates.
  • the cloning vectors may then be isolated using standard techniques, e.g. Sambrook et al, Molecular Cloning, Second Edition (Cold Spring Harbor Laboratory, New York, 1989).
  • appropriate adaptors and primers may be employed so that the conjugate population can be increased by PCR.
  • the Bst Y1 and Sal I digested fragments are cloned into a Barn HI-/Xho I-digested vector having the following single-copy restriction sites: This adds the Fok 1 site which will allow initiation of the sequencing process discussed more fully below.
  • Tags can be conjugated to cDNAs of existing libraries by standard cloning methods cDNAs are excised from their existing vector, isolated, and then ligated into a vector containing a repertoire of tags.
  • the tag-containing vector is linearized by cleaving with two restriction enzymes so that the excised cDNAs can be ligated in a predetermined orientation.
  • the concentration of the linearized tag-containing vector is in substantial excess over that of the cDNA inserts so that ligation provides an inherent sampling of tags.
  • a general method for exposing the single stranded tag after amplification involves digesting a target polynucleotide-containing conjugate with the 5' ⁇ 3' exonuclease activity of T4 DNA polymerase, or a like enzyme.
  • T4 DNA polymerase or a like enzyme.
  • T4 DNA polymerase When used in the presence of a single deoxynucleoside triphosphate, such a polymerase will cleave nucleotides from 3' recessed ends present on the non-template strand of a double stranded fragment until a complement of the single deoxynucleoside triphosphate is reached on the template strand.
  • the technique may also be used to preferentially methylate interior Fok I sites of a target polynucleotide while leaving a single Fok I site at the terminus of the polynucleotide unmethylated.
  • the terminal Fok I site is rendered single stranded using a polymerase with deoxycytidine triphosphate.
  • the double stranded portion of the fragment is then methylated, after which the single stranded terminus is filled in with a DNA polymerase in the presence of all four nucleoside triphosphates, thereby regenerating the Fok I site.
  • this procedure can be generalized to endonucleases other than Fok I
  • the polynucleotides are mixed with microparticles containing the complementary sequences of the tags under conditions that favor the formation of perfectly matched duplexes between the tags and their complements.
  • the hybridization conditions are sufficiently stringent so that only perfectly matched sequences form stable duplexes.
  • the polynucleotides specifically hybridized through their tags may be ligated to the complementary sequences attached to the microparticles. Finally, the microparticles are washed to remove polynucleotides with unligated and/or mismatched tags.
  • the density of tag complements on the microparticle surface is typically greater than that necessary for some sequencing operations. That is, in sequencing approaches that require successive treatment of the attached polynucleotides with a variety of enzymes, densely spaced polynucleotides may tend to inhibit access of the relatively bulky enzymes to the polynucleotides.
  • the polynucleotides are preferably mixed with the microparticles so that tag complements are present in significant excess, e.g. from 10:1 1 to 100:1, or greater, over the polynucleotides. This ensures that the density of polynucleotides on the microparticle surface will not be so high as to inhibit enzyme access.
  • the average inter-polynucleotide spacing on the microparticle surface is on the order of 30-100 nm.
  • Guidance in selecting ratios for standard CPG supports and Ballotini beads (a type of solid glass support) is found in Maskos and Southern, Nucleic Acids Research, 20: 1679-1684 (1992).
  • standard CPG beads of diameter in the range of 20-50 ⁇ m are loaded with about 10 5 polynucleotides
  • GMA glycidalmethacrylate
  • Bangs Laboratories Carmel, IN
  • tag complements for sorting are synthesized on microparticles combinatorially; thus, at the end of the synthesis, one obtains a complex mixture of microparticles from which a sample is taken for loading tagged polynucleotides.
  • the size of the sample of microparticles will depend on several factors, including the size of the repertoire of tag complements, the nature of the apparatus for used for observing loaded microparticles--e.g. its capacity, the tolerance for multiple copies of microparticles with the same tag complement (i.e. "bead doubles"), and the like.
  • the following table provide guidance regarding microparticle sample size, microparticle diameter, and the approximate physical dimensions of a packed array of microparticles of various diameters.
  • the probability that the sample of microparticles contains a given tag complement or is present in multiple copies is described by the Poisson distribution, as indicated in the following table.
  • Table VI Number of microparticles in sample (as fraction of repertoire size), m Fraction of repertoire of tag complements present in sample, 1-e -m Fraction of microparticles in sample with unique tag complement attached, m(e -m )/2 Fraction of microparticles in sample carrying same tag complement as one other microparticle in sample ("bead doubles"), m 2 (e -m )/2 1.000 0.63 0.37 0.18 .693 0.50 0.35 0.12 .405 0.33 0.27 0.05 .285 0.25 0.21 0.03 .223 0.20 0.18 0.02 .105 0.10 0.09 0.005 .010 0.01 0.01
  • the kinetics of sorting depends on the rate of hybridization of oligonucleotide tags to their tag complements which, in turn, depends on the complexity of the tags in the hybridization reaction.
  • a trade off exists between sorting rate and tag complexity, such that an increase in sorting rate may be achieved at the cost of reducing the complexity of the tags involved in the hybridization reaction.
  • the effects of this trade off may be ameliorated by "panning.”
  • Specificity of the hybridizations may be increased by taking a sufficiently small sample so that both a high percentage of tags in the sample are unique and the nearest neighbors of substantially all the tags in a sample differ by at least two words.
  • This latter condition may be met by taking a sample that contains a number of tag-polynucleotide conjugates that is about 0.1 percent or less of the size of the repertoire being employed. For example, if tags are constructed with eight words selected from Table II, a repertoire of 8 8 , or about 1.67 x 10 7 , tags and tag complements are produced. In a library of tag-cDNA conjugates as described above, a 0.1 percent sample means that about 16,700 different tags are present.
  • loaded microparticles may be separated from unloaded microparticles by a fluorescently activated cell sorting (FACS) instrument using conventional protocols, e.g. tag-cDNA conjugates may be fluorescently label in the technique described below by providing fluorescently labelled right primer After loading and FACS sorting, the label may be cleaved prior to ligating encoded adaptors, e.g. by Dpn I or like enzyme that recognize methylated sites.
  • FACS fluorescently activated cell sorting
  • a panning step may be implemented by providing a sample of tag-cDNA conjugates each of which contains a capture moiety at an end opposite, or distal to, the oligonucleotide tag.
  • the capture moiety is of a type which can be released from the tag-cDNA conjugates, so that the tag-cDNA conjugates can be sequenced with a single-base sequencing method
  • moieties may comprise biotin, digoxigenin, or like ligands, a triplex binding region, or the like.
  • such a capture moiety comprises a biotin component. Biotin may be attached to tag-cDNA conjugates by a number of standard techniques.
  • biotin may be attached by using a biotinylated primer in an amplification after sampling.
  • biotin may be attached after excising the tag-cDNA conjugates by digestion with an appropriate restriction enzyme followed by isolation and filling in a protruding strand distal to the tags with a DNA polymerase in the presence of biotinylated uridine triphosphate
  • a tag-cDNA conjugate After a tag-cDNA conjugate is captured, it may be released from the biotin moiety in a number of ways, such as by a chemical linkage that is cleaved by reduction, e.g. Herman et al, Anal. Biochem., 156: 48-55 (1986), or that is cleaved photochemically, e.g. Olejnik et al, Nucleic Acids Research, 24: 361-366 (1996), or that is cleaved enzymatically by introducing a restriction site in the PCR primer.
  • a chemical linkage that is cleaved by reduction, e.g. Herman et al, Anal. Biochem., 156: 48-55 (1986), or that is cleaved photochemically, e.g. Olejnik et al, Nucleic Acids Research, 24: 361-366 (1996), or that is cleaved enzymatically by introducing
  • the latter embodiment can be exemplified by considering the library of tag-polynucleotide conjugates described above:
  • the following adapters may be ligated to the ends of these fragments to permit amplification by PCR where "ACTAGT” is a Spe I recognition site (which leaves a staggered cleavage ready for single base sequencing), and the X's and Z's are nucleotides selected so that the annealing and dissociation temperatures of the respective primers are approximately the same.
  • the tags of the conjugates are rendered single stranded by the exonuclease activity of T4 DNA polymerase and conjugates are combined with a sample of microparticles, e.g. a repertoire equivalent, with tag complements attached.
  • the conjugates are preferably ligated to their tag complements and the loaded microparticles are separated from the unloaded microparticles by capture with avidinated magnetic beads, or like capture technique.
  • 4-5 x 10 5 cDNAs can be accumulated by pooling the released microparticles.
  • the pooled microparticles may then be simultaneously sequenced by a single-base sequencing technique.
  • Determining how many times to repeat the sampling and panning steps--or more generally, determining how many cDNAs to analyze depends on one's objective. If the objective is to monitor the changes in abundance of relatively common sequences, e.g. making up 5% or more of a population, then relatively small samples, i.e. a small fraction of the total population size, may allow statistically significant estimates of relative abundances. On the other hand, if one seeks to monitor the abundances of rare sequences, e.g. making up 0.1 % or less of a population, then large samples are required.
  • sample size there is a direct relationship between sample size and the reliability of the estimates of relative abundances based on the sample
  • there is extensive guidance in the literature on determining appropriate sample sizes for making reliable statistical estimates e.g. Koller et al, Nucleic Acids Research, 23:185-191 (1994); Good, Biometrika, 40: 16-264 (1953); Bunge et al, J. Am. Stat. Assoc., 88: 364-373 (1993); and the like.
  • a sample of at least 10 4 sequences are accumulated for analysis of each library.
  • a sample of at least 10 5 sequences are accumulated for the analysis of each library; and most preferably, a sample of at least 5 x 10 5 sequences are accumulated for the analysis of each library.
  • the number of sequences sampled is preferably sufficient to estimate the relative abundance of a sequence present at a frequency within the range of 0.1% to 5% with a 95% confidence limit no larger than 0.1% of the population size.
  • An exemplary tag library is constructed as follows to form the chemically synthesized 9-word tags of nucleotides A, G, and T defined by the formula: 3'-TGGC-[ 4 (A,G,T) 9 ]-CCCC p where "[ 4( (A,G,T) 9 ]” indicates a tag mixture where each tag consists of nine 4-mer words of A, G, and T; and "p" indicate a 5' phosphate.
  • This mixture is ligated to the following right and left primer binding regions (SEQ ID NO: 4 & 5):
  • the right and left primer binding regions are ligated to the above tag mixture, after which the single stranded portion of the ligated structure is filled with DNA polymerase then mixed with the right and left primers indicated below and amplified to give a tag library.
  • the underlined portion of the left primer binding region indicates a Rsr II recognition site.
  • the left-most underlined region of the right primer binding region indicates recognition sites for Bsp 120I, Apa I, and Eco O 109I, and a cleavage site for Hga I
  • the right-most underlined region of the right primer binding region indicates the recognition site for Hga I.
  • the right or left primers may be synthesized with a biotin attached (using conventional reagents, e.g. available from Clontech Laboratories, Palo Alto, CA) to facilitate purification after amplification and/or cleavage.
  • cDNA is produced from an mRNA sample by conventional protocols using pGGCCCT 15 (A or G or C) as a primer for first strand synthesis anchored at the boundary of the poly A region of the mRNAs and N 8 (A or T)GATC as the primer for second strand synthesis. That is, both are degenerate primers such that the second strand primer is present in two forms and the first strand primer is present in three forms
  • the GATC sequence In the second strand primer corresponds to the recognition site of Mbo I; other four base recognition sites could be used as well, such as those for Barn H1, Sph I, Eco RI, or the like.
  • the presence of the A and T adjacent to the restriction site of the second strand primer ensures that a stripping and exchange reaction can be used in the next step to generate a five-base 5' overhang of "GGCCC".
  • the first strand primer is annealed to the mRNA sample and extended with reverse transcriptase, after which the RNA strand is degraded by the RNase H activity of the reverse transcriptase leaving a single stranded cDNA.
  • the second strand primer is annealed and extended with a DNA polymerase using conventional protocols. After second strand synthesis, the resulting cDNAs are methylated with CpG methylase (New England Biolabs, Beverly, MA) using manufacturer's protocols.
  • the following cloning vector is constructed, e.g. starting from a commercially available plasmid, such as a Bluescript phagemid (Stratagene, La Jolla, CA)(SEQ ID NO: 6).
  • the plasmid is cleaved with Ppu MI and Pme I (to give a Rsr II-compatible end and a flush end so that the insert is oriented) and then methylated with DAM methylase.
  • the tag-containing construct is cleaved with Rsr II and then ligated to the open plasmid, after which the conjugate is cleaved with Mbo I and Bam HI to permit ligation and closing of the plasmid.
  • the plasmids are then amplified and isolated for use in accordance with the invention.
  • a segment of plasmid pGEM7Z (Promega, Madison, WI) is amplified and attached to glass beads via a double stranded DNA linker, one strand of which is synthesized directly onto (and therefore covalently linked to) the beads.
  • a mixture of encoded adaptors (1024 different adaptors in all) is applied to the target polynucleotides so that only those adaptors whose protruding strands form perfectly matched duplexes with the target polynucleotides are ligated.
  • Each of 16 fluorescently labelled tag complements are then applied to the polynucleotide-adaptor conjugates under conditions that permit hybridization of only the correct tag complements.
  • the presence or absense of fluorescent signal after washing indicates the presence or absence of a particular nucleotide at a particular location.
  • the sequencing protocol of this example is applicable to multiple target polynucleotides sorted onto one or more solid phase supports as described in Brenner, International patent applications PCT/US95/12791 and PCT/US96/09513 (WO 96/12014 and WO 96/41011).
  • a 47-mer oligonucleotide is synthesized directly on Ballotini beads (.040-.075 mm, Jencons Scientific, Bridgeville, PA) using a standard automated DNA synthesizer protocol.
  • the complementary strand to the 47-mer is synthesized separately and purified by HPLC. When hybridized the resulting duplex has a Bst XI restriction site at the end distal from the bead.
  • the complementary strand is hybridized to the attached 47-mer in the following mixture: 25 ⁇ L complementary strand at 200 pmol/ ⁇ L; 20 mg Ballotini beads with the 47-mer; 6 ⁇ L New England Biolabs #3 restriction buffer (from 10x stock solution); and 25 ⁇ L distilled water.
  • the mixture is heated to 93°C and then slowly cooled to 55°C, after which 40 units of Bst XI (at 10 units/ ⁇ L) is added to bring the reaction volume to 60 ⁇ L.
  • the mixture is incubated at 55°C for 2 hours after which the beads are washed three times in TE (pH 8.0).
  • the segment of pGEM7Z to be attached to the beads is prepared as follows: Two PCR primers were prepared using standard protocols (SEQ ID NO: 7 and SEQ ID NO: 8): The PCR reaction mixture consists of the following: 1 ⁇ l pGEM7Z at I ng/ ⁇ l; 10 ⁇ l primer 1 at 10 pmol/ ⁇ l; 10 ⁇ l primer 2 at 10 pmol/ ⁇ l; 10 ⁇ l deoxyribonucleotide triphosphates at 2.5 mM; 10 ⁇ l 10x PCR buffer (Perkin-Elmer); 0.5 ⁇ l Taq DNA polymerase at 5 units/ ⁇ l; and 58 ⁇ l distilled water to give a final volume of 100 ⁇ l.
  • the reaction mixture was subjected to 25 cycles of 93°C for 30 sec; 60°C for 15 sec; and 72°C for 60 sec, to give a 172 basepair product, which is successively digested with Bbv I (100 ⁇ l PCR reaction mixture, 12 ⁇ l 10x # 1 New England Biolabs buffer, 8 ⁇ l Bbv I at 1 unit/ ⁇ l incubate at 37°C for 6 hours) and with Bst XI (to the Bbv I reaction mixture is added: 5 ⁇ l 1 M NaCl, 67 ⁇ l distilled water, and 8 ⁇ l Bst XI at 10 units/ ⁇ l, and the resulting mixture is incubated at 55°C for 2 hours).
  • Bbv I 100 ⁇ l PCR reaction mixture, 12 ⁇ l 10x # 1 New England Biolabs buffer, 8 ⁇ l Bbv I at 1 unit/ ⁇ l incubate at 37°C for 6 hours
  • Bst XI to the Bbv I reaction mixture is
  • the Bbv I/Bst XI-restricted fragment is ligated to the double stranded linker attached to the Ballotini beads in the following mixture: 17 ⁇ l Bbv I/Bst XI-resiricted fragment (10 ⁇ g).
  • top strands of the following 16 sets of 64 encoded adaptors are each separately synthesized on an automated DNA synthesizer (model 392 Applied Biosystems, Foster City) using standard methods.
  • the bottom strand which is the same for all adaptors, is synthesized separately then hybridized to the respective top strands: SEQ ID NO.
  • Each of the 16 tag complements are separately synthesized as amino-derivatized oligonucleotides and are each labelled with a fluorescein molecule (e.g. FAM, an NHS-ester of fluorescein, available from Molecular Probes, Eugene, OR) which is attached to the 5' end of the tag complement through a polyethylene glycol linker (Clonetech Laboratories, Palo Alto, CA)
  • a fluorescein molecule e.g. FAM, an NHS-ester of fluorescein, available from Molecular Probes, Eugene, OR
  • FAM an NHS-ester of fluorescein, available from Molecular Probes, Eugene, OR
  • the sequences of the tag complements are simply the 12-mer complements of the tags listed above.
  • Ligation of the adaptors to the target polynucleotide is earned out in a mixture consisting of 5 ⁇ l beads (20 mg), 3 ⁇ L NEB 10x ligase buffer, 5 ⁇ L adaptor mix (25 nM), 2.5 ⁇ L NEB T4 DNA ligase (2000 units/ ⁇ L), and 14.5 ⁇ L distilled water.
  • the mixture is incubated at 16°C for 30 minutes, after which the beads are washed 3 times in TE (pH 8.0).
  • the 3' phosphates of the ligated adaptors are removed by treating the polynucleotide-bead mixture with calf intestinal alkaline phosphatase (CIP) (New England Biolabs, Beverly, MA), using the manufacturer's protocol. After removal of the 3' phosphates, the CIP may be inactivated by proteolytic digestion, e.g. using Pronase TM (available form Boeringer Mannhiem, Indianapolis, IN), or an equivalent protease, with the manufacturer's protocol.
  • CIP calf intestinal alkaline phosphatase
  • the polynucleotide-bead mixture is then washed and treated with a mixture of T4 polynucleotide kinase and T4 DNA ligase (New England Biolabs, Beverly, MA) to add a 5' phosphate at the gap between the target polynucleotide and the adaptor to complete the ligation of the adaptors to the target polynucleotide.
  • T4 polynucleotide kinase and T4 DNA ligase New England Biolabs, Beverly, MA
  • each of the labelled tag complements is applied to the polynucleotide-bead mixture under conditions which permit the formation of perfectly matched duplexes only between the oligonucleotide tags and their respective complements, after which the mixture is washed under stringent conditions, and the presence or absense of a fluorescent signal is measured.
  • Tag complements are applied in a solution consisting of 25 nM tag complement 50 mM NaCl, 3 mM Mg, 10 mM Tris-HCl (pH 8.5), at 20°C, incubated for 10 minutes, then washed in the same solution (without tag complement) for 10 minute at 55°C.
  • the encoded adaptors are cleaved from the polynucleotides with Bbv I using the manufacturer's protocol. After an initial ligation and identification, the cycle of ligation, identification, and cleavage is repeated three times to give the sequence of the 16 terminal nucleotides of the target polynucleotide.
  • Figure 4 illustrates the relative fluorescence from each of four tag complements applied to identify nucleotides at positions 5 through 16 (from the most distal from the bead to the most proximal to the bead).
  • a cDNA library is constructed in which an oligonucleotide tag consisting of 8 four-nucleotide "words" is attached to each cDNA.
  • the repertoire of oligonucleotide tags of this size is sufficiently large (about 10 8 ) so that if the cDNAs are synthesized from a population of about 10 6 mRNAs, then there is a high probability that each cDNA will have a unique tag for sorting.
  • first strand synthesis is carried out in the presence of 5-Me-dCTP (to block certain cDNA restriction sites) and a biotinylated primer mixture containing the oligonucleotide tags.
  • the tag-cDNA conjugates are cleaved with Dpn II (which is unaffected by the 5-Me-deoxycytosmes), the biotinylated portions are separated from the reaction mixture using streptavidin-coated magnetic beads, and the tag-cDNA conjugates are recovered by cleaving them from the magnetic beads via a Bsm BI site carried by the biotinylated primer.
  • the Bsm BI-Dpn II fragment containing the tag-cDNA conjugate is then inserted into a plasmid and amplified.
  • tag-cDNA conjugates are amplified out of the plasmids by PCR in the presence of 5-Me-dCTP, using biotinylated and fluorescently labelled primers containing pre-defined restriction endonuclease sites.
  • affinity purification with streptavidin coated magnetic beads the tag-cDNA conjugates are cleaved from the beads, treated with T4 DNA polymerase in the presence of dGTP to render the tags single stranded, and then combined with a repertoire of GMA beads having tag complements attached.
  • the GMA beads are sorted via FACS to produce an enriched population of GMA beads loaded with cDNAs.
  • the enriched population of loaded GMA beads are immobilized in a planar array in a flow chamber where base-by-base sequence takes place using encoded adaptors.
  • poly(A + ) mRNA is extracted from DBY746 yeast cells using conventional protocols.
  • First and second strand cDNA synthesis is carried out by combining 100-150 pmoles of the following primer (SEQ ID NO: 26): 5'-biotin-ACTAAT CGTCTC ACTAT TTAATTAA [W,W,W,G] 8 CC(T) 18 V-3' with the poly(A+) mRNA using a Stratagene (La Jolla, CA) cDNA Synthesis Kit in accordance with the manufacturer's protocol. This results in cDNAs whose first stand deoxycytosines are methylated at the 5-carbon position.
  • V is G, C, or A
  • [W,W,W,G] is a four-nucleotide word selected from Table II as described above
  • the single underlined portion is a Bsm BI recognition site
  • the double underlined portion is a Pac I recognition site.
  • the DNA captured by the beads is digested with Bsm BI to release the tag-cDNA conjugates for cloning into a modified pBCSK - vector (Stratagene, La Jolla, CA) using standard protocols.
  • the pBCSK - vector is modified by adding a Bbs I site by inserting the following fragment (SEQ ID NO: 27) into the Kpn I/Eco RV digested vector.
  • Bsm BI/Dpn II digested tag-cDNA conjugate is inserted in the pBCSK - which is previously digested with Bbs I and Bam HI. After ligation, the vector is transfected into the manufacturer's recommended host for amplification.
  • the tag-cDNA conjugates are amplified by PCR in the presence of 5-Me-dCTP using 20-mer primers complementary to vector sequences flanking the tag-cDNA insert.
  • the "upstream” primer, i.e. adjacent to the tag, is biotinylated and the "downstream” primer, i.e. adjacent to the cDNA, is labelled with fluorescein.
  • the PCR product is affinity purified then cleaved with Pac I to release fluorescently labelled tag-cDNA conjugates.
  • the tags of the conjugates are rendered single stranded by treating them with T4 DNA polymerase in the presence of dGTP.
  • the tag-cDNA conjugate is purified by phenol-chloroform extraction and combined with 5.5 mm GMA beads carrying tag complements, each tag complement having a 5' phosphate.
  • Hybridization is conducted under stringent conditions in the presence of a thermal stable ligase so that only tags forming perfectly matched duplexes with their complements arc ligated.
  • the GMA beads are washed and the loaded beads are concentrated by FACS sorting, using the fluorescently labelled cDNAs to identify loaded GMA beads.
  • the tag-cDNA conjugates attached to the GMA beads are digested with Dpn II to remove the fluorescent label and treated with alkaline phosphatase to prepare the cDNAs for sequencing.
  • the following cleavage adaptor (SEQ ID NO. 28) is ligated to the Dpn II-digested and phosphatase treated cDNAs: after which the 3' phosphate is removed by alkaline phosphatase, the 5' strand of the cDNA is treated with T4 DNA kinase, and the nick between the cleavage adaptor and cDNA is ligated. After cleavage by Bbv I, the encoded adaptors of Example 1 are ligated to the ends of the cDNAs as described above.
  • a flow chamber (500) diagrammatically represented in Figure 5 is prepared by etching a cavity having a fluid inlet (502) and outlet (504) in a glass plate (506) using standard micromachining techniques, e.g. Ekstrom et al, International patent application PCT/SE91/00327 (WO 91/16966); Brown, U.S. patent 4,911,782; Harrison et al, Anal. Chem. 64: 1926-1932 (1992); and the like.
  • the dimension of flow chamber (500) are such that loaded microparticles (508), e.g. GMA beads, may be disposed in cavity (510) in a closely packed planar monolayer of 100-200 thousand beads.
  • Cavity (510) is made into a closed chamber with inlet and outlet by anodic bonding of a glass cover slip (512) onto the etched glass plate (506), e.g. Pomerantz, U.S. patent 3,397,279.
  • Reagents are metered into the flow chamber from syringe pumps (514 through 520) through valve block (522) controlled by a microprocessor as is commonly used on automated DNA and peptide synthesizers, e.g. Bridgham et al, U.S. patent 4,668,479; Hood et al, U.S. patent 4,252,769; Barstow et al, U.S. patent 5,203,368; Hunkapiller, U.S. patent 4,703,913, or the like.
  • Three cycles of ligation, identification, and cleavage are carried out in flow chamber (500) to give the sequences of 12 nucleotides at the termini of each of appoximately 100,000 cDNAs.
  • Nucleotides of the cDNAs are identified by hybridizing tag complements to the encoded adaptors as described in Example 1.
  • Specifically hybridized tag complements are detected by exciting their fluorescent labels with illumination beam (524) from light source (526), which may be a laser, mercury arc lamp, or the like.
  • Illumination beam (524) passes through filter (528) and excites the fluorescent labels on tag complements specifically hybridized to encoded adaptors in flow chamber (500).
  • Resulting fluorescence (530) is collected by confocal microscope (532), passed through filter (534), and directed to CCD camera (536), which creates an electronic image of the bead array for processing and analysis by workstation (538).
  • the cDNAs are treated with Pronase TM or like enzyme.
  • Encoded adaptors and T4 DNA ligase (Promega, Madison, WI) at about 0.75 units per ⁇ L are passed through the flow chamber at a flow rate of about 1-2 ⁇ L per minute for about 20-30 minutes at 16°C, after which 3' phosphates are removed from the adaptors and the cDNAs prepared for second strand ligation by passing a mixture of alkaline phosphatase (New England Bioscience, Beverly, MA) at 0.02 units per ⁇ L and T4 DNA kinase (New England Bioscience, Beverly, MA) at 7 units per ⁇ L through the flow chamber at 37°C with a flow rate of 1-2 ⁇ L per minute for 15-20 minutes.
  • alkaline phosphatase New England Bioscience, Beverly, MA
  • T4 DNA kinase New England Bioscience, Beverly, MA
  • Tag complements at 25 nM concentration are passed through the flow chamber at a flow rate of 1-2 ⁇ L per minute for 10 minutes at 20°C, after which fluorescent labels carried by the tag complements are illuminated and fluorescence is collected.
  • the tag complements are melted from the encoded adaptors by passing hybridization buffer through the flow chamber at a flow rate of 1-2 ⁇ L per minute at 55°C for 10 minutes
  • Encoded adaptors are cleaved from the cDNAs by passing Bbv I (New England Biosciences, Beverly, MA) at 1 unit/ ⁇ L at a flow rate of 1-2 ⁇ L per minute for 20 minutes at 37°C.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Genetics & Genomics (AREA)
  • Immunology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Saccharide Compounds (AREA)
  • Peptides Or Proteins (AREA)

Claims (19)

  1. Procédé de détermination d'une séquence de nucléotides à une extrémité d'un polynucléotide, le procédé comprenant les étapes consistant à :
    (a) appliquer une pluralité de différents adaptateurs codés sur ledit polynucléotide, dans lequel chaque adaptateur codé est un acide désoxyribonucléique double brin comprenant (i) un marqueur oligonucléotidique choisi parmi un ensemble d'oligonucléotides à hybridation croisée minimale, et (ii) un brin proéminent qui correspond, de manière connue audit marqueur oligonucléotidique ;
    dans lequel chaque oligonucléotide de l'ensemble d'oligonucléotides à hybridation croisée minimale diffère de tous les autres oligonucléotides de l'ensemble par au moins deux nucléotides ;
    (b) ligaturer, sur ladite extrémité du polynucléotide, des adaptateurs codés dont les brins proéminents forment des doubles brins parfaitement appariés avec ladite extrémité ; et
    (c) pour chacun d'une pluralité de nucléotides de ladite extrémité du polynucléotide, hybrider spécifiquement un complément de marqueur marqué sur le marqueur oligonucléotidique de chaque adaptateur codé ligaturé sur celui-ci, dans lequel ledit complément de marqueur hybridé correspond, de manière connue, à l'identité dudit nucléotide,
    identifiant ainsi chacun de ladite pluralité de nucléotides dans ladite extrémité du polynucléotide.
  2. Procédé selon la revendication 1 pour la détermination de séquences de nucléotides d'une pluralité de polynucléotides, le procédé comprenant en outre, avant l'étape (a), les étapes consistant à :
    (i) attacher un premier marqueur oligonucléotidique provenant d'un répertoire de marqueurs à chaque polynucléotide dans une population de polynucléotides,
    dans lequel chaque premier marqueur oligonucléotidique du répertoire est choisi dans un premier ensemble d'oligonucléotides à hybridation croisée minimale, et chaque oligonucléotide du premier ensemble à hybridation croisée minimale diffère de tous les autres oligonucléotides du premier ensemble par au moins deux nucléotides ;
    (ii) échantillonner la population de polynucléotides de façon à former un échantillon de polynucléotides tel qu'essentiellement tous les polynucléotides différents dans l'échantillon ont des premiers marqueurs oligonucléotidiques attachés différents ; et
    (iii) trier les polynucléotides de l'échantillon en hybridant spécifiquement les premiers marqueurs oligonucléotidiques sur leur complément respectif, les compléments respectifs étant fixés, sous la forme de populations uniformes d'oligonucléotides essentiellement identiques dans des régions spatialement discrètes, sur les un ou plusieurs supports en phase solide ;
    et les étapes (a) à (c) sont réalisées sur chaque polynucléotide de la pluralité.
  3. Procédé selon la revendication 2 pour l'identification d'une population de molécules d'ARNm, dans lequel lesdits polynucléotides sont des molécules d'ADNc et ladite étape (i) consiste à :
    - former une population de molécules d'ADNc à partir de la population de molécules d'ARNm de sorte que chaque molécule d'ADNc comprenne un premier marqueur oligonucléotidique attaché, les premiers marqueurs oligonucléotidiques étant choisis dans un premier ensemble d'oligonucléotides à hybridation croisée minimale, dans lequel chaque oligonucléotide du premier ensemble à hybridation croisée minimale diffère de tous les autres oligonucléotides du premier ensemble par au moins deux nucléotides ;
    et comprenant en outre l'étape consistant à :
    - identifier la population de molécules d'ARNm par la distribution statistique des parties des séquences des molécules d'ADNc.
  4. Procédé selon la revendication 1, dans lequel ladite étape de ligature consiste à ligaturer une pluralité d'adaptateurs codés différents sur ladite extrémité dudit polynucléotide de sorte que lesdits brins proéminents de la pluralité d'adaptateurs codés différents soient complémentaires d'une pluralité de parties différentes dudit brin dudit polynucléotide, et il existe une correspondance bijective entre lesdits adaptateurs codés différents et les parties différentes dudit brin.
  5. Procédé selon la revendication 4, dans lequel lesdites parties différentes dudit brin dudit polynucléotide sont contiguës.
  6. Procédé selon la revendication 2, comprenant en outre les étapes consistant à (d) cliver lesdits adaptateurs codés desdits polynucléotides avec une nucléase comprenant un site de reconnaissance de nucléase séparé de son site de clivage de façon à former un nouveau brin proéminent sur ladite extrémité de chacun desdits polynucléotides, et (e) répéter les étapes (a) à (d).
  7. Procédé selon la revendication 1, comprenant en outre les étapes consistant à :
    (d) cliver l'adaptateur codé de l'extrémité du polynucléotide avec une nucléase comprenant un site de reconnaissance de nucléase séparé de son site de clivage de façon à former un nouveau brin proéminent à l'extrémité du polynucléotide ; et
    (e) répéter les étapes (a) à (d).
  8. Procédé selon la revendication 1 ou 7, dans lequel ledit brin proéminent dudit adaptateur codé contient 2 à 6 nucléotides et dans lequel l'étape d'identification consiste à hybrider spécifiquement lesdits compléments de marqueur successifs sur ledit marqueur oligonucléotidique de façon à déterminer successivement l'identité de chaque nucléotide dans ladite partie dudit polynucléotide.
  9. Procédé selon l'une quelconque des revendications 1, 7 ou 8, dans lequel ladite étape d'identification consiste en outre à fournir un certain nombre d'ensembles de compléments de marqueur équivalent au nombre de nucléotides à identifier dans ladite partie dudit polynucléotide.
  10. Procédé selon la revendication 9, dans lequel ladite étape d'identification consiste en outre à fournir lesdits compléments de marqueur dans chacun desdits ensembles qui sont capables d'indiquer la présence d'un nucléotide prédéterminé par un signal émis par une fraction émettant un signal fluorescent, chaque type de nucléotide comprenant une fraction émettrice de signal fluorescent différente.
  11. Procédé selon l'une quelconque des revendications 1 ou 7 à 10, dans lequel lesdits marqueurs oligonucléotidiques desdits adaptateurs codés sont simple brin et lesdits compléments de marqueur desdits marqueurs oligonucléotidiques sont simple brin, si bien qu'une hybridation spécifique entre un marqueur oligonucléotidique et son complément de marqueur respectif se produit au moyen d'un appariement de bases selon Watson et Crick.
  12. Procédé selon la revendication 11, dans lequel lesdits adaptateurs codés ont la forme : 5 ʹ - p ( N ) n ( N ) r ( N ) s ( N ) q ( N ) t - 3 ʹ z ( ) r ( ) s ( ) q - 5 ʹ
    Figure imgb0067
    ou p N r ( N ) s ( N ) q ( N ) t - 3 ʹ 3 ʹ - z ( N ) n ( ) r ( ) s ( ) q - 5 ʹ
    Figure imgb0068

    où N représente un nucléotide et N' son complément, p représente un groupe phosphate, z représente un groupe hydroxyle 3' ou un groupe bloquant 3', n représente un nombre entier compris entre 2 et 6, bornes incluses, r représente un nombre entier compris entre 0 et 18, bornes incluses, s représente un nombre entier qui soit est compris entre quatre et six, bornes incluses, à chaque fois que l'adaptateur codé comprend un site de reconnaissance de nucléase, soit vaut 0 à chaque fois qu'il n'y a pas de site de reconnaissance de nucléase, q représente un nombre entier supérieur ou égal à 0, et t représente un nombre entier supérieur ou égal à 8.
  13. Procédé selon la revendication 12, dans lequel.r est compris entre 0 et 12, bornes incluses, t représente un nombre entier compris entre 8 et 20, bornes incluses, et z représente un groupe phosphate.
  14. Procédé selon la revendication 1, dans lequel lesdits marqueurs oligonucléotidiques desdits adaptateurs codés sont double brin et lesdits compléments de marqueur desdits marqueurs oligonucléotidiques sont simple brin, si bien qu'une hybridation spécifique entre un marqueur oligonucléotidique et son complément de marqueur respectif se produit au moyen de la formation d'un triplex de Hoogsteen ou d'un triplex de Hoogsteen inversé.
  15. Procédé selon la revendication 14, dans lequel lesdits adaptateurs codés ont la forme : 5 ʹ - p ( N ) n ( N ) r ( N ) s ( N ) q ( N ) t - 3 ʹ z ( ) r ( ) s ( ) q ( ) t - 5 ʹ
    Figure imgb0069
    ou p N r ( N ) s ( N ) q ( N ) t - 3 ʹ 3 ʹ - z ( N ) n ( ) r ( ) s ( ) q ( ) t - 5 ʹ
    Figure imgb0070

    où N représente un nucléotide et N' son complément, p représente un groupe phosphate, z représente un groupe hydroxyle 3' ou un groupe bloquant 3', n représente un nombre entier compris entre 2 et 6, bornes incluses, r représente un nombre entier compris entre 0 et 18, bornes incluses, s représente un nombre entier qui soit est compris entre quatre et six, bornes incluses, à chaque fois que l'adaptateur codé comprend un site de reconnaissance de nucléase, soit vaut 0 à chaque fois qu'il n'y a pas de site de reconnaissance de nucléase, q représente un nombre entier supérieur ou égal à 0, et t représente un nombre entier supérieur ou égal à 8.
  16. Procédé selon la revendication 15, dans lequel r est compris entre 0 et 12, bornes incluses, t représente un nombre entier compris entre 12 et 24, bornes incluses, et z représente un groupe phosphate.
  17. Procédé selon l'une quelconque des revendications 14 à 16, dans lequel les éléments dudit ensemble à hybridation croisée minimale diffèrent les uns des autres par au moins six nucléotides.
  18. Composition d'intérêt comprenant une pluralité d'adaptateurs oligonucléotidiques double brin, dans laquelle les adaptateurs ont la forme : 5 ʹ - p ( N ) n ( N ) r ( N ) s ( N ) q ( N ) t - 3 ʹ z ( ) r ( ) s ( ) q - 5 ʹ
    Figure imgb0071
    ou p N r ( N ) s ( N ) q ( N ) t - 3 ʹ 3 ʹ - z ( N ) n ( ) r ( ) s ( ) q - 5 ʹ
    Figure imgb0072

    où chaque (N)t représente un seul marqueur oligonucléotidique simple brin et est choisi dans un ensemble d'oligonucléotides à hybridation croisée minimale de sorte que chaque oligonucléotide de l'ensemble diffère de tous les autres oligonucléotide de l'ensemble par au moins deux nucléotides ;
    où :
    N représente un nucléotide et N' son complément,
    p représente un groupe phosphate,
    z représente un groupe hydroxyle 3' ou un groupe bloquant 3',
    n représente un nombre entier compris entre 2 et 6, bornes incluses,
    r représente un nombre entier compris entre 0 et 18, bornes incluses,
    s représente un nombre entier compris entre quatre et six, bornes incluses,
    l'adaptateur comprend, dans une partie double brin séparée du marqueur oligonucléotidique, un site de reconnaissance de nucléase d'une nucléase dont le site de reconnaissance est séparé de son site de clivage,
    q représente un nombre entier supérieur ou égal à 0, et
    t représente un nombre entier supérieur ou égal à 8.
  19. Composition d'intérêt comprenant une pluralité d'adaptateurs oligonucléotidiques double brin, dans laquelle les adaptateurs ont la forme : 5 ʹ - p ( N ) n ( N ) r ( N ) s ( N ) q ( N ) t - 3 ʹ z ( ) r ( ) s ( ) q ( ) t - 5 ʹ
    Figure imgb0073
    ou p N r ( N ) s ( N ) q ( N ) t - 3 ʹ 3 ʹ - z ( N ) n ( ) r ( ) s ( ) q ( ) t - 5 ʹ
    Figure imgb0074

    où chaque (N)t représente un seul marqueur oligonucléotidique double brin (N')t choisi dans un ensemble d'oligonucléotides à hybridation croisée minimale de sorte que chaque oligonucléotide de l'ensemble diffère de tous les autres oligonucléotide de l'ensemble par au moins deux paires de bases ;
    où :
    N représente un nucléotide et N' son complément,
    p représente un groupe phosphate,
    z représente un groupe hydroxyle 3' ou un groupe bloquant 3',
    n représente un nombre entier compris entre 2 et 6, bornes incluses,
    r représente un nombre entier compris entre 0 et 18, bornes incluses,
    s représente un nombre entier compris entre quatre et six, bornes incluses,
    l'adaptateur comprend, dans une partie double brin séparée du marqueur oligonucléotidique, un site de reconnaissance de nucléase d'une nucléase dont le site de reconnaissance est séparé de son site de clivage,
    q représente un nombre entier supérieur ou égal à 0, et
    t représente un nombre entier supérieur ou égal à 8.
EP97929757A 1996-06-06 1997-06-02 Signatures par ligature d'adaptateurs codes Expired - Lifetime EP0923650B1 (fr)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US659453 1996-06-06
US08/659,453 US5846719A (en) 1994-10-13 1996-06-06 Oligonucleotide tags for sorting and identification
US68958796A 1996-08-12 1996-08-12
US689587 1996-08-12
PCT/US1997/009472 WO1997046704A1 (fr) 1996-06-06 1997-06-02 Signatures par ligature d'adaptateurs codes

Publications (2)

Publication Number Publication Date
EP0923650A1 EP0923650A1 (fr) 1999-06-23
EP0923650B1 true EP0923650B1 (fr) 2007-03-07

Family

ID=27097828

Family Applications (1)

Application Number Title Priority Date Filing Date
EP97929757A Expired - Lifetime EP0923650B1 (fr) 1996-06-06 1997-06-02 Signatures par ligature d'adaptateurs codes

Country Status (13)

Country Link
EP (1) EP0923650B1 (fr)
JP (1) JP4124377B2 (fr)
CN (1) CN1195872C (fr)
AT (1) ATE356221T1 (fr)
AU (1) AU733782B2 (fr)
CA (1) CA2256700A1 (fr)
CZ (1) CZ397998A3 (fr)
DE (1) DE69737450T2 (fr)
HK (1) HK1021206A1 (fr)
HU (1) HUP0003944A3 (fr)
NO (1) NO985698L (fr)
PL (1) PL331513A1 (fr)
WO (1) WO1997046704A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10316357B2 (en) 2014-01-31 2019-06-11 Swift Biosciences, Inc. Compositions and methods for enhanced adapter ligation

Families Citing this family (84)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6406848B1 (en) 1997-05-23 2002-06-18 Lynx Therapeutics, Inc. Planar arrays of microparticle-bound polynucleotides
USRE43097E1 (en) 1994-10-13 2012-01-10 Illumina, Inc. Massively parallel signature sequencing by ligation of encoded adaptors
CN1146668C (zh) * 1995-06-07 2004-04-21 林克斯治疗公司 用于分选和鉴定的寡核苷酸标记物
GB9620769D0 (en) * 1996-10-04 1996-11-20 Brax Genomics Ltd Nucleic acid sequencing
US5888737A (en) * 1997-04-15 1999-03-30 Lynx Therapeutics, Inc. Adaptor-based sequence analysis
US6607878B2 (en) 1997-10-06 2003-08-19 Stratagene Collections of uniquely tagged molecules
US6265163B1 (en) 1998-01-09 2001-07-24 Lynx Therapeutics, Inc. Solid phase selection of differentially expressed genes
JP3944996B2 (ja) 1998-03-05 2007-07-18 株式会社日立製作所 Dnaプローブアレー
AU754952B2 (en) 1998-06-24 2002-11-28 Illumina, Inc. Decoding of array sensors with microspheres
US6480791B1 (en) * 1998-10-28 2002-11-12 Michael P. Strathmann Parallel methods for genomic analysis
NO986133D0 (no) 1998-12-23 1998-12-23 Preben Lexow FremgangsmÕte for DNA-sekvensering
WO2000039333A1 (fr) * 1998-12-23 2000-07-06 Jones Elizabeth Louise Methode de sequençage utilisant des marques grossissantes
AU778438B2 (en) * 1999-04-06 2004-12-02 Yale University Fixed address analysis of sequence tags
US20060275782A1 (en) 1999-04-20 2006-12-07 Illumina, Inc. Detection of nucleic acid reactions on bead arrays
US6544732B1 (en) 1999-05-20 2003-04-08 Illumina, Inc. Encoding and decoding of array sensors utilizing nanocrystals
AU7569600A (en) 1999-05-20 2000-12-28 Illumina, Inc. Combinatorial decoding of random nucleic acid arrays
US8080380B2 (en) 1999-05-21 2011-12-20 Illumina, Inc. Use of microfluidic systems in the detection of target analytes using microsphere arrays
US8481268B2 (en) 1999-05-21 2013-07-09 Illumina, Inc. Use of microfluidic systems in the detection of target analytes using microsphere arrays
AU6770800A (en) * 1999-08-13 2001-03-13 Yale University Analysis of sequence tags with hairpin primers
DK1218545T3 (da) 1999-08-18 2012-02-20 Illumina Inc Fremgangsmåder til fremstilling af oligonukleotidopløsninger
US7582420B2 (en) 2001-07-12 2009-09-01 Illumina, Inc. Multiplex nucleic acid reactions
ATE492652T1 (de) 2000-02-07 2011-01-15 Illumina Inc Nukleinsäuredetektionsverfahren mit universellem priming
US6812005B2 (en) 2000-02-07 2004-11-02 The Regents Of The University Of California Nucleic acid detection methods using universal priming
EP1967595A3 (fr) 2000-02-16 2008-12-03 Illumina, Inc. Génotypage parallèle de plusieurs échantillons prélevés sur des patients
US7291460B2 (en) * 2002-05-31 2007-11-06 Verenium Corporation Multiplexed systems for nucleic acid sequencing
WO2004065000A1 (fr) 2003-01-21 2004-08-05 Illumina Inc. Moniteur de reactions chimiques
AU2004254552B2 (en) * 2003-01-29 2008-04-24 454 Life Sciences Corporation Methods of amplifying and sequencing nucleic acids
GB0308852D0 (en) * 2003-04-16 2003-05-21 Lingvitae As Method
WO2005010184A1 (fr) * 2003-07-25 2005-02-03 Takara Bio Inc. Methode servant a detecter une mutation
GB0324456D0 (en) 2003-10-20 2003-11-19 Isis Innovation Parallel DNA sequencing methods
US7622281B2 (en) 2004-05-20 2009-11-24 The Board Of Trustees Of The Leland Stanford Junior University Methods and compositions for clonal amplification of nucleic acid
EP2857523A1 (fr) 2005-02-01 2015-04-08 Applied Biosystems, LLC Procédé pour détermine un séquence dans un polynucleotide
GB0524069D0 (en) 2005-11-25 2006-01-04 Solexa Ltd Preparation of templates for solid phase amplification
CN101415839B (zh) * 2006-02-08 2012-06-27 亿明达剑桥有限公司 对多核苷酸模板进行测序的方法
WO2007097443A1 (fr) * 2006-02-20 2007-08-30 National University Corporation Hokkaido University Procede de determination de la sequence des bases d'un adn
CA2649725A1 (fr) * 2006-04-19 2007-10-25 Applera Corporation Reactifs, procedes et bibliotheques concus pour un sequencage a base de spheres sans gel
US8889348B2 (en) 2006-06-07 2014-11-18 The Trustees Of Columbia University In The City Of New York DNA sequencing by nanopore using modified nucleotides
US9605307B2 (en) 2010-02-08 2017-03-28 Genia Technologies, Inc. Systems and methods for forming a nanopore in a lipid bilayer
US8324914B2 (en) 2010-02-08 2012-12-04 Genia Technologies, Inc. Systems and methods for characterizing a molecule
US9678055B2 (en) 2010-02-08 2017-06-13 Genia Technologies, Inc. Methods for forming a nanopore in a lipid bilayer
WO2012088339A2 (fr) 2010-12-22 2012-06-28 Genia Technologies, Inc. Caractérisation de molécules individuelles d'adn, sur nanopores, à l'aide de ralentisseurs
US8962242B2 (en) 2011-01-24 2015-02-24 Genia Technologies, Inc. System for detecting electrical properties of a molecular complex
US9110478B2 (en) 2011-01-27 2015-08-18 Genia Technologies, Inc. Temperature regulation of measurement arrays
SG10201605049QA (en) * 2011-05-20 2016-07-28 Fluidigm Corp Nucleic acid encoding reactions
US10501791B2 (en) 2011-10-14 2019-12-10 President And Fellows Of Harvard College Sequencing by structure assembly
US11021737B2 (en) 2011-12-22 2021-06-01 President And Fellows Of Harvard College Compositions and methods for analyte detection
US10227639B2 (en) 2011-12-22 2019-03-12 President And Fellows Of Harvard College Compositions and methods for analyte detection
US8986629B2 (en) 2012-02-27 2015-03-24 Genia Technologies, Inc. Sensor circuit for controlling, detecting, and measuring a molecular complex
CN102766689B (zh) * 2012-04-17 2015-07-22 盛司潼 一种增加测序读长的测序方法
AU2013251701A1 (en) * 2012-04-24 2014-10-30 Gen9, Inc. Methods for sorting nucleic acids and multiplexed preparative in vitro cloning
JP6445426B2 (ja) * 2012-05-10 2018-12-26 ザ ジェネラル ホスピタル コーポレイション ヌクレオチド配列を決定する方法
US9914967B2 (en) 2012-06-05 2018-03-13 President And Fellows Of Harvard College Spatial sequencing of nucleic acids using DNA origami probes
JP2015525077A (ja) 2012-06-15 2015-09-03 ジェニア・テクノロジーズ・インコーポレイテッド チップの構成および高精度な核酸配列決定
US9605309B2 (en) 2012-11-09 2017-03-28 Genia Technologies, Inc. Nucleic acid sequencing using tags
US9759711B2 (en) 2013-02-05 2017-09-12 Genia Technologies, Inc. Nanopore arrays
EP2971184B1 (fr) 2013-03-12 2019-04-17 President and Fellows of Harvard College Procédé de génération d'une matrice tridimensionnelle contenant des acides nucléiques
US10648026B2 (en) 2013-03-15 2020-05-12 The Trustees Of Columbia University In The City Of New York Raman cluster tagged molecules for biological imaging
BR112015030491A8 (pt) 2013-06-04 2022-07-26 Harvard College método de modular a expressão de um ácido nucleico alvo em uma célula, bem como método de alterar um ácido nucleico de dna alvo em uma célula
ES2717756T3 (es) * 2013-09-30 2019-06-25 Qiagen Gmbh Moléculas adaptadoras de ADN para la preparación de genotecas de ADN y método para su producción y uso
US9551697B2 (en) 2013-10-17 2017-01-24 Genia Technologies, Inc. Non-faradaic, capacitively coupled measurement in a nanopore cell array
US9567630B2 (en) 2013-10-23 2017-02-14 Genia Technologies, Inc. Methods for forming lipid bilayers on biochips
CN109797199A (zh) 2013-10-23 2019-05-24 吉尼亚科技公司 使用纳米孔的高速分子感测
HUE051262T2 (hu) * 2014-01-28 2021-03-01 Lanzatech New Zealand Ltd Módszer rekombináns mirkoorganizmus elõállítására
US10179932B2 (en) 2014-07-11 2019-01-15 President And Fellows Of Harvard College Methods for high-throughput labelling and detection of biological features in situ using microscopy
CN106661636A (zh) * 2014-09-05 2017-05-10 凯杰有限公司 衔接子连接的扩增子的制备
ES2726149T3 (es) 2014-09-12 2019-10-02 Mgi Tech Co Ltd Oligonucleótido aislado y su uso en la secuenciación de ácidos nucleicos
US10494630B2 (en) * 2014-10-14 2019-12-03 Mgi Tech Co., Ltd. Linker element and method of using same to construct sequencing library
US20180044668A1 (en) * 2014-10-14 2018-02-15 Bgi Shenzhen Co., Limited Mate pair library construction
US10221448B2 (en) 2015-03-06 2019-03-05 Pillar Biosciences Inc. Selective amplification of overlapping amplicons
US10689690B2 (en) * 2015-08-13 2020-06-23 Centrillion Technology Holdings Corporation Library construction using Y-adapters and vanishing restriction sites
WO2017079406A1 (fr) 2015-11-03 2017-05-11 President And Fellows Of Harvard College Procédé et appareil pour imagerie volumétrique d'une matrice tridimensionnelle contenant des acides nucléiques
WO2017106777A1 (fr) 2015-12-16 2017-06-22 Fluidigm Corporation Amplification multiplex de haut niveau
US11680253B2 (en) 2016-03-10 2023-06-20 The Board Of Trustees Of The Leland Stanford Junior University Transposase-mediated imaging of the accessible genome
CA3022290A1 (fr) 2016-04-25 2017-11-02 President And Fellows Of Harvard College Procedes de reaction en chaine d'hybridation pour la detection moleculaire in situ
WO2018013837A1 (fr) * 2016-07-15 2018-01-18 The Regents Of The University Of California Procédés de production de bibliothèques d'acides nucléiques
WO2018045181A1 (fr) 2016-08-31 2018-03-08 President And Fellows Of Harvard College Procédés de génération de bibliothèques de séquences d'acides nucléiques pour la détection par séquençage fluorescent in situ
CN109923216A (zh) 2016-08-31 2019-06-21 哈佛学院董事及会员团体 将生物分子的检测组合到使用荧光原位测序的单个试验的方法
US10550425B2 (en) * 2016-11-14 2020-02-04 Agilent Technologies, Inc. Composition and method for improving signal-to-noise in hybridization to oligonucleotide arrays
CN110024037B (zh) * 2016-11-30 2023-06-27 微软技术许可有限责任公司 经由连接的dna随机存取存储系统
CN109789167A (zh) 2017-04-14 2019-05-21 哈佛学院董事及会员团体 用于产生细胞衍生的微丝网络的方法
CN107299151B (zh) * 2017-08-28 2020-06-09 华中农业大学 用于鉴别稻米脂肪酸品质的dna分子标记及其应用
CA3087001A1 (fr) 2018-01-12 2019-07-18 Claret Bioscience, Llc Procedes et compositions d'analyse d'acide nucleique
CA3100983A1 (fr) 2018-06-06 2019-12-12 The Regents Of The University Of California Procedes de production de bibliotheques d'acides nucleiques et compositions et kits pour leur mise en ƒuvre
SG11202101934SA (en) 2018-07-30 2021-03-30 Readcoor Llc Methods and systems for sample processing or analysis

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0630972A2 (fr) * 1993-06-25 1994-12-28 Hitachi, Ltd. Procédé d'analyse d'ADN

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5102785A (en) * 1987-09-28 1992-04-07 E. I. Du Pont De Nemours And Company Method of gene mapping
JPH02299598A (ja) * 1989-04-14 1990-12-11 Ro Inst For Molecular Genetics & Geneteic Res 微視的サイズの別個の粒子と結合している核酸試料中のごく短い配列の全部または一部のオリゴヌクレチドプローブとのハイブリダイゼーションによる決定法
GB9401200D0 (en) * 1994-01-21 1994-03-16 Medical Res Council Sequencing of nucleic acids
US5552278A (en) * 1994-04-04 1996-09-03 Spectragen, Inc. DNA sequencing by stepwise ligation and cleavage
US5604097A (en) * 1994-10-13 1997-02-18 Spectragen, Inc. Methods for sorting polynucleotides using oligonucleotide tags
EP0832287B1 (fr) * 1995-06-07 2007-10-10 Solexa, Inc Marqueurs oligonuclotidiques servant a trier et a identifier

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0630972A2 (fr) * 1993-06-25 1994-12-28 Hitachi, Ltd. Procédé d'analyse d'ADN

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10316357B2 (en) 2014-01-31 2019-06-11 Swift Biosciences, Inc. Compositions and methods for enhanced adapter ligation

Also Published As

Publication number Publication date
ATE356221T1 (de) 2007-03-15
CN1195872C (zh) 2005-04-06
AU3374097A (en) 1998-01-05
DE69737450D1 (de) 2007-04-19
HUP0003944A3 (en) 2003-08-28
NO985698D0 (no) 1998-12-04
HUP0003944A2 (en) 2001-03-28
CN1230226A (zh) 1999-09-29
NO985698L (no) 1999-02-08
EP0923650A1 (fr) 1999-06-23
CZ397998A3 (cs) 1999-07-14
JP4124377B2 (ja) 2008-07-23
PL331513A1 (en) 1999-07-19
JP2000515006A (ja) 2000-11-14
DE69737450T2 (de) 2007-11-29
CA2256700A1 (fr) 1997-12-11
WO1997046704A1 (fr) 1997-12-11
HK1021206A1 (en) 2000-06-02
AU733782B2 (en) 2001-05-24

Similar Documents

Publication Publication Date Title
EP0923650B1 (fr) Signatures par ligature d'adaptateurs codes
US6013445A (en) Massively parallel signature sequencing by ligation of encoded adaptors
EP0975655B1 (fr) Ameliorations apportees a une analyse de sequence sur la base d'adaptateurs
US5763175A (en) Simultaneous sequencing of tagged polynucleotides
US5780231A (en) DNA extension and analysis with rolling primers
US5962228A (en) DNA extension and analysis with rolling primers
EP0832287B1 (fr) Marqueurs oligonuclotidiques servant a trier et a identifier
US6235475B1 (en) Oligonucleotide tags for sorting and identification
US6280935B1 (en) Method of detecting the presence or absence of a plurality of target sequences using oligonucleotide tags
EP1967592B1 (fr) Procédé d'amélioration de l'efficacité de séquençage de polynucléotide
USRE43097E1 (en) Massively parallel signature sequencing by ligation of encoded adaptors
EP2110445A2 (fr) Système de marquage moléculaire
EP1017847B1 (fr) Methode d'etablissement de la cartographie des sites de restriction dans les polynucleotides
EP0840803B1 (fr) Sequencage simultane de polynucleotides marques
JPH11151092A (ja) ローリングプライマーを用いたdna伸長および分析

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19990105

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

17Q First examination report despatched

Effective date: 20031216

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: SOLEXA, INC

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: SOLEXA, INC.

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070307

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070307

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070307

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REF Corresponds to:

Ref document number: 69737450

Country of ref document: DE

Date of ref document: 20070419

Kind code of ref document: P

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070607

REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1021206

Country of ref document: HK

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070618

REG Reference to a national code

Ref country code: CH

Ref legal event code: NV

Representative=s name: BOVARD AG PATENTANWAELTE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070807

NLV1 Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act
ET Fr: translation filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20070630

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070307

26N No opposition filed

Effective date: 20071210

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070307

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070608

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20070605

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20070602

REG Reference to a national code

Ref country code: CH

Ref legal event code: PFA

Owner name: SOLEXA, INC.

Free format text: SOLEXA, INC.#25861 INDUSTRIAL BLVD.#HAYWARD, CA 94545 (US) -TRANSFER TO- SOLEXA, INC.#25861 INDUSTRIAL BLVD.#HAYWARD, CA 94545 (US)

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 20

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20160601

Year of fee payment: 20

Ref country code: DE

Payment date: 20160524

Year of fee payment: 20

Ref country code: CH

Payment date: 20160613

Year of fee payment: 20

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: BE

Payment date: 20160425

Year of fee payment: 20

Ref country code: FR

Payment date: 20160516

Year of fee payment: 20

REG Reference to a national code

Ref country code: DE

Ref legal event code: R071

Ref document number: 69737450

Country of ref document: DE

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

REG Reference to a national code

Ref country code: GB

Ref legal event code: PE20

Expiry date: 20170601

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20170601