WO2002010443A1 - Sondes combinatoires et utilisations associees - Google Patents

Sondes combinatoires et utilisations associees Download PDF

Info

Publication number
WO2002010443A1
WO2002010443A1 PCT/AU2001/000931 AU0100931W WO0210443A1 WO 2002010443 A1 WO2002010443 A1 WO 2002010443A1 AU 0100931 W AU0100931 W AU 0100931W WO 0210443 A1 WO0210443 A1 WO 0210443A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
probes
sequences
polynucleotides
sequence
Prior art date
Application number
PCT/AU2001/000931
Other languages
English (en)
Inventor
Mark John Gibbs
Adrian John Gibbs
Roger William Brown
Original Assignee
The Australian National University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AUPQ9026A external-priority patent/AUPQ902600A0/en
Priority claimed from AUPQ9483A external-priority patent/AUPQ948300A0/en
Priority to US10/343,107 priority Critical patent/US20050260574A1/en
Application filed by The Australian National University filed Critical The Australian National University
Priority to JP2002516359A priority patent/JP2004504068A/ja
Priority to NZ523715A priority patent/NZ523715A/en
Priority to CA002416952A priority patent/CA2416952A1/fr
Priority to AU2001276178A priority patent/AU2001276178A1/en
Priority to EP01953687A priority patent/EP1322780A4/fr
Publication of WO2002010443A1 publication Critical patent/WO2002010443A1/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/20Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6834Enzymatic or biochemical coupling of nucleic acids to a solid phase
    • C12Q1/6837Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression

Definitions

  • THIS INVENTION relates generally to novel means and methods for nucleic acid analysis and detection. More particularly, the present invention relates to a set of oligonucleotide probes, wherein two or more probes, in combination, can specifically detect a target polynucleotide and wherein different combinations of probes provide specificity for detecting and distinguishing different target polynucleotides. The invention also relates to methods for designing such combinations of oligonucleotide probes by way of gene sequence analyses that are preferably carried out using a digital computer, and to methods for interpreting the results of tests using such probe combinations.
  • nucleic acid probes used in nucleic acid hybridisations were mostly obtained empirically by isolating DNA or RNA fragments that were derived from the targeted organism(s) or gene(s).
  • the international sequence databases e.g., the GenBank and EMBL databases. These databases of known gene sequences have been increasing tenfold in size every five years for many years and now contain a representative sample of most genes and most major groups of organisms.
  • DNA micro-arrays use spots of detector oligonucleotides or probes positioned in arrays on a solid support, typically a glass wafer.
  • the probes are allowed to hybridise with sample nucleic acids, which contain the target nucleic acids and which have been fluorescently labelled.
  • the probes and target nucleic acids of the sample are allowed to hybridise under conditions that only detect exact or almost exact complementarity between the probes and the target nucleic acids. If a target nucleic acid complements and hybridises to a particular probe in the array, the spot will fluoresce. Recording the fluorescence of the spots enables one to assess which target sequences are present in the nucleic acids mixture.
  • Sequence information obtained from native RNA or DNA molecules, is used to determine the sequence of the synthesised oligonucleotide probes and this information is usually stored in computer databases and manipulated using software. Each probe is synthesised so that it contains nucleotides in an order (sequence) that matches a part of a known native nucleotide sequence or the complement of a part of that sequence. Oligonucleotide probes used in conventional arrays are typically 10-25 nucleotides long. For the purposes of the present invention, and as will be more fully discussed hereinafter, the nucleic acid molecules that are to be identified in an assay or test are designated "target polynucleotides".
  • target sequences The parts or segments of these polynucleotides that match the sequence of, and hybridise to, an oligonucleotide probe are designated "target sequences”. This term also includes within its scope sequences as represented in a computer datafile or some other readable form.
  • oligonucleotide probes are most commonly used in micro-arrays to identify and quantify the mRNA transcripts from genes. These micro-arrays usually contain probes representing several different target sequences from each gene sequence and these probes are usually chosen to be target specific (i.e., they hybridise with just one target polynucleotide). Thus, these micro-arrays contain many more probes than the number of target polynucleotides they are designed to detect.
  • DNA micro-arrays provide a facile and rapid means of detecting and measuring the expression of different genes. They have also been used to detect variants of well- characterised nucleic acid molecules (i.e., to detect genetic polymorphisms and genotypes).
  • RFLP restriction fragment length polymorphism
  • PCR polymerase chain reaction
  • micro-arrays for routine diagnosis appears to be slow. This is probably due to the relatively high cost of designing, developing and producing micro-arrays that could detect a large number of target polynucleotides. New methods and reagents are, therefore, required to realise this promise, and the present invention helps to meet that need.
  • the present invention provides improved nucleic acid analysis techniques as described more fully hereinafter.
  • a set of oligonucleotide probes for detecting a plurality of different target polynucleotides, wherein a respective target polynucleotide corresponds to a single polynucleotide or a group of related polynucleotides, said set including a collection of different promiscuous probes, wherein a respective promiscuous probe is capable of hybridising to a target sequence shared between at least two of said target polynucleotides, wherein at least one target polynucleotide comprises at least two target sequences shared between other target polynucleotides, and wherein a predefined combination of promiscuous probes is capable of hybridising to said at least two target sequences, said predefined combination providing specificity of detection of said at least one target polynucleotide.
  • the set of oligonucleotide probes comprises a plurality of different predefined combinations of probes, each providing specificity of detection of a different target polynucleotide. /
  • the set of oligonucleotide probes further comprises at least one non-promiscuous probe that is capable of hybridising to a unique target sequence of a single target polynucleotide.
  • the set of oligonucleotide probes comprises at least one probe that is capable of hybridising to a pivot sequence, which divides two or more polynucleotides into distinct groups.
  • the set of oligonucleotide probes comprises at least one degenerate oligonucleotide probe that is capable of hybridising to a redundant target sequence.
  • the invention provides a method for detecting a plurality of different target polynucleotides using the set of oligonucleotide probes as broadly described above, said method comprising:
  • the method further comprises analysing whether any of said target polynucleotides in said test sample corresponds to a phenotype-determining target polynucleotide.
  • the method further comprises diagnosing a phenotype of a patient from which said test sample was derived based on the phenotype-determining target polynucleotide(s) present in the test sample.
  • the step of processing is performed by a programmable digital computer.
  • the invention provides a method for detecting an unknown or uncharacterised member of a polynucleotide family using the set of probes as broadly described above, said method comprising: - exposing said probes to a test sample under stringent hybridisation conditions;
  • the different combination of oligonucleotide probes corresponds to a hypothetical predefined combination of probes belonging to a predefined assemblage.
  • the hypothetical predefined combination of probes comprises at least one degenerate oligonucleotide probe that is capable of hybridising to a redundant target sequence.
  • nucleic acid sequence database comprising the sequences of a plurality of target polynucleotides for identical target sequences that are shared between two or more of said target polynucleotides to thereby obtain a subset of shared target sequences
  • the process further includes the step of:
  • said process further comprises:
  • the process preferably comprises: - searching the database for sequences that are unique to respective target polynucleotides to thereby obtain a subset of unique target sequences;
  • a target sequence from said unique subset or a combination of target sequences from said shared subset and/or said unique subset which, when hybridised by complementary or substantially complementary oligonucleotide probe(s), facilitate(s) specific detection of that target polynucleotide.
  • said process further comprises:
  • the process suitably comprises: - searching the database for target sequences that are substantially identical or conserved between related target polynucleotides;
  • a target sequence from said redundant subset or a combination of target sequences from said shared subset and/or said redundant subset which, when hybridised by complementary or substantially complementary oligonucleotide probe(s), facilitate(s) specific detection of that target polynucleotide.
  • the process comprises:
  • the process comprises:
  • said process is performed by a digital computer.
  • the invention provides a computer program product for identifying a set of target sequences for designing a set of oligonucleotide probes, as broadly described above, comprising code that receives as input sequences of target polynucleotides from one or more nucleic acid sequence databases and/or information that identifies sequences corresponding to said target polynucleotides; code that identifies potential target sequences within the target polynucleotides; code that identifies the target sequences that are shared between different target polynucleotides; optional code that identifies the target sequences that are unique to specific target polynucleotides, code that assesses every possible combination or a number of combinations of the target sequences to identify those combinations of target sequences which, when hybridised by complementary oligonucleotide probes, facilitate discrimination between different target polynucleotides; and a computer readable medium that stores the codes.
  • the computer program product further comprises code that creates a database which registers the presence or absence of possible target sequences found within respective target polynucleotides.
  • the computer program product further comprises code that identifies substantially identical or conserved sequences between the target sequences and code that identifies redundant sequence variants of said substantially identical target sequences, wherein said redundant sequence variants are registered as target sequences.
  • the invention provides a computer program product for processing hybridisation data comprising code that identifies for each target polynucleotide a combination of features in an oligonucleotide array whose probes facilitate specific detection of that polynucleotide; code that receives as input hybridisation data from hybridisation reactions between sample polynucleotides and the oligonucleotide probes in the array; code that processes the hybridisation data to determine whether the sample polynucleotides comprise any of the target polynucleotides by searching for hybridisation patterns that match any of the predefined combinations or predefined assemblages of target sequences; and a computer readable medium that stores the codes.
  • said computer program product comprises code that receives as input the sequence of an oligonucleotide probe in each feature of an oligonucleotide array and code that receives as input a database that contains information on the presence or absence of target sequences in target polynucleotides.
  • the computer program product further comprises code that deduces the probability that the detected pattern of hybridisation indicates the presence of a target polynucleotide.
  • Figure 1 shows a hypothetical target sequence and the set of all possible subsequences including eight or more bases derived from the target sequence.
  • Figure 2A shows a Venn diagram representing the relationships between the sub- sequence of three hypothetical target sequences (A, B and C). Some sub-sequences derived from each target sequence are unique and some are shared. Target A shares some subsequence with B and some with C and some with both B and C, and C and B share some that are not shared with A.
  • Figure 2B shows a Venn diagram matching Figure 2A and showing which sub- sequences (X and Y) could be used to reduce the size of the set required to detect and distinguish between targets A, B and C.
  • Figure 3 shows the sequence of the shared 'B-motif in potyvirus polymerase genes. Positions (sites) in the sequence where variations are found are boxed, and each box lists the different nucleotides known to occur at that site.
  • Figure 4 is a diagrammatic representation of an array of oligonucleotides. Each square (feature) on the grid represents a different oligonucleotide spot on an array consisting of 256 different oligonucleotides. Every possible combination of the sequence variants shown in Figure 3 is represented in one of the 256 spots on the array. The spots on the array could be ordered so that the oligonucleotides in the rows and columns identified with arrows carry the sequence variations as shown for positions 3, 6 and 9. Oligonucleotides with variations in position 12, 15 and 18 could be similarly identified.
  • Figure 5 is a diagrammatic representation showing the expected reactions on an array designed as shown in Figure 4 when DNAs encoding the polymerase B-motifs of the potyviruses potato virus Y (PVY) and bean yellow mosaic (BYMV) are used.
  • the nucleotides at variable positions 3 and 6 are shown to the left of the array and those at variable positions 9, 12 and 15 are shown above the array.
  • the reactions with cDNA generated from the RNA of three groups of potyviruses are shown: A. strains -N (GenBank code D00441), -NFR (X12456) and -PA (A08776); B. strains -Hung (M95491) and -NSW (X97895); and C.
  • FIG. 6 is a diagrammatic representation depicting shared gene sequences in potyvirus genomes showing sequence variations present in those sequences, and the overlapping parts of two of those sequences that could be used combinatorially as probes in a micro-array to detect and identify potyviruses.
  • A). A region of the polymerase encoding its 'B-motif, and two sub-sequences derived from it; B).
  • Figure 7 is a diagrammatic representation depicting the pattern of permutations of variable sites in the probes designed from three conserved regions of potyvirus genomes
  • Figure 8 is a diagrammatic representation depicting hybridisation patterns obtained using copies of a hypothetical micro-array to detect cDNAs encoding the genomes of six different strains of potato virus Y and one of bean yellow mosaic virus
  • Figure 7 The virus-derived cDNAs match those in the example shown in Figure 5.
  • Figure 9 is a diagrammatic representation of a system used to carry out the instructions encoded by the storage, medium of Figures 11 and 12.
  • Figure 10 depicts a flow diagram showing an embodiment of a method for designing combinatorial probes according to the present invention.
  • Figure 11 is a diagrammatic representation showing a cross section of a magnetic storage medium.
  • Figure 12 is a diagrammatic representation showing a cross section of an optically readable data storage medium.
  • an element means one element or more than one element.
  • Complementary refers to the topological capability or matching together of interacting surfaces of an oligonucleotide probe and its target oligonucleotide, which may be part of a larger polynucleotide.
  • the target and its probe can be described as complementary, and furthermore, the contact surface characteristics are complementary to each other.
  • Complementary includes base complementarity such as A is complementary to T or U, and C is complementary to G in the genetic code.
  • this invention also encompasses situations in which there is non-traditional base-pairing such as Hoogsteen base pairing which has been identified in certain transfer RNA molecules and postulated to exist in a triple helix, h the context of the definition of the term “complementary”, the terms “match” and “mismatch” as used herein refer to the hybridisation potential of paired nucleotides in complementary nucleic acid strands. Matched nucleotides hybridise efficiently, such as the classical A-T and G-C base pair mentioned above. Mismatches are other combinations of nucleotides that hybridise less efficiently.
  • oligonucleotide probes refers to a set of probes having substantially similar sequences, some of liir.li match known, preferably conserved, target sequences and some of which are similar but not identical to the same known target sequences. These latter target sequences correspond to redundant target sequences as defined herein. Oligonucleotides probes that recognise redundant target sequences contain sequence variations that exist in at least two of the known target sequences but not together in one sequence, i.e., they match one of these sequences at one nucleotide position but at least one other known target sequence at another nucleotide position. Thus, these probe sets contain potential permutations of known sequence variants that have not yet been reported but are likely to occur in nature.
  • feature refers to an area of a substrate having a collection of substantially same-sequence, surface immobilised oligonucleotide probes. Generally, one feature is different from another feature if the probes of the different features have substantially different nucleotide sequences.
  • a feature is a spatially addressable synthesis site as for example disclosed in U.S. Patent Nos. 5,384,261; 5,143,854; 5,150,270; 5,593,139; 5,634,734; and WO95/11995.
  • gene is meant a genomic nucleic acid sequence at a particular genetic locus.
  • gene family or “family of polynucleotides” refers to a set of polynucleotides or genes or the polypeptides they encode, that have statistically significant sequence homology as, for example, determined by appropriate Monte Carlo shuffling tests (Hunter and Kearney, 1983, Biol Cybern 47(2): 141-146). Such sets are related through common ancestry as a result of gene inheritance by related but separate lineages or by gene duplication or by horizontal gene transfer or an equivalent recombinational process and subsequent evolution. Such sets include nucleic acid species from related pathogens, such as different genotypes or strains of a bacterial or virus species or different bacterial or viral species belonging to a single genus.
  • Such sets also include genes that share a region that encodes a related domain.
  • Many shared sequences encoding domains are known in the art including, for example, the ATPase domain, the cadherin-like domain, the EGF domain, the immunoglobulin domain, and the fibronectin type II domain. Reference may be made in this respect to R.F. Doolittle (1995, Annu. Rev. Biochem. 64: 287-314).
  • Gene families frequently encode polypeptides sharing conserved regions, but may also include conserved regions that encode RNA that interact with other polynucleotides, and regions that interact with proteins, such as homeobox and tymobox regions. conserveed regions may extend to those in intronic sequences and genomic regions whose functions are currently unknown.
  • polypeptides share a highly conserved region if the polypeptides have a sequence identity of at least 60% over a comparison window often amino acids, or if they share a sequence identity of at least 80% over a comparison window of at least five amino acids.
  • high density polynucleotide arrays and the like is meant those arrays that contain at least 400 different features per cm 2 .
  • high discrimination hybridisation conditions refers to hybridisation conditions in which single base mismatch may be determined.
  • hybridising specifically to refers to the binding, duplexing, or hybridising of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA.
  • near-minimal number of probes is meant a number of probes that is less than the number of target polynucleotides but greater than the minimal number of probes.
  • a near-minimal number of probes would be less than 50% of the number of target polynucleotides, but more preferably less than 40%, less than 30%, less than 20%, less than 10%, or less than 5%.
  • a sample such as, for example, a polynucleotide extract is isolated from, or derived from, a particular source of the host.
  • the extract can be obtained from a tissue or a biological fluid isolated directly from the host.
  • oligonucleotide refers to a polymer composed of a multiplicity of nucleotide residues (deoxyribonucleotides or ribonucleotides, or related structural variants or synthetic analogues thereof) linked via phosphodiester bonds, or related structural variants or synthetic analogues thereof, such as 'locked nucleic acids'
  • oligonucleotide typically refers to a nucleotide polymer in which the nucleotide residues and linkages between them are naturally occurring, it will be understood that the term also includes within its scope various analogues including, but not restricted to, peptide nucleic acids (PNAs), phosphoramidates, phosphorothioates, methyl phosphonates, 2-O-methyl ribonucleic acids, and the like.
  • PNAs peptide nucleic acids
  • PNAs peptide nucleic acids
  • phosphoramidates phosphoramidates
  • phosphorothioates phosphorothioates
  • methyl phosphonates 2-O-methyl ribonucleic acids
  • oligonucleotide is typically rather short in length, generally from about 8 to 30 nucleotides, more preferably from about 10 to 20 nucleotides and still more preferably from about 11 to 17 nucleotides, but the term can refer to molecules of any length, although the term "polynucleotide” or "nucleic acid” is typically used for large oligonucleotides.
  • Oligonucleotides may be prepared using any suitable method, such as, for example, the phosphotriester method as described in an article by Narang et al. (1979, Methods Enzymol 68 90) and U.S. Patent No. 4,356,270.
  • the phosphodiester method as described in Brown et al. (1979, Methods Enzymol. 68 109) may be used for such preparation.
  • Automated embodiments of the above methods may also be used.
  • diethylphosphoramidites are used as starting materials and may be synthesised as described by Beaucage et al. (1981, Tetrahedron Letters 22 1859-1862).
  • Beaucage et al. (1981, Tetrahedron Letters 22 1859-1862).
  • U.S. Patent Nos 4,458,066 and 4,500,707 refer to methods for synthesising oligonucleotides on a modified solid support.
  • the oligonucleotide is synthesised according to the method disclosed in U.S. Patent No. 5,424,186 (Fodor et al). This method uses lithographic techniques to synthesise a plurality of different oligonucleotides at precisely known locations on a substrate surface.
  • oligonucleotide array refers to a substrate having oligonucleotide probes with different known sequences deposited at discrete known locations associated with its surface.
  • the substrate can be in the form of a two dimensional substrate as described in U.S. Patent No. 5,424,186. Such substrate may be used to synthesise two-dimensional spatially addressed oligonucleotide (matrix) arrays.
  • the substrate may be characterised in that it forms a tubular array in which a two dimensional planar sheet is rolled into a three-dimensional tubular configuration.
  • the substrate may also be in the foi of a microsphere or bead connected to the surface of an optic fibre as, for example, disclosed by Chee et al in WO 00/39587.
  • Oligonucleotide arrays have at least two different features and a density of at least 400 features per cm 2 .
  • the arrays can have a density of about 500, at least one thousand, at least 10 thousand, at least 100 thousand, at least one million or at least 10 million features per cm 2 .
  • the substrate may be silicon or glass and can have the thickness of a glass microscope slide or a glass cover slip, or may be composed of other synthetic polymers. Substrates that are transparent to light are useful when the method of performing an assay on the substrate involves optical detection.
  • the term also refers to a probe array and the substrate to which it is attached that form part of a wafer.
  • 'patient refers to patients of any animal origin, including humans, and includes any individual it is desired to examine or treat using the methods of the invention. However, it will be understood that "patient” does not imply that symptoms are present.
  • phenotype-determining target polynucleotide is meant a target polynucleotide that is associated with a particular phenotype of an organism including, but not restricted to, a disease or condition.
  • vot sequence is used herein to refer to a target sequence that occurs in two or more of the target polynucleotides but not in all of the target polynucleotides.
  • a pivot sequence occurs in about 20% to about 80% of target polynucleotides, more preferably in about 30% to about 70%, more preferably in about 40% to about 60% and more preferably in about 45% to about 55% of the chosen target polynucleotides.
  • predefined assemblage refers to a collection of oligonucleotide probes that is made up of members which belong to two or more predefined sets of oligonucleotide probes, wherein oligonucleotides probes from these predefined sets are at least substantially complementary to, and would be expected to hybridise with, a family or group of related target polynucleotides.
  • a target polynucleotide may be indicated by hybridisation with oligonucleotide probes from several predefined sets, but it may not be known before hand to which oligonucleotide probes in each set the target polynucleotide will hybridise.
  • a predefined assemblage preferably contains degenerate oligonucleotide probes as defined herein.
  • predefined combination refers to a combination of oligonucleotide probes that are at least substantially complementary to, or would be expected to hybridise with, target sequences of a single target polynucleotide.
  • Target sequences which are recognised by a predefined combination of probes encompass known target sequences or a potential or hypothetical combination of at least one known target sequence and at least one redundant target sequence as defined herein. Such potential combination of target sequences can be recognised by oligonucleotide probes belonging to a predefined assemblage as described hereinafter.
  • Probe refers to an oligonucleotide molecule that binds to a specific target sequence or other moiety of another nucleic acid molecule. Unless otherwise indicated, the term “probe” in the context of the present invention typically refers to an oligonucleotide probe that binds to another oligonucleotide or polynucleotide, often called the "target polynucleotide", through complementary base pairing. Probes can bind target polynucleotides lacking complete sequence complementarity with the probe, depending on the stringency of the hybridisation conditions. Oligonucleotide probes may be selected to be “substantially complementary" to a target sequence as defined herein.
  • the exact length of the oligonucleotide probe will depend on many factors including temperature and source of probe and use of the method.
  • the oligonucleotide probe may typically contain 8 to 30 nucleotides, more preferably from about 10 to 20 nucleotides and still more preferably from about 11 to 17 nucleotides capable of hybridisation to a target sequence although it may contain more or fewer such nucleotides.
  • redundant target sequence refers a hypothetical or potential target sequence that has been deduced from substantially identical or conserved target polynucleotides.
  • the deduced sequences may therefore correspond to potential permutations of known sequence variants, which have not yet been reported but are likely to occur in nature.
  • redundant target sequences may be deduced from reference sequences of a gene family. This term also includes within its scope sequences as represented in a computer datafile or some other readable form that could be used to guide the synthesis of redundant oligonucleotide probes.
  • reference sequence is meant a part or segment of a target polynucleotide that could be used to guide the selection of a target sequence.
  • sequence relationships between two or more polynucleotides or polypeptides include “comparison window”, “sequence identity”, “percentage of sequence identity” and “substantial identity”. Because two polynucleotides may each comprise (1) a sequence (i.e., only a portion of the complete polynucleotide sequence) that is similar between the two polynucleotides, and (2) a sequence that is divergent between the two polynucleotides. Sequence comparisons between two (or more) polynucleotides are typically performed by comparing sequences of the two polynucleotides over a "comparison window" to identify and compare local regions of sequence similarity.
  • a “comparison window” refers to a conceptual segment of at least 20 contiguous positions, usually about 20 to about 100, more usually about 100 to about 150 in which a sequence is compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned.
  • the comparison window may comprise additions or deletions (i.e., gaps) of about 20% or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
  • Optimal alignment of sequences for aligning a comparison window may be conducted by computerised implementations of algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Drive Madison, WI, USA; CLUSTAL described by Jeanmougin, F., et al, 1998, Trends Biochem. Sci. 23: 403-5) or by inspection, or using dot diagrams, and the best alignment (i.e., resulting in the highest percentage homology over the comparison window) generated by any of the various methods selected.
  • sequence identity refers to the extent that sequences are identical on a nucleotide-by-nucleotide basis or an amino acid-by-amino acid basis over a window of comparison.
  • a “percentage of sequence identity” is calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, I) or the identical amino acid residue (e.g., Ala, Pro, Ser, Thr, Gly, Val, Leu, He, Phe, Tyr, Trp, Lys, Arg, His, Asp, Glu, Asn, Gin, Cys and Met) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity.
  • the identical nucleic acid base e.g., A, T, C,
  • sequence identity will be understood to mean the "match percentage” calculated by an appropriate method.
  • sequence identity analysis may be carried out using the DNASIS computer program (Version 2.5 for windows; available from Hitachi Software engineering Co., Ltd., South San Francisco, California, USA) using standard defaults as used in the reference manual accompanying the software.
  • “Stringency” as used herein refers to the temperature and ionic strength conditions, and presence or absence of certain organic solvents, during hybridisation. The higher the stringency, the higher will be the observed degree of complementarity between immobilized polynucleotides and the labelled target polynucleotide.
  • Stringent conditions refers to temperature and ionic conditions under which only polynucleotides having a high proportion of complementary bases, preferably having exact complementarity, will hybridise.
  • the stringency required is nucleotide sequence dependent and depends upon the various components present during hybridisation, and is greatly changed when nucleotide analogues are used.
  • stringent conditions are selected to be about 10 to 20° C less than the thermal melting point (T m ) for the specific sequence at a defined ionic strength and pH.
  • T m is the temperature (under defined ionic strength and pH) at which 50% of a target sequence hybridises to a complementary probe.
  • an oligonucleotide probe will hybridise to a target sequence under at least low stringency conditions, preferably under at least medium stringency conditions and more preferably under high stringency conditions.
  • Reference herein to low stringency conditions include and encompass from at least about 1% v/v to at least about 15% v/v formamide and from at least about 1 M to at least about 2 M salt for hybridisation at 42° C, and at least about 1 M to at least about 2 M salt for washing at 42° C.
  • Low stringency conditions also may include 1% Bovine Serum Albumin (BSA), 1 mM EDTA, 0.5 M NaHPO 4 (pH 7.2), 7% SDS for hybridisation at 65° C, and (i) 2xSSC, 0.1% SDS; or (ii) 0.5% BSA, 1 mM EDTA, 40 mM NaHPO 4 (pH 7.2), 5% SDS for washing at room temperature.
  • BSA Bovine Serum Albumin
  • 1 mM EDTA 0.5 M NaHPO 4
  • 2xSSC 0.1% SDS
  • BSA Bovine Serum Albumin
  • Medium stringency conditions also may include 1% Bovine Serum Albumin (BSA), 1 mM EDTA, 0.5 M NaHPO 4 (pH 7.2), 7% SDS for hybridisation at 65° C, and (i) 2 x SSC, 0.1% SDS; or (ii) 0.5% BSA, 1 mM EDTA, 40 mM NaHPO 4 (pH 7.2), 5% SDS for washing at 42° C.
  • BSA Bovine Serum Albumin
  • High stringency conditions include and encompass from at least about 31% v/v to at least about 50% v/v formamide and from at least about 0.01 M to at least about 0.15 M salt for hybridisation at 42° C, and at least about 0.01 M to at least about 0.15 M salt for washing at 42° C
  • High stringency conditions also may include 1% BSA, 1 mM EDTA, 0.5 M NaHPO 4 (pH 7.2), 7% SDS for hybridisation at 65° C, and (i) 0.2 x SSC, 0.1% SDS; or (ii) 0.5% BSA, ImM EDTA, 40 mM NaHPO 4 (pH 7.2), 1% SDS for washing at a temperature in excess of 65° C.
  • Other stringent conditions are well known in the art.
  • substantially complementary it is meant that an oligonucleotide probe is sufficiently complementary to hybridise with a target sequence. Accordingly, the nucleotide sequence of the oligonucleotide probe need not reflect the exact complementary sequence of the target sequence, hi a preferred embodiment, the oligonucleotide probe contains no mismatches and with the target sequence.
  • substantially similar affinities refers herein to target sequences having similar strengths of detectable hybridisation to their complementary or substantially complementary oligonucleotide probes under a chosen set of stringent conditions.
  • target polynucleotide refers to a polynucleotide of interest (e.g., a single gene or polynucleotide) or a group of polynucleotides (e.g., a family of polynucleotides, as described above).
  • the target polynucleotide can designate mRNA, RNA, cRNA, cDNA or DNA.
  • the probe is used to obtain information about the target polynucleotide: whether the target polynucleotide has affinity for a given probe.
  • Target polynucleotides may be naturally occurring or man-made nucleic acid molecules. Also, they can be employed in their unaltered state or as aggregates with other species.
  • Target polynucleotides may be associated covalently or non-covalently, to a binding member, either directly or via a specific binding substance.
  • a target polynucleotide can hybridise to a probe whose sequence is at least partially complementary to a sub-sequence of the target polynucleotide.
  • target sequence is used herein to refer to a chosen nucleotide sequence of at most 300, 250, 200, 150, 100, 75, 50, 30, 25 or at most 15 nucleotides in length.
  • Target sequences include sequences of at least 8, 10, 15, 25, 30, 35, 45, 50, 60, 70, 80, 90, 100, 120, 135, 150, 175, 200, 250 and 300 nucleotides in length.
  • target sequences include, but are not restricted to, repeat sequences such as Alu repeat sequences, conserved or non-conserved regions of gene families, introns, promoter sequences including the Hogness Box and the TATA box, signal sequences, enhancers, protein-binding domains such as a homeobox, tymobox, polymorphisms and conserved protein domains or portions thereof.
  • repeat sequences such as Alu repeat sequences, conserved or non-conserved regions of gene families, introns, promoter sequences including the Hogness Box and the TATA box, signal sequences, enhancers, protein-binding domains such as a homeobox, tymobox, polymorphisms and conserved protein domains or portions thereof.
  • the genomes (i.e., the complete gene sequences) of organisms range in length from a few hundred nucleotides for viroids and viruses to a few billion for multicellular organisms.
  • Conventional oligonucleotide probes typically target sequences that are only 8-30 nucleotides long for detection purposes.
  • short stretches (substrings or sub-sequences) of the target polynucleotide sequences are considered.
  • This second technique may be used to consider a set of short aligned sub-sequences from a larger alignment. Depending on the range of length of sub-sequences that are considered, some of the possible sub-sequences will overlap or contain others (Figure 1). conserveed, substantially similar or substantially identical sequences can be found using these techniques as implemented in well know algorithms. Longer conserved regions may also be identified if substantially identical or similar sub-sequences are found to overlap or to be adjacent or in close proximity,
  • Some sub-sequences will be unique to a target polynucleotide (i.e., not found in other target polynucleotides) but many of the shorter sub-sequences from one target polynucleotide will also be found in other target polynucleotide (shared sub-sequences). Moreover, different sets of these shorter sub-sequences will be shared between different combinations of target polynucleotides ( Figure 2A) (i.e., one target polynucleotide may share some sub-sequences with another target polynucleotide but another set of subsequences will be shared with a third target polynucleotide and so on).
  • probes designed from the shared sub-sequences will hybridise to more than one target polynucleotide and when probes are designed from several different shared sub-sequences the pattern of hybridisation will be complex.
  • Such shared and unique sub-sequences form the basis of target sequences as described hereinafter.
  • the present invention is predicated in part on a novel strategy for decreasing the number and/or size of oligonucleotide probes required for detecting and distinguishing between a plurality of target polynucleotides.
  • the strategy involves detecting different target polynucleotides using a set of oligonucleotide probes, which includes a collection of promiscuous probes, wherein each promiscuous probe is capable of hybridising to a predetermined sub-sequence or target sequence shared between at least two target polynucleotides.
  • the target polynucleotides to be detected comprise two or more target sequences, at least one of which is shared with one or more other target polynucleotides.
  • a particular target polynucleotide can be specifically detected by detecting hybridisation thereto of at least two promiscuous probes, wherein different target polynucleotides are identified by different combinations of such probes.
  • the instant combinatorial detection can be carried out minimally using three gene targets, e.g., targets A, B and C. These genes could be identified using three specific probes, but they could also be identified by only two probes, if these probes were designed using the sequences of two shared target sequences, x and y. A probe designed from target sequence x reacts with A, one designed from target sequence y reacts with B and both probes react with C ( Figure 2B). Furthermore, the shorter an oligonucleotide is, the greater the number of gene sequences with which it is likely to hybridise, therefore probes used in a combinatorial way can be shorter than those that are specific.
  • targets A, B and C could be identified using three specific probes, but they could also be identified by only two probes, if these probes were designed using the sequences of two shared target sequences, x and y. A probe designed from target sequence x reacts with A, one designed from target sequence y reacts with B and both probes react with C ( Figure 2B
  • efficiently designed combinatorial arrays will be comprised of fewer and typically shorter probes, than those using target-specific probes.
  • a particular advantage of such arrays is that they will be less costly to produce.
  • the potential savings will depend in part on the size of the set of target sequences: the larger the target sequence set the greater the potential savings will be as the number of target sequences that are available for combinatorial detection or identification is larger.
  • the set of probes may optionally contain non-promiscuous probes each of which is capable of hybridising to a single or unique target sequence in the plurality of target polynucleotides.
  • non-promiscuous probes and combinations of promiscuous probes are used to distinguish between the plurality of different target polynucleotides. Accordingly, a respective target polynucleotide can be specifically detected by detecting hybridisation thereto of at least two promiscuous probes, or a single non-promiscuous probe.
  • the above combinatorial approach is particularly useful for designing efficient sets of probes to detect, for example, all likely members of a group of related but variable genes. Large sets of probes are required if every possible sequence is to be identified specifically. However, if a combinatorial approach is used as described herein the required specificity can be obtained by using a combination of small sets of less specific (i.e., cross hybridising) or promiscuous probes.
  • a set of probes can be designed so that a target polynucleotide would hybridise to at least two probes from the set.
  • different combinations of cross-reactive or 'promiscuous' probes only are used to discriminate between, and identify specifically, a plurality of target polynucleotides.
  • probes that hybridise to target sequences uniquely in concert with promiscuous probes are used to provide such discrimination and identification. The saving in the number of probes will depend on the variability of the target sequences.
  • sequences of the shared reference sequences may have been conserved during the evolution of the target polynucleotides (i.e., the target polynucleotides have some common ancestry) or they may be shared because coincidental sequence similarities have arisen through a process of convergence. Both types of shared sequences are useful for designing promiscuous probes according to the invention.
  • Another set of target sequences that could be used would be those that are similar to varying degrees. Different target polynucleotides should contain many such similar target sequences and because under Certain conditions probes will hybridise with sequences that are almost identical but not absolutely identical, some similar target sequences could be used.
  • Useful reference sequences for guiding selection of target sequences include, but are not restricted to, those defining repeat sequences, conserved or non-conserved regions of gene families, introns or exons, promoters, signal sequences, enhancers, boxes, protein-binding domains, polymorphisms and conserved protein domains or other multinucleotide groupings of interest (e.g., - homeoboxes, tymoboxes, etc).
  • the probe set includes probes that define the degenerate set of oligonucleotides.
  • useful probes can contain inosine, other generic bases, or mixtures of A, C, T G especially at the third position of a codon site.
  • a 5 reference sequence defines a polymorphism. In this instance, probes interrogate the presence of individual polymorphic variants.
  • the combinatorial method for designing reduced sets of probes could be applied to any test or device that uses two or more probes, and it will allow significant economies or cost savings in tests or devices that use larger numbers of probes and have a broad range of
  • the method could be used in one embodiment to improve the design of DNA micro-arrays that are used for gene expression studies, pathogen strain typing, genotype typing, diagnosis, forensics or any other use requiring that species or genes be detected, distinguished or identified.
  • the method could also be used to improve the design of tests or devices that are based on nucleotide hybridisation but that do not use
  • the set of probes is immobilised on one or more solid supports.
  • An oligonucleotide probe may be immobilised to the solid support using any suitable technique. For example, Holstrom et al. (1993, Anal. Biochem. 209: 278-283) exploit the
  • Another method which may be employed involves precoating of polystyrene or glass solid phases with poly-L-Lys or poly-L-Lys,
  • the oligonucleotide primers may be synthesised in situ utilising, for example, the method of Maskos and Southern (1992, Nucleic Acids Res. 20 1679-1684) or that of Fodor et al. (supra).
  • the set of probes is in the form of a nucleic acid array, preferably a high-density nucleic acid array, which may optionally comprise a mixture of different but individually addressable microbeads.
  • oligonucleotide probes used in the invention may be immobilized either directly or indirectly.
  • a probe may be adsorbed to a surface or alternatively covalently bound to a spacer molecule, which has been covalently bound to the solid support.
  • the spacer molecule may include a latex microparticle, a protein such as bovine serum albumin (BSA) or a polymer such as dextran or poly-(ethylene glycol).
  • BSA bovine serum albumin
  • a polymer such as dextran or poly-(ethylene glycol).
  • the spacer molecule may comprise a homo-polynucleotide tail such as, for example, oligo- dT.
  • the spacer molecule is 10 to 25 molecules in length.
  • Probes may be designed to optimise specific hybridisation to their reference sequences.
  • Dimanac et al. U.S. Patent No. 5,972,619 describe probes containing a core 8-mer and one of three possible variations at outer positions with two variations at each end. Such probes are represented as 5 '-(A, T, G, C)(A, T, G, C) N8 (A, T, G, C)-3'.
  • This type of probe one does not need to discriminate the non-informative end bases (two on 5' end, and one on 3' end) since only the internal 8-mer is read as the probe sequence.
  • the invention also contemplates a process for identifying target sequences for the preparation of a set of oligonucleotide probes as broadly defined above.
  • the process comprises searching a nucleic acid sequence database comprising the sequences of a plurality of target polynucleotides for identical target sequences that are shared between two or more of the target polynucleotides to thereby obtain a subset of shared target sequences (shared subset).
  • the process further comprises recording the positions in each polynucleotide sequence of all overlapping sub-sequences, for example between 8 and 30 nucleotides in length, within that sequence, h an alternate embodiment, the process further comprises recording the positions in each polynucleotide sequence of all unique sub-sequences within that sequence (unique subset), hi yet another embodiment, the process further comprises sorting the target sequences from said subset(s) to obtain target sequences with substantially similar affinities for their complementary oligonucleotide probes.
  • Potential target sequences that are preferably identified in the sub-sequence database include, but are not restricted to:
  • Pivot sequences that preferably divide two or more target polynucleotides into two sets, one set comprising from 40-60% of the target group in which the pivot sequence is present, and the other, the remaining 60-40% of the polynucleotides, in which the pivot sequence is not present. This sorting would be done using a computational embodiment in the style of Danzig's simplex algorithm of linear programming.
  • the process further comprises recording the positions in each polynucleotide sequence of any target sequences that divide two or more target polynucleotides into sets, thus defining a pivot sequence subset.
  • process further comprises recording the positions in each polynucleotide sequence of any target sequences that are substantially identical or conserved between related target polynucleotides. Redundant sequences corresponding to potential sequence variants of such target sequences can then be deduced to obtain a subset of redundant target sequences (redundant subset), which correspond to potentially unknown or uncharacterised target polynucleotides.
  • a combination of target sequences is then selected from one or more of the shared subset, the redundant subset and the pivot subset or a single target sequence is selected from the unique subset, for specifically detecting each target polynucleotide or group of target polynucleotides.
  • a predefined assemblage of target sequences is identified wherein at least one member of the combination is a redundant target sequence.
  • the unknown or uncharacterised member would, therefore, be expected to hybridise with a predefined assemblage of oligonucleotide probes, wherein at least one probe is substantially complementary to a redundant target sequence.
  • a minimal or near minimal number of oligonucleotide probes is determined which, in different combinations, discriminate between the different target polynucleotides.
  • At least 2, more preferably at least 10, more preferably at least 50, more preferably at least 100 and still more preferably at least 1000 different combinations of target sequences are determined for specifically detecting a corresponding number of target polynucleotides.
  • sets of probes based on pivot sequences that divide the target polynucleotides in substantially all possible combinations, and that are of minimal or near minimal length, can be used to provide efficient probes for identifying target polynucleotides using micro-arrays.
  • Sets of probes based on conserved sequences can be used to provide taxonomic information since they represent regions of gene families that have been inherited from a shared ancestor.
  • Probe sequences, like those described hereinafter for potyviruses can then be deduced from such taxonomic analysis, to provide a basis for the construction of a probe array that can identify as-yet-unknown relatives of a chosen target group or family of polynucleotides. It is also envisaged that some target sequences will occur in both pivot and conserved groups, and that most of these shared sequences will be recognised as contiguous regions of shared sequences.
  • the most efficient micro-arrays will comprise mixtures of probes identified by both pivot and conserved searching techniques, pruned after tests for sequence redundancy, and expanded to include permutations of contiguous and conserved regions so as to capture likely sequence variants of gene families.
  • micro-arrays will not only identify known target sequences but also related sequences. Further that previously unknown polynucleotides will be recognised and initially characterised by such micro-arrays, and that the probe sequences with which unknown polynucleotides are found to hybridise can be used as primers in polymerase chain reactions to further characterise and identify such unknown polynucleotides. 4. Computer related embodiments
  • the design or construction of a set of combinatorial probes of the present invention is suitably facilitated with the assistance of a computer programmed with software, which ter alia searches a nucleic acid sequence database comprising the sequences of a plurality of target polynucleotides for identical target sequences that are shared between two or more of the target polynucleotides to thereby obtain a subset of shared target sequences (shared subset).
  • the software determines subsequently for each target polynucleotide a combination of target sequences from said subset whose sequence information can be used to construct probes that can facilitate specific detection of that target polynucleotide.
  • the invention encompasses a computer for designing the sequence of a set of combinatorial probes of the invention, wherein the computer comprises: (a) a machine readable data storage medium comprising a data storage material encoded with machine readable data, wherein the machine readable data comprises a plurality of target polynucleotides (e.g., a gene database); (b) a working memory for storing instructions for processing the machine-readable data; (c) a central- processing unit coupled to the working memory and to the machine-readable data storage medium, for processing the machine-readable data to provide identical target sequences that are shared between two or more of the target polynucleotides; and (d) an output hardware coupled to the central processing unit, for receiving said identical target sequences.
  • a machine readable data storage medium comprising a data storage material encoded with machine readable data
  • the machine readable data comprises a plurality of target polynucleotides (e.g., a gene database)
  • a working memory for storing instructions for processing
  • the computer processes said machine-readable data to provide for each target polynucleotide a combination of target sequences, which when hybridised by complementary or substantially complementary oligonucleotide probes, facilitate specific detection of that target polynucleotide.
  • the computer may also process the machine-readable data to record positions in each polynucleotide sequence of all overlapping sub-sequences, for example between 8 and 30 nucleotides in length, within that sequence.
  • the computer may process the machine- readable data to record the positions in each polynucleotide sequence of all unique subsequences within that sequence (unique subset).
  • the computer processes the machine-readable data to sort the target sequences in said subset(s) to obtain target sequences with substantially similar affinities for their complementary oligonucleotide probes.
  • the computer may process the machine-readable data to record the positions in each polynucleotide sequence of any target sequences that divide two or more target polynucleotides into sets, thus defining a pivot sequence subset, hi an alternate embodiment, the computer may process the machine-readable data to record the positions in each polynucleotide sequence of any target sequences that are substantially identical or conserved between related target polynucleotides.
  • the computer also may process the machine-readable data to deduce redundant sequences corresponding to potential sequence variants of such target sequences to obtain a subset of redundant target sequences (redundant subset), which correspond to potentially unknown or uncharacterised target polynucleotides.
  • the invention also contemplates a computer program product for designing combinatorial probes of the present invention, comprising code that receives as input sequences of target polynucleotides from one or more nucleic acid sequence databases and/or information that identifies sequences corresponding to said target polynucleotides; code that identifies potential target sequences within the target polynucleotides; code that identifies the target sequences that are shared between different target polynucleotides; optional code that identifies the target sequences that are unique to specific target polynucleotides, code that assesses every possible combination or a number of combinations of the target sequences to identify those combinations of target sequences which, when hybridised by complementary oligonucleotide probes, facilitate discrimination between different target polynucleotides; and a computer readable medium that stores the codes.
  • the computer program product further comprises code that creates a database which registers the presence or absence of possible target sequences found within respective target polynucleotides. Additionally, or alternatively, the computer program product further comprises code that identifies substantially identical or conserved sequences between the target sequences and code that identifies redundant sequence variants of said substantially identical target sequences, wherein said redundant sequence variants are registered as target sequences.
  • a computer 11 comprising a central processing unit ("CPU") 20, a working memory 22 which may be, e.g., RAM (random-access memory) or “core” memory, mass storage memory 24 (such as one or more disk drives or CD-ROM drives), one or more cathode-ray tube (“CRT”) display terminals 26, one or more keyboards 28, one or more input lines 30, and one or more output lines 40, all of which are interconnected by a conventional bidirectional system bus 50.
  • CPU central processing unit
  • working memory 22 which may be, e.g., RAM (random-access memory) or “core” memory
  • mass storage memory 24 such as one or more disk drives or CD-ROM drives
  • CRT cathode-ray tube
  • Input hardware 36 coupled to computer 11 by input lines 30, may be implemented in a variety of ways.
  • machine-readable data may be inputted via the use of a modem or modems 32 connected by a telephone line or dedicated data line 34.
  • the input hardware 36 may comprise CD.
  • ROM drives or disk drives 24 in conjunction with display terminal 26, keyboard 28 may also be used as an input device.
  • Output hardware 46 coupled to computer 11 by output lines 40, may similarly be implemented by conventional devices.
  • output hardware 46 may include CRT display terminal 26 for displaying a synthetic polynucleotide sequence or a synthetic polypeptide sequence as described herein.
  • Output hardware might also include a printer 42, so that hard copy output may be produced, or a disk drive 24, to store system output for later use.
  • CPU 20 coordinates the use of the various input and output devices
  • these steps include (1) selecting a group of entities to be identified (e.g., a group of organisms, a family of related polynucleotides etc); (2) compiling sequence data for those entities; (3) identifying target sequences that are shared between those entities to provide a subset of shared sequences; (4) deriving potential oligonucleotide sequences (oligos), which can be used as probes for detecting and distinguishing members of the group; (5) preparing primary "taxon x oligo" matrix; (6) deducing a meta "taxon pair - oligo” matrix (7) identifying a "minimum set cover” of oligos using "greedy strategy”; (8) identifying replicate sets of identical probes from oligos of step (7); and (9) evaluating discriminatory power of the probes.
  • a group of entities to be identified e.g., a group of organisms, a family of related polynucleotides etc
  • compiling sequence data for those entities
  • Figure 11 shows a cross section of a magnetic data storage medium 100 which can be encoded with machine readable data, or set of instructions, for designing a set of probes of the invention, which can be carried out by a system such as system 10 of Figure 9.
  • Medium 100 can be a conventional floppy diskette or hard disk, having a suitable substrate
  • Medium 100 may also have an opening (not shown) for receiving the spindle of a disk drive or other data storage device 24.
  • the magnetic domains of coating 102 of medium 100 are polarised or oriented so as to encode in manner which may be conventional, machine readable data such as that described herein, for execution by a system such as system 10 of Figure 9.
  • Figure 12 shows a cross section of an optically readable data storage medium 110 which also can be encoded with such a machine-readable data, or set of instructions, for designing a synthetic molecule of the invention, which can be carried out by a system such as system 10 of Figure 9.
  • Medium 110 can be a conventional compact disk read only memory (CD-ROM) or a rewritable medium such as a magneto-optical disk, which is optically readable and magneto-optically writable.
  • Medium 100 preferably has a suitable substrate 111, which may be conventional, and a suitable coating 112, which may be conventional, usually of one side of substrate 111.
  • coating 112 is reflective and is impressed with a plurality of pits 113 to encode the machine-readable data. The arrangement of pits is read by reflecting laser light off the surface of coating 112.
  • a protective coating 114 which preferably is substantially transparent, is provided on top of coating 112. In the case of a magneto-optical disk, as is well known, coating 112 has no pits
  • the 113 has a plurality of magnetic domains whose polarity or orientation can be changed magnetically when heated above a certain temperature, as by a laser (not shown).
  • the orientation of the domains can be read by measuring the polarisation of laser light reflected from coating 112.
  • the arrangement of the domains encodes the data as described above.
  • the invention also provides a method for detecting a plurality of different target polynucleotides using a set of probes as broadly described above.
  • the method comprises exposing the probes to a test sample suspected of containing one or more of said target polynucleotides under conditions favouring specific hybridisation.
  • Suitable test samples may include extracts of double or single stranded nucleic acids obtained from archaeal, eubacterial or eukaryotic origin.
  • extracts may be obtained from cells, tissues or materials derived from plants, fungi, bacteria or animals as well as materials derived from viruses, satellite viruses, viroids and similar non- cellular organisms.
  • Sample extracts of DNA or RNA may be prepared from fluid suspensions of biological materials, or by grinding biological materials, or following a cell lysis step which includes, but is not limited to, lysis effected by treatment with SDS (or other detergents), osmotic shock, guanidinium isothiocyanate and lysozyme.
  • Suitable DNA which may be used in the method of the invention, includes genomic DNA or cDNA. Such DNA may be prepared by any one of a number of commonly used protocols as for example described in CURRENT PROTOCOLS IN MOLECULAR BIOLOGY (Ausubel, et al, eds.) (John Wiley & Sons, Inc. 1995), and MOLECULAR CLONING.
  • RNA may be prepared by any suitable protocol as for example described in CURRENT PROTOCOLS IN MOLECULAR BIOLOGY (supra), MOLECULAR CLONING. A LABORATORY MANUAL (supra) and Chomczynski and Sacchi (1987, Anal. Biochem. 162 156, hereby incorporated by reference).
  • RNA which may be used in the method of the invention, includes messenger RNA, complementary RNA transcribed from DNA (cRNA) or genomic or subgenomic RNA.
  • cRNA complementary RNA transcribed from DNA
  • genomic or subgenomic RNA Such RNA may be prepared using standard protocols as for example described in the relevant sections of Ausubel, et al. (supra) and Sambrook, et al. (supra).
  • the genomic DNA o cDNA may be fragmented, for example, by sonication or by treatment with restriction endonucleases.
  • the genomic DNA or cDNA is fragmented such that resultant DNA fragments are of a length greater than the length of the immobilized oligonucleotide probe(s) but small enough to allow rapid access thereto under suitable hybridisation conditions.
  • fragments of genomic DNA or cDNA may be selected and amplified using a suitable nucleotide amplification technique, involving appropriate random or specific primers.
  • amplification techniques are well known to those of skill in the art and include, for example, PCR (Saiki et al, 1988, supra), Strand Displacement Amplification (SDA) (US 5,422,252, Little et al), Rolling Circle Replication (RCR) (Liu et al, 1996, J. Am. Chem. Soc. 118: 1587-1594; International
  • target polynucleotides or fragments thereof are detectably labelled so that their hybridisation to individual probes can be determined.
  • the target polynucleotides or fragments may have one or more reporter molecules associated therewith.
  • the reporter molecule may be selected from a group including a chromogen, a catalyst, an enzyme, a fluorochrome, a chemiluminescent molecule, a bioluminescent molecule, a lanthanide ion such as Europium (Eu 34 ), a radioisotope and a direct visual label.
  • a chromogen a catalyst
  • an enzyme a fluorochrome
  • a chemiluminescent molecule e.g., a chemiluminescent molecule
  • bioluminescent molecule e.g., a lanthanide ion such as Europium (Eu 34 )
  • lanthanide ion such as Europium (Eu 34 )
  • a direct visual label use may be made of a colloidal metallic or non- metallic particle, a dye particle, an enzyme or a substrate, an organic polymer, a latex particle, a liposome, or other vesicle containing a signal producing substance and the like.
  • Especially preferred labels of this type include large colloids, for example, metal colloids such as those from gold, selenium, silver, tin and titanium oxide.
  • an enzyme used as a direct visual label
  • biotinylated bases are incorporated into a target polynucleotide. Hybridisation is detected by incubation with streptavidin-reporter molecules.
  • Suitable fluorochromes include, but are not limited to, fluorescein isothiocyanate (FITC), tetramethylrhodamine isothiocyanate (TRITC), R-Phycoerythrin (RPE), and Texas Red.
  • FITC fluorescein isothiocyanate
  • TRITC tetramethylrhodamine isothiocyanate
  • RPE R-Phycoerythrin
  • Texas Red Texas Red
  • Other exemplary fluorochromes include those discussed by Dower et al. (International Publication WO 93/06121). Reference also may be made to the fluorochromes described in U.S. Patents 5,573,909 (Singer et al), 5,326,692 (Brinkley et al). Alternatively, reference may be made to the fluorochromes described in U.S. Patent Nos.
  • fluorescent labels include, for example, fluorescein phosphoramidites such as Fluoreprime (Pharmacia), Fluoredite (Millipore) and FAM (Applied Biosystems International).
  • Radioactive reporter molecules include, for example, P, which can be detected by a X-ray or phosphoimager techniques.
  • the hybrid-forming step can be performed under suitable conditions for hybridising oligonucleotide probes to test nucleic acid including DNA or RNA.
  • suitable conditions for hybridising oligonucleotide probes to test nucleic acid including DNA or RNA.
  • Preferably high discrimination hybridisation conditions are used.
  • a hybridisation reaction can be performed in the presence of a hybridisation buffer that optionally includes a hybridisation optimising agent, such as an isostabilising agent, a denaturing agent and/or a renaturation accelerant.
  • a hybridisation optimising agent such as an isostabilising agent, a denaturing agent and/or a renaturation accelerant.
  • isostabilising agents include, but are not restricted to, betaines and lower tetraalkyl ammonium salts.
  • Denaturing agents are compositions that lower the melting temperature of double stranded nucleic acid molecules by interfering with hydrogen bonding between bases in a double stranded nucleic acid or the hydration of nucleic acid molecules.
  • Denaturing agents include, but are not restricted to, formamide, formaldehyde, dimethylsulphoxide, tetraethyl acetate, urea, guanidium isothiocyanate, glycerol and chaotropic salts.
  • Hybridisation accelerants include heterogeneous nuclear ribonucleoprotein (hnRP) Al and cationic detergents such as cetyltrimethylammonium bromide (CTAB) and dodecyl trimethylarnmonium bromide (DTAB), polylysine, spermine, spermidine, single stranded binding protein (SSB), phage T4 gene 32 protein and a mixture of ammonium acetate and ethanol.
  • CAB cetyltrimethylammonium bromide
  • DTAB dodecyl trimethylarnmonium bromide
  • polylysine polylysine
  • spermine spermine
  • spermidine single stranded binding protein
  • Hybridisation buffers may include target polynucleotides at a concentration between about 0.005 nM and about 50 nM, preferably between about 0.5 nM and 5 nM, more preferably between about 1 nM and 2 nM
  • a hybridisation mixture containing the target polynucleotides is placed in contact with the array of probes and incubated at a temperature and for a time appropriate to permit hybridisation between the target sequences in the target polynucleotides and any complementary probes.
  • Contact can take place in any suitable container, for example, a dish or a cell designed to hold the solid support on which the probes are bound.
  • incubation will be at temperatures normally used for hybridisation of nucleic acids, for example, between about 20° C and about 75° C, example, about 25° C, about 30° C, about 35° C, about 40° C, about 45° C, about 50° C, about 55° C, about 60° C, or about 65° C.
  • 20° C to 50° C is preferred.
  • lower temperatures are preferred.
  • a sample of target polynucleotides is incubated with the probes for a time sufficient to allow the desired level of hybridisation between the target sequences in the target polynucleotides and any complementary probes.
  • the hybridisation may be carried out at about 45° C +/-10° C in formamide for 1-2 days.
  • the probes are washed to remove any unbound nucleic acid with a hybridisation buffer, which can typically comprise a hybridisation optimising agent in the same range of concentrations as for the hybridisation step. This washing step leaves only bound target polynucleotides.
  • the probes are then examined to identify which probes have hybridised to a target polynucleotide.
  • a signal may be instrumentally detected by irradiating a fluorescent label with light and detecting fluorescence in a fluorimeter; by providing for an enzyme system to produce a dye which could be detected using a spectrophotometer; or detection of a dye particle or a coloured colloidal metallic or non metallic particle using a reflectometer; in the case of using a radioactive label or chemiluminescent molecule employing a radiation counter or autoradiography.
  • a detection means may be adapted to detect or scan light associated with the label which light may include fluorescent, luminescent, focussed beam or laser light.
  • a charge couple device (CCD) or a photocell can be used to scan for emission of light from a probe:target polynucleotide hybrid from each location in the micro-array and record the data directly in a digital computer.
  • electronic detection of the signal may not be necessary. For example, with enzymatically generated colour spots associated with nucleic acid array format, as herein described, visual examination of the array will allow interpretation of the pattern on the array.
  • the detection means is preferably interfaced with pattern recognition software to convert the pattern of signals from the array into a plain language genetic profile.
  • the set of probes is in the form of a nucleic acid array and detection of a signal generated from a reporter molecule on the array is performed using a 'chip reader' .
  • a detection system that can be used by a 'chip reader' is described for example by Pirrung et al (U.S. Patent No. 5,143,854).
  • the chip reader will typically also incorporate some signal processing to determine whether the signal at a particular array position or feature is a true positive or maybe a spurious signal. Exemplary chip readers are described for example by Fodor et al (U.S. Patent No., 5,925,525).
  • the reaction may be detected using flow cytometry.
  • the hybridisation data are then processed to determine which probes have formed hybrids.
  • a digital computer is employed to correlate specific positional labelling on the array with the presence of any of the target sequences for which the probes have specificity of interaction.
  • the positional information is directly converted to a database indicating what sequence interactions have occurred.
  • Data generated in hybridisation assays is most easily analysed with the use of a programmable digital computer.
  • the computer program product generally contains a readable medium that stores the codes. Certain files are devoted to memory that includes the location of each feature and all the target sequences known to contain the sequence of the oligonucleotide probe at that feature.
  • the programmable computer would contain specialist software code and register data derived from the entire sequence database, or containing that part of the entire sub-sequence database that is relevant to the particular probe array, and from the pattern of hybridisation will assess the probability that particular target sequences were present in the tested DNA sample.
  • the computer program product can also contain code that receives as input hybridisation data from a hybridisation reaction between a target sequence and an oligonucleotide probe.
  • the computer program product can also include code that processes the hybridisation data.
  • Data analysis can include the steps of determining, for example, the fluorescence intensity as a function of substrate position from the data collected, removing "outliers" (data deviating from a predetermined statistical distribution), and calculating the relative binding affinity of the target sequences from the remaining data.
  • the resulting data can be displayed as an image with colour in each region varying according to the light emission or binding affinity between target sequences and probes therein.
  • the amount of binding at each address is determined by examining the on-off rates of the hybridisation. For example, the amount of binding at each address is determined at several time points after the nucleic acid sample is contacted with the array. The amount of total hybridisation can be determined as a function of the kinetics of binding based on the amount of binding at each time point. Persons of skill in the art can easily determine the dependence of the hybridisation rate on temperature, sample agitation, washing conditions (e.g., pH, solvent characteristics, temperature) in order to maximise conditions for hybridisation rate and signal to noise.
  • washing conditions e.g., pH, solvent characteristics, temperature
  • the computer program product also can include code that receives instructions from a programmer as input.
  • the computer program product may also transform the data into a format for presentation.
  • the computer program product for processing hybridisation data comprises code that identifies for each target polynucleotide a combination of features in an oligonucleotide array whose probes facilitate specific detection of that polynucleotide; code that receives as input hybridisation data from hybridisation reactions between sample polynucleotides and the ohgonucleotide probes in the array; code that processes the hybridisation data to determine whether the sample polynucleotides comprise any of the target polynucleotides by searching for hybridisation patterns that match any of the predefined combinations of target sequences; and a computer readable medium that stores the codes. It is not necessary to identify the sequence of respective oligonucleotide probes in each feature of the array.
  • the hybridisation analysis software only requires as input which combination of features in the array corresponds to a particular target polynucleotide.
  • the computer program product comprises code that receives as input the sequence of an oligonucleotide probe in each feature of an oligonucleotide array and code that receives as input a database that contains information on the presence or absence of target sequences in target polynucleotides.
  • the computer program product further comprises code that deduces the probability that the detected pattern of hybridisation indicates the presence of a target polynucleotide.
  • the database of target sequences would be regularly up-dated and the part of it relevant to each particular set of probes forming each micro-array would also be updated for those using particular commercial applications of the invention.
  • Illustrated in this example is the use of probe combinations to detect all members of a variable gene family using, as an example, the gene sequences of the potyviruses, the largest genus of the family Potyviridae.
  • the Potyviridae is the largest and one of the best- studied plant virus families, species of which cause significant losses in many crops throughout the world. At least 400 potyviruses are known, and they comprise about one quarter of all known plant viruses.
  • potyvirus genomes would, however, be detected more efficiently using micro-arrays designed by the combinatorial approach mentioned above and such arrays would be more informative as they will be more discriminating.
  • the presence of the conserved B-motif region of potyviruses described above could be detected by fewer shorter probes if two overlapping sub-groups of sequences derived from the 20-nucleotide long sequence were used ( Figure 6A).
  • a micro-array of these two subgroups would therefore consist of 96 probes, namely about one third of the number of probes required by the full 20 nucleotide motif. When this array is used in a test, the presence of a potyvirus polymerase B-motif region will be indicated by hybridisation to at least one probe from each sub-group.
  • Arrays designed using the two or three sub-groups of B motif sequences would be less specific than an array consisting of probes with the complete 20-nucleotide long sequences. However, their specificity could be augmented, perhaps to an even greater level than the larger array, by including additional probes based on other regions of the potyvirus genome,
  • the hybridisation pattern in Figure 8 is shown between such an array and the cDNAs of the virus genes used in the example of the array with the complete 20 nucleotide long B-motif probe sequences ( Figure 5).
  • the combinatorial array would be similarly capable of detecting any potyvirus cDNA but could also be used to distinguish between the PVY-Hung and NSW strains and between PVY-Co and BYMV. The larger array would not have those capabilities. It is difficult to estimate the specificity of combinatorial probe sets because of the complexity and biases of gene sequences, and because their specificity would depend in practice on the source of the cDNA, and hence the likely contaminants.
  • Illustrative in this example is one embodiment of the process of the invention for identifying sequences useful for producing combinatorial probes for detecting a plurality of organisms.
  • Sequences to be used as combinatorial probes can be identified using known sequences (e.g., published in a nucleic acid sequence database) relating to target polynucleotides (e.g., a gene or group of genes or transcripts relating thereto) of a plurality of organisms of interest. Finding the "minimum set" of sub-sequences to cover likely variation in the target polynucleotides and to be used as a probe set is a "Nondeterministic Polynomial time (NP)-complete" problem, and algorithms for the identification of suitable target sequences can be based on principles discussed for example in: Garey, M.R. and Johnson, D.S. (1979).
  • NP Nondeterministic Polynomial time
  • a nucleic acid sequence database is searched for sequences of a selected genomic region present in the target set of organisms, which might define, for example, a plurality of "taxa".
  • the selected region may comprise sequences ZZ which are delimited by, and can be amplified in PCR using a pair of redundant PCR primers (i.e., mixtures of primers that hybridise with all known species of the set), for example all the recorded polymerase genes of influenza (orthomyxo) viruses. These sequences are complied for stage (2).
  • the compiled sequences are fragmented into sets of shorter overlapping nucleotide sequences or oligonucleotide sequences (oligos) that are, ideally, 8-12 nucleotides long, but may be 6 or more nucleotides long.
  • All oligos of a particular size are sorted into a primary "taxon x oligo" matrix; initially different matrices are constructed for each oligo size class. In each matrix is recorded the presence or absence of each kind of oligo in each of the taxa.
  • a "meta-taxon pair x oligo" matrix (or meta-matrix.) is then constructed from each primary matrix by comparing all taxon pairs in the primary matrix and recording, for each pair, whether or not they are distinguished by each oligo.
  • the "minimum set" of oligos to distinguish the target sequences is then derived from the meta-matrix, using the standard “greedy strategy”: a). The oligo that distinguishes most taxa in the meta-matrix is identified by summing the number of hits for each oligo in the meta-matrix; b). That oligo is then removed from the meta-matrix together with its "hitting set", namely all the pairs of taxa that it distinguishes; c). This process is repeated until hitting sets that include all or most taxa have been found; usually 12 or more in number; d).
  • the algorithm iteratively and progressively tests all possible sets to identify the best minimum set by swapping oligos at each iteration.
  • Other criteria can also be used to select the oligos that are likely (for physico-chemical reasons) to make the best probes, for example, those that are of similar composition and those that are not nested subsequences of one another.
  • Each working set of probes can use several minimum sets of oligos discovered in this way. At least 5 sets are usually required to ensure the accuracy of identification, especially as a single individual minimum set may not uniquely identify all taxa in the set.
  • a working set may also include oligos of more than one length class.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

L'invention concerne un ensemble de sondes oligonucléotidiques ainsi qu'un procédé de détection de plusieurs polynucléotides cibles, différents. Cet ensemble comprend une collection de différentes sondes ubiquistes, chacune capable de s'hybrider à une séquence cible partagée par au moins deux polynucléotides cibles. Au moins l'un des polynucléotides cibles comprend au moins une séquence cible partagée avec au moins un autre polynucléotide cible. Une combinaison de sondes ubiquistes est capable de s'hybrider à des séquences cibles dudit polynucléotide cible, au moins, cette combinaison déterminée de sondes apportant la spécificité de détection du polynucléotide cible. L'invention concerne encore des procédés d'identification d'un ensemble de séquences cibles, servant à désigner l'ensemble de sondes oligonucléotidiques de l'invention.
PCT/AU2001/000931 2000-07-27 2001-07-27 Sondes combinatoires et utilisations associees WO2002010443A1 (fr)

Priority Applications (6)

Application Number Priority Date Filing Date Title
US10/343,107 US20050260574A1 (en) 2000-07-27 2001-01-27 Combinatorial probes and uses therefor
JP2002516359A JP2004504068A (ja) 2000-07-27 2001-07-27 コンビナトリアル・プローブ及びそのための用途
NZ523715A NZ523715A (en) 2000-07-27 2001-07-27 A set of oligonucleotide probes 'promiscuous' probes that hybridise to target sequences common to more than one of the target polynucleotides
CA002416952A CA2416952A1 (fr) 2000-07-27 2001-07-27 Sondes combinatoires et utilisations associees
AU2001276178A AU2001276178A1 (en) 2000-07-27 2001-07-27 Combinatorial probes and uses therefor
EP01953687A EP1322780A4 (fr) 2000-07-27 2001-07-27 Sondes combinatoires et utilisations associees

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
AUPQ9026 2000-07-27
AUPQ9026A AUPQ902600A0 (en) 2000-07-27 2000-07-27 Combinatorial probes and uses therefor
AUPQ9483 2000-08-17
AUPQ9483A AUPQ948300A0 (en) 2000-08-17 2000-08-17 Combinatorial probes and uses therefor
US22621200P 2000-08-18 2000-08-18
US60/226,212 2000-08-18

Publications (1)

Publication Number Publication Date
WO2002010443A1 true WO2002010443A1 (fr) 2002-02-07

Family

ID=27158234

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2001/000931 WO2002010443A1 (fr) 2000-07-27 2001-07-27 Sondes combinatoires et utilisations associees

Country Status (5)

Country Link
EP (1) EP1322780A4 (fr)
JP (1) JP2004504068A (fr)
CA (1) CA2416952A1 (fr)
NZ (1) NZ523715A (fr)
WO (1) WO2002010443A1 (fr)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004079627A1 (fr) * 2003-03-07 2004-09-16 Kabushikikaisha Dynacom Procede de selection de facteur fonctionnel pour l'identification d'un gene
JP2005190427A (ja) * 2003-12-26 2005-07-14 Canon Inc 配列を同定するための変異要素のセット抽出方法
AU2010203517B2 (en) * 2009-01-09 2012-08-16 The Regents Of The University Of Michigan Recurrent gene fusions in cancer
US20130244884A1 (en) * 2010-05-13 2013-09-19 Gen9, Inc. Methods for Nucleotide Sequencing and High Fidelity Polynucleotide Synthesis
US8945556B2 (en) 2010-11-19 2015-02-03 The Regents Of The University Of Michigan RAF gene fusions
US9284609B2 (en) 2005-09-12 2016-03-15 The Brigham And Women's Hospital, Inc. Recurrent gene fusions in prostate cancer
US9303291B2 (en) 2007-07-06 2016-04-05 The Regents Of The University Of Michigan MIPOL1-ETV1 gene rearrangements
US9403141B2 (en) 2013-08-05 2016-08-02 Twist Bioscience Corporation De novo synthesized gene libraries
US9677067B2 (en) 2015-02-04 2017-06-13 Twist Bioscience Corporation Compositions and methods for synthetic gene assembly
US9895673B2 (en) 2015-12-01 2018-02-20 Twist Bioscience Corporation Functionalized surfaces and preparation thereof
US9926602B2 (en) 2009-09-17 2018-03-27 The Regents Of The University Of Michigan Recurrent gene fusions in prostate cancer
US9957569B2 (en) 2005-09-12 2018-05-01 The Regents Of The University Of Michigan Recurrent gene fusions in prostate cancer
US9981239B2 (en) 2015-04-21 2018-05-29 Twist Bioscience Corporation Devices and methods for oligonucleic acid library synthesis
US10053688B2 (en) 2016-08-22 2018-08-21 Twist Bioscience Corporation De novo synthesized nucleic acid libraries
CN109071590A (zh) * 2016-03-01 2018-12-21 方馨基因组学公司 用于分子探针的数据驱动设计、合成和应用的系统和方法
US10417457B2 (en) 2016-09-21 2019-09-17 Twist Bioscience Corporation Nucleic acid based data storage
US10669304B2 (en) 2015-02-04 2020-06-02 Twist Bioscience Corporation Methods and devices for de novo oligonucleic acid assembly
US10696965B2 (en) 2017-06-12 2020-06-30 Twist Bioscience Corporation Methods for seamless nucleic acid assembly
US10844373B2 (en) 2015-09-18 2020-11-24 Twist Bioscience Corporation Oligonucleic acid variant libraries and synthesis thereof
US10894959B2 (en) 2017-03-15 2021-01-19 Twist Bioscience Corporation Variant libraries of the immunological synapse and synthesis thereof
US10894242B2 (en) 2017-10-20 2021-01-19 Twist Bioscience Corporation Heated nanowells for polynucleotide synthesis
US10907274B2 (en) 2016-12-16 2021-02-02 Twist Bioscience Corporation Variant libraries of the immunological synapse and synthesis thereof
US10936953B2 (en) 2018-01-04 2021-03-02 Twist Bioscience Corporation DNA-based digital information storage with sidewall electrodes
US11332738B2 (en) 2019-06-21 2022-05-17 Twist Bioscience Corporation Barcode-based nucleic acid sequence assembly
US11377676B2 (en) 2017-06-12 2022-07-05 Twist Bioscience Corporation Methods for seamless nucleic acid assembly
US11407837B2 (en) 2017-09-11 2022-08-09 Twist Bioscience Corporation GPCR binding proteins and synthesis thereof
US11492665B2 (en) 2018-05-18 2022-11-08 Twist Bioscience Corporation Polynucleotides, reagents, and methods for nucleic acid hybridization
US11492727B2 (en) 2019-02-26 2022-11-08 Twist Bioscience Corporation Variant nucleic acid libraries for GLP1 receptor
US11492728B2 (en) 2019-02-26 2022-11-08 Twist Bioscience Corporation Variant nucleic acid libraries for antibody optimization
US11512347B2 (en) 2015-09-22 2022-11-29 Twist Bioscience Corporation Flexible substrates for nucleic acid synthesis
US11550939B2 (en) 2017-02-22 2023-01-10 Twist Bioscience Corporation Nucleic acid based data storage using enzymatic bioencryption

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6321318B2 (ja) * 2012-09-14 2018-05-09 日本碍子株式会社 標的核酸の検出方法

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5683881A (en) * 1995-10-20 1997-11-04 Biota Corp. Method of identifying sequence in a nucleic acid target using interactive sequencing by hybridization
WO2000040758A2 (fr) * 1999-01-06 2000-07-13 Hyseq Inc. Sequencage par hybridation, ameliore, utilisant des groupes de sondes

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5541308A (en) * 1986-11-24 1996-07-30 Gen-Probe Incorporated Nucleic acid probes for detection and/or quantitation of non-viral organisms
EP0769068B1 (fr) * 1994-06-24 2005-10-12 Innogenetics N.V. Detection, identification et differentiation simultanees de taxa d'eubacteriales a l'aide d'une technique d'hybridation
DE19616750A1 (de) * 1996-04-26 1997-11-06 Newlab Diagnostic Systems Gmbh Verfahren zum Nachweis von Mikroorganismen in Gemischen
WO1999022023A2 (fr) * 1997-10-29 1999-05-06 Mira Diagnostica Gmbh Procede de caracterisation de microorganismes
US6306643B1 (en) * 1998-08-24 2001-10-23 Affymetrix, Inc. Methods of using an array of pooled probes in genetic analysis

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5683881A (en) * 1995-10-20 1997-11-04 Biota Corp. Method of identifying sequence in a nucleic acid target using interactive sequencing by hybridization
WO2000040758A2 (fr) * 1999-01-06 2000-07-13 Hyseq Inc. Sequencage par hybridation, ameliore, utilisant des groupes de sondes

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
BEHR ET AL.: "A nested array of rRNA targeted probes for the detection and identification of enterococci by reverse hybridization", SYSTEM. APPL. MICROBIOL., vol. 23, 2000, pages 563 - 572, XP009048472 *
BORNEMAN ET AL.: "Probe selection algorithms with applications in the analysis of microbial communities", BIOINFORMATICS, vol. 17, no. SUPPL. 1, 2001, pages S39 - S48, XP002992531 *
HERWIG ET AL.: "Information theoretical probe selection for hybridisation experiments", BIOINFORMATICS, vol. 16, no. 10, 2000, pages 890 - 898, XP002298879 *
LIPSHUTZ ET AL.: "High density synthetic oligonucleotide arrays", NATURE GENETICS, vol. 21, no. 1 SUPPL., 1999, pages 20 - 24, XP000865982 *
See also references of EP1322780A4 *

Cited By (71)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004079627A1 (fr) * 2003-03-07 2004-09-16 Kabushikikaisha Dynacom Procede de selection de facteur fonctionnel pour l'identification d'un gene
JPWO2004079627A1 (ja) * 2003-03-07 2006-06-08 株式会社ダイナコム 遺伝子を同定するための作用要素の選択方法
JP2005190427A (ja) * 2003-12-26 2005-07-14 Canon Inc 配列を同定するための変異要素のセット抽出方法
US8041512B2 (en) 2003-12-26 2011-10-18 Canon Kabushiki Kaisha Method of acquiring a set of specific elements for discriminating sequence
US9745635B2 (en) 2005-09-12 2017-08-29 The Regents Of The University Of Michigan Recurrent gene fusions in prostate cancer
US10190173B2 (en) 2005-09-12 2019-01-29 The Regents Of The University Of Michigan Recurrent gene fusions in prostate cancer
US9957569B2 (en) 2005-09-12 2018-05-01 The Regents Of The University Of Michigan Recurrent gene fusions in prostate cancer
US9284609B2 (en) 2005-09-12 2016-03-15 The Brigham And Women's Hospital, Inc. Recurrent gene fusions in prostate cancer
US10041123B2 (en) 2005-09-12 2018-08-07 The Regents Of The University Of Michigan Recurrent gene fusions in prostate cancer
US9719143B2 (en) 2007-07-06 2017-08-01 The Regents Of The University Of Michigan MIPOL1-ETV1 gene rearrangements
US10167517B2 (en) 2007-07-06 2019-01-01 The Regents Of The University Of Michigan MIPOL1-ETV1 gene rearrangements
US9303291B2 (en) 2007-07-06 2016-04-05 The Regents Of The University Of Michigan MIPOL1-ETV1 gene rearrangements
AU2010203517B2 (en) * 2009-01-09 2012-08-16 The Regents Of The University Of Michigan Recurrent gene fusions in cancer
US9926602B2 (en) 2009-09-17 2018-03-27 The Regents Of The University Of Michigan Recurrent gene fusions in prostate cancer
US9938582B2 (en) 2009-09-17 2018-04-10 The Regents Of The University Of Michigan Recurrent gene fusions in prostate cancer
US10240194B2 (en) * 2010-05-13 2019-03-26 Gen9, Inc. Methods for nucleotide sequencing and high fidelity polynucleotide synthesis
US20130244884A1 (en) * 2010-05-13 2013-09-19 Gen9, Inc. Methods for Nucleotide Sequencing and High Fidelity Polynucleotide Synthesis
US9567644B2 (en) 2010-11-19 2017-02-14 The Regents Of The University Of Michigan RAF gene fusions
US11015224B2 (en) 2010-11-19 2021-05-25 The Regents Of The University Of Michigan RAF gene fusions
US8945556B2 (en) 2010-11-19 2015-02-03 The Regents Of The University Of Michigan RAF gene fusions
US10618024B2 (en) 2013-08-05 2020-04-14 Twist Bioscience Corporation De novo synthesized gene libraries
US10632445B2 (en) 2013-08-05 2020-04-28 Twist Bioscience Corporation De novo synthesized gene libraries
US9889423B2 (en) 2013-08-05 2018-02-13 Twist Bioscience Corporation De novo synthesized gene libraries
US9403141B2 (en) 2013-08-05 2016-08-02 Twist Bioscience Corporation De novo synthesized gene libraries
US9839894B2 (en) 2013-08-05 2017-12-12 Twist Bioscience Corporation De novo synthesized gene libraries
US11452980B2 (en) 2013-08-05 2022-09-27 Twist Bioscience Corporation De novo synthesized gene libraries
US11559778B2 (en) 2013-08-05 2023-01-24 Twist Bioscience Corporation De novo synthesized gene libraries
US9833761B2 (en) 2013-08-05 2017-12-05 Twist Bioscience Corporation De novo synthesized gene libraries
US10639609B2 (en) 2013-08-05 2020-05-05 Twist Bioscience Corporation De novo synthesized gene libraries
US9555388B2 (en) 2013-08-05 2017-01-31 Twist Bioscience Corporation De novo synthesized gene libraries
US10272410B2 (en) 2013-08-05 2019-04-30 Twist Bioscience Corporation De novo synthesized gene libraries
US10773232B2 (en) 2013-08-05 2020-09-15 Twist Bioscience Corporation De novo synthesized gene libraries
US10384188B2 (en) 2013-08-05 2019-08-20 Twist Bioscience Corporation De novo synthesized gene libraries
US11185837B2 (en) 2013-08-05 2021-11-30 Twist Bioscience Corporation De novo synthesized gene libraries
US10583415B2 (en) 2013-08-05 2020-03-10 Twist Bioscience Corporation De novo synthesized gene libraries
US9409139B2 (en) 2013-08-05 2016-08-09 Twist Bioscience Corporation De novo synthesized gene libraries
US9677067B2 (en) 2015-02-04 2017-06-13 Twist Bioscience Corporation Compositions and methods for synthetic gene assembly
US10669304B2 (en) 2015-02-04 2020-06-02 Twist Bioscience Corporation Methods and devices for de novo oligonucleic acid assembly
US11697668B2 (en) 2015-02-04 2023-07-11 Twist Bioscience Corporation Methods and devices for de novo oligonucleic acid assembly
US11691118B2 (en) 2015-04-21 2023-07-04 Twist Bioscience Corporation Devices and methods for oligonucleic acid library synthesis
US9981239B2 (en) 2015-04-21 2018-05-29 Twist Bioscience Corporation Devices and methods for oligonucleic acid library synthesis
US10744477B2 (en) 2015-04-21 2020-08-18 Twist Bioscience Corporation Devices and methods for oligonucleic acid library synthesis
US11807956B2 (en) 2015-09-18 2023-11-07 Twist Bioscience Corporation Oligonucleic acid variant libraries and synthesis thereof
US10844373B2 (en) 2015-09-18 2020-11-24 Twist Bioscience Corporation Oligonucleic acid variant libraries and synthesis thereof
US11512347B2 (en) 2015-09-22 2022-11-29 Twist Bioscience Corporation Flexible substrates for nucleic acid synthesis
US9895673B2 (en) 2015-12-01 2018-02-20 Twist Bioscience Corporation Functionalized surfaces and preparation thereof
US10987648B2 (en) 2015-12-01 2021-04-27 Twist Bioscience Corporation Functionalized surfaces and preparation thereof
US10384189B2 (en) 2015-12-01 2019-08-20 Twist Bioscience Corporation Functionalized surfaces and preparation thereof
CN109071590B (zh) * 2016-03-01 2023-08-08 方馨基因组学公司 用于分子探针的数据驱动设计、合成和应用的系统和方法
CN109071590A (zh) * 2016-03-01 2018-12-21 方馨基因组学公司 用于分子探针的数据驱动设计、合成和应用的系统和方法
US10053688B2 (en) 2016-08-22 2018-08-21 Twist Bioscience Corporation De novo synthesized nucleic acid libraries
US10975372B2 (en) 2016-08-22 2021-04-13 Twist Bioscience Corporation De novo synthesized nucleic acid libraries
US11263354B2 (en) 2016-09-21 2022-03-01 Twist Bioscience Corporation Nucleic acid based data storage
US10417457B2 (en) 2016-09-21 2019-09-17 Twist Bioscience Corporation Nucleic acid based data storage
US10754994B2 (en) 2016-09-21 2020-08-25 Twist Bioscience Corporation Nucleic acid based data storage
US11562103B2 (en) 2016-09-21 2023-01-24 Twist Bioscience Corporation Nucleic acid based data storage
US10907274B2 (en) 2016-12-16 2021-02-02 Twist Bioscience Corporation Variant libraries of the immunological synapse and synthesis thereof
US11550939B2 (en) 2017-02-22 2023-01-10 Twist Bioscience Corporation Nucleic acid based data storage using enzymatic bioencryption
US10894959B2 (en) 2017-03-15 2021-01-19 Twist Bioscience Corporation Variant libraries of the immunological synapse and synthesis thereof
US11377676B2 (en) 2017-06-12 2022-07-05 Twist Bioscience Corporation Methods for seamless nucleic acid assembly
US11332740B2 (en) 2017-06-12 2022-05-17 Twist Bioscience Corporation Methods for seamless nucleic acid assembly
US10696965B2 (en) 2017-06-12 2020-06-30 Twist Bioscience Corporation Methods for seamless nucleic acid assembly
US11407837B2 (en) 2017-09-11 2022-08-09 Twist Bioscience Corporation GPCR binding proteins and synthesis thereof
US10894242B2 (en) 2017-10-20 2021-01-19 Twist Bioscience Corporation Heated nanowells for polynucleotide synthesis
US11745159B2 (en) 2017-10-20 2023-09-05 Twist Bioscience Corporation Heated nanowells for polynucleotide synthesis
US10936953B2 (en) 2018-01-04 2021-03-02 Twist Bioscience Corporation DNA-based digital information storage with sidewall electrodes
US11492665B2 (en) 2018-05-18 2022-11-08 Twist Bioscience Corporation Polynucleotides, reagents, and methods for nucleic acid hybridization
US11732294B2 (en) 2018-05-18 2023-08-22 Twist Bioscience Corporation Polynucleotides, reagents, and methods for nucleic acid hybridization
US11492727B2 (en) 2019-02-26 2022-11-08 Twist Bioscience Corporation Variant nucleic acid libraries for GLP1 receptor
US11492728B2 (en) 2019-02-26 2022-11-08 Twist Bioscience Corporation Variant nucleic acid libraries for antibody optimization
US11332738B2 (en) 2019-06-21 2022-05-17 Twist Bioscience Corporation Barcode-based nucleic acid sequence assembly

Also Published As

Publication number Publication date
NZ523715A (en) 2004-07-30
EP1322780A4 (fr) 2005-08-03
EP1322780A1 (fr) 2003-07-02
CA2416952A1 (fr) 2002-02-07
JP2004504068A (ja) 2004-02-12

Similar Documents

Publication Publication Date Title
WO2002010443A1 (fr) Sondes combinatoires et utilisations associees
Reinartz et al. Massively parallel signature sequencing (MPSS) as a tool for in-depth quantitative gene expression profiling in all organisms
EP0799897B1 (fr) Kits et méthodes pour la détection des acides nucléiques cibles à l'aide des acides nucléiques marqueurs
US7344831B2 (en) Methods for controlling cross-hybridization in analysis of nucleic acid sequences
EP1660674B1 (fr) Profilage d'expression au moyen de microreseaux
KR100961156B1 (ko) 식물 바이러스를 검출하기 위한 프로브 세트,마이크로어레이, 방법 및 키트
US7476519B2 (en) Strategies for gene expression analysis
EP1185699B1 (fr) Hybridation soustractive basee sur des micro-ensembles
US5639612A (en) Method for detecting polynucleotides with immobilized polynucleotide probes identified based on Tm
WO2001073134A2 (fr) Jeux ordonnes d'echantillons de profilage genique
US20050260574A1 (en) Combinatorial probes and uses therefor
WO2008143640A1 (fr) Microréseau d'acides nucléiques du virus de la grippe et son procédé d'utilisation
US20020058252A1 (en) Short shared nucleotide sequences
AU2001276178A1 (en) Combinatorial probes and uses therefor
WO2009098038A1 (fr) Procédés et systèmes pour des mesures de contrôle de la qualité dans des essais d'hybridation
AU2007203577A1 (en) Combinatorial probes and uses therefor
Wang et al. Methods for genome-wide analysis of gene expression changes in polyploids
Bodrossy Diagnostic oligonucleotide microarrays for microbiology
Reinartz et al. Technique review
Farbrother et al. Comparison of probe preparation methods for DNA microarrays
KR101487824B1 (ko) 이온화에너지원에 따른 특이반응 유전자 또는 그의 단편을 포함하는 이온화에너지원 검출용 조성물 및 키트
US20110130299A1 (en) Plastidial microarray
Choudhury MASSIVELY PARALLEL SIGNATURE SEQUENCING
JEFFREY Construction and applications of gene microarrays on nylon
Dowd et al. Microarrays: Design and Use for Agricultural and Environmental Applications

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWE Wipo information: entry into national phase

Ref document number: 523715

Country of ref document: NZ

Ref document number: 2001276178

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 2416952

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2001953687

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2001953687

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 10343107

Country of ref document: US

WWP Wipo information: published in national office

Ref document number: 523715

Country of ref document: NZ

WWG Wipo information: grant in national office

Ref document number: 523715

Country of ref document: NZ

WWW Wipo information: withdrawn in national office

Ref document number: 2001953687

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 2001953687

Country of ref document: EP