EP1831394A2 - Sondes, bibliotheques et trousses pour l'analyse de melanges d'acides nucleiques et leurs procedes de construction - Google Patents

Sondes, bibliotheques et trousses pour l'analyse de melanges d'acides nucleiques et leurs procedes de construction

Info

Publication number
EP1831394A2
EP1831394A2 EP05804988A EP05804988A EP1831394A2 EP 1831394 A2 EP1831394 A2 EP 1831394A2 EP 05804988 A EP05804988 A EP 05804988A EP 05804988 A EP05804988 A EP 05804988A EP 1831394 A2 EP1831394 A2 EP 1831394A2
Authority
EP
European Patent Office
Prior art keywords
library
probe
probes
sequences
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP05804988A
Other languages
German (de)
English (en)
Inventor
Niels Birger Ramsing
Peter Mouritzen
Søren Morgenthaler ECHWALD
Niels Tolstrup
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Exiqon AS
Original Assignee
Exiqon AS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Exiqon AS filed Critical Exiqon AS
Publication of EP1831394A2 publication Critical patent/EP1831394A2/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H21/00Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection

Definitions

  • the invention relates to nucleic acid probes, nucleic acid probe libraries, and kits for detect- ing, classifying, or quantifying components in a complex mixture of nucleic acids, such as a transcriptome, and methods of using the same.
  • Microarrays for profiling the expression of thousands of genes, such as GeneChipTM arrays (Affymetrix, Inc., Santa Clara, CA), correlations between expressed genes and cellular phenotypes may be identified at a fraction at the cost and labour necessary for traditional methods, such as Northern- or dot-blot analysis.
  • Microarrays permit the development of multiple parallel assays for identifying and validating biomarkers of disease and drug targets which can be used in diagnosis and treatment.
  • Gene expression profiles can also be used to estimate and predict metabolic and toxicological consequences of exposure to an agent (e.g., such as a drug, a potential toxin or carcinogen, etc.) or a condition (e.g., temperature, pH, etc).
  • an agent e.g., such as a drug, a potential toxin or carcinogen, etc.
  • a condition e.g., temperature, pH, etc.
  • Microarray experiments often yield redundant data, only a fraction of which has value for the experimenter. Additionally, because of the highly parallel format of microarray-based assays, conditions may not be optimal for individual capture probes. For these reasons, microarray experiments are most often followed up by, or sequentially replaced by, confirmatory studies using single-gene homogeneous assays. These are most often quantitative PCR-based methods such as the 5' nuclease assay or other types of dual labelled probe quantitative assays. However, these assays are still time-consuming, single-reaction assays that are hampered by high costs and time-consuming probe design procedures. Further, 5' nuclease assay probes are relatively large (e.g., 15-30 nucleotides). Thus, the limitations in homogeneous assay systems currently known create a bottleneck in the validation of microarray findings, and in focused target validation procedures.
  • the preferred method of quantifying mRNA by real-time PCR uses sequence- specific detection probes.
  • One approach for avoiding the problem of random amplification and the formation of primer- dimers is to use generic detection probes that may be used to detect a large number of dif- ferent types of nucleic acid molecules, while retaining some sequence specificity, has been described by Simeonov, et al. (Nucleic Acid Research 30(17): 91, 2002; U.S. Patent Publication 20020197630) and involves the use of a library of probes comprising more than 10% of all possible sequences of a given length (or lengths).
  • the library can include various non- natural nucleobases and other modifications to stabilize binding of probes/primers in the Ii- brary to a target sequence.
  • the present invention solves the problems faced by contemporary approaches to homogeneous assays outlined above. This is done by providing a method for construction of generic multi-probes with sufficient sequence specificity - so that they are unlikely to detect a randomly amplified sequence fragment or primer-dimers - but are still capable of detecting many different target sequences each.
  • probes are usable in different assays and may be combined in small probe libraries (50 to 500 probes) that can be used to detect and/or quantify individual components in complex mixtures composed of thousands of different nucleic acids (e.g. detecting individual transcripts in the human transcriptome composed of >30,000 different nucleic acids.) when combined with a target specific primer set.
  • Each multi-probe comprises two elements: 1) a detection element or detection moiety consisting of one or more labels to detect the binding of the probe to the target; and 2) a recognition element or recognition sequence tag ensuring the binding to the specific target(s) of interest.
  • the detection element can be any of a variety of detection principles used in homogeneous assays.
  • the detection of binding is either direct by a measurable change in the properties of one or more of the labels following binding to the target (e.g. a molecular beacon type assay with or without stem structure) or indirect by a subsequent reaction following binding (e.g. cleavage by the 5' nuclease activity of the DNA polymerase in 5' nuclease as- says).
  • Each detection element may include a quencher selected from the quenchers disclosed in European patent applications 04078170 and 03759288.
  • the quencher preferably has formula I
  • R 1 , R 4 , R 5 and R 8 independently is/are a bond or selected from substituted or non-substituted amino group, which constitute(s) the linker(s) to the remainder of the oligonucleotide probe, and wherein the remaining R 1 to R 8 groups are each, independently hydrogen or substituted or non-substituted hydroxy, amino, alkyl, aryl, arylalkyl or alkoxy
  • substitution of the amino group can be with an alkyl, alkylaryl or aryl group.
  • alkyl is used herein in the context of formula I to refer to a branched or unbranched, saturated or unsaturated, monovalent hydrocarbon radical, generally having from about 1-30 carbons and preferably, from 1-6 carbons.
  • Suitable alkyl radicals include, for example, structures containing one or more methylene, methine and/or methyne groups. Branched structures have a branching motif similar to iso-propyl, t-butyl, i-butyl, 2- ethylpropyl, etc.
  • the term encompasses "substituted alkyls" and "cyclic alkyl".
  • Substituted alkyl refers to alkyl as just described including one or more substituents such as, for example, Ci-C 6 -alkyl, aryl, acyl, halogen (i.e. alkylhalos, e.g., CF 3 ), hydroxy, amino, alkoxy, alkylamino, acylamino, thioamido, acyloxy, aryloxy, aryloxyalkyl, mercapto, thia, aza, oxo, both saturated and unsaturated cyclic hydrocarbons, heterocycles and the like. These groups may be attached to any carbon or substituent of the alkyl moiety. Additionally, these groups may be pendent from, or integral to, the alkyl chain.
  • alkylaryl in this context means a radical obtained by combining an alkyl and an aryl group.
  • Typical alkylaryl groups include phenethyl, ethyl phenyl and the like.
  • alkylamino in this context means amino substituted with alkyl.
  • the amino group is attached to the anthraquinone structure.
  • alkylarylamino in this context means amino substituted with alkylaryl.
  • the amino group is attached to the anthraquinone structure.
  • arylamino in this context means amino substituted with aryl.
  • the amino group is attached to the anthraquinone structure.
  • quenchers used in the invention include l,4-bis-(3-hydroxy- propylamino)-anthraquinone, l-(3-(4,4'-dimethoxy-trityloxy)propylamino)-4-(3- hydroxypropylamino)-anthraquinone, l,5-bis-(3-hydroxy-propylamino)-anthraquinone, l-(3- hydroxypropylamino)-5-(3-(4,4'-dimethoxy-trityloxy)propylamino)-anthraquinone, 1,4-bis- (4-(2-hydroxyethyl)phenylamino)-anthraquinone, l-(4-(2-(4,4'-dimeth
  • One especially preferred quencher is compound 11 of Example 21, i.e. 1,4-Bis(2- hydroxyethylamino)-6-methylanthraquinone.
  • the recognition element also contributes to the novelty of the present invention. It comprises a short oligonucleotide moiety whose sequence has been selected to enable detection of a large subset of target nucleotides in a given complex sample mixture.
  • the novel probes designed to detect many different target molecules each are referred to as multi-probes.
  • the concept of designing a probe for multiple targets and exploit the recurrence of a short recognition sequence by selecting the most frequently encountered sequences is novel and contrary to conventional probes that are designed to be as specific as possible for a single target sequence.
  • the surrounding primers and the choice of probe sequence in combination subsequently ensure the specificity of the multi-probes.
  • nucleosidic bases or nucleotides are incorporated in the recognition element, possibly together with minor groove binders and other modifications, that all aim to stabilize the duplex formed between the probe and the target molecule so that the shortest possible probe sequence with the widest range of targets can be used.
  • the modifications are incorporation of LNA residues to reduce the length of the recognition element to 8 or 9 nucleotides while maintaining sufficient stability of the formed duplex to be detectable under ordinary assay conditions.
  • less than 20% of the oligonucleotide probes of said library have a guanidyl (G) residue in the 5' and/or 3' position of the recognition element, but it is preferred that less than 10% of the oligonucleotide probes have a G in the 5' end of the recognition element, such as less than 5%.
  • libraries where the recognition elements do not have a G in the 5' end.
  • the multi-probes are modified in order to increase the binding affinity of the probe for a target sequence by at least two-fold compared to a probe of the same sequence without the modification, under the same conditions for detection, e.g., such as PCR conditions, or stringent hybridization conditions.
  • the preferred modifications include, but are not limited to, inclusion of nucleobases, nucleosidic bases or nucleotides that has been modified by a chemical moiety or replaced by an analogue (e.g. including a ribose or deoxyribose analogue) or by using internucleotide linkages other than phosphodiester linkages (such as non-phosphate internucleotide linkages), all to increase the binding affinity.
  • the preferred modifications may also include attachment of duplex stabilizing agents e.g., such as minor-groove-binders (MGB) or intercalating nucleic acids (INA). Additionally the preferred modifications may also include addition of non-discriminatory bases e.g., such as 5-nitroindole, which are capable of stabilizing duplex formation regardless of the nucleobase at the opposing position on the target strand.
  • a preferred embodiment entails that all probes in the inventive library include at least one 5-nitroindole residue (and most preferred: all probes include one single 5-nitroindole residue.
  • multi-probes composed of a non-sugar-phosphate backbone, e.g.
  • the stabilizing modification(s) and the ensuing multi-probe will in the following also be referred to as "modified oligonucleotide”. More preferably the binding affinity of the modified oligonucleotide is at least about 3-fold, 4- fold, 5-fold, or 20-fold higher than the binding of a probe of the same sequence but without the stabilizing modification(s).
  • the stabilizing modification(s) is inclusion of one or more LNA nucleotide analogs.
  • Probes of from 6 to 12 nucleotides according to the invention may comprise from 1 to 8 stabilizing nucleotides, such as LNA nucleotides. When at least two LNA nucleotides are included, these may be consecutive or separated by one or more non-LNA nucleotides.
  • LNA nucleotides are alpha and/or xylo LNA nucleotides.
  • the invention also provides oligomer multi-probe library useful under conditions used in NASBA based assays.
  • NASBA is a specific, isothermal method of nucleic acid amplification suited for the amplification of RNA.
  • Nucleic acid isolation is achieved via lysis with guanidine thiocyanate plus Triton X-IOO and ending with purified nucleic acid being eluted from silicon dioxide particles.
  • Amplification by NASBA involves the coordinated activities of three enzymes, AMV Reverse Transcriptase, RNase H, and T7 RNA Polymerase. Quantitative detection is achieved by way of internal calibrators, added at isolation, which are co-amplified and subsequently identified along with the wild type of RNA using electro chemiluminescence.
  • the invention also provides an oligomer multi-probe library comprising multi-probes comprising at least one with stabilizing modifications as defined above.
  • the probes are less than about 20 nucleotides in length and more preferably less than 12 nucleotides, and most preferably about 8 or 9 nucleotides.
  • the library comprises less than about 3000 probes and more preferably the library comprises less than 500 probes and most preferably about 100 probes.
  • the libraries containing labelled multi-probes may be used in a variety of applications depending on the type of detection element attached to the recognition element.
  • the multi-probes described above are designed together to complement each other as a predefined subset of all possible sequences of the given lengths selected to be able to detect/characterize/quantify the largest number of nucleic acids in a complex mixture using the smallest number of multi-probe sequences.
  • These predesigned small subsets of all possible sequences constitute a multi-probe library.
  • the multi-probe Ii- braries described by the present invention attains this functionality at a greatly reduced complexity by deliberately selecting the most commonly occurring oligomers of a given length or lengths while attempting to diversify the selection to get the best possible coverage of the complex nucleic acid target population.
  • probes of the library hybridize with more than about 60% of a target population of nucleic acids, such as a population of human mRNAs. More preferably, the probes hybridize with greater than 70%, greater than 80%, greater than 90%, greater than 95% and even greater than 98% of all target nucleic acid molecules in a population of target molecules (see, e.g., Fig. 1).
  • a probe library (i.e. such as about 100 multi- probes) comprising about 0.1 % of all possible sequences of the selected probe length(s), is capable of detecting, classifying, and/or quantifying more than 98% of mRNA transcripts in the transcriptome of any specific species, particularly mammals and more particular humans (i.e., > 35,000 different mRNA sequences).
  • a multi-probe library of the invention it is preferred that at least 85% of all target nucleic acids in a target population are covered by a multi-probe library of the invention.
  • the problems with existing homogeneous assays mentioned above are addressed by the use of a multi-probe library according to the invention consisting of a minimal set of short detection probes selected so as to recognize or detect a majority of all expressed genes in a given cell type from a given organism.
  • the library comprises probes that detect each transcript in a transcriptome of greater than about 10,000 genes, greater than about 15,000 genes, greater than about 20,000 genes, greater than about 25,000 genes, greater than about 30,000 genes or greater than about 35,000 genes or equivalent numbers of different mRNA transcripts.
  • the library comprises probes that detect mammalian transcripts sequences, e.g., such as mouse, rat, rabbit, monkey, or human sequences.
  • the present invention overcomes the limitations discussed above for contemporary homogeneous assays.
  • the detection element of the multi-probes according to the invention may be single or doubly labelled (e.g. by comprising a label at each end of the probe, or an internal position).
  • probes according to the invention can be adapted for use in 5' nuclease assays, molecular beacon assays, FRET assays, and other similar assays.
  • the detection multi-probe comprises two labels capable of interacting with each other to produce a signal or to modify a signal, such that a signal or a change in a signal may be detected when the probe hybridizes to a target sequence.
  • the two labels comprise a quencher and a reporter molecule.
  • the probe comprises a target-specific recognition segment capable of specifically hybridizing to a plurality of different nucleic acid molecules comprising the complementary recognition sequence.
  • a particular detection aspect of the invention referred to as a "molecular beacon with a stem region" is when the recognition segment is flanked by first and second complementary hairpin-forming sequences which may anneal to form a hairpin.
  • a reporter label is attached to the end of one complementary sequence and a quenching moiety is attached to the end of the other complementary sequence.
  • the stem formed when the first and second complementary sequences are hybridized keeps these two labels in close proximity to each other, causing a signal produced by the reporter to be quenched by fluorescence resonance energy transfer (FRET).
  • FRET fluorescence resonance energy transfer
  • the proximity of the two labels is reduced when the probe is hybridized to a target sequence and the change in proximity produces a change in the interaction between the labels.
  • Hybridization of the probe thus results in a signal (e.g. fluorescence) being produced by the reporter molecule, which can be detected and/or quantified.
  • the multi-probe comprises a reporter and a quencher molecule at opposing ends of the short recognition sequence, so that these moieties are in sufficient proximity to each other, that the quencher substantially reduces the signal produced by the reporter molecule.
  • a reporter and a quencher molecule at opposing ends of the short recognition sequence, so that these moieties are in sufficient proximity to each other, that the quencher substantially reduces the signal produced by the reporter molecule.
  • a particular detection aspect of the invention referred to as a "5' nuclease assay” is when the multi-probe may be susceptible to cleavage by the 5' nuclease activity of the DNA polymerase. This reaction may possibly result in separation of the quencher molecule from the reporter molecule and the production of a detectable signal.
  • probes can be used in amplification-based assays to detect and/or quantify the amplification process for a target nucleic acid.
  • each probe comprises a detection element and a recognition segment having a length of about 8-9 nucleotides, where some or all of the nucleobases in said oligonucleotides are substituted by non-natural bases having the effect of increasing binding affinity compared to natural nucleobases, and/or some or all of the nucleotide units of the oligonucleotide probe are modified with a chemical moiety to increase binding affinity, and/or where said oligonucleotides are modified with a chemical moiety to in- crease binding affinity, such that the probe has sufficient stability for binding to the target sequence under conditions suitable for detection, and wherein the number of different recognition segments comprises less than 10% of all possible segments of the given length, and wherein more than 90% of the probes can detect more than one complementary target in a target population of nucleic acids such that the library of oligon
  • the invention therefore relates to a library of oligonucleotide probes wherein each probe in the library consists of a recognition sequence tag and a detection moiety wherein at least one monomer in each oligonucleotide probe is a modified monomer analogue, increasing the binding affinity for the complementary target sequence relative to the corresponding unmo- dified oligonucleotide (which may e.g.
  • the invention further relates to a library of oligonucleotide probes wherein the recognition sequence tag segment of the probes in the library have been modified in at least one of the following ways: i) substitution with at least one non-naturally occurring nucleotide; and ii) substitution with at least one chemical moiety to increase the stability of the probe.
  • the invention relates to a library of oligonucleotide probes wherein the recognition sequence tag has a length of 6 to 12 nucleotides (i.e. 6, 7, 8, 9, 10, 11 or 12), and wherein the preferred length is 8 or 9 nucleotides.
  • the invention relates to recognition sequence tags that are substituted with LNA nu- cleotides.
  • oligonucleotide probe comprising a quencher of formula I and a 5'-nitroindole residue. It is believed that such useful multiprobes are inventive in their own right. Preferred such probes are free from a 5' guanidyl residue, and in general such inventive probes are disclosed in the present specification and claims. Especially preferred probes are those set forth in Table 1, Table IA, Fig. 13, or Fig 14.
  • the invention relates to libraries of the invention where more than 90% of the oligonucleotide probes can bind and detect at least two target sequences in a nucleic acid population, preferably because the bound target sequences that are complementary to the recognition sequence of the probes.
  • the probe is capable of detecting more than one target in a target population of nucleic acids, e.g., the probe is capable of hybridizing to a plurality of different nucleic acid molecules contained within the target population of nucleic acids.
  • the invention also provides a method, system and computer program embedded in a computer readable medium ("a computer program product") for designing multi-probes compri- sing at least one stabilizing nucleobase.
  • the method comprises querying a database of target sequences (e.g., such as a database of expressed sequences) and designing a small set of probes (e.g.
  • Probes are designed in silico, which comprise all possible combinations of nucleotides of a given length forming a database of virtual candidate probes.
  • These virtual probes are queried against the database of target sequences to identify probes that comprise the maximal ability to detect the most different target sequences in the database ("optimal probes"). Op- timal probes so identified are removed from the virtual probe database. Additionally, target nucleic acids, which were identified by the previous set of optimal probes, are subtracted from the target nucleic acid database. The remaining probes are then queried against the remaining target sequences to identify a second set of optimal probes. The process is repeated until a set of probes is identified which can provide the desired coverage of the target sequence database. The set may be stored in a database as a source of sequences for tran- scriptome analysis. Multi-probes may be synthesized having recognition sequences, which correspond to those in the database to generate a library of multi-probes.
  • the target sequence database comprises nucleic acid sequences corresponding to human mRNA (e.g., mRNA molecules, cDNAs, and the like).
  • the method further comprises calculating stability based on the assumption that the recognition sequence comprises at least one stabilizing nucleotide, such as an LNA molecule.
  • the calculated stability is used to eliminate probe recognition sequences with inadequate stability from the database of virtual candidate probes prior to the initial query against the database of target sequence to initiate the identification of optimal probe recognition sequences.
  • the method further comprises calculating the propensity for a given probe recognition sequence to form a duplex structure with itself based on the assumption that the recognition sequence comprises at least one stabilizing nucleotide, such as an LNA molecule.
  • the calculated propensity is used to eliminate probe recognition se- quences that are likely to form probe duplexes from the database of virtual candidate probes prior to the initial query against the database of target sequence to initiate the determination of optimal probe recognition sequences.
  • the method further comprises evaluating the general applicability of a given candidate probe recognition sequence for inclusion in the growing set of optimal probe candidates by both a query against the remaining target sequences as well as a query against the original set of target sequences.
  • only probe recognition sequences that are frequently found in both the remaining target sequences and in the original target sequences are added to in the growing set of optimal probe recognition sequences. In a most preferred aspect this is accomplished by calculating the product of the scores from these queries and selecting the probes recognition sequence with the highest product that still is among the probe recognition sequences with 20% best score in the query against the current targets.
  • the invention also provides a computer program embedded in a computer readable medium comprising instructions for searching a database comprising a plurality of different target sequences and for identifying a set of probe recognition sequences capable of identifying to at least about 60%, about 70%, about 80%, about 90% and about 95% of the sequences within the database.
  • the program provides instructions for executing the method described above.
  • the program provides instructions for implementing an algorithm as shown in Fig. 2.
  • the invention further provides a system wherein the system comprises a memory for storing a database comprising sequence information for a plurality of different target sequences and also comprises an application program for executing the program instructions for searching the database for a set of probe recognition sequences which is capable of hybridizing to at least about 60%, about 70%, about 80%, about 90% and about 95% of the sequences within the database.
  • an oligonucleotide probe comprising a detection element and a recognition segment each independently having a length of about 1 to 8 or 9 nucleotides, wherein some or all of the nucleotides in the oligonucleotides are substituted by non-natural bases or base analogues having the effect of increasing binding affinity compared to natural nucleobases and/or some or all of the nucleotide units of the oligonucleotide probe are modified with a chemical moiety or replaced by an analogue to increase binding affinity, and/or where said oligonucleotides are modified with a chemical moiety or is an oligonucleotide analogue to increase binding affinity, such that the probe has sufficient stability for binding to the target sequence under conditions suitable for detection, and wherein the probe is capable of detecting more than one complementary target in a target population of nucleic acids.
  • a preferred embodiment of the invention is a kit for the characterization or detection or quantification of target nucleic acids comprising samples of a library of multi-probes.
  • the kit comprises in silico protocols for their use.
  • the kit comprises information relating to suggestions for obtaining inexpensive DNA primers.
  • the probes contained within these kits may have any or all of the characteristics described above.
  • a plurality of probes comprises at least one stabilizing nucleotide, such as an LNA nucleotide.
  • the plurality of probes comprises a nucleotide coupled to or stably associated with at least one chemical moiety for increasing the stability of binding of the probe.
  • the kit comprises about 100 different probes.
  • kits according to the invention allow a user to quickly and efficiently develop an assay for thousands of different nucleic acid targets.
  • the invention further provides a multi-probe comprising one or more LNA nucleotides, which has a reduced length of about 8, or 9 nucleotides.
  • a multi-probe comprising one or more LNA nucleotides, which has a reduced length of about 8, or 9 nucleotides.
  • the present invention relates to an oligonucleotide multi-probe library comprising LNA-substituted octamers and nonamers of less than about 1000 sequences, preferably less than about 500 sequences, or more preferably less than about 200 se- quences, such as consisting of about 100 different sequences selected so that the library is able to recognize more than about 90%, more preferably more than about 95% and more preferably more than about 98% of mRNA sequences of a target organism or target organ.
  • a recurring problem in designing real-time PCR detection assays for multiple genes is that the success-rate of these de-novo designs is less than 100%. Troubleshooting a non-functional assay can be cumbersome since ideally, a target specific template is needed for each probe, to test the functionality of the detection probe. Furthermore, a target specific template can be useful as a positive control if it is unknown whether the target is available in the test sample.
  • a limited number of detection probes in a probe library kit as de- scribed in the present invention e.g. 90
  • An important aspect of the present invention is the selection of optimal probe target sequences in order to target as many targets with as few probes as possible, given a target selection criteria. This may be achieved by deliberately selecting target sequences that occur more frequently than what would have been expected from a random distribution.
  • the invention therefore relates in one aspect to a method of selecting oligonucleotide sequences useful in a multi-probe library of the invention, the method comprising a) providing a first list of all possible oligonucleotides of a predefined number of nucleotides, N (typically an integer selected from 6, 7, 8, 9, 10, 11, and 12, preferably 8 or 9), said oligonucleotides having a melting temperature, Tm, of at least 50 0 C (preferably at least 60 0 C such as at least 62°C), b) providing a second list of target nucleic acid sequences (such as a list of a target nucleic acid population discussed herein), c) identifying and storing for each member of said first list, the number of members from said second list, which include a sequence complementary to said each member, d) selecting a member of said first list, which in the identification in step c matches the maximum number, identified in step c, of members from said second list, e) adding the member selected in
  • the has a bias against including a member in the third list that have a 5' guanidyl (G) and/or a bias against including members in the third list that have a 3' guanidyl (G).
  • guanidyl is avoided as the 5' residue in all oligonucleotide sequences in said third list
  • the first list only includes oligonucleotides incapable of self-hybridization in order to render a subsequent use of the probes less prone to false positives.
  • the selection method may include a number of steps after step f, but before step m g) subtraction of all members from said second list which include a sequence complementary to the member selected in step d to obtain a revised second list, h) identification and storing of, for each member of said revised first list, the number of members from said revised second list, which include a sequence complementary to said each member, i) selecting a member of said first list, which in the identification in step h matches the maximum number, identified in step h, of members from said second list, or selecting a member of said first list provides the maximum number obtained by multiplying the number identified in step h with the number identified in step c, j) addition of the member selected in step i to said third list, k) subtraction of the member selected in step i from said revised first list, and I) subtraction of all members from said revised second list which include a sequence or complementary to the member selected in step i.
  • the above-mentioned avoidance of guanidyl as the 5' residue is preferably achieved by i) reducing the list of step a to include only those that do not include a 5' guanidyl residue, and/or ii) avoiding selection in step d and/or i of those sequences which include a 5' guanidyl residue, and/or iii) omitting step e and/or j for those sequences that include a 5' guanidyl residue.
  • step d after step c is conveniently preceded by identification of those members of said first list which hybridizes to more than a selected percentage (60% or higher such as the preferred 80%) of the maximum number of members from said second list so that only those members so identified are subjected to the selection in step d.
  • a selected percentage (60% or higher such as the preferred 80%) of the maximum number of members from said second list so that only those members so identified are subjected to the selection in step d.
  • the method of the invention can also include the feature that it is ensured that members are not entered on the third list if such members have previously failed qualitative as useful probes. Or, in simpler terms, after design of a library, the individual members are tested for their usefulness, and probes which are found to behave sub optimally in a relevant assay are included in a "negative list" which is checked when later designing new probes and probe libraries.
  • step a reduces the list of step a to include only those that have not previously failed qualitatively, and/or ii) avoid selection in step d or i of those sequences that have not previously failed qualitatively, and/or iii) omit step e or j for those sequences that have not previously failed qualitatively
  • said first, second and third lists are stored in the memory of a computer system, preferably in a database.
  • the memory also termed “computer readable medium” can be both volatile and non-volatile, i.e. any memory device conventionally used in computer systems: a random access memory (RAM), a readonly memory (ROM), a data storage device such as a hard disk, a CD-ROM, DVD-ROM, and any other known memory device.
  • the invention also provides a computer program product providing instructions for implementing the selection method, embedded in a computer-readable medium (defined as above). That is, the computer program may be compiled and loaded in an active computer memory, or it may be loaded on a non-volatile storage device (optionally in a compressed format) from where it can be executed. Consequently, the invention also includes a system comprising a database of target sequences and an application program for executing the computer program. A source code for such a computer program is set forth in Fig. 17.
  • Nl the complete length of the given nucleic acid population (e.g. 76.002.917 base pairs as in the 1 June 30, 2003 release of RefSeq).
  • N2 the number of fragments comprising the nucleic acid population (e.g. 38.556 genes in the 1 June 30, 2003 release of RefSeq).
  • N3 the length of the recognition sequence (e.g. 9 base pairs)
  • N4 the occurrence frequency
  • N4 (N1-((N3-1) X 2 x N2))/(4 N3 )
  • a random 8-mer and 9-mer sequence would on average occur 1,151 and 287 times, respectively, in a random population of the described 38,556 mRNA sequences.
  • the 76.002.917 base pairs originating from 38.556 genes would corre- spond to an average transcript length of 1971 bp, containing each 1971-16 or 1955 9-mer target sequences each.
  • 38.556/1955/287 or 5671 9-mer probes would be needed for one probe to target each gene.
  • the occurrence of 9-mer sequences is not randomly distributed. In fact, a small subset of sequences occurs at surprisingly high prevalence, up to over 30 times the preva- lence anticipated from a random distribution.
  • the most common sequences should be selected to increase the coverage of a selected library of probe target sequences. As described previously, selection should be step-wise, such that the selection of the most common target sequences is evaluated as well in the starting target population as well as in the population remaining after each selection step.
  • the targets for the probe library are the entire expressed transcriptome.
  • the above-mentioned target can further be restricted to only include the 1000 most proximal bases in each mRNA. This may result in the selection of another set of optimal probe target sequences for optimal coverage.
  • the above-mentioned target may be restricted to include only the 50 bp of coding region sequence flanking the introns of a gene to ensure assays that preferably only monitor mRNA and not genomic DNA or to only include regions not containing di-, tri- or tetra repeat sequences, to avoid repetitive binding or probes or primers or regions not containing know allelic variation, to avoid primer or probe mis-annealing due to sequence variations in target sequences or regions of extremely high GC-content to avoid inhibition of PCR amplification.
  • the optimal set of probes may vary, depending in the prevalence of target sequences in each target selection.
  • Human genomic A set of genomic sequences can be extracted from a genome, which could be the human, by dividing the genomic sequence in pieces of 500 nucleotides in length. Such a Probe Library can be used to measure any genomic sequence, including regulatory sequences, introns, repetitive sequences and other genomic sequences. The following library has been identified by means of the methods disclosed herein, cf. Fig. 17.
  • Bacteria 199 bacteria and archae genomes from which can be downloaded from NCBI: ftp.ncbi.nih.gov The genomes can be classified according to the use of nucleotides. An even use of nucleotides is if every nucleotide (a,c,g,t) is used 25% of the time. Deviation from even usage can for example be taken as any that differs by more than 3%. Following this criteria the 199 genomes divide into: 91 AT rich, 44 GC rich, 28 no >3% skewness, 21 A rich, 15 other categories.
  • Probes from a human probe library do not give a good coverage.
  • Designing probes for an AT rich organism is a challenge because of the low melting temperature. The probes must be longer to achieve the melting temperature, but this lowers the coverage.
  • a Probe library for mainly AT rich genomes is given in the following "bacteria table" (also identified by means of the program set forth in Fig. 17).
  • Another part of the invention relates to identification of a means for detection of a target nucleic acid, the method comprising A) inputting, into a computer system, data that uniquely identifies the nucleic acid sequence of said target nucleic acid, wherein said computer system comprises a database holding information of the composition of at least one library of nucleic acid probes of the invention, and wherein the computer system further comprises a database of target nucleic acid sequences for each probe of said at least one library and/or further comprises means for acquiring and comparing nucleic acid sequence data, B) identifying, in the computer system, a probe from the at least one library, wherein the sequence of the probe exists in the target nucleic acid sequence or a sequence complementary to the target nucleic acid sequence,
  • step C) identifying, in the computer system, a primer that will amplify the target nucleic acid sequence
  • D providing, as identification of the specific means for detection, an output that points out the probe identified in step B and the sequences of the primers identified in step C.
  • the above-outlined method has several advantages in the event it is desired to rapidly and specifically identify a particular nucleic acid. If the researcher already has acquired a suitable multi-probe library of the invention, the method makes it possible within seconds to acquire information relating to which of the probes in the library one should use for a subsequent assay, and of the primers one should synthesize. The time factor is important, since synthesis of a primer pair can be accomplished overnight, whereas synthesis of the probe would normally be quite time-consuming and cumbersome.
  • Step A then comprises inputting, into the computer system, data that identifies the at least one library of nucleic acids from which it is desired to select a member for use in the specific means for detection.
  • the preferred inputting interface is an internet-based web-interface, because the method is conveniently stored on a web server to allow access from users who have acquired a probe library of the present invention.
  • the method also would be useful as part of an installable computer application, which could be installed on a single computer or on a local area network.
  • the primers identified in step C are chosen so as to minimize the chance of amplifying genomic nucleic acids in a PCR reaction. This is of course only relevant where the sample is likely to contain genomic material.
  • One simple way to minimize the chance of amplification of genomic nucleic acids is to include, in at least one of the primers, a nucleotide sequence which in genomic DNA is interrupted by an intron. In this way, the primer will only prime amplification of transcripts where the intron has been spliced out.
  • primer pairs that cannot amplify genomic DIMA or other transcripts.
  • Such primers can be identified by doing a computerized search with the primers against the genome and transcriptome, i.e. an in silico PCR. Such a search must find and filter primer pairs where the left and right primer can match the DNA within the distance of a typical amplicon length, which can be 600 nucleotides or several thousand nucleotides.
  • the left and right primer can match in four different ways: 1: The left primer and the reverse complement of the right primer. 2: The left primer and the reverse complement of the left primer. 3: The right primer and the reverse complement of the left primer. 4: The right primer and the reverse complement of the right primer.
  • a further optimization of the method is to choose the primers in step C so as to minimize the length of amplicons obtained from PCR performed on the target nucleic acid sequence and it is further also preferred to select the primers so as to optimize the GC content for performing a subsequent PCR.
  • the selection method for detection means can be provided to the end-user as a computer program product providing instructions for implementing the method, embedded in a computer-readable medium. Consequently, the invention also provides for a system comprising a database of nucleic acid probes of the invention and an ap- plication program for executing this computer program.
  • the method and the computer programs and system allows for quantitative or qualitative determination of the presence of a target nucleic acid in a sample, comprising i) identifying, by means of the detection means selection method of the invention, a specific means for detection of the target nucleic acid, where the specific means for detection com- prises an oligonucleotide probe and a set of primers, ii) obtaining the primers and the oligonucleotide probe identified in step i), iii) subjecting the sample to a molecular amplification procedure in the presence of the primers and the oligonucleotide probe from step ii), and iv) determining the presence of the target nucleic acid based on the outcome of step iii).
  • primers obtained in step ii) are obtained by synthesis and it is preferred that the oligonucleotide probe is obtained from a library of the present invention.
  • the molecular amplification method is typically a PCR or a NASBA procedure, but any in vitro method for specific amplification (and, possibly, detection) of a nucleic acid is useful.
  • the preferred PCR procedure is a qPCR (also known as real-time reverse transcription PCR or kinetic RT-PCR).
  • Fig. 1 illustrates the use of conventional long probes in panel (A) as well as the properties and use of short multi-probes (B) from a library constructed according to the invention.
  • the short multi-probes comprise a recognition segment chosen so that each probe sequence may be used to detect and/or quantify several different target sequences comprising the complementary recognition sequence.
  • Fig. IA shows a method according to the prior art.
  • Fig. IB shows a method according to one aspect of the invention.
  • Fig. 2 is a flow chart showing a method for designing multi-probe sequences for a library according to one aspect of the invention.
  • the method can be implemented by executing instructions provided by a computer program embedded in a computer readable medium.
  • the program instructions are executed by a system, which comprises a database of sequences such as expressed sequences.
  • Fig. 3 is a graph illustrating the redundancy of probes targeting each gene within a 100- probe library according to one aspect of the invention.
  • the y-axis shows the number of genes in the human transcriptome that are targeted by different number of probes in the library. It is apparent that a majority of all genes are targeted by several probes. The average number of probes per gene is 17.4.
  • Fig. 4 shows the theoretical coverage of the human transcriptome by a selection of hyper- abundant oligonucleotides of a given length.
  • the graphs show the percentage of approximately 38.000 human mRNA sequences that can be detected by an increasing number of well-chosen short multi-probes of different length.
  • the graph illustrates the theoretical cover- age of the human transcriptome by optimally chosen (i.e.
  • the Homo sapiens transcriptome sequence was obtained from European Bioinformatics Institute (EMBL-EBI). A region of 1000 nt proximal to the 3' end of each mRNA sequence was used for the analysis (from 50 nt to 1050 nt upstream from the 3' end). As the amplification of each sequence is by PCR both strands of the amplified duplex was considered a valid target for multi-probes in the probe library. Probe sequences that even with LNA substitutions have inadequate Tm, as well as self-complementary probe sequences are excluded.
  • Fig. 6 shows representative real time PCR curves for 9-mer multi-probes detecting target sequences in a dual labelled probe assay. Results are from real time PCR reactions with 9 nt long LNA enhanced dual labelled probes targeting different 9-mer sequences within the same gene. Each of the three different dual labelled probes were analysed in PCRs generating the 469, the 570 or the 671 SSA4 amplicons (each between 81 to 95 nt long). Dual labelled probe 469, 570, and 671 is shown in Panel a, b, and c, respectively. Each probe only detects the amplicon it was designed to detect.
  • the C t values were 23.7, 23.2, and 23.4 for the dual labelled probes 469, 570, and 671, respectively. 2 x 10 7 copies of the SSA4 cDNA were added as template. The high similarity between results despite differences in both probe sequences and their individual primer pairs indicate that the assays are very robust.
  • Fig. 7 shows examples of real time PCR curves for Molecular Beacons with a 9-mer and a 10- mer recognition site.
  • panel (B) A similar experiment with a molecular beacon having a 9-mer recognition site detecting the 570 SSA4 amplicon is shown in panel (B). Signal was only obtained when SSA4 cDNA was added (2 x 10 7 copies).
  • Fig. 8 shows an example of a real time PCR curve for a SYBR-probe with a 9-mer recognition site targeting the 570 SSA4 amplicon. Signal was only obtained in the sample where SSA4 cDNA was added (2 x 10 7 copies), whereas no signal was detected without addition of template.
  • Fig. 9 shows a calibration curve for three different 9-mer multi-probes using a dual labelled probe assay principle. Detection of different copy number levels of the SSA4 cDNA by the three dual labelled probes. The threshold cycle nr defines the cycle number at which signal was first detected for the respective PCR.
  • Fig. 10 shows the use of 9-mer dual labelled multi-probes to quantify a heat shock protein before and after-exposure to heat shock in a wild type yeast strain as well as a mutant strain where the corresponding gene has been deleted.
  • Real time detection of SSA4 transcript levels in wild type (wt) yeast and in the SSA4 knockout mutant with the Dual-labelled-570 probe is shown.
  • the different strains were either cultured at 30 0 C till harvest (- HS) or they were exposed to 40 0 C for 30 minutes prior to harvest.
  • the Dual-labelled-570 probe was used in this example.
  • the transcript was only detected in the wt type strain, where it was most abundant in the + HS culture. Q values were 26,1 and 30.3 for the + HS and the - HS culture, respec- tively.
  • Fig. 11 shows an example of how more than one gene can be detected by the same 9-mer probe while nucleic acid molecules without the probe target sequence (i.e. complementary to the recognition sequence) will not be detected.
  • Dual-labelled-469 detects both the SSA4 (469 amplicon) and the POL5 transcript with Q values of 29.7 and 30.1, respectively. No signal was detected from the APG9 and HSP82 transcripts.
  • Dual-labelled-570 detects both the SSA4 (570 amplicon) and the APG9 transcript with Q values of 31.3 and 29.2 respectively. No signal is detected from the P0L5 and HSP82 transcripts.
  • Fig. 12 shows agarose gel electrophoresis of a fraction of the amplicons generated in the PCR reactions shown in the example of Fig. 11, demonstrating that the probes are specific for target sequences comprising the recognition sequence but do not hybridize to nucleic acid molecules which do not comprise the target sequence.
  • lane 1 contain the SSA4-469 amplicon (81 bp)
  • lane 2 contains the POL5 amplicon (94 bp)
  • lane 3 contains the APG9 amplicon (97 bp)
  • lane 4 contains the HSP82 amplicon (88 bp).
  • Lane M contains a 50 bp ladder as size indicator.
  • Fig. 13 Preferred target sequences.
  • Fig. 14 Further Preferred target sequences.
  • Fig. 15 Longmers (positive controls). The sequences are set forth in SEQ ID NOs. 32-46.
  • Fig. 16 Procedure for the selection of probes and the designing of primers for qPCR.
  • Fig. 17 Source code for the program used in the calculation of a multi-probe dataset.
  • Fig. 18 The result from performing real time PCR with a probe carrying the Q4 quencher together with the fluorescein dye.
  • Figure 19 The result from performing real time PCR with a dual labelled probe carrying a 3'- Nitroindole.
  • Figure 20 The result from performing real time PCR with a probe having perfect match or a single mismatch relative to the amplified target sequence. As control, a PCR without addition of template was included in the experiment.
  • the present invention relates to short oligonucleotide probes or multi-probes, chosen and designed to detect, classify or characterize, and/or quantify many different target nucleic acid molecules.
  • These multi-probes comprise at least one non-natural modification (e.g. such as LNA nucleotide) for increasing the binding affinity of the probes for a recognition sequence, which is a subsequence of the target nucleic acid molecules.
  • the target nucleic acid molecules are otherwise different outside of the recognition sequence.
  • the multi-probes comprise at least one nucleotide modified with a chemical moiety for increasing binding affinity of the probes for a recognition sequence, which is a subsequence of the target nucleic acid sequence.
  • the probes comprise both at least one non-natural nucleotide and at least one nucleotide modified with a chemical moiety.
  • the at least one non-natural nucleotide is modified by the chemical moiety.
  • the invention also provides kits, libraries and other compositions comprising the probes.
  • the invention further provides methods for choosing and designing suitable oligonucleotide probes for a given mixture of target sequences, ii) individual probes with these abilities, and iii) libraries of such probes chosen and designed to be able to detect, classify, and/or quantify the largest number of target nucleotides with the smallest number of probe sequences.
  • Each probe according to the invention is thus able to bind many different targets, but may be used to create a specific assay when combined with a set of specific primers in PCR assays.
  • Preferred oligonucleotides of the invention are comprised of about 8 to 9 nucleotide units, a substantial portion of which comprises stabilizing nucleotides, such as LNA nucleotides.
  • a preferred library contains approximately 100 of these probes chosen and designed to characterize a specific pool of nucleic acids, such as mRNA, cDNA or genomic DNA. Such a library may be used in a wide variety of applications, e.g., gene expression analyses, SNP detection, and the like. (See, e.g., Rg. 1).
  • a cell includes a plurality of cells, including mixtures thereof.
  • a nucleic acid molecule includes a plurality of nucleic acid molecules.
  • transcriptome refers to the complete collection of transcribed elements of the genome of any species.
  • RNAs In addition to mRNAs, it also represents non-coding RNAs which are used for structural and regulatory purposes.
  • amplicon refers to small, replicating DNA fragments.
  • sample refers to a sample of tissue or fluid isolated from an organism or organisms, including but not limited to, for example, skin, plasma, serum, spinal fluid, lymph fluid, synovial fluid, urine, tears, blood cells, organs, tumours, and also to samples of in vitro cell culture constituents (including but not limited to conditioned medium resulting from the growth of cells in cell culture medium, recombinant cells and cell components).
  • an "organism” refers to a living entity, including but not limited to, for example, human, mouse, rat, Drosophila (e.g. D. melanogaster), C. elegans, yeast, Arabidopsis (e.g. A. thaliana), zebra fish, primates (e.g. chimpanzees), domestic animals, etc.
  • SBC nucleobases By the term “SBC nucleobases” is meant “Selective Binding Complementary" nucleobases, i.e. modified nucleobases that can make stable hydrogen bonds to their complementary nucleobases, but are unable to make stable hydrogen bonds to other SBC nucleobases.
  • the SBC nucleobase A' can make a stable hydrogen bonded pair with its comple- mentary unmodified nucleobase, T.
  • the SBC nucleobase T' can make a stable hydrogen bonded pair with its complementary unmodified nucleobase, A.
  • the SBC nucleobases A' and T' will form an unstable hydrogen bonded pair as compared to the base- pairs A'-T and A-T'.
  • a SBC nucleobase of C is designated C and can make a stable hydrogen bonded pair with its complementary unmodified nucleobase G
  • a SBC nucleo- base of G is designated G' and can make a stable hydrogen bonded pair with its complementary unmodified nucleobase C
  • C and G' will form an unstable hydrogen bonded pair as compared to the basepairs C-G and C-G'.
  • a stable hydrogen bonded pair is obtained when 2 or more hydrogen bonds are formed e.g. the pair between A' and T, A and T', C and G', and C and G.
  • An unstable hydrogen bonded pair is obtained when 1 or no hydrogen bonds is formed e.g. the pair between A' and T 1 and C and G'.
  • SBC nucleobases are 2,6-diaminopurine (A', also called D) together with 2-thio-uracil (U', also called 2S U)(2-thio-4-oxo-pyrimidine) and 2-thio-thymine (T', also called 2S T)(2-thio-4-oxo-5-methyl-pyrimidine).
  • Fig. 4 illustrates that the pairs A- 2S T and D-T have 2 or more than 2 hydrogen bonds whereas the D- 2S T pair forms a single (unstable) hy- drogen bond.
  • SBC nucleobases pyrrolo-[2,3-d]pyrimidine-2(3H)-one (C, also called PyrroloPyr) and hypoxanthine (G', also called I)(6-oxo-purine) are shown in Fig. 9 where the pairs PyrroloPyr-G and C-I have 2 hydrogen bonds each whereas the PyrroloPyr-I pair forms a single hydrogen bond.
  • SBC LNA oligomer is meant a “LNA oligomer” containing at least one "LNA unit” where the nucleobase is a "SBC nucleobase”.
  • LNA unit with an SBC nucleobase is meant a
  • SBC LNA monomer Generally speaking SBC LNA oligomers include oligomers that besides the SBC LNA monomer(s) contain other modified or naturally-occurring nucleotides or nucleosides.
  • SBC monomer is meant a non-LNA monomer with a SBC nucleobase.
  • isose- quential oligonucleotide is meant an oligonucleotide with the same sequence in a Watson- Crick sense as the corresponding modified oligonucleotide e.g.
  • sequences agTtcATg is equal to agTscD 2S Ug where s is equal to the SBC DNA monomer 2-thio-t or 2-thio-u, D is equal to the SBC LNA monomer LNA-D and 25 U is equal to the SBC LNA monomer LNA 25 U.
  • nucleic acid refers to primers, probes, oligomer fragments to be detected, oligomer controls and unlabelled blocking oligomers and shall be generic to polydeoxyribonucleotides (containing 2-deoxy-D-ribose), to polyribonucleotides (containing D-ribose), and to any other type of polynucleotide which is an N glycoside of a purine or pyrimidine base, or modified purine or pyrimidine bases.
  • nucleic acid refers only to the primary structure of the molecule. Thus, these terms include double- and single-stranded DNA, as well as double- and single stranded RNA.
  • the oligonucleotide is comprised of a sequence of approximately at least 3 nucleotides, preferably at least about 6 nucleotides, and more preferably at least about 8 - 30 nucleotides corresponding to a region of the designated nucleotide sequence. "Corresponding" means identical to or complementary to the designated sequence.
  • oligonucleotide is not necessarily physically derived from any existing or natural sequence but may be generated in any manner, including chemical synthesis, DNA replication, reverse transcription or a combination thereof.
  • oligonucleotide or nucleic acid intend a polynucleotide of genomic DNA or RNA, cDNA, semi synthetic, or synthetic origin which, by virtue of its origin or manipulation: (1) is not associated with all or a portion of the polynucleotide with which it is associated in nature; and/or (2) is linked to a polynucleotide other than that to which it is linked in nature; and (3) is not found in nature.
  • an end of an oligonucleotide is referred to as the "5 1 end” if its 5 1 phosphate is not linked to the 3' oxygen of a mononucleotide pentose ring and as the "3' end” if its 3 1 oxygen is not linked to a 5' phosphate of a subsequent mononucleotide pentose ring.
  • a nucleic acid sequence even if internal to a larger oligonucleotide, also may be said to have a 5' and 3' ends.
  • the 3' end of one oligonucleotide points toward the 5' end of the other; the former may be called the "upstream” oligonucleotide and the latter the "downstream” oligonucleotide.
  • primer may refer to more than one primer and refers to an oligonucleotide, whether occurring naturally, as in a purified restriction digest, or produced synthetically, which is capable of acting as a point of initiation of synthesis along a complementary strand when placed under conditions in which synthesis of a primer extension product which is com- plementary to a nucleic acid strand is catalyzed.
  • Such conditions include the presence of four different deoxyribonucleoside triphosphates and a polymerization-inducing agent such as DNA polymerase or reverse transcriptase, in a suitable buffer ("buffer” includes substituents which are cofactors, or which affect pH, ionic strength, etc.), and at a suitable temperature.
  • the primer is preferably single-stranded for maximum efficiency in amplification.
  • PCR reaction As used herein, the terms "PCR reaction”, “PCR amplification”, “PCR” and “real-time PCR” are interchangeable terms used to signify use of a nucleic acid amplification system, which multiplies the target nucleic acids being detected. Examples of such systems include the polymerase chain reaction (PCR) system and the ligase chain reaction (LCR) system. Other methods recently described and known to the person of skill in the art are the nucleic acid sequence based amplification (IMASBATM, Cangene, Mississauga, Ontario) and Q Beta Repli- case systems. The products formed by said amplification reaction may or may not be monitored in real time or only after the reaction as an end point measurement.
  • PCR polymerase chain reaction
  • LCR ligase chain reaction
  • Other methods recently described and known to the person of skill in the art are the nucleic acid sequence based amplification (IMASBATM, Cangene, Mississauga, Ontario) and Q Beta Repli- case systems.
  • nucleic acid sequence refers to an oligonucleotide which, when aligned with the nucleic acid sequence such that the 5' end of one sequence is paired with the 3" end of the other, is in "antiparallel association.”
  • Bases not commonly found in natural nucleic acids may be included in the nucleic acids of the present invention include, for example, inosine and 7-deazaguanine. Complementarity may not be perfect; stable duplexes may contain mismatched base pairs or unmatched bases.
  • nucleic acid technology can determine duplex stability empirically considering a number of variables including, for example, the length of the oligonucleotide, percent concentration of cytosine and guanine bases in the oligonucleotide, ionic strength, and incidence of mismatched base pairs.
  • T n melting temperature
  • the term "probe” refers to a labelled oligonucleotide, which forms a duplex structure with a sequence in the target nucleic acid, due to complementarity of at least one sequence in the probe with a sequence in the target region.
  • the probe preferably, does not contain a sequence complementary to sequence(s) used to prime the polymerase chain reac- tion. Generally the 3' terminus of the probe will be "blocked" to prohibit incorporation of the probe into a primer extension product.
  • Locking may be achieved by using non-complementary bases or by adding a chemical moiety such as biotin or even a phosphate group to the 3' hydroxyl of the last nucleotide, which may, depending upon the selected moiety, may serve a dual purpose by also acting as a label.
  • a chemical moiety such as biotin or even a phosphate group
  • label refers to any atom or molecule which can be used to provide a detectable (preferably quantifiable) signal, and which can be attached to a nucleic acid or protein. Labels may provide signals detectable by fluorescence, radioactivity, colorimetric, X- ray diffraction or absorption, magnetism, enzymatic activity, and the like.
  • 5'->3' nuclease activity or “5 1 to 3' nuclease activity” refers to that activity of a template-specific nucleic acid polymerase including either a 5'->3' exonuclease activity traditionally associated with some DNA polymerases whereby nucleotides are removed from the 5' end of an oligonucleotide in a sequential manner, (i.e., E. coli DNA polymerase I has this activity whereas the Klenow fragment does not), or a 5'->3" endonuclease activity wherein cleavage occurs more than one nucleotide from the 5' end, or both.
  • thermo stable nucleic acid polymerase refers to an enzyme which is relatively stable to heat when compared, for example, to nucleotide polymerases from E. coli and which catalyzes the polymerization of nucleosides.
  • the enzyme will initiate synthesis at the 3'-end of the primer annealed to the target sequence, and will proceed in the 5'-direction along the template, and if possessing a 5' to 3' nuclease activity, hydrolyzing or displacing intervening, annealed probe to release both labelled and unlabelled probe fragments or intact probe, until synthesis terminates.
  • a representative thermo stable enzyme isolated from Thermus aquaticus (Tag) is described in U.S. Pat. No. 4,889,818 and a method for using it in conventional PCR is described in Saiki et al., (1988), Science 239:487.
  • nucleobase covers the naturally occurring nucleobases adenine (A), guanine (G), cytosine (C), thymine (T) and uracil (U) as well as non-naturally occurring nucleobases such as xanthine, diaminopurine, 8-oxo-N 6 -methyladenine, 7-deazaxanthine, 7-deazaguanine, N 4 ,N 4 -ethanocytosin, N 6 ,N 6 -ethano-2,6-diaminopurine, 5-methylcytosine, 5-(C 3 -C 6 )-alkynyl- cytosine, 5-fluorouracil, 5-bromouracil, pseudoisocytosine, 2-hydroxy-5-methyl-4-triazolopy- ridin, isocytosine, isoguanine, inosine and the "non-naturally occurring" nucleobases described in Benner et al
  • nucleobase thus includes not only the known purine and pyrimidine heterocycles, but also heterocyclic analogues and tautomers thereof. Further naturally and non naturally occurring nucleobases include those disclosed in U.S. Patent No. 3,687,808; in chapter 15 by Sanghvi, in Antisense Research and Application, Ed. S. T. Crooke and B.
  • nucleosidic base or “nucleobase analogue” is further intended to include heterocyclic compounds that can serve as nucleosidic bases including certain "universal bases” that are not nucleosidic bases in the most classical sense but serve as nucleosidic bases.
  • a universal base is 3-nitropyrrole and 5-nitroindole.
  • Other preferred compounds include pyrene and pyridyloxazole derivatives, pyrenyl, pyrenylmethylglycerol derivatives and the like.
  • Other preferred universal bases include, pyrrole, diazole or triazole derivatives, including those universal bases known in the art.
  • universal base is meant a naturally-occurring or desirably a non-naturally occurring compound or moiety that can pair with a natural base (e.g., adenine, guanine, cytosine, uracil, and/or thymine), and that has a T m differential of 15, 12, 10, 8, 6, 4, or 2 0 C or less as described herein.
  • a natural base e.g., adenine, guanine, cytosine, uracil, and/or thymine
  • oligonucleotide By “oligonucleotide,” “oligomer,” or “oligo” is meant a successive chain of monomers (e.g., glycosides of heterocyclic bases) connected via internucleoside linkages.
  • LNA unit is meant an individual LNA monomer (e.g., an LNA nucleoside or LNA nucleo- tide) or an oligomer (e.g., an oligonucleotide or nucleic acid) that includes at least one LNA monomer.
  • LNA units as disclosed in WO 99/14226 are in general particularly desirable modified nucleic acids for incorporation into an oligonucleotide of the invention.
  • the nucleic acids may be modified at either the 3' and/or 5' end by any type of modification known in the art. For example, either or both ends may be capped with a protecting group, attached to a flexible linking group, attached to a reactive group to aid in attachment to the substrate surface, etc.
  • Desirable LNA units and their method of synthesis also are disclosed in WO 00/47599, US 6,043,060, US 6,268,490, PCT/JP98/00945, WO 0107455, WO 0100641, WO 9839352, WO 0056746, WO 0056748, WO 0066604, Morita et al., Bioorg. Med. Chem. Lett. 12(l):73-76, 2002; Hakansson et al., Bioorg. Med. Chem. Lett. 11(7):935- 938, 2001; Koshkin et al., J. Org. Chem.
  • LNA monomers also referred to as "oxy-LNA” are LNA monomers which include bicyclic compounds as disclosed in PCT Publication WO 03/020739 wherein the bridge between R 4' and R 2' as shown in formula (I) below together designate -CH 2 -O- (methyloxy LNA) or -CH 2 -CH 2 -O- (ethyloxy LNA, also designated ENA).
  • LNA monomers are designated "thio-LNA” or "amino-LNA” including bicyclic structures as disclosed in WO 99/14226, wherein the heteroatom in the bridge between R 4' and R 2' as shown in formula (I) below together designate -CH 2 -S-, -CH 2 -CH 2 -S-, -CH 2 -NH- or - CH 2 -CH 2 -NH-.
  • LNA modified oligonucleotide an oligonucleotide comprising at least one LNA monomeric unit of formula (I), described infra, having the below described illustrative exam- pies of modifications: wherein X is selected from -O-, -S-, -N(R N )-, -C(R 6 R 6* )-, -0-C(R 7 R 7* )-, -C(R 6 R 6* )-O-, -S- C(R 7 R 7* )-, -C(R 6 R 6* )-S-, -N(R N* )-C(R 7 R 7* )-, -C(R 6 R 6* )-N(R N* )-, and -C(R 6 R 6* )-C(R 7 R 7* ).
  • B is selected from a modified base as discussed above e.g. an optionally substituted carbo- cyclic aryl such as optionally substituted pyrene or optionally substituted pyrenylmethylgly- cerol, or an optionally substituted heteroalicylic or optionally substituted heteroaromatic such as optionally substituted pyridyloxazole, optionally substituted pyrrole, optionally substituted diazole or optionally substituted triazole moieties; hydrogen, hydroxy, optionally substituted Ci- 4 -alkoxy, optionally substituted C 1-4 -alkyl, optionally substituted C ⁇ -acyloxy, nucleobases, DNA intercalators, photochemically active groups, thermochemically active groups, chelating groups, reporter groups, and ligands.
  • an optionally substituted carbo- cyclic aryl such as optionally substituted pyrene or optionally substituted pyrenylmethylgly- cerol, or an optionally substituted heteroalicylic or optionally
  • P designates the radical position for an internucleoside linkage to a succeeding monomer, or a 5'-terminal group, such internucleoside linkage or 5'-terminal group optionally including the substituent R 5 .
  • One of the substituents R 2 , R 2* , R 3 , and R 3* is a group P* which designates an internucleoside linkage to a preceding monomer, or a 2'/3'-terminal group.
  • Each of the substituents R 1* , R 2 , R 2* , R 3 , R 4* , R 5 , R 5* , R 6 and R 6* , R 7 , and R 7* which are present and not involved in P, P * or the biradical(s), is independently selected from hydrogen, optionally substituted C 1-:12 -alkyl, optionally substituted C 2- i 2 -alkenyl, optionally substituted C 2-12 -alkynyl, hydroxy, C 1-12 -alkoxy, C 2- i 2 -alkenyloxy, carboxy, Ci -12 ⁇ alkoxycarbonyl, Ci -I2 - alkylcarbonyl, formyl, aryl, aryloxy-carbonyl, aryloxy, arylcarbonyl, heteroaryl, heteroaryl- oxy-carbonyl, heteroaryloxy, heteroarylcarbonyl, amino, mono- and di(Ci -6 -alkyl)amino
  • Exemplary 5', 3', and/or 2' terminal groups include -H, -OH, halo (e.g., chloro, fluoro, iodo, or bromo), optionally substituted aryl, (e.g., phenyl or benzyl), alkyl (e.g., methyl or ethyl), alkoxy (e.g., methoxy), acyl (e.g.
  • acetyl or benzoyl aroyl, aralkyl, hydroxy, hydroxyalkyl, alkoxy, aryloxy, aralkoxy, nitro, cyano, carboxy, alkoxycarbonyl, aryloxycarbonyl, aralkoxy- carbonyl, acylamino, aroylamino, alkylsulfonyl, arylsulfonyl, heteroarylsulfonyl, alkylsulfinyl, arylsulfinyl, heteroarylsulfinyl, alkylthio, arylthio, heteroarylthio, aralkylthio, heteroaralkyl- thio, amidino, amino, carbamoyl, sulfamoyl, alkene, alkyne, protecting groups (e.g., silyl, 4,4'-dimethoxytrityl, monomethoxytrityl, or
  • references herein to a nucleic acid unit, nucleic acid residue, LNA unit, or similar term are inclusive of both individual nucleoside units and nucleotide units and nu- cleoside units and nucleotide units within an oligonucleotide.
  • a "modified base” or other similar term refers to a composition (e.g., a non-naturally occur ⁇ ring nucleobase or nucleosidic base), which can pair with a natural base (e.g., adenine, guanine, cytosine, uracil, and/or thymine) and/or can pair with a non-naturally occurring nucleobase or nucleosidic base.
  • the modified base provides a T m differential of 15, 12, 10, 8, 6, 4, or 2°C or less as described herein. Exemplary modified bases are described in EP 1 072 679 and WO 97/12896.
  • chemical moiety refers to a part of a molecule.
  • Modified by a chemical moiety thus refer to a modification of the standard molecular structure by inclusion of an unusual chemical structure. The attachment of said structure can be covalent or non-covalent.
  • inclusion of a chemical moiety in an oligonucleotide probe thus refers to attachment of a molecular structure.
  • chemical moiety include but are not limited to cova- lently and/or non-covalently bound minor groove binders (MGB) and/or intercalating nucleic acids (INA) selected from a group consisting of asymmetric cyanine dyes, DAPI, SYBR Green I, SYBR Green II, SYBR Gold, PicoGreen, thiazole orange, Hoechst 33342, Ethidium Bromide, l-O-(l-pyrenylmethyl)glycerol and Hoechst 33258.
  • Other chemical moieties include the modified nucleobases, nucleosidic bases or LNA modified oligonucleotides.
  • Dual labelled probe refers to an oligonucleotide with two attached labels. In one aspect, one label is attached to the 5' end of the probe molecule, whereas the other label is attached to the 3' end of the molecule.
  • a particular aspect of the invention contain a fluores- cent molecule attached to one end and a molecule, which is attached to the other end and which is able to quench the fluorophore by Fluorescence Resonance Energy Transfer (FRET). 5' nuclease assay probes and some Molecular Beacons are examples of Dual labelled probes.
  • FRET Fluorescence Resonance Energy Transfer
  • 5' nuclease assay probe refers to a dual labelled probe which may be hydrolyzed by the 5'-3' exonuclease activity of a DNA polymerase.
  • a 5' nuclease assay probes is not nec- essarily hydrolyzed by the 5'-3' exonuclease activity of a DNA polymerase under the conditions employed in the particular PCR assay.
  • the name ⁇ 5' nuclease assay is used regardless of the degree of hydrolysis observed and does not indicate any expectation on behalf of the experimenter.
  • 5' nuclease assay probe and "5' nuclease assay” merely refers to assays where no particular care has been taken to avoid hydrolysis of the involved probe.
  • W 5' nuclease assay probes are often referred to as a ⁇ TaqMan assay probes", and the “5' nuclease assay” as “TaqMan assay”. These names are used interchangeably in this application.
  • oligonucleotide analogue refers to a nucleic acid binding molecule capable of recognizing a particular target nucleotide sequence.
  • a particular oligonucleotide analogue is peptide nucleic acid (PNA) in which the sugar phosphate backbone of an oligonucleotide is replaced by a protein like backbone.
  • PNA peptide nucleic acid
  • nucleobases are attached to the uncharged polyamide backbone yielding a chimeric pseudopeptide-nucleic acid structure, which is homomorphous to nucleic acid forms.
  • Molecular Beacon refers to a single or dual labelled probe which is not likely to be affected by the 5'-3' exonuclease activity of a DNA polymerase. Special modifications to the probe, polymerase or assay conditions have been made to avoid separation of the labels or constituent nucleotides by the 5'-3' exonuclease activity of a DNA polymerase. The detection principle thus rely on a detectable difference in label elicited signal upon binding of the molecular beacon to its target sequence.
  • the oligonucleotide probe forms an intramolecular hairpin structure at the chosen assay temperature mediated by complementary sequences at the 5'- and the 3'-end of the oligonucleotide.
  • the oligonucleotide may have a fluorescent molecule attached to one end and a molecule attached to the other, which is able to quench the fluorophore when brought into close proximity of each other in the hairpin structure.
  • a hairpin structure is not formed based on complementary structure at the ends of the probe sequence instead the detected signal change upon binding may result from interaction between one or both of the labels with the formed duplex structure or from a general change of spatial conformation of the probe upon binding - or from a reduced interaction between the labels after binding.
  • a particular aspect of the molecular beacon contain a number of LNA residues to inhibit hydrolysis by the 5'-3' exonuclease activity of a DNA polymerase.
  • multi-probe refers to a probe which comprises a recognition segment which is a probe sequence sufficiently complementary to a recognition sequence in a target nucleic acid molecule to bind to the sequence under moderately stringent conditions and/or under conditions suitable for PCR, 5' nuclease assay and/or Molecular Beacon analysis (or generally any FRET-based method). Such conditions are well known to those of skill in the art.
  • the recognition sequence is found in a plurality of sequences being evaluated, e.g., such as a transcriptome.
  • a multi-probe according to the invention may comprise a non-natural nucleotide ("a stabilizing nucleotide”) and may have a higher binding affinity for the recognition sequence than a probe comprising an identical sequence but without the stabilizing modification.
  • a stabilizing nucleotide a non-natural nucleotide
  • at least one nucleotide of a multi-probe is modified by a chemical moiety (e.g., covalently or otherwise stably associated with during at least hybridization stages of a PCR reaction) for increasing the binding affinity of the recognition segment for the recognition sequence.
  • a multi-probe with an increased "binding affinity" for a recognition sequence than a probe which comprises the same sequence but which does not comprise a stabilizing nucleotide refers to a probe for which the association constant (K a ) of the probe recognition segment is higher than the association constant of the complementary strands of a double- stranded molecule.
  • the association constant of the probe recognition segment is higher than the dissociation constant (K d ) of the complementary strand of the recognition sequence in the target sequence in a double stranded molecule.
  • a “multi-probe library” or “library of multi-probes” comprises a plurality of multi- probes, such that the sum of the probes in the library are able to recognise a major proportion of a transcriptome, including the most abundant sequences, such that about 60%, about 70%, about 80%, about 85%, more preferably about 90%, and still more preferably 95%, of the target nucleic acids in the transcriptome, are detected by the probes.
  • Monomers are referred to as being "complementary” if they contain nucleobases that can form hydrogen bonds according to Watson-Crick base-pairing rules (e.g. G with C, A with T or A with U) or other hydrogen bonding motifs such as for example diaminopurine with T, inosine with C, pseudoisocytosine with G, etc.
  • Watson-Crick base-pairing rules e.g. G with C, A with T or A with U
  • other hydrogen bonding motifs such as for example diaminopurine with T, inosine with C, pseudoisocytosine with G, etc.
  • the term "succeeding monomer” relates to the neighbouring monomer in the 5'-terminal di- rection and the "preceding monomer” relates to the neighbouring monomer in the 3'-terminal direction.
  • target population refers to a plurality of different sequences of nucleic acids, for example the genome or other nucleic acids from a particular species including the transcriptome of the genome, wherein the transcriptome refers to the complete col- lection of transcribed elements of the genome of any species.
  • the number of different target sequences in a nucleic acid population is at least 100, but as will be clear the number is often much higher (more than 200, 500, 1000, and 10000 - in the case where the target population is a eukaryotic transcriptome).
  • target nucleic acid refers to any relevant nucleic acid of a single specific sequence, e. g., a biological nucleic acid, e. g., derived from a patient, an animal (a human or non-human animal), a plant, a bacteria, a fungi, an archae, a cell, a tissue, an organism, etc.
  • a biological nucleic acid e. g., derived from a patient, an animal (a human or non-human animal), a plant, a bacteria, a fungi, an archae, a cell, a tissue, an organism, etc.
  • the method optionally further comprises selecting the bacteria, archae, plant, non-human animal, cell, fungi, or non-human organism based upon detection of the target nucleic acid.
  • the target nucleic acid is derived from a patient, e. g., a human patient.
  • the invention optionally further includes selecting a treatment, diagnosing a disease, or diagnosing a genetic predisposition to a disease, based upon detection of the target nucleic acid.
  • target sequence refers to a specific nucleic acid sequence within any target nucleic acid.
  • stringent conditions is the “stringency” which occurs within a range from about T m -5°C (5°C below the melting temperature (T m ) of the probe) to about 2O 0 C to 25°C below T m .
  • the stringency of hybridization may be altered in order to identify or detect identical or related polynucleotide sequences.
  • Hybridization techniques are generally described in Nucleic Acid Hybridization, A Practical Approach, Ed. Hames, B. D. and Higgins, S. J., IRL Press, 1985; Gall and Pardue, Proc. Natl. Acad. Sci., USA 63: 378-383, 1969; and John, et al. Nature 223: 582-587, 1969.
  • a multi-probe according to the invention is preferably a short sequence probe which binds to a recognition sequence found in a plurality of different target nucleic acids, such that the multi-probe specifically hybridizes to the target nucleic acid but do not hybridize to any detectable level to nucleic acid molecules which do not comprise the recognition sequence.
  • a collection of multi-probes, or multi-probe library is able to recognize a major proportion of a transcriptome, including the most abundant sequences, such as about 60%, about 70%, about 80%, about 85%, more preferably about 90%, and still more preferably 95%, of the target nucleic acids in the transcriptome, are detected by the probes.
  • a multi-probe according to the invention comprises a "stabilizing modification” e.g. such as a non-natural nucleotide (“a stabilizing nucleotide”) and has higher binding affinity for the recognition sequence than a probe comprising an identical sequence but without the stabilizing sequence.
  • a stabilizing nucleotide e.g. such as a non-natural nucleotide
  • at least one nucleotide of a multi-probe is modified by a chemical moiety (e.g., covalently or otherwise stably associated with the probe during at least hybridization stages of a PCR reaction) for increasing the binding affinity of the recogni- tion segment for the recognition sequence.
  • a multi-probe of from 6 to 12 nucleotides comprises from 1 to 6 or even up to 12 stabilizing nucleotides, such as LNA nucleotides.
  • An LNA enhanced probe library contains short probes that recognize a short recognition sequence (e.g., 8-9 nucleotides).
  • LNA nu- cleobases can comprise ⁇ -LNA molecules (see, e.g., WO 00/66604) or xylo-LNA molecules (see, e.g., WO 00/56748).
  • the T m of the multi-probe when bound to its recognition sequence is between about 55 0 C to about 70 0 C.
  • the multi-probes comprise one or more modified nucleobases.
  • Modified base units may comprise a cyclic unit (e.g. a carbocyclic unit such as pyrenyl) that is joined to a nucleic unit, such as a l'-position of furasonyl ring through a linker, such as a straight of branched chain alkylene or alkenylene group.
  • Alkylene groups suitably having from 1 (i.e., - CH 2 -) to about 12 carbon atoms, more typically 1 to about 8 carbon atoms, still more typically 1 to about 6 carbon atoms.
  • Alkenylene groups suitably have one, two or three carbon- carbon double bounds and from 2 to about 12 carbon atoms, more typically 2 to about 8 carbon atoms, still more typically 2 to about 6 carbon atoms.
  • Multi-probes according to the invention are ideal for performing such assays as real-time PCR as the probes according to the invention are preferably less than about 25 nucleotides, less than about 15 nucleotides, less than about 10 nucleotides, e.g., 8 or 9 nucleotides.
  • a multi-probe can specifically hybridize with a recognition sequence within a target sequence under PCR conditions and preferably the recognition sequence is found in at least about 50, at least about 100, at least about 200, at least about 500 different target nucleic acid molecules.
  • a library of multi-probes according to the invention will comprise multi- probes, which comprise non-identical recognition sequences, such that any two multi-probes hybridize to different sets of target nucleic acid molecules.
  • the sets of target nucleic acid molecules comprise some identical target nucleic acid molecules, i.e., a target nucleic acid molecule comprising a gene sequence of interest may be bound by more than one multi-probe.
  • Such a target nucleic acid molecule will contain at least two different recognition sequences which may overlap by one or more, but less than x nucleotides of a recognition sequence comprising x nucleotides.
  • a multi-probe library comprises a plurality of different multi-probes, each different probe localized at a discrete location on a solid substrate.
  • localize refers to being limited or addressed at the location such that hybridization event detected at the location can be traced to a probe of known sequence identity.
  • a localized probe may or may not be stably associated with the substrate.
  • the probe could be in solution in the well of a microtiter plate and thus localized or addressed to the well.
  • the probe could be stably associated with the substrate such that it remains at a defined location on the substrate after one or more washes of the substrate with a buffer.
  • the probe may be chemically associated with the substrate, either directly or through a linker molecule, which may be a nucleic acid sequence, a peptide or other type of molecule, which has an affinity for molecules on the substrate.
  • the target nucleic acid molecules may be localized on a substrate (e.g., as a cell or cell lysate or nucleic acids dotted onto the substrate).
  • a substrate e.g., as a cell or cell lysate or nucleic acids dotted onto the substrate.
  • multi-LNA probes are preferably chemically synthesized using commercially available methods and equipment as described in the art ⁇ Tetrahedron 54: 3607-30, 1998).
  • the solid phase phosphoramidite method can be used to produce short LNA probes (Caruthers, et al., Cold Spring Harbor Symp. Quant. Biol. 47:411-418, 1982, Adams, et al., J. Am. Chem. Soc. 105: 661 (1983).
  • the determination of the extent of hybridization of multi-probes from a multi-probe library to one or more target sequences may be carried out by any of the methods well known in the art. If there is no detectable hybridization, the extent of hybridization is thus 0.
  • labelled signal nucleic acids are used to detect hybridization.
  • Complementary nucleic acids or signal nucleic acids may be labelled by any one of several methods typically used to detect the presence of hybridized polynucleotides. The most common method of detection is the use of ligands, which bind to labelled antibodies, fluorophores or chemiluminescent agents. Other labels include antibodies, which can serve as specific binding pair members for a labelled ligand. The choice of label depends on sensitivity required, ease of conjugation with the probe, stability requirements, and available instrumentation.
  • LNA-containing-probes are typically labelled during synthesis.
  • the flexibility of the phosphoramidite synthesis approach furthermore facilitates the easy production of LNAs carrying all commercially available linkers, fluorophores and labelling-molecules available for this standard chemistry.
  • LNA may also be labelled by enzymatic reactions e.g. by kinasing.
  • Multi-probes can comprise single labels or a plurality of labels.
  • the plurality of labels comprise a pair of labels which interact with each other either to produce a signal or to produce a change in a signal when hybridization of the multi- probe to a target sequence occurs.
  • the multi-probe comprises a fluorophore moiety and a quencher moiety, positioned in such a way that the hybridized state of the probe can be distinguished from the unhybridized state of the probe by an increase in the fluorescent signal from the nucleotide.
  • the multi-probe comprises, in addition to the recognition element, first and second complementary sequences, which specifically hybridize to each other, when the probe is not hybridized to a recognition sequence in a target molecule, bringing the quencher molecule in sufficient proximity to said reporter molecule to quench fluorescence of the reporter molecule. Hybridization of the target molecule distances the quencher from the reporter molecule and results in a signal, which is proportional to the amount of hybridization.
  • polymerization of strands of nucleic acids can be detected using a polymerase with 5' nuclease activity.
  • Fluorophore and quencher molecules are incorporated into the probe in sufficient proximity such that the quencher quenches the signal of the fluorophore molecule when the probe is hybridized to its recognition sequence.
  • Cleavage of the probe by the polymerase with 5' nuclease activity results in separation of the quencher and fluorophore molecule, and the presence in increasing amounts of signal as nucleic acid sequences
  • reporter means a reporter group, which is detectable either by itself or as a part of a detection series.
  • functional parts of reporter groups are biotin, digoxigenin, fluorescent groups (groups which are able to absorb electromagnetic radiation, e.g.
  • illustrative examples are DANSYL (5-di- methylamino)-l-naphthalenesulfonyl), DOXYL (N-oxyl-4,4-dimethyloxazolidine), PROXYL (N- oxyl-2,2,5,5-tetramethylpyrrolidine), TEMPO (N-oxyl-2,2,6,6-tetramethylpiperidine), dinitro- phenyl, acridines, coumarins, Cy3 and Cy5 (trademarks for Biological Detection Systems, Inc.), erythrosine, coumaric acid, umbelliferone, Texas red, rhodamine, tetramethyl rhoda- mine, Rox, 7-nitrobenzo-2-oxa-l-diazole (NBD), pyrene, fluorescein, Europ
  • substituted organic nitroxides or other paramagnetic probes (e.g. Cu 2+ , Mg 2+ ) bound to a biological molecule being detectable by the use of electron spin resonance spectroscopy).
  • paramagnetic probes e.g. Cu 2+ , Mg 2+
  • Suitable samples of target nucleic acid molecule may comprise a wide range of eukaryotic and prokaryotic cells, including protoplasts; or other biological materials, which may harbour target nucleic acids.
  • the methods are thus applicable to tissue culture animal cells, animal cells (e.g., blood, serum, plasma, reticulocytes, lymphocytes, urine, bone marrow tissue, cerebrospinal fluid or any product prepared from blood or lymph) or any type of tissue biopsy (e.g.
  • a muscle biopsy a liver biopsy, a kidney biopsy, a bladder biopsy, a bone biopsy, a car- tilage biopsy, a skin biopsy, a pancreas biopsy, a biopsy of the intestinal tract, a thymus biopsy, a mammae biopsy, a uterus biopsy, a testicular biopsy, an eye biopsy or a brain biopsy, e.g., homogenized in lysis buffer), archival tissue nucleic acids, plant cells or other cells sensitive to osmotic shock and cells of bacteria, yeasts, viruses, mycoplasmas, protozoa, rickettsia, fungi and other small microbial cells and the like.
  • Target nucleic acids which are recognized by a plurality of multi-probes can be assayed to detect sequences which are present in less than 10% in a population of target nucleic acid molecules, less than about 5%, less than about 1%, less than about 0,1%, and less than about 0.01% (e.g., such as specific gene sequences).
  • the type of assay used to detect such sequences is a non-limiting feature of the invention and may comprise PCR or some other suitable assay as is known in the art or developed to detect recognition sequences which are found in less than 10% of a population of target nucleic acid molecules.
  • the assay to detect the less abundant recognition sequences comprises hybridizing at least one primer capable of specifically hybridizing to the recognition sequence but substantially incapable of hybridizing to more than about 50, more than about 25, more than about 10, more than about 5, more than about 2 target nucleic acid molecules (e.g., the probe recognizes both copies of a homozygous gene sequence), or more than one target nucleic acid in a population (e.g., such as an allele of a single copy heterozygous gene sequence present in a sample).
  • a pair of such primers is provided and flank the recognition sequence identified by the multi-probe, i.e., are within an amplifiable distance of the recognition sequence such that amplicons of about 40-5000 bases can be produced, and preferably, 50-500 or more preferably 60-100 base amplicons are produced.
  • One or more of the primers may be labelled.
  • amplifying reactions are well known to one of ordinary skill in the art and include, but are not limited to PCR, RT-PCR, LCR, in vitro transcription, rolling circle PCR, OLA and the like. Multiple primers can also be used in multiplex PCR for detecting a set of specific target molecules.
  • the invention further provides a method for designing multi-probes sequences for use in methods and kits according to the invention.
  • a flow chart outlining the steps of the method is shown in Fig. 2.
  • a plurality of n-mers of n nucleotides is generated in silico, containing all pos- sible n-mers.
  • a subset of n-mers are selected which have a Tm >. 6O 0 C.
  • a subset of these probes is selected which do not self-hybridize to provide a list or database of candidate n-mers.
  • the sequence of each n-mer is used to query a database comprising a plurality of target sequences.
  • the target sequence database comprises expressed sequences, such as human mRNA sequences.
  • n-mers are selected that identify a maximum number of target sequences (e.g., n-mers which comprise recognition segments which are complementary to subsequences of a maximal number of target sequences in the target database) to generate an n-mer/target sequence matrix. Sequences of n-mers, which bind to a maximum number of target sequences, are stored in a database of optimal probe sequences and these are subtracted from the candidate n-mer database. Target sequences that are identified by the first set of optimal probes are removed from the target sequence database.
  • target sequences e.g., n-mers which comprise recognition segments which are complementary to subsequences of a maximal number of target sequences in the target database
  • the process is then repeated for the remaining candidate probes until a set of multi-probes is identified comprising n-mers which cover more than about 60%, more than about 80%, more than about 90% and more than about 95% of targets sequences.
  • the optimal sequences identified at each step may be used to generate a database of virtual multi-probes sequences.
  • Multi-probes may then be synthesized which comprise sequences from the multi-probe database.
  • the method further comprises evaluating the general applicability of a given candidate probe recognition sequence for inclusion in the growing set of optimal probe candidates by both a query against the remaining target sequences as well as a query against the original set of target sequences.
  • only probe recognition sequences that are frequently found in both the remaining target sequences and in the original target sequences are added to in the growing set of optimal probe recognition sequences. In a most preferred aspect this is accomplished by calculating the product of the scores from these queries and selecting the probes recognition sequence with the highest product that still is among the probe recognition sequences with 20% best score in the query against the current targets.
  • the invention also provides computer program products for facilitating the method described above (see, e.g., Fig. 2).
  • the computer program product comprises program instructions, which can be executed by a computer or a user device connectable to a network in communication with a memory.
  • the invention further provides a system comprising a computer memory comprising a database of target sequences and an application system for executing instructions provided by the computer program product.
  • a preferred embodiment of the invention is a kit for the characterisation or detection or quantification of target nucleic acids comprising samples of a library of multi-probes.
  • the kit comprises in silico protocols for their use.
  • the kit compri- ses information relating to suggestions for obtaining inexpensive DNA primers.
  • the probes contained within these kits may have any or all of the characteristics described above.
  • a plurality of probes comprises a least one stabilizing nucleobase, such as an LIMA nucleobase.
  • the plurality of probes comprises a nucleotide coupled or stably associated with at least one chemical moiety for increasing the stability of binding of the probe.
  • the kit comprises a number of different probes for covering at least 60% of a population of different target sequences such as a transcriptome.
  • the transcriptome is a human transcriptome.
  • the kit comprises at least one probe labelled with one or more labels.
  • one or more probes comprise labels capable of interacting with each other in a FRET-based assay, i.e., the probes may be designed to perform in 5' nuclease or Molecular Beacon -based assays.
  • kits according to the invention allow a user to quickly and efficiently to develop assays for many different nucleic acid targets.
  • the kit may additionally comprise one or more reagents for performing an amplification reaction, such as PCR.
  • probe reference numbers designate the LNA-oligonucleotide sequences shown in the synthesis examples below.
  • ENSEMBL The human transcriptome mRNA sequences were obtained from ENSEMBL.
  • ENSEMBL is a joint project between EMBL - EBI and the Sanger Institute to develop a software system which produces and maintains automatic annotation on eukaryotic genomes (see, e.g., Butler, Nature 406 (6794): 333, 2000).
  • ENSEMBL is primarily funded by the Wellcome Trust. It is noted that sequence data can be obtained from any type of database comprising expressed sequences, however, ENSEMBL is particularly attractive because it presents up-to-date sequence data and the best possible annotation for metazoan genomes.
  • the file "Homo_sapiens.cdna.fa” was downloaded from the ENSEMBL ftp site: ftp://ftp.ensembl.org/pub/current human/data/ on May 14. 2003.
  • the file contains all EN- SEMBL transcript predictions (i.e., 37347 different sequences). From each sequence the region starting at 50 nucleotides upstream from the 3' end to 1050 nucleotides upstream of the 3' end was extracted. The chosen set of probe sequences (see best mode below) was further evaluated against the human mRNA sequences in the Reference Sequence (RefSeq) collection from NCBI.
  • RefSeq standards serve as the basis for medical, functional, and diversity studies; they provide a stable reference for gene identification and characterization, mutation analysis, expression studies, polymorphism discovery, and comparative analyses.
  • the RefSeq collection aims to provide a comprehensive, integrated, non-redundant set of sequences, including genomic DNA, transcript (RNA), and protein products, for major research organisms. Similar coverage was found for both the 37347 sequences from ENSEMBL and the 19567 sequences in the RefSeq collection, i.e., demonstrating that the type of database is a non- limiting feature of the invention.
  • the optimal coverage of a transcriptome is found in two steps.
  • a sparse matrix of n_mers and genes is determined, so that the number of genes that contain a given n_mer can be found easily. This is done by running the getcover program with the -p option and a sequence file in FASTA format as input.
  • the second step is to determine the optimal cover with an algorithm, based on the matrix determined in the first step.
  • a program such as the getcover program is run with the matrix as input.
  • programs performing similar functions and for executing similar steps may be readily designed by those of skill in the art.
  • n-mers are generated and the expected melting temperature is calculated, n- mers with a melting temperature below 60 0 C or with high self-hybridisation energy are removed from the set. This gives a list of n-mers that have acceptable physical properties.
  • a list of gene sequences representing the human transcriptome is extracted from the ENSEMBL database. 3. Start of the main loop: Given the n-mer and gene list a sparse matrix of n-mers versus genes is generated by identifying all n-mers in a given gene and storing the result in a matrix.
  • the optimal n-mer is the one where the product of its current coverage and the total coverage is maximal.
  • the optimal n-mer is deleted from the n-mer list (step 1).
  • n-mer The genes covered by this n-mer are deleted from the gene list (step 2). 10. The n-mer is added to the optimal n-mer list, the process is continued from step 3 until no more n-mers can be found.
  • the program code ("getcover” version 1.0 by Niels Tolstrup 2003) for calculation of a multi- probe dataset is listed in Fig. 17. It consists of three proprietary modules: getcover.c, dyp.c, dyp.h
  • the program also incorporate four modules covered by the GNU Lesser General Public Licence:
  • the software was compiled with aap.
  • the main. aap file used to make the program is likewise listed in Fig. 17.
  • Target sequences in this database are exemplary optimal targets for a multi-probe library. These optimal multi-probes are listed in TABLE 1 below and comprise 5' fluorescein fluoro- phores and 3' Eclipse or other quenchers (see below).
  • each probe target occurs in at least 6% of the sequences in the human transcriptome (i.e., more than 2200 target sequences each, more than 800 sequences targeted within 1000 nt proximal to the 3' end of the transcript).
  • Self score is at least 10 below T m estimate for the duplex formed with the target.
  • the formed duplex with their target sequence has a T m at or above 60 0 C.
  • > 95.0 % of the mRNA sequences are targeted within the 1000 nt near their 3' terminal, (position 50 to 1050 from 3' end) and > 95% of the mRNA contain the target sequence for more than one probe in the library. More than 650,000 target sites for these 100 multi- probes were identified in the human transcriptome containing 37,347 nucleic acid sequences. The average number of multi-probes addressing each transcript in the transcriptome is 17.4 and the median value is target sites for 14 different probes.
  • sequences noted above are also an excellent choice of probes for other transcriptomes, though they were not selected to be optimized for the particular organisms. We have thus evaluated the coverage of the above listed library for the mouse and rat genome despite the fact that the above probes were designed to detect/characterize/quantify the transcripts in the human transcriptome only. E.g. see table 2.
  • Fig. 3 shows the expected coverage as percentage of the total number of mRNA sequences in the human transcriptome that are detectable within a 1000 nt long stretch near the 3' end of the respective sequences (i.e. the sequence from 50 nt to 1050 nt from the 3' end) by optimized probes of different lengths.
  • the probes are required to be sufficiently stable (Tm>60 degC) and with a low propensity for forming self duplexes, which eliminate many 9-mers and even more 8-mer probe sequences.
  • the initial set of 100 probes for human mRNAs can be modified to generate similar library kits for transcriptomes from other organisms (mouse, rat, Drosophila, C. elegans, yeast, Arabidopsis, zebra fish, primates, domestic animals, etc.). Construction of these new probe libraries will require little effort, as most of the human mRNA probes may be re-used in the novel library kits (TABLE 2).
  • the limited number of probes in the proposed libraries target a large fraction (> 98%) of the human transcriptome, but there is also a large degree of redundancy in that most of the genes (almost 95%) may be detected by more than one probe.
  • More than 650,000 target sites have been identified in the human transcriptome (37347 genes) for the 100 probes in the best mode library shown above. This gives an average number of target sites per probe of 6782 (i.e. 18 % of the transcriptome) ranging from 2527 to 12066 sequences per probe.
  • the average number of probes capable of detecting a particular gene is 17.4, and the median value is 14.
  • Within the library of only 100 probes we thus have at least 14 probes for more than 50% of all human mRNA sequences.
  • Fig. 4 The number of genes that are targeted by a given number of probes in the library is depicted in Fig. 4.
  • the SSA4 gene from yeast was selected for the expression as- says because the gene transcription level can be induced by heat shock and mutants are available where expression is knocked out.
  • Three different 9mer sequences were selected amongst commonly occurring 9mer sequences within the human transcriptome (Table 3). The sequences were present near the 3' terminal end of 1.8 to 6.4 % of all mRNA sequences within the human transcriptome. Further selection criteria were a moderate level of self-corn- plementarity and a Tm of 6O 0 C or above. All three sequences were present within the terminal 1000 bases of the SSA4 ORF.
  • Three 5' nuclease assay probes were constructed by synthesizing the three sequences with a FITCH fluorophore in the 5'-end and an Eclipse quencher (Epoch Biosciences) in the 3'end.
  • the probes were named according to their position within the ORF YER103W (SSA4) where position 1201 was set to be position 1.
  • Three sets of primer pairs were designed to produce three non-overlapping amplicons, which each contained one of the three probe sequences. Amplicons were named according to the probe sequence they encompassed.
  • the sequence of the SSA4 469 beacon was CAAGGAGAAGTTG (SEQ ID NO: 7, 10-mer recognition site) which should enable this oligonucleotide to form the intramolecular beacon structure with a stem formed by the LNA-LNA interactions between the 5'-CAA and the TTG-3'.
  • the sequence of the SSA4 570 beacon was CAAGGAAAGttG (9-mer recognition site) where the intramolecular beacon structure may form between the 5'-CAA and the ttG-3'. Both the sequences were synthesized with a fluorescein fluorophore in the 5'-end and a Dabcyl quencher in the 3'end.
  • SYBR Green labelled probe was also designed to detect the SSA4 570 sequence and named SYBR-Probe-570.
  • the sequence of this probe was CAAGGAAaG.
  • This probe was synthesized with an amino-C6 linker on the 5'-end on which the fluorophore SYBR Green 101 (Molecular Probes) was attached according to the manufactures instructions. Upon hybridization to the target sequence, the linker attached fluorophore should intercalate in the generated LNA-DNA duplex region causing increased fluorescence from the SYBR Green 101.
  • CPG solid supports were derivatized with either eclipse quencher (EQ13992-EQ13996) or dabcyl (EQ13997-EQ14148) and 5'- fluorescein phosphoramidite (GLEN Research, Sterling, Virginia, USA).
  • the synthesis cycle was modified for LNA phosphoramidites (250s coupling time) compared to DNA phosphoramidites.
  • lW-tetrazole or 4,5-dicyanoimidazole Proligo, Hamburg, Germany was used as activator in the coupling step.
  • oligonucleotides were deprotected using 32% aqueous ammonia (Ih at room temperature, then 2 hours at 60 0 C) and purified by HPLC (Shimadzu-SpectraChrom series; XterraTM RP18 column, 10?m 7.8 x 150 mm (Waters). Buffers: A: 0.05M Triethylammonium acetate pH 7.4. B. 50% acetonitrile in water. Eluent: 0-25 min: 10-80% B; 25-30 min: 80% B). The composition and purity of the oligonucleotides were verified by MALDI-MS (PerSeptive Bio- system, Voyager DE-PRO) analysis, see Table 5.
  • Amplification of the partial yeast gene was done by standard PCR using yeast genomic DNA as template. Genomic DNA was prepared from a wild type standard laboratory strain of Sac- charomyces cerevisiae using the Nucleon MiY DNA extraction kit (Amersham Biosciences) according to supplier's instructions.
  • a forward primer containing a restriction enzyme site and a reverse primer containing a universal linker sequence were used.
  • 20 bp was added to the 3'-end of the amplicon, next to the stop codon.
  • the reverse primer was exchanged with a nested primer containing a poly-T 20 tail and a restriction enzyme site.
  • the SSA4 amplicon contains 729 bp of the SSA4 ORF plus a 20 bp universal linker sequence and a poly-A 20 tail.
  • PCR primers used were:
  • YER103W-For-SacI acgtgagctcattgaaactgcaggtggtattatga (SEQ ID NO: 23)
  • the PCR amplicon was cut with the restriction enzymes, EcoRI + BamHl.
  • the DNA fragment was ligated into the pTRIampl ⁇ vector (Ambion) using the Quick Ligation Kit (New England Biolabs) according to the supplier's instructions and transformed into E. coli DH-5 by standard methods.
  • plasmid DNA was sequenced using M13 forward and M13 reverse primers and analysed on an ABI 377.
  • SSA4 cRNA was obtained by performing in vitro transcription with the Megascript T7 kit (Am- bion) according to the supplier's instructions.
  • Reverse transcription was performed with l ⁇ g of cRNA and 0.2 U of the reverse transcriptase Superscript II RT (Invitrogen) according to the suppliers instructions except that 20 U Supe- rase-In (RNAse inhibitor - Ambion) was added.
  • the produced cDNA was purified on a
  • composition of the PCR reactions shown in Table 6 together with PCR cycle protocols listed in Table 7 will be referred to as standard 5' nuclease assay or standard Beacon assay conditions.
  • each probe only produces a fluorescent signal together with the amplicon it was designed to detect (see also Figs. 10, 11 and 12).
  • the different probes had very similar cycle threshold C t values (from 23.2 to 23.7), showing that the as- says and probes have a very equal efficiency.
  • the assays should detect similar expression levels when used in used in real expression assays. This is an important finding, because variability in performance of different probes is undesirable.
  • the ability to detect in real time, newly generated PCR amplicons was also demonstrated for the molecular beacon design concept.
  • the Molecular Beacon designed against the 469 ampli- con with a 10-mer recognition sequence produced a clear signal when the SSA4 cDNA template and primers for generating the 469 amplicon were present in the PCR, Fig. 7A.
  • the observed Q value was 24.0 and very similar to the ones obtained with the 5' nuclease assay probes again indicating a very similar sensitivity of the different probes. No signal was produced when the SSA4 template was not added.
  • a similar result was produced by the Molecular Beacon designed against the 570 amplicon with a 9-mer recognition sequence, Fig. 7B.
  • the ability to detect newly generated PCR amplicons was also demonstrated for the SYBR- probe design concept.
  • the 9-mer SYBR-probe designed against the 570 amplicon of the SSA4 cDNA produced a clear signal when the SSA4 cDNA template and primers for generating the 570 amplicon were present in the PCR, Fig. 8. No signal was produced when the SSA4 template was not added.
  • the ability to detect different levels of gene transcripts is an essential requirement for a probe to perform in a true expression assay.
  • the fulfilment of the requirement was shown by the three 5' nuclease assay probes in an assay where different levels of the expression vector derived SSA4 cDNA was added to different PCR reactions together with one of the 5' nuclease assay probes (Fig. 9).
  • Composition and cycle conditions were according to standard 5' nuclease assay conditions.
  • the cDNA copy number in the PCR before start of cycling is reflected in the cycle threshold value Q, i.e., the cycle number at which signal is first detected. Signal is here only defined as signal if fluorescence is five times above the standard deviation of the fluorescence detected in PCR cycles 3 to 10.
  • the results show an overall good correlation between the logarithm to the initial cDNA copy number and the C t value (Fig. 9).
  • the correlation appears as a straight line with slope between -3.456 and -3.499 depending on the probe and correlation coefficients between 0.9981 and 0.9999.
  • the slope of the curves reflect the efficiency of the PCRs with a 100% efficiency corresponding to a slope of -3.322 assuming a doubling of amplicon in each PCR cycle.
  • the slopes of the present PCRs indicate PCR efficiencies between 94% and 100%.
  • SSA4 transcripts Expression levels of the SSA4 transcript were detected in different yeast strains grown at different culture conditions ( ⁇ heat shock).
  • a standard laboratory strain of Saccharomyces cerevisiae was used as wild type yeast in the experiments described here.
  • a SSA4 knockout mutant was obtained from EUROSCARF (accession number Y06101). This strain is here referred to as the SSA4 mutant.
  • Both yeast strains were grown in YPD medium at 30 0 C till an OD 600 of 0.8 A.
  • Yeast cultures that were to be heat shocked were transferred to 40 0 C for 30 minutes after which the cells were harvested by centrifugation and the pellet frozen at - 80 0 C. Non-heat shocked cells were in the meantime left growing at 30 0 C for 30 minutes and then harvested as above.
  • Reverse transcription was performed with 5 ⁇ g of anchored oligo(dT) primer to prime the reaction on l ⁇ g of total RNA, and 0.2 U of the reverse transcriptase Superscript II RT (Invi- trogen) according to the suppliers instructions except that 20 U Superase-In (RNAse inhibitor - Ambion) was added. After two-hours of incubation, enzyme inactivation was performed at 70° for 5 minutes. The cDNA reactions were diluted 5 times in 10 mM Tris buffer pH 8.5 and oligonucleotides and enzymes were removed by purification on a MicroSpinTM S-400 HR column (Amersham Pharmacia Biotech). Prior to performing the expression assay the cDNA was diluted 20 times.
  • the expression assay was performed with the Dual-labelled-570 probe using standard 5' nuclease assay conditions except 2 ⁇ L of template was added.
  • the template was a 100 times dilution of the original reverse transcription reactions.
  • the four different cDNA templates used were derived from wild type or mutant with or without heat shock.
  • Total cDNA derived from non-heat shocked wild type yeast was used as template for the expression assay, which was performed using standard 5' nuclease assay conditions except 2 ⁇ L of template was added. As shown in Fig. 11, all three probes could detect expression of the genes according to the assay design outlined in Table 8. Expression was not detected with any other combination of probe and primers than the ones outlined in Table 8. Expression data are available in the literature for the SSA4, P0L5, HSP82, and the APG9 (Holstege, et al. 1998).
  • the probe target length (n), and sequence (nmer) and occurrence in the total target (cover), as well as the number of new hits per probe target selection (Newhit), the product of Newhit and cover (newhit x cover) and the number of accumulated hits in the target population from all accumulated probes (sum) is exemplified in the table below.
  • Probe library is coupled to the use of a real-time PCR design software which can:
  • the ProbeFinder software designs optimal qPCR probes and primers fast and reliably for a given human gene.
  • the design comprises the following steps: 1) Determination of the intron positions
  • Introns are determined by a blast search against the human genome. Regions found on the DNA, but not in the transcript are considered to be introns.
  • Virtually all human transcripts are covered by at least one of the 90 probes, the high coverage is made possible by LNA modifications of the recognition sequence tags.
  • Primers are designed with ⁇ Primer3' (Whitehead Inst. For Biomedical Research, S. Rozen and HJ. Skaletsky). Finally the probes are ranked according to selected rules ensuring the best possible qPCR. The rules favour intron spanning amplicons to remove false signals from DNA contamination, amplicons that will not amplify off target genomic sequence or other transcripts as found by an in silico PCR search, small amplicon size for reproducible and comparable assays and a GC content optimized for PCR.
  • ENA-T monomers are prepared and used for the preparation of dual labelled probes of the invention.
  • the X denotes a 2'-O,4'-C-ethylene-5-methyluridine (ENA-T).
  • ENA-T 2'-O,4'-C-ethylene-5-methyluridine
  • the reaction conditions for incorporation of a 5'-0-Dimethoxytrityl-2'-0,4'-C-ethylene-5-methyluridine-3'-0-(2-cyanoethyl- N,N-diisopropyl)phosphoramidite corresponds to the reaction conditions for the preparation of LNA oligomers as described in EXAMPLE 6.
  • LNA nucleotides are in capital letters; 6-Fitc: Fluorescein 6-isothiocyanate;
  • Example 21 which also shows preparation of a 2-cyanoethyl protected phosphoramidite version of this molecule for use in the general method in Example 6, i.e.
  • the 17302 Q4 dual label probe is prepared as generally described in Example 6.
  • Example 20 which also shows preparation of a 2-cyanoethyl protected phosphoramidite version of this molecule (l-(3-(2- cyanoethoxy(diisopropylamino)phosphinoxy)propylamino)-4-(3-(4,4'-dimethoxy- trityloxy)propylamino)-anthraquinone) for use in the general method in Example 6; z: 2'-deoxy-5-nitroindole-ribofuranosyl; mC: 5-methylcytosin.
  • the 15305 Ql dual label probe is prepared as described in Example 6.
  • Leucoquinizarin (9.9 g; 0.04 mol) is mixed with 3-amino-l-propanol (10 mL) and Ethanol (200 mL) and heated to reflux for 6 hours. The mixture is cooled to room temperature and stirred overnight under atmospheric conditions. The mixture is poured into water (500 mL) and the precipitate is filtered off washed with water (200 mL) and dried. The solid is boiled in ethylacetate (300 mL), cooled to room temperature and the solid is collected by filtration.
  • l,4-Bis(3-hydroxypropylamino)-anthraquinone (7.08 g; 0.02 mol) is dissolved in a mixture of dry N,N-dimethylformamide (150 mL) and dry pyridine (50 mL). Dimethoxytritylchloride (3.4 g; 0.01 mol) is added and the mixture is stirred for 2 hours. Additional dimethoxytritylchloride (3.4 g; 0.01 mol) is added and the mixture is stirred for 3 hours. The mixture is concentrated under vacuum and the residue is re-dissolved in dichloromethane
  • 6-nnethyl-quinizarin (10, 2.5g) is suspended in acetic acid (30ml), Zn-dust (2g) is added and the mixture is stirred at 90 0 C for Ih.
  • the mixture is then filtered through a pad of celite, cooled to room temperature and water (90ml) is added and the reduced anthraquinone derivative can then be collected by filtration.
  • the solid is then mixed with boric acid (1.9 g; 0.03 mol) and ethanol (100 ml_) and refluxed for 1 hour.
  • the mixture is cooled to room temperature and added 4-aminophenethyl alcohol (4.1 g; 0.03 mol) where after the mixture is heated to reflux for 3 days.
  • SNPs Single Nucleotide polymorphisms
  • Detection of SNPs using dual labelled probes can be done by simultaneously using 2 differently labelled probes, which each hybridize specifically to one SNP allele. The result of the real time PCR will hence indicate the presence of one or the other or both alleles in the sample.
  • sample can be used either genomic DNA or RNA.
  • LNA-containing oligo's is an increased specificity, allowing the SNP-position in the probe to be placed at any position in the probe.
  • 6-Fitc Fluorescein 6-isothiocyanate
  • EQL EclipseTM Dark Quencher (Epoch Biosciences)
  • mC 5-methylcytosin.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Physics & Mathematics (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Materials By The Use Of Chemical Reactions (AREA)

Abstract

L'invention concerne des sondes d'acides nucléiques, des bibliothèques de sondes d'acides nucléiques et des trousses permettant de détecter, de classer ou de quantifier des composants dans un mélange complexe d'acides nucléiques, par exemple un transcriptome, et leurs procédés d'utilisation. L'invention concerne également des procédés permettant d'identifier des sondes d'acides nucléiques utiles dans les bibliothèques de sondes ainsi que des procédés permettant d'identifier un moyen de détecter un acide nucléique donné.
EP05804988A 2004-12-22 2005-12-21 Sondes, bibliotheques et trousses pour l'analyse de melanges d'acides nucleiques et leurs procedes de construction Withdrawn EP1831394A2 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US63785704P 2004-12-22 2004-12-22
DKPA200401987 2004-12-22
DKPA200402012 2004-12-28
PCT/DK2005/000815 WO2006066592A2 (fr) 2004-12-22 2005-12-21 Sondes, bibliotheques et trousses pour l'analyse de melanges d'acides nucleiques et leurs procedes de construction

Publications (1)

Publication Number Publication Date
EP1831394A2 true EP1831394A2 (fr) 2007-09-12

Family

ID=36292652

Family Applications (1)

Application Number Title Priority Date Filing Date
EP05804988A Withdrawn EP1831394A2 (fr) 2004-12-22 2005-12-21 Sondes, bibliotheques et trousses pour l'analyse de melanges d'acides nucleiques et leurs procedes de construction

Country Status (4)

Country Link
EP (1) EP1831394A2 (fr)
JP (1) JP2008523828A (fr)
CA (1) CA2593916A1 (fr)
WO (1) WO2006066592A2 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2423324A4 (fr) * 2009-04-22 2013-02-13 Vertex Pharma Ensemble sonde pour l'identification d'une mutation nucléotidique et procédé d'identification d'une mutation nucléotidique

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004024314A2 (fr) * 2002-09-11 2004-03-25 Exiqon A/S Population d'acides nucleiques comprenant une sous-population d'oligomeres lna
AU2003275018B2 (en) * 2002-09-20 2009-10-01 Integrated Dna Technologies, Inc. Anthraquinone quencher dyes, their methods of preparation and use
JP4573833B2 (ja) * 2003-06-20 2010-11-04 エクシコン・アクティーゼルスカブ 核酸混合物を解析するためのプローブ、ライブラリーおよびキット、並びにその構築方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2006066592A2 *

Also Published As

Publication number Publication date
JP2008523828A (ja) 2008-07-10
WO2006066592A3 (fr) 2006-08-24
WO2006066592A2 (fr) 2006-06-29
CA2593916A1 (fr) 2006-06-29

Similar Documents

Publication Publication Date Title
US11111535B2 (en) Probes, libraries and kits for analysis of mixtures of nucleic acids and methods for constructing the same
US8192937B2 (en) Methods for quantification of microRNAs and small interfering RNAs
EP1735459B1 (fr) Nouvelles methodes permettant de quantifier des micro arn et petits arn interferants
US6344316B1 (en) Nucleic acid analysis techniques
CN102301011B (zh) 对小rna进行定量的方法
WO2008040355A2 (fr) Nouveaux procédés de quantification de micro-arn et de petits arn interférants
US20010053519A1 (en) Oligonucleotides
WO2006069584A2 (fr) Nouvelles compositions d'oligonucleotides et sequences de sondes utiles pour la detection et l'analyse de microarn et de leurs marn cibles
US8188255B2 (en) Human microRNAs associated with cancer
EP1994182A2 (fr) Analogues de nucléobase dégénérée
EP1594975A2 (fr) Amplification multiplex de polynucleotides
EP1546404A2 (fr) Population d'acides nucleiques comprenant une sous-population d'oligomeres lna
US20060166238A1 (en) Probes, libraries and kits for analysis of mixtures of nucleic acids and methods for constructing the same
US20060014183A1 (en) Extendable probes
EP1639130B1 (fr) Sondes, bibliotheques et trousses d'analyse pour melanges d'acides nucleiques et methodes de realisation
CN101090979A (zh) 用于分析核酸混合物的探针、文库和试剂盒及其构建方法
EP1831394A2 (fr) Sondes, bibliotheques et trousses pour l'analyse de melanges d'acides nucleiques et leurs procedes de construction
US20160289761A1 (en) Oligonucleotides comprising a secondary structure and uses thereof
WO2005121358A2 (fr) Sondes extensibles

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20070723

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

17Q First examination report despatched

Effective date: 20071005

DAX Request for extension of the european patent (deleted)
RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: EXIQON A/S

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20100326