WO2009143590A2 - Protocol de détection de séquence d'insertion - Google Patents
Protocol de détection de séquence d'insertion Download PDFInfo
- Publication number
- WO2009143590A2 WO2009143590A2 PCT/BE2009/000026 BE2009000026W WO2009143590A2 WO 2009143590 A2 WO2009143590 A2 WO 2009143590A2 BE 2009000026 W BE2009000026 W BE 2009000026W WO 2009143590 A2 WO2009143590 A2 WO 2009143590A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- nucleic acid
- oligonucleotides
- marker
- sequence
- nucleotides
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6834—Enzymatic or biochemical coupling of nucleic acids to a solid phase
- C12Q1/6837—Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips
Definitions
- the present invention relates to a novel microarray-based methodology, referred to as 'Sliding Window Hybridization' (SWH) 3 for reliable and convenient detection of short nucleic acid sequences in genomes of an organism.
- SWH 'Sliding Window Hybridization'
- Short artificial DNA sequences can be introduced into the genome of organisms to serve as convenient genetic markers (Pat. DE60204313T).
- These oligonucleotides can be between about 5 and about 50 nucleotides in length, all have a specific sequence, are spaced with regular intervals and are preferentially introduced into neutral DNA sequences, i.e. sequences without any coding or regulatory function.
- When enough of these artificial oligonucleotides are inserted into the genome of a set of strains, or preferentially a single strain, they cover the whole genome genetically.
- the artificially marked strain(s) can then be used for mapping of any single or multiple genetic elements that are responsible for a given trait in the organism.
- an organism with a trait of interest is crossed with the artificially marked strain and the segregants with the trait of interest are pooled, their DNA extracted and the markers scored.
- the absence of specific artificial markers will reveal the location of genetic elements that are required for the trait of interest.
- the presence of the markers can be determined using various protocols, for instance PCR amplification with the primers, one being the marker sequence and the other one a sequence downstream of the marker in the genome.
- the present invention provides a convenient microarray detection methodology to score all markers simultaneously and reliably without any prior PCR amplification of the DNA.
- Short insertions can also occur naturally in genomes, but in this case they are more likely to be spread randomly over the genome, and present in positions irrespective of the functionality of the DNA sequence surrounding their location.
- the present invention can also be used to detect such insertions conveniently and reliably. For the sake of simplicity we call all artificial or natural insertions of short DNA sequences in genomes 'markers'.
- a first aspect of present invention is to provide a microarray for the detection of a nucleic acid marker within a target sequence comprising an oligonucleotide set comprising a plurality of oligonucleotides, each oligonucleotide comprising:
- the nucleic acid marker has a length of 1 to about 200 nucleotides.
- the sum of the lengths of the second part and the third part is identical in each of the oligonucleotides.
- the sum of the lengths of the second part and the third part varies between 10 and 100 nucleotides. In one embodiment the sum of the lengths of the second part and the third part varies between 30 and 60 nucleotides. In one embodiment the sum of the lengths of the second part and the third part varies between 20 and 40 nucleotides.
- the oligonucleotide sets comprise a plurality of unique oligonucleotides.
- the oligonucleotide set further comprises a control oligonucleotide for detecting the absence of the nucleic acid marker comprising a sequence which is complementary to the upstream region flanking the nucleic acid marker and which is directly linked to a sequence which is complementary to the downstream region flanking the nucleic acid marker.
- control oligonucleotide consists of a sequence which is complementary to the upstream region flanking the nucleic acid marker and which is directly linked to a sequence which is complementary to the downstream region flanking the nucleic acid marker.
- sequence which is complementary to the upstream region flanking the nucleic acid marker of the control oligonucleotide has a length of 5 to about 50 nucleotides and the sequence which is complementary to the downstream region flanking the nucleic acid marker has a length of 5 to about 50 nucleotides.
- the microarray comprises a plurality of oligonucleotide sets for the detection of a plurality of nucleic acid markers.
- the oligonucleotides are of similar or identical length.
- microarray of present invention it is an advantage of the microarray of present invention that the microarray can be obtained in different ways known to people familiar with the state-of-the-art.
- the microarray can contain a number of nucleic acid markers that is only limited by the dimensions of the microarray. It is therefore also an advantage of the microarray of present invention that a high number of markers can be detected simultaneously.
- the sets of oligonucleotides provide a stronger hybridization signal because the target sequence is able to hybridize to several, or all, oligonucleotides present within the matching set.
- markers can be scored without the need to amplify the target nucleic acids in a sample. It is an advantage of the microarray of present invention that markers are detected with a very high reliability, in most cases even a 100% reliability.
- a second aspect of present invention is to provide a method for the detection of a nucleic acid marker in a target sequence, comprising the steps of labeling the target sequence with a first label and labeling a reference sequence with a second label; hybridizing the labeled target and reference sequence to a microarray comprising an oligonucleotide set comprising a plurality of oligonucleotides, each oligonucleotide comprising a first part which is complementary to the nucleic acid marker, and either or both a second part flanking the first part at the 5' end which is complementary to the upstream region flanking the nucleic acid marker, and a third part flanking the first part at the 3' end which is complementary to the downstream region flanking the nucleic acid marker, wherein the respective lengths of the second part and the third part differ in each of the oligonucleotides such that the sum of the lengths of the second part and the third part does not differ more than 10% between the oligonucleotides; and
- the method further comprises the step of determining the presence of the nucleic acid marker through regression analysis of the point cloud obtained by plotting for each oligonucleotide the intensity measured for the first label against the intensity of the second label.
- the method further comprises the step of plotting the hybridization intensities to obtain a logarithmic scatter plot.
- the method of present invention is used to detect a nucleic acid marker of about 5 to about 60 nucleotides.
- the oligonucleotides are of similar or identical length.
- the method of present invention teaches the surprising finding that the presence of nucleic acid markers can be derived upon regression analysis of the computed scatter plot of the hybridization signals obtained from the microarray reading.
- it was difficult to score a marker because of the often weak hybridization signal and because many experiments needed to be repeated.
- the present invention provides a method of detecting nucleic acid markers all at once, which is also more economically interesting than the detection methods provided in the art.
- FIGURE 1 A. Example of an inserted marker sequence with 28 nucleotides and upstream and downstream flanking genomic sequences. B. The same genomic sequence where the marker is absent.
- FIGURE 2 Example of the signals generated by detection of a single marker that is present in the genomic DNA of three yeast strains: 211a, 424a and 510a.
- the cloud of points in the lower left corner are universal control oligonucleotides.
- FIGURE 3 Example of the signals generated by detection of a single marker that is absent in the genomic DNA of three yeast strains: 211a, 424a and 510a.
- the cloud of points in the lower left corner are universal control oligonucleotides.
- FIGURE 4 Example of the signals generated by detection of a single marker that is absent in the genomic DNA of two yeast strains 424a and 510a, and present in the genomic DNA of one yeast strain: 211a.
- the cloud of points in the lower left corner are universal control oligonucleotides.
- FIGURE 5 Example of the signals generated by detection of a single marker that is present in the genomic DNA of two yeast strains 424a and 510a, and absent in the genomic DNA of one yeast strain: 211a.
- the cloud of points in the lower left corner are universal control oligonucleotides.
- FIGURE 6 Example of the detection of a set of markers that is present and a set of markers that is absent in the genomic DNA of three yeast strains: 211a, 424a and 510a.
- AU dots that are in the cloud forming a line with a slope of about 1 and an intercept close to 0 correspond to oligonucleotides matching markers that are absent.
- All dots that are in the upper cloud forming a line with a slope close to 0 and an intercept around 1000 correspond to oligonucleotides matching markers that are present.
- FIGURE 7 Schematic explanation of the marker scoring principle.
- a regression line is drawn through the fluorescence hybridisation signals. When the intercept of the regression line is higher than 0 and the slope is 0, the marker is present. When the intercept of the regression line is 0 and the slope is 1, the marker is absent.
- FIGURE 8 Experimentally determined values of the intercept and the slope of the regression line for all markers on a microarray. Intercept around 0 and slope around 1 indicates absence of the marker. Intercept clearly higher than 0 and slope mostly around 0 but always clearly lower than 1, indicates presence of the marker.
- FIGURE 9 Plot of the experimentally determined values of the intercept and slope for all markers on a microarray. Intercept around 0 and slope around 1 indicates absence of the marker. Intercept clearly higher than 0 and slope mostly around 0 but always clearly lower than 1, indicates presence of the marker. The cloud of points indicating absent markers is clearly separated from the cloud of points indicating present markers.
- nucleic acid refers to a single stranded or double stranded nucleic acid sequence and may consist of deoxyribonucleotides, ribonucleotides, nucleotide analogues, modified nucleotides, or may have been adapted otherwise.
- nucleic acid marker refers to a position comprising one or more nucleotides in the nucleic acid sequence which differs relative to a reference nucleic acid sequence.
- the nucleic acid marker of present invention can be a naturally occurring marker such as, but not limited to, a polymorphism, a single nucleotide polymorphism ("SNP"), or stretches of repeating sequences that vary as to the length of the repeat from individual to individual.
- the nucleic acid marker can be an insertion, deletion, substitution, tandem repeat or similar.
- the nucleic acid marker can also be a nucleic acid variant, e.g. correlated to the presence of a disease such as an inherited disease.
- the nucleic acid marker can also be an artificial genomic marker, such as e.g. those described in WO2003002709.
- target sequence refers to a nucleic acid sequence such as DNA of any origin, such as but not limited to viral DNA, bacterial DNA, fungal DNA, mammalian DNA, plant DNA, or DNA fragments.
- the DNA can be a gene, or any other part of a genome.
- target sequence also refers to RNA of any origin or any form, such as but not limited to viral RNA 5 mRNA, rRNA, or tRNA.
- target sequence further refers to cDNA, oligonucleotides, or synthetic DNA, RNA, PNA, synthetic oligonucleotides, modified oligonucleotides or other nucleic acid analogues.
- the target sequence may be naturally-occurring or man-made.
- mRNA or mRNA transcripts include, but are not limited to pre-mRNA transcripts, transcript processing intermediates, mature mRNA(s) ready for translation and transcripts of the gene or genes, or nucleic acids derived from the mRNA transcript(s). Transcript processing may include splicing, editing, and degradation.
- a nucleic acid derived from an mRNA transcript refers to a nucleic acid for whose synthesis the mRNA transcript or a subsequence thereof has ultimately served as a template.
- a cDNA reverse transcribed from an mRNA, a cRNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc. are all derived from the mRNA transcript and detection of such derived products is indicative of the presence and/or abundance of the original transcript in a sample.
- mRNA derived samples include, but are not limited to, mRNA transcripts of the gene or genes, cDNA reverse transcribed from the mRNA, cRNA transcribed from the cDNA, DNA amplified from the genes, RNA transcribed from amplified DNA, and the like.
- oligonucleotide set refers to all oligonucleotides of present invention that are used in the detection of a single marker. Some arrays have over 600 oligonucleotide sets.
- oligonucleotide refers to a nucleic acid that is about 300 nucleotides or less in length.
- the oligonucleotide can be DNA or RNA, and single- or double-stranded. It also refers to a peptide nucleic acid, and other nucleic acid analog and nucleic acid mimetic. Oligonucleotides can be naturally occurring or synthetic, but are typically prepared by synthetic means. In the present invention, the oligonucleotide can be allele- specific for detecting the presence of a nucleic acid marker, e.g. a mutation-bearing gene or a SNP.
- a nucleic acid marker e.g. a mutation-bearing gene or a SNP.
- the oligonucleotide can also be specific for detecting the presence or absence of an artificial genomic marker, e.g. those described in WO2003002709.
- an oligonucleotide can be surface immobilized.
- control oligonucleotide refers to an oligonucleotide specific for the allele where the marker is not present, or for the allele that represents the normal, healthy situation e.g. is not mutation-bearing, or for an allele with which the target sequence needs to be compared.
- a control oligonucleotide can be surface immobilized.
- a control oligonucleotide is complementary to a reference sequence.
- polymorphism refers to the occurrence of two or more genetically determined alternative sequences or alleles in a population.
- a polymorphic marker is the locus at which divergence occurs. Preferred markers have at least two alleles, each occurring at frequency of greater than 1%.
- a polymorphic locus may be as small as one base pair.
- Polymorphic markers suitable for use in the present invention include restriction fragment length polymorphisms, variable number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats and tetranucleotide repeats. "Dinucleotide repeats" comprise segments of at least about 10 base pairs of DNA consisting of a variable number of CA tandem repeats.
- the dinucleotide repeats are a subclass of all short tandem repeat sequences.
- the dinucleotide repeats are generally spread throughout the chromosomal DNA of an individual.
- the number of CA dinucleotides in any particular tandem array varies greatly from individual to individual, and thus, dinucleotide repeats may serve to generate restriction fragment length polymorphisms, and may additionally serve as size-based amplification product differentiation markers.
- reference sequence refers to a defined sequence used as a basis for a sequence comparison.
- a reference sequence may be a subset of a larger sequence, for example, as a segment of a full-length cDNA or gene sequence, or may comprise a complete cDNA or gene sequence.
- a reference sequence may also be a whole genome.
- a reference sequence is at least 10 nucleotides in length, sometimes at least 50 nucleotides in length, and in some cases at least 100 and up to 300 nucleotides in length.
- a reference sequence is complementary to a control oligonucleotide.
- upstream region and “downstream region” mean the region immediately adjacent or substantially near the nucleic acid marker of present invention in the 5' and 3' direction, respectively.
- Complementary refers to the hybridization or base pairing between nucleotides or nucleic acids, such as, for instance, between the two strands of a double stranded DNA molecule or between an oligonucleotide and a target sequence.
- Complementary nucleotides are, generally, A and T (or A and U), or C and G.
- Two single stranded RNA or DNA molecules are said to be complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the nucleotides of the other strand, usually at least about 90% to 95%, and more preferably from about 98 to 100%.
- RNA or DNA strand will hybridize under selective hybridization conditions to its complement.
- selective hybridization will occur when there is at least about 65% complementary over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 90% complementary.
- hybridization or “hybridizing” refers to the process in which two single- stranded polynucleotides bind non-covalently to form a stable double-stranded polynucleotide.
- hybridization or “hybridizing” may also refer to triple- stranded hybridization.
- the resulting (usually) double-stranded polynucleotide is a “hybrid.”
- the proportion of the population of polynucleotides that forms stable hybrids is referred to herein as the "hybridization intensity” or “signal intensity”.
- Hybridization conditions will typically include salt, concentrations of less than about IM, more usually less than about 500 mM and less than about 200 mM.
- Hybridization temperatures can be as low as 5° C, but are typically greater than 22° C, more typically greater than about 30° C, and preferably in excess of about 37° C.
- Hybridizations are usually performed under stringent conditions, i.e. conditions under which a probe will hybridize to its target subsequence. Stringent conditions are sequence-dependent and are different in different circumstances. Longer fragments may require higher hybridization temperatures for specific hybridization.
- stringent conditions are selected to be about 5° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH.
- Tm is the temperature (under defined ionic strength, pH and nucleic acid composition) at which 50% of the probes complementary to the target sequence hybridize to the target sequence at equilibrium.
- stringent conditions include salt concentration of at least 0.01 M to no more than 1 M Na ion concentration (or other salts) at a pH 7.0 to 8.3 and a temperature of at least 25° C.
- salt concentration of at least 0.01 M to no more than 1 M Na ion concentration (or other salts) at a pH 7.0 to 8.3 and a temperature of at least 25° C.
- 5*SSPE 750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4
- a temperature of 25-30° C are suitable for allele-specific probe hybridizations.
- Hybridization probes are nucleic acids (such as the oligonucleotides of present invention) capable of binding in a base-specific manner to a complementary strand of nucleic acid (such as the target sequence of present invention).
- Such probes include peptide nucleic acids and other nucleic acid analogs and nucleic acid mimetics.
- determining the presence refers to determining whether or not the relevant genetic, physiological and/or biochemical event, e.g. linked with the occurrence of a disease is present, hi practice, both the absence and the presence of a certain event or phenotype can function as markers. Accordingly, reference to determining the presence of a nucleic acid marker or a nucleic acid variant, generally encompasses determining whether the marker is present, either based on the absence or the presence of the variant or marker in a sample. Moreover, this also includes the possible finding that the marker is not present in the sample, i.e. determining the absence (or presence) of a nucleic acid variant or a nucleic acid marker.
- determining the presence of the marker can also be done indirectly, e.g., where the presence of a nucleic acid variant is linked to disease, the occurrence of this marker can (besides the direct detection of the nucleic acid variant) also be done by determining the homozygous presence of the corresponding allele not comprising the nucleic.
- allele-specific oligonucleotides for detecting the presence of a SNP can be specific for the allele where the SNP is not present.
- allele is one of several alternative forms of a gene or DNA sequence at a specific chromosomal location (locus). At each autosomal locus an individual possesses two alleles, one inherited from the father and one from the mother.
- gene means the genetic constitution of an individual, either overall or at a specific locus, and defines the combination of alleles the individual carries.
- homozygous refers to having two of the same alleles at a locus; the term “heterozygous” refers to having different alleles at a locus.
- allotype refers to any of the genetically determined variants in the constant region of a given subclass of an immunoglobulin that is detectable as an antigen by members of the same species having a different constant region.
- the microarray of the present invention provides for the detection of a nucleic acid marker within a target sequence.
- the microarray comprises at least an oligonucleotide set comprising a plurality of oligonucleotides wherein each oligonucleotide comprises:
- the respective lengths of the second part and the third part differ in each of the oligonucleotides.
- the sum of the lengths of the second part and the third part does not differ more than 10% between the oligonucleotides. More preferably, the sum of the lengths of the second part and the third part is identical for all the oligonucleotides.
- the length of the nucleic acid marker can vary between 1 and 200 oligonucleotides, more preferably between 1 and 100, most preferably between 1 and 60, such as between 5 and 60.
- the sum of the lengths of the second part and the third part of each of the nucleotides is preferably between 10 and 100, more preferably between 20 and 80, most preferably between 30 and 60.
- each oligonucleotide has a homology of at least 95%, more preferably at least 97%, even more preferably at least 99% with the nucleic acid marker. Most preferably the first part is identical to the nucleic acid marker. It is preferred that the second part of each oligonucleotide has a homology of at least 95%, more preferably at least 97%, even more preferably at least 99% with the corresponding portion of the upstream flanking region. Most preferably the first part is identical to the corresponding portion of the upstream flanking region.
- each oligonucleotide has a homology of at least 95%, more preferably at least 97%, even more preferably at least 99% with the corresponding portion of the downstream flanking region.
- the first part is identical to the corresponding portion of the downstream flanking region.
- oligonucleotide set comprising a plurality of oligonucleotides, wherein the first, second and third part of the oligonucleotides are 100% complementary to the marker sequence, and the corresponding portion of the upstream and downstream flanking region, respectively.
- a set of oligonucleotides is used in which the marker sequence is centered in each oligonucleotide between two flanking sequences that are complementary to the flanking genomic DNA sequences of the region where the marker is inserted, hi the set of oligonucleotides the flanking genomic sequences vary from 0 nucleotides on the upstream side and n remaining nucleotides on the downstream side to the reverse situation with n remaining nucleotides on the upstream side and 0 nucleotides on the downstream side, the flanking sequences gradually changing with preferably less than six nucleotides at a time, more preferably less than five nucleotides at a time, even more preferably less than four nucleotides at a time, e.g.
- the invention can also be performed using larger intervals, exceeding six nucleotides at a time, in particular in case the second and/or third parts of the oligonucleotides are lengthy, e.g. exceeding 100 nucleotides.
- the set of oligonucleotides thus forms a 'sliding window' that passes over the position of the marker in the genome and the methodology is therefore called 'Sliding Window Hybridization' (SWH).
- SWH 'Sliding Window Hybridization'
- the oligonucleotide is 60 nucleotides long and the marker sequence 28 nucleotides, the sum of the flanking genomic sequences is 32 nucleotides.
- the set of oligonucleotides for each marker will contain 33 oligonucleotides, the first one being 32/marker/O, the second 31/marker/l, the third 30/marker/2, etc., the second last l/marker/31, and the last: O/marker/32. If for instance 600 markers are to be detected, the total number of oligonucleotides will be 19,800.
- the microarray according to the present invention comprises a plurality of the oligonucleotide sets for the detection of a plurality of nucleic acid markers.
- Each set of oligonucleotides may optionally further comprise a control nucleotide of the nucleic acid marker comprising a sequence which is complementary to the upstream region flanking the nucleic acid marker, which is directly linked to a sequence which is complementary to the downstream region flanking the nucleic acid marker.
- this control oligonucleotide consists of the flanking upstream and downstream complementary region sequences.
- the flanking upstream region sequence has a length of 5 to about 50 nucleotides and the flanking downstream region sequence has a length of 5 to about 50 nucleotides.
- the two flanking sequences of the control oligonucleotide have the same length.
- Such oligonucleotide having a complementary sequence to the flanking genomic sequences of a marker allows for obtaining a specific hybridization signal in case of absence of the marker in the target sequence sample. More preferably, the absence of the markers can be detected conveniently by placing another set of oligonucleotides on the microarray with each oligonucleotide being identical to the flanking genomic sequences of a marker. For instance, if 600 markers are used and the oligonucleotides have a length of 60 nucleotides, this involves 600 additional oligonucleotides, each being identical with 30 nucleotides genomic sequence upstream and 30 nucleotides genomic sequence downstream of the 600 markers, respectively. For these oligonucleotides, signals will be generated only when the markers are absent in the DNA hybridized on the microarray.
- Oligonucleotides with a certain length are spotted or synthesised on the microarray according to one of several established procedures. Such microarrays can be custom ordered from several companies. The method used for the immobilization of DNA on the glass surface will depend on the company. A common one is to immobilize amino modified oligonucleotides on aldehyde silane coated glass slides (Schena et al, 1996, PNAS, 93, 10614-19). The company Isogen uses amino modified oligonucleotides spotted on epoxy modified slides (Nexterion ® slide). Printing on the microarrays can be done with contact printing (i.e. touching the surface) or non-contact printing technology, the latter technology being preferred.
- the company Agilent synthesizes the oligonucleotides on the slide by a method called Sure print technology. They use a non- contact inkjet printer to synthesize 60-mer oligos, base by base, from digital sequence files. Standard phosphoramidite chemistry used in the reactions allows for very high coupling efficiencies to be maintained at each step in the synthesis of the full-length oligonucleotide (spotting phosphoramidites method).
- the method of the present invention provides for the detection of a nucleic acid marker in a target sequence using a microarray according to the present invention as described above.
- the method comprises the steps of: a) labeling the target sequence with a first label and labeling a reference sequence with a second label, wherein the reference sequence lacks the nucleic acid marker but comprises the upstream and downstream flanking regions of the marker as a continuous sequence; b) hybridizing the labeled target and reference sequence to a microarray comprising an oligonucleotide set comprising a plurality of oligonucleotides, each oligonucleotide comprising a first part which is complementary to the nucleic acid marker, and either or both a second part flanking the first part at the 5' end which is complementary to the upstream region flanking the nucleic acid marker, and a third part flanking the first part at the 3' end which is complementary to the downstream region flanking the nucleic acid marker, wherein the respective lengths of the second part
- the method further comprises determining the presence of the nucleic acid marker through regression analysis of the point cloud obtained by plotting for each oligonucleotide the intensity measured for the first label against the intensity of the second label.
- the scatter plot is a logarithmic scatter plot.
- Said regression analysis may comprise the rendering of said scatter plot, or it can be an in silico procedure which returns the absence or presence of the marker without visualizing the plots.
- Different types of labels can be used in the method according to the present invention such as, but not limited to, a fluorescent, chromogenic or chemiluminescent dye, a radio-isotope, metal and/or magnetic nanoparticle, etc. Suitable labels for use in the different detection methods are numerous and extensively described in the art.
- Fluorescent labels include but are not limited to fluorescein isothiocyanates (FITC) 5 carboxyfluoresceins, such as tetramethylrhodamine (TMR), carboxy tetramethyl- rhodamine (TAMRA), carboxy-X-rhodamine (ROX), sulforhodamine 101 (Texas redTM), Atto dyes (Sigma Aldrich), Fluorescent Red and Fluorescent Orange, phycoerythrin, phycocyanin, and Crypto-FluorTM dyes.
- TMR tetramethylrhodamine
- TAMRA carboxy tetramethyl- rhodamine
- ROX carboxy-X-rhodamine
- sulforhodamine 101 Texas redTM
- Atto dyes Sigma Aldrich
- Fluorescent Red and Fluorescent Orange phycoerythrin, phycocyanin, and Crypto-FluorTM dyes.
- labels used in quantitative and qualitative assays include but are not limited to dendrimers, quantum dots, up-converting phosphors and nanoparticles. Accordingly, the detection steps performed in the methods of the invention will be determined by the label used and include, but are not limited to fluorescence, colorimetry, absorption, reflection, polarization, refraction, electrochemistry, chemiluminescence, Rayleigh scattering and Raman scattering, SE(R)RS, resonance light scattering, grating-coupled surface plasmon resonance, scintillation counting, magnetic sensors, electrochemical detection (such as anode stripping voltametry), etc.
- Methods of the present invention envisage the formation of DNA/DNA, DNA/RNA or RNA/RNA oligonucleotide/target hybrids.
- sequence within the oligonucleotide capable of specifically hybridizing with the target corresponds to sequence which is shorter than the target nucleic acid.
- Factors such as the nature of the hybrid (DNA/DNA or DNA/RNA), the length of the oligonucleotide/target hybrid, the degree of complementarity between probe and target have an influence on the hybridization.
- An appropriate choice of the length and the sequence of the oligonucleotide, buffer composition and temperature allow a manipulation of the specific binding between oligonucleotide and target. The selection of appropriate conditions is known to the skilled person and is explained for example in Sambrook et al. (2001) "Molecular Cloning: A Laboratory Manual", Cold Spring Harbor Press).
- the methods of the present invention may need to discriminate between two closely related target molecules.
- the specificity of the oligonucleotides for the target sequence is essential and non-specific or low-stringency binding with the target molecule should be excluded.
- hybridization conditions need to be strict and only perfect matches of the specific binding pairs can be allowed.
- the target-specific probes used in the methods of the present invention are typically chosen in function of the target molecule to be detected.
- the first and second target-specific probes will typically be complementary or partly complementary nucleic acids, each capable of specifically hybridizing with a different part of the target molecule.
- a set of oligonucleotides is used in which the marker sequence is centered in each oligonucleotide between two flanking sequences that are identical to the flanking genomic DNA sequences of the region where the marker is inserted.
- flanking genomic sequences vary from 0 nucleotides on the upstream side and n remaining nucleotides on the downstream side to the reverse situation with n remaining nucleotides on the upstream side and 0 nucleotides on the downstream side, the flanking sequences gradually changing with preferably less than six nucleotides at a time, more preferably less than five nucleotides at a time, even more preferably less than four nucleotides at a time, e.g. one or two or three nucleotides at a time.
- the invention can also be performed using larger intervals, exceeding six nucleotides at a time, in particular in case the second and/or third parts of the oligonucleotides are lengthy, e.g. exceeding 100 nucleotides.
- the set of oligonucleotides thus forms a 'sliding window' that passes over the position of the marker in the genome and the methodology is therefore called 'Sliding Window Hybridization' (SWH).
- SWH 'Sliding Window Hybridization'
- signals will be generated only when the markers are present in the DNA hybridized on the microarray. For instance, if the oligonucleotide is 60 nucleotides long and the marker sequence 28 nucleotides, the sum of the flanking genomic sequences is 32 nucleotides.
- the set of oligonucleotides for each marker will contain 33 oligonucleotides, the first one being 32/marker/0, the second 31/marker/l, the third 30/marker/2, etc., the second last 1 /marker/31, and the last: O/marker/32. If for instance 600 markers are to be detected, the total number of oligonucleotides will be 19,800.
- the target sequence is obtained by procedures known in the prior art.
- Extraction of nucleic acids such as DNA or RNA is a typical pre- treatment of the sample envisaged in the context of the present invention.
- Methods and kits suitable for extracting nucleic acids are available in the art and include methods and kits based on phenol-chloroform extraction, salting out DNA extraction, and guanidinium thiocyanate extraction.
- Exemplary nucleic acid isolation techniques include (1) organic extraction followed by ethanol precipitation, e.g. using a phenol/chloroform organic reagent (e.g.
- kits can be used to expedite such methods, for example, Genomic DNA Purification Kit and the Total RNA Isolation System (both available from Promega, Madison, Wis.). Further, such methods have been automated or semi-automated using, for example, the ABI PRISMTM 6700 Automated Nucleic Acid Workstation (Applied Biosystems, Foster City, Calif.) or the ABI PRISMTM 6100 Nucleic Acid PrepStation and associated protocols, e.g. NucPrepTM Chemistry: Isolation of Genomic DNA from Animal and Plant Tissue, Applied Biosystems Protocol 4333959 Rev. A (2002), Isolation of Total RNA from Cultured Cells, Applied Biosystems Protocol 4330254 Rev.
- the above pre- treatment methods can further comprise a fragmentation step, e.g. by enzyme digestion, shearing or sonication, and/or an enzymatic amplification step, e.g. by PCR including RT-PCR.
- a fragmentation step e.g. by enzyme digestion, shearing or sonication
- an enzymatic amplification step e.g. by PCR including RT-PCR.
- a PCR amplification of the target DNA can be performed prior to the detection of the analyte.
- the detection steps performed in the methods of the invention will be determined by the label used and include, but are not limited to fluorescence, colorimetry, absorption, reflection, polarization, refraction, electrochemistry, chemiluminescence, Rayleigh scattering and Raman scattering, SE(R)RS, resonance light scattering, grating-coupled surface plasmon resonance, scintillation counting, magnetic sensors, electrochemical detection (such as anode stripping voltametry), etc.
- diagnosis and treatment of a variety of disorders may often be accomplished through identification and/or manipulation of the genetic material which encodes for specific disease associated traits.
- polymorphisms as genetic linkage markers is thus of critical importance in locating, identifying and characterizing the genes which are responsible for specific traits, hi particular, such mapping techniques allow for the identification of genes responsible for a variety of disease or disorder-related traits which may be used in the diagnosis and or eventual treatment of those disorders. Given the size of the human genome, as well as those of other mammals, it would generally be desirable to provide methods of rapidly identifying and screening for polymorphic genetic markers.
- the present invention meets these and other needs.
- the estimated mean extent of heterozygosity is on the order of 0.2 percent.
- the frequency of any minor allele varies with about 50% being more frequent ranging from 0.2 to 0. 45 while the remaining 50% are less frequent ranging from 0.03 to 0. 15. Invertebrates appear to exhibit a greater degree of heterozygosity.
- To distinguish DNA polymorphisms from mutant DNA sequences associated with a disease state it is necessary to run a control DNA sample from the same individual isolated from non-disease state tissue. Buccal cells from the cheek or blood cells are typically used as control samples. Alternatively, mutations can be mapped to known sites associated with the disease state.
- the present invention may have considerable utility for genetic research because it allows for the rapid and simple detection of DNA sequence polymorphisms anywhere in the genome.
- markers can be used for mapping hemizygous genomic DNA deletions found in tumor cells (loss of heterozygosity) and linkage disequilibrium mapping in the vicinity of a known genetic marker of the disease locus as well as for diagnostic tools in genetic counseling of inherited disorders particularly, where no other pre-established marker is available. They may also be useful for forensic or human DNA fingerprinting. Uses of Polymorphic Profiles. After determining a polymorphic profile of an individual or population of individuals, this information can be used in a number of methods, such as association studies and medical diagnosis, forensics, paternity testing, etc.
- association studies and diagnosis are based on the fact that the polymorphic profile of an individual may contribute to phenotype of the individual in different ways. Some polymorphisms occur within a protein coding sequence and contribute to phenotype by affecting protein structure. The effect may be neutral, beneficial or detrimental, or both beneficial and detrimental, depending on the circumstances. For example, a heterozygous sickle cell mutation confers resistance to malaria, but a homozygous sickle cell mutation is usually lethal. Other polymorphisms occur in noncoding regions but may exert phenotypic effects indirectly via influence on replication, transcription, and translation. A single polymorphism may affect more than one phenotypic trait. Likewise, a single phenotypic trait may be affected by polymorphisms in different genes.
- polymorphisms predispose an individual to a distinct mutation that is causally related to a certain phenotype. Correlation is performed for a population of individuals who have been tested for the presence or absence of one or more phenotypic traits of interest and for polymorphic profile. The alleles of each polymorphism in the profile are then reviewed to determine whether the presence or absence of a particular allele is associated with the trait of interest. Correlation can be performed by standard statistical methods such as a ⁇ -squared test and statistically significant correlations between polymorphic form(s) and phenotypic characteristics are noted.
- the determination of which polymorphic forms occupy a set of polymorphic sites in an individual identifies a set of polymorphic forms that distinguishes the individual.
- the capacity to identify a distinguishing or unique set of forensic markers in an individual is useful for forensic analysis. For example, one can determine whether a blood sample from a suspect matches a blood or other tissue sample from a crime scene by determining whether the set of polymorphic forms occupying selected polymorphic sites is the same in the suspect and the sample.
- the set of polymorphic markers does not match between a suspect and a sample, it can be concluded (barring experimental error) that the suspect was not the source of the sample. If the set of markers does match, one can conclude that the DNA from the suspect is consistent with that found at the crime scene. If frequencies of the polymorphic forms at the loci tested have been determined (e.g., by analysis of a suitable population of individuals), one can perform a statistical analysis to determine the probability that a match of suspect and crime scene sample would occur by chance.
- paternity testing the object is usually to determine whether a male is the father of a child. In most cases, the mother of the child is known and thus, the mother's contribution to the child's genotype can be traced. Paternity testing investigates whether the part of the child's genotype not attributable to the mother is consistent with that of the putative father. Paternity testing can be performed by analyzing sets of polymorphisms in the putative father and the child. If the set of polymorphisms in the child attributable to the father does not match the putative father, it can be concluded, barring experimental error, that the putative father is not the real father. If the set of polymorphisms in the child attributable to the father does match the set of polymorphisms of the putative father, a statistical calculation can be performed to determine the probability of coincidental match.
- Yeast strains with artificial markers are described in WO2003002709 (Thevelein J.M., P. Ma, P. Van Dijck, F. Dumortier, W. Broothaerts Novel technology for genetic mapping.)
- Yeast genomic DNA is isolated using the Lyticase method. 10 ⁇ g of genomic DNA is digested for 3h with: Hind III + BgI II + Xba I or Sac II + Mfe I + Dra I (1 unit of each enzyme/ ⁇ g DNA). The digested genomic DNA is purified by precipitation with EtOH. Two ⁇ g of the purified DNA is labeled using for instance the protocol developed for microarray based comparative genomic hybridization by the Stanford Medical Center. For this purpose H 2 O is added to 2 ⁇ g of DNA to obtain a total volume of 20 ⁇ l.
- DNA from a strain without markers is labeled with Cy5-dCTP and DNA from a strain with markers with Cy3-dCTP.
- the labeled DNA is subsequently hybridized to the microarray with the marker oligonucleotides according to one of several available protocols known to people familiar with the state-of-the-art. For instance, 50 pmol of DNA per sample is hybridized to the microarray in an Agilent oven for 17 h at 65 0 C with a rotor turning speed of 20 rpm according to the hybridization protocol used in the Agilent Array Based CGH for Genomic DNA Analysis.
- the genomic DNA will hybridize to the oligonucleotides on the microarray with varying efficiency, depending on the position of the marker in the oligonucleotide sequence.
- microarrays with the marker oligonucleotides can be obtained in different ways known to people familiar with the state-of-the-art. For instance, custom-made microarrays with a single chamber containing all oligonucleotides can be ordered from Aglient. Alternatively, custom-made microarrays with eight chambers containing each 15,000 oligonucleotides can be ordered from Agilent. In this case a selection is made of the oligonucleotides so that the total number of oligonucleotides required fits on the array. This is done by eliminating a number of intermediate oligonucleotides, spaced a few oligonucleotides from each other.
- the oligonucleotides nr. 3, 7, 11, 15, 19, 23, 27 and 31 are not used, which reduces the total number per marker to 25.
- the fluorescence signals on the microarrays are measured with one of the commercially available microarray scanners, e.g. the Agilent DNA microarray scanner G2565BA. After scanning of the microarray for the fluorescence signals, the data are analysed using the Feature Extraction Software from Agilent. This software plots the individual spot signals as dots in a logarithmic graph with Cy5 fluorescence on the X-axis and Cy3 fluorescence on the Y-axis.
- the present invention introduces the concept of the sliding window of oligonucleotide sequences and the corresponding principle of using the Cy3 and Cy5 signal intensities of all spots in the window to enhance the reliability of the marker detection so much that it reaches 100%.
- the signals from the microarray are treated with a computer script that draws a regression line through all spots obtained for a single marker, and the slope and the intercept of the regression line are both computed. In this way two values are assigned to each marker.
- the slope has a value close to zero, preferably lower than 0.5 and the intercept is high, preferably higher than 5, the marker is present.
- the slope has a value close to 1, preferably higher than 0.7, and the intercept is close to 0, preferably less than 2, the marker is absent.
- the combination of slope and intercept figures allows unequivocal assignment of marker presence or absence.
- the novelty of the present invention is that the marker detection is not based on the intensity of the hybridisation signal (with e.g.
- the computer script translates the results of the slope and intercept values into a table or a diagram indicating the presence or absence of each marker with near absolute reliability.
- Example 2 Detection of a genomic mutation linked to an inherited disease
- Certain regions of the human genome contain sequentially repeated nucleic acid sequences, variously referred to as tandem repeats.
- a repeated sequence of three nucleotides is commonly referred to as a "triplet repeat" and although a certain multiple of a repeated sequence may be normal, genetic mutations can increase the multiple of the repeat causing an unstable structure where the multiple tends to increase generation by generation.
- the repeated region can be responsible for diseases having severe outward manifestations such as mental retardation, neuromuscular degenerative disorders, ataxia, involuntary movement, epilepsy, dementia and nerve palsy.
- mHTT mutant HTT
- CAG repeats are related to how much this process is affected, and correlates with age at onset and the rate of progression of symptoms. For example, 36-39 repeats result in much later onset and slower progression of symptoms than the mean, such that individuals may die of other causes before they manifest symptoms; this is termed "reduced penetrance”. With very large repeat counts, HD can occur under the age of 20, when it is then referred to as juvenile HD, akinetic-rigid, or Westphal variant HD. This accounts for about 7% of HD carriers.
- test sample containing a target sequence is obtained from an individual.
- the test sample can be derived from any biological source, such as a physiological fluid, including, blood, saliva, ocular tens fluid, cerebral spinal fluid, sweat, urine, milk, ascites fluid, mucous, synovial fluid, peritoneal fluid, amniotic fluid and the like, or fermentation broths, cell cultures, chemical reaction mixtures and the like.
- the test sample can be used directly as obtained from the source or following a pre-treatment to modify the character of the sample.
- test sample can be pretreated prior to use by, for example, preparing plasma from blood, preparing liquids from solid materials, diluting viscous fluids, filtering liquids, distilling liquids, concentrating liquids, inactivating interfering components, adding reagents, and the like.
- test sample is pretreated to digest, restrict or render double stranded nucleic acid sequences single stranded, and is also purified to concentrate the target sequences that is contained therein.
- the target sequence within the test sample is labeled with Cy3-dCTP.
- the reference sequence is a DNA sequence of a healthy individual that has a number of repeats within the normal, healthy range, and is labeled with Cy5-dCTP.
- the labeled DNA is subsequently hybridized to the microarray as described in Example 1, and the data is similarly analysed with the Feature Extraction Software from Agilent.
- the nucleic acid marker is a triplet repeat expansion of a certain length.
- the microarray is now featured with different oligonucleotide sets that bear several critical lengths: set 1 where the marker has less than 27 repeats of an unaffected individual, set 2 where the marker has between 27 and 35 repeats of an unaffected individual, set 3 where the marker has between 36 and 39 repeats of a mildly affected individual, and set 4 where the marker has above 39 repeats of a severely affected individual.
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Physics & Mathematics (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Immunology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Apparatus Associated With Microorganisms And Enzymes (AREA)
Abstract
La présente invention concerne une nouvelle méthodologie basée sur les micro-réseaux, nommée « hybridation à fenêtre dynamique » (SWH), pour détecter de manière fiable et commode de courtes séquences d'acides nucléiques dans les génomes d'un organisme.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB0809444A GB0809444D0 (en) | 2008-05-23 | 2008-05-23 | Insertion sequence detection protocol |
GB0809856.8 | 2008-05-30 |
Publications (3)
Publication Number | Publication Date |
---|---|
WO2009143590A2 true WO2009143590A2 (fr) | 2009-12-03 |
WO2009143590A3 WO2009143590A3 (fr) | 2010-06-24 |
WO2009143590A8 WO2009143590A8 (fr) | 2011-12-29 |
Family
ID=39616024
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/BE2009/000026 WO2009143590A2 (fr) | 2008-05-23 | 2009-05-25 | Protocol de détection de séquence d'insertion |
Country Status (2)
Country | Link |
---|---|
GB (1) | GB0809444D0 (fr) |
WO (1) | WO2009143590A2 (fr) |
-
2008
- 2008-05-23 GB GB0809444A patent/GB0809444D0/en not_active Ceased
-
2009
- 2009-05-25 WO PCT/BE2009/000026 patent/WO2009143590A2/fr active Application Filing
Non-Patent Citations (1)
Title |
---|
No further relevant documents disclosed * |
Also Published As
Publication number | Publication date |
---|---|
GB0809444D0 (en) | 2008-07-02 |
WO2009143590A8 (fr) | 2011-12-29 |
WO2009143590A3 (fr) | 2010-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP3693352B2 (ja) | プローブアレイを使用して、遺伝子多型性を検出し、対立遺伝子発現をモニターする方法 | |
US20190024141A1 (en) | Direct Capture, Amplification and Sequencing of Target DNA Using Immobilized Primers | |
US7192700B2 (en) | Methods and compositions for conducting primer extension and polymorphism detection reactions | |
JP4481491B2 (ja) | 核酸の検出方法 | |
US6506568B2 (en) | Method of analyzing single nucleotide polymorphisms using melting curve and restriction endonuclease digestion | |
US20060199183A1 (en) | Probe biochips and methods for use thereof | |
US20120276533A1 (en) | Method for Simultaneously Detecting Polymorphisms of Acetaldehyde Dehydrogenase 2 and Alcohol Dehydrogenase 2 | |
WO2005085476A1 (fr) | Detection d'un strp, tel que le syndrome de l'x fragile | |
AU2003247715B2 (en) | Methods and compositions for analyzing compromised samples using single nucleotide polymorphism panels | |
JP2013090622A (ja) | 多型検出用プローブ、多型検出方法、薬効判定方法及び多型検出用キット | |
CN108026583A (zh) | Hla-b*15:02的单核苷酸多态性及其应用 | |
US20100297633A1 (en) | Method of amplifying nucleic acid | |
US20070231803A1 (en) | Multiplex pcr mixtures and kits containing the same | |
US20090023597A1 (en) | Single Nucleotide Polymorphism Detection from Unamplified Genomic DNA | |
AU2003247603A1 (en) | Methods and compositions for monitoring primer extension and polymorphism detection reactions | |
WO2009143590A2 (fr) | Protocol de détection de séquence d'insertion | |
WO2014150938A1 (fr) | Procédés de génération de fragments moléculaires d'acide nucléique ayant une distribution de dimension personnalisée | |
RU2600874C2 (ru) | Набор олигонуклеотидных праймеров и зондов для генотипирования полиморфных локусов днк, ассоциированных с риском развития спорадической формы болезни альцгеймера в российских популяциях | |
WO2003020950A2 (fr) | Procedes et compositions pour la detection de polymorphisme bidirectionnel | |
Park et al. | DNA Microarray‐Based Technologies to Genotype Single Nucleotide Polymorphisms | |
JP2005245272A (ja) | アルコール脱水素酵素遺伝子多型の簡易検出方法および検出用試薬 | |
Mikulecky et al. | The San Diego Conference Nucleic Acid» a-------a Technologies in Disease Detection S kkkk kkk kkk kkk kkk kkk | |
JP2005198525A (ja) | グルタチオンs−トランスフェラーゼ遺伝子多型の検出方法および検出用キット |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
NENP | Non-entry into the national phase in: |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 09753354 Country of ref document: EP Kind code of ref document: A2 |