WO2007020444A1 - Array comparative genomic hybridisation methods using linear amplification - Google Patents

Array comparative genomic hybridisation methods using linear amplification Download PDF

Info

Publication number
WO2007020444A1
WO2007020444A1 PCT/GB2006/003072 GB2006003072W WO2007020444A1 WO 2007020444 A1 WO2007020444 A1 WO 2007020444A1 GB 2006003072 W GB2006003072 W GB 2006003072W WO 2007020444 A1 WO2007020444 A1 WO 2007020444A1
Authority
WO
WIPO (PCT)
Prior art keywords
array
nucleic acid
fragments
amplification
sequences
Prior art date
Application number
PCT/GB2006/003072
Other languages
French (fr)
Inventor
Andrew Allen
Original Assignee
Oxford Gene Technology Ip Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oxford Gene Technology Ip Limited filed Critical Oxford Gene Technology Ip Limited
Publication of WO2007020444A1 publication Critical patent/WO2007020444A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates
    • C12Q1/6855Ligating adaptors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6865Promoter-based amplification, e.g. nucleic acid sequence amplification [NASBA], self-sustained sequence replication [3SR] or transcription-based amplification system [TAS]

Definitions

  • This invention is in the field of comparative genomic hybridisation (CGH) analysis. BACKGROUND ART
  • CGH is a technique in which two labelled samples are compared by allowing them to hybridise and subsequently looking for regions of differential hybridisation. It has been used particularly in cytogenetic analysis, where it allows the comparison of a genome isolated from a clinical (test) sample (e.g. derived from a cancer patient) with a control (reference) sample in a single hybridisation. It was originally disclosed in reference 1.
  • SpectralChipTM from Spectral Genomics
  • GenoSensorTM from Vysis
  • QArrayTM from Genetix
  • the "Human Genome CGH Microarray” from Agilent
  • Nimblegen's "Array CGH” product is a high-resolution 60-mer oligonucleotide-based microarray that contains over 40,000 probes sourced from the NCBI human genome. This array enables genome-wide profiling of genomic aberrations (including copy number changes) on a single chip, starting from only 25ng of total genomic DNA per sample [6].
  • NimbleGen's arrays contain ⁇ 385,000 probes on a single glass slide, with 16 ⁇ m x 16 ⁇ m probes tiled through genie and intergenic regions at a median probe spacing of 6kbp.
  • reference 7 established the efficacy of whole genome amplification (WGA) approaches for achieving this goal.
  • the protocol used in reference 7 amplifies DNA using either ⁇ 29 or Bst DNA polymerase.
  • Agilent's standard protocol utilises ⁇ 29 random amplification [8].
  • Reference 9 achieves WGA by using degenerate oligonucleotide-primed PCR and strand displacement amplification (SDA) with Bst polymerase.
  • Reference 10 uses PCR to amplify the fragments obtained after BgIW digestion of the genome.
  • Spectral's process amplifies DNA using the Klenow fragment and random primers [11], although the degree of amplification is lower than in references 8 & 9.
  • Reference 4 also uses random primers and Klenow DNA polymerase. Amplification can be avoided only where a large amount of sample genomic DNA is available, which may be rare in clinical settings. In reference 6, for example, artificial tests used 20 ⁇ g standard genomic DNA for initial tests, whereas real tests used 10ng DNA - a 2000-fold difference. Because the array CGH procedure involves comparing long chromosomal DNA molecules to a large number of smaller targets, the sample DNA needs to be shortened before being applied. This is generally achieved by fragmenting the sample DNA.
  • Agilent's protocol uses restriction digestion after amplification, whereas Spectral's protocol uses sonication before random prime labelling and amplification.
  • the use of an exponential amplification step in existing array CGH protocols is not ideal. Exponential processes are very sensitive to initial conditions, and non-uniform amplification of DNA can cause biasing in the generation of certain loci, particularly over a typical series of about 20 amplification cycles, giving artificial changes in copy number and misleading results from CGH analysis.
  • Reference 9 acknowledges that "PCR-based WGA methods have been reported to cause 4 to 6-fold amplification bias", and that "DOP-PCR is not an ideal approach for WGA, as the amplification bias could produce false results”.
  • Reference 12 discloses a method for producing amplified amounts of RNA from genomic DNA, and the RNA is used for CGH analysis, including by array.
  • Genomic DNA is annealed to an oligonucleotide that contains a primer domain linked to a RNA polymerase-promoter domain.
  • the oligonucleotide is then subjected to a primer extension reaction to produce dsDNA molecules including a RNA polymerase promoter at the 5 1 end. Transcription from these dsDNA molecules is used, to give labelled RNA copies of the genomic DNA, and these RNA molecules are used as probes in hybridisation reactions.
  • genomic DNA is amplified in a linear process prior to CGH array analysis.
  • in vitro transcription using T7 RNA polymerase has been found to generate at least 30-40 ⁇ g of RNA from nanogram quantities of template, covering the whole genome. This yield and coverage was not expected, and provides enough RNA for several hybridisation analyses.
  • using a linear amplification method avoids the bias associated with prior art methods.
  • the invention provides, in a process for amplifying a DNA sample for analysis by array CGH, the improvement consisting of the use of a linear amplification.
  • the invention provides a process for amplifying a DNA sample for analysis by array CGH, comprising the steps of: (a) fragmenting the DNA sample; and (b) amplifying the resulting fragments using a linear nucleic acid amplification method.
  • the preferred linear amplification method is in vitro transcription (IVT).
  • the invention provides a process for amplifying a DNA sample for analysis by array CGH, comprising the steps of: (a) fragmenting the DNA sample, to give fragments; (b) attaching linker sequences to the fragments, to give amplifiable sequences, wherein the linkers include sequences for initiation of linear amplification; and (c) subjecting the amplifiable sequences to a linear nucleic acid amplification method.
  • the invention also provides a process for preparing a nucleic acid sample for amplification, comprising steps (a) and (b) of this process. Amplification in step (c) may then be performed separately.
  • the invention also provides a process for analysing a nucleic acid sample using an array, comprising amplification steps (a) to (c) as described above, and further comprising the step of: (d) applying amplified nucleic acid from step (c) to a nucleic acid array.
  • the amplified nucleic acid is preferably RNA, produced in step (c) by an in vitro transcription reaction, and is preferably labelled.
  • this analysis process is performed for a test sample and a reference sample, under substantially identical conditions for each sample, and the results of the two analyses are compared to give a CGH result.
  • the invention also provides a process for analysing a nucleic acid sample using an array, comprising the step of applying amplified nucleic acid to a nucleic acid array, wherein the amplified nucleic acid has been prepared by a process comprising amplification steps (a) to (c) as described above.
  • amplification steps a) to (c) as described above.
  • the invention provides a process for preparing a DNA sample for analysis by array CGH, comprising the steps of: (a) performing a restriction digestion on the DNA sample, to give fragments; and (b) subjecting the fragments to a nucleic acid amplification process.
  • the amplification process preferably gives linear amplification and, as described above, will typically be preceded by addition of linker sequences to the fragments. It may amplify both strands of the fragments, or only one strand.
  • Processes of the invention preferably do not use any PCR amplification (or any other exponential amplification process) prior to CGH analysis.
  • the DNA sample is typically a genomic DNA sample. Analysis will generally be performed on total genomic DNA which, in a eukaryote, includes DNA from the nucleus and other organelles e.g. from the mitochondria.
  • the invention can be used for comparing all types of DNA by CGH, and is particularly suitable for analysing human cells, including cancer cells.
  • Fragmentation of genomic DNA in a sample can be achieved physically, chemically, or enzymatically. Physical and chemical fragmentation is essentially random, whereas enzymatic fragmentation using restriction enzymes is sequence-specific and repeatable. Thus restriction digestion is a preferred method for fragmenting gDNA in a sample.
  • restriction enzyme for digestion of gDNA can have a major impact on subsequent CGH analysis. For example, if a frequent cutter is chosen then the genome will be represented by a large number of short fragments; on the other hand, if a rare cutter is used then there will be a smaller number of longer fragments of the genome. Thus the choice of enzyme dictates the number and length of target sequences for hybridisation, which is of key importance in array CGH analysis.
  • sequence information for the human genome allows in silico restriction digestions to be performed in order to identify enzymes that will give desired coverage of the genome with suitable fragment sizes.
  • More than one restriction enzyme can be used to digest gDNA.
  • the enzymes may or may not be isoschizomers.
  • the Agilent protocol [8] digests gDNA with both Alu ⁇ (cuts at AG
  • Preferred restriction enzymes are type Il or type Ms, including particularly type Il enzymes with asymmetric recognition sequences..
  • the restriction enzyme(s) may produce cohesive (sticky) ends, or produce blunt ends. Enzymes that produce cohesive ends with a single nucleotide overhang are less preferred, as they are less suitable for specific ligation than longer (e.g. 2, 3, 4, 5, 6 nt) cohesive ends.
  • a restriction enzyme's recognition sequence includes a c residue
  • this residue may be subject to methylation
  • Restriction enzymes that recognise DNA within or adjacent to this residue may be inhibited from DNA cleavage at this locus, which may affect the copy number of this locus identified in subsequent analysis.
  • the methods may use an enzyme that is insensitive to cytosine methylation.
  • the restriction enzyme(s) may have a degenerate recognition sequence (e.g. including one or more N, R, Y, W, S, M, K, H, B, V, or D nucleotides), or may have a single specific recognition sequence (e.g. a specific palindrome). Where the enzyme has a specific recognition sequence, it may cut within that sequence or may cut outside the sequence. Where it cuts outside the sequence, the cleavage sequence may be degenerate or there may be a single specific sequence.
  • Preferred restriction enzyme produce degenerate cohesive ends e.g. SsaJI or Fok ⁇ , Bbv ⁇ , BbvW, BstF ⁇ l, TspDTI (see Table below). SsaJI produces a degenerate cohesive end from within its recognition sequence, whereas Fok ⁇ produces a degenerate cohesive end outside its recognition sequence. A preferred degree of degeneracy is 16-fold.
  • Isoschizomers and neoschizomers of these enzymes can also be used e.g. BseGI, BtsCI, HinGUII, Hpy178VI, HpyF6l, HpyF67lll, Stsl, BseDI, BssEDI, BsoKI, BstZIOI, Hpy99IV, HpyFIOIII, HpyF61 ll, HpyF67IV, Seel, Uba1442l, BseXI, BstVI I, AIwXI, Bchl, BsaUI, BseKI, Bsp423l, BsrVI, Vst12l, Bst71 l, Lfel, Lsp1109l, Bbsl, Bpil, BpuAI, BstV2l, Bbr7l, Bbvll, Bbv16ll, Bco102ll, BsaVI, Bsc91l, BscKI, B
  • restriction enzymes that leave blunt ends, or that cut only at single specific sequences, is not preferred, because the resulting fragments all have identical end sequences and so are more liable to circularisation and/or concatenation in downstream steps than fragments with degenerate ends.
  • linker sequences after an enzyme produces a degenerate cohesive end, it is possible to attach linker sequences to only a subset of the fragments, allowing the complexity of a sample to be reduced.
  • an enzyme such as SsaJI
  • the generated fragments can have sixteen different overhangs. Linkers can be attached to a chosen subset of these 16 overhangs, and then the linker-containing sequences can be selected, thereby reducing overall complexity.
  • step (b) may involve attaching linker sequences to only a subset of the restriction fragments, and this can be achieved by attaching linkers by sequence-specific criteria.
  • This procedure allows complexity of the sample to be reduced, and permits enrichment of a chosen subset of fragments. In silico digestion of a genome provides information about the relative frequency of the possible degenerate sequences (e.g.
  • a preferred restriction enzyme is SsaJI (or an isoschizomer thereof), which is predicted by in silico digestion to cut inside substantially every human gene, thereby providing a hybridisable fragment for every gene.
  • Enzymes and subsets can be chosen based on various criteria e.g. the choice may be influenced by characteristics of the genome to be analysed, including the GC content of the DNA to be analysed. For example, although an enzyme such as BsaJI has 16 possible cleavage recognition sequences, these are not necessarily represented equally throughout the genome. In the human genome, for instance, the GT overhang is seen in 2% of cleavages, whereas AA occurs more frequently (approximately 10%).
  • the invention allows the methylation status profiles of the two samples to be compared by array CGH.
  • gDNA is highly fragmented (e.g. in a poor quality or aged/degraded sample)
  • procedures such as PCR will fail when primers span a break in the target, as extension of one primer does not produce a new template for the opposite primer.
  • methods of the invention are able to provide better results than existing protocols in situations where the starting DNA is of poor quality.
  • problems of re-annealing are avoided.
  • signal intensity in hybridisation analysis is improved, because the array probes do not have to compete against the complementary DNA strand for hybridisation to the target.
  • the invention uses a linear procedure for amplifying the sample DNA. Unlike a process such as PCR, if the product of one step of amplification does not feed back as a substrate for the next step then amplification proceeds in a linear fashion rather than exponentially. Only one new strand is formed per cycle of amplification.
  • a preferred amplification procedure is IVT, which produces complementary RNA from a DNA target.
  • Any suitable DNA-dependent RNA polymerase can be used, such as SP6, T3 or T7, all of which are readily available.
  • Use of a T7 RNA polymerase is preferred, as protocols for amplifying gDNA using it are known in the art (e.g. from reference 13).
  • Other procedures for whole genome amplification (WGA) are disclosed in reference 14.
  • Amplification by T7 RNA polymerase requires the presence of suitable promoters in the target DNA (rather than, for instance, involving extension of an annealed primer). These promoters can be introduced at the 5' ends of fragmented gDNA using known techniques.
  • IVT can then be used to produce multiple RNA copies of the gDNA, for use in CGH analysis.
  • Introduction of promoters will typically be achieved by ligating linker sequences to the 5' ends of gDNA fragments, wherein the linkers include the IVT promoter sequence.
  • linkers may be introduced by subsequent hybridisation of a primer containing the promoter sequence. The use of linkers is described in more detail below.
  • a key feature of CGH is that hybridisation must be detectable. If an array is used where hybridisation can be detected without labelling of target DNA then the CGH methods of the invention do not need to include label incorporation. Typically, however, detection of hybridisation on an array requires the target to be labelled. Conveniently, label can be incorporated during the linear amplification step. Fluorescent labelling is preferred.
  • Labelling can be direct e.g. by incorporating labelled nucleotides during nucleic acid chain extension. Rather than incorporate fluorophores directly, however, it is also possible to incorporate a specific functional group to which fluorophores can later be coupled ('post-labelling'). Both labelling methods have been used successfully, but post-labelling is preferred, as bias may be caused when using direct labelling if a fluorophore interferes with the chain extension process.
  • a preferred indirect labelling protocol uses 5-(3-aminoallyl)-UTP, which can be included together with unlabelled UTP during transcription. The aminoallyl group can be added after amplification using suitable reactive labels e.g. the NHS-CyDye range from Amersham Biosciences.
  • Incorporation of a large number of fluorophores means that the product can readily be detected by any of the familiar means of fluorescence detection.
  • One feature of IVT is that, unlike PCR, it gives single-stranded products. To detect these products by hybridisation, probes on the array will be from the complementary strand. In general, only one strand per restriction fragment will be amplified, halving the complexity of the detectable sequences relative to a double-stranded PCR amplification product.
  • the methods of the invention have been able to generate sufficient quantities of RNA for CGH analysis even from low amounts of starting genomic DNA e.g. ⁇ 20ng.
  • the standard protocol from Agilent required at least 1.5 ⁇ g of genomic DNA and even the "Low Input gDNA" protocol requires at least 500ng of starting material [15].
  • the Nimblegen protocol requires 1-3 ⁇ g of starting gDNA [16].
  • Linkers As mentioned above, the invention typically involves the attachment of linkers to fragments of genomic DNA. These linkers can be used to introduce binding sequences for polymerases, such as 5 1 promoters for use in IVT. After a linker is added to the 5 1 end of a fragment then that fragment becomes amplifiable by an appropriate polymerase. Ligation of amplification linkers to the 5' end of restriction fragments is known from “complexity management" [17,18] and target enrichment [19] procedures, including selective amplification of a subset of the fragments based on their adaptors. It is also described in reference 10 as part of the protocol for reducing sample complexity by "representation".
  • FIG. 13 shows examples of promoters for the T7, SP6 and T3 RNA polymerases (SEQ ID NOS: 1 to 3).
  • the preferred polymerase for use with the invention is T7 RNA polymerase, and so preferred linkers include a T7 RNA polymerase promoter sequence.
  • T7 RNA polymerase promoter sequences are known, including natural sequences [20] and artificial ones (e.g. see refs. 21-26). Different T7 RNA polymerases can have different promoter sequence preferences, and mutant T7 RNA polymerases have been produced to match specific promoters (e.g. see refs.
  • T7 RNA polymerases and promoter sequences, and can easily match any particular T7 RNA polymerase to its preferred promoter sequence.
  • the consensus 23 base-pair T7 DNA promoter is classically divided into two domains, an upstream binding domain (-17 to -5, numbered relative to the start of transcription), and a downstream initiation domain (-4 to +6). This 23mer is SEQ ID NO: 1 herein:
  • a preferred linker of the invention includes SEQ ID NO: 8, and more preferably includes SEQ ID NO: 1. This will typically be paired to a complementary sequence in a double-stranded region of the linker.
  • the 23mer of SEQ ID NO: 1 will be on the upper strand of a double-stranded sequence, where the lower strand is the template for T7 RNA polymerase i.e. the upper strand is in the same sense as the T7-produced transcript.
  • a preferred linker is shown in Figure 1A, which includes a 30mer double-stranded region with a 4-mer overhang on the lower strand.
  • the linker may include additional sequence to the 5' and/or 3' end of the T7 RNA polymerase promoter e.g. to ensure correct spacing relative to the start of the target sequence, to space the promoter away from any attached molecules in order to avoid steric interference, to add a primer sequence for subsequent PCR amplification, to add a sequence tag, efc.
  • Preferred linkers do not include a primer binding site.
  • the T7 RNA polymerase binds to double-stranded DNA, and so the promoter sequence in the linker should be double-stranded. This can be achieved by adding a linker including the double-stranded sequence, or by adding a linker with a single-stranded sequence and then adding a complementary single-stranded sequence to give the final duplex. Using a linker that includes a double-stranded region is preferred.
  • the linkers are to be attached to the 5' end of double-stranded restriction fragments, and so the sequence of at least a portion of the linker will depend on the restriction enzyme(s) used during the fragmentation.
  • Figure 3 shows linker possibilities for three different types of overhang produced by restriction enzyme digestion.
  • linkers should have single-stranded overhangs that match the overhang produced by the restriction enzyme.
  • Linkers can be joined to gDNA fragments by standard ligation techniques, including both chemical and enzymatic ligation e.g. with a ligase.
  • Ligases require a free 3 1 hydroxy group and a free 5' phosphate group.
  • a suitable kinase can be used to phosphorylate free 5' ends, such as a T4 polynucleotide kinase.
  • the 5' end of the lower strand ( 5 'CACG7) should be phosphorylated in order to be ligated to a restriction fragment.
  • the 5' end of the upper strand which is not to be ligated to a restriction fragment, should not be phosphorylated. This can be achieved e.g. by blocking its 5' end.
  • the reaction mixture can be purified e.g. to remove unligated linkers, nucleotides, label, etc.
  • Reagents that are commercially available for PCR cleanup can be used for this purpose, such as the PCR purification columns from QiagenTM (e.g. QIAQuickTM). These columns rely on DNA binding to silica gel membranes.
  • linkers can selectively be attached to a subset of fragments if a restriction enzyme produces a degenerate cohesive end.
  • a restriction enzyme produces a degenerate cohesive end.
  • fifteen overhangs remain unlinked.
  • Figure 2 shows only one end of these fragments, but the other end will also have a cohesive end produced by the restriction enzyme.
  • the unlinked fragments are prone to circularisation, leading to a loss of analysable and/or linkable sequences. Fragments are also prone to concatenation to other fragments with complementary overhangs, leading to the generation of nucleic acids in which non-contiguous genome regions are juxtaposed. It is thus preferred to minimise these unwanted side reactions.
  • One way to avoid the unwanted reactions is to include nucleic acids that can hybridise to undesired single-stranded sequences, thereby blocking the unwanted reactions.
  • This procedure is illustrated in Fig.4, in which the desired linker matches the CGTG overhang and other overhangs are blocked by the use of 14 different blockers. Once the desired linker has ligated to its target CGTG overhang then circularisation can no longer take place.
  • linkers that can ligate to them are added to the ligation reaction.
  • the blockers can hybridise to the overhangs of undesired fragments, and be ligated, in order to add sequence to prevent concatenation. By lacking phosphorylation at the 5' end not involved in ligation, the blockers cannot take part in further ligations. Additional features at the 5' end can also prevent further ligation.
  • the blocking linkers should not include overhangs complimentary to or identical to the T7 linker.
  • a blocker with a sequence complementaty to the desired linker should not be used, or else the desired linker would itself be blocked.
  • a blocker that matches CACG is not included e.g. to select the 5 ' -CGTG-3 ? overhang the blockers are based on 5 ' -GDAC-3 ' , 5 ⁇ -GNCC-3 ' , 5 ' -GVGC-3 ' and 5 ' -GNTC-3 ' .
  • These blockers will generally have a single-stranded overhang to match the targeted restriction site, a double-stranded central region, and optionally a long single-stranded overhang that does not match any target sequence (e.g. see Figure 1 B).
  • the 3' nucleotide not involved in ligation to a restriction fragment may be modified to prevent chain extension.
  • the invention provides a nucleic acid having (i) a double-stranded region containing a promoter for a DNA-dependent RNA polymerase, and (ii) a single-stranded overhang that can hybridise to an overhang produced by a restriction enzyme.
  • the single-stranded overhang is suitable for subsequent ligation reactions.
  • RNA polymerases and restriction enzymes are described above.
  • Preferred restriction enzymes are those with degenerate recognition sequences.
  • a preferred combination is a T7 RNA polymerase promoter and a SsaJI overhang (e.g. Figure 1A).
  • the double-stranded region may include up to 100 base pairs e.g. ⁇ 90, ⁇ 80, ⁇ 70, ⁇ 60, ⁇ 50, ⁇ 40, ⁇ 30, etc.
  • the single-stranded overhang may begin immediately next to the end of the promoter sequence, as shown in Figure 1A. Other features of the linkers are described elsewhere herein.
  • preferred linkers match a sequence that is not palindromic, as palindromic sequences can hybridise to each other and cause ligation of linkers.
  • SsaJI overhangs are palindromic, and so these four are not preferred (G ⁇ TC, GCGC, GGCC, GTAC). Any of the remaining overhangs can be used (i.e.
  • enrichment can operate at the appropriate level. If only specific chromosomes are of interest then enrichment can be used to reduce or eliminate fragments derived from other chromosomes, provided that there are appropriately-distributed overhang sequences.
  • In silico digestion of a genome sequence can be used to find the best restriction enzymes for any desired enrichment.
  • SsaJI is used to digest the human genome, and a linker is used to select only fragments with a single-stranded 5 ' -CGTG-3 ' overhang. Thus the linker will have a single stranded overhang of 3 ' -GCAC-5 ' on the lower strand.
  • Figure 1A shows the preferred linker.
  • Linker offers a 50-fold enrichment of the genome, while giving fragments that retain full coverage of the human genome (an average of 1 fragment every 10kb).
  • a different linker which selects a 5 ' -C ⁇ G-3 ' overhang at the 5' end of a restriction fragment (i.e. has a 3 ' -GTTC-S 1 overhang on the lower strand), gives a 10-fold enrichment i.e. more complete coverage of the genome (1 fragment every 2-3kb), but potentially a lower signal/noise ratio.
  • Linkers may include a binding group such that they can be selectively extracted from a sample.
  • a linker may have a covalently-attached biotin molecule.
  • these ligated linkers can be extracted using avidin or streptavidin, for example, thereby removing the unlinked genomic DNA from the sample. This procedure is not essential, but allows a reduction in volume of material for IVT, and removes unwanted background DNA.
  • Suitable binding pairs either of which may be attached to a linker, with the other member of the pair being used for extraction, are known in the art, and include but are not limited to: biotin/streptavidin; biotin/avidin; antigen/antibody; etc. Attachment to the linker via the 5' end of the upper strand can advantageously prevent that 5' end from being involved in unwanted ligation reactions.
  • the member of the binding pair that is not attached to the linker can be attached to a suitable solid support e.g. to a column, to a surface, or to a magnetic or paramagnetic bead.
  • a suitable solid support e.g. to a column, to a surface, or to a magnetic or paramagnetic bead.
  • streptavid in-coated paramagnetic beads are widely available, and a preferred separation method involves: ligating biotin-labelled linkers to restriction fragments; contacting the restriction fragments with streptavidin-coated paramagnetic beads; allowing the streptavidin and fragments to interact; and washing the beads to enrich the ligated material.
  • Separation of linked fragments from unlinked fragments can also be used to concentrate a sample. For instance, where an IVT reaction requires a 20 ⁇ l volume, but the DNA to be amplified is in a larger volume (e.g. in 50 ⁇ l after cleanup), that DNA can conveniently be removed by the use of magnetic separation and then can be added directly to an IVT reaction. This process is more convenient and more rapid than concentrating the DNA to reduce the volume.
  • the invention provides a process for selectively amplifying a subset of sequences in a population of nucleic acids, comprising the steps of: (a) digesting the nucleic acids with a restriction enzyme to give restriction fragments with cohesive ends that include a degenerate sequence;
  • linker sequence includes a binding sequence for a nucleic acid polymerase, upstream of (ii) a specific sequence for hybridising to a desired subset of said degenerate sequence;
  • step (a) Preferred features of restriction digestion in step (a) are disclosed above.
  • Preferred features of linker sequences for use in step (b) are disclosed above. It is possible to ligate more than one linker sequence e.g. with ⁇ saJI digestion, there are 16 degenerate dinucleotide sequences, and from 1 to 15 linkers can be used to enrich a subject of the total restriction fragments. Using all 16 linkers will not give any enrichment.
  • the degeneracy in step (a) is n
  • the number of linkers that may be used in step (b) is between 1 and n-1. Preferably, however, it is between 1 and n/2, in order to avoid the presence of linkers with overhangs that are complementary to each other and that would thus ligate to each other.
  • the invention also provides a process for preparing a nucleic acid sample for selective amplification of a subset of sequences, comprising steps (a) and (b) of this process. Amplification in step (c) may then be performed separately.
  • this enrichment strategy is compatible with linear amplification of the genome.
  • 20ng of genomic DNA then ligating linkers to just 1 /50th of the fragments resulting from a BsaJI digestion, leaves 0.4ng of template after the linked fragments are selected i.e. 5, 000-fold lower than the 20 ⁇ g of DNA used as a standard in reference 6.
  • 20ng of starting material has been shown, however, to provide ⁇ 5 ⁇ g of RNA after an overnight IVT amplification, enough for an array hybridization.
  • the array must contain probes that match the fragmented genomic DNA. Every different fragmentation of a genome will give different hybridisable sequences, and so there will be a different optimum set of probes for each of the fragmentations. Thus the best array for analysing a fragmented genome will depend on the precise fragmentation method that was used. For a specific restriction enzyme, in silico digestion can show the fragments that will be produced. This information can be used to design a set of probes that are hybridisable to the restriction fragments and that cover the genome to the desired degree. Moreover, it can be used to design a set of probes that will offer an appropriate level of coverage after enrichment has been performed as described above.
  • Probe design may also involve standard techniques, such as ensuring that probes are essentially unique within a target genome (i.e. that they have essentially no cross-hybridsation potential). If specific regions are of interest then probes may be focused on these regions e.g. on telomeres, on centromeres, on specific chromosomes, on specific genes, etc. Probe design may also be restricted by the number of probes that can be included on the chosen array platform.
  • the invention provides a nucleic acid array comprising probes for hybridisation to restriction fragments obtained by digesting a genome as described above.
  • the invention also provides a process for designing a set of probe sequences for use on a nucleic acid array, comprising the steps of: (a) selecting a target genome for analysis; (b) performing an in silico digest of that genome with a restriction enzyme, to provide a set of fragment sequences; (c) designing a set of probes, wherein the set contains at least one probe each for at least 50% of the fragment sequences.
  • the invention also provides a nucleic acid array obtainable by this design process.
  • the invention also provides a nucleic acid array comprising probes for hybridisation to at least 20% (e.g. >30%, >40%, >50%, >60%, >70%, >80%, >90%, >95%, >98%, efc.) of the restriction fragments obtainable by digestion of a genomic DNA.
  • the invention also provides a nucleic acid array comprising nucleic acid probes, wherein at least 20% (e.g. >30%, >40%, >50%, >60%, >70%, >80%, >90%, >95%, >98%, efc.) of the probes can hybridise to restriction fragments obtainable by digestion of a genomic DNA.
  • the genomic DNA is preferably human genomic DNA
  • the restriction enzyme is preferably selected from: BsaJI, Fok ⁇ , Bbv ⁇ , BbvW, BstF5l and TspDTI (and isoschizomers and neoschizomers thereof, as mentioned above).
  • the array will instead comprise probes for hybridisation to at least 20% (e.g. >30%, >40%, >50%, >60%, >70%, >80%, >90%, >95%, >98%, efc.) of, or at least 20% (e.g.
  • the probes can hybridise to, the restriction fragments that (a) are obtainable by digestion with a restriction enzyme that produces degenerate cohesive ends, and (b) include a specific non-degenerate sequence in the cohesive end produced by the restriction enzyme.
  • the array will include probes for hybridising at least to restriction fragments containing one of the specific 16 dinucleotides (AA, AC, AG, AT, CA, cc, CG, CT, GA, GC, GG, GT, TA, TC, TG, TT).
  • the array may include probes for hybridising to fragments with the specific 5 ' -CGTG overhang out of the possible 5 ' -CNNG overhangs.
  • Probes for including on the array can be designed based on knowledge of the target sequences.
  • a probe will have a sequence selected such that it is specific for a single target sequence i.e. probes that can hybridise to more than one target sequence are undesirable. Specific hybridisation in this way ensures that copy number polymorphism for a particular target is directly related to the ratio obtained from the array. Given the sequences of all targets, design algorithms can select probe sequences with the required specificity. Where IVT is used, the probes will be designed to be complementary to the transcribed single-stranded RNA sequences.
  • the array will include probes for at least 5% (e.g.
  • Probes for 50% of the transcribed sequences can represent one ssRNA molecule for each target restriction fragment. There will generally not be more than one probe for every target sequence.
  • the hybridisation probes on an array of the invention can be pre-synthesised before being applied to the array, or may be prepared on the array in situ (e.g. by inkjet printing, by light-directed synthesis, etc.).
  • the hybridisation probes will generally be at least 30 nucleotides long (e.g. >40nt, >50nt, >60nt, >70nt, >80nt, etc.).
  • the probes may be oligonucleotides, although it is also possible to use longer probes e.g. BAC DNA, PCR amplification products, etc.
  • the probes may be attached to the array non-covalently or, preferably, covalently.
  • the probes can be attached by a 5 1 terminal residue, by a 3' terminal residue, or by an internal residue.
  • Bead- based arrays may be used [30] e.g. as in the Sentrix® array.
  • the invention provides a kit, comprising two or more of: (a) a DNA-dependent RNA polymerase; (b) a linker of the invention; and (c) a restriction enzyme.
  • the polymerase of (a) is preferably a T7 RNA polymerase.
  • the linker of (b) is described in more detail above.
  • the kit may contain more than one linker. It may also contain one or more blockers, as described above.
  • the enzyme of (c) preferably produces degenerate non-cohesive ends.
  • the kit preferably contains at least components (a) and (b).
  • the kit may additionally include an array of the invention.
  • composition “comprising” encompasses “including” as well as “consisting” e.g. a composition “comprising” X may consist exclusively of X or may include something additional e.g. X + Y.
  • the term “about” in relation to a numerical value x means, for example, x+10%. Where necessary, the term “about” can be omitted.
  • hybridisation typically refers to specific hybridisation, and exclude non-specific hybridisation. Specific hybridisation can occur under experimental conditions chosen, using techniques well known in the art, to ensure that the majority of stable interactions between probe and target are where the probe and target have at least 90% sequence identity. The hybridisation conditions can be used to aid the design of probes in arrays, such that probe sequences are not used if they have more than 90% identity to other areas of the genome being analysed, to minimise cross-hybridisation.
  • Stable duplexes are those that remain hybridised after washing such that they will contribute to the signal obtained for that probe when reading the array.
  • Different steps of processes can be performed in different geographical locations e.g. in different countries. Thus, for instance, a step of linker attachment could take place in a different location from a step of linear amplification. Similarly, a step of linear amplification could take place in a different place location from a step of applying the product of such amplification to an array e.g. the amplified material could be transported in between steps.
  • Figure 1A shows a preferred linker, containing a T7 RNA polymerase promoter (in bold) and an overhang on the lower strand, for matching a specific BsaJI fragment.
  • the linker is formed from two single-stranded DNA sequences (SEQ ID NOS: 4 & 5).
  • Figure 1B shows a blocker formed from two single-stranded DNA sequences (SEQ ID NOS: 6 & 7).
  • the 5' end of the lower strand (SEQ ID NO: 5) is phosphorylated ('P'). to permit ligation to a restriction fragment.
  • Figure 2 illustrates the use of a specific linker for selecting only one out of sixteen degenerate overhang sequences produced by SsaJI digestion.
  • Figure 3 shows overhangs produced by three hypothetical restriction enzymes, RE1 to RE3, and the linkers used to join to the restriction fragments.
  • Figure 4 illustrates how undesired restriction fragments can be blocked, to prevent concatenation.
  • Figure 5 shows frequencies of the 16 possible degenerate single-stranded overhang sequences after in silico digestion of X.
  • Figure 6 and 7 show scatter plots.
  • the X axes show the log of Cy3 signal intensity
  • the Y axes show the log of Cy5 signal intensity.
  • a female sample was Cy5-labelled and a male sample was Cy3-labelled. Red points come from probes directed to the X chromosome, blue points come from probes to chromosome 16, and green points come from probes directed to the Y chromosome.
  • Figures 8 and 9 show the signal ratio (male:female) for hybridisation to probes on an array, against fragment index.
  • the probes are specific for targets in chromosome 16.
  • the probes are for targets on chromosomes X and Y.
  • Figures 10 and 11 compare direct and indirect labelling of IVT products.
  • Figure 12 shows the results of experiments using different amounts of starting material.
  • Figure 13 shows promoters sequences (SEQ ID NOS: 1 to 3) for T7, SP6 and T3 RNA polymerases.
  • the transcription start site is shown as "+1".
  • the minimum sequences (19mers) required for efficient transcription are underlined.
  • Figure 14 shows data for chromosome 16 using protocols from Agilent (14A and 14D), Nimblegen (14B and 14E) or the present invention (14C and 14F).
  • Figures 14A to 14C show datapoints for the whole chromosome;
  • 14D to 14F show the 16ptel region (Ae. an expansion of the left-hand region of 14A to 14C) with a small moving window of ⁇ 25kb.
  • Figure 15 shows data for chromosome 2 using protocols from Agilent (15B) or the present invention (15A).
  • the y-axis is log of normalised ratio.
  • the 16 sequences are not equally represented, with the GC sequence being rare but TG being common. Post-digestion selection of fragments having a particular overhang will lead to simplification of the total DNA present e.g. selection of GT sequences will give a ⁇ 50-fold enrichment, whereas selection of AA or TT sequences will give a ⁇ 10-fold enrichment.
  • the beads were then placed directly into an IVT reaction using T7 MegascriptTM or T7 MegashortscriptTM kits from AmbionTM.
  • the IVT reaction included CyeDye-UTP at a 1 :1 ratio with unlabelled UTP.
  • Female targets were labelled with Cy5 and male targets were labelled with Cy3.
  • a DNA microarray was designed to analyse the targets generated by this reaction.
  • the probes on the array were selected to hybridise to target sequences present after the enrichment, and to be essentially unique within those target sequences.
  • the array included probes for only chromosomes 16, X and Y.
  • Hybridisation data were normalised to probes designed to chromosome 16, which should not differ between the male and female samples.
  • Figures 6 and 7 are scatter plots from the hybridisation.
  • probes designed to detect the X chromosome had a higher signal ratio from target derived from the female template (1.5-1-7x above that of chromosome 16 probes) than when testing the male-derived target.
  • the Y-axes show the signal ratio (male:female) for hybridisation to each probe on the array
  • the X-axes show the location of the probe sequences along the chromosome ("fragment index").
  • Figure 8 shows that the chromosome 16 data are closely clustered around the 1 :1 ratio, which indicates no difference between the male and female samples.
  • the sub-telomeric (Xp) region of the X-chromosome has some identity with the equivalent region of the Y chromosome
  • Figure 9 shows that probes designed to this region (to the left of the dotted line) co-hybridise with target derived from both X and Y chromosomes. In the Y-specific region, however, this level of co-hybridisation is not seen.
  • Samples were obtained from individuals with learning disabilities, with the aim of identifying underlying chromosomal abnormalities.
  • the samples were analysed using (i) the methods and arrays available from Agilent [31], (ii) the methods and arrays available from Nimblegen, and (iii) the methods and arrays of the invention. All three protocols involve the use of arrays with oligonucleotide probes.
  • the Nimblegen arrays that were used had a similar number of probes to the arrays of the present invention.
  • a comparison of results from the three methods is shown in Figure 14 from samples with monosomy 16ptel, trisomy 16qtel.
  • the protocol of the invention was used to analyse DNA from a colorectal cancer cell line. Analysis using an BioanalyserTM showed that the genomic DNA in the cell line was of poor quality in terms of amount and integrity. The Agilent protocol was also used to analyse the same cell line.
  • Figure 15A shows results obtained with the protocol of the invention for the relevant region of chromosome 2
  • Figure 15B shows results for the same region using the Agilent protocol.
  • the data in Figure 15B are essentially flat (log of normalised ratio ⁇ 0), indicating no deletion, whereas the data in Figure 15A (log of normalised ratio ⁇ 0) indicate a potential deletion.
  • the invention is particularly useful for analysing poor quality DNA e.g. in aged samples, degraded samples, preserved samples (such as formalin-fixed paraffin-embedded samples), efc.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

In a CGH array method, amplification of genomic DNA proceeds linearly rather than exponentially. In vitro transcription using T7 RNA polymerase is particularly suitable. Using a linear amplification method avoids the bias associated with prior art methods. Amplification preferably follows restriction fragmentation. Preferred restriction enzymes produce degenerate cohesive ends, which allows complexity reduction of the products by the use of specific linkers, and in particular linkers that include a T7 RNA polymerase promoter.

Description

ARRAY COMPARATIVE GENOMIC HYBRIDISATION METHODS USING LINEAR AMPLIFICATION
All documents and on-line information cited herein are incorporated by reference in their entirety.
TECHNICAL FIELD
This invention is in the field of comparative genomic hybridisation (CGH) analysis. BACKGROUND ART
CGH is a technique in which two labelled samples are compared by allowing them to hybridise and subsequently looking for regions of differential hybridisation. It has been used particularly in cytogenetic analysis, where it allows the comparison of a genome isolated from a clinical (test) sample (e.g. derived from a cancer patient) with a control (reference) sample in a single hybridisation. It was originally disclosed in reference 1.
Whereas early CGH methods relied on hybridisation to a reference chromosome sample, more recently array-based CGH methods have been developed [2-5]. In these methods, the reference chromosome is replaced by an array of immobilised nucleic acid probes, with the individual immobilised sequences having known chromosomal locations and covering the genome to a desired degree. By choosing appropriate probes, this method gives the potential to cover any genomic region of interest, and to any desired resolution.
These two distinct methods are referred to as 'chromosomal CGH' and 'array CGH'.
Commercial array CGH kits are now available, including SpectralChip™ from Spectral Genomics, GenoSensor™ from Vysis, QArray™ from Genetix, the "Human Genome CGH Microarray" from Agilent, and Nimblegen's "Array CGH" product. Agilent's "Human Genome CGH Microarray 44A" is a high-resolution 60-mer oligonucleotide-based microarray that contains over 40,000 probes sourced from the NCBI human genome. This array enables genome-wide profiling of genomic aberrations (including copy number changes) on a single chip, starting from only 25ng of total genomic DNA per sample [6]. NimbleGen's arrays contain ~385,000 probes on a single glass slide, with 16μm x 16μm probes tiled through genie and intergenic regions at a median probe spacing of 6kbp.
To provide enough material for hybridisation in array CGH, a key step in current protocols is exponential amplification of sample DNA, and reference 7 established the efficacy of whole genome amplification (WGA) approaches for achieving this goal. The protocol used in reference 7 amplifies DNA using either Φ29 or Bst DNA polymerase. Agilent's standard protocol utilises Φ29 random amplification [8]. Reference 9 achieves WGA by using degenerate oligonucleotide-primed PCR and strand displacement amplification (SDA) with Bst polymerase. Reference 10 uses PCR to amplify the fragments obtained after BgIW digestion of the genome. Spectral's process amplifies DNA using the Klenow fragment and random primers [11], although the degree of amplification is lower than in references 8 & 9. Reference 4 also uses random primers and Klenow DNA polymerase. Amplification can be avoided only where a large amount of sample genomic DNA is available, which may be rare in clinical settings. In reference 6, for example, artificial tests used 20μg standard genomic DNA for initial tests, whereas real tests used 10ng DNA - a 2000-fold difference. Because the array CGH procedure involves comparing long chromosomal DNA molecules to a large number of smaller targets, the sample DNA needs to be shortened before being applied. This is generally achieved by fragmenting the sample DNA. Agilent's protocol uses restriction digestion after amplification, whereas Spectral's protocol uses sonication before random prime labelling and amplification. The use of an exponential amplification step in existing array CGH protocols is not ideal. Exponential processes are very sensitive to initial conditions, and non-uniform amplification of DNA can cause biasing in the generation of certain loci, particularly over a typical series of about 20 amplification cycles, giving artificial changes in copy number and misleading results from CGH analysis. Reference 9 acknowledges that "PCR-based WGA methods have been reported to cause 4 to 6-fold amplification bias", and that "DOP-PCR is not an ideal approach for WGA, as the amplification bias could produce false results".
Reference 12 discloses a method for producing amplified amounts of RNA from genomic DNA, and the RNA is used for CGH analysis, including by array. Genomic DNA is annealed to an oligonucleotide that contains a primer domain linked to a RNA polymerase-promoter domain. The oligonucleotide is then subjected to a primer extension reaction to produce dsDNA molecules including a RNA polymerase promoter at the 51 end. Transcription from these dsDNA molecules is used, to give labelled RNA copies of the genomic DNA, and these RNA molecules are used as probes in hybridisation reactions.
It is an object of the invention to provide further and improved methods for use in array CGH analysis, and in particular to provide methods that do not suffer from the problems associated with amplifications used in existing CGH array protocols.
DISCLOSURE OF THE INVENTION
According to the invention, genomic DNA (gDNA) is amplified in a linear process prior to CGH array analysis. Surprisingly, in vitro transcription using T7 RNA polymerase has been found to generate at least 30-40μg of RNA from nanogram quantities of template, covering the whole genome. This yield and coverage was not expected, and provides enough RNA for several hybridisation analyses. Moreover, using a linear amplification method avoids the bias associated with prior art methods.
Thus the invention provides, in a process for amplifying a DNA sample for analysis by array CGH, the improvement consisting of the use of a linear amplification. Similarly, the invention provides a process for amplifying a DNA sample for analysis by array CGH, comprising the steps of: (a) fragmenting the DNA sample; and (b) amplifying the resulting fragments using a linear nucleic acid amplification method. The preferred linear amplification method is in vitro transcription (IVT).
To achieve linear amplification by IVT, it is necessary for the target DNA to include the polymerase's promoter. These sequences will typically be added to the gDNA after it has been fragmented. Thus the invention provides a process for amplifying a DNA sample for analysis by array CGH, comprising the steps of: (a) fragmenting the DNA sample, to give fragments; (b) attaching linker sequences to the fragments, to give amplifiable sequences, wherein the linkers include sequences for initiation of linear amplification; and (c) subjecting the amplifiable sequences to a linear nucleic acid amplification method. The invention also provides a process for preparing a nucleic acid sample for amplification, comprising steps (a) and (b) of this process. Amplification in step (c) may then be performed separately.
The invention also provides a process for analysing a nucleic acid sample using an array, comprising amplification steps (a) to (c) as described above, and further comprising the step of: (d) applying amplified nucleic acid from step (c) to a nucleic acid array. The amplified nucleic acid is preferably RNA, produced in step (c) by an in vitro transcription reaction, and is preferably labelled. Preferably this analysis process is performed for a test sample and a reference sample, under substantially identical conditions for each sample, and the results of the two analyses are compared to give a CGH result. The invention also provides a process for analysing a nucleic acid sample using an array, comprising the step of applying amplified nucleic acid to a nucleic acid array, wherein the amplified nucleic acid has been prepared by a process comprising amplification steps (a) to (c) as described above. Thus the performance of (i) steps (a) to (c), and then (ii) step (d) need not be performed by the same person, at the same time, or in the same place. As described above, fragmentation of gDNA prior to array CGH has been achieved in the prior art by restriction digestion of DNA after amplification [8] and by sonication of DNA before amplification [11]. With the invention, it is preferred instead to perform restriction digestion before amplification. This approach is incompatible with the amplification used in the Agilent protocol, because the Φ29 enzyme does not work on restriction-digested DNA. Thus the invention provides a process for preparing a DNA sample for analysis by array CGH, comprising the steps of: (a) performing a restriction digestion on the DNA sample, to give fragments; and (b) subjecting the fragments to a nucleic acid amplification process. The amplification process preferably gives linear amplification and, as described above, will typically be preceded by addition of linker sequences to the fragments. It may amplify both strands of the fragments, or only one strand. Processes of the invention preferably do not use any PCR amplification (or any other exponential amplification process) prior to CGH analysis.
In CGH analysis, the DNA sample is typically a genomic DNA sample. Analysis will generally be performed on total genomic DNA which, in a eukaryote, includes DNA from the nucleus and other organelles e.g. from the mitochondria.
The invention can be used for comparing all types of DNA by CGH, and is particularly suitable for analysing human cells, including cancer cells.
Fragmenting genomic DNA
Fragmentation of genomic DNA in a sample can be achieved physically, chemically, or enzymatically. Physical and chemical fragmentation is essentially random, whereas enzymatic fragmentation using restriction enzymes is sequence-specific and repeatable. Thus restriction digestion is a preferred method for fragmenting gDNA in a sample.
The choice of restriction enzyme for digestion of gDNA can have a major impact on subsequent CGH analysis. For example, if a frequent cutter is chosen then the genome will be represented by a large number of short fragments; on the other hand, if a rare cutter is used then there will be a smaller number of longer fragments of the genome. Thus the choice of enzyme dictates the number and length of target sequences for hybridisation, which is of key importance in array CGH analysis. The availability of sequence information for the human genome allows in silico restriction digestions to be performed in order to identify enzymes that will give desired coverage of the genome with suitable fragment sizes.
More than one restriction enzyme can be used to digest gDNA. In this case, the enzymes may or may not be isoschizomers. For instance, the Agilent protocol [8] digests gDNA with both Alu\ (cuts at AG|CT) and Rsa\ (cuts at GT^AC).
Preferred restriction enzymes are type Il or type Ms, including particularly type Il enzymes with asymmetric recognition sequences..
The restriction enzyme(s) may produce cohesive (sticky) ends, or produce blunt ends. Enzymes that produce cohesive ends with a single nucleotide overhang are less preferred, as they are less suitable for specific ligation than longer (e.g. 2, 3, 4, 5, 6 nt) cohesive ends.
Where a restriction enzyme's recognition sequence includes a c residue, this residue may be subject to methylation, Restriction enzymes that recognise DNA within or adjacent to this residue may be inhibited from DNA cleavage at this locus, which may affect the copy number of this locus identified in subsequent analysis. The methods may use an enzyme that is insensitive to cytosine methylation.
The restriction enzyme(s) may have a degenerate recognition sequence (e.g. including one or more N, R, Y, W, S, M, K, H, B, V, or D nucleotides), or may have a single specific recognition sequence (e.g. a specific palindrome). Where the enzyme has a specific recognition sequence, it may cut within that sequence or may cut outside the sequence. Where it cuts outside the sequence, the cleavage sequence may be degenerate or there may be a single specific sequence. Preferred restriction enzyme produce degenerate cohesive ends e.g. SsaJI or Fok\, Bbv\, BbvW, BstFδl, TspDTI (see Table below). SsaJI produces a degenerate cohesive end from within its recognition sequence, whereas Fok\ produces a degenerate cohesive end outside its recognition sequence. A preferred degree of degeneracy is 16-fold.
Figure imgf000007_0001
Isoschizomers and neoschizomers of these enzymes can also be used e.g. BseGI, BtsCI, HinGUII, Hpy178VI, HpyF6l, HpyF67lll, Stsl, BseDI, BssEDI, BsoKI, BstZIOI, Hpy99IV, HpyFIOIII, HpyF61 ll, HpyF67IV, Seel, Uba1442l, BseXI, BstVI I, AIwXI, Bchl, BsaUI, BseKI, Bsp423l, BsrVI, Vst12l, Bst71 l, Lfel, Lsp1109l, Bbsl, Bpil, BpuAI, BstV2l, Bbr7l, Bbvll, Bbv16ll, Bco102ll, BsaVI, Bsc91l, BscKI, BspBS31 l, BsplS4l, BspTS514l, BspVI, BstBS32l, BstTSδl, Rtr20l, etc.
The use of restriction enzymes that leave blunt ends, or that cut only at single specific sequences, is not preferred, because the resulting fragments all have identical end sequences and so are more liable to circularisation and/or concatenation in downstream steps than fragments with degenerate ends. Moreover, after an enzyme produces a degenerate cohesive end, it is possible to attach linker sequences to only a subset of the fragments, allowing the complexity of a sample to be reduced. As an illustration, if an enzyme such as SsaJI is used, the generated fragments can have sixteen different overhangs. Linkers can be attached to a chosen subset of these 16 overhangs, and then the linker-containing sequences can be selected, thereby reducing overall complexity. This procedure is illustrated in Figure 2, where a linker is used to select a single one of the 16 overhanging sequences. The use of linkers is described in more detail below. Thus, where step (a) of the amplification process involves fragmenting the genomic DNA sample with a restriction fragment that produces degenerate cohesive ends, step (b) may involve attaching linker sequences to only a subset of the restriction fragments, and this can be achieved by attaching linkers by sequence-specific criteria. This procedure allows complexity of the sample to be reduced, and permits enrichment of a chosen subset of fragments. In silico digestion of a genome provides information about the relative frequency of the possible degenerate sequences (e.g. the proportion of each of the 16 possible dinucleotides after SsaJI digestion), thereby allowing linkers to be chosen to target a specific subset of fragments produced by a given restriction enzyme. A preferred restriction enzyme is SsaJI (or an isoschizomer thereof), which is predicted by in silico digestion to cut inside substantially every human gene, thereby providing a hybridisable fragment for every gene.
Enzymes and subsets can be chosen based on various criteria e.g. the choice may be influenced by characteristics of the genome to be analysed, including the GC content of the DNA to be analysed. For example, although an enzyme such as BsaJI has 16 possible cleavage recognition sequences, these are not necessarily represented equally throughout the genome. In the human genome, for instance, the GT overhang is seen in 2% of cleavages, whereas AA occurs more frequently (approximately 10%).
If one sample is digested with a methylation-insensitive enzyme and another sample is digested with a methylation-sensitive isoschizomer, the invention allows the methylation status profiles of the two samples to be compared by array CGH.
Linear nucleic acid amplification
As described in reference 13, to obtain quantities of DNA sufficient for microarray analysis from normal clinical samples, it must be amplified. Existing protocols use exponential amplification procedures, but these are highly susceptible to bias. Any sequence-dependent or length-dependent biases in the amplification process are themselves exponentially amplified. These disadvantages of exponential amplification techniques are well known, but they are particularly acute in methods such as CGH, where the basic aim is to compare two samples. If amplifications of a reference and test sample differ in efficiency by just 2% then the overall difference after 20 cycles is (0.98)20, or 66.7%. This large difference in copy number at the end of amplification can frustrate the basic aim of CGH.
Moreover, where gDNA is highly fragmented (e.g. in a poor quality or aged/degraded sample), procedures such as PCR will fail when primers span a break in the target, as extension of one primer does not produce a new template for the opposite primer. Advantageously, methods of the invention are able to provide better results than existing protocols in situations where the starting DNA is of poor quality. Furthermore, problems of re-annealing are avoided. As only one strand of the dsDNA template is copied, signal intensity in hybridisation analysis is improved, because the array probes do not have to compete against the complementary DNA strand for hybridisation to the target. Thus the invention uses a linear procedure for amplifying the sample DNA. Unlike a process such as PCR, if the product of one step of amplification does not feed back as a substrate for the next step then amplification proceeds in a linear fashion rather than exponentially. Only one new strand is formed per cycle of amplification.
A preferred amplification procedure is IVT, which produces complementary RNA from a DNA target. Any suitable DNA-dependent RNA polymerase can be used, such as SP6, T3 or T7, all of which are readily available. Use of a T7 RNA polymerase is preferred, as protocols for amplifying gDNA using it are known in the art (e.g. from reference 13). Other procedures for whole genome amplification (WGA) are disclosed in reference 14.
Amplification by T7 RNA polymerase requires the presence of suitable promoters in the target DNA (rather than, for instance, involving extension of an annealed primer). These promoters can be introduced at the 5' ends of fragmented gDNA using known techniques.
IVT can then be used to produce multiple RNA copies of the gDNA, for use in CGH analysis. Introduction of promoters will typically be achieved by ligating linker sequences to the 5' ends of gDNA fragments, wherein the linkers include the IVT promoter sequence. As an alternative, they may be introduced by subsequent hybridisation of a primer containing the promoter sequence. The use of linkers is described in more detail below.
A key feature of CGH is that hybridisation must be detectable. If an array is used where hybridisation can be detected without labelling of target DNA then the CGH methods of the invention do not need to include label incorporation. Typically, however, detection of hybridisation on an array requires the target to be labelled. Conveniently, label can be incorporated during the linear amplification step. Fluorescent labelling is preferred.
Labelling can be direct e.g. by incorporating labelled nucleotides during nucleic acid chain extension. Rather than incorporate fluorophores directly, however, it is also possible to incorporate a specific functional group to which fluorophores can later be coupled ('post-labelling'). Both labelling methods have been used successfully, but post-labelling is preferred, as bias may be caused when using direct labelling if a fluorophore interferes with the chain extension process. A preferred indirect labelling protocol uses 5-(3-aminoallyl)-UTP, which can be included together with unlabelled UTP during transcription. The aminoallyl group can be added after amplification using suitable reactive labels e.g. the NHS-CyDye range from Amersham Biosciences.
Unlike a typical sequencing reaction, it is not necessary to use different fluorophores for different nucleotides in the same strand during IVT1 because individual nucleotides do not need to be distinguished. Thus 1 , 2, 3 or 4 of A, C, G and T/U may be labelled, and a mixture of labelled and unlabelled NTPs can be used. It is not necessary to replace every instance of a particular nucleotide with the labelled form. Thus, for example, a mixture of three unlabelled NTPs can be used, plus the fourth NTP in both labelled and unlabelled form, provided that the amount of incorporated label is adequate for detection. Use of labelled UTP is preferred, and a mixture can include labelled and unlabelled UTP at a 1 :1 ratio. Incorporation of a large number of fluorophores (e.g. in at least 5% of incorporated nucleotides, such as >10%, >20%, >30%, >40%, >50%, >75%, or more) means that the product can readily be detected by any of the familiar means of fluorescence detection. One feature of IVT is that, unlike PCR, it gives single-stranded products. To detect these products by hybridisation, probes on the array will be from the complementary strand. In general, only one strand per restriction fragment will be amplified, halving the complexity of the detectable sequences relative to a double-stranded PCR amplification product.
Advantageously, the methods of the invention have been able to generate sufficient quantities of RNA for CGH analysis even from low amounts of starting genomic DNA e.g. ~20ng. In contrast, the standard protocol from Agilent required at least 1.5μg of genomic DNA and even the "Low Input gDNA" protocol requires at least 500ng of starting material [15]. The Nimblegen protocol requires 1-3μg of starting gDNA [16].
Linkers As mentioned above, the invention typically involves the attachment of linkers to fragments of genomic DNA. These linkers can be used to introduce binding sequences for polymerases, such as 51 promoters for use in IVT. After a linker is added to the 51 end of a fragment then that fragment becomes amplifiable by an appropriate polymerase. Ligation of amplification linkers to the 5' end of restriction fragments is known from "complexity management" [17,18] and target enrichment [19] procedures, including selective amplification of a subset of the fragments based on their adaptors. It is also described in reference 10 as part of the protocol for reducing sample complexity by "representation".
Sequence requirements for polymerase binding sites are well known in the art. Figure 13 shows examples of promoters for the T7, SP6 and T3 RNA polymerases (SEQ ID NOS: 1 to 3). The preferred polymerase for use with the invention is T7 RNA polymerase, and so preferred linkers include a T7 RNA polymerase promoter sequence. Various T7 RNA polymerase promoter sequences are known, including natural sequences [20] and artificial ones (e.g. see refs. 21-26). Different T7 RNA polymerases can have different promoter sequence preferences, and mutant T7 RNA polymerases have been produced to match specific promoters (e.g. see refs. 27 & 28), but the skilled person can routinely obtain both T7 RNA polymerases and promoter sequences, and can easily match any particular T7 RNA polymerase to its preferred promoter sequence. The consensus 23 base-pair T7 DNA promoter is classically divided into two domains, an upstream binding domain (-17 to -5, numbered relative to the start of transcription), and a downstream initiation domain (-4 to +6). This 23mer is SEQ ID NO: 1 herein:
5 ' -TAATACGACTCACTATAGGGAGA-S ' The minimum sequence required for efficient transcription (underlined in Figure 13) is the first 19mer of SEQ ID NO: 1 {i.e. SEQ ID NO: 8: TAATACGACTCACTATAGG). Thus a preferred linker of the invention includes SEQ ID NO: 8, and more preferably includes SEQ ID NO: 1. This will typically be paired to a complementary sequence in a double-stranded region of the linker. The 23mer of SEQ ID NO: 1 will be on the upper strand of a double-stranded sequence, where the lower strand is the template for T7 RNA polymerase i.e. the upper strand is in the same sense as the T7-produced transcript. A preferred linker is shown in Figure 1A, which includes a 30mer double-stranded region with a 4-mer overhang on the lower strand.
The linker may include additional sequence to the 5' and/or 3' end of the T7 RNA polymerase promoter e.g. to ensure correct spacing relative to the start of the target sequence, to space the promoter away from any attached molecules in order to avoid steric interference, to add a primer sequence for subsequent PCR amplification, to add a sequence tag, efc. Preferred linkers do not include a primer binding site.
The T7 RNA polymerase binds to double-stranded DNA, and so the promoter sequence in the linker should be double-stranded. This can be achieved by adding a linker including the double-stranded sequence, or by adding a linker with a single-stranded sequence and then adding a complementary single-stranded sequence to give the final duplex. Using a linker that includes a double-stranded region is preferred.
For IVT, the linkers are to be attached to the 5' end of double-stranded restriction fragments, and so the sequence of at least a portion of the linker will depend on the restriction enzyme(s) used during the fragmentation. For example, Figure 3 shows linker possibilities for three different types of overhang produced by restriction enzyme digestion.
Where fragmentation used a restriction fragment that generates cohesive ends then the linkers should have single-stranded overhangs that match the overhang produced by the restriction enzyme.
Linkers can be joined to gDNA fragments by standard ligation techniques, including both chemical and enzymatic ligation e.g. with a ligase. Ligases require a free 31 hydroxy group and a free 5' phosphate group. Thus the linkers and fragments should, if necessary, be treated appropriately, either chemically or enzymatically. For example, a suitable kinase can be used to phosphorylate free 5' ends, such as a T4 polynucleotide kinase. Looking at Figure 1A as an example, the 5' end of the lower strand (5 'CACG...) should be phosphorylated in order to be ligated to a restriction fragment. The 5' end of the upper strand, which is not to be ligated to a restriction fragment, should not be phosphorylated. This can be achieved e.g. by blocking its 5' end.
After ligation, the reaction mixture can be purified e.g. to remove unligated linkers, nucleotides, label, etc. Reagents that are commercially available for PCR cleanup can be used for this purpose, such as the PCR purification columns from Qiagen™ (e.g. QIAQuick™). These columns rely on DNA binding to silica gel membranes.
As mentioned above, and as illustrated in Figure 2, linkers can selectively be attached to a subset of fragments if a restriction enzyme produces a degenerate cohesive end. In the situation illustrated in Figure 2, fifteen overhangs remain unlinked. Figure 2 shows only one end of these fragments, but the other end will also have a cohesive end produced by the restriction enzyme. Where the two cohesive ends are complementary then the unlinked fragments are prone to circularisation, leading to a loss of analysable and/or linkable sequences. Fragments are also prone to concatenation to other fragments with complementary overhangs, leading to the generation of nucleic acids in which non-contiguous genome regions are juxtaposed. It is thus preferred to minimise these unwanted side reactions.
One way to avoid the unwanted reactions is to include nucleic acids that can hybridise to undesired single-stranded sequences, thereby blocking the unwanted reactions. This procedure is illustrated in Fig.4, in which the desired linker matches the CGTG overhang and other overhangs are blocked by the use of 14 different blockers. Once the desired linker has ligated to its target CGTG overhang then circularisation can no longer take place. To minimise concatenation and circularisation of the other overhangs, linkers that can ligate to them are added to the ligation reaction. The blockers can hybridise to the overhangs of undesired fragments, and be ligated, in order to add sequence to prevent concatenation. By lacking phosphorylation at the 5' end not involved in ligation, the blockers cannot take part in further ligations. Additional features at the 5' end can also prevent further ligation.
The blocking linkers should not include overhangs complimentary to or identical to the T7 linker. A blocker with a sequence complementaty to the desired linker should not be used, or else the desired linker would itself be blocked. When matching the CGTG overhang, therefore, a blocker that matches CACG is not included e.g. to select the 5 ' -CGTG-3 ? overhang the blockers are based on 5 ' -GDAC-3 ' , 5 -GNCC-3 ' , 5 ' -GVGC-3 ' and 5 ' -GNTC-3 ' .
These blockers will generally have a single-stranded overhang to match the targeted restriction site, a double-stranded central region, and optionally a long single-stranded overhang that does not match any target sequence (e.g. see Figure 1 B). The 3' nucleotide not involved in ligation to a restriction fragment (the 3' end of the lower strand in Figure 1 B) may be modified to prevent chain extension. The invention provides a nucleic acid having (i) a double-stranded region containing a promoter for a DNA-dependent RNA polymerase, and (ii) a single-stranded overhang that can hybridise to an overhang produced by a restriction enzyme. The single-stranded overhang is suitable for subsequent ligation reactions. Suitable RNA polymerases and restriction enzymes are described above. Preferred restriction enzymes are those with degenerate recognition sequences. A preferred combination is a T7 RNA polymerase promoter and a SsaJI overhang (e.g. Figure 1A). The double-stranded region may include up to 100 base pairs e.g. <90, <80, <70, <60, <50, <40, <30, etc. The single-stranded overhang may begin immediately next to the end of the promoter sequence, as shown in Figure 1A. Other features of the linkers are described elsewhere herein.
Where degenerate overhangs are utilised, preferred linkers match a sequence that is not palindromic, as palindromic sequences can hybridise to each other and cause ligation of linkers. Four of the 16 possible SsaJI overhangs are palindromic, and so these four are not preferred (GΆTC, GCGC, GGCC, GTAC). Any of the remaining overhangs can be used (i.e. 3 ' -GAAC-S ' , 3 ' -GACC-5 ' , 3 ' -GAGC- 5 \ 3 ' -GCAC-5 ' , 3 ' -GCCC-5 \ 3 ' -GCTC-5 ' , 3 ' -GGAC- 5 \ 3 ' -GGGC-5 \ 3 ' -GGTC-5 \ 3 ' -GTCC-5 ' , 3 ' -GTGC-5 \ 3 ' -GTTC-5 ' . The 3 ' -GCAC-5 ' and 3 ' -GTTC-5 ' overhangs are the most preferred.
Enrichment
The approach illustrated in Figure 2, where specific linkers are used to select a subset of restriction fragments, leads to enrichment of that subset, and to a consequent reduction in the complexity of sequences in a sample. The process reduces the amount of gDNA being analysed, which gives better signal-to-noise ratios during hybridisation; on the other hand, it reduces the number of sequences available for hybridisation, thereby leading to less extensive coverage of the genome, which potentially means that mutations will not be seen. For any given situation, the skilled person can determine the necessary balance between coverage and signal. Provided that the enrichment leaves a sufficient number of fragments distributed throughout the genome then it will have little impact CGH analysis e.g. to detect long deletions or duplications. This balance is inherent in the CGH technique, which sits between two extremes of simply counting chromosome numbers in two samples at one end and full sequencing of the genomes in two samples at the other end. Enrichment by selective linker ligation can simplify CGH analysis, by removing some gDNA, without destroying the validity of the process.
If full-genome analysis is not required then enrichment can operate at the appropriate level. If only specific chromosomes are of interest then enrichment can be used to reduce or eliminate fragments derived from other chromosomes, provided that there are appropriately-distributed overhang sequences. In silico digestion of a genome sequence can be used to find the best restriction enzymes for any desired enrichment. In a preferred process, SsaJI is used to digest the human genome, and a linker is used to select only fragments with a single-stranded 5 ' -CGTG-3 ' overhang. Thus the linker will have a single stranded overhang of 3 ' -GCAC-5 ' on the lower strand. Figure 1A shows the preferred linker. This linker offers a 50-fold enrichment of the genome, while giving fragments that retain full coverage of the human genome (an average of 1 fragment every 10kb). A different linker, which selects a 5 ' -CΆΆG-3 ' overhang at the 5' end of a restriction fragment (i.e. has a 3 ' -GTTC-S 1 overhang on the lower strand), gives a 10-fold enrichment i.e. more complete coverage of the genome (1 fragment every 2-3kb), but potentially a lower signal/noise ratio. Linkers may include a binding group such that they can be selectively extracted from a sample. For example, a linker may have a covalently-attached biotin molecule. After ligation to chosen fragments, these ligated linkers can be extracted using avidin or streptavidin, for example, thereby removing the unlinked genomic DNA from the sample. This procedure is not essential, but allows a reduction in volume of material for IVT, and removes unwanted background DNA.
Suitable binding pairs, either of which may be attached to a linker, with the other member of the pair being used for extraction, are known in the art, and include but are not limited to: biotin/streptavidin; biotin/avidin; antigen/antibody; etc. Attachment to the linker via the 5' end of the upper strand can advantageously prevent that 5' end from being involved in unwanted ligation reactions.
To facilitate separation, the member of the binding pair that is not attached to the linker can be attached to a suitable solid support e.g. to a column, to a surface, or to a magnetic or paramagnetic bead. For example, streptavid in-coated paramagnetic beads are widely available, and a preferred separation method involves: ligating biotin-labelled linkers to restriction fragments; contacting the restriction fragments with streptavidin-coated paramagnetic beads; allowing the streptavidin and fragments to interact; and washing the beads to enrich the ligated material.
Separation of linked fragments from unlinked fragments can also be used to concentrate a sample. For instance, where an IVT reaction requires a 20μl volume, but the DNA to be amplified is in a larger volume (e.g. in 50μl after cleanup), that DNA can conveniently be removed by the use of magnetic separation and then can be added directly to an IVT reaction. This process is more convenient and more rapid than concentrating the DNA to reduce the volume.
The enrichment processes described above are useful not only for CGH preparation, but in other processes where a reduction in sequence complexity is required (cf. refs. 19 & 29).
Thus the invention provides a process for selectively amplifying a subset of sequences in a population of nucleic acids, comprising the steps of: (a) digesting the nucleic acids with a restriction enzyme to give restriction fragments with cohesive ends that include a degenerate sequence;
(b) ligating a linker sequence to the 5' end of a restriction fragment, to give ligated molecules, wherein the linker sequence (i) includes a binding sequence for a nucleic acid polymerase, upstream of (ii) a specific sequence for hybridising to a desired subset of said degenerate sequence; and
(c) contacting the ligated molecules with the polymerase, to amplify a subset of the restriction fragments.
Preferred features of restriction digestion in step (a) are disclosed above. Preferred features of linker sequences for use in step (b) are disclosed above. It is possible to ligate more than one linker sequence e.g. with βsaJI digestion, there are 16 degenerate dinucleotide sequences, and from 1 to 15 linkers can be used to enrich a subject of the total restriction fragments. Using all 16 linkers will not give any enrichment. Thus, where the degeneracy in step (a) is n, the number of linkers that may be used in step (b) is between 1 and n-1. Preferably, however, it is between 1 and n/2, in order to avoid the presence of linkers with overhangs that are complementary to each other and that would thus ligate to each other.
The invention also provides a process for preparing a nucleic acid sample for selective amplification of a subset of sequences, comprising steps (a) and (b) of this process. Amplification in step (c) may then be performed separately.
Surprisingly, this enrichment strategy is compatible with linear amplification of the genome. Starting with 20ng of genomic DNA, then ligating linkers to just 1 /50th of the fragments resulting from a BsaJI digestion, leaves 0.4ng of template after the linked fragments are selected i.e. 5, 000-fold lower than the 20μg of DNA used as a standard in reference 6. 20ng of starting material has been shown, however, to provide ~5μg of RNA after an overnight IVT amplification, enough for an array hybridization. The finding that such a low amount of target DNA can give detectable results in an array CGH method after using only a linear amplification, rather than PCR, is surprising,
Arrays for use with the invention To give useful results in an array CGH method, the array must contain probes that match the fragmented genomic DNA. Every different fragmentation of a genome will give different hybridisable sequences, and so there will be a different optimum set of probes for each of the fragmentations. Thus the best array for analysing a fragmented genome will depend on the precise fragmentation method that was used. For a specific restriction enzyme, in silico digestion can show the fragments that will be produced. This information can be used to design a set of probes that are hybridisable to the restriction fragments and that cover the genome to the desired degree. Moreover, it can be used to design a set of probes that will offer an appropriate level of coverage after enrichment has been performed as described above. Probe design may also involve standard techniques, such as ensuring that probes are essentially unique within a target genome (i.e. that they have essentially no cross-hybridsation potential). If specific regions are of interest then probes may be focused on these regions e.g. on telomeres, on centromeres, on specific chromosomes, on specific genes, etc. Probe design may also be restricted by the number of probes that can be included on the chosen array platform.
The invention provides a nucleic acid array comprising probes for hybridisation to restriction fragments obtained by digesting a genome as described above.
The invention also provides a process for designing a set of probe sequences for use on a nucleic acid array, comprising the steps of: (a) selecting a target genome for analysis; (b) performing an in silico digest of that genome with a restriction enzyme, to provide a set of fragment sequences; (c) designing a set of probes, wherein the set contains at least one probe each for at least 50% of the fragment sequences.
The invention also provides a nucleic acid array obtainable by this design process.
The invention also provides a nucleic acid array comprising probes for hybridisation to at least 20% (e.g. >30%, >40%, >50%, >60%, >70%, >80%, >90%, >95%, >98%, efc.) of the restriction fragments obtainable by digestion of a genomic DNA. The invention also provides a nucleic acid array comprising nucleic acid probes, wherein at least 20% (e.g. >30%, >40%, >50%, >60%, >70%, >80%, >90%, >95%, >98%, efc.) of the probes can hybridise to restriction fragments obtainable by digestion of a genomic DNA.
The genomic DNA is preferably human genomic DNA, and the restriction enzyme is preferably selected from: BsaJI, Fok\, Bbv\, BbvW, BstF5l and TspDTI (and isoschizomers and neoschizomers thereof, as mentioned above).
Where enrichment has been used to reduce the number of targets, based on selection of a subset of degenerate sequences as described above, then the array will instead comprise probes for hybridisation to at least 20% (e.g. >30%, >40%, >50%, >60%, >70%, >80%, >90%, >95%, >98%, efc.) of, or at least 20% (e.g. >30%, >40%, >50%, >60%, >70%, >80%, >90%, >95%, >98%, efc.) of the probes can hybridise to, the restriction fragments that (a) are obtainable by digestion with a restriction enzyme that produces degenerate cohesive ends, and (b) include a specific non-degenerate sequence in the cohesive end produced by the restriction enzyme. Thus, where a restriction enzyme's overhang has a 16-fold degeneracy arising from a NN dinucleotide, the array will include probes for hybridising at least to restriction fragments containing one of the specific 16 dinucleotides (AA, AC, AG, AT, CA, cc, CG, CT, GA, GC, GG, GT, TA, TC, TG, TT). With SsaJI, for instance, the array may include probes for hybridising to fragments with the specific 5 ' -CGTG overhang out of the possible 5 ' -CNNG overhangs.
Probes for including on the array can be designed based on knowledge of the target sequences. In general, a probe will have a sequence selected such that it is specific for a single target sequence i.e. probes that can hybridise to more than one target sequence are undesirable. Specific hybridisation in this way ensures that copy number polymorphism for a particular target is directly related to the ratio obtained from the array. Given the sequences of all targets, design algorithms can select probe sequences with the required specificity. Where IVT is used, the probes will be designed to be complementary to the transcribed single-stranded RNA sequences. The array will include probes for at least 5% (e.g. >10%, >20%, >30%, >40%, >50%, >60%, >70%, >80%, >90%, >95% or more) of these single-stranded RNA molecules. Probes for 50% of the transcribed sequences can represent one ssRNA molecule for each target restriction fragment. There will generally not be more than one probe for every target sequence.
Where the enrichment methods of the invention have been used, it is preferred not to include probes that match restriction fragments having identical overhangs at both ends, because both of the strands (upper & lower) will have had linkers attached to them and will have been amplified. Being complementary to each other, the two amplified sequences will have been able to hybridise to each other during IVT, and will also compete with the probe for hybridisation.
The hybridisation probes on an array of the invention can be pre-synthesised before being applied to the array, or may be prepared on the array in situ (e.g. by inkjet printing, by light-directed synthesis, etc.). The hybridisation probes will generally be at least 30 nucleotides long (e.g. >40nt, >50nt, >60nt, >70nt, >80nt, etc.). Thus the probes may be oligonucleotides, although it is also possible to use longer probes e.g. BAC DNA, PCR amplification products, etc.
The probes may be attached to the array non-covalently or, preferably, covalently.
The probes can be attached by a 51 terminal residue, by a 3' terminal residue, or by an internal residue.
Various materials can be used as the solid support in arrays. Glass is preferred. Bead- based arrays may be used [30] e.g. as in the Sentrix® array.
Kits
The invention provides a kit, comprising two or more of: (a) a DNA-dependent RNA polymerase; (b) a linker of the invention; and (c) a restriction enzyme. As described above, the polymerase of (a) is preferably a T7 RNA polymerase.
The linker of (b) is described in more detail above. The kit may contain more than one linker. It may also contain one or more blockers, as described above.
As described above, the enzyme of (c) preferably produces degenerate non-cohesive ends. The kit preferably contains at least components (a) and (b). The kit may additionally include an array of the invention.
General
The term "comprising" encompasses "including" as well as "consisting" e.g. a composition "comprising" X may consist exclusively of X or may include something additional e.g. X + Y. The term "about" in relation to a numerical value x means, for example, x+10%. Where necessary, the term "about" can be omitted.
The word "substantially" does not exclude "completely" e.g. a composition which is "substantially free" from Y may be completely free from Y. Where necessary, the word "substantially" may be omitted from the definition of the invention. References to "hybridisation" typically refer to specific hybridisation, and exclude non-specific hybridisation. Specific hybridisation can occur under experimental conditions chosen, using techniques well known in the art, to ensure that the majority of stable interactions between probe and target are where the probe and target have at least 90% sequence identity. The hybridisation conditions can be used to aid the design of probes in arrays, such that probe sequences are not used if they have more than 90% identity to other areas of the genome being analysed, to minimise cross-hybridisation. The stability of any particular probe/target duplex depend on the buffer/washing conditions used. Stable duplexes are those that remain hybridised after washing such that they will contribute to the signal obtained for that probe when reading the array. Different steps of processes can be performed in different geographical locations e.g. in different countries. Thus, for instance, a step of linker attachment could take place in a different location from a step of linear amplification. Similarly, a step of linear amplification could take place in a different place location from a step of applying the product of such amplification to an array e.g. the amplified material could be transported in between steps. BRIEF DESCRIPTION OF THE DRAWINGS
In Figure 1 , Figure 1A shows a preferred linker, containing a T7 RNA polymerase promoter (in bold) and an overhang on the lower strand, for matching a specific BsaJI fragment. The linker is formed from two single-stranded DNA sequences (SEQ ID NOS: 4 & 5). Figure 1B shows a blocker formed from two single-stranded DNA sequences (SEQ ID NOS: 6 & 7). In Figure 1A, the 5' end of the lower strand (SEQ ID NO: 5) is phosphorylated ('P'). to permit ligation to a restriction fragment.
Figure 2 illustrates the use of a specific linker for selecting only one out of sixteen degenerate overhang sequences produced by SsaJI digestion. Figure 3 shows overhangs produced by three hypothetical restriction enzymes, RE1 to RE3, and the linkers used to join to the restriction fragments.
Figure 4 illustrates how undesired restriction fragments can be blocked, to prevent concatenation.
Figure 5 shows frequencies of the 16 possible degenerate single-stranded overhang sequences after in silico digestion of X.
Figure 6 and 7 show scatter plots. The X axes show the log of Cy3 signal intensity, and the Y axes show the log of Cy5 signal intensity. In Figure 6, a female sample was Cy5-labelled and a male sample was Cy3-labelled. Red points come from probes directed to the X chromosome, blue points come from probes to chromosome 16, and green points come from probes directed to the Y chromosome.
Figures 8 and 9 show the signal ratio (male:female) for hybridisation to probes on an array, against fragment index. In Figure 8 the probes are specific for targets in chromosome 16. In Figure 9, the probes are for targets on chromosomes X and Y.
Figures 10 and 11 compare direct and indirect labelling of IVT products. Figure 12 shows the results of experiments using different amounts of starting material.
Figure 13 shows promoters sequences (SEQ ID NOS: 1 to 3) for T7, SP6 and T3 RNA polymerases. The transcription start site is shown as "+1". The minimum sequences (19mers) required for efficient transcription are underlined.
Figure 14 shows data for chromosome 16 using protocols from Agilent (14A and 14D), Nimblegen (14B and 14E) or the present invention (14C and 14F). Figures 14A to 14C show datapoints for the whole chromosome; 14D to 14F show the 16ptel region (Ae. an expansion of the left-hand region of 14A to 14C) with a small moving window of ~25kb.
Figure 15 shows data for chromosome 2 using protocols from Agilent (15B) or the present invention (15A). The y-axis is log of normalised ratio. MODES FOR CARRYING OUT THE INVENTION
In silico digestion of human genome sequence
An in silico digestion of the published sequence of human chromosome 1 using SsaJI was performed. The resulting frequencies of the 16 possible degenerate single-stranded overhang sequences were as follows, and are shown in Figure 5:
Figure imgf000020_0001
The 16 sequences are not equally represented, with the GC sequence being rare but TG being common. Post-digestion selection of fragments having a particular overhang will lead to simplification of the total DNA present e.g. selection of GT sequences will give a ~50-fold enrichment, whereas selection of AA or TT sequences will give a ~10-fold enrichment.
Comparison of male and female genomic DNA
Human male and female genomic DNA were separately digested to completion with βsaJI. The DNA was then cleaned using the Qiagen Gel extraction procedure. 200-5000ng of digested DNA was ligated to the linker shown in Figure 1A, modified using the BioTEG™ system to have a biotin label covalently attached to the 5' end of the shorter strand. The ligation reaction was cleaned using a Qiagen PCR purification column in order to remove excess unligated linkers. The cleaned DNA was added to streptavidin paramagnetic particles from Promega™ such that the biotinylated DNA would bind to the particles. The particles were washed to remove any unbound DNA, thus enriching for those fragments ligated to the biotinylated linker. The beads were then placed directly into an IVT reaction using T7 Megascript™ or T7 Megashortscript™ kits from Ambion™. For labelling, the IVT reaction included CyeDye-UTP at a 1 :1 ratio with unlabelled UTP. Female targets were labelled with Cy5 and male targets were labelled with Cy3.
A DNA microarray was designed to analyse the targets generated by this reaction. The probes on the array were selected to hybridise to target sequences present after the enrichment, and to be essentially unique within those target sequences. For initial testing, the array included probes for only chromosomes 16, X and Y. Hybridisation data were normalised to probes designed to chromosome 16, which should not differ between the male and female samples. Figures 6 and 7 are scatter plots from the hybridisation. As expected, probes designed to detect the X chromosome had a higher signal ratio from target derived from the female template (1.5-1-7x above that of chromosome 16 probes) than when testing the male-derived target.
For instance, in Figure 6 the data points representing probes in the array from chromosome 16 cluster around the 1 :1 line, whereas the data points from the chromosome X probes are clustered on a parallel straight line with higher signal intensity. In a comparative PCR experiment, shown in Figure 7, the data points cluster around the same lines but are distributed more widely. Tight clustering is advantageous as it permits greater confidence when concluding that a signal change is caused by a chromosomal deletion or duplication, rather than being caused by underlying experimental noise.
In figures 8 and 9, the Y-axes show the signal ratio (male:female) for hybridisation to each probe on the array, and the X-axes show the location of the probe sequences along the chromosome ("fragment index"). Figure 8 shows that the chromosome 16 data are closely clustered around the 1 :1 ratio, which indicates no difference between the male and female samples. The sub-telomeric (Xp) region of the X-chromosome has some identity with the equivalent region of the Y chromosome, and Figure 9 shows that probes designed to this region (to the left of the dotted line) co-hybridise with target derived from both X and Y chromosomes. In the Y-specific region, however, this level of co-hybridisation is not seen.
To check for dye bias, dye swap experiments were performed, where the same test and reference samples are used, but the dyes are switched. Thus female targets were labelled with Cy3 and male targets were labelled with Cy5. Figure 10 shows the results of a dye swap experiment, and there is little or no dye bias. Moreover, they also give a second data set for comparing the two samples.
To compare direct and indirect labelling, aminoallyl-UTP was substituted for CyeDye-UTP in the dye swapped IVT reaction. Signals obtained with the two labels were compared. Figures 10 (CyeDye) and 11 (aminoallyl) show very little difference, indicating that the point of labelling in the procedure has little effect. To see how the method copes with low amounts of starting DNA, it was performed with different amounts of the same target sample: 20ng; 100ng; 500ng; and 1000ng. Figure 12 shows the results of these four array CGH experiments, and a good correlation was seen between the 20ng and IOOOng samples. Thus the method of the invention has been shown to give useful array CGH data from just 20ng of starting material.
Comparison of different array CGH protocols
Samples were obtained from individuals with learning disabilities, with the aim of identifying underlying chromosomal abnormalities. The samples were analysed using (i) the methods and arrays available from Agilent [31], (ii) the methods and arrays available from Nimblegen, and (iii) the methods and arrays of the invention. All three protocols involve the use of arrays with oligonucleotide probes. The Nimblegen arrays that were used had a similar number of probes to the arrays of the present invention. A comparison of results from the three methods is shown in Figure 14 from samples with monosomy 16ptel, trisomy 16qtel.
In Figures 14A to 14C, where all datapoints for the chromosome are shown, the Agilent protocol (14A) clearly showed the chromosomal abnormality, but it was less clear in the Nimblegen data (14B). In contrast, the abnormality in Figure 14C is much more evident and less subject to variation.
In Figures 14D to 14F, where just the 16ptel region is shown, the abnormality is again difficult to detect visually in the Nimblegen data (14E). The data from the Agilent protocol (14D) and from the present invention (14F) are clear, but are tighter in 14F.
CGH analysis of low quality genomic DNA
The protocol of the invention was used to analyse DNA from a colorectal cancer cell line. Analysis using an Bioanalyser™ showed that the genomic DNA in the cell line was of poor quality in terms of amount and integrity. The Agilent protocol was also used to analyse the same cell line.
Using the protocol of the invention, it was possible to detect a region on chromosome 2 with a potential deletion mutation. Figure 15A shows results obtained with the protocol of the invention for the relevant region of chromosome 2, and Figure 15B shows results for the same region using the Agilent protocol. The data in Figure 15B are essentially flat (log of normalised ratio ~ 0), indicating no deletion, whereas the data in Figure 15A (log of normalised ratio < 0) indicate a potential deletion.
Thus the invention is particularly useful for analysing poor quality DNA e.g. in aged samples, degraded samples, preserved samples (such as formalin-fixed paraffin-embedded samples), efc.
It will be understood that the invention has been described by way of example only and modification of detail may be made without departing from the spirit and scope of the invention.
REFERENCES (the full contents of which are incorporated herein by reference)
[1] WO93/18186.
[2] WO96/17958.
[3] Oostlander et al. (2004) Clin Genet 66:488-95.
[4] van den Ussel et al. (2005) Nucleic Acids Research 2005 33(22):e192.
[5] Pinkel et al. (1998) Nature Genet 20:207-11.
[6] Barrett et al. (2004) PNAS USA 101 : 17765-70.
[7] Lage et al. (2003) Genome Res 13:294-307.
[8] Array-Based CGH Procedures for Genomic DNA Analysis. Protocol, version 1.1 (March 2005), Agilent Technologies.
[9] Hughes et al. (2004) Cytogenet Genome Res 105:18-24. [10] Lucito et al. (2003) Genome Research 13:2291-305. [11] Protocol for SpectralChip™ 2600. P/N 36-0001-00 rev 4. Spectral Genomics.
[12] US-2005/0147975.
[13] Liu et al. (2003) BMC Genomics 4:19.
[14] Hughes et al. (2005) Prog Biophys MoI Biol 88:173-89.
[15] De Witte & Rizzo (2006) Agilent Application Note "An Optimized aCGH Protocol Allows Reduced Input of Genomic DNA".
[16] Nimblegen Datasheet (2005) Array Comparative Genomic Hybridization (CGH).
[17] WO03/010328.
[18] WO03/012118.
[19] US-2003/0082543 (US patent 6,632,611).
[20] Dunn & Studier (1983) J MoI Biol 166:477-535.
[21] lkeda et al. (1992) Biochemistry 31 :9073-80.
[22] Martin & Coleman (1987) Biochemistry 26:2690-6.
[23] Diaz et al. (1993) J MoI Biol 229:805-11.
[24] Maslak et al. (1993) Biochemistry 32:4270-4.
[25] Ujvari & Martin (1997) J MoI Biol 273:775-81.
[26] lmburgio et al. (2000) Biochemistry 39:10419-30.
[27] US patent 5,122,457.
[28] US patent 5,385,834.
[29] EP-A-1184466.
[30] Kuhn et al. (2004) Genome Res 14:2347-56.
[31] Agilent Oligonucleotide Array-Based CGH for Genomic DNA Analysis Protocol 4.0, June 2006. Agilent Part Number G4410-90010.

Claims

1. A process for amplifying a DNA sample for analysis by array CGH, comprising the steps of: (a) fragmenting the DNA sample, to give fragments; (b) attaching linker sequences to the fragments, to give amplifiable sequences, wherein the linkers include sequences for initiation of linear amplification; and (c) subjecting the amplifiable sequences to a linear nucleic acid amplification process.
2. A process for analysing a nucleic acid sample using an array, comprising steps (a) to (c) as described in claim 1 , and further comprising the step of: (d) applying amplified nucleic acid from step (c) to a nucleic acid array.
3. The process of any preceding claim, wherein the linear nucleic acid amplification process produces RNA.
4. The process of claim 3, wherein the linear nucleic acid amplification process is an in vitro transcription, and the sequences for initiation of linear amplification are promoters.
5. The process of claim 4, wherein the process uses a T7 RNA polymerase.
6. The process of any preceding claim, where amplification in step (c) includes incorporation of a label.
7. The process of any preceding claim, wherein fragmentation in step (a) uses a restriction enzyme.
8. The process of claim 7, wherein the restriction enzyme produces degenerate cohesive ends.
9. The process of claim 8, wherein linkers are attached in step (b) to only a subset of the genomic fragments
10. The process of claim 8 or claim 9, wherein the restriction enzyme is SsaJI.
11. The process of claim 10, wherein the linker has a single stranded overhang of 3 --GCAC-5 ' .
12. The process of any preceding claim, wherein the linkers include a covalently attached biotin molecule.
13. The process of any preceding claim, wherein attachment in step (b) uses a ligase.
14. The process of any preceding claim, wherein the DNA sample is a genomic DNA sample.
15. The process of any preceding claim, wherein the DNA sample is a human DNA sample.
16. The process of any preceding claim, wherein the array comprises oligonucleotide hybridisation probes.
17. A nucleic acid array comprising probes for hybridisation to nucleic acid obtained by the method of any preceding claim.
18. A nucleic acid array comprising probes for hybridisation to at least 20% of the restriction fragments obtainable by digestion of human genomic DNA with a restriction enzyme
5 selected from the group consisting of: SsaJI, Fok\.
19. A nucleic acid array comprising nucleic acid probes, wherein at least 20% of the probes can hybridise to restriction fragments obtainable by digestion of a genomic DNA.
20. The array of claim 17, claim 18 or claim 19, comprising oligonucleotide hybridisation probes.
10 21. A kit, comprising two or more of: (a) a DNA dependent RNA polymerase; (b) a linker of the invention; and (c) a restriction enzyme.
22. The kit of claim 21 , wherein the polymerase is a T7 RNA polymerase.
23. A process for preparing a genomic DNA sample for analysis by array CGH, comprising the steps of: (a) performing a restriction digestion on the genomic DNA sample, to give
15 genomic fragments; and (b) subjecting the genomic fragments to a nucleic acid amplification process.
24. A process for selectively amplifying a subset of sequences in a population of nucleic acids, comprising the steps of: (a) digesting the nucleic acids with a restriction enzyme to give restriction fragments with cohesive ends that include a degenerate sequence; (b)
20 ligating a linker sequence to the 51 end of a restriction fragment, to give ligated molecules, wherein the linker sequence (i) includes a binding sequence for a nucleic acid polymerase, upstream of (ii) a specific sequence for hybridising to a desired subset of said degenerate sequence; and (c) contacting the ligated molecules with the polymerase, to produce copies of a subset of the restriction fragments.
25 25. In a process for amplifying a genomic DNA sample for analysis by array CGH, the improvement consisting of the use of a linear amplification.
PCT/GB2006/003072 2005-08-16 2006-08-16 Array comparative genomic hybridisation methods using linear amplification WO2007020444A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0516797.8 2005-08-16
GB0516797A GB0516797D0 (en) 2005-08-16 2005-08-16 CGH method

Publications (1)

Publication Number Publication Date
WO2007020444A1 true WO2007020444A1 (en) 2007-02-22

Family

ID=35098398

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2006/003072 WO2007020444A1 (en) 2005-08-16 2006-08-16 Array comparative genomic hybridisation methods using linear amplification

Country Status (2)

Country Link
GB (1) GB0516797D0 (en)
WO (1) WO2007020444A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009115313A1 (en) * 2008-03-19 2009-09-24 Oryzon Genomics, S.A. Method and composition for methylation analysis
EP2217921A2 (en) * 2007-11-08 2010-08-18 University of Washington Dna microarray based identification and mapping of balanced translocation breakpoints

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999014373A1 (en) * 1997-09-17 1999-03-25 Yale University Method for selection of insertion mutations
WO2005026329A2 (en) * 2003-09-12 2005-03-24 Cornell Research Foundation, Inc. Methods for identifying target nucleic acid molecules

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999014373A1 (en) * 1997-09-17 1999-03-25 Yale University Method for selection of insertion mutations
WO2005026329A2 (en) * 2003-09-12 2005-03-24 Cornell Research Foundation, Inc. Methods for identifying target nucleic acid molecules

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
FAN JIAN-BING ET AL: "Highly parallel genomic assays", NATURE REVIEWS GENETICS, vol. 7, no. 8, August 2006 (2006-08-01), pages 632 - 644, XP002414335, ISSN: 1471-0056 *
GUILLAUD-BATAILLE MARINE ET AL: "Detecting single DNA copy number variations in complex genomes using one nanogram of starting DNA and BAC-array CGH.", NUCLEIC ACIDS RESEARCH. 2004, vol. 32, no. 13, 2004, pages e112, XP002414334, ISSN: 1362-4962 *
HUGHES SIMON ET AL: "The use of whole genome amplification in the study of human disease", PROGRESS IN BIOPHYSICS & MOLECULAR BIOLOGY, vol. 88, no. 1, May 2005 (2005-05-01), pages 173 - 189, XP004651876, ISSN: 0079-6107 *
LIU CHIH LONG ET AL: "Development and validation of a T7 based linear amplification for genomic DNA", BMC GENOMICS, BIOMED CENTRAL, LONDON, GB, vol. 4, no. 1, 9 May 2003 (2003-05-09), pages 19, XP021002045, ISSN: 1471-2164 *
WATSON SPENCER K ET AL: "Methods for high throughput validation of amplified fragment pools of BAC DNA for constructing high resolution CGH arrays", BMC GENOMICS, BIOMED CENTRAL, LONDON, GB, vol. 5, no. 1, 14 January 2004 (2004-01-14), pages 6, XP021002143, ISSN: 1471-2164 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2217921A2 (en) * 2007-11-08 2010-08-18 University of Washington Dna microarray based identification and mapping of balanced translocation breakpoints
JP2011505122A (en) * 2007-11-08 2011-02-24 ユニヴァーシティ オブ ワシントン DNA microarray-based identification and mapping of equilibrium translocation breakpoints
EP2217921A4 (en) * 2007-11-08 2011-07-06 Univ Washington Dna microarray based identification and mapping of balanced translocation breakpoints
WO2009115313A1 (en) * 2008-03-19 2009-09-24 Oryzon Genomics, S.A. Method and composition for methylation analysis

Also Published As

Publication number Publication date
GB0516797D0 (en) 2005-09-21

Similar Documents

Publication Publication Date Title
EP2380993B1 (en) Method for generating and amplifying DNA libraries for sensitive detection and analysis of DNA methylation
EP2920324B1 (en) Localised rca-based amplification method
US8551709B2 (en) Methods for fragmentation and labeling of nucleic acids
US8076063B2 (en) Multiplexed methylation detection methods
EP2456888B1 (en) Probes for specific analysis of nucleic acids
EP2920320B1 (en) Rca reporter probes and their use in detecting nucleic acid molecules
US20060211000A1 (en) Methods, compositions, and kits for detection of microRNA
CN111263819A (en) RNA templated ligation
US20080241831A1 (en) Methods for detecting small RNA species
EP3274470B1 (en) Solid phase nucleic acid target capture and replication using strand displacing polymerases
WO2004051224A2 (en) Multiplexed methylation detection methods
CA2827497A1 (en) Method for localized in situ detection of mrna
WO2008069906A2 (en) Digital expression of gene analysis
WO2008097957A2 (en) Detection of mature small rna molecules
WO2004065628A1 (en) Quantitative multiplex detection of nucleic acids
WO2002101358A9 (en) Multiplexed detection methods
EP4077717B1 (en) Method of detecting an analyte
AU2012316129B2 (en) Methods of co-detecting mRNA and small non-coding RNA
JP6089012B2 (en) DNA methylation analysis method
EP1461458A2 (en) Multiplexed methylation detection methods
WO2007020444A1 (en) Array comparative genomic hybridisation methods using linear amplification
AU2002345657A1 (en) Multiplexed detection methods

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06779144

Country of ref document: EP

Kind code of ref document: A1