WO2011071382A1 - Profilage polymorphique du génome entier - Google Patents
Profilage polymorphique du génome entier Download PDFInfo
- Publication number
- WO2011071382A1 WO2011071382A1 PCT/NL2010/050836 NL2010050836W WO2011071382A1 WO 2011071382 A1 WO2011071382 A1 WO 2011071382A1 NL 2010050836 W NL2010050836 W NL 2010050836W WO 2011071382 A1 WO2011071382 A1 WO 2011071382A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- adapter
- sequence
- fragment
- genome
- fragments
- Prior art date
Links
- 239000012634 fragment Substances 0.000 claims abstract description 88
- 238000000034 method Methods 0.000 claims abstract description 44
- 102000054765 polymorphisms of proteins Human genes 0.000 claims abstract description 18
- 239000002773 nucleotide Substances 0.000 claims description 34
- 125000003729 nucleotide group Chemical group 0.000 claims description 34
- 230000002068 genetic effect Effects 0.000 claims description 24
- 108091008146 restriction endonucleases Proteins 0.000 claims description 23
- 238000011176 pooling Methods 0.000 claims description 19
- 210000004507 artificial chromosome Anatomy 0.000 claims description 10
- 238000003556 assay Methods 0.000 claims description 7
- 230000000295 complement effect Effects 0.000 claims description 6
- 239000003550 marker Substances 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 4
- 230000008878 coupling Effects 0.000 claims description 2
- 238000010168 coupling process Methods 0.000 claims description 2
- 238000005859 coupling reaction Methods 0.000 claims description 2
- 238000012165 high-throughput sequencing Methods 0.000 abstract description 7
- 108020004414 DNA Proteins 0.000 description 33
- 208000024191 minimally invasive lung adenocarcinoma Diseases 0.000 description 31
- 230000003321 amplification Effects 0.000 description 21
- 238000003199 nucleic acid amplification method Methods 0.000 description 21
- 102000039446 nucleic acids Human genes 0.000 description 13
- 108020004707 nucleic acids Proteins 0.000 description 13
- 150000007523 nucleic acids Chemical class 0.000 description 13
- 238000012163 sequencing technique Methods 0.000 description 13
- 238000013507 mapping Methods 0.000 description 12
- 102000053602 DNA Human genes 0.000 description 9
- 108091028043 Nucleic acid sequence Proteins 0.000 description 8
- 238000013459 approach Methods 0.000 description 8
- 230000009467 reduction Effects 0.000 description 8
- 108091034117 Oligonucleotide Proteins 0.000 description 7
- 230000000875 corresponding effect Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 108700028369 Alleles Proteins 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 6
- 210000000349 chromosome Anatomy 0.000 description 6
- 238000003752 polymerase chain reaction Methods 0.000 description 6
- 230000004544 DNA amplification Effects 0.000 description 4
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 4
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 4
- 108090000623 proteins and genes Proteins 0.000 description 4
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 4
- 241000219194 Arabidopsis Species 0.000 description 3
- 241000894006 Bacteria Species 0.000 description 3
- 108010042407 Endonucleases Proteins 0.000 description 3
- 102000004533 Endonucleases Human genes 0.000 description 3
- 241000282414 Homo sapiens Species 0.000 description 3
- 108091092878 Microsatellite Proteins 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000013537 high throughput screening Methods 0.000 description 3
- 238000000126 in silico method Methods 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 108091093088 Amplicon Proteins 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 101000690100 Homo sapiens U1 small nuclear ribonucleoprotein 70 kDa Proteins 0.000 description 2
- 101100029173 Phaeosphaeria nodorum (strain SN15 / ATCC MYA-4574 / FGSC 10173) SNP2 gene Proteins 0.000 description 2
- 101100094821 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) SMX2 gene Proteins 0.000 description 2
- 238000012300 Sequence Analysis Methods 0.000 description 2
- 102100024121 U1 small nuclear ribonucleoprotein 70 kDa Human genes 0.000 description 2
- 210000004436 artificial bacterial chromosome Anatomy 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 229940104302 cytosine Drugs 0.000 description 2
- 238000001976 enzyme digestion Methods 0.000 description 2
- 238000003205 genotyping method Methods 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 238000009396 hybridization Methods 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 238000011835 investigation Methods 0.000 description 2
- 101150044508 key gene Proteins 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 239000013612 plasmid Substances 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 241000894007 species Species 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 239000004149 tartrazine Substances 0.000 description 2
- 229940113082 thymine Drugs 0.000 description 2
- 210000001519 tissue Anatomy 0.000 description 2
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 241000219195 Arabidopsis thaliana Species 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 108010061309 E021 Proteins 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101000836075 Homo sapiens Serpin B9 Proteins 0.000 description 1
- 101000661807 Homo sapiens Suppressor of tumorigenicity 14 protein Proteins 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 238000002944 PCR assay Methods 0.000 description 1
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 1
- 102100025517 Serpin B9 Human genes 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 101100439974 Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4) clpE gene Proteins 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 208000005652 acute fatty liver of pregnancy Diseases 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 239000013599 cloning vector Substances 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000004163 cytometry Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000009144 enzymatic modification Effects 0.000 description 1
- 238000006911 enzymatic reaction Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 238000010359 gene isolation Methods 0.000 description 1
- 230000007614 genetic variation Effects 0.000 description 1
- 238000012268 genome sequencing Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 238000001531 micro-dissection Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000007481 next generation sequencing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012175 pyrosequencing Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 239000002151 riboflavin Substances 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 238000010008 shearing Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Definitions
- the present invention relates to the field of molecular biology and biotechnology.
- the invention relates to the field of nucleic acid detection and identification. More in particular the invention relates to the generation of a physical map of a genome, or part thereof, using high-throughput sequencing technology, combined with the identification of polymorphic markers that are mapped to the physical map.
- the invention further relates to the development of marker assays of the discovered polymorphic markers and the use of the markers in the generation of a genetic map. More in particular, the invention relates to the identification of sequence tags, from which a physical map is created, and polymorphic sequence tags, from which a genetic map is made, and which can be integrated into a genome wide high density physical map. Background of the invention
- Integrated genetic and physical genome maps are extremely valuable for map-based gene isolation, comparative genome analysis and as sources of sequence-ready clones for genome sequencing projects.
- the effect of the availability of an integrated map of physical and genetic markers of a species for genome research is enormous.
- Integrated maps allow for precise and rapid gene mapping and precise mapping of microsatellite loci and SNP markers.
- Various methods have been developed for assembling physical maps of genomes of varying complexity.
- One of the better characterized approaches use restriction enzymes to generate large numbers of DNA fragments from genomic subclones (Brenner et al. , Proc. Natl. Acad. Sci. , (1989), 86, 8902-8906; Gregory et al., Genome Res.
- a physical map is generated from a combination of restriction enzyme digestion of clones in a library, pooling, restriction enzyme digestion, adapter-ligation, (selective) amplification, high-throughput sequencing and deconvolution of the resulting sequences results into BAC clone specific sets that can be used to assemble physical maps.
- the assembly of the clones into contigs is based on the co- presence of terminal nucleotide sequences of the sequenced fragments which can be used as sequence based anchor points for additional linkage of sequence data. More in detail, the technology disclosed in WO2008007951 is Whole Genome Profiling (WGP) , KeyGene's recently developed proprietary approach for sequence based physical mapping.
- WGP Whole Genome Profiling
- a BAC library is constructed from a single homozygous individual and BAC clones are pooled in a multi-dimensional format.
- BAC pools are characterized by pool specific tags to allow assignment of sequences to individual BAC clones based on the coordinates in the multidimensional pool screening.
- DNA is extracted from each BAC pool and digested with restriction enzymes, for instance EcoRI and Msel.
- restriction enzymes for instance EcoRI and Msel.
- the EcoRI ends of the restriction fragments are analyzed on a next-generation sequencer such as the lllumina Genome Analyzer and in this way these relative short (20-100 basepairs) sequenced fragments, called the WGP Tags, can be assigned to individual BACs.
- BACs can be assembled based on overlapping WGP tag patterns using a contiging software tool such as FPC
- the WGP method is unique in providing sequence based anchor points instead of fragment lengths for assembly of BAC contigs. Sequence based anchors are more accurate and provide the basis for assembly of Whole Genome Shotgun data.
- sequencing refers to determining the order of nucleotides (base sequences) in a nucleic acid sample, e.g. DNA or RNA.
- bases sequences e.g. DNA or RNA.
- Many techniques are available such as Sanger sequencing and high-throughput sequencing technologies (also known as next- generation sequencing technologies) such as the GS FLX platform offered by Roche Applied Science, and the Genome Analyzer from lllumina, both based on pyrosequencing.
- Restriction endonuclease a restriction endonuclease or restriction enzyme is an enzyme that recognizes a specific nucleotide sequence (target site) in a double-stranded DNA molecule, and will cleave both strands of the DNA molecule at or near every target site, leaving a blunt or a staggered end.
- Frequent cutters and rare cutters Restriction enzymes typically have recognition sequences that vary in number of nucleotides from 3, 4 (such as Msel) to 6 (EcoRI) and even 8 (Notl).
- the restriction enzymes used can be frequent and rare cutters. The term 'frequent' in this respect is typically used in relation to the term 'rare'.
- Frequent cutting endonucleases are restriction endonucleases that have a relatively short recognition sequence. Frequent cutters typically have 3-5 nucleotides that they recognise and
- a frequent cutter on average cuts a DNA sequence every 64-1024 nucleotides.
- Rare cutters are restriction endonucleases that have a relatively long recognition sequence. Rare cutters typically have 6 or more nucleotides that they recognise and subsequently cut. Thus, a rare 6-cutter on average cuts a DNA sequence every 4096 nucleotides, leading to longer fragments. It is observed again that the definition of frequent and rare is relative to each other, meaning that when a 4 bp restriction enzyme, such as Msel, is used in combination with a 5-cutter such as Avail, Avail is seen as the rare cutter and Msel as the frequent cutter.
- Restriction fragments the DNA molecules produced by digestion with a restriction endonuclease are referred to as restriction fragments. Any given genome (or nucleic acid, regardless of its origin) will be digested by a particular restriction endonuclease into a discrete set of restriction fragments.
- the DNA fragments that result from restriction endonuclease cleavage can be further used in a variety of techniques and can for instance be detected by gel electrophoresis.
- Ligation the enzymatic reaction catalyzed by a ligase enzyme in which two double- stranded DNA molecules are covalently joined together is referred to as ligation.
- ligation the enzymatic reaction catalyzed by a ligase enzyme in which two double- stranded DNA molecules are covalently joined together.
- both DNA strands are covalently joined together, but it is also possible to prevent the ligation of one of the two strands through chemical or enzymatic modification of one of the ends of the strands. In that case the covalent joining will occur in only one of the two DNA strands.
- Synthetic oligonucleotide single-stranded DNA molecules having preferably from about 10 to about 50 bases, which can be synthesized chemically are referred to as synthetic oligonucleotides.
- synthetic DNA molecules are designed to have a unique or desired nucleotide sequence, although it is possible to synthesize families of molecules having related sequences and which have different nucleotide compositions at specific positions within the nucleotide sequence.
- synthetic oligonucleotide will be used to refer to DNA molecules having a designed or desired nucleotide sequence.
- Adapters short double-stranded DNA molecules with a limited number of base pairs, e.g. about 10 to about 30 base pairs in length, which are designed such that they can be ligated to the ends of restriction fragments.
- Adapters are generally composed of two synthetic oligonucleotides which have nucleotide sequences which are partially complementary to each other. When mixing the two synthetic oligonucleotides in solution under appropriate conditions, they will anneal to each other forming a double-stranded structure.
- one end of the adapter molecule is designed such that it is compatible with the end of a restriction fragment and can be ligated thereto; the other end of the adapter can be designed so that it cannot be ligated, but this need not be the case (double ligated adapters).
- Adapter-ligated restriction fragments restriction fragments that have been capped by adapters.
- primers in general, the term primers refer to DNA strands which can prime the synthesis of DNA.
- DNA polymerase cannot synthesize DNA de novo without primers: it can only extend an existing DNA strand in a reaction in which the complementary strand is used as a template to direct the order of nucleotides to be assembled.
- primers we will refer to the synthetic oligonucleotide molecules which are used in a polymerase chain reaction (PCR) as primers.
- DNA amplification the term DNA amplification or amplification will be typically used to denote the in vitro synthesis of double-stranded DNA molecules using PCR. It is noted that other amplification methods exist and they may be used in the present invention without departing from the gist.
- Tagging refers to the addition of a sequence tag to a nucleic acid sample in order to be able to distinguish it from a second or further nucleic acid sample.
- Tagging can e.g. be performed by the addition of a sequence identifier during complexity reduction or by any other means known in the art such as a separate ligation step.
- a sequence identifier can e.g. be a unique base sequence of varying but defined length uniquely used for identifying a specific nucleic acid sample. Typical examples are ZIP sequences, known in the art as commonly used tags for unique detection by hybridization (lannone et al. Cytometry 39:131 -140, 2000).
- nucleotide based tags the origin of a sample, a clone or an amplified product can be determined upon further processing.
- the different nucleic acid samples should be identified using different tags.
- Identifier a short sequence that can be added to an adapter or a primer or included in its sequence or otherwise used as label to provide a unique identifier (aka barcode or index).
- identifiers can be sample specific, pool specific, clone specific, amplicon specific etc.
- the different nucleic acid samples are generally identified using different identifiers.
- Identifiers preferably differ from each other by at least two base pairs and preferably do not contain two identical consecutive bases to prevent misreads.
- the identifier function can sometimes be combined with other functionalities such as adapters or primers and can be located at any convenient position.
- Tagged library refers to a library of tagged nucleic acids.
- Aligning and alignment With the term “aligning” and “alignment” is meant the comparison of two or more nucleotide sequence based on the presence of short or long stretches of identical or similar nucleotides. Several methods for alignment of nucleotide sequences are known in the art, as will be further explained below.
- a contig is used in connection with DNA sequence analysis, and refers to assembled contiguous stretches of DNA derived from two or more DNA fragments having contiguous nucleotide sequences.
- a contig is a set of overlapping DNA fragments that provides a partial contiguous sequence of a genome.
- the term 'contig' is also used to indicate a contiguous stretch of, for instance, BACs, a "BAC-contig'.
- a BAC contig can also be made on marker analysis, i.e. a more indirect way of sequence analysis.
- a "scaffold" is defined as a series of contigs that are in the correct order, but are not connected in one continuous sequence , i.e. contain gaps.
- Contig maps also represent the structure of contiguous regions of a genome by specifying overlap relationships among a set of clones.
- the term "contigs” encompasses a series of cloning vectors which are ordered in such a way as to have each sequence overlap that of its neighbours.
- the linked clones can then be grouped into contigs, either manually or, preferably, using appropriate computer programs such as FPC, PHRAP, CAP3 etc.
- Complexity reduction is used to denote a method wherein the complexity of a nucleic acid sample, such as genomic DNA, is reduced by the generation or selection of a subset of the sample.
- This subset can be representative for the whole (i.e. complex) sample and is preferably a reproducible subset. Reproducible means in this context that when the same sample is reduced in complexity using the same method and experimental conditions, the same, or at least comparable, subset is obtained.
- the method used for complexity reduction may be any method for complexity reduction known in the art. Examples of methods for complexity reduction include for example AFLP® (Keygene N.V., the Netherlands; see e.g. EP 0 534 858), the methods described by Dong (see e.g.
- WO 03/0121 18, WO 00/24939 indexed linking
- Unrau et al., vide infra indexed linking
- the complexity reduction methods used in the present invention have in common that they are reproducible. Reproducible in the sense that when the same sample is reduced in complexity in the same manner, the same subset of the sample is obtained, as opposed to more random complexity reduction such as microdissection, random shearing, or the use of mRNA (cDNA) which represents a portion of the genome transcribed in a selected tissue and for its reproducibility is depending on the selection of tissue, time of isolation etc..
- cDNA mRNA
- DNA amplification the term DNA amplification will be typically used to denote the in vitro synthesis of double-stranded DNA molecules using PCR. It is noted that other amplification methods exist and they may be used in the present invention without departing from the gist.
- High-throughput screening is a method for scientific experimentation especially relevant to the fields of biology and chemistry. Through a combination of modern robotics and other specialised laboratory hardware, it allows a researcher to effectively screen large amounts of samples
- Artificial clone library a population of hosts (bacteria, yeast), each of which carries a DNA molecule that was inserted into a cloning vector such that a representation of the genome of an organism is present (usually an entrire genome) of clones, artificial
- BAC Bacterial Artificial Chromosome, a DNA construct, usually based on a functional fertilility plasmid (or F-plasmid), used for transforming and cloning in bacteria, usually E. coli.
- the usual insert size is 100-350 kb but can also be 700kb.
- BACs are often used to sequence the genomes of organisms whereby a short piece of the organism's DNA is amplified as an insert in BACs and then sequenced. Rearrangement of the sequenced parts in silico provides the genome sequence of the organism.
- polymorphism refers to the presence of two or more variants of a nucleotide sequence in a population.
- a polymorphism may comprise one or more base changes, an insertion, a repeat, or a deletion.
- a polymorphism includes e.g. a simple sequence repeat (SSR) and a single nucleotide polymorphism (SNP), which is a variation, occurring when a single nucleotide: adenine (A), thymine (T), cytosine (C) or guanine (G) - is altered.
- SSR simple sequence repeat
- SNP single nucleotide polymorphism
- a variation must generally occur in at least 1 % of the population to be considered a SNP.
- SNPs make up e.g. 90% of all human genetic variations, and occur every 100 to 300 bases along the human genome. Two of every three SNPs substitute Cytosine (C) with
- Thymine Variations in the DNA sequences of e.g. humans or plants can affect how they handle diseases, bacteria, viruses, chemicals, drugs, etc.
- Heterozygous An organism is heterozygous for a particular gene when different alleles occupy the gene's position on the homologous chromosomes
- Homozygous An organism is homozygous for a particular gene when identical alleles are present on both homologous chromosomes Summary of the invention
- the present inventors found that 'Whole Genome Profiling' or WGP has proven to work well in providing high-quality physical maps. However, from analysis and validation on several data sets, it was observed that sometimes 'gaps' (missing WGP tags) in assembled BACs occur: one overlapping BAC might not show the same set of WGP tags as another BAC covering the same region. These gaps could result from sequence errors or incomplete deconvolution due to inadequate sequencing depth, however the inventors realised that they could also have a biological background. If there would be a SNP or short indel within the WGP tag region, this would result in a polymorphic tag: either a present/absent tag or a tag with a SNP or indel. Exactly these latter SNP variants provide a unique opportunity to use them as genetic markers.
- the present inventors have found by using artificial chromosome libraries (particularly BAC libraries) from a heterozygous genome sample or a combination of two or more homozygous genome samples, this observed effect can be used in an efficient way to create a physical map and, based on the same data set, screen the data for the presence of polymorphisms within the WGP-tags. This will significantly reduce the effort in SNP discovery and performing a large scale genotyping experiment.
- the addition of a (rough scale) genotyping and genetic mapping effort provides a link of BAC contigs into linkage groups. This genetic map can then be extended to a high density, high resolution map by adding all SNP tags from their positions as known from the BAC contigs, resulting in an integrated physical and genetic map.
- FIG. 6 Various adapter-primer combinations containing identifiers to yield tagged fragments.
- A adapter nucleotide
- F fragment nucleotide
- P primer nucleotide
- 1. adapter ligated fragment; 2. adapter ligated fragment containing identifier, a. amplification with primer directed against identifier, b. amplification with primer directed against adapter; 3. adapter ligated fragment containing degenerate identifier section, amplification with primer directed against degenerate identifier and introducing identifier in amplified fragment. 4. adapter ligated fragment containing no identifier, amplification with primer introducing identifier in amplified fragment.
- Figure 7 Integration of genetic and physical map.
- Figure 8 Alignment of BAC clones for contig 293. Indicated by the rectangular dotted box is the likely position of the polymorphic WGP tag SNP1 .
- FIG. 9 Alignment of BAC clones for contig 307. Indicated by the rectangular dotted box is the likely position of the polymorphic WGP tag SNP2.
- the invention relates to a method for the generation of a physical map of a sample genome and identification of polymorphisms, comprising the steps of:
- an artificial chromosome e.g. BAC, YAC
- each artificial chromosome clone contains DNA from a sample genome, wherein the sample genome is selected from the group consisting of
- step (h) aligning the sequenced fragments based on the determined sequences in step (e);
- step (i) determining polymorphisms between the aligned sequences of step (h).
- an artificial clone bank is provided.
- the library can be a Bacterial Artificial Chromosome library (BAC) or based on yeast (YAC). Other libraries such as based on fosmids, cosmids, PAC, TAC or MAC are also possible.
- BAC library is preferably of a high quality and preferably is a high insert size genomic library. This means that the individual BAC contains a relative large insert of the genomic DNA under investigation (typically > 100 kbp). The size of the preferred large insert is species-dependent.
- BACs as examples of artificial chromosomes.
- the present invention is not limited thereto and that other artificial chromosomes can be used without departing from the gist of the invention.
- the libraries contain at least five genome equivalents, more preferably at least 7, most preferably at least 8. Particularly preferred is at least 10. The higher the number of genome equivalents in the library, the more reliable the resulting contigs and physical map will be.
- sample genome DNA is thus selected from the group consisting of a
- heterozygous sample genome a combination of two or more homozygous sample genomes; and a combination of at least one heterozygous and at least one homozygous sample genome.
- the individual clones in the library are pooled to form pools containing a multitude of artificial chromosomes or clones.
- the pooling may be the simple combination of a number of individual clones into one sample (for example, 100 clones into 10 pools, each containing 10 clones), but also more elaborate pooling strategies may be used.
- the distribution of the clones over the pools is preferably such that each clone is present in at least two or more of the pools.
- the pools contain from 10 to 10000 clones per pool, preferably from 100 to 1000, more preferably from 250 to 750. It is observed that the number of clones per pool can vary widely, and this variation is related to, for instance, the size of the genome under investigation.
- the maximum size of a pool or a sub-pool is governed by the ability to uniquely identify a clone in a pooling set by a set of identifiers.
- a typical range for a genome equivalent in a pool set is in the order of 0.2 - 0.3, and this may again vary per genome.
- the pools are generated based on pooling strategies well known in the art. The skilled man is capable selecting the optimal pooling strategy based on factors such as genome size etc.
- the resulting pooling strategy will depend on the circumstances, and examples thereof are plate pooling, N-dimensional pooling such as 2D-pooling, 3D- pooling, 6D-pooling or complex pooling.
- the pools may, on their turn, be combined in super-pools (i.e. super-pools are pools of pools of clones) or divided into sub-pools.
- deconvolution i.e. the correct identification of the individual clone in a library by detection of the presence of a known associated indicator (i.e. label or identifier) of the clone in one or more pools or subpools
- a known associated indicator i.e. label or identifier
- the pooling strategy is preferably such that every clone in the library is distributed in such over the pools that a unique combination of pools is made for every clone. The result thereof is that a certain combination of (sub)pools uniquely identifies a clone.
- the pools are digested with restriction endonucleases to yield restriction fragments.
- Each pool is preferably separately subjected to an endonuclease digest.
- Each pool is treated with the same (combination of) endonuclease(s).
- Restriction endonucleases may be frequent cutters (4 or 5 cutters, such as Msel or Pstl) or rare cutters (6 and more cutters such as EcoRI, Hindlll).
- restriction endonucleases are selected such that restriction fragments are obtained that are, on average, present in an amount or have a certain length distribution that is adequate for the subsequent steps.
- two or more restriction endonucleases can be used and in certain embodiments, combinations of rare and frequent cutters can be used. For large genomes the use of, for instance, three or more restriction endonucleases can be used advantageously.
- adapters are ligated in step (d) to provide for adapter-ligated restriction fragments.
- adapters are synthetic oligonucleotides as defined herein elsewhere.
- the adapters used in the present invention preferably contain an identifier section, in essence as defined herein elsewhere to provide for 'tagged adapters' .
- the adapter contains a pool-specific identifier, i.e.
- an adapter containing a unique identifier is used that unequivocally indicates the pool.
- the adapter contains a degenerate identifier section which is used in combination with a primer containing a pool-specific identifier.
- the adapter-ligated restriction fragments can be combined in larger groups, in particular when the adapters contain a pool-specific identifier. This combination in larger groups may aid in reducing the number of parallel amplifications of each set of adapter-ligated restriction fragments obtained from a pool.
- the adapters that are ligated do not contain an identifier or a degenerate identifier section.
- the adapter-ligated fragments are subsequently amplified using primers that contain identifiers (tags), for instance at their 5'end. The result is that amplified, tagged adapter-ligated fragments are obtained.
- the adapters can be the same for a plurality (or all) of pools and the amplification using tagged primers creates the distinction between the pools that can later be used in the deconvolution.
- the tagged adapter-ligated fragment can be amplified.
- the amplification may serve to reduce the complexity or to increase the amount the DNA available for analysis.
- the amplification can be performed using a set of primers that are at least partly complementary to the adapters and or the tags/identifiers. This amplification may be independently from the amplification described herein above that introduces the tags into the adapters. In certain embodiments, the amplification may serve several purposes at a time, i.e. reduce complexity, increase DNA amount and introduce tags in the adapter-ligated fragments in the pools.
- the adapter-ligated fragments can be combined in larger groups, in particular when the adapters contain a pool-specific identifier. This combination in larger groups may aid in reducing the number of parallel amplifications of each set of adapter- ligated restriction obtained from a pool.
- the adapter-ligated fragments can be amplified using a set of primers of which at least one primer amplifies the pool-specific identifier at the position of the pool-specific or degenerate identifier in the adapter.
- the primer may contain (part of) the identifier, but the primer may also be complementary to a section of the adapter that is located outside the tag, i.e. downstream in the adapter. Amplification then also amplifies the tag. See in this respect Fig 6 for various embodiments.
- step (e) part of the sequence of the tagged adapter-ligated fragment is determined.
- the tagged adapter-ligated fragments are subjected to sequencing, preferably high throughput sequencing as described herein elsewhere. During sequencing, at least part of the nucleotide sequence of the (amplified) tagged adapter-ligated fragment is determined.
- At least the sequence of the pool-specific identifier and part of the fragment (i.e. derived from the sample genome) of the (amplified) tagged adapter-ligated fragment is determined.
- a sequence of at least 10 nucleotides of the fragment is determined.
- at least 15, 20, 25, 30 or 35 nucleotides of the fragment (i.e. derived from the sample genome) are determined.
- the number of nucleotides that are to be determined minimally will be, again, genome- as well as sequencing platform dependent. For instance, in plants more repetitive sequences are present, hence longer sequences (50-150 nucleotides) may to be determined for a contig of comparable quality.
- the sequence library may be sequenced with an average redundancy level (aka oversampling rate) of at least 5.
- an average redundancy level (aka oversampling rate) of at least 5.
- the sequence is determined of at least 5 amplicons obtained from the amplification of one specific adapter- ligated fragment.
- each fragment is (statistically) sequenced on average at least five times.
- Increased redundancy is preferred as its improves the fraction of fragments that are sampled in each pool and the accuracy of these sequences, so preferably the redundancy level is at least 7, more preferably a least 10.
- Increased average sequencing redundancy levels are used to compensate for a phenomenon that is known as 'sampling variation', i.e.
- sequencing is performed using high-throughput sequencing methods, such as the methods disclosed in WO 03/004690, WO 03/054142, WO
- step (f) the (partly) sequenced (amplified) tagged adapter-ligated fragments are correlated to the corresponding clone, typically in silico by means of computerized methods.
- the (amplified) tagged adapter-ligated fragments are selected that contain identical sections of nucleotides in the restriction fragment-derived part.
- the different pool-specific identifiers (tags) are identified that are present in those (amplified) tagged adapter-ligated fragments.
- the combination of the different pool-specific identifiers and hence the sequence of the restriction fragment can be uniquely assigned to a specific clone (a process indicated as 'deconvolution').
- each pool in the library is uniquely addressed by a combination of 3 pool- specific identifiers with the same restriction fragment-derived section.
- a restriction fragment-derived section originating from a clone will be tagged with 3 different identifiers.
- Unique restriction fragment-derived sections when observed in combination with the 3 identifiers can be assigned to a single BAC clone. This can be repeated for each (amplified) tagged adapter-ligated fragment that contains other unique sections of nucleotides in the restriction fragment-derived part.
- the clones are combined and ordered into clone contigs in step (g) of the method.
- the grouping and ordering can be performed by fingerprint contiging software for this purpose such as FPC software (Soderlund et al (1997) FPC: a system for building contigs from restriction fingerprinted clones. Comput. Appl. Biosci., 13:523-535.) essentially as described herein elsewhere.
- FPC software Serlund et al (1997) FPC: a system for building contigs from restriction fingerprinted clones. Comput. Appl. Biosci., 13:523-535.
- the alignment of the clones into contigs and the corresponding order of WGP tags generates a physical map of the sample genome.
- the above steps can be performed independently for each genome sample. More in particular, in certain embodiments, for each heterozygous and/or homozygous sample genome steps (a) - (g) are performed independently. The steps relating to the screening for polymorphisms in the subsequent steps are taken from the separate steps (e).
- sequenced fragments are aligned based on the determined sequences in step (e) of the method.
- sequenced fragments comprise a sample identification tag, identifying the pool from which the BAC was derived (and hence the fragment).
- the sequence fragments further comprise the remains of the restriction site and further a sequence that is derived form the sample genome (sometimes indicated as the 'sequence tag').
- step (i) the polymorphisms are determined between the aligned sequences of the WGP tags of step (h).
- a polymorphism in the WGP tag region will result in two variants, with a SNP (or Indel) somewhere in the 65 nt, non-restriction site part of a (75 nt sequence read length)) WGP tag (which is now termed 'SNP Tag', i.e. a SNP Tag is a WGP tag that contains a polymorphism or an indel between its two variants, See fig 3). Given equal portions of BACs and pool samples, the two variants ('alleles') of each SNP Tag are expected to be found in a 50/50 distribution.
- a polymorphism In those cases where a polymorphism is present in the restriction site region, it will cause the WGP tag to be present in only 50% of the BACs from the corresponding region in the genome. These tags are less useful to discriminate between alleles as it is not sure whether the tags are missing because the region was not covered by the BAC, or because of a polymorphism.
- the low number of SNP tags will still allow the formation of contigs from overlapping BACs, as obtained in a standard WGP approach, albeit with less stringent alignment (FPC) settings.
- FPC stringent alignment
- a binning step is executed first to combine allelic variants (and possibly sequence errors) into a single WPG tag followed by a FPC step as for a homozygous line (i.e. with more stringent settings).
- an additional analysis will be done to identify "SNP tags" and their position on the contigs or physical map.
- the discovered polymorphism and SNPs are converted into conventional SNP assays using conventional technology.
- SNP tags can due to the existence of sequence information be converted to PCR-assays, invader assays, Golden Gate assays and the like.
- the developed assay can be validated on the sample genomes(s) used to generate the SNPs.
- the discovered SNP tags are both physical markers (as their position on the physical map is known) and genetic markers, as they differ between sample genomes. This allows to couple genetic and physical maps (See Fig 7).
- SNPs can be followed in subsequent crosses, for instance in the offspring of crosses between parents of which at least one of the parents was used to create the physical map.
- the SNPs can serve as genetic markers that are linked to the physical map.
- the BACs used to generate the physical map can be anchored to the genetic map, resulting in a high resolution map based on a scaffold of genetic markers (SNP tags) and supplemented with WGP tags. It is further possible to also link genetic markers obtained via other ways to the map to further complete the high resolution map.
- a further aspect of the invention relates to a method for the generation of a linked genetic map of genetic markers (SNP tags) and a physical map (of WGP tags) comprising
- step (i) providing two parents, wherein at least one of the parents has been used as a sample genome in the generation of the physical map that provided the SNP tags of step (i);
- the genetic linkage map is constructed using the markers discovered using the method of the invention.
- the map can be constructed by observing how frequently pairs of SNP Tag markers are inherited together after selfing an F1 population and analyzing the F2 individuals (about 100).
- the thus obtained genetic map shows the relative locations of these SNPs along the chromosome, thus providing the link with the position of the SNP tags on the BAC contigs of the physical map.
- This example is to provide an illustration for the polymorphic Whole Genome Profiling (pWGP) concept described herein.
- the invention is aimed to perform physical mapping and SNP discovery at the same time, by executing a WGP project on a BAC library derived from a heterozygote individual or a combination of two polymorphic individuals.
- data have been used from both Arabidopsis thaliana Columbia and Landsberg erecta ecotypes.
- WGP tags were identified which were specific to either the Col or the Ler data.
- a single Col tag matches a single Ler tag with only one nucleotide difference, than such a tag is a putative SNP marker.
- Two of such candidates and corresponding BAC clones are identified. They are presented in Table 3.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
La présente invention concerne un procédé permettant de générer une carte physique d'un génome d'un échantillon combinée à l'identification de polymorphismes. Ce procédé implique, d'une part de dresser une carte physique d'une échantillothèque de chromosomes artificiels bactériens d'un génome d'échantillon hétérozygote en se basant sur un séquençage à haut débit de fragments de restrictions qui sont ligaturés à des adaptateurs marqués et qui appartiennent à des clones de chromosomes artificiels bactériens mis en commun, et d'autre part à déterminer les polymorphismes entre les fragments de restrictions ligaturés à des adaptateurs marqués.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US28533109P | 2009-12-10 | 2009-12-10 | |
NL2003932 | 2009-12-10 | ||
US61/285,331 | 2009-12-10 | ||
NL2003932 | 2009-12-10 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011071382A1 true WO2011071382A1 (fr) | 2011-06-16 |
Family
ID=42272720
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/NL2010/050836 WO2011071382A1 (fr) | 2009-12-10 | 2010-12-09 | Profilage polymorphique du génome entier |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2011071382A1 (fr) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0534858A1 (fr) | 1991-09-24 | 1993-03-31 | Keygene N.V. | Amplification sélective des fragments de restriction: procédé général pour le "fingerprinting" d'ADN |
WO2000024939A1 (fr) | 1998-10-27 | 2000-05-04 | Affymetrix, Inc. | Gestion de la complexite et analyse d'adn genomique |
WO2003004690A2 (fr) | 2001-07-06 | 2003-01-16 | 454$m(3) CORPORATION | Methode utilisant un filtre poreux pour isoler en parallele des micro-reactions chimiques independantes |
WO2003012118A1 (fr) | 2001-07-31 | 2003-02-13 | Affymetrix, Inc. | Gestion de la complexite d'adn genomique |
WO2003027311A2 (fr) * | 2001-09-24 | 2003-04-03 | Seqwright, Inc | Strategie de sequençage aleatoire de banques ordonnees de clones pour le sequençage des acides nucleiques |
WO2003054142A2 (fr) | 2001-10-30 | 2003-07-03 | 454 Corporation | Nouvelles proteines de fusion sulfurylase-luciferase et sulfurylase thermostable |
WO2004063323A2 (fr) * | 2003-01-10 | 2004-07-29 | Keygene N.V. | Procede fonde sur aflp destine a l'integration de cartes physiques et genetiques |
WO2004070005A2 (fr) | 2003-01-29 | 2004-08-19 | 454 Corporation | Sequençage a double extremite |
WO2006137734A1 (fr) * | 2005-06-23 | 2006-12-28 | Keygene N.V. | Strategies ameliorees pour le sequençage de genomes complexes utilisant des techniques de sequençage a haut rendement |
WO2008007951A1 (fr) | 2006-07-12 | 2008-01-17 | Keygene N.V. | Cartographie physique à haut débit par aflp |
-
2010
- 2010-12-09 WO PCT/NL2010/050836 patent/WO2011071382A1/fr active Application Filing
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0969102A2 (fr) * | 1991-09-24 | 2000-01-05 | Keygene N.V. | Amplification sélective des fragments de restriction; procédé général pour le "fingerprinting" d'ADN |
EP0534858A1 (fr) | 1991-09-24 | 1993-03-31 | Keygene N.V. | Amplification sélective des fragments de restriction: procédé général pour le "fingerprinting" d'ADN |
WO2000024939A1 (fr) | 1998-10-27 | 2000-05-04 | Affymetrix, Inc. | Gestion de la complexite et analyse d'adn genomique |
WO2003004690A2 (fr) | 2001-07-06 | 2003-01-16 | 454$m(3) CORPORATION | Methode utilisant un filtre poreux pour isoler en parallele des micro-reactions chimiques independantes |
WO2003012118A1 (fr) | 2001-07-31 | 2003-02-13 | Affymetrix, Inc. | Gestion de la complexite d'adn genomique |
US6975943B2 (en) | 2001-09-24 | 2005-12-13 | Seqwright, Inc. | Clone-array pooled shotgun strategy for nucleic acid sequencing |
WO2003027311A2 (fr) * | 2001-09-24 | 2003-04-03 | Seqwright, Inc | Strategie de sequençage aleatoire de banques ordonnees de clones pour le sequençage des acides nucleiques |
WO2003054142A2 (fr) | 2001-10-30 | 2003-07-03 | 454 Corporation | Nouvelles proteines de fusion sulfurylase-luciferase et sulfurylase thermostable |
WO2004063323A2 (fr) * | 2003-01-10 | 2004-07-29 | Keygene N.V. | Procede fonde sur aflp destine a l'integration de cartes physiques et genetiques |
WO2004070005A2 (fr) | 2003-01-29 | 2004-08-19 | 454 Corporation | Sequençage a double extremite |
WO2004069849A2 (fr) | 2003-01-29 | 2004-08-19 | 454 Corporation | Amplification d'acides nucleiques par emulsion de billes |
WO2004070007A2 (fr) | 2003-01-29 | 2004-08-19 | 454 Corporation | Prodece de preparation de banques d'adn simple brin |
WO2005003375A2 (fr) | 2003-01-29 | 2005-01-13 | 454 Corporation | Procede d'amplification et de sequençage d'acides nucleiques |
WO2006137734A1 (fr) * | 2005-06-23 | 2006-12-28 | Keygene N.V. | Strategies ameliorees pour le sequençage de genomes complexes utilisant des techniques de sequençage a haut rendement |
WO2008007951A1 (fr) | 2006-07-12 | 2008-01-17 | Keygene N.V. | Cartographie physique à haut débit par aflp |
Non-Patent Citations (9)
Title |
---|
BRENNER ET AL., PROC. NATL. ACAD. SCI., vol. 86, 1989, pages 8902 - 8906 |
CAI W-W ET AL: "A clone-array pooled shotgun strategy for sequencing large genomics", GENOME RESEARCH, COLD SPRING HARBOR LABORATORY PRESS, WOODBURY, NY, US LNKD- DOI:10.1101/GR.198101, vol. 11, 1 January 2001 (2001-01-01), pages 1619 - 1623, XP002967818, ISSN: 1088-9051 * |
GREGORY ET AL., GENOME RES., vol. 7, 1997, pages 1162 - 1168 |
KLEIN ET AL., GENOME RESEARCH, vol. 10, 2000, pages 798 - 807 |
KLEIN P E ET AL: "A high-throughput AFLP-based method for constructing integrated genetic and physical maps: Progress toward a sorghum genome map", GENOME RESEARCH, COLD SPRING HARBOR LABORATORY PRESS, WOODBURY, NY, US LNKD- DOI:10.1101/GR.10.6.789, vol. 10, no. 6, 1 June 2000 (2000-06-01), pages 789 - 807, XP002240094, ISSN: 1088-9051 * |
LANNONE ET AL., CYTOMETRY, vol. 39, 2000, pages 131 - 140 |
MARRA ET AL., GENOME RES., vol. 7, 1997, pages 1072 - 1084 |
SEO ET AL., PROC. NATL. ACAD. SCI. USA, vol. 101, 2004, pages 5488 - 93 |
SODERLUND ET AL.: "FPC: a system for building contigs from restriction fingerprinted clones", COMPUT. APPL. BIOSCI., vol. 13, 1997, pages 523 - 535 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10538806B2 (en) | High throughput screening of populations carrying naturally occurring mutations | |
JP5389638B2 (ja) | 制限断片に基づく分子マーカーのハイスループットな検出 | |
EP2663655B1 (fr) | Génotypage fondé sur des séquences aléatoires à extrémités appariées | |
EP2427569B1 (fr) | Utilisation d'endonucléases à restriction de classe iib dans des applications de séquençage de 2ème génération | |
EP2379751B1 (fr) | Nouvelles stratégies de séquençage du génome | |
US8975028B2 (en) | Method for the identification of the clonal source of a restriction fragment | |
EP2513333A1 (fr) | Séquençage du génome total basé sur des enzymes de restriction | |
US20200102612A1 (en) | Method for identifying the source of an amplicon | |
WO2011071382A1 (fr) | Profilage polymorphique du génome entier | |
US20150329906A1 (en) | Novel genome sequencing strategies |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10796186 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10796186 Country of ref document: EP Kind code of ref document: A1 |