WO2016057947A1 - Random nucleotide mutation for nucleotide template counting and assembly - Google Patents
Random nucleotide mutation for nucleotide template counting and assembly Download PDFInfo
- Publication number
- WO2016057947A1 WO2016057947A1 PCT/US2015/054981 US2015054981W WO2016057947A1 WO 2016057947 A1 WO2016057947 A1 WO 2016057947A1 US 2015054981 W US2015054981 W US 2015054981W WO 2016057947 A1 WO2016057947 A1 WO 2016057947A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- nams
- group
- template
- read
- mnams
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6827—Hybridisation assays for detection of mutation or polymorphism
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07H—SUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
- C07H21/00—Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids
- C07H21/04—Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids with deoxyribosyl as saccharide radical
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
- C12N15/1031—Mutagenizing nucleic acids mutagenesis by gene assembly, e.g. assembly by oligonucleotide extension PCR
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1058—Directional evolution of libraries, e.g. evolution of libraries is achieved by mutagenesis and screening or selection of mixed population of organisms
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
- C12Q1/6851—Quantitative amplification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2521/00—Reaction characterised by the enzymatic activity
- C12Q2521/50—Other enzymatic activities
- C12Q2521/539—Deaminase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2525/00—Reactions involving modified oligonucleotides, nucleic acids, or nucleotides
- C12Q2525/10—Modifications characterised by
- C12Q2525/161—Modifications characterised by incorporating target specific and non-target specific sites
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2531/00—Reactions of nucleic acids characterised by
- C12Q2531/10—Reactions of nucleic acids characterised by the purpose being amplify/increase the copy number of target nucleic acid
- C12Q2531/113—PCR
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2535/00—Reactions characterised by the assay type for determining the identity of a nucleotide base or a sequence of oligonucleotides
- C12Q2535/122—Massive parallel sequencing
Definitions
- the present invention provides a method is provided for determining the number of nucleic acid molecules (NAMs) in a group of NAMs, comprising
- step (iii) counting the number of different sequences obtained in step (ii) to determine the number of unique mNAMs in the group of mNAMS
- the present invention also provides a method for determining the number of nucleic acid molecules (NAMs) in a group of NAMs, comprising
- mNAMs mutagenized NAMs
- step (iii) counting the number of different sequences obtained in step (iii) to determine the number of unique mNAMs in the group of mNAMs,
- the present invention also provides a method for determining the number of different sequences in a group of nucleic acid molecules (NAMs) that have been mutagenized and then amplified comprising
- step (ii) counting the number of different sequences obtained in step (ii) ,
- the present invention also provides a method for sequencing a nucleic acid molecule (NAM) that comprises two or more segments having substantially the same sequence, and that has a length of more than one sequencing read, comprising
- step (i) subjecting each copy of the NAM in step (i) to a mutagenesis that mutates only select nucleic acid bases in the NAMs at a rate of 10% to 90% to produce mutated copies of the NAM (mcNAM) ;
- the present invention also provides a method for determining genomic copy number information from genomic material, comprising,
- the present invention also provides a method for profiling RNA transcripts, comprising i) obtaining a group of RNA transcripts;
- RNA transcript profile determining the proportionate number of a plurality of RNA transcripts having the same sequence to a second different plurality of RNA transcripts that have the same sequence, thereby determining RNA transcript profile.
- the present invention also provides a method for determining allelic imbalance, comprising
- the present invention also provides a method for determining genome assembly, comprising
- step (iii) aligning the sequences obtained in step (ii) according to matching mutation patterns in overlaps of the sequences
- the present invention also provides a method for determining haplotype assembly, comprising
- step (iii) comparing the sequences obtained in step (ii) , thereby determining haplotype assembly.
- the present invention also provides a kit for determining the number of NAMs in a group of NAMs comprising:
- the mutagen is a bisulfite or a salt thereof, or a deamination agent.
- the present invention also provides a composition of matter comprising a plurality of mutagenized nucleic acid molecules (mNAMs), wherein selected mutable nucleic acid positions in the plurality of mutagenized NAMs (mNAMs) are mutated at a rate of 10% to 90%.
- mNAMs mutagenized nucleic acid molecules
- the present invention also provides a composition of matter derived from sequencing primers has the sequence:
- ACACTCTTTCCCTACACACGACGCTCTTCCGATC*T (Seq ID p5.mC), wherein the cytosines (C) are methylated, and wherein *T is a phosphorothioated thymine.
- the present invention also provides a composition of matter derived from sequencing primers has the sequence: 5Phos-
- the present invention also provides a composition of matter comprising two or more copies of a nucleic acid molecule (NAM) comprising two or more segments having substantially the same sequence, and that has a length of more than one sequencing read, wherein each copy of the NAM has a unique primer at its 5' end and another unique primer at its 3' end, and is subjected to a mutagenesis that mutates each mutable position in the NAMs at a rate of 10% to 90% to produce mutated copies of the NAM (mcNAM) , wherein the unique primers of each mcNAM lack a nucleotide that is mutable by the mutagenesis.
- NAM nucleic acid molecule
- the present invention also provides sequence information produced by a system including one or more processing units which counts the number of different sequences obtained by a sequencer that processed the group of amplified mNAMs in the method of the present invention, or the group of mcNAMs of in the method of the present invention.
- the present invention also provides a system including one or more processing units which counts the number of different sequences obtained by a sequencer that processed the group of amplified mNAMs of in the method of the present invention, or the group of mcNAMs of in the method of the present invention.
- Cytidine deamination is a mutational method that converts cytidine (underlined) to a uridine (bold) . Upon amplification, uridine becomes a thymidine (bold) (panel C) . The sequenced nucleotide strings are aligned, aggregated by their mutational patterns (panel D) , and the number of the distinct patterns counted.
- each template has a unique pair of markers, or "end tags", denoted by the colored circle and square (panel A) . Markers of the same color occur on the same template strands and are said to be "in phase”. Gray marks on the templates show positions that may mutate.
- panel B a random mutation process
- each read is mapped to the reference template (panel C, top strand) .
- the phase of the original templates was recovered (panel D) .
- the ability to recover the template count is a function of the window size, template length, flip rate, number of templates, and depth of coverage. Simulation results of template count estimation are shown under a variety of conditions. Each panel has three plots, for window sizes of 10, 20, and 30 bits. The x axis shows the true count from 2 to 1,024 (log2 scale) and the y axis shows the average estimated count divided by the true count, or the proportion of templates recovered. Panel A simulates recovery when the template is one window long for a range of flip rates for infinite coverage. Panel B shows the results from one window template under finite coverage (lx to 5x reads per template) for a fixed flip rate of 0.35. Panel C repeats the results of panel B for long templates comprised of 16 read lengths.
- each read is vertex. Some reads contain their end tag (colored circles) and some do not (white circles) . Two reads with an edge were connected if they agree on their overlap. The weight of an edge reflects the strength of that overlap.
- Panel A depicts the template information assuming exhaustive coverage, drawing all distinct reads and the edges between them.
- sample reads were finitely sampled from the templates at a depth of coverage of 4x and 8x per template, respectively. From this information the greedy algorithm was applied (panels Ci and C2) , to select the best edges, shown in red.
- Panel A depicts the rate at which each base is observed over all the data for those positions with a coverage of at least 30 reads.
- Panel B depicts the cumulative distribution of the conversion rate per read.
- Panel C depicts the correlation in flips for all cytosines in the targeted region.
- Panel D depicts panel C as a histogram. For the best amplified position, partial conversion was determined with a 60% flip rate randomly distributed throughout each read .
- Figure 7 Conversion rate as a function of incubation time and temperature.
- the datasets A3, A6, A9 and A45 are the 3, 6, 9 and 45 minute conversions described herein.
- Figure 8 Subset of clustered reads showing mutational patterns and two heterozygous positions in the sample. The panel on the left shows all the positions in the fragment while the plot on the right shows only cytosine (bit) positions. The white lines separate reads derived from the same initial template. Each cluster contains between 30 and 50 reads. Black indicates a position where the read matches the reference genome. The frequent gray squares are cytosines that have converted to thymine. The white and light grey streaks are linked heterozygous alleles which split according to mutation pattern. Sparse background errors are typically from the sequencer while the bands of error are typically the result of PCR.
- Figure 10 Comparison of template count distributions for fragments from the autosome and X chromosome. Since the sample is male, 2 to 1 ratio was observed in the mean template counts. The histogram is the empirical distribution and the curve shows a negative binomial fit.
- FIG 11 Heterozygous allele counts by template demonstrate perfect fit to the binomial distribution.
- the plot on the left shows as a histogram the counts for one allele.
- the curve shows the theoretical expectation of the count distribution assuming a binomial distribution for the allele at each locus assuming the given locus coverage.
- the plot on the right shows the Q-Q plot over all 6000 heterozygous positions observed.
- Figure 13 Properties of the consensus sequences derived by clustering reads with the same mutation pattern. For each consensus sequence base, the proportion of reads reporting the homozygous base was determined. These error rates are order of magnitude below sequencer error.
- the present invention provides a method for determining the number of nucleic acid molecules (NAMs) in a group of NAMs, comprising
- step (iii) counting the number of different sequences obtained in step (ii) to determine the number of unique mNAMs in the group of mNAMS
- the present invention provides a method for determining the number of nucleic acid molecules (NAMs) in a group of NAMs, comprising i) obtaining an amplified and mutagenized group of NAMs that was produced by
- step (iii) counting the number of different sequences obtained in step (ii) to determine the number of unique mNAMs in the group of mNAMS
- obtaining sequences comprises obtaining composite sequences produced by assembling sequence reads of the mNAMs by a) aligning the sequence reads according to matching mutation patterns in overlaps of the sequence reads, thereby obtaining composite sequences, and
- obtaining sequences in comprises obtaining composite sequences produced by assembling sequence reads of the mNAMs by
- the present invention also provides a method for determining the number of nucleic acid molecules (NAMs) in a group of NAMs, comprising
- mNAMs mutagenized NAMs
- step (iii) counting the number of different sequences obtained in step (iii) to determine the number of unique mNAMs in the group of mNAMs,
- the present invention also provides a method for determining the number of nucleic acid molecules (NAMs) in a group of NAMs, comprising
- mNAMs mutagenized NAMs
- step (iii) counting the number of different sequences obtained in step (iii) to determine the number of unique mNAMs in the group of mNAMs,
- the sequencing comprises assembling sequence reads of the mNAMs into composite sequences by
- the sequencing comprises assembling sequence reads of the mNAMs into composite sequences by
- step (iii) wherein counting the number of different composite sequences obtained in step (iii) .
- a sub-group of NAMs in the group of NAMs is determined to have substantially the same nucleotide sequence.
- the sub-group of NAMs is determined to have nucleotide sequences that are at least 95, 96, 97, 98, 99, 99.1, 99.2, 99.3, 99.4, 99.5, 99.6, 99.7, 99.9, or 99.9% identical.
- the nucleotide sequences of a sub-group of NAMs comprise a stretch of consecutive nucleotides having a sequence which includes at least two mutable positions and is i) identical to the sequence of a stretch of consecutive nucleotides within another NAM within the sub-group of NAMs, or ii) determined to have at least 95, 96, 97, 98, 99, 99.1, 99.2, 99.3, 99.4, 99.5, 99.6, 99.7, 99.9, or 99.9% identical to the sequence of a stretch of consecutive nucleotides within another NAM within the sub-group of NAMs .
- the counting comprises counting the number of different sequences that are determined to have substantially the same sequence except for their mutable positions, thereby determining the number of NAMs in the group of NAMs that had substantially the same sequence. In one or more embodiments, the counting comprises counting the number of different sequences which lack substantially the same sequence in any stretch including at least two mutable positions, thereby determining the number of NAMs without substantially the same sequence in the group of NAMs.
- the present invention also provides a method for determining the number of different sequences in a group of nucleic acid molecules (NAMs) that have been mutagenized and then amplified comprising
- step (ii) counting the number of different sequences obtained in step (ii) ,
- the present invention also provides a method for sequencing a nucleic acid molecule (NAM) that comprises two or more segments having substantially the same sequence, and that has a length of more than one sequencing read, comprising
- a copy of the NAM is a partial copy of the NAM.
- a copy of the NAM has at least 50 bp of identical or complementary sequence to the NAM.
- a copy of the NAM is a complete copy of the NAM.
- the present invention also provides a method for sequencing a nucleic acid molecule (NAM) that comprises two or more segments having substantially the same sequence, and that has a length of more than one sequencing read, comprising
- step (i) subjecting each copy of the NAM in step (i) to a mutagenesis that mutates only select nucleic acid bases in the NAMs at a rate of 10% to 90% to produce mutated copies of the NAM (mcNAM) ;
- the present invention also provides a method for sequencing a nucleic acid molecule (NAM) that comprises two or more segments having substantially the same sequence, and that has a length of more than one sequencing read, comprising
- step (i) subjecting each copy of the NAM in step (i) to a chemical mutagenesis that mutates only select nucleic acid bases in the NAMs at a rate of 10% to 90% to produce mutated copies of the NAM (mcNAM) ;
- each of the two or more copies of the NAM has a unique primer at its 5' end and another unique primer at its 3' end.
- the unique primers of each mcNAM lack a nucleotide that is mutable by the mutagenesis.
- the present invention also provides a method for determining genomic copy number information from genomic material, comprising,
- the present invention also provides a method for profiling RNA transcripts, comprising
- RNA transcript profile determining the proportionate number of a plurality of RNA transcripts having the same sequence to a second different plurality of RNA transcripts that have the same sequence, thereby determining RNA transcript profile.
- the present invention also provides a method for determining allelic imbalance, comprising
- the present invention also provides a method for determining genome assembly, comprising
- step (iii) aligning the sequences obtained in step (ii) according to matching mutation patterns in overlaps of the sequences
- the present invention also provides a method for determining haplotype assembly, comprising i) obtaining a group of alleles, wherein the alleles in the group of alleles are located in the same chromosome;
- the rate of mutagenizing each mutable position of the NAMs in the group of NAMs is 25% to 75%.
- the rate of mutagenizing each mutable position of the NAMs in the group of NAMs is 40% to 60%.
- the rate of mutagenizing each mutable position of the NAMs in the group of NAMs is 50%.
- the proportion of all bases mutated in each mNAM is about 3% to 30%.
- the mutagenesis is by cytosine deamination .
- the mutagenesis is performed after binding template molecules to a bead or surface.
- biotinylated primers are attached to templates .
- templates linked to biotinylated moieties are attached to streptavidin beads .
- the mutagenesis further comprises beads and/or varietal tags .
- the cytosine deamination is induced by a bisulfite or a salt thereof. In some embodiments, the cytosine deamination is induced by enzymology .
- the cytosine deamination is induced by an activation-induced deaminase.
- the mutagenesis comprises contacting the group of NAMs with a depurination agent, transposase agent, or an alkylating agent.
- each mutable position of the NAMs comprises a cytosine (C) .
- the cytosine (C) is mutated to a uracil (U) or a thymine (T) .
- each NAM in the group of NAMs has a unique primer at its 5' end and another unique primer at its 3' end.
- the primer comprises one or more methylated cytosines.
- the primer comprises one or more phosphorothioated nucleotide bases .
- the primer further comprises a 5'- phosphorylated, deoxyuridine-containing anchor-primer.
- the primer has the sequence: ACACTCTTTCCCTACACACGACGCTCTTCCGATCT (Seq ID p5) .
- the primer has the sequence: ACACTCTTTCCCTACACACGACGCTCTTCCGATC*T (Seq ID p5.mC), wherein the cytosines (C) are methylated, and wherein *T is a phosphorothioated thymine .
- the cytosines (C) are methylated.
- the primer has the sequence: GATCGGAAGAGCGGTTCAGCAGGAATGCCGAG (Seq ID p7)
- the primer further comprises a 5'- phosphorylated, deoxyuridine-containing anchor-primer.
- the primer has the sequence: having the sequence: GATCGGAAGAGCGGTTCAGCAGGAATGCCGA*G (Seq ID p7.mC), wherein the cytosines (C) are methylated, wherein *G is a phosphorothioated guanine, and wherein 5Phos is a 5 ' -phosphorylated, deoxyuridine- containing anchor-primer.
- the cytosines (C) are methylated.
- the assembling further comprises aligning the sequences according to unique primers at the 5' and 3' ends .
- the sequence of each unique primer comprises a segment that is substantially the same sequence, and an amplification primer that is complementary to the shared sequence when amplifying the mNAMs or copy thereof.
- amplification is performed using a sequence- specific "wobble" primer.
- each unique primer comprises a unique tag sequence.
- the method further comprises the step of tagging each NAM or copy thereof.
- the tag lacks a nucleotide that is mutable by the mutagenesis.
- the NAM is within a mixture of DNA or RNA extracted from a cell. In some embodiments, the DNA or RNA extracted from the cell has been fragmented .
- the DNA or RNA extracted from the cell has been fragmented by mechanical shearing or one or more restriction enzymes.
- the one or more restriction enzyme is Pstl .
- fragmentation occurs before amplifying. In one or more embodiments, fragmentation occurs after amplifying.
- fragmentation occurs after mutagenesis.
- the method of the claimed invention rther comprises subjecting the fragmented DNA or RNA to end-repair.
- the method of the claimed invention further comprises subjecting the fragmented DNA or RNA to adenylation .
- the method of the claimed invention further comprises subjecting the fragmented DNA or RNA to ligation with methyl-cytosine adaptors, wherein the methyl-cytosine adaptors are bisulfite resistant sequencing adaptors.
- the NAM is a DNA molecule.
- the DNA molecule is a fragment of genomic DNA.
- the DNA molecule is a cDNA molecule.
- the NAM is an RNA molecule.
- the RNA molecule is an mRNA molecule. In one or more embodiments, the RNA molecule is a viral RNA molecule .
- the NAM is an RNA molecule derived from one or more cell lines.
- the method of the claimed invention further comprises reverse transcription of the NAM.
- the reverse transcription is with poly-T and methyl-cytosines , wherein the methyl-cytosines are resistant to bisulfite mutation.
- chemical mutagenesis occurs prior to reverse transcription.
- one or more NAMs in the group of NAMs has a length of one sequencing read length.
- one or more NAMs in the group of NAMs has a length of two or more sequencing read lengths .
- the sequencing read length is 2, 3, 4, 5, 6, 7, 8, 9, 10, or 10-3000 sequencing read lengths.
- the number of NAMs in the group of NAMs is about 2, 3, 4, 5, 6, 7, 8, 9, 10, or 10-10000.
- the number of NAMs in the group of NAMs is greater than 10000, then diluting the group of NAMs.
- the amplifying is by short-range or long-range polymerase chain reaction (PCR) .
- the mutagen of the mutagenesis is diluted .
- the group of NAMs is incubated with a mutagen at an incubation temperature of about 70 to 78 degrees Celsius .
- the incubation temperature is about 73 degrees Celsius.
- the group of NAMs is incubated with a mutagen at an incubation time of about 3 to 45 minutes.
- the group of NAMs is incubated with a mutagen at an incubation time of about 5 to 20 minutes.
- the incubation time is about 3, 6, or 9 minutes .
- the incubation time is about 10 minutes.
- the present invention also provides a kit for determining the number of NAMs in a group of NAMs comprising:
- the mutagen is a bisulfite or a salt thereof, or a deamination agent.
- the bisulfite or salt thereof is NaHS03.
- the mutagen induces cytosine deamination .
- the cytosine deamination is by enzymology .
- the mutagen is diluted.
- the kit of the present invention further comprises a plurality of unique primers including:
- substantially unique primers comprise substantially unique tags .
- the kit of the present invention further comprises a DNA polymerase having 3' -5' proofreading activity.
- the plurality of substantially unique primers comprises 10 n primers, wherein n is an integer from 2 to 9.
- the substantially unique tags are at least 6 nucleotides long.
- the substantially unique tags are at least 15 nucleotides long.
- the substantially unique primers comprise sets of substantially unique primers having shared sample tags .
- the sample tags are at least 2 or 4 nucleotides long.
- the sequence of the substantially unique tag is not altered by the mutagen.
- the kit of the present invention further comprises a primer wherein the cytosines (C) are methylated.
- the methylated primer having the sequence: ACACTCTTTCCCTACACACGACGCTCTTCCGATC*T (Seq ID p5.mC), wherein the cytosines (C) are methylated, and wherein *T is a phosphorothioated thymine.
- the methylated primer having the sequence: 5Phos-GATCGGAAGAGCGGTTCAGCAGGAATGCCGA*G (Seq ID p7.mC), wherein the cytosines (C) are methylated, wherein *G is a phosphorothioated guanine, and wherein 5Phos is a 5' -phosphorylated, deoxyuridine-containing anchor-primer .
- the present invention also provides a composition of matter comprising a plurality of mutagenized nucleic acid molecules (mNAMs), wherein selected mutable nucleic acid positions in the plurality of mutagenized NAMs (mNAMs) are mutated at a rate of 10% to 90%.
- mNAMs mutagenized nucleic acid molecules
- each mutable position is mutated at a rate of 25% to 75%.
- each mutable position is mutated at a rate of 40% to 60%.
- each mutable position is mutated at a rate of 50%.
- each mutable nucleic acid base is mutated at a rate of 25% to 75%.
- each mutable nucleic acid base is mutated at a rate of 40% to 60%.
- each mutable nucleic acid base is mutated at a rate of 50%.
- the proportion of all nucleic acids mutated in each mNAM is about 3% to 30%.
- the m itable nucleic acid position is a cytosine position of the mNAMs and the mutagenesis is deamination of the cytosine.
- the mutable nucleic acid base is a cytosine base of the mNAMs and the mutagenesis is deamination of the cytosine .
- the deamination of the cytosine is induced by a bisulfite or a salt thereof.
- the cytosine deamination of the cytosine is induced by enzymology.
- the cytosine deamination of the cytosine is induced by an activation-induced deaminase.
- the mutable nucleic acid position is mutagenized by contacting the group of NAMs with a depurination agent, transposase agent, or an alkylating agent.
- each mutable position of the NAMs comprises a cytosine (C) .
- the cytosine (C) is mutated to a uracil (U) or a thymine (T) .
- each NAM in the plurality of NAMs has a unique primer at its 5' end and another unique primer at its 3' end.
- the primer comprises one or more methylated cytosines.
- the primer comprises one or more phosphorothioated nucleotide bases .
- the primer further comprises a 5'- phosphorylated, deoxyuridine-containing anchor-primer.
- the primer has the sequence: ACACTCTTTCCCTACACACGACGCTCTTCCGATCT (Seq ID p5) .
- the plurality of mNAMS bearing a primer wherein the sequence of the primer is:
- cytosines (C) are methylated.
- all the cytosines (C) are methylated.
- the primer has the sequence: GATCGGAAGAGCGGTTCAGCAGGAATGCCGAG (Seq ID p7) .
- the primer further comprises a 5'- phosphorylated, deoxyuridine-containing anchor-primer.
- GATCGGAAGAGCGGTTCAGCAGGAATGCCGA*G (Seq ID p7.mC), wherein *G is a phosphorothioated guanine, and wherein 5Phos is a 5' -phosphorylated, deoxyuridine-containing anchor-primer .
- the cytosines (C) are methylated.
- the present invention also provides a composition of matter derived from sequencing primers has the sequence:
- ACACTCTTTCCCTACACACGACGCTCTTCCGATC*T (Seq ID p5.mC), wherein the cytosines (C) are methylated, and wherein *T is a phosphorothioated thymine .
- the present invention also provides a composition of matter derived from sequencing primers has the sequence: 5Phos-
- the present invention also provides a composition of matter comprising two or more copies of a nucleic acid molecule (NAM) comprising two or more segments having substantially the same sequence, and that has a length of more than one sequencing read, wherein each copy of the NAM has a unique primer at its 5' end and another unique primer at its 3' end, and is subjected to a mutagenesis that mutates each mutable position in the NAMs at a rate of 10% to 90% to produce mutated copies of the NAM (mcNAM) , wherein the unique primers of each mcNAM lack a nucleotide that is mutable by the mutagenesis.
- NAM nucleic acid molecule
- the present invention also provides sequence information produced by a system including one or more processing units which counts the number of different sequences obtained by a sequencer that processed the group of amplified mNAMs in the method of the claimed invention, or the group of mcNAMs of in the method of the claimed invention.
- the present invention also provides a system including one or more processing units which counts the number of different sequences obtained by a sequencer that processed the group of amplified mNAMs of in the method of the claimed invention, or the group of mcNAMs of in the method of the claimed invention.
- the present invention also provides a method for sequencing a nucleic acid molecule (NAM) that comprises two or more segments having substantially the same sequence, and that has a length of more than one sequencing read, comprising
- step (i) subjecting each copy of the NAM in step (i) to a mutagenesis that mutates only select nucleic acid positions in the NAMs at a rate of 10% to 90% to produce mutated copies of the NAM (mcNAM) ;
- the present invention also provides a method for distinguishing between benign and malignantly transformed cells by detecting one or more single nucleotide polymorphisms (SNPs) in a sample from a subject and a reference sample from a control subject comprising a method of the claimed invention.
- SNPs single nucleotide polymorphisms
- the present invention also provides a method for distinguishing between benign and malignantly transformed cells by detecting one or more single nucleotide polymorphisms (SNPs) in a first and second sample from a subject comprising a method of the claimed invention.
- SNPs single nucleotide polymorphisms
- the present invention also provides a method for determining the presence of tumor cells in a sample by comparing a sample from a subject and a reference sample from a control subject comprising a method of the claimed invention.
- the present invention also provides a method for determining the presence of tumor cells in a sample by comparing a first and second sample from a subject comprising a method of the claimed invention.
- the present invention also provides a method for quantifying tumor cells in a sample by comparing a sample from a subject and a reference sample from a control subject comprising a method of the claimed invention.
- the present invention also provides a method for quantifying tumor cells in a sample by comparing a sample from a first and second sample from a subject comprising a method of the claimed invention.
- the present invention also provides a method for detecting one or more rare mutations by comparing a sample from a subject and a reference sample from a control subject comprising a method of the claimed invention.
- the present invention also provi le s a method for detecting one or more rare mutations by comparing a sample from a first and second sample from a subject comprising a method of the claimed invention.
- the sample is a blood sample, plasma sample, serum sample, tissue sample, or cell sample.
- the tissue sample is from a tumor mass, surgically removed tumor mass, or margins of a surgically removed tumor mass.
- the present invention also provides a method for detecting one or more rare mutations in a cell-free or substantially cell-free sample comprising a method of the claimed invention.
- the present invention also provides a method for determining whether a fetus has at least one or more rare mutations in a cell-free or substantially cell-free sample comprising a method of the claimed invention
- the sample is a maternal sample.
- the maternal sample is obtained from a member selected from: maternal blood, maternal plasma and maternal serum .
- nucleic acid molecule and “sequence” are not used interchangeably herein.
- sequence refers to the sequence information of a “nucleic acid molecule”.
- mutable position refers to the position of a nucleic acid that is susceptible to a given type of chemical mutagenesis.
- determining the number refers to determining the lower bound number.
- X% with respect to mutation rate, refers to the probability percentage of mutagenesis per mutable position of the multiple mutable positions that are present in a plurality of nucleic acid molecules. Thus, 25% mutation rate means a 25% probability of mutagenesis.
- nucleic acid shall mean any nucleic acid, including, without limitation, DNA, RNA and hybrids thereof.
- the nucleic acid bases that form nucleic acid molecules can be the bases A, C, G, T and U, as well as derivatives thereof .
- contig and continguous refers to a set of overlapping sequence or sequence reads.
- amplifying refers to the process of synthesizing nucleic acid molecules that are complementary to one or both strands of a template nucleic acid.
- Amplifying a nucleic acid molecule typically includes denaturing the template nucleic acid, annealing primers to the template nucleic acid at a temperature that is below the melting temperatures of the primers, and enzymatically elongating from the primers to generate an amplification product. The denaturing, annealing and elongating steps each can be performed once.
- the denaturing, annealing and elongating steps are performed multiple times (e.g., polymerase chain reaction (PCR) ) such that the amount of amplification product is increasing, often times exponentially, although exponential amplification is not required by the present methods.
- Amplification typically requires the presence of deoxyribonucleoside triphosphates, a DNA polymerase enzyme and an appropriate buffer and/or co-factors for optimal activity of the polymerase enzyme.
- the term "amplified nucleic acid molecule” refers to the nucleic acid sequences, which are produced from the amplifying process as defined herein.
- bisulfite mutagenesis refers to the mutagenesis of nucleic acid with a reagent used for the bisulfite conversion of cytosine to uracil.
- bisulfite conversion reagents include but are not limited to treatment with a bisulfite, a disulfite or a hydrogensulfite compound.
- mapping refers to identifying a location on a genome or cDNA library that has a sequence which is substantially identical to or substantially fully complementary.
- the nucleic acid molecule may be, but is not limited to the following: a segment of genomic material, a cDNA, a mRNA, or a segment of a cDNA.
- methylation refers to the covalent attachment of a methyl group at the C5-position of the nucleotide base cytosine.
- methylation state or refers to the presence or absence of 5-methyl-cytosine ("5-Me") at one or a plurality of CpG dinucleotides within a DNA sequence.
- a methylation site is a sequence of contiguous linked nucleotides that is recognized and methylated by a sequence specific methylase .
- a methylase is an enzyme that methylates (i. e., covalently attaches a methyl group) one or more nucleotides at a methylation site.
- the term "read” or “sequence read” refers to the nucleotide or base sequence information of a nucleic acid that has been generated by any sequencing method.
- a read therefore corresponds to the sequence information obtained from one strand of a nucleic acid fragment.
- a DNA fragment where sequence has been generated from one strand in a single reaction will result in a single read.
- multiple reads for the same DNA strand can be generated where multiple copies of that DNA fragment exist in a sequencing project or where the strand has been sequenced multiple times.
- a read therefore corresponds to the purine or pyrimidine base calls or sequence determinations of a particular sequencing reaction.
- sequencing refers to nucleotide sequence information that is sufficient to identify or characterize the nucleic acid molecule, and could be the full length or only partial sequence information for the nucleic acid molecule.
- reference genome refers to a genome of the same species as that being analyzed for which genome the sequence information is known.
- region of the genome refers to a continuous genomic sequence comprising multiple discrete locations.
- sample tag refers to a nucleic acid having a sequence no greater than 1000 nucleotides and no less than two that may be covalently attached to each member of a plurality of tagged nucleic acid molecules or tagged reagent molecules.
- a “sample tag” may comprise part of a "tag.”
- segments of genomic material refers to the nucleic acid molecules resulting from fragmentation of genomic DNA.
- substantially the same sequences have at least about 80% sequence identity or complementarity, respectively, to a nucleotide sequence. Substantially the same sequences or may have at least about 95%, 96%, 97%, 98%, 99% or 100% sequence identity or complementarity, respectively.
- substantially unique primers refers to a plurality of primers, wherein each primer comprises a tag, and wherein at least 50% of the tags of the plurality of primers are unique.
- the tags are at least 60%, 70%, 80%, 90%, or 100% unique tags.
- substantially unique tags refers to tags in a plurality of tags, wherein at least 50% of the tags of the plurality are unique to the plurality of tags.
- substantially unique tags will be at least 60%, 70%, 80%, 90%, or 100% unique tags.
- tag refers to a nucleic acid having a sequence no greater than 1000 nucleotides and no less than two that may be covalently attached to a nucleic acid molecule or reagent molecule.
- a tag may comprise a part of an adaptor or a primer.
- a "tagged nucleic acid molecule” refers to a nucleic acid molecule which is covalently attached to a "tag.”
- the term "wobble base pairing" with regard to two complementary nucleic acid sequences refers to the base pairing of G to uracil U rather than C, when one or both of the nucleic acid strands contains the ribonucleobase U.
- the term "substantially fully complementary” with regard to a sequence refers to the reverse complement of the sequence allowing for both Watson-Crick base pairing and wobble base pairing, whereby G pairs with either C or U, and A pairs with either U or T.
- a sequence may be substantially complementary to the entire length of another sequence, or it may be substantially complementary to a specified portion or length of another sequence.
- U may be present in RNA
- T may be present in DNA. Therefore, a U within an RNA sequence may pair with A or G in either an RNA sequence or a DNA sequence, while an A within either of a RNA or DNA sequence may pair with a U in a RNA sequence or T in a DNA sequence .
- a "wobble" primer is a set of primers where the sequence at mutable positions is equally likely to match the original base or the mutated base .
- Example 11 a library of Python programs was developed, (Example 11), to simulate mutation, sequencing, counting and assembly of distinct templates under the assumptions of error-free sequencing, perfect mapping, and uniformity of mutation sites, mutation rate, sequence coverage, and DNA amplification.
- Our ability to recover template count and assembly depends on the depth of read sampling, typically called “coverage”. Coverage usually means the average number of reads overlapping a position in the reference genome, however herein, coverage means the average read depth over a position per template.
- the first class of applications focuses on the general problem of determining absolute template count. This is important for determining the copy number of genomic DNA, measuring mRNA expression levels, quantifying allele bias, and detecting somatic mutations.
- the protocol requires mutagenesis prior to amplification. Amplification could be either short- or long-range PCR but must occur before fragmentation if needed for library preparation. The number of possible mutation patterns should exceed the template number to obtain the most accurate count. Hence, cases were only considered where the absolute template count is below the low thousands, and save the other cases for Discussion.
- the number of possible patterns depends on the number of bits per read, and the probability of observing a given pattern depends also on the flip rate.
- the optimal flip rate for generating distinct mutational patterns is 0.5, wherein every pattern is equally likely. However for a window of at least 20 bits, corresponding to a read length of 80 base pairs, a rate of 0.25 is still virtually perfect for template counts in the thousands. Similar efficacy is obtained at a flip rate of 0.15 for a 30-bit window. Templates numbering in the thousands are adequate for genome copy number determination or single-cell transcript profiling.
- example code is provided to simulate performance under a variety of conditions. The recovery of template count was also demonstrated subject to varying depth of coverage for a fixed flip rate of 0.35 ( Figure 3, panel B) .
- a path in this graph represents a possible partial assembly of an initial template pattern. Consequently, determining the minimum number of templates needed to explain all of the reads is achieved by finding the minimum number of paths such that every vertex in the graph is included in at least one path. This is known as the minimum vertex cover and in general is an nondeterministic polynomial time hard (NP-hard) problem.
- NP-hard polynomial time hard
- our graph is not only directed, but also acyclic.
- the minimum number of covering paths is equivalent to the maximum number of elements in an antichain (Dilworth, 1950) .
- the minimum number of templates needed to explain the reads is equal to the maximum number of reads that are pairwise incompatible.
- the simulations of this section provide guidelines for (i) genome wide copy number determination, (ii) transcript profiling, and (iii) determining allelic ratios.
- copy number the ratio of count was measured for a given locus to the median count over the remainder of the genome.
- transcript profiling the proportionate counts of each gene transcript were measured.
- allelic imbalance the ratio of counts was measured from templates distinguishable by at least one SNV. In the context of RNA, this also enables observation of biased allele expression resulting from chromosome inactivation, imprinting and the like.
- the second class of applications is to correctly assemble reads by their mutation patterns in order to recover the proper end-to-end sequence of nearly identical templates, desired when determining haplotype phasing or enumerating transcript isoforms.
- Long templates each tagged uniquely at both ends were considered to simulate the more general task of determining how many initial templates can be correctly assembled from end to end from the mutation pattern alone ( Figure 2) .
- reads were connected with overlapping mutation signatures to assemble a path from one tag to the other. Whereas in the previous application, all compatible edges between reads were allowed, for this problem a subgraph was built with only the "best edges" between overlapping reads.
- a pair of tags is "exactly matched" if there is a path in the subgraph that connects them and neither tag is connected to another tag. Such a path is called an "exact path.” If two tags originate from the same template, they are a “true match.” A "true path” is an exact path for which every read originates from the same initial template .
- Determining performance for the general task provides a lower bound on performance for other applications, because if there is an exact path that is also true, then all sequence information for that template was correctly observed. This includes haplotype phasing in the case of genomic data and transcript structure in the case of RNA profiling. In fact, these two applications are less demanding than the general task because there will only be a few template varieties and each template variety provides additional sequence information for distinguishing them.
- Figure 4 panel B explores the effect of coverage (2x to 14x per template) on recovery of exact matches as a function of template length (2 to 1, 024 read lengths), for 32 templates, a 30 bit read length, and a flip rate of 0.35.
- Partial bisulfite mutagenesis was obtained in a single stranded phi x 174 genomic DNA using the MethylEaseTM Xceed Rapid DNA Bisulphite Modification Kit (Human Genetic Signatures) .
- the full conversion protocol was modified by changing the incubation temperature to 73 degrees Celsius (from 80 degrees Celsius) and the incubation time to 10 minutes (instead of 45 minutes) . Four regions were amplified to measure the conversion rate.
- MethylEaseTM Xceed Rapid DNA Bisulphite Modification Kit Human Genetic Signatures
- the sequencing primer (Seq ID p5.mC), wherein the cytosines (C) are methylated, and wherein *T is a phosphorothioated thymine to protect the ends from degradation by exonuclease .
- p7. mC The sequencing primer (Seq ID p5.mC), wherein the cytosines (C) are methylated, and wherein *T is a phosphorothioated thymine to protect the ends from degradation by exonuclease .
- *T is a phosphorothioated thymine to protect the ends from degradation by exonuclease .
- the sequencing primer (Seq ID p7.mC), wherein the cytosines (C) are methylated, wherein *G is a phosphorothioated guanine to protect the ends from degradation by exonuclease, and wherein 5Phos is a 5'- phosphorylated, deoxyuridine-containing anchor-primer.
- Figure 6 panel A depicts the rate at which each base is observed over all the data for those positions with a coverage of at least 30 reads. Nearly all the C positions are at 40% C and 60% T. For each of the four regions amplified, conversion patterns were compared between reads. Not all regions are equally well covered in the data, with 40-4500 reads.
- Figure 6 panel B depicts the cumulative distribution of the conversion rate per read.
- Figure 6 panel C depicts correlation in flips for all cytosines in the targeted region. It was determined that there none.
- Figure 6 panel D depicts the data of Figure 6 panel C as a histogram.
- Counting varietal tags can be used to mitigate the effects of amplification bias. While the original message is completely recoverable, the tag is confined to one end of the molecule such that identity and count can only be distinguished within one read length of the ends. Only reads that include the tag are useful in determining count and varietal tags provide no solution for assembly and assortment.
- Described herein are different approaches for counting and assembling templates using template mutagenesis.
- the non-limiting examples herein demonstrate by simulation that template mutagenesis can solve both the problems of counting and assembly.
- the order of operation is mutagenesis first, followed by short- or long-range PCR, then fragmentation, if needed, and preparation of sequence libraries.
- Two classes of applications were explored. The first is counting specific DNA or RNA molecules, for assessing genome copy number or profiling a transcriptome ( Figure 1) .
- the second is sequence assembly - for example establishing haplotypes or distinguishing transcript isoforms ( Figure 2) .
- each mutable position (or "bit") converts (or "flips") independently from a wild-type state to an altered state with a fixed probability (or "flip rate”) .
- Performance was simulated under a variety of reasonable parameters for read length and mutation rate, and over a range of template lengths and counts. The results are presented under an assumption of complete coverage to obtain a theoretical upper limit of performance and then consider the consequences of sampling to various levels of coverage. In the simulations, mutable positions are distributed uniformly throughout the template such that each read contains the same number of bits (or "bit length”) . Sequence or mapping error are not presently incorporated. Variations to these assumptions and procedures are addressed herein.
- RNA templates can be counted with near perfect accuracy for mRNA species of intermediate to scarce expression ( ⁇ 1,000 copies per cell).
- varietal tagging can achieve accurate counting of gene transcripts, even long and abundant ones, but it is limited to labeling the end of a molecule and so does not allow counting of isoforms or observing sequence variants, except near the ends of transcripts.
- the two methods, varietal tagging and mutagenesis can be seamlessly integrated, achieving the benefits of both methods.
- the ability to establish phase by this method depends on strong concordance between the haplotypes and the reference genome. For those regions where the reference genome is a poor match, due to repeat content, large-scale rearrangements or novel sequence, the mutation pattern assembly algorithm will fail to generate consistent end-to-end assemblages. Although this presents a problem for direct inference by reference-matched phasing, it provides an opportunity for de novo haplotype assembly.
- the SUTTA algorithm (Narzisi and Mishra, 2011) assembles haplotypes from short-read data by scoring proposed local assemblies based on orthogonal data sources, such as coverage, mate pairs, or physical maps. Template mutagenesis can help. Each local reference genome that SUTTA considers can also be assigned a score based on the number of successful end-to-end mutation pattern assemblies over the region. The result would be a de novo assembly over the human genome for those difficult regions.
- mapping was assumed. In practice, however, the ability to map reads might be somewhat degraded by template mutation.
- a standard practice is to map to a reduced alphabet where all cytosines ("C"s) are converted to thymines ("T"s) in both the read and the reference, with two distinct references genomes for each DNA strand (Krueger et al., 2011; Otto et al . , 2012) .
- C cytosines
- T thymines
- restricting to a smaller alphabet and doubling the reference genomes impacts the ability to unambiguously map reads, however, the effect is surprisingly mild (Krueger et al . , 2012) .
- the mapping algorithm can be augmented with a probabilistic model of the flip rate to prioritize the most likely alignments.
- sequence error may reduce the ability to recover mutation patterns in those cases where the error appears to flip a bit or reverse a flipped bit. Fortunately, sequence error is typically rare. Within a reasonable range for flip rate, window size, and template count, sparse mutational patterns are expected, well separated so that no two patterns are very much alike. Sequence error will produce a pattern "nearby" an established pattern, and less well covered, and this signature can be used to discount those reads.
- the simulations demonstrate that most applications work best for a low initial template count, less than a few thousand. This is not a problem for many genomic applications and is close to ideal for single-cell RNA analysis. If analysis of greater numbers of template molecules is desired, for example during analysis of bulk mRNA, then after mutagenesis of the first strand cDNA, multiple separate amplifications reactions can be performed, each with low template count. The products of each reaction can be tagged with barcodes, pooled and sequenced.
- the description of the method is very similar to that established for the phiX samples discussed above.
- a study was performed on a PstI digest of a human genome.
- Genomic DNA is digested with a restriction enzyme (PstI) . Fragments are end-repaired, adenylated, and then ligated to bisulfite resistant sequencing adapters. These adapters match the standard Illumina adapters, save that the cytosines are replaced by methyl-cytosines .
- the sample is then treated with a standard kit for bisulfite treatment (MethylEasy Xceed Rapid DNA Bisulphite Modification Kit Mix; Human Genetic Signatures.) Instead of incubating the sample for the standard of 80°C for 45 minutes, 3, 6, and 9 minutes at 73°C were tested. One library using the standard 80°C and 45 minutes was also generated. The samples were sub-sampled, amplified and sequenced.
- the resulting reads were mapped to the genome using an informatics pipeline designed for bisulfite sequence data.
- the converted read-pairs are then mapped twice, once to a genome where every C is converted to a T and once to a genome where every G is converted to A.
- the best mapping was assigned to the original read-pair and the mapped genome recorded.
- Reads that map to the AGT-genome are called "original top” or "OT” and are templates derived from the top strand of the initial restriction fragment.
- the reads that map to the ACT genome are called "original bottom” or "OB.” Focus was on a 135 thousand fragments with high quality alignments in the range of 150 to 400 base pairs.
- Each restriction fragment/strand provides an opportunity to observe multiple reads derived from the same initial template.
- error-free data one need only cluster reads that have precisely the same pattern.
- a robust method was developed for joining reads derived from the same initial template.
- Information was extracted from all convertible positions and then cluster reads using a multi- scale clustering algorithm that works on pair-wise hamming distances [ arXiv : 1506.03072 (clustering method devised is available at arvix.org/abs/1506.03072)].
- An example of clustered reads at a single restriction fragment for the original top strand is shown in Figure 8.
- the mutational protocol can be applied to cDNA as well. While this data is less well-studied, the preliminary results are very promising. Taking whole RNA derived from cell lines, the mRNA was reverse transcribed with poly-T and template switch oligo primers that are resistant to bisulfite mutation (methyl-cytosines substituted for cytosines) . The resulting first strand cDNA were mutagenized with the muSeq protocol for 6 minutes at 73C. The mutated strands were then sub-sampled, amplified, sonicated, repaired, and ligated to sequencing adapters, amplified and sequenced .
- the resulting reads were then converted two ways (read 1 C ⁇ T, read 2 G ⁇ A; and read 1 G ⁇ A, read 2 C ⁇ T) mapped to two versions of the human genome using the STAR mapper (Dobin, et al STAR: ultrafast universal RNA-seq aligner, Bioinformatics . 2012) much as described above. The best of the four maps were selected to assign to the original read. Plots showing stacks of reads in the IGV viewer are shown in Figures 13 and 14.
- RNA sequence can be directly mutagenized before reverse transcription.
- a sample is obtained from a subject afflicted with cancer.
- the sample is subjected to a chemical mutagenesis as described herein.
- the mutagenized sample is sequenced, aligned, mapped, and counted as described herein.
- the presence of tumor cells in the sample is determined. Also, quantification of tumor cells in the sample is determined. Also, one or more rare mutations in the sample is determined. Also, one or more single nucleotide polymorphisms in the sample is determined. Also, benign and malignantly transformed cells is distinguished.
- Example 7B Detecting a small load of cancer DNA in the presence of an excess of normal DNA
- a sample is obtained from a subject afflicted with cancer.
- the sample is subjected to a chemical mutagenesis as described herein.
- the sample is further subjected to beads and/or varietal tags.
- the mutagenized sample is sequenced, aligned, mapped, and counted as described herein.
- the presence of tumor cells in the sample is determined. Also, quantification of tumor cells in the sample is determined. Also, one or more rare mutations in the sample is determined. Also, one or more single nucleotide polymorphisms in the sample is determined. Also, benign and malignantly transformed cells is distinguished.
- a sample is obtained from a pregnant female.
- the sample is subjected to a chemical mutagenesis as described herein.
- the mutagenized sample is sequenced, aligned, mapped, and counted as described herein .
- One or more rare mutations in a fetus is determined.
- one or more single nucleotide polymorphisms in a fetus is determined.
- one or more chromosomal abnormalities in a fetus is determined.
- Example 6 The error reduction described above in Example 6 is used in conjunction with the beads and/or varietal tags to obtain sequence counts for rare variants .
- a group of RNA transcipts in one cell or a population of cells is obtained.
- the group of RNA transcripts is subjected to a chemical mutagenesis and sequenced as described herein.
- An assembly algorithm is applied to the sequences of the group of RNA transcripts.
- the assembly algorithm may be SOAPdenovo-Trans, Velvet/Oases, Trans-ABySS, or Trinity transcriptome assemblers.
- a transcriptome is obtained without mapping to a reference genome.
- a group of NAMs is obtained from genomes from a large variety of organisms, wherein some of the organisms may be highly related.
- the group of NAMs is subjected to a chemical mutagenesis and fragmentation as described herein.
- the group of mutagenized fragments of NAMs is sequenced.
- the sequencing may be metagenome sequencing, shotgun sequencing, or high-throughput sequencing.
- an assembly algorithm is applied to the sequences of the group of mutagenized fragments of NAMs.
- the assembly algorithm may be Phrap, Celera, or Velvet/Oases assemblers. Independent genomes are assembled.
- seed hash ("This is not a random seed.")
- templates_binary np . array ([ toBinary (x) for x in templates] )
- mean_unique np .
- mean (unique_counts [ ind] ) info [ read_length, flip_rate, template_count, coverage, mean_unique]
- fixed_window_finite_coverage Performs simulations testing the recovery of template count over a fixed window for a range of flip_rates, read_lengths , and template_counts
- seed hash ("This is not a random seed.")
- templates_binary np . array ([ toBinary (x) for x in templates] )
- seed hash ("This is not a random seed.")
- template_length 2*read_length + span_length ## length of template
- left_edge read_length ## left edge of span (first read position to not contain left mark)
- templates getRandomPatterns (template_count,
- Apos read_length - 1
- word_space getWordSpace (match)
- num_reads int (coverage * template_count * ( float (template_length) / float ( read_length) ) )
- read_template np . random. randint (0, template_count, num reads)
- read_start np . random. randint (0, max_start, num reads)
- word space assigns ambiguous reads to their least template
- read_index [word_space [x] for x in zip (read_template, read_start) ]
- read_tracker defaultdict (set)
- edge_in, edge_out, inscore greedyAssembly (read_start, read_index, match, one_count, score_table, read_length,
- true_exact_match false_exact_match
- many_windows_finite_coverage_counting Performs simulations testing the recovery of template count over a template of many read lengths for a fixed flip_rate, and a range of read_lengths, and template_counts,
- The provides a speed advantage for long templates.
- seed hash (This is not a seed.)
- template_length read_length*template_factor
- templates getRandomPatterns (template_count,
- word_space getWordSpace (match)
- read_template np . random . randint ( 0 ,
- read_start np . random . randint ( 0 , max_start, num_reads )
- read_index [word_space [x] for x in zip (read_template, read_start) ]
- seed hash ("This is not a random seed.")
- template_length read_length*template_factor
- templates getRandomPatterns (template_count, template_length, flip_rate)
- overlap_count np . array ( [np . sum (np . equal . outer (x, x) , 0) for x in overlap . T ]).
- def templateToWindows (template, window_size) :
- def templatesToWindows templates, window_size
- : ' ' ' convert a set of template patterns into their sequence of window_size subwords ' ' '
- Tl and T2 match for every position on the window P: (P+W)
- mismatch_pos [ i ] np .
- logical_xor templates [ i ] , templates
- each position converts from template index to word index. if each word is unique at a position, then they are identical, otherwise, each word is assigned its lowest index template.
- template_count match . shape [ 0 ] ## number of templates
- read_length match . shape [ 3 ] - 1 ## length of read
- max_start match . shape [ 2 ] - read_length + 1 ## maximum read start position
- word_space [t, pos] np.argmax(x)
- one_count[T, P, W] returns the number of flipped positions in the window P: (P+W) in template T.
- one_count [ : , :toEnd, k+1] one_count [ : , : toEnd, k] + templates [ : , k : ]
- This function generates a lookup table that keeps the values for: length of overlap, number of ones for a fixed flip_rate.
- the primary DAG has vertices for each read
- the complete DAG has the same vertices
- overlap_match np . equal . outer (overlap [: , position+1], overlap[:, position+1])
- source (position, read[x, position]) ##
- Y is a simple node if
- overlap_match np . equal . outer (overlap [: , position+1], overlap[:, position+1])
- source (position, read[x, position]) ##
- node_list sorted (out_edge . keys ()) [:: -1 ]
- the primary DAG has vertices for each read
- max_start read . shape [ 1 ] - 2 ## iterate backwards from the next to last position to the first for position in range (max_start, -1, -1) :
- overlap_match np . equal . outer (overlap [: , position+1], overlap[:, position+1])
- source (position, read[x, position]) ##
- def overlapMatch (read_start, read_index, match, read_length, template_length) :
- index_to_start np . array ( [bisect . bisect_left (read_start, x) for x in np . arange (template_length) ] + [num_nodes] )
- index_to_start [ a_pos : (a_pos + 2)]
- a_template read_index [ low : high]
- overlap_length read_length - b_pos + a_pos
- def pathScore edge_in, edge_out, read_start, read_index, left_edge, right_edge, read_tracker
- start_marks [set () for _ in range (num_nodes ) ]
- start_marks [target] .update start_marks [source] )
- Some nodes may be referred by multiple reads from different templates -- if the mutation patterns are degenerate for more than a read length .
- has_true_path hasTruePath (tindex, pstart, pend)
- the score for an edge is the log likelihood of an accidental overlap
- index_to_start np . array ( [bisect . bisect_left (read_start, x) for x in np . arange (template_length) ] + [num_nodes])
- outedge_heaps [[] for _ in range (num_nodes) ]
- index_to_start [ a_pos : (a_pos + 2)]
- a_template read_index [ low : high]
- overlap_length read_length - b_pos + a_pos
- score score table [bits, flipped]
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Immunology (AREA)
- Analytical Chemistry (AREA)
- Plant Pathology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Bioinformatics & Computational Biology (AREA)
- Ecology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Description
Claims
Priority Applications (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP15849374.2A EP3204521B1 (en) | 2014-10-10 | 2015-10-09 | Random nucleotide mutation for nucleotide template counting and assembly |
AU2015330685A AU2015330685B2 (en) | 2014-10-10 | 2015-10-09 | Random nucleotide mutation for nucleotide template counting and assembly |
US15/515,913 US11008606B2 (en) | 2014-10-10 | 2015-10-09 | Random nucleotide mutation for nucleotide template counting and assembly |
EP21176777.7A EP3957742A1 (en) | 2014-10-10 | 2015-10-09 | Random nucleotide mutation for nucleotide template counting and assembly |
CA2964169A CA2964169C (en) | 2014-10-10 | 2015-10-09 | Random nucleotide mutation for nucleotide template counting and assembly |
IL251509A IL251509B (en) | 2014-10-10 | 2017-04-02 | Random nucleotide mutation for nucleotide template counting and assembly |
US17/320,634 US20210340604A1 (en) | 2014-10-10 | 2021-05-14 | Random nucleotide mutation for nucleotide template counting and assembly |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201462062571P | 2014-10-10 | 2014-10-10 | |
US62/062,571 | 2014-10-10 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/515,913 A-371-Of-International US11008606B2 (en) | 2014-10-10 | 2015-10-09 | Random nucleotide mutation for nucleotide template counting and assembly |
US17/320,634 Continuation US20210340604A1 (en) | 2014-10-10 | 2021-05-14 | Random nucleotide mutation for nucleotide template counting and assembly |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016057947A1 true WO2016057947A1 (en) | 2016-04-14 |
Family
ID=55653867
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2015/054981 WO2016057947A1 (en) | 2014-10-10 | 2015-10-09 | Random nucleotide mutation for nucleotide template counting and assembly |
Country Status (6)
Country | Link |
---|---|
US (2) | US11008606B2 (en) |
EP (2) | EP3204521B1 (en) |
AU (1) | AU2015330685B2 (en) |
CA (1) | CA2964169C (en) |
IL (1) | IL251509B (en) |
WO (1) | WO2016057947A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020035669A1 (en) * | 2018-08-13 | 2020-02-20 | Longas Technologies Pty Ltd | Sequencing algorithm |
US11421238B2 (en) | 2018-02-20 | 2022-08-23 | Longas Technologies Pty Ltd | Method for introducing mutations |
WO2023039509A1 (en) * | 2021-09-10 | 2023-03-16 | Cold Spring Harbor Laboratory | Method of measuring microsatellite length variations |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020087076A1 (en) * | 2018-10-26 | 2020-04-30 | The Board Of Trustees Of The Leland Stanford Junior University | Methods and uses of introducing mutations into genetic material for genome assembly |
GB202111184D0 (en) * | 2021-08-03 | 2021-09-15 | Hendriks Gerardus Johannes | Methods |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004042078A1 (en) * | 2002-11-05 | 2004-05-21 | The University Of Queensland | Nucleotide sequence analysis by quantification of mutagenesis |
US20050266453A1 (en) * | 2002-11-01 | 2005-12-01 | Gregory Coia | Mutagenesis methods using ribavirin and/or RNA replicases |
US20090047680A1 (en) * | 2007-08-15 | 2009-02-19 | Si Lok | Methods and compositions for high-throughput bisulphite dna-sequencing and utilities |
WO2013177086A1 (en) * | 2012-05-21 | 2013-11-28 | Sequenom, Inc. | Methods and processes for non-invasive assessment of genetic variations |
US20130338043A1 (en) * | 2012-06-12 | 2013-12-19 | The Johns Hopkins University | Efficient, Expansive, User-Defined DNA Mutagenesis |
US20140065609A1 (en) * | 2010-10-22 | 2014-03-06 | James Hicks | Varietal counting of nucleic acids for obtaining genomic copy number information |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5866344A (en) * | 1991-11-15 | 1999-02-02 | Board Of Regents, The University Of Texas System | Antibody selection methods using cell surface expressed libraries |
AU2002219818B2 (en) * | 2000-11-20 | 2007-08-16 | Cargill, Incorporated | 3-hydroxypropionic acid and other organic compounds |
US20070154892A1 (en) * | 2005-12-30 | 2007-07-05 | Simon Wain-Hobson | Differential amplification of mutant nucleic acids by PCR in a mixure of nucleic acids |
EP2351858B1 (en) | 2006-02-28 | 2014-12-31 | University of Louisville Research Foundation | Detecting fetal chromosomal abnormalities using tandem single nucleotide polymorphisms |
US20100184044A1 (en) | 2006-02-28 | 2010-07-22 | University Of Louisville Research Foundation | Detecting Genetic Abnormalities |
CN103620055A (en) | 2010-12-07 | 2014-03-05 | 利兰·斯坦福青年大学托管委员会 | Non-invasive determination of fetal inheritance of parental haplotypes at the genome-wide scale |
US9725765B2 (en) | 2011-09-09 | 2017-08-08 | The Board Of Trustees Of The Leland Stanford Junior University | Methods for obtaining a sequence |
US20140377762A1 (en) | 2011-12-19 | 2014-12-25 | 360 Genomics Ltd. | Method for enriching and detection of variant target nucleic acids |
US9977861B2 (en) * | 2012-07-18 | 2018-05-22 | Illumina Cambridge Limited | Methods and systems for determining haplotypes and phasing of haplotypes |
US20140066317A1 (en) | 2012-09-04 | 2014-03-06 | Guardant Health, Inc. | Systems and methods to detect rare mutations and copy number variation |
US10253325B2 (en) * | 2012-12-19 | 2019-04-09 | Boston Medical Center Corporation | Methods for elevating fat/oil content in plants |
ES2815684T3 (en) * | 2015-06-10 | 2021-03-30 | Biocartis N V | Improved detection of methylated DNA |
-
2015
- 2015-10-09 US US15/515,913 patent/US11008606B2/en active Active
- 2015-10-09 CA CA2964169A patent/CA2964169C/en active Active
- 2015-10-09 EP EP15849374.2A patent/EP3204521B1/en active Active
- 2015-10-09 AU AU2015330685A patent/AU2015330685B2/en active Active
- 2015-10-09 WO PCT/US2015/054981 patent/WO2016057947A1/en active Application Filing
- 2015-10-09 EP EP21176777.7A patent/EP3957742A1/en active Pending
-
2017
- 2017-04-02 IL IL251509A patent/IL251509B/en active IP Right Grant
-
2021
- 2021-05-14 US US17/320,634 patent/US20210340604A1/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050266453A1 (en) * | 2002-11-01 | 2005-12-01 | Gregory Coia | Mutagenesis methods using ribavirin and/or RNA replicases |
WO2004042078A1 (en) * | 2002-11-05 | 2004-05-21 | The University Of Queensland | Nucleotide sequence analysis by quantification of mutagenesis |
US20090047680A1 (en) * | 2007-08-15 | 2009-02-19 | Si Lok | Methods and compositions for high-throughput bisulphite dna-sequencing and utilities |
US20140065609A1 (en) * | 2010-10-22 | 2014-03-06 | James Hicks | Varietal counting of nucleic acids for obtaining genomic copy number information |
WO2013177086A1 (en) * | 2012-05-21 | 2013-11-28 | Sequenom, Inc. | Methods and processes for non-invasive assessment of genetic variations |
US20130338043A1 (en) * | 2012-06-12 | 2013-12-19 | The Johns Hopkins University | Efficient, Expansive, User-Defined DNA Mutagenesis |
Non-Patent Citations (1)
Title |
---|
LEVY ET AL.: "Facilitated sequence counting and assembly by template mutagenesis", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 111, no. 43, 13 October 2014 (2014-10-13), pages E4632 - E4637, XP055428445 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11421238B2 (en) | 2018-02-20 | 2022-08-23 | Longas Technologies Pty Ltd | Method for introducing mutations |
WO2020035669A1 (en) * | 2018-08-13 | 2020-02-20 | Longas Technologies Pty Ltd | Sequencing algorithm |
US20210174905A1 (en) * | 2018-08-13 | 2021-06-10 | Longas Technologies Pty Ltd. | Sequencing Algorithm |
CN113015813A (en) * | 2018-08-13 | 2021-06-22 | 朗斯科技有限公司 | Sequencing algorithm |
EP4293123A3 (en) * | 2018-08-13 | 2024-01-17 | Illumina Singapore PTE. Ltd. | Sequencing algorithm |
WO2023039509A1 (en) * | 2021-09-10 | 2023-03-16 | Cold Spring Harbor Laboratory | Method of measuring microsatellite length variations |
Also Published As
Publication number | Publication date |
---|---|
US11008606B2 (en) | 2021-05-18 |
CA2964169A1 (en) | 2016-04-14 |
US20210340604A1 (en) | 2021-11-04 |
AU2015330685A1 (en) | 2017-04-20 |
US20170306392A1 (en) | 2017-10-26 |
IL251509A0 (en) | 2017-05-29 |
CA2964169C (en) | 2023-09-19 |
EP3204521A1 (en) | 2017-08-16 |
EP3957742A1 (en) | 2022-02-23 |
EP3204521B1 (en) | 2021-06-02 |
AU2015330685B2 (en) | 2022-02-17 |
IL251509B (en) | 2021-04-29 |
EP3204521A4 (en) | 2018-03-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11814678B2 (en) | Universal short adapters for indexing of polynucleotide samples | |
US11788139B2 (en) | Optimal index sequences for multiplex massively parallel sequencing | |
US20210340604A1 (en) | Random nucleotide mutation for nucleotide template counting and assembly | |
CN108431233B (en) | Efficient construction of DNA libraries | |
AU2018331434A1 (en) | Universal short adapters with variable length non-random unique molecular identifiers | |
JP7051677B2 (en) | High Molecular Weight DNA Sample Tracking Tag for Next Generation Sequencing | |
US11608518B2 (en) | Methods for analyzing nucleic acids | |
ES2965194T3 (en) | Sequencing algorithm | |
US20220364080A1 (en) | Methods for dna library generation to facilitate the detection and reporting of low frequency variants | |
JP2023531720A (en) | Methods and compositions for analyzing nucleic acids | |
Wang et al. | High coverage of single cell genomes by T7-assisted enzymatic methyl-sequencing | |
WO2023212223A1 (en) | Single cell multiomics | |
Wei | Single Cell Phylogenetic Fate Mapping: Combining Microsatellite and Methylation Sequencing for Retrospective Lineage Tracing | |
Bellos | Statistical methods for elucidating copy number variation in high-throughput sequencing studies |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15849374 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 251509 Country of ref document: IL |
|
ENP | Entry into the national phase |
Ref document number: 2964169 Country of ref document: CA |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2015330685 Country of ref document: AU Date of ref document: 20151009 Kind code of ref document: A |
|
REEP | Request for entry into the european phase |
Ref document number: 2015849374 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2015849374 Country of ref document: EP |