WO2019084306A1 - Method for genome complexity reduction and polymorphism detection - Google Patents

Method for genome complexity reduction and polymorphism detection

Info

Publication number
WO2019084306A1
WO2019084306A1 PCT/US2018/057568 US2018057568W WO2019084306A1 WO 2019084306 A1 WO2019084306 A1 WO 2019084306A1 US 2018057568 W US2018057568 W US 2018057568W WO 2019084306 A1 WO2019084306 A1 WO 2019084306A1
Authority
WO
WIPO (PCT)
Prior art keywords
endonuclease
degenerate
crispr
crrnas
pool
Prior art date
Application number
PCT/US2018/057568
Other languages
French (fr)
Inventor
Matias Kirst
Christopher DERVINIS
Marcio F. RESENDE
Original Assignee
University Of Florida Research Foundation, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Florida Research Foundation, Inc. filed Critical University Of Florida Research Foundation, Inc.
Publication of WO2019084306A1 publication Critical patent/WO2019084306A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/30Nucleotides
    • C12P19/34Polynucleotides, e.g. nucleic acids, oligoribonucleotides
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B20/00Methods specially adapted for identifying library members
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof

Definitions

  • the present disclosure relates generally to the fields of molecular biology and genetics. More particularly, the present disclosure relates to DNA sequencing and genotyping.
  • Direct DNA sequencing of specific, targeted regions of genomes offers an alternative to genotyping by sequencing entire genomes.
  • the target regions that represent a reduced representation of the genome need to be first obtained, to then be sequenced so that DNA polymorphisms can be detected and genotyped based on the presence of variants in the sequencing reads.
  • the present disclosure provides methods to produce a reduced representation of a genome for sequencing, genotyping and DNA polymorphism detection.
  • crRNAs CRISPR RNAs
  • the degenerate crRNAs combine a specified sequence region with a degenerate sequence region to provide greater flexibility in RNA-guided DNA endonuclease digestion of double stranded DNA.
  • the degenerate crRNAs comprise single strand RNA oligonucleotides 15-30 nucleobases in length having a 3' specific sequence region 1-20 nucleobases in length and a 5' degenerate sequence region 1-29 nucleobases in length.
  • the degenerate crRNAs are provided in a pool or library of oligonucleotides wherein each individual oligonucleotide in the pool contains the same specific nucleotide sequence (specific sequence) in the specific sequence region.
  • Oligonucleotides in the pool will contain different nucleotide sequences (degenerate sequence) in the degenerate sequence region of the crRNA. Because the degenerate sequences encompasses all, nearly all, or a substantial fraction of the possible nucleotide sequences, the frequency of digestion of the DNA in a sample is determined by the specific sequence present in each individual oligo in the pool of degenerate crRNAs. In some embodiments, the degenerate sequence contains one or more universal bases.
  • Digestion of double stranded DNA with an RNA-guided DNA endonuclease guided by degenerate crRNAs yields DNA fragments based on the frequency of occurrence of a target sequence in the double stranded DNA complementary to the degenerate crRNA specific sequence or the degenerate crRNA specific sequence in combination with a PAM sequence.
  • degenerate crRNAs are linked to tracrRNAs to form degenerate guide RNAs.
  • degenerate guide RNAs facilitates digestion with some RNA- guided DNA endonucleases.
  • adaptors are linked to the DNA fragments.
  • An adaptor is an oligomer that can be ligated to the DNA fragments produced from digestion of the double stranded DNA.
  • Adaptors can be used to facilitate genetic analyses, including but not limited to, amplification and/or sequencing of the DNA fragments using primers that hybridize to an adaptor nucleotide sequence.
  • the described degenerate crRNAs, degenerate guide RNAs and adaptors can be used to digest double stranded DNA, such as genomic DNA. Digesting DNA using RNA- guided DNA endonucleases can be used to reduce genome complexity (generate a reduced representation of a genome) and/or analyze genomic DNA. Ligating adaptors to DNA fragments produced by digestion with RNA-guided DNA endonucleases can be used to facilitate genetic analyses. Such analyses include, but are not limited to, amplification and sequencing.
  • a double stranded DNA sample is digested using one or more pools of degenerate crRNAs and one or more RNA-guided DNA endonucleases. In some embodiments, two or more pools of degenerate crRNAs with different specific sequences are used.
  • the DNA fragments are optionally ligated to adaptors. In some embodiments, the DNA fragments, with or without attached adaptors, are amplified and/or sequenced. The described methods are useful in genome and genetic analysis.
  • Described are methods for producing a reduced representation of a genome comprising the steps of digesting the genome with a class 2 CRISPR-Cas endonuclease guided by a pool of degenerate crRNAs to form a first nucleic acid product.
  • a first adaptor is ligated to a first end of the first nucleic acid product and a second adaptor is ligated to a second end of the first nucleic acid product to produce a second nucleic acid product.
  • the second nucleic acid product is the amplified. Any method in the art for DNA amplification can be used.
  • amplifying comprises PCR.
  • amplifying the second nucleic acid product by PCR comprised using a first PCR primer that hybridized to the first adaptor and a second PCR primer that hybridizes to the second adaptor to produce a third nucleic acid product.
  • the third nucleic acid product represents a reduced representation of the genome.
  • the class 2 CRISPR-Cas endonuclease is a type V-A class 2 CRISPR-Cas endonuclease, for example Casl2a.
  • FIG. 1 Exemplary crRNA for a class 2 CRISPR-Cas endonuclease type V-A having a defined nucleotide sequence.
  • SEQ ID NO:l amino acid sequence having a defined nucleotide sequence.
  • FIG. 2. A. Guide RNA.
  • B. Formula for a pool of degenerate guide RNAs, wherein N1-29 represents 1-29 degenerate nucleotides, X1-20 represents a specific sequence of 1-20 nucleotides, and linker represents a linker, and X' n - m represents a tracrRNA 70-89 nucleotides in length.
  • FIG. 3. A. Formula for a pool of 20mer degenerate crRNAs having degenerate sequences 15 nucleobases in length and specific sequences 5 nucleobases in length, wherein N represents a degenerate nucleotide and X represents a specific nucleotide.
  • N represents a degenerate nucleotide and NNNNNNNNNNNNNNN comprises a degenerate sequence.
  • FIG. 4. A. Generic formula for a pool of degenerate crRNAs having degenerate sequence N n and specific sequence X x , wherein N n represents degenerate sequence n nucleotides in length, X x represents a specific sequence x nucleotides in length, n is a integer from 1 to 20, x is a integer from 1 to 29, and the value of n+x is 15-30.
  • N n represents degenerate sequence n nucleotides in length
  • X x represents a specific sequence x nucleotides in length
  • n is a integer from 1 to 20
  • x is a integer from 1 to 29, and the value of n+x is 15-30.
  • C Formulas for pools of 20mer degenerate crRNAs having specific sequences A, CU, UGA, AAAA, and GACUC. N represents a degenerate nucleotide. [0019] FIG. 5. General overview of the reaction.
  • A Two positions in the genome are recognized as complementary to guide RNAs, and digested by a class 2 CRISPR-Cas endonuclease.
  • the degenerate region of the guide RNA is represented by a dotted line, followed by the specific sequence (thick continuous line), and a thin continuous line (tracrRNA, in case of digestion by an endonuclease of Type II).
  • the specific sequence and the presence of the PAM sequence determine the position of the digestion.
  • B After digestion, fragments in the genome that contain the specific sequence at both ends are generated.
  • C Next, a ligase reaction adds adaptor sequences to each digested fragment.
  • D These fragments can then be amplified by PCR or other amplification method.
  • FIG. 6. A. Exemplary target double stranded DNA sequence strand showing the crRNA sequence (boxed) and PAM sequence (Bold). N represents a degenerate nucleotide (each N is independent of the others), N* represents a base complementary to the corresponding N, X 1 , X 2 , X 3 , and X 4 represent specific nucleotides, X 1* , X 2* , X 3* , and X 4* represent bases complementary to X 1 , X 2 , X 3 , and X 4 , respectively, PAM represents a PAM sequence, and P*A*M* represents bases complementary to PAM. B. Exemplary target double stranded DNA sequence (showing the DNA strand having the same sequence as the crRNA) having a TTGA specific sequence immediately upstream of an NGG PAM sequence.
  • FIG. 7 Exemplary DNA segment to be searched in the genome for prediction of digestion guided by a pool of crRNAs having a TTGA specific sequence and for which the RNA-guided DNA endonuclease requires an NGG PAM sequence.
  • FIG. 8 Histograms displaying the number of 3 (left panel), 4 (middle panel) and 5 (right panel) base combinations in the specific sequence (Y-axis) that will generate a range of number of fragments between 200-500 nucleotides after digestion (X-axis).
  • SEQ ID NO:l - Is an example of a crRNA sequence for class 2 CRISPR-Cas endonuclease type V-A.
  • SEQ ID NO:2 - Is an example of a crRNA and tracrRNA sequence for class 2 CRISPR-5 Cas endonuclease type V-B.
  • the term “about” indicates insubstantial variation in a quantity of a component of a composition not having any significant effect on the activity or stability of the composition.
  • the term “about,” when referring to the length of an oligonucleotide, is meant to encompass lengths that are within 1 or 2 nucleotides of the stated length.
  • the term "about,” when referring to a value or to an amount of mass, weight, time, volume, concentration or percentage, is meant to encompass variations of in some embodiments ⁇ 20%, in some embodiments ⁇ 10%, in some embodiments ⁇ 5%, in some embodiments ⁇ 1%, in some embodiments ⁇ 0.5%, and in some embodiments ⁇ 0.1% from the specified amount, as such variations are appropriate to perform the disclosed methods.
  • Nucleic acid refers to a multimeric compound comprising nucleosides or nucleoside analogs which have nitrogenous heterocyclic bases or base analogs linked together to form a polynucleotide, including conventional RNA, DNA, mixed RNA-DNA, and polymers that are analogs thereof.
  • a nucleic acid "backbone” may be made up of a variety of linkages, including one or more of sugar-phosphodiester linkages, peptide-nucleic acid bonds, phosphorothioate linkages, methylphosphonate linkages, or combinations thereof.
  • Sugar moieties of a nucleic acid may be ribose, deoxyribose, or similar compounds with substitutions, e.g., 2' methoxy or 2' halide substitutions.
  • Nitrogenous bases may be conventional bases (A, G, C, T, U), analogs thereof, or derivatives of purines or pyrimidines.
  • Nucleic acids may include modified bases to alter the function or behavior of the nucleic acid, e.g., addition of a 3'-terminal dideoxynucleotide to block additional nucleotides from being added to the nucleic acid.
  • Embodiments of oligomers that may affect stability of a hybridization complex include PNA oligomers, oligomers that include 2'-methoxy or 2'-fluoro substituted RNA, or oligomers that affect the overall charge, charge density, or steric associations of a hybridization complex, including oligomers that contain charged linkages (e.g., phosphorothioates) or neutral groups (e.g., methylphosphonates).
  • charged linkages e.g., phosphorothioates
  • neutral groups e.g., methylphosphonates
  • DNA refers to a double- stranded DNA molecule of genomic or synthetic origin, i.e., a polymer of deoxyribonucleotide bases or a polynucleotide molecule, read from the 5' (upstream) end to the 3' (downstream) end.
  • DNA sequence refers to the nucleotide sequence of a DNA molecule.
  • isolated DNA molecule refers to a DNA molecule at least partially separated from other molecules normally associated with it in its native or natural state.
  • isolated refers to a DNA molecule that is at least partially separated from some of the nucleic acids which normally flank the DNA molecule in its native or natural state.
  • DNA molecules fused to regulatory or coding sequences with which they are not normally associated, for example as the result of recombinant techniques are considered isolated herein.
  • Such molecules are considered isolated when integrated into the chromosome of a host cell or present in a nucleic acid solution with other DNA molecules, in that they are not in their native state.
  • RNA refers to a molecule comprising at least one ribonucleotide residue.
  • ribonucleotide is meant a nucleotide with a hydroxyl group at the 2' position of a P-D-ribofuranose moiety.
  • RNA encompass double stranded RNA, single stranded RNA, RNAs with both double stranded and single stranded regions, isolated RNA such as partially purified RNA, essentially pure RNA, synthetic RNA, recombinantly produced RNA, as well as altered RNA, or analog RNA, that differs from naturally occurring RNA by the addition, deletion, sub-stitution, and/or alteration of one or more nucleotides.
  • Such alterations can include addition of non- nucleotide material, such as to the end(s) of an RNA molecule or internally, for example at one or more nucleotides of the RNA.
  • Nucleotides in the RNA molecules of the presently dis-closed subject matter can also comprise non-standard nucleotides, such as non-naturally occurring nucleotides or chemically synthesized nucleotides or deoxynucleotides. These altered RNAs can be referred to as analogs or analogs of a naturally occurring RNA.
  • double stranded RNA refers to an RNA molecule at least a part of which is in Watson-Crick base pairing forming a duplex.
  • the term is to be understood to encompass an RNA molecule that is either fully or only partially double stranded.
  • Exemplary double stranded RNAs include, but are not limited to molecules comprising at least two distinct RNA strands that are either partially or fully duplexed by intermolecular hybridization.
  • the term is intended to include a single RNA molecule that by intramolecular hybridiza-tion can form a double stranded region (for example, a hairpin).
  • intermolecular hybridization and “intramolecular hybridization” refer to double stranded mole-cules for which the nucleotides involved in the duplex formation are present on different mole-cules or the same molecule, respectively.
  • a “primer” refers to an oligomer that hybridizes to a template nucleic acid and has a 3' end that is extended by polymerization.
  • a primer may be optionally modified attachment of inclusion of: a 5' region that is non-complementary to the target sequence, a tag, a promoter, a detectable label, or sequence or molecule used or useful for manipulation, amplification, purification, or detection.
  • a “primer” is typically a highly purified, isolated polynucleotide that is designed for use in specific annealing or hybridization methods that involve thermal amplification.
  • a pair of primers may be used with template DNA, such as a sample of genomic DNA, in a thermal amplification, such as polymerase chain reaction (PCR), to produce an amplicon, where the amplicon produced from such reaction would have a DNA sequence corresponding to sequence of the template DNA located between the two sites where the primers hybridized to the template.
  • PCR polymerase chain reaction
  • an "amplicon" is a piece or fragment of DNA that has been synthesized using amplification techniques. In order for a nucleic acid molecule to serve as a primer it need only be sufficiently complementary in sequence to be able to form a stable double- stranded structure under the particular solvent and salt concentrations employed.
  • hybridization or “hybridize” is meant the ability of two completely or partially complementary nucleic acid strands to come together under specified hybridization assay conditions in a parallel or antiparallel orientation to form a stable structure having a double- stranded region.
  • the two constituent strands of this double-stranded structure sometimes called a hybrid, are held together by hydrogen bonds. These hydrogen bonds most commonly form between nucleotides containing the bases adenine and thymine or uracil (A and T or U) or cytosine and guanine (C and G) on single nucleic acid strands.
  • a crRNA can hybridize to its target nucleic acid to form a crRNA:target hybrid, but does not form stable crRNA: no n- target hybrids.
  • a crRNA hybridizes to target nucleic acid to a sufficiently greater extent than to non-target nucleic acid to guide target-specific digestion of the DNA by the RNA-guided DAN endonuclease.
  • single nucleotide polymorphism refers to a single nucleotide position in a genomic sequence for which two or more alternative alleles are present at appreciable frequency in a population.
  • a target sequence is a nucleotide sequence in a double stranded DNA to which a crRNA or degenerate crRNA can hybridize. Because the DNA is double stranded, a target sequence will contain one strand which contains a sequence complementary to, or that hybridizes to, a (degenerate) crRNA and one strand that contain a sequence that is the same as the (degenerate) crRNA nucleotide sequence.
  • RNA-guided DNA endonucleases guided by pools or libraries of degenerate crRNA oligonucleotides to digest double stranded DNA.
  • the digested DNA can then be modified by adaptor ligation for sequencing, polymorphism detection, or other genetic analysis.
  • the methods of the present disclosure rely on the digestion of DNA using a class 2 CRISPR— Cas endonuclease, guided by CRISPR RNA (crRNA) designed to contain one or more degenerate bases so that any variation in the number of targeted sites can be achieved, and a wide variation in positions targeted can be obtained compared to traditional restriction enzyme digestion. Products of the digestion can then be prepared for sequencing using traditional methods of DNA library preparation for sequencing.
  • the present disclosure provides for the simplification of the process for sequencing a reduced representation of the genome for identification and analysis of genetic polymorphisms.
  • methods for reduced representation of a genome comprise: contacting a sample comprising genomic DNA with a described pool of degenerate crRNAs and an RNA-guided DNA endonuclease enzyme(s).
  • the degenerate crRNAs guide cleavage of the double stranded DNA by the RNA-guided DNA endonuclease enzyme.
  • Cleavage of the genomic DNA by the RNA-guided DNA endonuclease results in DNA fragments having the specific sequence at or near one or both ends.
  • the number and average length of the genomic DNA fragments is determined by the specific sequence of the degenerate crRNAs.
  • the digested double stranded DNA is modified by linkage of adaptor sequence. Any method known in the art for performing adaptor ligation can be used with the described compounds, compositions, kits, and methods.
  • the adaptor-modified DNA fragments are subjected to sequence analysis, polymorphism detection, and/or other genetic analysis. Any method known in the art for performing sequencing and DNA polymorphism detection can be used with the described compounds, compositions, kits, and methods.
  • the adaptor-modified DNA fragments are subjected to nucleic acid amplification. Amplification can be performed using primers that hybridize to the adaptor sequences. Any method known in the art for amplifying the adaptor-modified fragments be used with the described compounds, compositions, kits, and methods. In some embodiments, the amplified sequence is then subjected to sequence analysis or other genetic analysis.
  • a pool or library of degenerate crRNA comprises single strand RNA oligonucleotides 15-30 nucleobases in length each containing a specific sequence region and a degenerate sequence region.
  • Each crRNA in the pool or library has the same specific sequence in the specific sequence region. However individual crRNAs in the pool or library can have different sequences in the degenerate sequence region.
  • the degenerate sequence region is located 5' of the specific sequence region.
  • a degenerate crRNA comprises, starting from the 3' end: (a) a specific sequence region having a specific nucleotide sequence wherein each position is independently a G, A, C, or U ribonucleotide or a modified G, A, C, or U, and (b) a degenerate sequence region wherein different sequences are present in individual oligonucleotides in the pool of degenerate crRNAs.
  • the specific sequence region can be from 1-20 nucleobases in length and has a defined or specific nucleotide sequence, i.e., in a pool of degenerate crRNAs, each individual oligonucleotide contains the same nucleotide sequence in the specific sequence region.
  • the degenerate sequence can be 1-29 nucleobases in length, wherein, within the pool of degenerate crRNAs, each nucleotide position can independently have, or be represented by, multiple possible alternatives. Because individual oligonucleotides in the pool of degenerate crRNAs contain difference sequences, the degenerate crRNAs can bind to multiple targets in a double stranded DNA sample.
  • a pool of degenerate crRNAs comprises sequences representing a mixture of all the possible oligonucleotides of length n, wherein n is the length in nucleobases of the degenerate sequence region.
  • a pool of degenerate crRNAs comprises sequences representing a mixture of nearly all the possible oligonucleotides of length n, wherein n is the length in nucleobases of the degenerate sequence region. In some embodiments, a pool of degenerate crRNAs comprises sequences representing a mixture of a substantial fraction, of the possible oligonucleotides of length n, wherein n is the length in nucleobases of the degenerate sequence region.
  • a pool of degenerate crRNAs comprises sequences representing ⁇ 1% to 100%, of the possible oligonucleotides of length n, wherein n is the length in nucleobases of the degenerate sequence region.
  • a pool of degenerate crRNAs comprises sequences representing about 1%, about 2%, about 3%, about 4%, about 5%, about 6%, about 7%, about 8%, about 9%, about 10%, about 15%, about 20%, about 25%, about 30%, about 40%, about 50%, about 60%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%, of the possible oligonucleotides of length n, wherein n is the length in nucleobases of the degenerate sequence region.
  • the degenerate crRNAs guide cleavage or nicking of the double stranded DNA by the RNA-guided DNA endonuclease at or near the target sequence.
  • the degenerate crRNAs are 15-30 nucleobases in length. In some embodiments, the degenerate crRNAs in a pool of degenerate crRNA are each about 15, about 16, about 17, about 18, about 19 nucleotides in length, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, or about 30 nucleobases in length or more. In some embodiments, the degenerate crRNAs in a pool of degenerate crRNAs are each about 19-21 nucleobases in length. In some embodiments, the degenerate crRNAs in a pool of degenerate crRNAs are each about 19 nucleobases in length.
  • the degenerate crRNAs in a pool of degenerate crRNA are each about 20 nucleobases in length. In some embodiments, the degenerate crRNAs in a pool of degenerate crRNAs are each about 21 nucleobases in length.
  • each degenerate crRNA contains a specific sequence region of 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides.
  • the specific sequence is 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9 or about 10 nucleotides in length, or greater, for example about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, or about 20 nucleotides in length or more.
  • each degenerate crRNA in a pool of degenerate crRNA, comprises a specific sequence region 1 to 10 nucleobases or more in length and a degenerate sequence region 9-20 nucleobases or more in length (see FIG. 3). [0049] In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a degenerate sequence region of about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, or about 20 nucleotides or more in length.
  • each degenerate crRNA in a pool of degenerate crRNAs, contains a degenerate sequence region of 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or20 nucleotides, or sufficient nucleotides to provide a degenerate crRNA of about 19-21 nucleotides in length. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 1 nucleotide and a degenerate sequence region of 18-20 nucleotides.
  • each degenerate crRNA in a pool of degenerate crRNAs, contains a specific sequence region of 2 nucleotides and a degenerate sequence region of 17, 18, or 19 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 3 nucleotides and a degenerate sequence region of 16, 17, or 18 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 4 nucleotides and a degenerate sequence region of 15, 16, or 17 nucleotides.
  • each degenerate crRNA in a pool of degenerate crRNAs, contains a specific sequence region of 5 nucleotides and a degenerate sequence region of 14, 15, or 16 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 6 nucleotides and a degenerate sequence region of 13, 14, or 15 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 7 nucleotides and a degenerate sequence region of 12, 13, or 14 nucleotides.
  • each degenerate crRNA in a pool of degenerate crRNAs, contains a specific sequence region of 8 nucleotides and a degenerate sequence region of 11, 12, or 13 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 9 nucleotides and a degenerate sequence region of 10, 11, or 12 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 10 nucleotides and a degenerate sequence region of 9, 10, or 11 nucleotides.
  • a used herein a specific sequence region comprises a nucleotide sequence wherein each position has a defined nucleobase selected from guanine (G), adenine (A), cytosine (C), and thymine (T) or uracil (U), and a modified G, A, C, or T/U.
  • G guanine
  • A adenine
  • C cytosine
  • T thymine
  • U uracil
  • a modified G, A, C, or T/U forms a base pair with, hybridizes to, or is complementary to, C, T, G, and A respectively.
  • each position in the degenerate sequence region can have multiple possible nucleobase alternatives.
  • some of the oligonucleotides can have a G, some can have an A, some can have a C, and some can have a T or U.
  • each position in a degenerate sequence in a pool of degenerate crRNAs can be G, A, C, or U with equal probability.
  • any given position in the degenerate sequence in a pool of degenerate crRNAs can be G, A, C, or U will equal probability.
  • each position in the degenerate sequence in a pool of degenerate crRNAs may independently contain one or more universal bases in lieu of or in addition to one or more of G, A, C, and/or U.
  • any given position in the degenerate sequence in a pool of degenerate crRNAs may independently contain one or more universal bases in lieu of or in addition to one or more of G, A, C, and/or U.
  • any given position in the degenerate sequence in a pool of degenerate crRNAs can independently be at least two of G, A, C, U, and universal base. The probability of each base at a degenerate position in a pool of degenerate crRNAs can independently be equal or statistical.
  • a statistical probability means the probability of each nucleotide being present at a given degenerate position in a pool of degenerate crRNAs is equal to a mole fraction.
  • the mole fraction is determined during synthesis of the pool of degenerate crRNAs.
  • a fully degenerate site is formed in a pool of oligonucleotides by using an equimolar mix G, A, C, and U at a given site during synthesis of the oligonucleotide.
  • a pyrimidine universal base can be used that hybridizes to A or G.
  • a purine universal base can be used that hybridizes to C or U/T.
  • a fully degenerate site can be a pyrimidine universal base or purine universal base with equal probability.
  • every position in the degenerate sequence in a pool of degenerate crRNAs is degenerate.
  • a degenerate sequence in a pool of degenerate crRNAs can have 0-5 specific sites wherein the specific sites independently contain a defined nucleotide selected from G, A, C, or T/U.
  • a degenerate sequence in a pool of degenerate crRNAs can have 0-10 sites wherein the site is degenerate for two or three nucleotides, such as, but not limited to, G + C, G + T, G + A, C + T, C + A, A + T, G + C + T, G + A + T, G + C + A and C + A + T.
  • the two or three nucleotides can be present with equal probability.
  • Degenerate crRNAs can form pools or libraries of crRNAs representing multiple sequences each containing a common specific sequence. Because the degenerate sequence is degenerate, the plurality of crRNAs in a pool of degenerate crRNAs can hybridize to a plurality of double stranded DNA target sequences.
  • a universal base is able to form a base pair with (hybridize to) two or more natural bases (i.e., G, A, C, and U or T). In some embodiments, a universal base is able to form a base pair with three or more of G, A, C, and U or T. In some embodiments, a universal base is able to form a base pair with each of G, A, C and U or T. An ideal universal base is able to hybridize non-selectively (or non-discriminately) to each of the native bases.
  • a universal base can replace any of the four natural bases without significantly affecting either melting behavior of duplexes or the normal activities of the modified oligonucleotide.
  • many universal bases have at least some preference for hybridizing to one or more of the natural bases. Universal bases include, but are not limited to:
  • Sequences in the double stranded DNA sequence that are complementary to a crRNA sequence are potential substrates for cleavage by the crRNA-RNA-guided DNA endonuclease complexes. Selection of the degenerate crRNA specific sequence affects the sizes and numbers of resultant DNA fragments. Because a pool of degenerate crRNAs contains oligonucleotides having multiple sequences in the degenerate sequence regions, the level of genome reduction is determined by the number of bases and the nucleotide sequence of the specific sequence. The remaining sequences of the pool of degenerate crRNAs are composed of degenerate (N) bases, up to the total length of the degenerate crRNA.
  • the specific sequence defines the position(s) in the genome, from where digestion by the RNA-guided DNA endonuclease will occur.
  • the pool of degenerate crRNAs can hybridize to multiple double stranded DNA target sequences containing a sequence complementary to the specific sequence. Generally, use of longer specific sequences will yield fewer fragments, and therefore greater reduction in genome complexity, than shorter specific sequences. For example, selection of a specific sequence with 1 nucleotide typically yields a larger number of fragments than a specific sequence with 10 nt, which is likely anneal to the genome less frequently. Different pools of degenerate crRNAs having different specific sequences will hybridize to different target sequences in a DNA sample.
  • Selection of the specific sequence depends on the number of fragments, frequency of digestion, fragment size, distribution, and/or location that one desires to sample in the genome.
  • the number of bases and the nucleotide sequence to be used in the specific sequence is determined by the bioinformatics analysis of the genome of the species being analyzed by identifying the number of instances that a specific sequence occurs in the reference genome.
  • RNA-guided DNA endonucleases require a protospacer adjacent motif (PAM) sequence that is adjacent to the crRNA sequence.
  • the PAM sequence is present in the double stranded DNA target sequence but not in the guide RNA sequence.
  • the PAM sequence varies by the species of RNA-guided DNA endonuclease.
  • Class 2 CRISPR-Cas type II endonuclease derived from S. pyogenes utilizes an NGG PAM sequence located on the immediate 3' end of the guide RNA recognition sequence.
  • PAM sequence include, but are not limited to, NGG (Streptococcus pyogenes, NNNNGATT (Neisseria meningitidis), NNAGAA (Streptococcus thermophilus), and NAAAAC (Treponema denticola).
  • the genomic sequence to be searched for frequency of digestion is the degenerate crRNA specific sequence.
  • the PAM sequence is also taken into consideration when estimating the number of positions in the genome that can be digested.
  • a genomic sequence to be searched for frequency an location of degenerate crRNA specific sequence plus the PAM sequence.
  • the target genomic sequence will be as described in FIG. 6 and 7 (5' crRNA+PAM 3') ⁇
  • Genomic nucleotide sequences are not completely random. Some sequences are likely to result in the digestion of DNA in positions that are more consistently distributed in the genome than others. Therefore, if a reference genome is available for the organism of interest, this distribution can be estimated to guide the selection of the most appropriate specific sequence.
  • two or more pools of degenerate crRNAs that differ in their specific sequence may be used in combination in a single digestion reaction. Using two or more pools of degenerate crRNAs that differ in their specific sequence in a single digestion reaction provides more flexibility in the regions of the genome that will be digested/amplified.
  • a degenerate crRNA further comprises a transacting CRISPR RNA (tracrRNA) operably linked to the degenerate crRNA by a linker to form a degenerate guide RNA.
  • tracrRNA transacting CRISPR RNA
  • pools of degenerate guide RNAs wherein the degeneratin crRNA are linked to tracrRNAs.
  • the tracrRNA is linked to the 3' end of the degenerate crRNA.
  • a degenerate guide RNA comprises both degenerate crRNA and tracrRNA sequences.
  • a degenerate guide RNA is about 50-260 nucleotides long. In some embodiments, the degenerate guide RNA is about 90-120 nucleotides long. In some embodiments, a degenerate guide RNA is at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 120, at least 150, at least 180, at least 200, at least 220, at least 240, or at least 260 nucleotides long.
  • the linker is a short nucleotide sequence.
  • the short nucleotide sequence is 1-10 nucleotides in length.
  • the linker region is about 4 nucleotides in length, although in other embodiments shorter or longer linker regions can be used, including linker regions of one to about 3 nucleotides in length, or linker regions of about 5, about 6, about 7, about 8, about 9, or about 10 or more nucleotides in length.
  • the linker is a non-nucleotide linker.
  • degenerate guide RNAs are used in combination with digestion by a class 2 CRISPR-Cas endonuclease of Type V-A.
  • a class 2 CRISPR-Cas endonuclease of Type V-A can be, but is not limited to, Casl2a (also known as Cpfl).
  • degenerate crRNAs connected to a irans-acting CRISPR RNAs (tracrRNA) by linkers are used in combination with digestion by a class 2 CRISPR-Cas endonuclease of Type II.
  • a class 2 CRISPR-Cas endonuclease of Type II can be, but is not limited to, Cas9.
  • degenerate crRNAs connected to a trans-acting CRISPR RNAs (tracrRNA) by linkers are used in combination with digestion by a class 2 CRISPR-Cas endonuclease of Type V-B.
  • a class 2 CRISPR-Cas endo nuclease of Type V-B can be, but is not limited to, Casl2b (also known as C2cl).
  • FIG. 1 An exemplary crRNA is shown in FIG. 1.
  • FIG. 2 An exemplary crRNA connected to a irans- acting CRISPR RNA (tracrRNA) by a nucleotide linker is shown in FIG. 2.
  • Degenerate crRNAs or degenerate guide RNA are typically ribonucleic acids. However degenerate crRNAs or degenerate guide RNAs may contain one or more modified nucleotides. The modifications may independently be base modifications or backbone modifications. Any of the described degenerate crRNAs and/or degenerate guide RNAs can contain at least one modified nucleotide. In some embodiments, a degenerate crRNA and/or degenerate guide RNA comprises two or more modified nucleotides. The two or more modified nucleotides may have the same or different modifications.
  • incorporation of one or more modified nucleotides is used to increase the stability of the duplex by raising the Tm relative to the corresponding degenerate crRNAs and/or degenerate guide RNAs without modified nucleotides.
  • a degenerate crRNA and/or degenerate guide RNA contains one or more modified nucleotides.
  • the modified nucleotides may be the same or different.
  • a "modified nucleotide" is a nucleotide other than a naturally occurring G, A, C, or U ribonucleotide.
  • a modified nucleotide may have a base modification, backbone modification and/or modified internucleoside linkage.
  • a modified oligonucleotide can be, but is not limited to, 2'-ribose modified nucleotide, 5 '-methyl cytosine, or universal base.
  • the degenerate crRNAs and/or degenerate guide RNAs may be made by any method available in the art of oligonucleotide synthesis.
  • adaptors are ligated to the ends of the digested DNA fragments.
  • Adaptors may be used to facilitate amplification and/or sequencing of the digested DNA fragments.
  • the adaptor-modified DNA fragments can then amplified and/or sequenced for identification of DNA polymorphisms.
  • an adaptor comprises an oligonucleotide that is ligatable to one or both strands of a double-stranded DNA molecule.
  • An adaptor maybe single stranded, double stranded, or a single strand that can form a hairpin.
  • a double stranded adaptor can have two blunt ends, two sticky ends, or a stick end and a blunt end.
  • An adaptor may contain one or more of: restriction enzyme recognition sequence, detectable label, molecular barcode, and primer binding sequence.
  • an adaptor is 8-120 nucleotides in length.
  • an adaptor is 20-60 nucleotides in length.
  • an adaptor is about 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45, about 50, about 55, or about 60 nucleotides in length.
  • a first adaptor is ligated to a first end of a first nucleic acid product or DNA fragment and a second adaptor is ligated to a second end of a first nucleic acid product or DNA fragment to form a second nucleic acid product.
  • the first adaptor and the second adaptor may be the same or different (i.e., they may have the same or different nucleotide sequences and/or they may contain the same or different modifications).
  • amplification and/or sequencing primers that hybridize to the adaptors. Hybridization to the adaptors can be used to facilitate polymerization initiation.
  • the primers may be the same length, shorter, or longer that the adaptor sequences.
  • a first PCR primer and a second PCR primer used for amplification of the DNA fragment may be the same or different.
  • the second nucleic acid product is amplified to form a third (amplified) nucleic acid product.
  • the second nucleic acid product is amplified by PCT for between about 15 and about 25 cycles.
  • the amplified nucleic acid product is sequenced.
  • the first nucleic acid product, second nucleic acid product or third nucleic acid product represents a reduced representation of the genome. In some embodiments a reduced representation of a plurality of genomes are produced.
  • RNA-guided DNA endonuclease forms a complex with the crRNA and binds to DNA and cleaves the DNA in a crRNA sequence-dependent manner.
  • the location of cleavage relative to the crRNA sequence (or crRNA hybridizing sequence) is dependent on the particular endonuclease.
  • RNA-guided DNA endonuclease can be, but is not limited to, a CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease, 2 CRISPR-Cas endonuclease Type II endonuclease, cas9 endonuclease, class 2 CRISPR-Cas endonuclease Type V-B endonuclease, Casl2b endonuclease, C2cl endonuclease, 2 class 2 CRISPR-Cas endonuclease Type V-A endonuclease, Casl2a endonuclease, or Cpfl endonuclease or a functional derivative of a CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease, 2 CRISPR- Cas endonuclease Type II endon
  • the double stranded DNA sample can comprise, consist essentially or, or consist of genomic DNA, total genomic DNA, and/or cDNA.
  • the double stranded DNA may be unamplified or amplified, undigested or digested, unfragmented or fragmented, unpurified or purified.
  • the double stranded DNA may be obtained or isolated by any method known in the art.
  • the double stranded DNA sample may be isolated from any organism, cell, tissue or remnant of an organism, cell, or tissue.
  • the organism can be living, dead, or archaeological.
  • the organism may be a prokaryote, archaea, or eukaryote.
  • the organism is a bacteria, archaea, plant, animal or fungus.
  • the animal can be, but is not limited to, a mammal, insect, fish, bird, amphibian, or reptile.
  • the mammal can be, but is not limited to, rodent, mouse, rat, dog, cat, bovine, porcine, ovine, non-human primate, human.
  • the mammal is a human.
  • the plant can be, but is not limited to, soybean, maize, wheat, or rice.
  • the second nucleic acid product is amplified by PCR.
  • a first PCR primer binds to the first adaptor and a second PCR primer binds to the second adaptor.
  • the third nucleic acid product represents a reduced representation of the genome.
  • the third nucleic acid product is sequenced to obtain the genotype of the individual.
  • the method further comprises comparing the genotype of the individual to a reference genotype.
  • the second nucleic acid product is amplified by PCR.
  • a first PCR primer binds to the first adaptor and a second PCR primer binds to the second adaptor.
  • the third nucleic acid product represents a reduced representation of the genome.
  • the third nucleic acid product is sequenced to identify the polymorphism of the individual.
  • the method further comprises comparing the polymorphism of the individual to a reference genotype.
  • the method further comprises identifying a plurality of polymorphisms in the genome of the individual.
  • degenerate crRNAs are linked to tracrRNAs to form a degenerate guide RNAs.
  • degenerate crRNAs are linked to tracrRNAs to form degenerate guide RNAs.
  • degenerate crRNAs are linked to tracrRNAs to form degenerate guide RNAs.
  • multiple pools of degenerate crRNA pools with different specific sequences can be used to digest double stranded DNA, reduce genome complexity, generate a reduced representation of a genome, or analyze genomic DNA using the described pools of degenerate crRNAs and one or more RNA-guided DNA endonucleases.
  • the methods comprise:
  • RNA-guided DNA endonuclease enzyme(s) e.g., RNA-guided DNA endonuclease enzyme(s)
  • incubating the double stranded DNA with the one or more pools of degenerate crRNAs and/or one or more pools of degenerate guide RNAs and the RNA-guided DNA endonuclease(s) for a sufficient time to digest the double stranded DNA thereby yielding DNA fragments (first nucleic acid product).
  • the methods comprise:
  • the double stranded DNA can be, but is not limited to, genomic DNA and cDNA.
  • any of the described pools of degenerate crRNAs or pools of degenerate guide RNAs may be used in the above method.
  • the one or more pools or degenerate crRNAs and/or one or more pools or degenerate guide RNAs contain different specific sequences.
  • the RNA-guided DNA endonuclease can be, but is not limited to, a CRISPR- Cas endonuclease, class 2 CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease Type II endonuclease, cas9 endonuclease, class 2 CRISPR-Cas endonuclease Type V-B endonuclease, Casl2b endonuclease, C2cl endonuclease, 2 class 2 CRISPR-Cas endonuclease Type V-A endonuclease, Casl2a endonuclease, or Cpfl endonuclease or a functional derivative of a CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease, 2 CRISPR- Cas endonuclease
  • adaptors are ligated to the DNA fragments to provide a reduced representation of the genome that can be amplified by PCR, sequenced, genotyped for one or more polymorphisms, and/or otherwise genetically analyzed (attachment of adaptors to a DNA fragment is shown in example FIG. 5C).
  • the methods further comprise, after step d):
  • the methods further comprise:
  • amplifying refers to generating one or more copies of a target nucleic acid, using the target nucleic acid as a template.
  • the methods further comprise:
  • a sequencing primer can hybridize to the same or similar sequence as an amplification primer.
  • the described methods are useful in genetic and/or genome analyses. Genetic and genomic analyses includes the identification, measurement, and/or comparison of genomic features such as DNA sequence, structural variation, gene expression, or regulatory and functional element annotation at a genomic scale.
  • the fragmented or amplified DNA may be analyzed by any analysis method including, but not limited to, DNA sequencing, high- throughput sequencing, next generation sequencing, thermocycling or isothermal nucleic acid amplification assay, real-time thermocycling or isothermal nucleic acid amplification assay, hybridization assay, microarray assay, bead array assay, primer extension assay, enzyme mismatch cleavage assay, branched hybridization assay, molecular beacon assay, invasive cleavage structure assay, or sandwich hybridization assay.
  • the fragmented or amplified DNA may be analyzed for the presence of one or more single nucleotide polymorphisms (SNPs) or other differences relative to a reference sequence.
  • SNPs single nucleotide poly
  • one or more DNA fragments may be cloned into a plasmid, thereby producing a recombinant plasmid.
  • compositions and kits useful for digesting double stranded DNA including but not limited to genomic DNA or cDNA), reducing representation of a genome, sequencing genomic DNA, and/or polymorphism detection.
  • reaction mixtures for digesting double stranded DNA, reducing representation of a genome, sequencing genomic DNA, and/or polymorphism detection are also provided by the disclosure.
  • kits comprising one or more reagents or components as described herein.
  • a composition, kit, and/or reaction mixture in accordance with the present disclosure comprises: one or more pools of degenerate crRNAs and/or one or more pools of degenerate guide RNAs as described herein and an RNA- guided DNA endonuclease(s).
  • DNA endonuclease comprises a class 2 CRISPR-Cas endonuclease.
  • a composition, kit and/or reaction mixture may further comprise one or more of the following: one or more adaptors, one or more amplification probes, and one or more sequencing probes. In some embodiments, the adaptors as the same.
  • the adaptors are distinct.
  • a kit comprises 3, 4, 5, 6 or more distinct adaptors.
  • a kit comprises first and second amplification primers.
  • the amplification primers are distinct.
  • a kit comprises 3, 4, 5, 6, or more distinct amplification primers.
  • a composition, kit and/or reaction mixture may further include a number of optional components such as, for example, buffer, salt solutions, appropriate nucleotide triphosphates (e.g., dATP, dCTP, dGTP, and dTTP), and/or enzymes (including, but not limited to: DNA polymerase and thermostable DNA polymerase), and control samples.
  • the container(s) can be of any suitable material, e.g., glass, plastic, metal, etc. , and of any suitable size, shape, or configuration.
  • a pool of degenerate crRNA and/or pool or degenerate guide RNAs is provided as a solid, such as powder form.
  • a pool of degenerate crRNAs and/or pool of degenerate guide RNA is provided as a liquid or solution.
  • the degenerate crRNAs or degenerate guide RNAs may be provided as isolated RNA or they may be provided as part of a complex with RNA- guided DNA endonucleases.
  • a degenerate crRNA-RNA-guided DNA endonuclease complex or degenerate guide RNA-RNA-guided DNA endonuclease complex refers to a complex comprising an RNA guided DNA endonuclease and a degenerate crRNA or degenerate guide RNA.
  • the degenerate crRNAs hybridize to a target sequences in the double stranded DNA and provides sequence specificity for digestion of the DNA by the endonuclease.
  • a kit may also include additional reagents, e.g., PCR components, such as salts including MgCl2, a thermostable polymerase enzyme, and deoxyribonucleotides, and the like, and other reagents, including for example media such as water, saline, or the like, as described herein.
  • additional reagents e.g., PCR components, such as salts including MgCl2, a thermostable polymerase enzyme, and deoxyribonucleotides, and the like
  • additional reagents e.g., PCR components, such as salts including MgCl2, a thermostable polymerase enzyme, and deoxyribonucleotides, and the like
  • additional reagents e.g., PCR components, such as salts including MgCl2, a thermostable polymerase enzyme, and deoxyribonucleotides, and the like
  • other reagents including for example media such as water,
  • a kit may also include one or more other adjuncts or adjuvants.
  • a kit further includes a set of instructions or packaging materials that describe methods of using the components for practicing methods in accordance with the present disclosure.
  • the instructions may be associated with a package insert and/or the packaging of the kit or the components thereof.
  • the kit comprises instructions or packaging materials that describe how to use the degenerate crRNAs or degenerate guide RNAs in a method to produce a reduced representation of a genome.
  • the kit comprises instructions or packaging materials that describe how to use the degenerate crRNAs or degenerate guide RNAs in a method to obtain a genotype of an individual.
  • the kit comprises instructions or packaging materials that describe how to use the degenerate crRNAs or degenerate guide RNAs in a method to identify a polymorphism in a genome of an individual.
  • Any method disclosed herein is also to be understood as a disclosure of corresponding uses of materials involved in the method directed to the purpose of the method.
  • Any of the degenerate crRNAs, degenerate guide RNAs, adaptors, and any combinations (e.g., kits and compositions) comprising such degenerate crRNAs, degenerate guide RNAs, and/or adaptors are to be understood as also disclosed for use in digesting double stranded DNA, reducing representation of a genome, sequencing genomic DNA, and/or polymorphism detection.
  • the method is used to digest genomic DNA.
  • the same or similar methods can be used to digest double stranded DNA from another source.
  • the reaction begins with the endonuclease digestion of DNA by the class 2 CRISPR-Cas endonuclease guided by a degenerate crRNA. After digestion, adaptors are ligated to prepare a reduced representation of the genome by amplification by PCR. The resultant reduced representation of the genome can be sequenced, and pol-ymorphisms genotyped (FIG. 5).
  • 3 ⁇ , of a class 2 CRISPR-Cas endonuclease reaction buffer (such as the Cas9 nuclease lOx reaction buffer) is combined with 3 ⁇ , of 300 nM of an oligonucleotide composed of the crRNA and tracrRNA sequence, 1 ⁇ , of 1 ⁇ of the class 2 CRISPR-Cas endonuclease of Type II Cas9, and 20 ⁇ , of water.
  • 10 to 1 ,000 ng of DNA to be digested is added.
  • the reaction is then maintained at 25°C for 1 to 24 hours, before incubation at 65°C for 10 minutes to stop the digestion reaction.
  • a standard ligation reaction is carried out. If the class 2 CRISPR-Cas endonuclease used in the reaction produces a blunt-end cut of the DNA (e.g. Cas9), then the reaction can be preceded by an A-tailing reaction, which adds an adenine to the ends of the DNA molecule following digestion. Finally, the ligation reaction is cleaned up and the digested DNA fragments that contain adaptors in both ends are amplified by standard polymerase chain reaction (PCR).
  • PCR polymerase chain reaction
  • the crRNA properties to be defined are (a) the number and (b) type (i.e., A, C, G or T) of bases to be used in the specific sequence.
  • the remaining sequence of the crRNA will be composed of degenerate (N) bases, up to the total length of the crRNA.
  • Combinations with specific sequences with all possible 3 base combinations are predicted to generate on average 24,294 fragments between 200-500 nucleotides, with a minimum of 2484 (specific sequence CCC) and a maximum of 85356 (specific sequence TTT). For 4 and 5 base combinations, the average number of fragments between 200-500 nucleotides is 2796 and 255, respectively.
  • the distribution of fragments generated for all combinations of bases in the specific sequence is shown in FIG. 8. As it can be seen from the broad range of possible number of digested fragments to be generated, that the specific sequence can be tailored depending on the number of regions of the genome that one wishes to sample.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Molecular Biology (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Microbiology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Degenerate CRISPR RNAs and degenerate guide RNAs, and methods of using the degenerate CRISPR RNA and degenerate guide RNAs for digesting double stranded DNA are described. The RNAs and methods can be used for genome analysis.

Description

METHOD FOR GENOME COMPLEXITY REDUCTION AND POLYMORPHISM
DETECTION
CROSS-REFERENCE TO RELATED APPLICATIONS
[001] This application is a claims the benefit of U.S. provisional application Serial No. 62/577,572, filed October 26, 2017, which is incorporated by reference herein.
SEQUENCE LISTING
[002] The sequence listing that is contained in the file named "2018-10- 24_519500PCT_SequenceListing_ST25" which is 1 kilobyte as measured in Microsoft Windows operating system and was created on October 24, 2018, is filed electronically herewith and incorporated herein by reference.
FIELD
[003] The present disclosure relates generally to the fields of molecular biology and genetics. More particularly, the present disclosure relates to DNA sequencing and genotyping.
BACKGROUND
[004] Extensive effort has been dedicated to genotyping human, plant, and animal populations to uncover genetic relationships and to identify genes that regulate clinical and agricultural traits, among many other uses. Current methods are costly and rely on large numbers of individuals. Technologies are needed to produce a reduced representation of the genome for sequencing and DNA polymorphism detection.
[005] Direct DNA sequencing of specific, targeted regions of genomes offers an alternative to genotyping by sequencing entire genomes. In this case, the target regions that represent a reduced representation of the genome need to be first obtained, to then be sequenced so that DNA polymorphisms can be detected and genotyped based on the presence of variants in the sequencing reads.
[006] Here we propose an alternative to existing approaches of genome complexity reduction using RE digestion. The approach relies on the digestion of DNA using a class 2 CRISPR-Cas endonuclease, guided by CRISPR RNAs (crRNAs) designed to contain one or more degenerate bases so that any variation in the number of targeted sites can be achieved, and a wide variation in positions targeted can be obtained compared to traditional RE digestion. Products of the digestion can then be prepared for sequencing using traditional methods of DNA library preparation for sequencing. The subject invention provides for the simplification of the process for sequencing a reduced representation of the genome for identification and analysis of genetic polymorphisms.
SUMMARY
[007] The present disclosure provides methods to produce a reduced representation of a genome for sequencing, genotyping and DNA polymorphism detection.
[008] Described are pools or libraries of degenerate CRISPR RNAs (crRNAs) useful in digesting double stranded DNA by RNA-guided DNA endonucleases. Digesting double stranded DNA, such as genomic DNA, in predictable ways reduces representation of a genome and facilitates sequencing, genetic analysis, and DNA polymorphism detection.
[009] The degenerate crRNAs combine a specified sequence region with a degenerate sequence region to provide greater flexibility in RNA-guided DNA endonuclease digestion of double stranded DNA. The degenerate crRNAs comprise single strand RNA oligonucleotides 15-30 nucleobases in length having a 3' specific sequence region 1-20 nucleobases in length and a 5' degenerate sequence region 1-29 nucleobases in length. The degenerate crRNAs are provided in a pool or library of oligonucleotides wherein each individual oligonucleotide in the pool contains the same specific nucleotide sequence (specific sequence) in the specific sequence region. Oligonucleotides in the pool will contain different nucleotide sequences (degenerate sequence) in the degenerate sequence region of the crRNA. Because the degenerate sequences encompasses all, nearly all, or a substantial fraction of the possible nucleotide sequences, the frequency of digestion of the DNA in a sample is determined by the specific sequence present in each individual oligo in the pool of degenerate crRNAs. In some embodiments, the degenerate sequence contains one or more universal bases. Digestion of double stranded DNA with an RNA-guided DNA endonuclease guided by degenerate crRNAs yields DNA fragments based on the frequency of occurrence of a target sequence in the double stranded DNA complementary to the degenerate crRNA specific sequence or the degenerate crRNA specific sequence in combination with a PAM sequence.
[0010] In some embodiments, degenerate crRNAs are linked to tracrRNAs to form degenerate guide RNAs. Using degenerate guide RNAs facilitates digestion with some RNA- guided DNA endonucleases.
[0011] In some embodiments, adaptors are linked to the DNA fragments. An adaptor is an oligomer that can be ligated to the DNA fragments produced from digestion of the double stranded DNA. Adaptors can be used to facilitate genetic analyses, including but not limited to, amplification and/or sequencing of the DNA fragments using primers that hybridize to an adaptor nucleotide sequence.
[0012] The described degenerate crRNAs, degenerate guide RNAs and adaptors can be used to digest double stranded DNA, such as genomic DNA. Digesting DNA using RNA- guided DNA endonucleases can be used to reduce genome complexity (generate a reduced representation of a genome) and/or analyze genomic DNA. Ligating adaptors to DNA fragments produced by digestion with RNA-guided DNA endonucleases can be used to facilitate genetic analyses. Such analyses include, but are not limited to, amplification and sequencing.
[0013] In some embodiments, a double stranded DNA sample is digested using one or more pools of degenerate crRNAs and one or more RNA-guided DNA endonucleases. In some embodiments, two or more pools of degenerate crRNAs with different specific sequences are used. The DNA fragments are optionally ligated to adaptors. In some embodiments, the DNA fragments, with or without attached adaptors, are amplified and/or sequenced. The described methods are useful in genome and genetic analysis.
[0014] Described are methods for producing a reduced representation of a genome, comprising the steps of digesting the genome with a class 2 CRISPR-Cas endonuclease guided by a pool of degenerate crRNAs to form a first nucleic acid product. In some embodiments, following digestion, a first adaptor is ligated to a first end of the first nucleic acid product and a second adaptor is ligated to a second end of the first nucleic acid product to produce a second nucleic acid product. The second nucleic acid product is the amplified. Any method in the art for DNA amplification can be used. In some embodiments, amplifying comprises PCR. In some embodiments, amplifying the second nucleic acid product by PCR comprised using a first PCR primer that hybridized to the first adaptor and a second PCR primer that hybridizes to the second adaptor to produce a third nucleic acid product. The third nucleic acid product represents a reduced representation of the genome. In certain embodiments the class 2 CRISPR-Cas endonuclease is a type V-A class 2 CRISPR-Cas endonuclease, for example Casl2a.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] FIG. 1. Exemplary crRNA for a class 2 CRISPR-Cas endonuclease type V-A having a defined nucleotide sequence. SEQ ID NO:l
[0016] FIG. 2. A. Guide RNA. B. Formula for a pool of degenerate guide RNAs, wherein N1-29 represents 1-29 degenerate nucleotides, X1-20 represents a specific sequence of 1-20 nucleotides, and linker represents a linker, and X'n-m represents a tracrRNA 70-89 nucleotides in length. C. Exemplary guide RNA for a class 2 CRISPR-Cas endonuclease type V-B having a defined sequence; crRNA (underlined) and tracrRNA of CAS9 from S. pyogenes. SEQ ID NO:2.
[0017] FIG. 3. A. Formula for a pool of 20mer degenerate crRNAs having degenerate sequences 15 nucleobases in length and specific sequences 5 nucleobases in length, wherein N represents a degenerate nucleotide and X represents a specific nucleotide. B. Examples of four different pools of 20mer degenerate crRNAs having degenerate sequences 15 nucleobases in length and specific sequence AGCCA, AAACU, AUUGA, and CGAAA. N represents a degenerate nucleotide and NNNNNNNNNNNNNNN comprises a degenerate sequence.
[0018] FIG. 4. A. Generic formula for a pool of degenerate crRNAs having degenerate sequence Nn and specific sequence Xx, wherein Nn represents degenerate sequence n nucleotides in length, Xx represents a specific sequence x nucleotides in length, n is a integer from 1 to 20, x is a integer from 1 to 29, and the value of n+x is 15-30. B. Formulas for pools of degenerate crRNAs 20 nucleobases in length having specific sequences 1-10 nucleobases in length wherein X 1 , X2, X3 , X4, X5 , X6, X7 , X8 , X9 , and X 10 are each independently specific nucleobases, and N represents a degenerate nucleotide. C. Formulas for pools of 20mer degenerate crRNAs having specific sequences A, CU, UGA, AAAA, and GACUC. N represents a degenerate nucleotide. [0019] FIG. 5. General overview of the reaction. (A) Two positions in the genome are recognized as complementary to guide RNAs, and digested by a class 2 CRISPR-Cas endonuclease. The degenerate region of the guide RNA is represented by a dotted line, followed by the specific sequence (thick continuous line), and a thin continuous line (tracrRNA, in case of digestion by an endonuclease of Type II). The specific sequence and the presence of the PAM sequence determine the position of the digestion. (B) After digestion, fragments in the genome that contain the specific sequence at both ends are generated. (C) Next, a ligase reaction adds adaptor sequences to each digested fragment. (D) These fragments can then be amplified by PCR or other amplification method.
[0020] FIG. 6. A. Exemplary target double stranded DNA sequence strand showing the crRNA sequence (boxed) and PAM sequence (Bold). N represents a degenerate nucleotide (each N is independent of the others), N* represents a base complementary to the corresponding N, X1, X2, X3, and X4 represent specific nucleotides, X1*, X2*, X3*, and X4* represent bases complementary to X1, X2, X3, and X4, respectively, PAM represents a PAM sequence, and P*A*M* represents bases complementary to PAM. B. Exemplary target double stranded DNA sequence (showing the DNA strand having the same sequence as the crRNA) having a TTGA specific sequence immediately upstream of an NGG PAM sequence.
[0021] FIG. 7. Exemplary DNA segment to be searched in the genome for prediction of digestion guided by a pool of crRNAs having a TTGA specific sequence and for which the RNA-guided DNA endonuclease requires an NGG PAM sequence.
[0022] FIG. 8. Histograms displaying the number of 3 (left panel), 4 (middle panel) and 5 (right panel) base combinations in the specific sequence (Y-axis) that will generate a range of number of fragments between 200-500 nucleotides after digestion (X-axis).
Description of the Sequences
[0023] SEQ ID NO:l - Is an example of a crRNA sequence for class 2 CRISPR-Cas endonuclease type V-A.
[0024] SEQ ID NO:2 - Is an example of a crRNA and tracrRNA sequence for class 2 CRISPR-5 Cas endonuclease type V-B. DEFINITIONS
[0025] Before describing the present teachings in detail, it is to be understood that the disclosure is not limited to specific compositions or process steps, as such may vary. It should be noted that, as used in this specification and the appended claims, the singular form "a," "an," and "the" include plural references unless the context clearly dictates otherwise. Thus, for example, reference to "an oligomer" includes a plurality of oligomers and the like. The conjunction "or" is to be interpreted in the inclusive sense, i.e., as equivalent to "and/or," unless the inclusive sense would be unreasonable in the context.
[0026] As used herein the specification, "a" or "an" may mean one or more. As used herein in the claim(s), when used in conjunction with the word "comprising," the words "a" or "an" may mean one or more than one.
[0027] The use of the term "or" in the claims is used to mean "and/or" unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and "and/or." As used herein "another" may mean at least a second or more
[0028] In general, the term "about" indicates insubstantial variation in a quantity of a component of a composition not having any significant effect on the activity or stability of the composition. As used herein, the term "about," when referring to the length of an oligonucleotide, is meant to encompass lengths that are within 1 or 2 nucleotides of the stated length. As used herein, the term "about," when referring to a value or to an amount of mass, weight, time, volume, concentration or percentage, is meant to encompass variations of in some embodiments ±20%, in some embodiments ±10%, in some embodiments ±5%, in some embodiments ±1%, in some embodiments ±0.5%, and in some embodiments ±0.1% from the specified amount, as such variations are appropriate to perform the disclosed methods.
[0029] All ranges are to be interpreted as encompassing the endpoints in the absence of express exclusions such as "not including the endpoints"; thus, for example, "within 10-15" includes the values 10 and 15. Also, the use of "comprise," "comprises," "comprising," "contain," "contains," "containing," "include," "includes," "including", "has", and "having" are not intended to be limiting. It is to be understood that both the foregoing general description and detailed description are exemplary and explanatory only and are not restrictive of the teachings. To the extent that any material incorporated by reference is inconsistent with the express content of this disclosure, the express content controls. Unless specifically noted, embodiments in the specification that recite "comprising" various components are also contemplated as "consisting of or "consisting essentially of the recited components.
[0030] "Nucleic acid," "polynucleotide," and "oligonucleotide" refer to a multimeric compound comprising nucleosides or nucleoside analogs which have nitrogenous heterocyclic bases or base analogs linked together to form a polynucleotide, including conventional RNA, DNA, mixed RNA-DNA, and polymers that are analogs thereof. A nucleic acid "backbone" may be made up of a variety of linkages, including one or more of sugar-phosphodiester linkages, peptide-nucleic acid bonds, phosphorothioate linkages, methylphosphonate linkages, or combinations thereof. Sugar moieties of a nucleic acid may be ribose, deoxyribose, or similar compounds with substitutions, e.g., 2' methoxy or 2' halide substitutions. Nitrogenous bases may be conventional bases (A, G, C, T, U), analogs thereof, or derivatives of purines or pyrimidines. Nucleic acids may include modified bases to alter the function or behavior of the nucleic acid, e.g., addition of a 3'-terminal dideoxynucleotide to block additional nucleotides from being added to the nucleic acid. Embodiments of oligomers that may affect stability of a hybridization complex include PNA oligomers, oligomers that include 2'-methoxy or 2'-fluoro substituted RNA, or oligomers that affect the overall charge, charge density, or steric associations of a hybridization complex, including oligomers that contain charged linkages (e.g., phosphorothioates) or neutral groups (e.g., methylphosphonates).
[0031] As used herein, the term "DNA" or "DNA molecule" refers to a double- stranded DNA molecule of genomic or synthetic origin, i.e., a polymer of deoxyribonucleotide bases or a polynucleotide molecule, read from the 5' (upstream) end to the 3' (downstream) end. As used herein, the term "DNA sequence" refers to the nucleotide sequence of a DNA molecule.
[0032] As used herein, the term "isolated DNA molecule" refers to a DNA molecule at least partially separated from other molecules normally associated with it in its native or natural state. In one embodiment, the term "isolated" refers to a DNA molecule that is at least partially separated from some of the nucleic acids which normally flank the DNA molecule in its native or natural state. Thus, DNA molecules fused to regulatory or coding sequences with which they are not normally associated, for example as the result of recombinant techniques, are considered isolated herein. Such molecules are considered isolated when integrated into the chromosome of a host cell or present in a nucleic acid solution with other DNA molecules, in that they are not in their native state.
[0033] As used herein, the term "RNA" refers to a molecule comprising at least one ribonucleotide residue. By "ribonucleotide" is meant a nucleotide with a hydroxyl group at the 2' position of a P-D-ribofuranose moiety. The terms encompass double stranded RNA, single stranded RNA, RNAs with both double stranded and single stranded regions, isolated RNA such as partially purified RNA, essentially pure RNA, synthetic RNA, recombinantly produced RNA, as well as altered RNA, or analog RNA, that differs from naturally occurring RNA by the addition, deletion, sub-stitution, and/or alteration of one or more nucleotides. Such alterations can include addition of non- nucleotide material, such as to the end(s) of an RNA molecule or internally, for example at one or more nucleotides of the RNA. Nucleotides in the RNA molecules of the presently dis-closed subject matter can also comprise non-standard nucleotides, such as non-naturally occurring nucleotides or chemically synthesized nucleotides or deoxynucleotides. These altered RNAs can be referred to as analogs or analogs of a naturally occurring RNA.
[0034] As used herein, the phrase "double stranded RNA" refers to an RNA molecule at least a part of which is in Watson-Crick base pairing forming a duplex. As such, the term is to be understood to encompass an RNA molecule that is either fully or only partially double stranded. Exemplary double stranded RNAs include, but are not limited to molecules comprising at least two distinct RNA strands that are either partially or fully duplexed by intermolecular hybridization. Addi-tionally, the term is intended to include a single RNA molecule that by intramolecular hybridiza-tion can form a double stranded region (for example, a hairpin). Thus, as used herein the phrases "intermolecular hybridization" and "intramolecular hybridization" refer to double stranded mole-cules for which the nucleotides involved in the duplex formation are present on different mole-cules or the same molecule, respectively.
[0035] A "primer" refers to an oligomer that hybridizes to a template nucleic acid and has a 3' end that is extended by polymerization. A primer may be optionally modified attachment of inclusion of: a 5' region that is non-complementary to the target sequence, a tag, a promoter, a detectable label, or sequence or molecule used or useful for manipulation, amplification, purification, or detection. A "primer" is typically a highly purified, isolated polynucleotide that is designed for use in specific annealing or hybridization methods that involve thermal amplification. A pair of primers may be used with template DNA, such as a sample of genomic DNA, in a thermal amplification, such as polymerase chain reaction (PCR), to produce an amplicon, where the amplicon produced from such reaction would have a DNA sequence corresponding to sequence of the template DNA located between the two sites where the primers hybridized to the template. As used herein, an "amplicon" is a piece or fragment of DNA that has been synthesized using amplification techniques. In order for a nucleic acid molecule to serve as a primer it need only be sufficiently complementary in sequence to be able to form a stable double- stranded structure under the particular solvent and salt concentrations employed.
[0036] By "hybridization" or "hybridize" is meant the ability of two completely or partially complementary nucleic acid strands to come together under specified hybridization assay conditions in a parallel or antiparallel orientation to form a stable structure having a double- stranded region. The two constituent strands of this double-stranded structure, sometimes called a hybrid, are held together by hydrogen bonds. These hydrogen bonds most commonly form between nucleotides containing the bases adenine and thymine or uracil (A and T or U) or cytosine and guanine (C and G) on single nucleic acid strands. A crRNA can hybridize to its target nucleic acid to form a crRNA:target hybrid, but does not form stable crRNA: no n- target hybrids. A crRNA hybridizes to target nucleic acid to a sufficiently greater extent than to non-target nucleic acid to guide target-specific digestion of the DNA by the RNA-guided DAN endonuclease.
[0037] As used herein, the term "single nucleotide polymorphism," or "SNP" for short, refers to a single nucleotide position in a genomic sequence for which two or more alternative alleles are present at appreciable frequency in a population.
[0038] As used herein, a target sequence is a nucleotide sequence in a double stranded DNA to which a crRNA or degenerate crRNA can hybridize. Because the DNA is double stranded, a target sequence will contain one strand which contains a sequence complementary to, or that hybridizes to, a (degenerate) crRNA and one strand that contain a sequence that is the same as the (degenerate) crRNA nucleotide sequence. DETAILED DESCRIPTION
1. Overview
[0039] Described are methods and materials to produce a reduced representation of a genome for sequencing and DNA polymorphism detection. The methods use RNA-guided DNA endonucleases guided by pools or libraries of degenerate crRNA oligonucleotides to digest double stranded DNA. The digested DNA can then be modified by adaptor ligation for sequencing, polymorphism detection, or other genetic analysis.
[0040] The methods of the present disclosure rely on the digestion of DNA using a class 2 CRISPR— Cas endonuclease, guided by CRISPR RNA (crRNA) designed to contain one or more degenerate bases so that any variation in the number of targeted sites can be achieved, and a wide variation in positions targeted can be obtained compared to traditional restriction enzyme digestion. Products of the digestion can then be prepared for sequencing using traditional methods of DNA library preparation for sequencing. The present disclosure provides for the simplification of the process for sequencing a reduced representation of the genome for identification and analysis of genetic polymorphisms.
[0041] In some embodiments, methods for reduced representation of a genome comprise: contacting a sample comprising genomic DNA with a described pool of degenerate crRNAs and an RNA-guided DNA endonuclease enzyme(s). The degenerate crRNAs guide cleavage of the double stranded DNA by the RNA-guided DNA endonuclease enzyme. Cleavage of the genomic DNA by the RNA-guided DNA endonuclease results in DNA fragments having the specific sequence at or near one or both ends. The number and average length of the genomic DNA fragments is determined by the specific sequence of the degenerate crRNAs.
[0042] In some embodiments, the digested double stranded DNA is modified by linkage of adaptor sequence. Any method known in the art for performing adaptor ligation can be used with the described compounds, compositions, kits, and methods.
[0043] In some embodiments, the adaptor-modified DNA fragments are subjected to sequence analysis, polymorphism detection, and/or other genetic analysis. Any method known in the art for performing sequencing and DNA polymorphism detection can be used with the described compounds, compositions, kits, and methods.
[0044] In some embodiments, the adaptor-modified DNA fragments are subjected to nucleic acid amplification. Amplification can be performed using primers that hybridize to the adaptor sequences. Any method known in the art for amplifying the adaptor-modified fragments be used with the described compounds, compositions, kits, and methods. In some embodiments, the amplified sequence is then subjected to sequence analysis or other genetic analysis.
2. Degenerate crRNAs
[0045] A pool or library of degenerate crRNA comprises single strand RNA oligonucleotides 15-30 nucleobases in length each containing a specific sequence region and a degenerate sequence region. Each crRNA in the pool or library has the same specific sequence in the specific sequence region. However individual crRNAs in the pool or library can have different sequences in the degenerate sequence region. The degenerate sequence region is located 5' of the specific sequence region. A degenerate crRNA comprises, starting from the 3' end: (a) a specific sequence region having a specific nucleotide sequence wherein each position is independently a G, A, C, or U ribonucleotide or a modified G, A, C, or U, and (b) a degenerate sequence region wherein different sequences are present in individual oligonucleotides in the pool of degenerate crRNAs. The specific sequence region can be from 1-20 nucleobases in length and has a defined or specific nucleotide sequence, i.e., in a pool of degenerate crRNAs, each individual oligonucleotide contains the same nucleotide sequence in the specific sequence region. The degenerate sequence can be 1-29 nucleobases in length, wherein, within the pool of degenerate crRNAs, each nucleotide position can independently have, or be represented by, multiple possible alternatives. Because individual oligonucleotides in the pool of degenerate crRNAs contain difference sequences, the degenerate crRNAs can bind to multiple targets in a double stranded DNA sample. In some embodiments, a pool of degenerate crRNAs comprises sequences representing a mixture of all the possible oligonucleotides of length n, wherein n is the length in nucleobases of the degenerate sequence region. In some embodiments, a pool of degenerate crRNAs comprises sequences representing a mixture of nearly all the possible oligonucleotides of length n, wherein n is the length in nucleobases of the degenerate sequence region. In some embodiments, a pool of degenerate crRNAs comprises sequences representing a mixture of a substantial fraction, of the possible oligonucleotides of length n, wherein n is the length in nucleobases of the degenerate sequence region. In some embodiments, a pool of degenerate crRNAs comprises sequences representing <1% to 100%, of the possible oligonucleotides of length n, wherein n is the length in nucleobases of the degenerate sequence region. In some embodiments, a pool of degenerate crRNAs comprises sequences representing about 1%, about 2%, about 3%, about 4%, about 5%, about 6%, about 7%, about 8%, about 9%, about 10%, about 15%, about 20%, about 25%, about 30%, about 40%, about 50%, about 60%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%, of the possible oligonucleotides of length n, wherein n is the length in nucleobases of the degenerate sequence region. The degenerate crRNAs guide cleavage or nicking of the double stranded DNA by the RNA-guided DNA endonuclease at or near the target sequence.
[0046] In some embodiments, the degenerate crRNAs are 15-30 nucleobases in length. In some embodiments, the degenerate crRNAs in a pool of degenerate crRNA are each about 15, about 16, about 17, about 18, about 19 nucleotides in length, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, or about 30 nucleobases in length or more. In some embodiments, the degenerate crRNAs in a pool of degenerate crRNAs are each about 19-21 nucleobases in length. In some embodiments, the degenerate crRNAs in a pool of degenerate crRNAs are each about 19 nucleobases in length. In some embodiments, the degenerate crRNAs in a pool of degenerate crRNA are each about 20 nucleobases in length. In some embodiments, the degenerate crRNAs in a pool of degenerate crRNAs are each about 21 nucleobases in length.
[0047] In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides. In some embodiments the specific sequence is 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9 or about 10 nucleotides in length, or greater, for example about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, or about 20 nucleotides in length or more.
[0048] In some embodiments, in a pool of degenerate crRNA, each degenerate crRNA comprises a specific sequence region 1 to 10 nucleobases or more in length and a degenerate sequence region 9-20 nucleobases or more in length (see FIG. 3). [0049] In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a degenerate sequence region of about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, or about 20 nucleotides or more in length. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a degenerate sequence region of 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or20 nucleotides, or sufficient nucleotides to provide a degenerate crRNA of about 19-21 nucleotides in length. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 1 nucleotide and a degenerate sequence region of 18-20 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 2 nucleotides and a degenerate sequence region of 17, 18, or 19 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 3 nucleotides and a degenerate sequence region of 16, 17, or 18 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 4 nucleotides and a degenerate sequence region of 15, 16, or 17 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 5 nucleotides and a degenerate sequence region of 14, 15, or 16 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 6 nucleotides and a degenerate sequence region of 13, 14, or 15 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 7 nucleotides and a degenerate sequence region of 12, 13, or 14 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 8 nucleotides and a degenerate sequence region of 11, 12, or 13 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 9 nucleotides and a degenerate sequence region of 10, 11, or 12 nucleotides. In some embodiments, in a pool of degenerate crRNAs, each degenerate crRNA contains a specific sequence region of 10 nucleotides and a degenerate sequence region of 9, 10, or 11 nucleotides.
[0050] A used herein a specific sequence region comprises a nucleotide sequence wherein each position has a defined nucleobase selected from guanine (G), adenine (A), cytosine (C), and thymine (T) or uracil (U), and a modified G, A, C, or T/U. Within the specific sequence, a modified G, A, C, T, or U forms a base pair with, hybridizes to, or is complementary to, C, T, G, and A respectively.
[0051] Within a pool of degenerate crRNAs, each position in the degenerate sequence region can have multiple possible nucleobase alternatives. On other words, for a given position, some of the oligonucleotides can have a G, some can have an A, some can have a C, and some can have a T or U. In some embodiments, each position in a degenerate sequence in a pool of degenerate crRNAs can be G, A, C, or U with equal probability. In some embodiments, any given position in the degenerate sequence in a pool of degenerate crRNAs can be G, A, C, or U will equal probability. In some embodiments, each position in the degenerate sequence in a pool of degenerate crRNAs may independently contain one or more universal bases in lieu of or in addition to one or more of G, A, C, and/or U. In some embodiments, any given position in the degenerate sequence in a pool of degenerate crRNAs, may independently contain one or more universal bases in lieu of or in addition to one or more of G, A, C, and/or U. In some embodiments, any given position in the degenerate sequence in a pool of degenerate crRNAs can independently be at least two of G, A, C, U, and universal base. The probability of each base at a degenerate position in a pool of degenerate crRNAs can independently be equal or statistical. A statistical probability means the probability of each nucleotide being present at a given degenerate position in a pool of degenerate crRNAs is equal to a mole fraction. The mole fraction is determined during synthesis of the pool of degenerate crRNAs. In some embodiments, a fully degenerate site is formed in a pool of oligonucleotides by using an equimolar mix G, A, C, and U at a given site during synthesis of the oligonucleotide. In some embodiments, a pyrimidine universal base can be used that hybridizes to A or G. In some embodiments, a purine universal base can be used that hybridizes to C or U/T. In some embodiments a fully degenerate site can be a pyrimidine universal base or purine universal base with equal probability. In some embodiments, every position in the degenerate sequence in a pool of degenerate crRNAs is degenerate. In some embodiments, a degenerate sequence in a pool of degenerate crRNAs can have 0-5 specific sites wherein the specific sites independently contain a defined nucleotide selected from G, A, C, or T/U. In some embodiments, a degenerate sequence in a pool of degenerate crRNAs can have 0-10 sites wherein the site is degenerate for two or three nucleotides, such as, but not limited to, G + C, G + T, G + A, C + T, C + A, A + T, G + C + T, G + A + T, G + C + A and C + A + T. For site degenerate sites having for two or three nucleotides, the two or three nucleotides can be present with equal probability. Degenerate crRNAs can form pools or libraries of crRNAs representing multiple sequences each containing a common specific sequence. Because the degenerate sequence is degenerate, the plurality of crRNAs in a pool of degenerate crRNAs can hybridize to a plurality of double stranded DNA target sequences.
[0052] As used herein, a universal base is able to form a base pair with (hybridize to) two or more natural bases (i.e., G, A, C, and U or T). In some embodiments, a universal base is able to form a base pair with three or more of G, A, C, and U or T. In some embodiments, a universal base is able to form a base pair with each of G, A, C and U or T. An ideal universal base is able to hybridize non-selectively (or non-discriminately) to each of the native bases. In some embodiments, a universal base can replace any of the four natural bases without significantly affecting either melting behavior of duplexes or the normal activities of the modified oligonucleotide. However, many universal bases have at least some preference for hybridizing to one or more of the natural bases. Universal bases include, but are not limited to:
3- nitropyrrole, 5-nitroindole, 4-nitroindole and 6-nitroindole, nitroimidazole, 4-nitropyrazole,
4- nitrobenzimidazole, 4-aminobenzimidazole, C-ribonucleoside, 2'-deoxyribosyl nucleoside,
5- nitroindazole, hypoxanthine, imidazole 4,5-dicarboxamide, benzimidazole, 5-fluoroindole, indole, isocarbostyril nucleoside derivatives, and pyrrolopyrimidine nucleoside.
[0053] Sequences in the double stranded DNA sequence that are complementary to a crRNA sequence are potential substrates for cleavage by the crRNA-RNA-guided DNA endonuclease complexes. Selection of the degenerate crRNA specific sequence affects the sizes and numbers of resultant DNA fragments. Because a pool of degenerate crRNAs contains oligonucleotides having multiple sequences in the degenerate sequence regions, the level of genome reduction is determined by the number of bases and the nucleotide sequence of the specific sequence. The remaining sequences of the pool of degenerate crRNAs are composed of degenerate (N) bases, up to the total length of the degenerate crRNA. Thus, the specific sequence defines the position(s) in the genome, from where digestion by the RNA-guided DNA endonuclease will occur. The pool of degenerate crRNAs can hybridize to multiple double stranded DNA target sequences containing a sequence complementary to the specific sequence. Generally, use of longer specific sequences will yield fewer fragments, and therefore greater reduction in genome complexity, than shorter specific sequences. For example, selection of a specific sequence with 1 nucleotide typically yields a larger number of fragments than a specific sequence with 10 nt, which is likely anneal to the genome less frequently. Different pools of degenerate crRNAs having different specific sequences will hybridize to different target sequences in a DNA sample.
[0054] Selection of the specific sequence depends on the number of fragments, frequency of digestion, fragment size, distribution, and/or location that one desires to sample in the genome. In some embodiments, the number of bases and the nucleotide sequence to be used in the specific sequence is determined by the bioinformatics analysis of the genome of the species being analyzed by identifying the number of instances that a specific sequence occurs in the reference genome.
[0055] Some RNA-guided DNA endonucleases require a protospacer adjacent motif (PAM) sequence that is adjacent to the crRNA sequence. The PAM sequence is present in the double stranded DNA target sequence but not in the guide RNA sequence. The PAM sequence varies by the species of RNA-guided DNA endonuclease. For example, Class 2 CRISPR-Cas type II endonuclease derived from S. pyogenes utilizes an NGG PAM sequence located on the immediate 3' end of the guide RNA recognition sequence. Other PAM sequence include, but are not limited to, NGG (Streptococcus pyogenes, NNNNGATT (Neisseria meningitidis), NNAGAA (Streptococcus thermophilus), and NAAAAC (Treponema denticola).
[0056] For RNA-guided DNA endonucleases not requiring PAM sequences, the genomic sequence to be searched for frequency of digestion is the degenerate crRNA specific sequence. For RNA-guided DNA endonucleases requiring PAM sequences, the PAM sequence is also taken into consideration when estimating the number of positions in the genome that can be digested. A genomic sequence to be searched for frequency an location of degenerate crRNA specific sequence plus the PAM sequence. For example, for digestion by a class 2 CRISPR-Cas endonuclease of Type II (e.g. Cas9), with a crRNA containing the specific sequence 5' TTGA 3', the target genomic sequence will be as described in FIG. 6 and 7 (5' crRNA+PAM 3')·
[0057] Genomic nucleotide sequences are not completely random. Some sequences are likely to result in the digestion of DNA in positions that are more consistently distributed in the genome than others. Therefore, if a reference genome is available for the organism of interest, this distribution can be estimated to guide the selection of the most appropriate specific sequence. [0058] In some embodiments, two or more pools of degenerate crRNAs that differ in their specific sequence may be used in combination in a single digestion reaction. Using two or more pools of degenerate crRNAs that differ in their specific sequence in a single digestion reaction provides more flexibility in the regions of the genome that will be digested/amplified.
3. Degenerate guide RNA
[0059] In additional embodiments, a degenerate crRNA further comprises a transacting CRISPR RNA (tracrRNA) operably linked to the degenerate crRNA by a linker to form a degenerate guide RNA. Also described are pools of degenerate guide RNAs wherein the degeneratin crRNA are linked to tracrRNAs. The tracrRNA is linked to the 3' end of the degenerate crRNA. A degenerate guide RNA comprises both degenerate crRNA and tracrRNA sequences.
[0060] In some embodiments, a degenerate guide RNA is about 50-260 nucleotides long. In some embodiments, the degenerate guide RNA is about 90-120 nucleotides long. In some embodiments, a degenerate guide RNA is at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 120, at least 150, at least 180, at least 200, at least 220, at least 240, or at least 260 nucleotides long.
[0061] In some embodiments, the linker is a short nucleotide sequence. In some embodiments, the short nucleotide sequence is 1-10 nucleotides in length. In some embodiments the linker region is about 4 nucleotides in length, although in other embodiments shorter or longer linker regions can be used, including linker regions of one to about 3 nucleotides in length, or linker regions of about 5, about 6, about 7, about 8, about 9, or about 10 or more nucleotides in length. In some embodiments, the linker is a non-nucleotide linker.
[0062] In some embodiments, degenerate guide RNAs are used in combination with digestion by a class 2 CRISPR-Cas endonuclease of Type V-A. A class 2 CRISPR-Cas endonuclease of Type V-A can be, but is not limited to, Casl2a (also known as Cpfl).
[0063] In some embodiments, degenerate crRNAs connected to a irans-acting CRISPR RNAs (tracrRNA) by linkers are used in combination with digestion by a class 2 CRISPR-Cas endonuclease of Type II. A class 2 CRISPR-Cas endonuclease of Type II can be, but is not limited to, Cas9. [0064] In some embodiments, degenerate crRNAs connected to a trans-acting CRISPR RNAs (tracrRNA) by linkers are used in combination with digestion by a class 2 CRISPR-Cas endonuclease of Type V-B. A class 2 CRISPR-Cas endo nuclease of Type V-B can be, but is not limited to, Casl2b (also known as C2cl).
[0065] An exemplary crRNA is shown in FIG. 1. An exemplary crRNA connected to a irans- acting CRISPR RNA (tracrRNA) by a nucleotide linker is shown in FIG. 2.
[0066] Degenerate crRNAs or degenerate guide RNA are typically ribonucleic acids. However degenerate crRNAs or degenerate guide RNAs may contain one or more modified nucleotides. The modifications may independently be base modifications or backbone modifications. Any of the described degenerate crRNAs and/or degenerate guide RNAs can contain at least one modified nucleotide. In some embodiments, a degenerate crRNA and/or degenerate guide RNA comprises two or more modified nucleotides. The two or more modified nucleotides may have the same or different modifications. In some degenerate crRNAs and/or degenerate guide RNAs, incorporation of one or more modified nucleotides is used to increase the stability of the duplex by raising the Tm relative to the corresponding degenerate crRNAs and/or degenerate guide RNAs without modified nucleotides.
[0067] In some embodiments, a degenerate crRNA and/or degenerate guide RNA contains one or more modified nucleotides. For degenerate crRNAs and/or degenerate guide RNAs containing two or more modified nucleotides, the modified nucleotides may be the same or different. As used herein, a "modified nucleotide" is a nucleotide other than a naturally occurring G, A, C, or U ribonucleotide. A modified nucleotide may have a base modification, backbone modification and/or modified internucleoside linkage. A modified oligonucleotide can be, but is not limited to, 2'-ribose modified nucleotide, 5 '-methyl cytosine, or universal base.
[0068] The degenerate crRNAs and/or degenerate guide RNAs may be made by any method available in the art of oligonucleotide synthesis.
4. Adaptor
[0069] In some embodiments, following endonuclease digestion, adaptors are ligated to the ends of the digested DNA fragments. Adaptors may be used to facilitate amplification and/or sequencing of the digested DNA fragments. The adaptor-modified DNA fragments can then amplified and/or sequenced for identification of DNA polymorphisms.
[0070] As used herein, an adaptor comprises an oligonucleotide that is ligatable to one or both strands of a double-stranded DNA molecule. An adaptor maybe single stranded, double stranded, or a single strand that can form a hairpin. A double stranded adaptor can have two blunt ends, two sticky ends, or a stick end and a blunt end. An adaptor may contain one or more of: restriction enzyme recognition sequence, detectable label, molecular barcode, and primer binding sequence. In some embodiments, an adaptor is 8-120 nucleotides in length. In some embodiments, an adaptor is 20-60 nucleotides in length. In some embodiments, an adaptor is about 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45, about 50, about 55, or about 60 nucleotides in length.
[0071] In some embodiments, a first adaptor is ligated to a first end of a first nucleic acid product or DNA fragment and a second adaptor is ligated to a second end of a first nucleic acid product or DNA fragment to form a second nucleic acid product. The first adaptor and the second adaptor may be the same or different (i.e., they may have the same or different nucleotide sequences and/or they may contain the same or different modifications).
[0072] In some embodiments are provided amplification and/or sequencing primers that hybridize to the adaptors. Hybridization to the adaptors can be used to facilitate polymerization initiation. The primers may be the same length, shorter, or longer that the adaptor sequences. A first PCR primer and a second PCR primer used for amplification of the DNA fragment may be the same or different. In some embodiments the second nucleic acid product is amplified to form a third (amplified) nucleic acid product. In some embodiments the second nucleic acid product is amplified by PCT for between about 15 and about 25 cycles. In some embodiments the amplified nucleic acid product is sequenced. In some embodiments, the first nucleic acid product, second nucleic acid product or third nucleic acid product represents a reduced representation of the genome. In some embodiments a reduced representation of a plurality of genomes are produced.
5. RNA -guided DNA endonuclease
[0073] An RNA-guided DNA endonuclease forms a complex with the crRNA and binds to DNA and cleaves the DNA in a crRNA sequence-dependent manner. The location of cleavage relative to the crRNA sequence (or crRNA hybridizing sequence) is dependent on the particular endonuclease. An RNA-guided DNA endonuclease can be, but is not limited to, a CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease, 2 CRISPR-Cas endonuclease Type II endonuclease, cas9 endonuclease, class 2 CRISPR-Cas endonuclease Type V-B endonuclease, Casl2b endonuclease, C2cl endonuclease, 2 class 2 CRISPR-Cas endonuclease Type V-A endonuclease, Casl2a endonuclease, or Cpfl endonuclease or a functional derivative of a CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease, 2 CRISPR- Cas endonuclease Type II endonuclease, cas9 endonuclease, class 2 CRISPR-Cas endonuclease Type V-B endonuclease, Casl2b endonuclease, C2cl endonuclease, 2 class 2 CRISPR-Cas endonuclease Type V-A endonuclease, Casl2a endonuclease, or Cpfl endonuclease.
6. Double stranded DNA
[0074] The double stranded DNA sample can comprise, consist essentially or, or consist of genomic DNA, total genomic DNA, and/or cDNA. The double stranded DNA may be unamplified or amplified, undigested or digested, unfragmented or fragmented, unpurified or purified. The double stranded DNA may be obtained or isolated by any method known in the art.
[0075] The double stranded DNA sample may be isolated from any organism, cell, tissue or remnant of an organism, cell, or tissue. The organism can be living, dead, or archaeological. The organism may be a prokaryote, archaea, or eukaryote. In some embodiments, the organism is a bacteria, archaea, plant, animal or fungus. The animal can be, but is not limited to, a mammal, insect, fish, bird, amphibian, or reptile. The mammal can be, but is not limited to, rodent, mouse, rat, dog, cat, bovine, porcine, ovine, non-human primate, human. In some embodiments, the mammal is a human. The plant can be, but is not limited to, soybean, maize, wheat, or rice.
7. Methods of use
[0076] Disclosed are methods of obtaining a genotype of an individual, comprising digesting a genome of the individual with a class 2 CRISPR-Cas endonuclease guided by a pool of degenerate crRNA or degenerate guide RNAs to produce a nucleic acid product having a first end and a second end, ligating a first adaptor to the first end of the nucleic acid product and a second adaptor to the second end of the nucleic acid product to produce a second nucleic acid product, and amplifying the second nucleic acid product to produce a third nucleic acid product. In some embodiments, the second nucleic acid product is amplified by PCR. In some embodiments, a first PCR primer binds to the first adaptor and a second PCR primer binds to the second adaptor. The third nucleic acid product represents a reduced representation of the genome. In some embodiments, the third nucleic acid product is sequenced to obtain the genotype of the individual. In certain embodiments the method further comprises comparing the genotype of the individual to a reference genotype.
[0077] Disclosed are methods of identifying a polymorphism in a genome of an individual, comprising digesting a genome of the individual with a class 2 CRISPR-Cas endonuclease guided by a pool of degenerate crRNA or degenerate guide RNAs to produce a nucleic acid product having a first end and a second end, ligating a first adaptor to the first end of the nucleic acid product and a second adaptor to the second end of the nucleic acid product to produce a second nucleic acid product, and amplifying the second nucleic acid product to produce a third nucleic acid product. In some embodiments, the second nucleic acid product is amplified by PCR. In some embodiments, a first PCR primer binds to the first adaptor and a second PCR primer binds to the second adaptor. The third nucleic acid product represents a reduced representation of the genome. In some embodiments, the third nucleic acid product is sequenced to identify the polymorphism of the individual. In certain embodiments the method further comprises comparing the polymorphism of the individual to a reference genotype. In some embodiments the method further comprises identifying a plurality of polymorphisms in the genome of the individual.
[0078] In some embodiments are described methods of digesting double stranded DNA using one or more described pools of degenerate crRNAs and one or more RNA-guided DNA endonucleases. In some embodiments, the degenerate crRNAs are linked to tracrRNAs to form a degenerate guide RNAs.
[0079] In some embodiments are described methods of reducing genome complexity or generating a reduced representation of a genome using one or more described pools of degenerate crRNAs and one or more RNA-guided DNA endonucleases. In some embodiments, the degenerate crRNAs are linked to tracrRNAs to form degenerate guide RNAs. [0080] In some embodiments are described methods of analyzing double stranded DNA using one or more described pools of degenerate crRNAs and one or more RNA-guided DNA endonucleases. In some embodiments, the degenerate crRNAs are linked to tracrRNAs to form degenerate guide RNAs.
[0081 ] In some embodiments, multiple pools of degenerate crRNA pools with different specific sequences can be used to digest double stranded DNA, reduce genome complexity, generate a reduced representation of a genome, or analyze genomic DNA using the described pools of degenerate crRNAs and one or more RNA-guided DNA endonucleases.
[0082] In some embodiments, the methods comprise:
a) providing a sample containing double stranded DNA,
b) adding one or more pools of degenerate crRNAs and/or one or more pools of degenerate guide RNAs,
c) adding an RNA-guided DNA endonuclease enzyme(s), and d) incubating the double stranded DNA with the one or more pools of degenerate crRNAs and/or one or more pools of degenerate guide RNAs and the RNA-guided DNA endonuclease(s) for a sufficient time to digest the double stranded DNA thereby yielding DNA fragments (first nucleic acid product).
[0083] In other embodiments, the methods comprise:
a) providing a sample containing double stranded DNA,
b) adding one or more pools of degenerate crRNA/RNA-guided DNA endonuclease enzyme complexes and/or one or more pools of degenerate guide RNA/RNA- guided DNA endonuclease enzyme complexes, and
c) incubating the double stranded DNA with the degenerate crRNA/RNA- guided DNA endonuclease enzyme complexes and/or degenerate guide RNA/RNA-guided DNA endonuclease enzyme complexes for a sufficient time to digest the double stranded DNA thereby yielding DNA fragments (first nucleic acid product).
[0084] The double stranded DNA can be, but is not limited to, genomic DNA and cDNA.
[0085] Any of the described pools of degenerate crRNAs or pools of degenerate guide RNAs may be used in the above method. In some embodiments, the one or more pools or degenerate crRNAs and/or one or more pools or degenerate guide RNAs contain different specific sequences.
[0086] The RNA-guided DNA endonuclease can be, but is not limited to, a CRISPR- Cas endonuclease, class 2 CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease Type II endonuclease, cas9 endonuclease, class 2 CRISPR-Cas endonuclease Type V-B endonuclease, Casl2b endonuclease, C2cl endonuclease, 2 class 2 CRISPR-Cas endonuclease Type V-A endonuclease, Casl2a endonuclease, or Cpfl endonuclease or a functional derivative of a CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease, 2 CRISPR- Cas endonuclease Type II endonuclease, cas9 endonuclease, class 2 CRISPR-Cas endonuclease Type V-B endonuclease, Cas 12b endonuclease, C2cl endonuclease, 2 class 2 CRISPR-Cas endonuclease Type V-A endonuclease, Casl2a endonuclease, or Cpfl endonuclease.
[0087] In some embodiments, adaptors are ligated to the DNA fragments to provide a reduced representation of the genome that can be amplified by PCR, sequenced, genotyped for one or more polymorphisms, and/or otherwise genetically analyzed (attachment of adaptors to a DNA fragment is shown in example FIG. 5C).
[0088] In some embodiments, the methods further comprise, after step d):
e) adding one or more adaptors to the DNA fragments, and f) ligating the adaptors to the DNA fragments to form adaptor-modified DNA fragments.
[0089] In some embodiments, the methods further comprise:
e) adding one or more amplification primers to the adaptor-modified DNA fragments, and
f) amplifying the adaptor-modified DNA fragments.
[0090] The term "amplifying" as used herein refers to generating one or more copies of a target nucleic acid, using the target nucleic acid as a template.
[0091] In some embodiments, the methods further comprise:
e) adding one or more sequencing primers to the adaptor-modified DNA fragments or amplified adaptor-modified DNA fragments, and f) sequencing the adaptor-modified DNA fragments or amplified adaptor- modified DNA fragments.
[0092] In some embodiments, a sequencing primer can hybridize to the same or similar sequence as an amplification primer.
[0093] The described methods are useful in genetic and/or genome analyses. Genetic and genomic analyses includes the identification, measurement, and/or comparison of genomic features such as DNA sequence, structural variation, gene expression, or regulatory and functional element annotation at a genomic scale. The fragmented or amplified DNA may be analyzed by any analysis method including, but not limited to, DNA sequencing, high- throughput sequencing, next generation sequencing, thermocycling or isothermal nucleic acid amplification assay, real-time thermocycling or isothermal nucleic acid amplification assay, hybridization assay, microarray assay, bead array assay, primer extension assay, enzyme mismatch cleavage assay, branched hybridization assay, molecular beacon assay, invasive cleavage structure assay, or sandwich hybridization assay. The fragmented or amplified DNA may be analyzed for the presence of one or more single nucleotide polymorphisms (SNPs) or other differences relative to a reference sequence.
[0094] In some embodiments, one or more DNA fragments, either generated from endonuclease digestion of double stranded DNA, either before or after adaptor linkage, or generated by amplification of a digested double stranded DNA fragment, may be cloned into a plasmid, thereby producing a recombinant plasmid.
8. Compositions and kits
[0095] The present disclosure compositions and kits useful for digesting double stranded DNA (including but not limited to genomic DNA or cDNA), reducing representation of a genome, sequencing genomic DNA, and/or polymorphism detection. Also provided by the disclosure are reaction mixtures for digesting double stranded DNA, reducing representation of a genome, sequencing genomic DNA, and/or polymorphism detection.
[0096] The present disclosure also provides kits comprising one or more reagents or components as described herein. In some embodiments, a composition, kit, and/or reaction mixture in accordance with the present disclosure comprises: one or more pools of degenerate crRNAs and/or one or more pools of degenerate guide RNAs as described herein and an RNA- guided DNA endonuclease(s). In some embodiments, DNA endonuclease comprises a class 2 CRISPR-Cas endonuclease. A composition, kit and/or reaction mixture may further comprise one or more of the following: one or more adaptors, one or more amplification probes, and one or more sequencing probes. In some embodiments, the adaptors as the same. In some embodiments, the adaptors are distinct. In some embodiments, a kit comprises 3, 4, 5, 6 or more distinct adaptors. In some embodiments, a kit comprises first and second amplification primers. In some embodiments, the amplification primers are distinct. In some embodiments, a kit comprises 3, 4, 5, 6, or more distinct amplification primers. A composition, kit and/or reaction mixture may further include a number of optional components such as, for example, buffer, salt solutions, appropriate nucleotide triphosphates (e.g., dATP, dCTP, dGTP, and dTTP), and/or enzymes (including, but not limited to: DNA polymerase and thermostable DNA polymerase), and control samples. Any of the above indicates components may be provided in one or more containers. The container(s) can be of any suitable material, e.g., glass, plastic, metal, etc. , and of any suitable size, shape, or configuration. In some embodiments, a pool of degenerate crRNA and/or pool or degenerate guide RNAs is provided as a solid, such as powder form. In some embodiments, a pool of degenerate crRNAs and/or pool of degenerate guide RNA is provided as a liquid or solution. The degenerate crRNAs or degenerate guide RNAs may be provided as isolated RNA or they may be provided as part of a complex with RNA- guided DNA endonucleases. A degenerate crRNA-RNA-guided DNA endonuclease complex or degenerate guide RNA-RNA-guided DNA endonuclease complex refers to a complex comprising an RNA guided DNA endonuclease and a degenerate crRNA or degenerate guide RNA. The degenerate crRNAs hybridize to a target sequences in the double stranded DNA and provides sequence specificity for digestion of the DNA by the endonuclease.
[0097] In some embodiments, a kit may also include additional reagents, e.g., PCR components, such as salts including MgCl2, a thermostable polymerase enzyme, and deoxyribonucleotides, and the like, and other reagents, including for example media such as water, saline, or the like, as described herein. Such reagents or components are well known in the art. Where appropriate, reagents included with such a kit may be provided either in the same container or media, or may alternatively be placed in a second or additional distinct container into which the additional composition or reagents may be placed and suitably aliquoted. Alternatively, reagents may be provided in a single container means. [0098] In some embodiments, a kit may also include one or more other adjuncts or adjuvants. In some embodiments, a kit further includes a set of instructions or packaging materials that describe methods of using the components for practicing methods in accordance with the present disclosure. The instructions may be associated with a package insert and/or the packaging of the kit or the components thereof. In some embodiments, the kit comprises instructions or packaging materials that describe how to use the degenerate crRNAs or degenerate guide RNAs in a method to produce a reduced representation of a genome. In some embodiments the kit comprises instructions or packaging materials that describe how to use the degenerate crRNAs or degenerate guide RNAs in a method to obtain a genotype of an individual. In some embodiments, the kit comprises instructions or packaging materials that describe how to use the degenerate crRNAs or degenerate guide RNAs in a method to identify a polymorphism in a genome of an individual.
[0099] Any method disclosed herein is also to be understood as a disclosure of corresponding uses of materials involved in the method directed to the purpose of the method. Any of the degenerate crRNAs, degenerate guide RNAs, adaptors, and any combinations (e.g., kits and compositions) comprising such degenerate crRNAs, degenerate guide RNAs, and/or adaptors, are to be understood as also disclosed for use in digesting double stranded DNA, reducing representation of a genome, sequencing genomic DNA, and/or polymorphism detection.
[00100] In the examples below, the method is used to digest genomic DNA. However, the same or similar methods can be used to digest double stranded DNA from another source.
[00101] All patents, patent applications, provisional applications, and publications referred to or cited herein are incorporated by reference in their entirety, including all figures and tables, to the ex-tent they are not inconsistent with the explicit teachings of this specification.
[00102] Following are examples that illustrate procedures for practicing the present disclosure. These examples should not be construed as limiting. All percentages are by weight and all solvent mixture proportions are by volume unless otherwise noted.
[00103] It should be understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to per-sons skilled in the art and are to be included within the spirit and purview of this application and the scope of the appended claims. In addition, any elements or limitations of any embodiment disclosed herein can be combined with any and/or all other elements or limitations (individually or in any combination) of any other embodiment disclosed herein, and all such combinations are contemplated with the scope of the present disclosure without limitation thereto.
EXAMPLES
Reaction overview
[00104] The reaction begins with the endonuclease digestion of DNA by the class 2 CRISPR-Cas endonuclease guided by a degenerate crRNA. After digestion, adaptors are ligated to prepare a reduced representation of the genome by amplification by PCR. The resultant reduced representation of the genome can be sequenced, and pol-ymorphisms genotyped (FIG. 5).
[00105] In one example, 3 μί, of a class 2 CRISPR-Cas endonuclease reaction buffer (such as the Cas9 nuclease lOx reaction buffer) is combined with 3 μί, of 300 nM of an oligonucleotide composed of the crRNA and tracrRNA sequence, 1 μί, of 1 μΜ of the class 2 CRISPR-Cas endonuclease of Type II Cas9, and 20 μί, of water. After pre-incubation for 10 minutes at 25°C, 10 to 1 ,000 ng of DNA to be digested is added. The reaction is then maintained at 25°C for 1 to 24 hours, before incubation at 65°C for 10 minutes to stop the digestion reaction.
[00106] After the DNA from the reaction is cleaned-up from enzyme and other reaction reagents, a standard ligation reaction is carried out. If the class 2 CRISPR-Cas endonuclease used in the reaction produces a blunt-end cut of the DNA (e.g. Cas9), then the reaction can be preceded by an A-tailing reaction, which adds an adenine to the ends of the DNA molecule following digestion. Finally, the ligation reaction is cleaned up and the digested DNA fragments that contain adaptors in both ends are amplified by standard polymerase chain reaction (PCR). CRISPR RNA (crRNA) design
[00107] The crRNA properties to be defined are (a) the number and (b) type (i.e., A, C, G or T) of bases to be used in the specific sequence. The remaining sequence of the crRNA will be composed of degenerate (N) bases, up to the total length of the crRNA.
[00108] We evaluated the impact of using different number of nucleotides and composition in the specific sequence of the crRNA, when performing an endonuclease digestion of the genome of Arabidopsis thaliana with a class 2 CRISPR-Cas endonuclease of Type II (e.g. Cas9). Combinations of specific sequences with all possible 3, 4 and 5 base combinations were evaluated, adjacent to the PAM sequence. Furthermore, we required that adjacent specific sequences be 200-500 nucleotides from each - fragments of this size are more efficiently amplified by PCR and sequenced after adaptor ligation. It is worth noticing that this interval may differ depending on the PCR conditions and sequencing platform.
[00109] Combinations with specific sequences with all possible 3 base combinations are predicted to generate on average 24,294 fragments between 200-500 nucleotides, with a minimum of 2484 (specific sequence CCC) and a maximum of 85356 (specific sequence TTT). For 4 and 5 base combinations, the average number of fragments between 200-500 nucleotides is 2796 and 255, respectively. The distribution of fragments generated for all combinations of bases in the specific sequence is shown in FIG. 8. As it can be seen from the broad range of possible number of digested fragments to be generated, that the specific sequence can be tailored depending on the number of regions of the genome that one wishes to sample.

Claims

We Claim:
1. A pool of degenerate CRISPR RNAs (crRNAs) comprising a plurality of
oligonucleotides having a formula comprising:
Figure imgf000031_0001
wherein,
Nn comprises a degenerate sequence n nucleobases in length;
Xx comprises a specific sequence x nucleobases in length;
n is an integer from 1 to 29;
x is an integer from 1 to 20;
the value of n+x is an integer from 15-30 ;
each individual crRNA in the pool contains the same specific sequence; and the pool of degenerate crRNA comprises a mixture of possible sequences within the degenerate sequence.
2. The pool of degenerate crRNAs of claim 1 , wherein one or more of nucleobases in the degenerate sequence and/or the specific sequence comprises a modified nucleotide.
3. The pool of degenerate crRNAs of claim 1 or 2, wherein x is 1 , 2, 3, 4, 5, 6, 7, 8, 9, or 10.
4. The pool of degenerate crRNAs of any of claims 1 -3, wherein n is 9, 10, 1 1 , 12, 13 , 14, 15, 16, 17, 18, 19, or 20.
5. The pool of degenerate crRNAs of any of claims 1 -3, wherein x is 1 and n is 18, 19, or 20.
6. The pool of degenerate crRNAs of any of claims 1 -3, wherein x is 2 and n is 17, 18, or 19.
7. The pool of degenerate crRNAs of any of claims 1 -3, wherein x is 3 and n is 16, 17, or 18.
8. The pool of degenerate crRNAs of any of claims 1 -3, wherein x is 4 and n is 15 , 16, or 17.
9. The pool of degenerate crRNAs of any of claims 1-3, wherein x is 5 and n is 14, 15, or 16.
10. The pool of degenerate crRNAs of any of claims 1-3, wherein x is 6 and n is 13, 14, or 15.
11. The pool of degenerate crRNAs of any of claims 1-3, wherein x is 7 and n is 12, 13, or 14.
12. The pool of degenerate crRNAs of any of claims 1-3, wherein x is 8 and n is 11, 12, or 13.
13. The pool of degenerate crRNAs of any of claims 1-3, wherein x is 9 and n is 10, 11, or 12.
14. The pool of degenerate crRNAs of any of claims 1-3, wherein x is 10 and n is 9, 10, or 11.
15. The pool of degenerate crRNAs of any of claims 1-14, wherein each position of the degenerate sequence can be G, A, C, or U will equal probability.
16. The pool of degenerate crRNAs of any of claims 1-14, wherein one or more of the nucleobases of the degenerate sequence comprises a universal base.
17. The pool of degenerate crRNAs of any of claims 1-16, wherein each crRNA is linked to a trans-acting CRISPR RNA (tracrRNA).
18. The pool of degenerate crRNAs of claim 17, wherein the tracrRNA is linked to the 3' end of the crRNA.
19. The pool of degenerate crRNAs of claim 18, wherein the tracrRNA is 70-89
nucleobases in length.
20. The pool of degenerate crRNAs of claim 19, wherein the tracrRNA is linked to the degenerate crRNA via a linker.
21. The pool of degenerate crRNAs of claim 20, wherein the linker is an oligonucleotide 1-10 nucleobases in length.
22. The pool of degenerate crRNAs of claim 20, wherein the linker is a non-nucleotide linker.
23. A composition comprising the pool of degenerate crRNAs any of claims 1-21 and one or more RNA-guided DNA endonucleases.
24. The composition of claim 23, wherein the RNA-guided DNA endonuclease comprises a class 2 CRISPR-Cas endonuclease.
25. The composition of claim 24, wherein the class 2 CRISPR-Cas endonuclease is a class 2 CRISPR-Cas endonuclease Type V-A.
26. The composition of claim 25, wherein the class 2 CRISPR-Cas endonuclease Type V-A is a Casl2a.
27. The composition of claim 24, wherein the class 2 CRISPR-Cas endonuclease is a class 2 CRISPR-Cas endonuclease Type II.
28. The composition of claim 27, wherein the class 2 CRISPR-Cas endonuclease Type II is a Cas9.
29. The composition of claim 24, wherein the class 2 CRISPR-Cas endonuclease is a class 2 CRISPR-Cas endonuclease Type V-B.
30. The composition of claim 29, wherein the class 2 CRISPR-Cas endonuclease Type V-B is Casl2b.
31. The composition of claim 23, wherein the RNA-guided DNA endonuclease is selected from the group consisting of: CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease, 2 CRISPR-Cas endonuclease Type II endonuclease, cas9 endonuclease, class 2 CRISPR-Cas endonuclease Type V-B endonuclease, Casl2b endonuclease, C2cl endonuclease, 2 class 2 CRISPR-Cas endonuclease Type V-A endonuclease, Casl2a endonuclease, or Cpfl endonuclease or a functional derivative of a CRISPR- Cas endonuclease, class 2 CRISPR-Cas endonuclease, 2 CRISPR-Cas endonuclease Type II endonuclease, cas9 endonuclease, class 2 CRISPR-Cas endonuclease Type V-B endonuclease, Casl2b endonuclease, C2cl endonuclease, 2 class 2 CRISPR-Cas endonuclease Type V-A endonuclease, Casl2a endonuclease, and Cpfl endonuclease.
32. The composition of any of claims 23-31 further comprising one or more adaptors.
33. The composition of claim 33, wherein the adaptors comprise a first adaptor and a
second adaptor.
34. The composition of claim 34, wherein the adaptors are the same.
35. The composition of claim 35, wherein the adaptors are distinct.
36. The composition of any of claims 23-31 further comprising one or more amplification primers.
37. The composition of claim 33, wherein the amplification primers comprise a first
amplification primer and a second amplification primer.
38. The composition of claim 34, wherein the amplification primers are the same.
39. The composition of claim 35, wherein the amplification primers are distinct.
40. A method for producing a reduced representation of a genome, comprising:
a) provided a sample containing genomic DNA,
b) adding a pool of degenerate crRNAs and/or a pool of degenerate guide RNAs, c) adding an RNA-guided DNA endonuclease enzyme, and
d) incubating the genomic DNA with the degenerate crRNAs and/or degenerate guide RNAs and the RNA-guided DNA endonuclease for a sufficient time to digest the genomic, thereby producing a first nucleic acid product.
41. The method of claim 40, further comprising: ligating a first adaptor to a first end of the first nucleic acid product and a second adaptor to a second end of the first nucleic acid product to produce a second nucleic acid product.
42. The method of claim 41, further comprising: amplifying the second nucleic acid product to producing a third nucleic acid product.
43. The method of claim 42, further comprising sequencing the third nucleic acid product to obtain a genetic sequence.
44. The method of any of claims 40 or 43 , wherein the RNA-guided DNA endonuclease is selected from the list consisting of: CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease Type II endonuclease, cas9 endonuclease, class 2 CRISPR-Cas endonuclease Type V-B endonuclease, Casl2b endonuclease, C2cl endonuclease, 2 class 2 CRISPR-Cas endonuclease Type V-A endonuclease, Casl2a endonuclease, or Cpfl endonuclease or a functional derivative of a CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease, class 2 CRISPR-Cas endonuclease Type II endonuclease, cas9 endonuclease, class 2 CRISPR-Cas endonuclease Type V-B endonuclease, Casl2b endonuclease, C2cl endonuclease, 2 class 2 CRISPR-Cas endonuclease Type V-A endonuclease, Casl2a endonuclease, and Cpfl endonuclease.
45. The method of claim of any of claims 40-44, wherein the genomic DNA is unamplified, amplified, undigested, digested, unfragmented, fragmented, unpurified, or purified.
46. The method of any of claims 40-45 , wherein two or more different pools of degenerate crRNAs and/or a pools of degenerate guide RNAs are added to the genomic DNA.
47. The method of claim 41 or 42, further comprising subjecting second or third nucleic acid product to genetic analysis, wherein the genetic analysis is selected from the group consisting of: real-time thermocycling or isothermal nucleic acid amplification assay, hybridization assay, microarray assay, bead array assay, primer extension assay, enzyme mismatch cleavage assay, branched hybridization assay, molecular beacon assay, invasive cleavage structure assay, sandwich hybridization assay, and single nucleotide polymorphisms (SNPs) assay.
48. A method of obtaining a genotype of an individual, comprising:
a) digesting a genome of the individual with a class 2 CRISPR-Cas endonuclease
guided by a pool of degenerate crRNAs and/or a pool of degenerate guide RNAs, to produce a first nucleic acid product having a first end and a second end;
b) ligating a first adaptor to the first end of the nucleic acid product and a second
adaptor to the second end of the nucleic acid product to produce a second nucleic acid product;
c) amplifying by PCR the second nucleic acid product using a first PCR primer that binds to the first adaptor and a second PCR primer that binds to the second adaptor to produce a third nucleic acid product, wherein the third nucleic acid product represents a reduced representation of the genome; and
d) sequencing the third nucleic acid product to obtain the genotype of the individual.
49. The method of claim 48, further comprising comparing the genotype of the individual to a reference genotype.
50. A method of identifying a polymorphism in a genome of an individual, comprising: a) digesting the genome of the individual with a class 2 CRISPR-Cas endonuclease guided by a pool of degenerate crRNAs and/or a pool of degenerate guide RNAs, to produce a nucleic acid product having a first end and a second end; b) ligating a first adaptor to the first end of the nucleic acid product and a second
adaptor to the second end of the nucleic acid product to produce a second nucleic acid product;
c) amplifying by PCR the second nucleic acid product using a first PCR primer that binds to the first adaptor and a second PCR primer that binds to the second adaptor to produce a third nucleic acid product, wherein the third nucleic acid product represents a reduced representation of the genome; and
d) sequencing the third nucleic acid product to identify the polymorphism in the
genome of the individual.
51. The method of claim 50, further comprising identifying a plurality of polymorphisms in the genome of the individual.
52. A method for digesting a double stranded DNA, comprising:
a) provided a sample containing double stranded DNA,
b) adding a pool of degenerate crRNA/RNA-guided DNA endonuclease complexes and/or a pool of degenerate guide RNA/RN A- guided DNA endonuclease complexes, and
c) incubating the double stranded DNA with the degenerate crRNA/RNA-guided DNA endonuclease enzyme complexes and/or degenerate guide RNA/RNA- guided DNA endonuclease enzyme complexes for a sufficient time to digest the double stranded DNA, thereby yielding DNA fragments.
53. A kit comprising the pool of degenerate crRNAs of any of claims 1-22.
54. The kit of claim 53, further comprising an RNA-guided DNA endonuclease.
55. The kit of claim 54, wherein the pool of degenerate crRNAs and the RNA-guided DNA endonuclease are provided as a complex.
56. The kit of claim 54 or 55, wherein the RNA-guided DNA endonuclease is selected from the group consisting of: CRISPR-Cas endonuclease, class 2 CRISPR-Cas
endonuclease, 2 CRISPR-Cas endonuclease Type II endonuclease, cas9 endonuclease, class 2 CRISPR-Cas endonuclease Type V-B endonuclease, Casl2b endonuclease, C2cl endonuclease, 2 class 2 CRISPR-Cas endonuclease Type V-A endonuclease, Casl2a endonuclease, or Cpfl endonuclease or a functional derivative of a CRISPR- Cas endonuclease, class 2 CRISPR-Cas endonuclease, 2 CRISPR-Cas endonuclease Type II endonuclease, cas9 endonuclease, class 2 CRISPR-Cas endonuclease Type V-B endonuclease, Casl2b endonuclease, C2cl endonuclease, 2 class 2 CRISPR-Cas endonuclease Type V-A endonuclease, Casl2a endonuclease, and Cpfl endonuclease.
57. The kit of claim any of claims 53-56, further comprising one or more adaptors.
58. The kit of claim 57, wherein the one or more adaptors at the same.
59. The kit of claim 57, wherein the one or more adaptors are distinct.
60. The kit of any of claims 53-59, further comprising one or more amplification primers.
61. The kit of claim 60, wherein the one or more adaptors at the same.
62. The kit of claim 60, wherein the one or more adaptors are distinct.
63. The kit of any of claims 53-62, further comprising instructions or packaging materials that describe how to use the pool of degenerate crRNAs in a method to produce a reduced representation of a genome.
64. The kit of any of claims 53-52, further comprising instructions or packaging materials that describe how to use the pool of degenerate crRNAs in a method to obtain a genotype of an individual or to identify a polymorphism in a genome of an individual.
PCT/US2018/057568 2017-10-26 2018-10-25 Method for genome complexity reduction and polymorphism detection WO2019084306A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201762577572P 2017-10-26 2017-10-26
US62/577,572 2017-10-26

Publications (1)

Publication Number Publication Date
WO2019084306A1 true WO2019084306A1 (en) 2019-05-02

Family

ID=66246701

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/057568 WO2019084306A1 (en) 2017-10-26 2018-10-25 Method for genome complexity reduction and polymorphism detection

Country Status (1)

Country Link
WO (1) WO2019084306A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090099040A1 (en) * 2007-10-15 2009-04-16 Sigma Aldrich Company Degenerate oligonucleotides and their uses
US20120094847A1 (en) * 2009-05-05 2012-04-19 Max-Planck-Gesellschaft Zur Forderung Der Wissenschaften E.V. The use of class iib restriction endonucleases in 2nd generation sequencing applications
CN103981211A (en) * 2014-05-16 2014-08-13 安徽省农业科学院水稻研究所 Breeding method for preparing closed glume pollination rice material
US20160208272A1 (en) * 2013-08-22 2016-07-21 E. I. Du Pont De Nemours And Company Plant genome modification using guide rna/cas endonuclease systems and methods of use

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090099040A1 (en) * 2007-10-15 2009-04-16 Sigma Aldrich Company Degenerate oligonucleotides and their uses
US20120094847A1 (en) * 2009-05-05 2012-04-19 Max-Planck-Gesellschaft Zur Forderung Der Wissenschaften E.V. The use of class iib restriction endonucleases in 2nd generation sequencing applications
US20160208272A1 (en) * 2013-08-22 2016-07-21 E. I. Du Pont De Nemours And Company Plant genome modification using guide rna/cas endonuclease systems and methods of use
CN103981211A (en) * 2014-05-16 2014-08-13 安徽省农业科学院水稻研究所 Breeding method for preparing closed glume pollination rice material

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FINERAN ET AL.: "Degenerate Target Sites Mediate Rapid Primed CRISPR Adaptation", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES, vol. 111, no. 16, 22 April 2014 (2014-04-22), pages E1629 - E1638, XP055595430, Retrieved from the Internet <URL:www.pnas.org/cgi/doi/10.1073/pnas.1400071111> *

Similar Documents

Publication Publication Date Title
US20210388430A1 (en) Compositions of toehold primer duplexes and methods of use
EP3450569A1 (en) Dna amplification method
EP3699291A1 (en) Single molecule nucleic acid sequence analysis processes and compositions
US9850481B2 (en) Method for genome complexity reduction and polymorphism detection
JPH08505535A (en) Method for producing single-stranded DNA molecule
AU2008230813A1 (en) Restriction endonuclease enhanced polymorphic sequence detection
EP3152324B1 (en) Strand-invasion based dna amplification method
JP2006512094A5 (en)
EP3346006B1 (en) Method for amplifying dna
US20100092972A1 (en) Assay for gene expression
WO2013192292A1 (en) Massively-parallel multiplex locus-specific nucleic acid sequence analysis
EP1756302A1 (en) A method for selectively detecting subsets of nucleic acid molecules
CN109517888B (en) Nucleic acid amplification method using allele-specific reactive primers
US20220145284A1 (en) Method of detecting multiple targets based on single detection probe using tag sequence snp
EP3805408B1 (en) Method of detecting target nucleic acid using rolling circle amplification and composition for detecting target nucleic acid
Ballantyne et al. Increased amplification success from forensic samples with locked nucleic acids
WO2019084306A1 (en) Method for genome complexity reduction and polymorphism detection
WO2008143367A1 (en) Haplotyping method by multiplex amplification
WO2003066827A2 (en) Methods and compositions for detecting differences between nucleic acids
KR100874378B1 (en) Method for identifying hanwoo meat by using single nucleotide polymorphisms
WO2002034937A9 (en) Methods for detection of differences in nucleic acids
JP2024510840A (en) Method for amplifying target nucleic acid using guide probe and clamping probe, and composition for amplifying target nucleic acid containing the same
WO2024050553A1 (en) Methods for measuring telomere length

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18871075

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18871075

Country of ref document: EP

Kind code of ref document: A1