WO2021036995A1 - Direct microrna sequencing using enzyme assisted sequencing - Google Patents

Direct microrna sequencing using enzyme assisted sequencing Download PDF

Info

Publication number
WO2021036995A1
WO2021036995A1 PCT/CN2020/110836 CN2020110836W WO2021036995A1 WO 2021036995 A1 WO2021036995 A1 WO 2021036995A1 CN 2020110836 W CN2020110836 W CN 2020110836W WO 2021036995 A1 WO2021036995 A1 WO 2021036995A1
Authority
WO
WIPO (PCT)
Prior art keywords
dna
chimeric
short rna
strand
blocking oligomer
Prior art date
Application number
PCT/CN2020/110836
Other languages
French (fr)
Inventor
Shuo Huang
Jinyue Zhang
Shuanghong YAN
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Publication of WO2021036995A1 publication Critical patent/WO2021036995A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/483Physical analysis of biological material
    • G01N33/487Physical analysis of biological material of liquid biological material
    • G01N33/48707Physical analysis of biological material of liquid biological material by electrical means
    • G01N33/48721Investigating individual macromolecules, e.g. by translocation through nanopores

Definitions

  • This invention relates to a method for identify an analyte using protein nanopore.
  • MicroRNAs are a group of short, single-stranded, non-coding RNA molecules that act as posttranscriptional gene regulators for a wide variety of physiological processes, including proliferation, differentiation, apoptosis, and immune reactions [1, 2] .
  • miRNAs are a group of short, single-stranded, non-coding RNA molecules that act as posttranscriptional gene regulators for a wide variety of physiological processes, including proliferation, differentiation, apoptosis, and immune reactions [1, 2] .
  • aberrant miRNAs expression levels have been shown to be closely related to diverse diseases, such as cancer [3-6] , auto-immune disorders [7] and inflammatory diseases [8] .
  • miRNAs can be characterized by northern blot, quantitative reverse transcription real-time polymerase chain reaction (qRT-PCR) assays or microarrays [9] .
  • Other emerging platforms for miRNA sensing include colorimetry [10] , bioluminescence [11] , enzymatic activity [12] and electrochemistry [13] .
  • these methods provide limited analytical information because miRNA sequences are not directly reported and prior knowledge of the target miRNA sequence is required.
  • MiRNAs function by binding to the 3’ untranslated region (3’UTR) of target messenger RNAs [14] . Minor sequence variations, which include trimming, addition or substitution of miRNA sequences, alter its binding affinities to target messenger RNA [14] . As reported, miRNA isoforms (isomiRs) [9] , which were generated by the addition or deletion of one or multiple nucleotides as terminal modifications, have been shown to participate in proliferative diseases [15] and cancer [16] . On the other hand, N6-methyl-adenosine (m6A) , which is an epigenetic modification among RNAs, plays important roles in miRNA biogenesis [17, 18] . These functional roles of miRNAs are associated with their specific sequences.
  • MiRNAs could be indirectly sequenced by sequencing its complementary DNAs (cDNA) by performing reverse transcription followed with deep sequencing [19] .
  • cDNA complementary DNAs
  • this strategy suffers from unpredictable amplification biases and the loss of epigenetic information [9, 20] .
  • Advances in corresponding sequencing technologies consequently facilitate early stage diagnosis of cancers and inspire miRNA-targeted therapeutics [21, 22] .
  • Nanopore Induced Phase-Shift Sequencing (NIPSS) , which is a variant of nanopore sequencing, was recently developed as a universal strategy to sequence analytes other than long stretches of DNA [23] or RNA [25] .
  • NIPSS is limited by its short read-length of 15 nucleotides, the concept has been verified by sequencing different 2’-deoxy-2’-fluoroarabinonucleic acid (FANA) strands [26] .
  • FANA 2’-deoxy-2’-fluoroarabinonucleic acid
  • a method of determining the sequence of a short RNA includes:
  • each of the two compartments comprises a liquid medium and the interface has a channel so dimensioned as to allow passage of only one single-strand polynucleotide at a time;
  • a DNA-short RNA chimeric substrate comprising a DNA-short RNA chimeric single-strand and at least one DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand, wherein the chimeric single-strand comprises the short RNA conjugated with a DNA;
  • the short RNA is derived from miRNA, siRNA or piRNA.
  • the DNA-short RNA chimeric single-strand is prepared by following steps: the 5’ phosphorylated DNA single-strand is first 5’ adenylated, assisted with the Mth RNA ligase; then the adenylated DNA is purified by ethanol precipitation and further ligated to the 3’-end of the short RNA by RNA ligase.
  • the DNA segment and the short RNA is separated by a spacer.
  • the spacer is an abasic spacer.
  • the enzyme is DNA polymerase;
  • the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand, a primer and a blocking oligomer;
  • the primer is hybridized with the 3' end of the DNA segment of the chimeric single-strand;
  • the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand adjacent to the 3' end of the primer;
  • the short RNA is adjacent to the 5' end of the DNA segment in the DNA-short RNA chimeric single-strand; and the method includes:
  • the DNA polymerase is phi29 DNA polymerase or variants thereof; more preferablly, the DNA polymerase is wild-type phi29 DNA polymerase, or D12A/D66A mutant of wild-type phi29 DNA polymerase.
  • the 5' end of the DNA segment is conjugated with the 5'end or the 3'end of the short RNA.
  • the 3' end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
  • the 3' end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
  • a cholesterol molecule is linked to the 3' end of the blocking oligomer.
  • the method includes:
  • the DNA segment of the chimeric single-strand contains sequence repeats to generate a unique signal pattern during primer extension.
  • sequence repeats include sequence repeats between "AAGA” and “TTTC” from 3' to 5'.
  • the enzyme is a DNA helicase;
  • the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand and a blocking oligomer;
  • the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand; and the method includes:
  • the DNA helicase is Hel 308 DNA helicase (which is also called DNA Helicase HEL308) or variants thereof.
  • At least one end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
  • the unpaired end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
  • a cholesterol molecule is linked to at least one end of the blocking oligomer.
  • the method includes:
  • the channel is a protein nanopore.
  • the protein nanopore is MspA, CsgG, ClyA or FraC or variants thereof.
  • the protein nanopore is a mutant MspA comprising the mutations of D90N/D91N/D93N/D118R/D134R/E139K compared to the wild-type MspA.
  • the DNA-short RNA chimeric substrate is a plurality of DNA-short RNA chimeric substrates, wherein different DNA-short RNA chimeric substrate comprises different short RNA.
  • different DNA-short RNA chimeric substrate comprises the same DNA segment.
  • a sequencing library is provide.
  • the sequencing library is constructed by thermal annealing from a plurality of DNA-short RNA chimeric single-strands and a DNA fragment; wherein different DNA-short RNA chimeric single-strand comprises the same DNA segment and different short RNA, and the DNA fragment is complementary to at least partial of the DNA segment of the chimeric single-strand.
  • the RNA is derived from miRNA, siRNA or piRNA.
  • the DNA segment and the short RNA is separated by a spacer.
  • the spacer is an abasic spacer.
  • the sequencing library constructed by thermal annealing from a plurality of DNA-short RNA chimeric single-strands, a primer and a blocking oligomer; wherein the primer is hybridized with the 3' end of the DNA segment of the chimeric single-strand; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand adjacent to the 3' end of the primer; and the short RNA is adjacent to the 5' end of the DNA segment in the DNA-short RNA chimeric single-strand.
  • the 5' end of the DNA segment is conjugated with the 5'end or the 3'end of the short RNA.
  • the 3' end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
  • the 3' end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
  • a cholesterol molecule is linked to the 3' end of the blocking oligomer.
  • the DNA segment of the chimeric single-strand contains sequence repeats to generate a unique signal pattern during primer extension.
  • sequence repeats include sequence repeats between "AAGA” and “TTTC” from 3' to 5'.
  • the sequencing library constructed by thermal annealing from a plurality of DNA-short RNA chimeric single-strands and a blocking oligomer; wherein the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand.
  • At least one end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
  • the unpaired end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
  • a cholesterol molecule is linked to at least one end of the blocking oligomer.
  • a system for determining the sequence of a short RNA comprises:
  • each of the two compartments comprises a liquid medium and the interface has a channel so dimensioned as to allow passage of only one single-strand polynucleotide at a time;
  • the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand and at least one DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand, the chimeric single-strand comprises the short RNA conjugated with a DNA, and the enzyme can process DNA in a nucleotide by nucleotide way.
  • the short RNA is derived from miRNA, siRNA or piRNA.
  • the DNA segment and the short RNA is separated by a spacer.
  • the spacer is an abasic spacer.
  • the system comprises a DNA polymerase, divalent cation, dNTPs and a DNA-short RNA chimeric substrate in one of the two compartments; wherein the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand, a primer and a blocking oligomer; the primer is hybridized with the 3' end of the DNA segment of the chimeric single-strand; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand adjacent to the 3' end of the primer; the short RNA is adjacent to the 5' end of the DNA segment in the DNA-short RNA chimeric single-strand.
  • the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand, a primer and a blocking oligomer
  • the primer is hybridized with the 3' end of the DNA segment of the chimeric single-strand
  • the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand
  • the 5' end of the DNA segment is conjugated with the 5'end or the 3'end of the short RNA.
  • the 3' end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
  • the 3' end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
  • a cholesterol molecule is linked to the 3' end of the blocking oligomer.
  • the DNA segment contains sequence repeats to generate a unique signal pattern during primer extension.
  • sequence repeats include sequence repeats between "AAGA” and “TTTC” from 3' to 5'.
  • the system comprises a DNA helicase, ATP, and a DNA-short RNA chimeric substrate in one of the two compartments; wherein the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand and a blocking oligomer; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand.
  • At least one end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
  • the unpaired end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
  • a cholesterol molecule is linked to at least one end of the blocking oligomer.
  • the channel is a protein nanopore.
  • the protein nanopore is MspA, CsgG, ClyA or FraC or variants thereof.
  • the protein nanopore is a mutant MspA comprising the mutations of D90N/D91N/D93N/D118R/D134R/E139K compared to the wild-type MspA.
  • the DNA-short RNA chimeric substrate is a plurality of DNA-short RNA chimeric substrates, wherein different DNA-short RNA chimeric substrate comprises different short RNA.
  • different DNA-short RNA chimeric substrate comprises the same DNA segment.
  • Fig. 1 shows direct miRNA sequencing using NIPSS.
  • (a) A schematic diagram of the preparation of a sequencing library.
  • the miRNA sequencing library is thermally annealed (Methods 1) from three separate nucleic acid strands which includes a chimeric template, a primer (green) and a blocker (light blue) (Table 1) .
  • the chimeric template is composed of a miRNA segment (red) , an abasic residue (blue dot) and a DNA segment (black) .
  • (b) The NIPSS strategy for direct miRNA sequencing.
  • NIPSS is carried out with an MspA nanopore (purple) and a wildtype (WT) phi29 DNAP (green) by following the reported enzymatic ratcheting strategy (Fig.
  • the DNA and miRNA segments of the trace are marked with black and red lines, respectively.
  • the trace segment that corresponds to reading the abasic site (X) is marked with a light blue stripe.
  • the DNA segment of the trace appears as two triangular shaped current characteristics due to the sequence design (Fig. 3) .
  • the DNA segment of the trace is immediately followed by an abrupt increase of the signal due to the introduction of the abasic site after the DNA sequence. All step transitions after the abasic signal lead to miRNA sequencing signals.
  • the current levels of “AAGA” and “TTTC” which represents the highest and the lowest sequencing signals by reading the DNA part, are marked by black arrows.
  • the demonstrated results were acquired by performing NIPSS with an aqueous buffer of 0.3 M KCl, 10 mM HEPES/KOH, 10 mM MgCl 2 , 10 mM (NH 4 ) 2 SO 4 and 4 mM DTT at pH 7.5.
  • Fig. 2 shows detailed schematic diagrams of NIPSS.
  • the configuration of direct miRNA sequencing by NIPSS is as reported previously 2 .
  • the chimeric template is composed of a miRNA segment (red, lower part) , an abasic residue (blue dot) and a DNA segment (black, upper part) (Methods 1) .
  • the sequencing library With a +180 mV applied potential, the sequencing library, which is bound with a phi29 DNAP (green) , was driven electrophoretically into the MspA nanopore (purple) .
  • (a) Initiation of NIPSS The cyan DNA blocker strand, which was thermally annealed with the sequencing library, was first voltage-driven fragment unzipped from the chimeric template.
  • This unzipping consequently triggered the replication-driven ratcheting by the phi29 DNA polymerase on top of the nanopore.
  • the transition between voltage-driven unzipping and replication-driven ratcheting represents the initiation of nanopore sequencing.
  • the motion directions of the chimeric template during unzipping and ratcheting are marked with cyan and black arrows, respectively.
  • MiRNA sequencing by NIPSS By utilizing the phase-shift, the miRNA segment pass through the pore constriction with single nucleotide steps, when the DNA drive-strand is being replicated by the phi29 DNA polymerase.
  • Fig. 3 shows design of the DNA segment within the chimeric template.
  • (a) A representative current trace acquired by sequencing DNA-miR-21 using NIPSS. The trace segments that correspond to reading DNA, the abasic site and miRNA are marked appropriately. The purpose of including the DNA segment, which is designed on the 5’ end of the chimeric template strand (Table 1) , is twofold. First, it acts as a drive strand, which can be enzymatically ratcheted by the phi29 DNAP against the electrophoretic force, guarantees that the miRNA segment can be sequenced by nanopores.
  • Fig. 4 shows identification of miR-21 and let-7a by NIPSS. MiRNA identities can be directly recognized by analyzing the miRNA part of the NIPSS signal.
  • (a) A representative current trace from DNA-miR-21 sequencing.
  • (b) A representative current trace from DNA-let-7a sequencing. Solid lines over the traces in (a-b) represent the extracted current height. “*” indicates occasional polymerase back-stepping [24] .
  • the initiation of miRNA sequencing is indicated by blue dashed lines, where an abasic spacer (X) is read by the nanopore.
  • the signal patterns of the DNA segment show high similarities since the sequence of the DNA drive-strand is identical. Whereas, the miRNA segment of the signal show remarkable differences between the demonstrated NIPSS events.
  • Fig. 5 shows identification of miRNAs using NIPSS.
  • Chimeric template strands containing miRNA-21 and Let-7a sequences were custom synthesized (Table S1) and directly sequenced by NIPSS.
  • (a) Overlay of multiple time-normalized events (N 24) from the DNA-miR-21 results acquired by NIPSS.
  • (b) Overlay of multiple time-normalized events (N 12) from the DNA-let-7a results acquired by NIPSS.
  • a-b The corresponding sequences (DNA-miR-21 or DNA-Let-7a, 3’-5’ convention) are aligned above the plots.
  • the DNA segments are marked in black (left of X) and the miRNA segments are marked in red (right of X) .
  • the abasic site (X, blue) , which separates the DNA and miRNA segments, acts as a signal marker to identify the sequence transition from reading DNA to miRNA during NIPSS.
  • (c) Consensus sequencing results comparison between DNA-miR-21 and DNA-let-7a. The mean and standard deviation values are derived from time-normalized events, as demonstrated in a and b. The DNA part of the NIPSS results shows great alignment in all steps between both templates. However, the miRNA segment of the signals shows significant variations, starting from the step marked with the light blue stripe. Light blue stripes in (a-c) mark the sequencing step of TCAX, which is the first quadromer sequence containing the abasic residue when acquired by NIPSS.
  • Fig. 6 shows discrimination of miRNA isoforms (isomiRs) using NIPSS.
  • miR-21 and its isoforms The upper image demonstrates the structure of a precursor microRNA (pre-miRNA) for human miR-21.
  • the lower image demonstrates the sequence (5’-3’) of mature miR-21 and its miR-21+U isoform, respectively.
  • An additional uracil (red box) exists at the 3’-end of miR-21+U.
  • the DNA, the abasic site and the miRNA part of the signal are marked separately.
  • the light blue strip marks the sequencing step of TCAX, which is the first sequence quadromer containing an abasic spacer encountered by NIPSS.
  • the demonstrated statistical results show great alignment in all parts except that marked with a dashed-line box.
  • a zoomed-in view of the dashed-line box in b illustrates a shift effect of current levels caused by a nucleotide insertion of DNA-miR-21+U in reference to DNA-miR-21.
  • the aligned sequence context above the results demonstrates that the addition of a uracil (red box) in DNA-miR-21+U systematically generates 1 nucleotide (marked as ⁇ ) shift.
  • This single nucleotide result shift is also demonstrated by the schematic diagram in the image inset which takes the results of step 25 as an example.
  • Fig. 7 shows representative current traces for isomiRs discrimination. Each panel shows individual trace segments, which correspond to the miRNA part of the nanopore sequencing signal of DNA-miR-21 (left) and DNA-miR-21+U (right) . Current steps with significant deviations between analytes are indicated by solid lines (blue for miR-21 and yellow for miR-21+U) . The “*” indicates polymerase back-stepping [24] that was occasionally observed. The duration time for each nanopore sequencing step is stochastic. Scale bar: 50 ms.
  • Fig. 8 shows statistical comparison between DNA-miR-21 and DNA-miR-21+U.
  • (a) Overlay of extracted current steps from multiple DNA-miR-21 events (N 24) .
  • (b) Overlay of extracted current steps from multiple DNA-miR-21+U events (N 25) .
  • the correlated sequences for each sequence are demonstrated on top of the plots in (a-b) where an abasic site (X) is located between DNA and miRNA.
  • a uridine monophosphate insertion has generated an additional current step when reading DNA-miR-21+U in reference to that of DNA-miR-21. Consequently, the signal pattern from DNA-miR-21+U is shifted by 1 nucleotide.
  • the light blue strip in (a-c) represents the quadromer reading of TCAX, which is the first quadromer containing the abasic site read by NIPSS.
  • Fig. 9 shows direct N6-methyladenosine (m6A) mapping using NIPSS.
  • (a) Schematic diagram of chimeric template strands containing canonical adenosine (A) or N6-methyladenosine (m6A) . A single “A” or “m6A” is embedded in different chimeric strands (top: DNA-miR-21, bottom: DNA-miR-21 (m6A) , Table 1) for NIPSS sequencing, where its chemical structure and location is annotated. Black and light red segments represent the DNA and the miRNA part of the strand.
  • (b) Schematic diagram of NIPSS sequencing of miRNA containing “A” or “m6A” nucleotides.
  • A*within the sequence context represents the “A” or “m6A” nucleotide within the strand.
  • (c) A representative current trace of DNA-miR-21 sequenced by NIPSS. Dark blue lines represent extracted mean current values from each step. The corresponding sequence context is aligned below in which a canonical adenosine is marked in gray highlight.
  • (d) A representative current trace when DNA-miR-21 (m6A) is sequenced by NIPSS. Light red lines represent extracted mean current values from each step. The corresponding sequence context is aligned below, from which a N6-methyladenosine is marked in gray hightlight.
  • the standard deviation values of DNA-miR-21 are demonstrated with gray column and the standard deviation of DNA-miR-21 (m6A) is demonstrated with black error bars.
  • the m6A modification results in a signal variation when m6A containing sequence quadromers were read by the nanopore constriction. Due to the limited spatial resolution of MspA, a single m6A modification results in detectable signal fluctuations within three current steps as marked by dashed box.
  • Fig. 10 shows representative current traces for m6A modification detection.
  • the sequence contexts between DNA-miR-21 and DNA-miR-21 (m6A) differ with only one m6A modification (Table 1) .
  • Each panel shows individual trace segments, which correspond to the miRNA part of the sequencing signal of DNA-miR-21 (left) and DNA-miR-21 (m6A) (right) .
  • Current steps with significant deviations between analytes are indicated by solid lines. These level differences could also be recognized from the dashed reference lines.
  • the duration time for each nanopore sequencing step is stochastic. Scale bar: 50 ms.
  • Fig. 11 show statistical signal variation between DNA-miR-21 and DNA-miR-21 (m6A) .
  • (a) Overlay of current steps extracted from multiple DNA-miR-21 events (N 24) .
  • (b) Overlay of current steps extracted from multiple DNA-miR-21 (m6A) events (N 24) .
  • Each current step in (a-b) represents the mean current values of each quadromer reading.
  • the correlated sequences are shown above the plots.
  • the “X” within the sequence stands for the abasic site located between the DNA and the miRNA segment of the chimeric template.
  • the average current values and error bars are created from current level traces exhibited in a and b.
  • the light blue strips in (a-c) represent the quadromer reading of TCAX, which is the first quadromer containg the abasic site when read by NIPSS.
  • (d) Statistical current differences between DNA-miR-21 and DNA-miR-21 (m6A) .
  • Mean current values in (c) were used to construct the current difference map, where the value of I DNA-miR-21 (m6A) –I DNA-miR-21 is displayed by a black solid line.
  • the standard deviation of signals from DNA-miR-21 is shown with gray columns.
  • the standard deviation of signals from DNA-miR-21 (m6A) is demonstrated with black bars.
  • Signal variations caused by m6A modification are marked by dashed box.
  • the sequence of the strand are aligned below the figure, where A* (gray highlighted) represents either A or m6A in the sequence.
  • Fig. 12 shows enzymatic ligation between DNA and RNA.
  • (a) A schematic diagram of enzymatic ligation. The reaction starts with a 58 nt DNA linker with a 5’ phosphate group (5PO 4 DNA) . After treatment with the Mth RNA ligase (New England Biolabs) , the 5’ end of the DNA linker is adenylated (5AppDNA) . To minimize non-specific ligation, T4 RNA ligase 2 truncated K227Q mutant (T4 Rnl2tr, New England Biolabs) was chosen, which specifically ligates the 5’ end of the pre-adenylated DNA linker (5AppDNA) with the 3’ end of target miRNA.
  • T4 RNA ligase 2 truncated K227Q mutant T4 Rnl2tr, New England Biolabs
  • Lane m stands for the Low Range ssRNA Ladder (New England Biolabs) . 5’ adenylation results in slight upshift of the band during gel electrophoresis.
  • the ligation reaction was performed by mixing the following components: 5.0 ⁇ L 50% (w/v) PEG 8000, 2 ⁇ L 10x RNA ligase buffer, 20 pmol 5AppDNA, 10 pmol miRNA, 1 ⁇ L T4 Rnl2tr, nuclease-free water to a final volume of 20 ⁇ L.
  • the reaction was incubated at 4 °C for 24 h and heat inactivated by incubation at 65 °C for 20 min.
  • Lane A indicates the microRNA Marker (New England Biolabs) .
  • Lane B stands for the miR-21+U strand.
  • Lane + stands for the ligation product.
  • Lane m stands for the Low Range ssRNA Ladder (New England Biolabs) .
  • An extra, high molecular weight band was detected in lane m, which is the ligated DNA-miRNA chimeric strand.
  • Fig. 13 shows a proposed strategy to directly sequence miRNA from natural resources.
  • a-c Schematic diagram of NIPSS sequencing library preparation with isolated miRNA from natural resources. Isolated miRNAs (a) could be ligated to form DNA-miRNA chimeric strands (b) and subsequently form sequencing libraries (c) by thermal annealing.
  • b DNA-miRNA chimeric strands
  • sequencing libraries c
  • Direct miRNA sequencing is carried out as described in this invention.
  • Current step transitions during NIPSS reading of miRNA could be decoded into RNA sequences for downstream clinical diagnosis or bioinformatics investigations.
  • Fig. 14 shows purification and characterization of Phi29 DNAP.
  • Prokaryotic expressed Phi29 DNAP (Methods) , which contains a hexa-histidine tag on its C-terminus, could be purified by nickel affinity chromatography and characterized by gel electrophoresis.
  • the Hairpin DNA (Table 1) , which forms a hairpin structure with a 5’ overhang upon thermal annealing, was used to test the replication ability of the two home-made Phi29 DNAP.
  • 1.8 ⁇ M Hairpin DNA was dissolved in the electrolyte buffer (0.3 M KCl, 10 mM HEPES/KOH, 10 mM MgCl 2 , 10 mM (NH 4 ) 2 SO 4 , 4mM DTT, pH 7.5) followed with thermal annealing.
  • Enzymatic driven chain elongation was performed with the addition of 5% (v/v) Phi29 DNAP and dNTP with a final concentration of 0.25 mM at room temperature for 1 h.
  • the chain elongation results were characterized by 14%polyacrylamide gel electrophoresis.
  • Lane 1 the elongated DNA product by D12A/D66A mutant (home-made) ;
  • Lane 2 the elongated product by phi29 DNAP-wt (home-made) ;
  • Lane 3 the elongated DNA product by the phi29 DNAP (NEB) ; 4, DNA without any DNA polymerase;
  • M 20 bp DNA Ladder (Dye Plus) .
  • Lane M 20 bp DNA Ladder (Dye Plus, BioRad) .
  • the appearance of the extension product has verified the replication ability of Phi29 DNAP-wt and Phi29 DNAP D12A/D66A.
  • MicroRNAs are a class of short non-coding RNAs that function in RNA silencing and post-transcriptional gene regulation. Besides their participation in regulating normal physiological activities, specific miRNA types could act as oncogenes, tumor suppressors or metastasis regulators, which are critical biomarkers for cancer.
  • enzyme assisted sequcing especially Nanopore Induced Phase Shift Sequencing (NIPSS) , which is a variant form of nanopore sequencing, could be used to directly sequence short RNA including miRNA.
  • NIPSS Nanopore Induced Phase Shift Sequencing
  • NIPSS clearly discriminates between different identities, isoforms and epigenetic variants of model miRNA sequences.
  • This invention demonstrates the first report of direct miRNA sequencing, which serves as a complement to existing miRNA sensing routines by the introduction of single molecule resolution. Future engineering of this technique may assist miRNA based early stage diagnosis or inspire novel cancer therapeutics.
  • Mature miRNAs measuring ⁇ 22 nucleotides (nt) in length, are ideal analytes for NIPSS. Though the 15 nt read-length fails to cover the miRNA length completely, it demonstrates the first single molecule sequencing attempt for miRNAs, which is superior in principle to existing miRNA sensing methods because no amplification or prior knowledge of target sequences is required, while all epigenetic information within the sequence is retained and can be resolved.
  • the method of the invention enables sequencing of miRNAs by DNA-RNA complexes using nanopore sequencing.
  • the nanopore is the only access which permits the flow of the ionic current between two compartments containing electrolytic solutions. With an applied potential, charged analytes are electrically driven through the pore, and the analytical information could be readily recognized and distinguished from the current blockades. It is very difficult to sequence miRNAs in the prior art.
  • the traditional nanopore sequencing method is designed for long sequences which is not suitable for miRNA. miRNAs may not transclocate through the nanopore or may not stay in the channel long enough.
  • the invention provides a method of determining the sequence of a short RNA, the method comprising:
  • each of the two compartments comprises a liquid medium and the interface has a channel so dimensioned as to allow passage of only one single-strand polynucleotide at a time;
  • RNA-short RNA chimeric substrate having a double-stranded portion and a single-stranded portion, wherein at least partial of the DNA segment of the chimeric substrate is double-strand and the single-stranded portion comprises the short RNA and optionlly partial of the DNAsegment; and the short RNA is adjacent to the 5' end of the DNA segment in the chimeric template;
  • short RNA strand may be derived from miRNA, siRNA or piRNA, and may be a single strand. In some embodiments, the short RNA is miRNA. The length of the short RNA strand may be 5-50 nt, such as 10-30 nt, preferably 15-20 nt.
  • miRNA is used according to its ordinary and plain meaning and refers to a microRNA molecule found in eukaryotes that is involved in RNA-based gene regulation.
  • the target short RNA is conjugated with a DNA.
  • the process of the DNA by the enzyme can reduce the short RNA translocation speed down to enable sequencing of the short RNA.
  • the chimeric substrate comprises DNA double-stranded portion so that the enzyme can process the DNA segment of the chimeric substrate.
  • the chimeric substrate may be formed by conjugating the target short RNA with a single-stranded DNA to form a chimeric single-strand and then hybridizing the chimeric single-strand with a DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand.
  • the chimeric substrate may comprise a DNA-short RNA chimeric single-strand and a DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand, wherein the chimeric single-strand comprises the short RNA conjugated with a DNA.
  • the DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand may be present in the same strand with the chimeric single-strand and form a hairpin structure with the chimeric single-strand.
  • the hairpin structure is optional and the DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand may be an individual fragment.
  • the chimeric substrate and the enzyme may be introduced into one of the two compartments as following steps: firstly, the chimeric single-strand may be synthesized firstly, and then the chimeric single-strand may be hybridized with the complementary DNA fragment to form the chimeric substrate, and then the chimeric substrate is incubated with the enzyme to form a complex, and finally the complex is added into one of the two compartments.
  • the chimeric single-strand, the complementary DNA fragment and the enzyme may be added into one of the two compartments together to form a complex in the compartment.
  • the chimeric single-strand may be hybridized with the complementary DNA fragment to form the chimeric substrate and the chimeric substrate and the enzyme then be added into one of the two compartments together to form a complex in the compartment
  • the chimeric single-strand may be prepared by following steps: the 5’ phosphorylated single-stranded DNA (5PO 4 DNA) is first 5’ adenylated, assisted with the Mth (Methanobacterium) RNA ligase; then the adenylated DNA (5AppDNA) is purified by ethanol precipitation and further ligated to the 3’-end of the target short RNA strand by the RNA ligase, preferablly T4 RNA ligase 2 truncated K227Q mutant (T4 Rnl2tr) .
  • the spacer may cause unique blockage current when passing through the channel and may serve as a signal marker to identify the beginning or the ending of sequencing signals from miRNA.
  • the spacer may be an abasic spacer, which can produce an abnormally large blockage current.
  • the spacer may be any other suitable spacer which can cause a unique blockage current when passing through the nanopore to distinguish it from adjacent nucleotides.
  • the spacer is optional.
  • the RNA may be conjugated directly with the DNA.
  • the spacer (if present) may be synthesized in one end of the single-stranded DNA, wherein said end is the end that will conjugate with the short RNA.
  • the single-stranded DNA with a spacer then can be used to prepare the chimeric single-strand following the above method.
  • the length of the DNA segment in the DNA-short RNA chimeric single-strand may be greater than or equal to 10nt, greater than or equal to 20nt, equal to 30nt, equal to 40nt, equal to 50nt, or equal to 60nt.
  • the enzyme may be any enzyme that can process DNA in a nucleotide by nucleotide way, including, but not limited to, DNA polymerase, DNA helicase, exonuclease etc.
  • the enzyme may be DNA polymerase.
  • DNA polymerase enables sequencing of short RNA by means of NIPSS, wherein RNA undergoes forward and reverse ratcheting translocation.
  • polymerase refers to an enzyme that performs template-directed synthesis of polynucleotides.
  • DNA polymerases are well-known to those skilled in the art. Examples of the DNA polymerase include, but not limit to, phi29 DNA polymerase, T7 DNA polymerase, His 1 DNA polymerase, His 2 DNA polymerase, Bacillus phage M2 DNA polymerase, Streptococcus phage CPl DNA polymerase, enterobacter phage PRD1 DNA polymerase, and variants thereof. Phi29 DNA polymerase is particularly preferable.
  • Phi29 DNA polymerase is derived from Bacillu subtilis bacteriophage phi29 (Blanco, L.; Salas, M.J. Biol. Chem. 1996, 277, 8509-8512; (19) Salas, M.; Blanco, L.; Lazaro, J.M.; de Vega, M. IUBMB. Life 2008, 60, 82-85) and works at room temperature with a high processivity.
  • the monomeric phi29 DNAP ( ⁇ 66.5 kDa) belongs to the B family of DNA polymerases and has both 5’-3’ primer strand extension and 3’-5’ exonuclease functions in the presence of Mg 2+ as well as 5’-3’ strand displacement activity.
  • Phi29 DNAP is able to synthesize very long stretches of DNA (>70kb) along a single template strand, and it can synthesize under loads of up to ⁇ 37pN, enough to counteract the electrophoretic force needed to drive DNA through the nanopore.
  • DNA polymerase may be a mutant of wild-type phi29 DNAP, such a mutant having K555T/D570N (Sakatani Y et al., Protein Eng Des Sel. 2019, 32 (11) : 481-487) or D12A/D66A comparing with wild-type phi29 DNAP.
  • the mutant of wild-type phi29 DNAP only has K555T/D570N or D12A/D66A comparing with wild-type phi29 DNAP.
  • K555T/D570N refers to K555T and D570N.
  • D12A/D66A refers to D12A and D66A.
  • the DNA-short RNA chimeric substrate may comprise a DNA-short RNA chimeric single-strand, a primer and a blocking oligomer.
  • the short RNA is adjacent to the 5' end of the DNA segment.
  • the primer is annealed to the 3' end of the DNA segment of the chimeric single-strand.
  • the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand adjacent (preferably immediately adjacent to) to the 3' end of the primer to prevent extension of the primer in the course of the forward ratcheting translocation, that is, the 3' end of the primer is protected by the blocking oligomer.
  • the double-stranded portion is consisted of the primer, the blocking oligomer and the hydrized DNA segment.
  • the polymerase is bound to the DNA-short RNA chimeric substrate.
  • the single-stranded portion of the polynucleotide complex is drawn into the channel, causing the blocking oligomer to unzip from the 3' end to the 5' end.
  • the blocking oligomer is completely unzipped, the extendable 3' end of the primer is exposed, allowing polymerase-driven primer extension to start.
  • the DNA-short RNA chimeric substrate translocates in the reverse direction through the channel as the primer extension, the short RNA also translocates through the channel, giving rise to an ion current blockade which can be measured.
  • primer extension driven by the polymerase reaction DNA and RNA are read sequentially.
  • the sequencing signal halts.
  • the fixed distance There is a fixed distance between the constriction site of the channel and the reaction site of the polymerase, and the fixed distance determines the length of the sequence of the RNA that can be detected. For example, for MspA, the fixed distance is equivalent in length to 14-15 nucleotides.
  • RNA is adjacent to the 5' of the DNA segment in the DNA-RNA chimeric template, since the RNA are not involved in any reaction, the orientation of the RNA itself is no limited.
  • the RNA may be ligated to the DNA segment at the RNA's 5' end or at the RNA's 3' end.
  • the DNA segment may contain unique sequence repeats to generate a unique signal pattern during primer extension, which marks the initiation of the primer extension.
  • the sequence repeats includes sequence repeats between "AAGA” and "TTTC" (3'-5') .
  • the primer may have a hairpin on its 5' end to prevent the polymerase from acting on the double-stranded end of the DNA segment.
  • the hairpin in the primer is optional.
  • the length of the primer seqence that can form a duplex with the chimeric single-strand is not limited, and may be 15-30nt, such as 18-27nt, e.g. 20-25nt.
  • the blocking oligomer may be any sequence that can hybridize with the DNA segment adjacent to the primer.
  • the blocking oligomer may be single-stranded DNA.
  • the length of the blocking oligomer is not limited and may be 15--30nt, such as 18-27nt, e.g. 20-25nt.
  • the 3'end of the blocking molecule is not paired with the DNA segment of the chimeric single-strand to have a free end for unzipping.
  • the 3' end of the blocking oligomer may have several (e.g. 1-10) nucleotides mismatched with the chimeric single-strand or have several (e.g. 1-10) abasic residues.
  • a cholesterol molecule may be linked to the 3' end of the blocking oligomer to facilitate unzipping.
  • the cholesterol molecule may be anchored to the membrane between the two compartments, facilitating the entry of the single-stranded portion of the chimeric substrate into the channel.
  • the chimeric single-strand, the primer and the blocking oligomer is mixed and thermally annealed to form the DNA-short RNA chimeric substrate, and then the chimeric substrate is added into one of the two compartments.
  • the chimeric single-strand, the primer and the blocking oligomer is added to one of the two compartments together and form a polynucleotide complex in the liquid medium of the compartment.
  • the polymerase may be first incubated with the chimeric substrate to form a complex and then the complex is added to one of the two compartments.
  • the polymerase may be added into one of the two compartments together with the pre-formed chimeric substrate or with the chimeric single-strand, the primer and the blocking oligomer to form a complex in the compartment.
  • the enzyme may be a DNA helicase.
  • DNA helicase enables sequencing of short RNA during the course of unzipping or the DNA double-strand.
  • the term "DNA helicase” refers to an enzyme that unwinds duplex DNA.
  • DNA helicases are well-known to those skilled in the art. Examples of the DNA helicase include, but not limit to, Hel 308 helicase, RecD helicase, e.g., TraI helicase, TrwC helicase, XPD helicase, Dda helicase and variants thereof.
  • Hel 308 DNA helicase (which is also called DNA Helicase HEL308) is particularly preferable.
  • Hel 308 helicase (which is also called DNA Helicase HEL308) is an ATP-dependent Ski2-like superfamily II (SF2) helicase/translocase that unwinds duplex DNA in the 3′to 5′direction.
  • SF2 Ski2-like superfamily II
  • the DNA-short RNA chimeric substrate may comprise a DNA-short RNA chimeric single-strand and a blocking oligomer.
  • the short RNA may be adjacent to the 5' end or the 3' end of the DNA segment.
  • the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand.
  • the double-stranded portion is consisted of the blocking oligomer and the hydrized DNA segment.
  • the helicase is bound to the DNA-short RNA chimeric substrate. When the electric potential difference between the two compartments is applied, the single-stranded portion of the polynucleotide complex is drawn into the channel. Meanwhile, the blocking oligomer is unzipped by the helicase, driving the short RNA to translocate through the channel in a nucleotide by nucleotide way, giving rise to an ion current blockade which can be measured.
  • the blocking oligomer may be any sequence that can hybridize with the DNA segment adjacent to the primer.
  • the blocking oligomer may be single-stranded DNA.
  • the length of the blocking oligomer is not limited and may be 15--30nt, such as 18-27nt, e.g. 20-25nt.
  • At least one end of the blocking molecule is not paired with the DNA segment of the chimeric single-strand to have a free end for unzipping.
  • the free end of the blocking oligomer may have several (e.g. 1-10) nucleotides mismatched with the chimeric single-strand or have several (e.g. 1-10) abasic residues.
  • a cholesterol molecule may be linked to one end (such as the free end) of the blocking oligomer to facilitate unzipping.
  • the cholesterol molecule may be anchored to the membrane between the two compartments, facilitating the entry of the single-stranded portion of the chimeric substrate into the channel.
  • the chimeric single-strand and the blocking oligomer is mixed and thermally annealed to form the DNA-short RNA chimeric substrate, and then the chimeric substrate is added into one of the two compartments.
  • the chimeric single-strand and the blocking oligomer is added to one of the two compartments together and form a polynucleotide complex in the liquid medium of the compartment.
  • the helicase may be first incubated with the chimeric substrate to form a complex and then the complex is added to one of the two compartments. Alternatively, the helicase may be added into one of the two compartments together with the pre-formed chimeric substrate or with the chimeric single-strand and the blocking oligomer to form a complex in the compartment.
  • the interface separating the two compartments may be a membrane, and may be any material capable of supporting the channel.
  • the membrane may be natural membrane, synthetic membrane or artificial membrane.
  • the membrane may be a solid membrane and may comprise or consist of a solid substrate, such as SiNx, glass, silicon dioxide, molybdenum disulfide, graphene, aluminium oxide, or CNT (carbon nano tube) , etc.
  • the solid membrane or nanopore may further comprise a linker group compound that is attached by covalent bond.
  • the polymerase may be attached to a solid membrane or solid nanopore using a suitable linker group.
  • the membrane may be a lipid bilayer, such as formed from amphiphilic molecules, e.g. phospholipids, which have both hydrophilic and lipophilic properties.
  • the membrane may be a polymer layer, such as formed from a block-copolymer.
  • the lipid bilayer may be artificial, for example non-natural. Methods for forming lipid bilayers are known in the art.
  • the diameter of the channel in the interface may be about 0.25nm to about 4 nm.
  • the channel may be a nanopore.
  • the channel is narrow enough that the blockage of the channel by the nucleotide can be detected by a change in a particular signal, for example, fluorescence signal.
  • the narrowest site of the channel may be called a constriction site.
  • the channel may have enough height to perform "Nanopore Induced Phase-Shift Sequencing (NIPSS) " .
  • the channel may have enough height to accommodate the short RNA, such as a full-length miRNA or at least partial sequence of the short RNA, such as at least partial sequence of a miRNA.
  • the nanopore may be a solid nanopore, a protein nanopore of a DNA nanopore whether the membrane is a solid membrane or a lipid bilayer.
  • the nanopore may be natural, for example derived from a biological organism, or the nanopore may be synthetic.
  • the nanopore may be recombinantly produced.
  • the nanopore may formed by a biological molecule, such as a protein (also can be called a protein nanopore or a nanopore-forming protein) .
  • a protein nanopore comprises different regions, including the entrance, the vestibule, and the narrowest region, called the constriction. When performing nanopore sequencing, the ionic current is blocked at the constriction, thereby causing a detectable change of the electric current, which can reflect the sequence information.
  • the protein nanopore used in this invention preferably has no spontaneous gating activities and/or preferably keeps open when the analyte is absent.
  • the protein nanopore used in this invention may be any suitalbe protein nanopore.
  • protein nanopore or nanopore-forming protein include CsgG, ⁇ -HL, ClyA, Phi29 connector protein, aerolysine, MspA, OmpF, OmpG, FraC, HlyA, SheA, sp1 or variants thereof.
  • Other examples of biological molecule nanopores include nanopores formed by DNA self-assembly.
  • the nanopore that can be used in the invention may also be ion channel, such as potassium channel or sodium channel and the like.
  • the protein nanopore or nanopore-forming protein is MspA or mutant MspA.
  • MspA refers to Mycobacterium smegmatis porin A and is octameric. MspA monomers associate with each other and form a conically shaped with a finite height ( ⁇ 10 nm) .
  • the term "mutant MspA" refers to a mutant of wild type MspA. In some embodiments, the mutant MspA may comprise the mutations of D90N/D91N/D93N/D118R/D134R/E139K (including all of listed mutations) compared to the wild-type MspA.
  • the mutant MspA may only have the mutations of D90N/D91N/D93N/D118R/D134R/E139K compared to the wild-type MspA.
  • Sequences of wild type MspA monomers are known by the person skilled in the art. For example, Sequences of wild type MspA monomers can be found in GenBank on https: //www. ncbi. nlm. nih. gov/.
  • the wild-type MspA porin monomer may have the following amino acid sequence:
  • the wild-type MspA porin monomer may be consisted of SEQ ID NO: 1.
  • a protein nanopore is added into the compartment into which the DNA-short RNA chimeric substrate will be added, then the protein nanopore can spontaneously insert in to the lipid bilayer between the two compartments.
  • the polymerase is present on the side farther from the constriction site of the channel.
  • “Farther” means that when the polymerase is present on one side of the channel, its distance from the constriction site of the channel is farther from the other side of the channel.
  • a protein nanopore is added into the compartment which accomodates the chimeric substrate and the enzyme and the smaller end of the protein nanopore can spontaneously insert in to the lipid bilayer between the two compartments.
  • the liquid medium comprised in the each of the two compartments may contain ions.
  • the liquid medium may be conductive, such as having an electrolyte buffer, e.g. an aqueous electrolyte buffer.
  • the liquid medium may comprise one or more of a salt, a detergent, or a buffering agent.
  • the liquid medium may comprise HEPES or Tris-HCl buffer.
  • the liquid medium may have a pH of from 4.0 to 12.0, from 4.5 to 10.0, from 5.0 to 9.0, from 5.5 to 8.8, from 6.0 to 8.7 or from 7.0 to 8.8 or 7.5 to 8.5.
  • the pH used is preferably about 7.5.
  • the liquid medium may comprise potassium chloride (KC1) , sodium chloride (NaCl) or caesium chloride (CsCl) .
  • KC1 is preferred.
  • the salt concentration may be up to saturation.
  • the salt concentration may be 3M or lower and is typically from 0.1 to 2.5 M, from 0.3 to 1.9 M, from 0.5 to 1.8 M, from 0.7 to 1.7 M, from 0.9 to 1.6 M or from 1M to 1.4 M.
  • the salt concentration is preferably from 150 mM to 1M.
  • High salt concentrations provide a high signal to noise ratio and allow for currents indicative of the presence of a nucleotide to be identified against the background of normal current fluctuations.
  • the concentration of the salts should allow the enzyme used in the present invention to work.
  • the liquid medium may comprise KCl and HEPES/KOH pH 7.5, for example, the liquid medium may comprise 0.3 M KCl, 10 mM HEPES/KOH, 10 mM (NH 4 ) 2 SO 4 and 4 mM DTT, pH 7.5.
  • the methods of the present invention may be carried out at from 0 °C to 100 °C, from 15 °C to 95 °C, from 16 °C to 90 °C, from 17 °C to 85 °C, from 18 °C to 80 °C, 19 °C to 70 °C, or from 20 °Cto 60 °C.
  • the methods are typically carried out at room temperature.
  • the methods are optionally carried out at a temperature that supports enzyme function, such as about 37 °C.
  • the liquid medium of the compartment which accomodates the DNA polymerase may also comprise divalent cations such as Mg 2+ , Ca 2+ or Co 2+ , and deoxyribonucleotide triphosphates (dNTPs) .
  • the divalent cations may be, for example, MgCl 2 .
  • the dNTPs may comprise all four standard dNTPs.
  • the liquid medium comprises 10 mM MgCl 2 and 250 ⁇ M dNTPs.
  • the divalent cation may be added into the liquid medium at any suitable time, such as being added when formulating the liquid medium.
  • the dNTPs may be added into the liquid medium at any suitable time, such as being added when formulating the liquid medium or being added together with the DNA-short RNA substrate.
  • the liquid medium of the compartment which accomodates the DNA helicase may also comprise ATP.
  • the method of the present invention is typically carried out with a voltage applied across the interface and the channel.
  • the voltage used is typically from +2 V to -2 V, typically -400 mV to +400mV.
  • the voltage used is preferably in a range having a lower limit selected from -400 mV, -300 mV, -200 mV, -150 mV, -100 mV, -50 mV, -20mV and 0 mV and an upper limit independently selected from +10 mV, + 20 mV, +50 mV, +100 mV, +150 mV, +200 mV, +300 mV and +400 mV.
  • the voltage used is more preferably in the range 100 mV to 240mV and most preferably in the range of 120 mV to 220 mV.
  • Measurement of the change of the ionic current through the channel may be performed by way of optical signal or electric current signal. It is well known to those skilled in the art to measure the change of the ionic current through the channel may be performed by way of optical signal or electric current signal and determine the sequence according to the measurement. Methods of measuring electric current are well known in the art. For example, one or more measurement electrodes could be used to measure the electric current through the channel. These can be, for example, a patch-clamp amplifier or a data acquisition device. For example, Axopatch-IB patch-clamp amplifier (Axon 200B, Molecular Devices) could be used to measure the electric current flowing through the channel.
  • Axopatch-IB patch-clamp amplifier Axon 200B, Molecular Devices
  • the method of the invention may be used to discriminate the identities of miRNA, discriminate miRNA isoforms, or detect m6A modification within miRNA.
  • the method of the invention may be used to determine the sequence of a plurality of short RNA.
  • the DNA-short RNA chimeric substrate may be a plurality of DNA-short RNA chimeric substrates. Different DNA-short RNA chimeric substrate may comprise the same DNA segment and different short RNA so that different short RNA may be sequencing simultaneously using the same DNA segment, the same enzyme, the same primer and /or the same blocking oligomer. If a DNA polymerase is used, different DNA-short RNA chimeric substrates may bind to one kind of primer, one kind of blocking oligomer and one kind of polymerase to form a plurality of complexes. If a DNA helicase is used, different DNA-short RNA chimeric substrates may bind to one kind of blocking oligomer and one kind of helicase to form a plurality of complex.
  • a sequencing library constructed by thermal annealing from a plurality of DNA-short RNA chimeric single-strands and a DNA fragment; wherein different DNA-short RNA chimeric single-strand comprises the same DNA segment and different short RNA, and the DNA fragment is complementary to at least partial of the DNA segment of the chimeric single-strand.
  • the DNA-short RNA chimeric single-strand, the DNA fragment, the primer and/or the blocking oligmer is defined as above.
  • the primers may be the same, the blocking oligmers may be the same, and/or different DNA-RNA chimeric single-strand may comprises the same DNA segment and different short RNA. Sequencing of the library may be conducted in a nanopore array.
  • a system for determining the sequence of a short RNA is provided.
  • the system comprising:
  • each of the two compartments comprises a liquid medium and the interface has a channel so dimensioned as to allow passage of only one single-strand polynucleotide at a time;
  • the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand and at least one DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand, the chimeric single-strand comprises the short RNA conjugated with a DNA, and the enzyme can process DNA in a nucleotide by nucleotide way.
  • the compartment, the interface, the liquid medium, the channel, the enzyme, the DNA-short RNA chimeric substrate, the DNA fragment, the primer and/or the blocking oligmer is defined as above.
  • the target miRNA strand must be conjugated with a section of DNA to form a DNA-miRNA chimeric template.
  • this chimeric template which is composed of a segment of DNA on the 5’-end and a segment of miRNA on the 3’-end with the two segments separated by an abasic spacer, was custom synthesized (Fig. 1a, Table 1) .
  • the sequencing library was constructed by thermal annealing from three separate strands: the chimeric template, the primer and the blocker (Fig. 1a, Table 1, Methods 1) [26, 27] .
  • Primer/Template duplex regions in the chimeric template strand are indicated by bold letters (GCATTCTCATGCAGGTCGTAGC) .
  • Blocker/Template duplex regions in the chimeric template strand are indicated by italic letters (TCAGATCTCACTATC) .
  • the “X” letter represents the abasic site.
  • PO 4 stands for the 5’ phosphate group.
  • NIPSS was carried out with a mutant Mycobacterium smegmatis porin A (MspA) nanopore (Methods 2) , atop of which a wildtype (WT) phi29 DNA polymerase (DNAP) served as the ratcheting enzyme during sequencing [26] .
  • MspA Mycobacterium smegmatis porin A
  • DNAP phi29 DNA polymerase
  • NIPSS Method 3
  • the sequencing library complex which is bound with the phi29 DNAP, was first electrophoretically dragged into the nanopore to unzip the blocker strand mechanically.
  • a phi29 DNAP-driven primer extension was initiated so that the chimeric template starts moving against the electrophoretic force in steps equivalent to a single nucleotide (Fig. 2) .
  • Fig. 1b the DNA segment, the abasic site and the miRNA segment sequentially pass through the nanopore constriction during NIPSS.
  • the DNA segment thus acts as the “DNA drive strand” .
  • the phi29 DNAP enzymatically drives the DNA segment to move against the electrophoretic force and simultaneously the tethered miRNA segment is sequenced by the nanopore constriction.
  • the DNA segment is designed to contain sequence repeats between “AAGA” and “TTTC” (3’-5’) to generate a unique signal pattern during NIPSS (Fig. 2, 3) , which marks the initiation of a NIPSS event.
  • the abasic spacer “X” following the DNA segment is expected to produce a high current signature that marks the initiation of miRNA sequencing within a NIPSS event (Fig. 2, 3, Table 1) .
  • miR-21 which is an intensively studied oncogenic miRNA and a cancer biomarker [5, 28] , was included in the miRNA segment of a chimeric strand (DNA-miR-21, Table 1) for sequencing.
  • a representative raw current trace of DNA-miR-21 is shown in Fig. 1c. This trace can be segmented according to the characteristic sequencing pattern of the DNA drive strand and the abasic spacer respectively, from which the DNA segment reports two repeats of triangular shaped signals and the abasic spacer reports an abnormally high step immediately after the signal from the DNA (Fig. 3) . Sequencing signals immediately subsequent from the abasic spacer report sequencing signals for miRNA. Specifically, the representative trace in Fig. 1c reports sequencing signals from miR-21.
  • Let-7a which is a member of let-7 family miRNA [29]
  • Let-7 miRNA was the first human miRNA to have been discovered [29] .
  • miR-21 which is a cancer biomarker
  • let-7 miRNA is known to target many oncogenes and thus it behaves as a cancer suppressor [30] .
  • Simultaneous discrimination of miR-21 and Let-7a which are an oncogenic miRNA and a cancer suppressor miRNA respectively, thus show significant bioanalytical value for cancer diagnosis and serve as an excellent example of miRNA identity recognition by NIPSS.
  • IsomiRs have attracted significant attention due to their functional roles in diverse biological processes, which include modulation of miRNA stabilization [15] , regulation of mRNA-targeting efficacy [16] and their correlations to different disease states [9, 15, 16, 32] .
  • Most isomiRs differ by only a single nucleotide at either the 3’ or the 5’-end, and thus discrimination by conventional miRNA sensing routines is challenging.
  • isomiRs differing at their 3’-ends as a result of non-template enzymatic additions [30] are more frequently observed and can be immediately distinguished with the current NIPSS scheme.
  • the 3’ uridylation product of miR-21 which is an isomiR of miR-21, is named miR-21+U in this invention (Fig. 6a) .
  • miR-21+U isomiR is abundant in human urine samples and is a promising biomarker for prostate cancer [33] .
  • chimeric template DNA-miR-21 and DNA-miR-21+U (Table 1) were designed and sequenced using the same NIPSS configuration (Fig. 1, 5) .
  • the means and standard deviations of extracted sequencing steps from 20 independent NIPSS events of DNA-miR-21 and DNA-miR-21+U are shown in Fig. 6b.
  • the DNA segment of the NIPSS sequencing signal shows perfect alignment, since the DNA drive strands from both chimeric templates are identical in sequence. Distinct variation of the signal begins at step 23 as a consequence of the additional uridine at the 3’-end of miR-21+U when compared to its isoform miR-21. Starting from step 23, all follow-up sequencing steps from miR-21+U appear back shifted by 1 nucleotide (Fig. 6b, c) , which further confirms that the 3’ uridylation isomiR has been recognized from direct NIPSS sequencing.
  • Table 2 Current values of all 4-nucleotides sequence contexts for isomiRs.
  • RNA metabolism [35] , cellular functions [34, 36, 37] and miRNA biogenesis [17, 18] .
  • m6A N6-methyladenosine
  • MeRIP-seq still requires reverse transcription followed by amplification and fails to achieve single nucleotide resolution.
  • Recent developments of direct RNA sequencing using nanopores have successfully demonstrated m6A mapping directly from mRNA samples [25] . Emerging investigations reveals that m6A modifications also naturally exist in miRNAs [17, 18] .
  • direct RNA sequencing by nanopores is only applicable to sequencing long stretches of RNA instead of miRNAs, whereas MeRIP-seq fails to achieve single molecule and single nucleotide resolutions.
  • NIPSS is particularly useful in distinguishing minor sequence variations at the 3’-end of the target miRNAs.
  • the first nucleotide at the 3’-end of the miRNA segment of DNA-miR-21 is tentatively changed from a canonical adenosine to m6A (Fig. 9a) .
  • This new strand is named as DNA-miR-21 (m6A) and is custom synthesized for downstream NIPSS characterization.
  • the m6A reaches the pore constriction first (Fig. 9b) , which makes this nucleotide highly resolvable by the current NIPSS configuration and thus becomes an optimum option for a proof of concept demonstration.
  • ⁇ 3 nucleotides show significant signal variations between the two analytes due to the introduced m6A modification, whereas other sequencing steps show a clear alignment as the remainder of the sequence is identical.
  • the difference of sequencing step heights between DNA-miR-21 (m6A) and DNA-miR-21 are shown in Fig. 9f, from which the maximum signal variation between the two analytes could achieve a difference of ⁇ 4 pA in the corresponding positions of the signals, due to the introduced chemical modification (Fig. 11) .
  • the gene coding for wild-type Phi29 DNA polymerase (PDB ID: 1XHX) and D12A/D66A mutant were custom synthesized by Genescript (New Jersey, USA) and cloned into a pET-30a (+) plasmid.
  • An extra histidine tag was included on the C-terminus of each protein for post expression purifications by nickel affinity chromatography.
  • the two types of phi29 DNAP follow the same purification process.
  • wild-type Phi29 DNA polymerase is abbreviated as Phi29 DNAP-wt, if not otherwise stated. After heat shock transformation, the cells were incubated in LB medium at 37 °C for 7 hours.
  • IPTG isopropyl ⁇ -D-thiogalactoside
  • the cells were harvested by centrifugation at 4 °C with 4000 rpm for 20 min and the collected pellets were stored at -80 °C.
  • the lysate was centrifuged at 4 °C with 20000 rcf for 60 min and then the supernatant was membrane filtered before being applied to a nickel affinity column (HisTrapTM HP, GE Healthcare) .
  • HisTrapTM HP nickel affinity column
  • Phi29 DNAP Purification and characterization of Phi29 DNAP is shown in Fig. 14.
  • the demonstrated miRNA sequencing strategy using NIPSS is currently not without limitations. With the concept of sequencing miRNAs from natural resources by direct miRNA sequencing using NIPSS, the target miRNA strands must be conjugated chemically or biochemically with the DNA drive-strand ahead of the library preparation. An enzymatic ligation strategy has been reported [40] and could be adapted for this purpose.
  • the 5’ phosphorylated DNA drive strand (5PO 4 DNA) is first treated with a 5’ DNA adenylation kit (New England Biolabs) , assisted with the Mth RNA ligase (Fig. 12) .
  • the adenylated DNA drive strand (5AppDNA) was characterized and purified by ethanol precipitation and further ligated to the 3’-end of the target miRNA strands by the T4 RNA ligase 2 truncated K227Q mutant (T4 Rnl2tr) .
  • the DNA-miRNA chimeric template can be characterized by electrophoresis on 15%polyacrylamide gel electrophoresis and the ligated chimeric template is shown as the extra band of higher molecular weight (Fig. 12) .
  • this biochemical conjugation strategy is compatible with sequencing the first 14-15 nucleotides to the 3’-end of any miRNA types.
  • an engineered form of MspA with redundant structures on top of its vestibule could be constructed to extend the phase-shift distance of NIPSS.
  • existing nanopores of larger dimensions such as ClyA 41 and FraC 42 may be adapted to sequence full length miRNA or even its precursor or primary form.
  • Chemical ligations [43, 44] between the 5’-end of target miRNAs and the 5’-end of the DNA drive strand may be carried out to form a reverse chimeric strand with a “head to head” configuration for 5’-end miRNA sequencing by NIPSS.
  • the first direct miRNA sequencing has been carried out by NIPSS. Similar strategies could also be adapted to sequence other short non-coding strands, such as siRNA [45] or piRNA [46] , avoiding the laborious motor protein engineering for nanopore sequencing [25] .
  • the nanopore sequencing results in this invention show clear signal discriminations between different sequences, isoforms and epigenetic modifications among synthetic miRNA sequences.
  • MiRNAs from natural resources such as miRNA extracts from clinical samples, can be conjugated with pre-designed DNA linker strands by performing routine enzymatic ligation to build NIPSS sequencing libraries (Fig. 13) .
  • NIPSS direct miRNA sequencing by NIPSS could be directly implemented in clinical diagnosis or could be utilized as a complement to existing miRNA sensing platforms when single base resolution is critical.
  • MiRNA sequencing by NIPSS share the same advantages of other nanopore sequencing technologies, including low cost, single base resolution and portability.
  • NIPSS when properly engineered could also be adapted to commercial nanopore sequencers, such as MinIon [25] .
  • high-throughput direct miRNA sequencing by NIPSS can be carried out in optical nanopore chips [25, 47] for low cost, high-throughput and multiplexed miRNA characterizations in a disposable device form.
  • the sequencing library for NIPSS is thermally annealed from three separate nucleic acid strands: the chimeric template, the primer and the blocker (Fig. 1a, Table 1) . These three strands were mixed with a 1: 1: 2 molar ratio in an aqueous buffer (0.3 M KCl, 10 mM HEPES/KOH, 10 mM MgCl 2 , 10 mM (NH 4 ) 2 SO 4 ) . Thermal annealing was carried out by incubating the mixture at 95 °C for 2 min and program cooled down to 25 °C with a rate of - 5 °C/min. For optimum sequencing data production, the thermal annealed sequencing library should be immediately used in subsequent NIPSS measurements.
  • MspA mutant D90N/D91N/D93N/D118R/D134R/E139K nanopore [48] was expressed with E. coli BL21 (DE3) and purified with nickel affinity chromatography as described previously [26, 49] .
  • This MspA mutant which is the sole MspA nanopore discussed in this invention, is named MspA, if not otherwise stated.
  • NIPPS experiments were carried out as described previously [26] . Briefly, the electrolyte buffer (0.3 M KCl, 10 mM HEPES/KOH, 10 mM MgCl 2 , 10 mM (NH 4 ) 2 SO 4 and 4 mM DTT at pH 7.5) were separated by a 1, 2-diphytanoyl-sn-glycero-3-phosphocholine (DphPC) lipid membrane (Avanti Polar Lipids) into cis and trans compartments. Both compartments were in contact with separate Ag/AgCl electrodes and connected to an Axopatch 200B patch clamp amplifier (Molecular Devices) to form a circuit, while the cis compartment is electrically grounded.
  • DphPC 1, 2-diphytanoyl-sn-glycero-3-phosphocholine
  • Nanopore sequencing was initiated by holding an applied voltage at +180 mV. All electrophysiology recordings were acquired with a Digidata 1550B digitizer (Molecular Devices) with a 25 kHz sampling rate and low-pass filtered at 5 kHz. All NIPSS experiments were performed at room temperature (22 ⁇ 1 °C) .
  • Potassium chloride (KCl) sodium chloride (NaCl) , sodium hydrogen phosphate (Na 2 HPO 4 ) and sodium dihydrogen phosphate (NaH 2 PO 4 ) were obtained from Aladdin (China) .
  • Magnesium chloride (MgCl 2 ) was from Macklin.
  • Ammonium sulfate (NH 4 ) 2 SO 4 was from Xilong Scientific.
  • 4- (2-hydroxyethyl) -1-piperazine ethanesulfonic acid (HEPES) was from Shanghai Yuanye Bio-Technology (China) .
  • Ammonium persulfate, kanamycin sulfate, dl-dithiothreitol (DTT) , dioxane-free isopropyl- ⁇ -D-thiogalactopyranoside (IPTG) , N, N, N’, N’-Tetramethyl-ethylenediamine (TEMED) and imidazole were from Solarbio (China) .
  • Ethylene-diaminetetraacetic acid (EDTA) , pentane, hexadecane, and Genapol X-80 were from SIGMA- ALDRICH.
  • HPLC-purified DNA oligonucleotides including DNA-miRNA template strands, primer, blocker, DNA linker and miR-21+U (Table 1) were custom synthesized by Genscript (New Jersey, USA) .
  • the MspA mutant (D90N/D91N/D93N/D118R/D134R/E139K) nanopore was expressed with E. coli BL21 (DE3) and purified with nickel affinity chromatography as described previously [49] .
  • This MspA mutant was abbreviated as MspA in the invention.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Provided are a method of determining the sequence of a short RNA, and the seqencing library and system thereof. The method is performed by enzyme assisted sequencing, especially "Nanopore Induced Phase-Shift Sequencing (NIPSS)", using a DNA-RNA chimeric substrate, and can be used to sequence short RNA such as miRNA, siRNA or piRNA.

Description

Direct microRNA sequencing using enzyme assisted sequencing FIELD
This invention relates to a method for identify an analyte using protein nanopore.
BACKGROUND
MicroRNAs (miRNAs) are a group of short, single-stranded, non-coding RNA molecules that act as posttranscriptional gene regulators for a wide variety of physiological processes, including proliferation, differentiation, apoptosis, and immune reactions [1, 2] . On the other hand, aberrant miRNAs expression levels have been shown to be closely related to diverse diseases, such as cancer [3-6] , auto-immune disorders [7] and inflammatory diseases [8] .
Conventionally, miRNAs can be characterized by northern blot, quantitative reverse transcription real-time polymerase chain reaction (qRT-PCR) assays or microarrays [9] . Other emerging platforms for miRNA sensing include colorimetry [10] , bioluminescence [11] , enzymatic activity [12] and electrochemistry [13] . Unfortunately, these methods provide limited analytical information because miRNA sequences are not directly reported and prior knowledge of the target miRNA sequence is required.
MiRNAs function by binding to the 3’ untranslated region (3’UTR) of target messenger RNAs [14] . Minor sequence variations, which include trimming, addition or substitution of miRNA sequences, alter its binding affinities to target messenger RNA [14] . As reported, miRNA isoforms (isomiRs) [9] , which were generated by the addition or deletion of one or multiple nucleotides as terminal modifications, have been shown to participate in proliferative diseases [15] and cancer [16] . On the other hand, N6-methyl-adenosine (m6A) , which is an epigenetic modification among RNAs, plays important roles in miRNA biogenesis [17, 18] . These functional roles of miRNAs are associated with their specific sequences. Therefore, directly decoding a miRNA sequence along with its chemical modifications becomes critical to correlate its physiological or pathological relevance. MiRNAs could be indirectly sequenced by sequencing its complementary DNAs (cDNA) by performing reverse transcription followed with deep sequencing [19] . However, this strategy suffers from unpredictable amplification biases and the loss of epigenetic information [9, 20] . Advances in corresponding sequencing technologies consequently facilitate early stage diagnosis of cancers and inspire miRNA-targeted therapeutics [21, 22] .
Recent developments in nanopore sequencing have suggested a new concept of nucleic acid sensing by direct sequencing. It has been reported that long stretches of DNA [23, 24] or RNA [25] can be directly sequenced by nanopores with single molecule resolution. Unfortunately, miRNAs and other short nucleic acid strands are not compatible with existing nanopore sequencing configurations mainly due to their short length.
Nanopore Induced Phase-Shift Sequencing (NIPSS) , which is a variant of nanopore sequencing, was recently developed as a universal strategy to sequence analytes other than long stretches of DNA [23] or RNA [25] . Though NIPSS is limited by its short read-length of 15 nucleotides, the concept has been verified by sequencing different 2’-deoxy-2’-fluoroarabinonucleic acid (FANA) strands [26] .
SUMMARY
According to one aspect of the present invention, a method of determining the sequence of a short RNA is provided, the method includes:
providing two compartments separated by an interface; wherein each of the two compartments comprises a liquid medium and the interface has a channel so dimensioned as to allow passage of only one single-strand polynucleotide at a time;
providing a DNA-short RNA chimeric substrate comprising a DNA-short RNA chimeric single-strand and at least one DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand, wherein the chimeric single-strand comprises the short RNA conjugated with a DNA;
providing an enzyme that can process DNA in a nucleotide by nucleotide way;
introducing the chimeric substrate and the enzyme into one of the two compartments and allowing the enzyme to bind to chimeric substrate;
applying an electric potential difference between the two compartments, thereby causing the single-stranded portion of the polynucleotide complex to translocate through the channel to the other side;
allowing the enzyme to process the DNA segment of the chimeric substrate, thereby driving the single-stranded portion of the polynucleotide complex to translocate through the channel in a nucleotide by nucleotide way; and
measuring the change of the ionic current through the channel during the translocation, thereby determining the sequence of the short RNA.
In some embodiments, the short RNA is derived from miRNA, siRNA or piRNA.
In some embodiments, the DNA-short RNA chimeric single-strand is prepared by following steps: the 5’ phosphorylated DNA single-strand is first 5’ adenylated, assisted with the Mth RNA ligase; then the adenylated DNA is purified by ethanol precipitation and further ligated to the 3’-end of the short RNA by RNA ligase.
In some embodiments, in the DNA-short RNA chimeric single-strand, the DNA segment and the short RNA is separated by a spacer.
In some embodiments, the spacer is an abasic spacer.
In some embodiments, the enzyme is DNA polymerase; the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand, a primer and a blocking oligomer; the primer is hybridized with the 3' end of the DNA segment of the chimeric single-strand; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand adjacent to the 3' end of the primer; the short RNA is adjacent to the 5' end of the DNA segment in the DNA-short RNA chimeric single-strand; and the method includes:
introducing the chimeric substrate, the polymerase, divalent cation and dNTPs into one of the two compartments and allowing the polymerase to bind to chimeric substrate;
applying an electric potential difference between the two compartments, thereby causing translocation of the single-stranded portion of the chimeric substrate through the channel and voltage-driven unzipping of the blocking oligomer, wherein the complete unzipping of the blocking oligomer initiate the polymerase-driven primer extension so that the chimeric substrate moves against the electric field force; and
measuring the change of the ionic current through the channel during the translocation, thereby determining the sequence of the short RNA;
preferablly, the DNA polymerase is phi29 DNA polymerase or variants thereof; more preferablly, the DNA polymerase is wild-type phi29 DNA polymerase, or D12A/D66A mutant of wild-type phi29 DNA polymerase.
In some embodiments, the 5' end of the DNA segment is conjugated with the 5'end or the 3'end of the short RNA.
In some embodiments, the 3' end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
In some embodiments, the 3' end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
In some embodiments, a cholesterol molecule is linked to the 3' end of the blocking oligomer.
In some embodiments, the method includes:
mixing the chimeric single-strand with the primer and the blocking oligomer to form the chimeric substrate, incubating the chimeric substrate with the polymerase to form a complex, and introducing the complex and dNTPs into one of the two compartments.
In some embodiments, the DNA segment of the chimeric single-strand contains sequence repeats to generate a unique signal pattern during primer extension.
In some embodiments, the sequence repeats include sequence repeats between "AAGA" and "TTTC" from 3' to 5'.
In some embodiments, the enzyme is a DNA helicase; the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand and a blocking oligomer; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand; and the method includes:
introducing the chimeric substrate and the helicase into one of the two compartments and allowing the helicase to bind to chimeric substrate;
applying an electric potential difference between the two compartments, thereby causing translocation of the single-stranded portion of the chimeric substrate through the channel, wherein unzipping the blocking oligomer by the helicase drives the short RNA to translocate through the channel in a nucleotide by nucleotide way;
measuring the change of the ionic current through the channel during the translocation, thereby determining the sequence of the short RNA;
preferably, the DNA helicase is Hel 308 DNA helicase (which is also called DNA Helicase HEL308) or variants thereof.
In some embodiments, at least one end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
In some embodiments, the unpaired end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
In some embodiments, a cholesterol molecule is linked to at least one end of the blocking oligomer.
In some embodiments, the method includes:
mixing the chimeric single-strand with the blocking oligomer to form the chimeric substrate, incubating the chimeric substrate with the helicase to form a complex, and introducing the complex into one of the two compartments.
In some embodiments, the channel is a protein nanopore.
In some embodiments, the protein nanopore is MspA, CsgG, ClyA or FraC or variants thereof.
In some embodiments, the protein nanopore is a mutant MspA comprising the mutations of D90N/D91N/D93N/D118R/D134R/E139K compared to the wild-type MspA.
In some embodiments, the DNA-short RNA chimeric substrate is a plurality of DNA-short RNA chimeric substrates, wherein different DNA-short RNA chimeric substrate comprises different short RNA.
In some embodiments, different DNA-short RNA chimeric substrate comprises the same DNA segment.
According to another aspect of the present invention, a sequencing library is provide. The sequencing library is constructed by thermal annealing from a plurality of DNA-short RNA chimeric single-strands and a DNA fragment; wherein different DNA-short RNA chimeric single-strand comprises the same DNA segment and different short RNA, and the DNA fragment is complementary to at least partial of the DNA segment of the chimeric single-strand.
In some embodiments, the RNA is derived from miRNA, siRNA or piRNA.
In some embodiments, in the DNA-short RNA chimeric single-strand, the DNA segment and the short RNA is separated by a spacer.
In some embodiments, the spacer is an abasic spacer.
In some embodiments, the sequencing library constructed by thermal annealing from a plurality of DNA-short RNA chimeric single-strands, a primer and a blocking oligomer; wherein the primer is hybridized with the 3' end of the DNA segment of the chimeric single-strand; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand adjacent to the 3' end of the primer; and the short RNA is adjacent to the 5' end of the DNA segment in the DNA-short RNA chimeric single-strand.
In some embodiments, the 5' end of the DNA segment is conjugated with the 5'end or the 3'end of the short RNA.
In some embodiments, the 3' end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
In some embodiments, the 3' end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
In some embodiments, a cholesterol molecule is linked to the 3' end of the blocking oligomer.
In some embodiments, the DNA segment of the chimeric single-strand contains sequence repeats to generate a unique signal pattern during primer extension.
In some embodiments, the sequence repeats include sequence repeats between "AAGA" and "TTTC" from 3' to 5'.
In some embodiments, the sequencing library constructed by thermal annealing from a plurality of DNA-short RNA chimeric single-strands and a blocking oligomer; wherein the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand.
In some embodiments, at least one end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
In some embodiments, the unpaired end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
In some embodiments, a cholesterol molecule is linked to at least one end of the blocking oligomer.
According to another aspect of the present invention, a system for determining the sequence of a short RNA is provided, the system comprises:
two compartments separated by an interface; wherein each of the two compartments comprises a liquid medium and the interface has a channel so dimensioned as to allow passage of only one single-strand polynucleotide at a time;
an enzyme and a DNA-short RNA chimeric substrate in one of the two compartments; wherein the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand and at least one DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand, the chimeric single-strand comprises the short RNA conjugated with a DNA, and the enzyme can process DNA in a nucleotide by nucleotide way.
In some embodiments, the short RNA is derived from miRNA, siRNA or piRNA.
In some embodiments, in the DNA-short RNA chimeric single-strand, the DNA segment and the short RNA is separated by a spacer.
In some embodiments, the spacer is an abasic spacer.
In some embodiments, the system comprises a DNA polymerase, divalent cation, dNTPs and a DNA-short RNA chimeric substrate in one of the two compartments; wherein the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand, a primer and a blocking oligomer; the primer is hybridized with the 3' end of the DNA segment of the chimeric single-strand; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand adjacent to the 3' end of the primer; the short RNA is adjacent to the 5' end of the DNA segment in the DNA-short RNA chimeric single-strand.
In some embodiments, the 5' end of the DNA segment is conjugated with the 5'end or the 3'end of the short RNA.
In some embodiments, the 3' end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
In some embodiments, the 3' end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
In some embodiments, a cholesterol molecule is linked to the 3' end of the blocking oligomer.
In some embodiments, the DNA segment contains sequence repeats to generate a unique signal pattern during primer extension.
In some embodiments, the sequence repeats include sequence repeats between "AAGA" and "TTTC" from 3' to 5'.
In some embodiments, the system comprises a DNA helicase, ATP, and a DNA-short RNA chimeric substrate in one of the two compartments; wherein the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand and a blocking oligomer; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand.
In some embodiments, at least one end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
In some embodiments, the unpaired end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
In some embodiments, a cholesterol molecule is linked to at least one end of the blocking oligomer.
In some embodiments, the channel is a protein nanopore.
In some embodiments, the protein nanopore is MspA, CsgG, ClyA or FraC or variants thereof.
In some embodiments, the protein nanopore is a mutant MspA comprising the mutations of D90N/D91N/D93N/D118R/D134R/E139K compared to the wild-type MspA.
In some embodiments, the DNA-short RNA chimeric substrate is a plurality of DNA-short RNA chimeric substrates, wherein different DNA-short RNA chimeric substrate comprises different short RNA.
In some embodiments, different DNA-short RNA chimeric substrate comprises the same DNA segment.
DESCRIPTION OF THE DRAWINGS
Fig. 1 shows direct miRNA sequencing using NIPSS. (a) A schematic diagram of the preparation of a sequencing library. The miRNA sequencing library is thermally annealed (Methods 1) from three separate nucleic acid strands which includes a chimeric template, a primer (green) and a blocker (light blue) (Table 1) . The chimeric template is composed of a miRNA segment (red) , an abasic residue (blue dot) and a DNA segment (black) . (b) The NIPSS strategy for direct miRNA sequencing. NIPSS is carried out with an MspA nanopore (purple) and a wildtype (WT) phi29 DNAP (green) by following the reported enzymatic ratcheting strategy (Fig. 2) [24] . A fixed phase shift distance between the polymerase synthesis site and the pore constriction is utilized to directly sequence miRNA. During NIPSS, the DNA segment, the abasic residue and the miRNA segment sequentially move through the nanopore constriction in single nucleotide steps until the abasic site (blue dot) reaches the binding pocket of phi29 DNAP. The inset image shows a zoomed-in view of the pore constriction. Due to the limited spatial resolution of the pore constriction, nanopore sequencing signals from MspA results from simultaneous reading of different combinations of sequence quadromers spanning the pore constriction during NIPSS. (c) A typical current trace acquired by sequencing DNA-miR-21 (Table 1) using NIPSS. The DNA and miRNA segments of the trace are marked with black and red lines, respectively. The trace segment that corresponds to reading the abasic site (X) is marked with a light blue stripe. Briefly, the DNA segment of the trace appears as two triangular shaped current characteristics due to the sequence design (Fig. 3) . The DNA segment of the trace is immediately followed by an abrupt increase of the signal due to the introduction of the abasic site after the DNA sequence. All step transitions after the abasic signal lead to miRNA sequencing signals. (d) Statistics of current steps from sequencing results of DNA-miR-21. The mean and standard deviation values were derived from 24 independent events. The corresponding sequence (3’-5’ convention, if not otherwise stated) is aligned above the statistics. The current levels of “AAGA” and “TTTC” , which represents the highest and the lowest sequencing signals by reading the DNA part, are marked by black arrows. The  demonstrated results were acquired by performing NIPSS with an aqueous buffer of 0.3 M KCl, 10 mM HEPES/KOH, 10 mM MgCl 2, 10 mM (NH 42SO 4 and 4 mM DTT at pH 7.5.
Fig. 2 shows detailed schematic diagrams of NIPSS. The configuration of direct miRNA sequencing by NIPSS is as reported previously 2. The chimeric template is composed of a miRNA segment (red, lower part) , an abasic residue (blue dot) and a DNA segment (black, upper part) (Methods 1) . With a +180 mV applied potential, the sequencing library, which is bound with a phi29 DNAP (green) , was driven electrophoretically into the MspA nanopore (purple) . (a) Initiation of NIPSS. The cyan DNA blocker strand, which was thermally annealed with the sequencing library, was first voltage-driven fragment unzipped from the chimeric template. This unzipping consequently triggered the replication-driven ratcheting by the phi29 DNA polymerase on top of the nanopore. The transition between voltage-driven unzipping and replication-driven ratcheting represents the initiation of nanopore sequencing. The motion directions of the chimeric template during unzipping and ratcheting are marked with cyan and black arrows, respectively. (b) The initiation of miRNA sequencing. Passage of the abasic spacer through the pore constriction marks the initiation of subsequent miRNA sequencing. (c) MiRNA sequencing by NIPSS. By utilizing the phase-shift, the miRNA segment pass through the pore constriction with single nucleotide steps, when the DNA drive-strand is being replicated by the phi29 DNA polymerase. All NIPSS experiments in this invention follow this configuration. All sequencing experiments were performed at 23 ℃ in the sequencing buffer of 0.3 M KCl, 10 mM HEPES/KOH, 10 mM MgCl 2, 10 mM (NH 42SO 4 and 4 mM DTT at pH 7.5.
Fig. 3 shows design of the DNA segment within the chimeric template. (a) A representative current trace acquired by sequencing DNA-miR-21 using NIPSS. The trace segments that correspond to reading DNA, the abasic site and miRNA are marked appropriately. The purpose of including the DNA segment, which is designed on the 5’ end of the chimeric template strand (Table 1) , is twofold. First, it acts as a drive strand, which can be enzymatically ratcheted by the phi29 DNAP against the electrophoretic force, guarantees that the miRNA segment can be sequenced by nanopores. Second, by including two sequence repeats of “AGAACTTT” (5’-3’) in the DNA segment (Table 1) , a unique trace pattern of two triangular wave (region marked with light gray shade) will appear ahead of the miRNA reading, which helps to recognize the initiation of an NIPSS event. The assigned numbers above each trace plateaus represent different quadromer readings during NIPSS. Nanopore reading of AAGA and TTTC (3’-5’) generates the highest and the lowest residual current among that from all other canonical combinations of DNA quadromers, respectively. Thus, quadromer reading from AAGA could be recognized as steps 5 and 13. Whereas, quadomer reading from TTTC could be recognized as  steps  1, 9 and 17. Back-stepping motion [24] of the nucleic acid strand, was occasionally observed, and marked with “*” . (b) The characteristic current signature generated by the DNA segment. The demonstrated trace was extracted from step 1-9 from (a) . Scale bar: 5 pA/25 ms. (c) The extracted current steps from (b) with aligned quadromer sequences.
Fig. 4 shows identification of miR-21 and let-7a by NIPSS. MiRNA identities can be directly recognized by analyzing the miRNA part of the NIPSS signal. (a) A representative current trace from DNA-miR-21 sequencing. (b) A representative current trace from DNA-let-7a sequencing. Solid lines over the traces in (a-b) represent the extracted current height. “*”  indicates occasional polymerase back-stepping [24] . The initiation of miRNA sequencing is indicated by blue dashed lines, where an abasic spacer (X) is read by the nanopore. The signal patterns of the DNA segment show high similarities since the sequence of the DNA drive-strand is identical. Whereas, the miRNA segment of the signal show remarkable differences between the demonstrated NIPSS events.
Fig. 5 shows identification of miRNAs using NIPSS. Chimeric template strands containing miRNA-21 and Let-7a sequences were custom synthesized (Table S1) and directly sequenced by NIPSS. (a) Overlay of multiple time-normalized events (N=24) from the DNA-miR-21 results acquired by NIPSS. (b) Overlay of multiple time-normalized events (N=12) from the DNA-let-7a results acquired by NIPSS. (a-b) The corresponding sequences (DNA-miR-21 or DNA-Let-7a, 3’-5’ convention) are aligned above the plots. The DNA segments are marked in black (left of X) and the miRNA segments are marked in red (right of X) . The abasic site (X, blue) , which separates the DNA and miRNA segments, acts as a signal marker to identify the sequence transition from reading DNA to miRNA during NIPSS. (c) Consensus sequencing results comparison between DNA-miR-21 and DNA-let-7a. The mean and standard deviation values are derived from time-normalized events, as demonstrated in a and b. The DNA part of the NIPSS results shows great alignment in all steps between both templates. However, the miRNA segment of the signals shows significant variations, starting from the step marked with the light blue stripe. Light blue stripes in (a-c) mark the sequencing step of TCAX, which is the first quadromer sequence containing the abasic residue when acquired by NIPSS.
Fig. 6 shows discrimination of miRNA isoforms (isomiRs) using NIPSS. (a) miR-21 and its isoforms. The upper image demonstrates the structure of a precursor microRNA (pre-miRNA) for human miR-21. The lower image demonstrates the sequence (5’-3’) of mature miR-21 and its miR-21+U isoform, respectively. An additional uracil (red box) exists at the 3’-end of miR-21+U. (b) Comparison of consensus sequencing results between DNA-miR-21 and DNA-miR-21+U. Consensus sequencing results from NIPSS reading of DNA-miR-21 (dark brown) and DNA-miR-21+U (light blue) derived from 25 time-normalized independent events (Table 2) . The DNA, the abasic site and the miRNA part of the signal are marked separately. The light blue strip marks the sequencing step of TCAX, which is the first sequence quadromer containing an abasic spacer encountered by NIPSS. The demonstrated statistical results show great alignment in all parts except that marked with a dashed-line box. (c) Sequencing result shift between DNA-miR-21 and DNA-miR-21+U. A zoomed-in view of the dashed-line box in b illustrates a shift effect of current levels caused by a nucleotide insertion of DNA-miR-21+U in reference to DNA-miR-21. The aligned sequence context above the results demonstrates that the addition of a uracil (red box) in DNA-miR-21+U systematically generates 1 nucleotide (marked as δ) shift. This single nucleotide result shift is also demonstrated by the schematic diagram in the image inset which takes the results of step 25 as an example.
Fig. 7 shows representative current traces for isomiRs discrimination. Each panel shows individual trace segments, which correspond to the miRNA part of the nanopore sequencing signal of DNA-miR-21 (left) and DNA-miR-21+U (right) . Current steps with significant deviations between analytes are indicated by solid lines (blue for miR-21 and yellow for miR-21+U) . The “*” indicates polymerase back-stepping [24] that was occasionally observed. The duration time for each nanopore sequencing step is stochastic. Scale bar: 50 ms.
Fig. 8 shows statistical comparison between DNA-miR-21 and DNA-miR-21+U. (a) Overlay of extracted current steps from multiple DNA-miR-21 events (N=24) . (b) Overlay of extracted current steps from multiple DNA-miR-21+U events (N=25) . The correlated sequences for each sequence are demonstrated on top of the plots in (a-b) where an abasic site (X) is located between DNA and miRNA. (c) Consensus comparison of sequencing signals between DNA-miR-21 and DNA-miR-21+U. Mean and standard deviations extracted from a and b. Here, a uridine monophosphate insertion has generated an additional current step when reading DNA-miR-21+U in reference to that of DNA-miR-21. Consequently, the signal pattern from DNA-miR-21+U is shifted by 1 nucleotide. The light blue strip in (a-c) represents the quadromer reading of TCAX, which is the first quadromer containing the abasic site read by NIPSS.
Fig. 9 shows direct N6-methyladenosine (m6A) mapping using NIPSS. (a) Schematic diagram of chimeric template strands containing canonical adenosine (A) or N6-methyladenosine (m6A) . A single “A” or “m6A” is embedded in different chimeric strands (top: DNA-miR-21, bottom: DNA-miR-21 (m6A) , Table 1) for NIPSS sequencing, where its chemical structure and location is annotated. Black and light red segments represent the DNA and the miRNA part of the strand. (b) Schematic diagram of NIPSS sequencing of miRNA containing “A” or “m6A” nucleotides. A*within the sequence context represents the “A” or “m6A” nucleotide within the strand. (c) A representative current trace of DNA-miR-21 sequenced by NIPSS. Dark blue lines represent extracted mean current values from each step. The corresponding sequence context is aligned below in which a canonical adenosine is marked in gray highlight. (d) A representative current trace when DNA-miR-21 (m6A) is sequenced by NIPSS. Light red lines represent extracted mean current values from each step. The corresponding sequence context is aligned below, from which a N6-methyladenosine is marked in gray hightlight. Scale bars in c and d represents 10 pA (current, vertical) and 50 ms (time, horizontal) , respectively. (e) Consensus sequencing results comparison between DNA-miR-21 and DNA-miR-21 (m6A) . The means and standard deviations were derived from 20 independent events. A*within the sequence context below the results represents either A or m6A. (f) Current differences between NIPSS results of DNA-miR-21 and DNA-miR-21 (m6A) . These differences were derived by calculating ΔI=I DNA-miR21 (m6A) -I DNA-miR21 from the mean values of 24 events from each strand, with the associated sequence aligned below. The standard deviation values of DNA-miR-21 are demonstrated with gray column and the standard deviation of DNA-miR-21 (m6A) is demonstrated with black error bars. The m6A modification results in a signal variation when m6A containing sequence quadromers were read by the nanopore constriction. Due to the limited spatial resolution of MspA, a single m6A modification results in detectable signal fluctuations within three current steps as marked by dashed box.
Fig. 10 shows representative current traces for m6A modification detection. The sequence contexts between DNA-miR-21 and DNA-miR-21 (m6A) differ with only one m6A modification (Table 1) . Each panel shows individual trace segments, which correspond to the miRNA part of the sequencing signal of DNA-miR-21 (left) and DNA-miR-21 (m6A) (right) . Current steps with significant deviations between analytes are indicated by solid lines. These level differences could also be recognized from the dashed reference lines. The duration time for each nanopore sequencing step is stochastic. Scale bar: 50 ms.
Fig. 11 show statistical signal variation between DNA-miR-21 and DNA-miR-21 (m6A) . (a) Overlay of current steps extracted from multiple DNA-miR-21 events (N=24) . (b) Overlay of current steps extracted from multiple DNA-miR-21 (m6A) events (N=24) . Each current step in (a-b) represents the mean current values of each quadromer reading. The correlated sequences are shown above the plots. The “X” within the sequence stands for the abasic site located between the DNA and the miRNA segment of the chimeric template. (c) Demonstration of different current patterns generated from two DNA-miRNA strands. The average current values and error bars are created from current level traces exhibited in a and b. The light blue strips in (a-c) represent the quadromer reading of TCAX, which is the first quadromer containg the abasic site when read by NIPSS. (d) Statistical current differences between DNA-miR-21 and DNA-miR-21 (m6A) . Mean current values in (c) were used to construct the current difference map, where the value of I DNA-miR-21 (m6A)–I DNA-miR-21 is displayed by a black solid line. The standard deviation of signals from DNA-miR-21 is shown with gray columns. The standard deviation of signals from DNA-miR-21 (m6A) is demonstrated with black bars. Signal variations caused by m6A modification are marked by dashed box. The sequence of the strand are aligned below the figure, where A* (gray highlighted) represents either A or m6A in the sequence.
Fig. 12 shows enzymatic ligation between DNA and RNA. (a) A schematic diagram of enzymatic ligation. The reaction starts with a 58 nt DNA linker with a 5’ phosphate group (5PO 4DNA) . After treatment with the Mth RNA ligase (New England Biolabs) , the 5’ end of the DNA linker is adenylated (5AppDNA) . To minimize non-specific ligation, T4 RNA ligase 2 truncated K227Q mutant (T4 Rnl2tr, New England Biolabs) was chosen, which specifically ligates the 5’ end of the pre-adenylated DNA linker (5AppDNA) with the 3’ end of target miRNA. (b) Characterization of DNA adenylation by 15%polyacrylamide (PAGE) -urea gel. The adenylation reaction was performed by mixing the following components: 200 pmol 5PO 4DNA, 4 μL 10x 5’ DNA adenylation buffer, 4 μL 1 mM ATP, 4 μL Mth RNA liagse and nuclease-free H 2O to a final volume of 40 μL. The reaction was incubated at 65 ℃ for 1 h and heat inactivated by incubation at 85 ℃ for 5 min. The reaction product 5AppDNA was purified by ethanol precipitation. Lane + and lane –stands for samples that were incubated with Mth RNA ligase or not, respectively. Lane m stands for the Low Range ssRNA Ladder (New England Biolabs) . 5’ adenylation results in slight upshift of the band during gel electrophoresis. (c) Characterization of DNA-miRNA ligation. Enzymatic ligation results were characterized by 15%PAGE-Urea gel electrophoresis. The ligation reaction was performed by mixing the following components: 5.0 μL 50% (w/v) PEG 8000, 2 μL 10x RNA ligase buffer, 20 pmol 5AppDNA, 10 pmol miRNA, 1 μL T4 Rnl2tr, nuclease-free water to a final volume of 20 μL. The reaction was incubated at 4 ℃ for 24 h and heat inactivated by incubation at 65 ℃ for 20 min. Lane A indicates the microRNA Marker (New England Biolabs) . Lane B stands for the miR-21+U strand. Lane -stands for the 5AppDNA. Lane + stands for the ligation product. Lane m stands for the Low Range ssRNA Ladder (New England Biolabs) . An extra, high molecular weight band was detected in lane m, which is the ligated DNA-miRNA chimeric strand.
Fig. 13 shows a proposed strategy to directly sequence miRNA from natural resources. (a-c) Schematic diagram of NIPSS sequencing library preparation with isolated miRNA from natural resources. Isolated miRNAs (a) could be ligated to form DNA-miRNA chimeric strands  (b) and subsequently form sequencing libraries (c) by thermal annealing. (d) Direct miRNA sequencing is carried out as described in this invention. (e) Current step transitions during NIPSS reading of miRNA could be decoded into RNA sequences for downstream clinical diagnosis or bioinformatics investigations.
Fig. 14 shows purification and characterization of Phi29 DNAP. Prokaryotic expressed Phi29 DNAP (Methods) , which contains a hexa-histidine tag on its C-terminus, could be purified by nickel affinity chromatography and characterized by gel electrophoresis. (a) The UV absorbance curve for wild-type phi29 DNAP (phi29 DNAP-wt) during gradient elution (0-250 mM imidazole) of the cell lysate sample using nickel affinity chromatography. Two major peaks were observed and their identities were confirmed by corresponding gel electrophoresis. The marked numbers indicate the fractions to be analyzed by gel electrophoresis. (b) The UV absorbance curve for phi29 DNAP D12A/D66A during gradient elution (0-250 mM imidazole) of the cell lysate sample using nickel affinity chromatography. Two major peaks were observed and their identities were confirmed by corresponding gel electrophoresis. The marked numbers indicate the fractions to be analyzed by gel electrophoresis. (c) Characterization of corresponding elution fractions for phi29 DNAP-wt with a 4-15%SDS-polyacrylamide gel. Lane 1: cell lysate sample; Lane 2: cell lysate sample after column loading. Lane 3-7: corresponding elution fractions marked in a) . Lane M: precision plus protein standards (BIO-RAD) ; It is clearly noticed that Phi29 DNAP-wt was in fraction 6. (d) Characterization of corresponding elution fractions for phi29 DNAP D12A/D66A with a 4-15%SDS-polyacrylamide gel. Lane 1: cell lysate sample; Lane 2: cell lysate sample after column loading. Lane 3-7: corresponding elution fractions marked in b) . Lane M: precision plus protein standards (BIO-RAD) ; It is clearly noticed that Phi29 DNAP D12A/D66A was in fraction 6 and 7. (e) Replication activity test of Phi29 DNAP-wt and Phi29 DNAP D12A/D66A. The Hairpin DNA (Table 1) , which forms a hairpin structure with a 5’ overhang upon thermal annealing, was used to test the replication ability of the two home-made Phi29 DNAP. Initially, 1.8 μM Hairpin DNA was dissolved in the electrolyte buffer (0.3 M KCl, 10 mM HEPES/KOH, 10 mM MgCl 2, 10 mM (NH 42SO 4, 4mM DTT, pH 7.5) followed with thermal annealing. Enzymatic driven chain elongation was performed with the addition of 5% (v/v) Phi29 DNAP and dNTP with a final concentration of 0.25 mM at room temperature for 1 h. The chain elongation results were characterized by 14%polyacrylamide gel electrophoresis. Lane 1: the elongated DNA product by D12A/D66A mutant (home-made) ; Lane 2: the elongated product by phi29 DNAP-wt (home-made) ; Lane 3: the elongated DNA product by the phi29 DNAP (NEB) ; 4, DNA without any DNA polymerase; M, 20 bp DNA Ladder (Dye Plus) . Lane M: 20 bp DNA Ladder (Dye Plus, BioRad) . The appearance of the extension product has verified the replication ability of Phi29 DNAP-wt and Phi29 DNAP D12A/D66A.
DETAILED DESCRIPTION
MicroRNAs (miRNAs) are a class of short non-coding RNAs that function in RNA silencing and post-transcriptional gene regulation. Besides their participation in regulating normal physiological activities, specific miRNA types could act as oncogenes, tumor suppressors or metastasis regulators, which are critical biomarkers for cancer. However, direct characterization of miRNA is challenging due to its unique properties such as its low abundance, sequence similarities and short length. The inventors find that enzyme assisted  sequcing, especially Nanopore Induced Phase Shift Sequencing (NIPSS) , which is a variant form of nanopore sequencing, could be used to directly sequence short RNA including miRNA. Particularly, in practice, NIPSS clearly discriminates between different identities, isoforms and epigenetic variants of model miRNA sequences. This invention demonstrates the first report of direct miRNA sequencing, which serves as a complement to existing miRNA sensing routines by the introduction of single molecule resolution. Future engineering of this technique may assist miRNA based early stage diagnosis or inspire novel cancer therapeutics.
Mature miRNAs, measuring ~22 nucleotides (nt) in length, are ideal analytes for NIPSS. Though the 15 nt read-length fails to cover the miRNA length completely, it demonstrates the first single molecule sequencing attempt for miRNAs, which is superior in principle to existing miRNA sensing methods because no amplification or prior knowledge of target sequences is required, while all epigenetic information within the sequence is retained and can be resolved.
The method of the invention enables sequencing of miRNAs by DNA-RNA complexes using nanopore sequencing. In a typical nanopore measurement, the nanopore is the only access which permits the flow of the ionic current between two compartments containing electrolytic solutions. With an applied potential, charged analytes are electrically driven through the pore, and the analytical information could be readily recognized and distinguished from the current blockades. It is very difficult to sequence miRNAs in the prior art. The traditional nanopore sequencing method is designed for long sequences which is not suitable for miRNA. miRNAs may not transclocate through the nanopore or may not stay in the channel long enough.
The invention provides a method of determining the sequence of a short RNA, the method comprising:
providing two compartments separated by an interface; wherein each of the two compartments comprises a liquid medium and the interface has a channel so dimensioned as to allow passage of only one single-strand polynucleotide at a time;
providing a DNA-short RNA chimeric substrate having a double-stranded portion and a single-stranded portion, wherein at least partial of the DNA segment of the chimeric substrate is double-strand and the single-stranded portion comprises the short RNA and optionlly partial of the DNAsegment; and the short RNA is adjacent to the 5' end of the DNA segment in the chimeric template;
providing an enzyme that can process DNA in a nucleotide by nucleotide way;
introducing the chimeric substrate and the enzyme into one of the two compartments and allowing the enzyme to bind to the chimeric substrate;
applying an electric potential difference between the two compartments, thereby causing the single-stranded portion of the polynucleotide complex to translocate through the channel to the other side;
allowing the enzyme to process the DNA segment of the chimeric substrate, thereby driving the single-stranded portion of the polynucleotide complex to translocate through the channel in a nucleotide by nucleotide way; and
measuring the change of the ionic current through the channel during the primer extension, thereby determining the sequence of the short RNA.
The term "short RNA strand" may be derived from miRNA, siRNA or piRNA, and may be a single strand. In some embodiments, the short RNA is miRNA. The length of the short  RNA strand may be 5-50 nt, such as 10-30 nt, preferably 15-20 nt. The term "miRNA" is used according to its ordinary and plain meaning and refers to a microRNA molecule found in eukaryotes that is involved in RNA-based gene regulation.
In the method of the present invention, the target short RNA is conjugated with a DNA. The process of the DNA by the enzyme can reduce the short RNA translocation speed down to enable sequencing of the short RNA. The chimeric substrate comprises DNA double-stranded portion so that the enzyme can process the DNA segment of the chimeric substrate. The chimeric substrate may be formed by conjugating the target short RNA with a single-stranded DNA to form a chimeric single-strand and then hybridizing the chimeric single-strand with a DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand. Therefore, the chimeric substrate may comprise a DNA-short RNA chimeric single-strand and a DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand, wherein the chimeric single-strand comprises the short RNA conjugated with a DNA. In some embodiments, the DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand may be present in the same strand with the chimeric single-strand and form a hairpin structure with the chimeric single-strand. The hairpin structure is optional and the DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand may be an individual fragment.
In some embodiments, the chimeric substrate and the enzyme may be introduced into one of the two compartments as following steps: firstly, the chimeric single-strand may be synthesized firstly, and then the chimeric single-strand may be hybridized with the complementary DNA fragment to form the chimeric substrate, and then the chimeric substrate is incubated with the enzyme to form a complex, and finally the complex is added into one of the two compartments. In some embodiments, the chimeric single-strand, the complementary DNA fragment and the enzyme may be added into one of the two compartments together to form a complex in the compartment. In some embodiments, the chimeric single-strand may be hybridized with the complementary DNA fragment to form the chimeric substrate and the chimeric substrate and the enzyme then be added into one of the two compartments together to form a complex in the compartment
The chimeric single-strand may be prepared by following steps: the 5’ phosphorylated single-stranded DNA (5PO 4 DNA) is first 5’ adenylated, assisted with the Mth (Methanobacterium) RNA ligase; then the adenylated DNA (5AppDNA) is purified by ethanol precipitation and further ligated to the 3’-end of the target short RNA strand by the RNA ligase, preferablly T4 RNA ligase 2 truncated K227Q mutant (T4 Rnl2tr) .
Between the short RNA and the DNA there may be a spacer. The spacer may cause unique blockage current when passing through the channel and may serve as a signal marker to identify the beginning or the ending of sequencing signals from miRNA. The spacer may be an abasic spacer, which can produce an abnormally large blockage current. The spacer may be any other suitable spacer which can cause a unique blockage current when passing through the nanopore to distinguish it from adjacent nucleotides. The spacer is optional. The RNA may be conjugated directly with the DNA.
The spacer (if present) may be synthesized in one end of the single-stranded DNA, wherein said end is the end that will conjugate with the short RNA. The single-stranded DNA  with a spacer then can be used to prepare the chimeric single-strand following the above method.
The length of the DNA segment in the DNA-short RNA chimeric single-strand may be greater than or equal to 10nt, greater than or equal to 20nt, equal to 30nt, equal to 40nt, equal to 50nt, or equal to 60nt.
The enzyme may be any enzyme that can process DNA in a nucleotide by nucleotide way, including, but not limited to, DNA polymerase, DNA helicase, exonuclease etc.
In some embodiments, the enzyme may be DNA polymerase. DNA polymerase enables sequencing of short RNA by means of NIPSS, wherein RNA undergoes forward and reverse ratcheting translocation.
The term "polymerase" refers to an enzyme that performs template-directed synthesis of polynucleotides. DNA polymerases are well-known to those skilled in the art. Examples of the DNA polymerase include, but not limit to, phi29 DNA polymerase, T7 DNA polymerase, His 1 DNA polymerase, His 2 DNA polymerase, Bacillus phage M2 DNA polymerase, Streptococcus phage CPl DNA polymerase, enterobacter phage PRD1 DNA polymerase, and variants thereof. Phi29 DNA polymerase is particularly preferable.
Phi29 DNA polymerase (phi29 DNAP) is derived from Bacillu subtilis bacteriophage phi29 (Blanco, L.; Salas, M.J. Biol. Chem. 1996, 277, 8509-8512; (19) Salas, M.; Blanco, L.; Lazaro, J.M.; de Vega, M. IUBMB. Life 2008, 60, 82-85) and works at room temperature with a high processivity. The monomeric phi29 DNAP (~66.5 kDa) belongs to the B family of DNA polymerases and has both 5’-3’ primer strand extension and 3’-5’ exonuclease functions in the presence of Mg 2+ as well as 5’-3’ strand displacement activity. Phi29 DNAP is able to synthesize very long stretches of DNA (>70kb) along a single template strand, and it can synthesize under loads of up to ~37pN, enough to counteract the electrophoretic force needed to drive DNA through the nanopore.
In some embodiments, DNA polymerase may be a mutant of wild-type phi29 DNAP, such a mutant having K555T/D570N (Sakatani Y et al., Protein Eng Des Sel. 2019, 32 (11) : 481-487) or D12A/D66A comparing with wild-type phi29 DNAP. In some embodiments, the mutant of wild-type phi29 DNAP only has K555T/D570N or D12A/D66A comparing with wild-type phi29 DNAP. K555T/D570N refers to K555T and D570N. D12A/D66A refers to D12A and D66A.
In the method using a DNA polymerase, the DNA-short RNA chimeric substrate may comprise a DNA-short RNA chimeric single-strand, a primer and a blocking oligomer. The short RNA is adjacent to the 5' end of the DNA segment. The primer is annealed to the 3' end of the DNA segment of the chimeric single-strand. The blocking oligomer is hybridized with the DNA segment of the chimeric single-strand adjacent (preferably immediately adjacent to) to the 3' end of the primer to prevent extension of the primer in the course of the forward ratcheting translocation, that is, the 3' end of the primer is protected by the blocking oligomer. The double-stranded portion is consisted of the primer, the blocking oligomer and the hydrized DNA segment. The polymerase is bound to the DNA-short RNA chimeric substrate. When the electric potential difference between the two compartments is applied, the single-stranded portion of the polynucleotide complex is drawn into the channel, causing the blocking oligomer to unzip from the 3' end to the 5' end. When the blocking oligomer is completely unzipped, the extendable 3' end of the primer is exposed, allowing polymerase-driven primer  extension to start. Then the DNA-short RNA chimeric substrate translocates in the reverse direction through the channel as the primer extension, the short RNA also translocates through the channel, giving rise to an ion current blockade which can be measured. During primer extension driven by the polymerase reaction, DNA and RNA are read sequentially. When the first non-deoxyribonucleotide residue reaches the reaction site of the polymerase, the sequencing signal halts. There is a fixed distance between the constriction site of the channel and the reaction site of the polymerase, and the fixed distance determines the length of the sequence of the RNA that can be detected. For example, for MspA, the fixed distance is equivalent in length to 14-15 nucleotides.
Although the RNA is adjacent to the 5' of the DNA segment in the DNA-RNA chimeric template, since the RNA are not involved in any reaction, the orientation of the RNA itself is no limited. The RNA may be ligated to the DNA segment at the RNA's 5' end or at the RNA's 3' end.
The DNA segment may contain unique sequence repeats to generate a unique signal pattern during primer extension, which marks the initiation of the primer extension. In some embodiments, the sequence repeats includes sequence repeats between "AAGA" and "TTTC" (3'-5') .
The primer may have a hairpin on its 5' end to prevent the polymerase from acting on the double-stranded end of the DNA segment. The hairpin in the primer is optional. The length of the primer seqence that can form a duplex with the chimeric single-strand is not limited, and may be 15-30nt, such as 18-27nt, e.g. 20-25nt.
The blocking oligomer may be any sequence that can hybridize with the DNA segment adjacent to the primer. The blocking oligomer may be single-stranded DNA. The length of the blocking oligomer is not limited and may be 15--30nt, such as 18-27nt, e.g. 20-25nt. The 3'end of the blocking molecule is not paired with the DNA segment of the chimeric single-strand to have a free end for unzipping. For example, the 3' end of the blocking oligomer may have several (e.g. 1-10) nucleotides mismatched with the chimeric single-strand or have several (e.g. 1-10) abasic residues. In some embodiments, a cholesterol molecule may be linked to the 3' end of the blocking oligomer to facilitate unzipping. The cholesterol molecule may be anchored to the membrane between the two compartments, facilitating the entry of the single-stranded portion of the chimeric substrate into the channel.
In some embodiments, firstly, the chimeric single-strand, the primer and the blocking oligomer is mixed and thermally annealed to form the DNA-short RNA chimeric substrate, and then the chimeric substrate is added into one of the two compartments. In some embodiments, the chimeric single-strand, the primer and the blocking oligomer is added to one of the two compartments together and form a polynucleotide complex in the liquid medium of the compartment. The polymerase may be first incubated with the chimeric substrate to form a complex and then the complex is added to one of the two compartments. Alternatively, the polymerase may be added into one of the two compartments together with the pre-formed chimeric substrate or with the chimeric single-strand, the primer and the blocking oligomer to form a complex in the compartment.
In some embodiments, the enzyme may be a DNA helicase. DNA helicase enables sequencing of short RNA during the course of unzipping or the DNA double-strand. The term "DNA helicase" refers to an enzyme that unwinds duplex DNA. DNA helicases are well-known  to those skilled in the art. Examples of the DNA helicase include, but not limit to, Hel 308 helicase, RecD helicase, e.g., TraI helicase, TrwC helicase, XPD helicase, Dda helicase and variants thereof. Hel 308 DNA helicase (which is also called DNA Helicase HEL308) is particularly preferable. Hel 308 helicase (which is also called DNA Helicase HEL308) is an ATP-dependent Ski2-like superfamily II (SF2) helicase/translocase that unwinds duplex DNA in the 3′to 5′direction.
In the method using the DNA helicase, the DNA-short RNA chimeric substrate may comprise a DNA-short RNA chimeric single-strand and a blocking oligomer. The short RNA may be adjacent to the 5' end or the 3' end of the DNA segment. The blocking oligomer is hybridized with the DNA segment of the chimeric single-strand. The double-stranded portion is consisted of the blocking oligomer and the hydrized DNA segment. The helicase is bound to the DNA-short RNA chimeric substrate. When the electric potential difference between the two compartments is applied, the single-stranded portion of the polynucleotide complex is drawn into the channel. Meanwhile, the blocking oligomer is unzipped by the helicase, driving the short RNA to translocate through the channel in a nucleotide by nucleotide way, giving rise to an ion current blockade which can be measured.
The blocking oligomer may be any sequence that can hybridize with the DNA segment adjacent to the primer. The blocking oligomer may be single-stranded DNA. The length of the blocking oligomer is not limited and may be 15--30nt, such as 18-27nt, e.g. 20-25nt. At least one end of the blocking molecule is not paired with the DNA segment of the chimeric single-strand to have a free end for unzipping. For example, the free end of the blocking oligomer may have several (e.g. 1-10) nucleotides mismatched with the chimeric single-strand or have several (e.g. 1-10) abasic residues. In some embodiments, a cholesterol molecule may be linked to one end (such as the free end) of the blocking oligomer to facilitate unzipping. The cholesterol molecule may be anchored to the membrane between the two compartments, facilitating the entry of the single-stranded portion of the chimeric substrate into the channel.
In some embodiments, firstly, the chimeric single-strand and the blocking oligomer is mixed and thermally annealed to form the DNA-short RNA chimeric substrate, and then the chimeric substrate is added into one of the two compartments. In some embodiments, the chimeric single-strand and the blocking oligomer is added to one of the two compartments together and form a polynucleotide complex in the liquid medium of the compartment. The helicase may be first incubated with the chimeric substrate to form a complex and then the complex is added to one of the two compartments. Alternatively, the helicase may be added into one of the two compartments together with the pre-formed chimeric substrate or with the chimeric single-strand and the blocking oligomer to form a complex in the compartment.
The interface separating the two compartments may be a membrane, and may be any material capable of supporting the channel. The membrane may be natural membrane, synthetic membrane or artificial membrane. The membrane may be a solid membrane and may comprise or consist of a solid substrate, such as SiNx, glass, silicon dioxide, molybdenum disulfide, graphene, aluminium oxide, or CNT (carbon nano tube) , etc. The solid membrane or nanopore may further comprise a linker group compound that is attached by covalent bond. The polymerase may be attached to a solid membrane or solid nanopore using a suitable linker group.
The membrane may be a lipid bilayer, such as formed from amphiphilic molecules, e.g. phospholipids, which have both hydrophilic and lipophilic properties. The membrane may be a polymer layer, such as formed from a block-copolymer. The lipid bilayer may be artificial, for example non-natural. Methods for forming lipid bilayers are known in the art.
The diameter of the channel in the interface may be about 0.25nm to about 4 nm. The channel may be a nanopore. The channel is narrow enough that the blockage of the channel by the nucleotide can be detected by a change in a particular signal, for example, fluorescence signal. The narrowest site of the channel may be called a constriction site. Preferrably, the channel may have enough height to perform "Nanopore Induced Phase-Shift Sequencing (NIPSS) " . In some embodiments, the channel may have enough height to accommodate the short RNA, such as a full-length miRNA or at least partial sequence of the short RNA, such as at least partial sequence of a miRNA.
The nanopore may be a solid nanopore, a protein nanopore of a DNA nanopore whether the membrane is a solid membrane or a lipid bilayer. The nanopore may be natural, for example derived from a biological organism, or the nanopore may be synthetic. The nanopore may be recombinantly produced. The nanopore may formed by a biological molecule, such as a protein (also can be called a protein nanopore or a nanopore-forming protein) . A protein nanopore comprises different regions, including the entrance, the vestibule, and the narrowest region, called the constriction. When performing nanopore sequencing, the ionic current is blocked at the constriction, thereby causing a detectable change of the electric current, which can reflect the sequence information.
The protein nanopore used in this invention preferably has no spontaneous gating activities and/or preferably keeps open when the analyte is absent. The protein nanopore used in this invention may be any suitalbe protein nanopore. Examples of protein nanopore or nanopore-forming protein include CsgG, α-HL, ClyA, Phi29 connector protein, aerolysine, MspA, OmpF, OmpG, FraC, HlyA, SheA, sp1 or variants thereof. Other examples of biological molecule nanopores include nanopores formed by DNA self-assembly. The nanopore that can be used in the invention may also be ion channel, such as potassium channel or sodium channel and the like. In some preferred embodiments, the protein nanopore or nanopore-forming protein is MspA or mutant MspA.
MspA refers to Mycobacterium smegmatis porin A and is octameric. MspA monomers associate with each other and form a conically shaped with a finite height (~10 nm) . The term "mutant MspA" refers to a mutant of wild type MspA. In some embodiments, the mutant MspA may comprise the mutations of D90N/D91N/D93N/D118R/D134R/E139K (including all of listed mutations) compared to the wild-type MspA. In some embodiments, the mutant MspA may only have the mutations of D90N/D91N/D93N/D118R/D134R/E139K compared to the wild-type MspA. Sequences of wild type MspA monomers are known by the person skilled in the art. For example, Sequences of wild type MspA monomers can be found in GenBank on https: //www. ncbi. nlm. nih. gov/. In some embodiments, the wild-type MspA porin monomer may have the following amino acid sequence:
Figure PCTCN2020110836-appb-000001
In some embodiments, the wild-type MspA porin monomer may be consisted of SEQ ID NO: 1.
Methods of preparing membrane with inserted nanopore are well known to those skilled in the art. For example, a protein nanopore is added into the compartment into which the DNA-short RNA chimeric substrate will be added, then the protein nanopore can spontaneously insert in to the lipid bilayer between the two compartments.
To perform NIPSS, there should be a distance between the constriction site of the channel and reaction site of the polymerase. Therefore, in some preferred embodiments, the polymerase is present on the side farther from the constriction site of the channel. "Farther" means that when the polymerase is present on one side of the channel, its distance from the constriction site of the channel is farther from the other side of the channel. In some embodiments, a protein nanopore is added into the compartment which accomodates the chimeric substrate and the enzyme and the smaller end of the protein nanopore can spontaneously insert in to the lipid bilayer between the two compartments.
The liquid medium comprised in the each of the two compartments may contain ions. The liquid medium may be conductive, such as having an electrolyte buffer, e.g. an aqueous electrolyte buffer. The liquid medium may comprise one or more of a salt, a detergent, or a buffering agent. The liquid medium may comprise HEPES or Tris-HCl buffer. The liquid medium may have a pH of from 4.0 to 12.0, from 4.5 to 10.0, from 5.0 to 9.0, from 5.5 to 8.8, from 6.0 to 8.7 or from 7.0 to 8.8 or 7.5 to 8.5. The pH used is preferably about 7.5. The liquid medium may comprise potassium chloride (KC1) , sodium chloride (NaCl) or caesium chloride (CsCl) . KC1 is preferred. The salt concentration may be up to saturation. The salt concentration may be 3M or lower and is typically from 0.1 to 2.5 M, from 0.3 to 1.9 M, from 0.5 to 1.8 M, from 0.7 to 1.7 M, from 0.9 to 1.6 M or from 1M to 1.4 M. The salt concentration is preferably from 150 mM to 1M. High salt concentrations provide a high signal to noise ratio and allow for currents indicative of the presence of a nucleotide to be identified against the background of normal current fluctuations. The concentration of the salts should allow the enzyme used in the present invention to work. DTT may also be added into the compartment to produce a reducing environment needed for some enzymes. In some embodiments, the liquid medium may comprise KCl and HEPES/KOH pH 7.5, for example, the liquid medium may comprise 0.3 M KCl, 10 mM HEPES/KOH, 10 mM (NH 42SO 4 and 4 mM DTT, pH 7.5.
The methods of the present invention may be carried out at from 0 ℃ to 100 ℃, from 15 ℃ to 95 ℃, from 16 ℃ to 90 ℃, from 17 ℃ to 85 ℃, from 18 ℃ to 80 ℃, 19 ℃ to 70 ℃, or from 20 ℃to 60 ℃. The methods are typically carried out at room temperature. The methods are optionally carried out at a temperature that supports enzyme function, such as about 37 ℃.
To enable primer extension, the liquid medium of the compartment which accomodates the DNA polymerase may also comprise divalent cations such as Mg 2+, Ca 2+ or Co 2+, and deoxyribonucleotide triphosphates (dNTPs) . The divalent cations may be, for example, MgCl 2. The dNTPs may comprise all four standard dNTPs. In some embodiments the liquid medium comprises 10 mM MgCl 2 and 250μM dNTPs. The divalent cation may be added into the liquid medium at any suitable time, such as being added when formulating the liquid medium. The dNTPs may be added into the liquid medium at any suitable time, such as being added when formulating the liquid medium or being added together with the DNA-short RNA substrate.
To enable function of the helicase, the liquid medium of the compartment which accomodates the DNA helicase may also comprise ATP.
The method of the present invention is typically carried out with a voltage applied across the interface and the channel. The voltage used is typically from +2 V to -2 V, typically -400 mV to +400mV. The voltage used is preferably in a range having a lower limit selected from -400 mV, -300 mV, -200 mV, -150 mV, -100 mV, -50 mV, -20mV and 0 mV and an upper limit independently selected from +10 mV, + 20 mV, +50 mV, +100 mV, +150 mV, +200 mV, +300 mV and +400 mV. The voltage used is more preferably in the range 100 mV to 240mV and most preferably in the range of 120 mV to 220 mV.
Measurement of the change of the ionic current through the channel may be performed by way of optical signal or electric current signal. It is well known to those skilled in the art to measure the change of the ionic current through the channel may be performed by way of optical signal or electric current signal and determine the sequence according to the measurement. Methods of measuring electric current are well known in the art. For example, one or more measurement electrodes could be used to measure the electric current through the channel. These can be, for example, a patch-clamp amplifier or a data acquisition device. For example, Axopatch-IB patch-clamp amplifier (Axon 200B, Molecular Devices) could be used to measure the electric current flowing through the channel.
The method of the invention may be used to discriminate the identities of miRNA, discriminate miRNA isoforms, or detect m6A modification within miRNA.
The method of the invention may be used to determine the sequence of a plurality of short RNA. The DNA-short RNA chimeric substrate may be a plurality of DNA-short RNA chimeric substrates. Different DNA-short RNA chimeric substrate may comprise the same DNA segment and different short RNA so that different short RNA may be sequencing simultaneously using the same DNA segment, the same enzyme, the same primer and /or the same blocking oligomer. If a DNA polymerase is used, different DNA-short RNA chimeric substrates may bind to one kind of primer, one kind of blocking oligomer and one kind of polymerase to form a plurality of complexes. If a DNA helicase is used, different DNA-short RNA chimeric substrates may bind to one kind of blocking oligomer and one kind of helicase to form a plurality of complex.
In another aspect of the invention, a sequencing library is provided. The sequencing library constructed by thermal annealing from a plurality of DNA-short RNA chimeric single-strands and a DNA fragment; wherein different DNA-short RNA chimeric single-strand comprises the same DNA segment and different short RNA, and the DNA fragment is complementary to at least partial of the DNA segment of the chimeric single-strand.
The DNA-short RNA chimeric single-strand, the DNA fragment, the primer and/or the blocking oligmer is defined as above. In the sequencing library, the primers may be the same, the blocking oligmers may be the same, and/or different DNA-RNA chimeric single-strand may comprises the same DNA segment and different short RNA. Sequencing of the library may be conducted in a nanopore array.
In another aspect of the invention, a system for determining the sequence of a short RNA is provided.
The system comprising:
two compartments separated by an interface; wherein each of the two compartments comprises a liquid medium and the interface has a channel so dimensioned as to allow passage of only one single-strand polynucleotide at a time;
an enzyme and a DNA-short RNA chimeric substrate in one of the two compartments; wherein the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand and at least one DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand, the chimeric single-strand comprises the short RNA conjugated with a DNA, and the enzyme can process DNA in a nucleotide by nucleotide way.
The compartment, the interface, the liquid medium, the channel, the enzyme, the DNA-short RNA chimeric substrate, the DNA fragment, the primer and/or the blocking oligmer is defined as above.
The embodiments described herein can be understood more readily by reference to the following detailed description, examples, and claims, and their previous and following description. It is to be understood that the embodiments described herein are not limited to the specific uses, methods, and/or products. It is also to be understood that the terminology used herein is for the purpose of describing particular aspects only and is not intended to be limiting.
Further, the following description is provided as an enabling teaching of the various embodiments in their best, currently known aspect. Those skilled in the relevant art will recognize that many changes can be made to the aspects described, while still obtaining the beneficial results of this disclosure. It will also be apparent that some of the desired benefits of the present invention can be obtained by selecting some of the features of the various embodiments without utilizing other features. Accordingly, those who work in the art will recognize that many modifications and adaptations to the various embodiments described herein are possible and can even be desirable in certain circumstances and are a part of the present disclosure. Thus, the following description is provided as illustrative of the principles of the embodiments described herein and not in limitation thereof.
Throughout this application, the term “about” is used to indicate that a value includes the standard deviation of error for the system or method being employed to determine the value. In any embodiment discussed in the context of a numerical value used in conjunction with the term “about, ” it is specifically contemplated that the term "about" can be omitted.
It should be understood throughout the present specification that expression of a singular form includes the concept of their plurality unless otherwise mentioned. Accordingly, for example, it should be understood that a singular article (for example, “a” , “an” , “the” in English) comprises the concepts of plural form unless otherwise mentioned.
It should be also understood that the terms as used herein have definitions typically used in the art unless otherwise mentioned. Thus, unless otherwise defined, all scientific and technical terms have the same meanings as those generally used by those skilled in the art to which the present invention pertains. If there is contradiction, the present specification (including the definition) precedes.
All patents and publications, including all sequences disclosed within such patents and publications, referred to herein are expressly incorporated by reference.
EXAMPLES
Example 1
Direct miRNA sequencing using NIPSS
To perform direct miRNA sequencing using NIPSS, the target miRNA strand must be conjugated with a section of DNA to form a DNA-miRNA chimeric template. To prove the feasibility of the method, this chimeric template, which is composed of a segment of DNA on the 5’-end and a segment of miRNA on the 3’-end with the two segments separated by an abasic spacer, was custom synthesized (Fig. 1a, Table 1) . The sequencing library was constructed by thermal annealing from three separate strands: the chimeric template, the primer and the blocker (Fig. 1a, Table 1, Methods 1) [26, 27] .
Table 1: Nucleic acid sequences used in this study.
Figure PCTCN2020110836-appb-000002
Notes:
1. All miRNA segments of the sequence contexts are underlined.
2. Primer/Template duplex regions in the chimeric template strand are indicated by bold letters (GCATTCTCATGCAGGTCGTAGC) .
3. Blocker/Template duplex regions in the chimeric template strand are indicated by italic letters (TCAGATCTCACTATC) .
4. The “X” letter represents the abasic site.
5. PO 4 stands for the 5’ phosphate group.
6. dideoxyC stands for a 3’ dideoxycytosine.
As reported, NIPSS was carried out with a mutant Mycobacterium smegmatis porin A (MspA) nanopore (Methods 2) , atop of which a wildtype (WT) phi29 DNA polymerase (DNAP) served as the ratcheting enzyme during sequencing [26] . During NIPSS (Methods 3) ,  the sequencing library complex, which is bound with the phi29 DNAP, was first electrophoretically dragged into the nanopore to unzip the blocker strand mechanically. Subsequently, a phi29 DNAP-driven primer extension was initiated so that the chimeric template starts moving against the electrophoretic force in steps equivalent to a single nucleotide (Fig. 2) . As presented in Fig. 1b, the DNA segment, the abasic site and the miRNA segment sequentially pass through the nanopore constriction during NIPSS.
The DNA segment thus acts as the “DNA drive strand” . Utilizing the phase-shift between the polymerase synthesis site and the nanopore constriction, the phi29 DNAP enzymatically drives the DNA segment to move against the electrophoretic force and simultaneously the tethered miRNA segment is sequenced by the nanopore constriction. The DNA segment is designed to contain sequence repeats between “AAGA” and “TTTC” (3’-5’) to generate a unique signal pattern during NIPSS (Fig. 2, 3) , which marks the initiation of a NIPSS event. The abasic spacer “X” following the DNA segment is expected to produce a high current signature that marks the initiation of miRNA sequencing within a NIPSS event (Fig. 2, 3, Table 1) . Since this initiation, miRNA sequencing signals with a read-length of ~15 nucleotides are expected until the abasic spacer reaches the polymerase synthesis site [26] . Unless otherwise stated, all NIPSS assays described in this invention were carried out by following this configuration. However, the miRNA segments from different chimeric templates contain varying sequences, whereas the DNA segment and the abasic spacer are kept unchanged (Table 1) .
As a proof of concept, miR-21, which is an intensively studied oncogenic miRNA and a cancer biomarker [5, 28] , was included in the miRNA segment of a chimeric strand (DNA-miR-21, Table 1) for sequencing. A representative raw current trace of DNA-miR-21 is shown in Fig. 1c. This trace can be segmented according to the characteristic sequencing pattern of the DNA drive strand and the abasic spacer respectively, from which the DNA segment reports two repeats of triangular shaped signals and the abasic spacer reports an abnormally high step immediately after the signal from the DNA (Fig. 3) . Sequencing signals immediately subsequent from the abasic spacer report sequencing signals for miRNA. Specifically, the representative trace in Fig. 1c reports sequencing signals from miR-21.
To analyze the sequencing data (Methods 4) , signal steps were extracted from raw sequencing traces by a custom LabView program as reported [26] . The means and standard deviations of all steps were summarized from 24 independent NIPSS events (Fig. 1d) . To avoid statistical bias, only NIPSS events with more than 14 nucleotides coverage for the miRNA segment were included (Methods 4) . From the statistics, the characteristic high and low current steps which correspond to nanopore readings of AAGA and TTTC were clearly recognized. After the signal from the abasic spacer, 14 subsequent steps that correspond to the additional sequence of miR-21 were demonstrated. Thus, direct miRNA sequencing by NIPSS has been conceptually demonstrated by sequencing a synthetic miR-21 moiety. NIPSS events from synthetic chimeric template containing identical sequences show highly consistent nanopore sequencing patterns, which can be aligned with designed sequences. Direct miRNA sequencing by NIPSS is in principle, universally applicable to other miRNA identities.
Example 2
Discrimination of miRNA identities by NIPSS
The sequence of Let-7a, which is a member of let-7 family miRNA [29] , was included to form another chimeric template named DNA-let-7a for sequencing. Let-7 miRNA was the first human miRNA to have been discovered [29] . In contrast to miR-21, which is a cancer biomarker, let-7 miRNA is known to target many oncogenes and thus it behaves as a cancer suppressor [30] . Simultaneous discrimination of miR-21 and Let-7a, which are an oncogenic miRNA and a cancer suppressor miRNA respectively, thus show significant bioanalytical value for cancer diagnosis and serve as an excellent example of miRNA identity recognition by NIPSS.
The NIPSS sequencing of DNA-Let-7a was carried out in a manner similar to that demonstrated in Fig. 1. Representative raw sequencing data for DNA-miR-21 and DNA-let-7a are demonstrated in Fig. 4. Statistically, an overlay of steps from 24 independent DNA-miR-21 sequencing events (Fig. 5a) and 12 independent DNA-let-7a sequencing events (Fig. 5b) are summarized. The means and standard deviations from the extracted events in Figs. 5a and 5b are shown together in Fig. 5c. The statistics of sequencing steps from both chimeric templates show great alignment in the DNA part of the events. This is expected because the sequence of the DNA segment is identical in DNA-miR-21 and DNA-let-7a (Table 1) . Starting from the steps of the abasic spacer however, the sequencing steps appear to deviate from one another due to significant sequence variations in the miRNA segment. Though preliminary, the observed variations of sequencing signals between Let-7a and miR-21 support the hypothesis that different miRNA identifies can be deduced from NIPSS.
Example 3
Discriminating miRNA isoforms by NIPSS
According to the scheme of direct miRNA sequencing by NIPSS, minor sequence variations such as addition, insertion or deletion of nucleotides near the 3’-end of the target miRNA could in principle be resolved by resolution of a single nucleotide. Natural miRNA length isoforms (isomiRs) , which result mainly from terminal additions or deletions of an indefinite number of nucleotides at the post-transcriptional level, differ at their 3’ or 5’-end in many mature miRNAs [31] . IsomiRs have attracted significant attention due to their functional roles in diverse biological processes, which include modulation of miRNA stabilization [15] , regulation of mRNA-targeting efficacy [16] and their correlations to different disease states [9, 15, 16, 32] . Most isomiRs differ by only a single nucleotide at either the 3’ or the 5’-end, and thus discrimination by conventional miRNA sensing routines is challenging. However, isomiRs differing at their 3’-ends as a result of non-template enzymatic additions [30] are more frequently observed and can be immediately distinguished with the current NIPSS scheme.
The 3’ uridylation product of miR-21, which is an isomiR of miR-21, is named miR-21+U in this invention (Fig. 6a) . As reported, natural miR-21+U isomiR is abundant in human urine samples and is a promising biomarker for prostate cancer [33] . As a proof of concept, chimeric template DNA-miR-21 and DNA-miR-21+U (Table 1) were designed and sequenced using the same NIPSS configuration (Fig. 1, 5) . The means and standard deviations of extracted  sequencing steps from 20 independent NIPSS events of DNA-miR-21 and DNA-miR-21+U are shown in Fig. 6b. As expected, the DNA segment of the NIPSS sequencing signal shows perfect alignment, since the DNA drive strands from both chimeric templates are identical in sequence. Distinct variation of the signal begins at step 23 as a consequence of the additional uridine at the 3’-end of miR-21+U when compared to its isoform miR-21. Starting from step 23, all follow-up sequencing steps from miR-21+U appear back shifted by 1 nucleotide (Fig. 6b, c) , which further confirms that the 3’ uridylation isomiR has been recognized from direct NIPSS sequencing.
The demonstrated sequencing results between DNA-miR-21 and DNA-miR-21+U have verified that IsomiRs with minor sequence variations at the 3’-end can be clearly resolved using NIPSS. As demonstrated, addition or deletion of one or multiple nucleotides can be immediately detected from the characteristic signal variations and the subsequent pattern shift (Fig. 6c) . More representative data (Fig. 7) and detailed statistics (Fig. 8) are shown in the Supporting Information (SI) . Minor sequence variations within the first 14 nucleotides to the 3’-end of the target miRNAs can be discriminated by following the same strategy. This makes direct miRNA sequencing by NIPSS immediately applicable to discrimination between members with high sequence similarities from the same miRNA family, such as Let-7 [29] .
Table 2: Current values of all 4-nucleotides sequence contexts for isomiRs.
Figure PCTCN2020110836-appb-000003
Figure PCTCN2020110836-appb-000004
Example 4
Detecting m6A modification within miRNA
Emerging evidence has shown that epigenetic modifications in RNA have a profound influence in gene regulation [34] . N6-methyladenosine (m6A) , which is the most abundant modification in RNA, plays a critical role in mRNA metabolism [35] , cellular functions [34, 36, 37] and miRNA biogenesis [17, 18] . However, due to the chemical and biochemical similarities between adenosine and m6A, precise localization of m6A modifications from natural miRNAs becomes technically challenging for any next generation sequencing platform. The emergence of the technique of MeRIP-seq (methylated RNA immunoprecipitation followed by sequencing) [38, 39] is a major advance for this technical need. However, MeRIP-seq still requires reverse transcription followed by amplification and fails to achieve single nucleotide resolution. Recent developments of direct RNA sequencing using nanopores have successfully demonstrated m6A mapping directly from mRNA samples [25] . Emerging investigations reveals that m6A modifications also naturally exist in miRNAs [17, 18] . However, direct RNA sequencing by nanopores is only applicable to sequencing long stretches of RNA instead of miRNAs, whereas MeRIP-seq fails to achieve single molecule and single nucleotide resolutions.
NIPSS is particularly useful in distinguishing minor sequence variations at the 3’-end of the target miRNAs. To verify whether single m6A modification within a short stretch of miRNA could be identified by NIPSS, a conceptual experiment was designed. The first nucleotide at the 3’-end of the miRNA segment of DNA-miR-21 is tentatively changed from a canonical adenosine to m6A (Fig. 9a) . This new strand is named as DNA-miR-21 (m6A) and is custom synthesized for downstream NIPSS characterization. According to the NIPSS convention, the m6A reaches the pore constriction first (Fig. 9b) , which makes this nucleotide highly resolvable by the current NIPSS configuration and thus becomes an optimum option for a proof of concept demonstration.
Representative NIPSS sequencing traces from DNA-miR-21 (Fig. 9c) and DNA-miR-21 (m6A) (Fig. 9d) show noticeable differences in the current height of the sequencing steps which correspond to the first nucleotide at the 3’-end of the miRNA segment. For a zoomed-in demonstration, only the fraction of the sequencing traces that correspond to the region of interest is shown (Fig. 9c, d) and the additional raw sequencing events are illustrated in Fig. 10. The means and standard deviations of the extracted current steps of 20 independent NIPSS events from DNA-miR-21 (dark blue) and DNA-miR-21 (m6A) (light red) are summarized and superimposed in Fig. 9e. Due to the limited spatial resolution of the MspA constriction, ~3 nucleotides show significant signal variations between the two analytes due to the introduced m6A modification, whereas other sequencing steps show a clear alignment as the remainder of  the sequence is identical. The difference of sequencing step heights between DNA-miR-21 (m6A) and DNA-miR-21 are shown in Fig. 9f, from which the maximum signal variation between the two analytes could achieve a difference of ~4 pA in the corresponding positions of the signals, due to the introduced chemical modification (Fig. 11) .
Example 5
Phi29 DNA polymerase (Phi29 DNAP) preparation
The gene coding for wild-type Phi29 DNA polymerase (PDB ID: 1XHX) and D12A/D66A mutant were custom synthesized by Genescript (New Jersey, USA) and cloned into a pET-30a (+) plasmid. An extra histidine tag was included on the C-terminus of each protein for post expression purifications by nickel affinity chromatography. The two types of phi29 DNAP follow the same purification process. For simplicity, wild-type Phi29 DNA polymerase is abbreviated as Phi29 DNAP-wt, if not otherwise stated. After heat shock transformation, the cells were incubated in LB medium at 37 ℃ for 7 hours. The expression was then induced by the addition of isopropyl β-D-thiogalactoside (IPTG) to a final concentration of 1 mM and shaken at 18 ℃ overnight. The cells were harvested by centrifugation at 4 ℃ with 4000 rpm for 20 min and the collected pellets were stored at -80 ℃. The pellets were then re-suspended in the lysis buffer (50 mM NaH 2PO 4/Na 2HPO 4, 300 mM NaCl, 10 mM 2-mercaptoethanol, 0.1%Triton X-100, 1 mM EDTA, pH=8.0) and sonicated on ice for 15 min. The lysate was centrifuged at 4 ℃ with 20000 rcf for 60 min and then the supernatant was membrane filtered before being applied to a nickel affinity column (HisTrapTM HP, GE Healthcare) . To eliminate protein impurity, the loaded column was first washed by buffer A (50 mM NaH 2PO 4/Na 2HPO 4, 300 mM NaCl, 10 mM 2-mercaptoethanol, 0.1%Triton X-100, pH=8.0) . Subsequently, the expected protein was eluted with a gradient of imidazole by mixing buffer A and buffer B (50 mM NaH 2PO 4/Na 2HPO 4, 300 mM NaCl, 10 mM 2-mercaptoethanol, 0.1%Triton X-100, 250 mM imidazole, pH=8.0) in a gradient. The fractions containing the target protein were collected and the buffer was exchanged to the storage buffer (100 mM KCl, 10 mM Tris-HCl, 0.1 mM EDTA, 1 mM DTT, 50% (v/v) glycerol, pH=7.5) using an ultrafiltration tube. An extension assay was performed to verify the replication activity of the purified Phi29 DNAP-wt and phi29 DNAP D12A/D66A using the hairpin primer (sequence: 5’-CGTAAGAGTACGTCCAGCATCGGCGCATGCGGATGCCTTTTGGCATCCGCATGCG-3’) . The result was analyzed by 14%polyacrylamide gel electrophoresis.
Purification and characterization of Phi29 DNAP is shown in Fig. 14.
Prospects
The demonstrated miRNA sequencing strategy using NIPSS is currently not without limitations. With the concept of sequencing miRNAs from natural resources by direct miRNA sequencing using NIPSS, the target miRNA strands must be conjugated chemically or biochemically with the DNA drive-strand ahead of the library preparation. An enzymatic ligation strategy has been reported [40] and could be adapted for this purpose.
Briefly, the 5’ phosphorylated DNA drive strand (5PO 4 DNA) is first treated with a 5’ DNA adenylation kit (New England Biolabs) , assisted with the Mth RNA ligase (Fig. 12) . The adenylated DNA drive strand (5AppDNA) was characterized and purified by ethanol precipitation and further ligated to the 3’-end of the target miRNA strands by the T4 RNA ligase 2 truncated K227Q mutant (T4 Rnl2tr) . After this ligation, the DNA-miRNA chimeric template can be characterized by electrophoresis on 15%polyacrylamide gel electrophoresis and the ligated chimeric template is shown as the extra band of higher molecular weight (Fig. 12) . In principle, this biochemical conjugation strategy is compatible with sequencing the first 14-15 nucleotides to the 3’-end of any miRNA types. To further extend the read-length of NIPSS to the full length of miRNA, an engineered form of MspA with redundant structures on top of its vestibule could be constructed to extend the phase-shift distance of NIPSS. Alternatively, existing nanopores of larger dimensions such as ClyA 41 and FraC 42 may be adapted to sequence full length miRNA or even its precursor or primary form. Chemical ligations [43, 44] between the 5’-end of target miRNAs and the 5’-end of the DNA drive strand may be carried out to form a reverse chimeric strand with a “head to head” configuration for 5’-end miRNA sequencing by NIPSS.
Conclusion
In summary, the first direct miRNA sequencing has been carried out by NIPSS. Similar strategies could also be adapted to sequence other short non-coding strands, such as siRNA [45] or piRNA [46] , avoiding the laborious motor protein engineering for nanopore sequencing [25] . Though demonstrated as a prototype, the nanopore sequencing results in this invention show clear signal discriminations between different sequences, isoforms and epigenetic modifications among synthetic miRNA sequences. MiRNAs from natural resources, such as miRNA extracts from clinical samples, can be conjugated with pre-designed DNA linker strands by performing routine enzymatic ligation to build NIPSS sequencing libraries (Fig. 13) . Consequently, direct miRNA sequencing by NIPSS could be directly implemented in clinical diagnosis or could be utilized as a complement to existing miRNA sensing platforms when single base resolution is critical. MiRNA sequencing by NIPSS share the same advantages of other nanopore sequencing technologies, including low cost, single base resolution and portability. In principle, NIPSS, when properly engineered could also be adapted to commercial nanopore sequencers, such as MinIon [25] . Ultimately, high-throughput direct miRNA sequencing by NIPSS can be carried out in optical nanopore chips [25, 47] for low cost, high-throughput and multiplexed miRNA characterizations in a disposable device form.
Methods
1. The construction of miRNA sequencing library.
The sequencing library for NIPSS is thermally annealed from three separate nucleic acid strands: the chimeric template, the primer and the blocker (Fig. 1a, Table 1) . These three strands were mixed with a 1: 1: 2 molar ratio in an aqueous buffer (0.3 M KCl, 10 mM HEPES/KOH, 10 mM MgCl 2, 10 mM (NH 42SO 4) . Thermal annealing was carried out by incubating the mixture at 95 ℃ for 2 min and program cooled down to 25 ℃ with a rate of - 5 ℃/min. For optimum sequencing data production, the thermal annealed sequencing library should be immediately used in subsequent NIPSS measurements.
2. The preparation of a biological nanopore.
The MspA mutant (D90N/D91N/D93N/D118R/D134R/E139K) nanopore [48] was expressed with E. coli BL21 (DE3) and purified with nickel affinity chromatography as described previously [26, 49] . This MspA mutant, which is the sole MspA nanopore discussed in this invention, is named MspA, if not otherwise stated.
3. NIPSS experiments.
NIPPS experiments were carried out as described previously [26] . Briefly, the electrolyte buffer (0.3 M KCl, 10 mM HEPES/KOH, 10 mM MgCl 2, 10 mM (NH 42SO 4 and 4 mM DTT at pH 7.5) were separated by a 1, 2-diphytanoyl-sn-glycero-3-phosphocholine (DphPC) lipid membrane (Avanti Polar Lipids) into cis and trans compartments. Both compartments were in contact with separate Ag/AgCl electrodes and connected to an Axopatch 200B patch clamp amplifier (Molecular Devices) to form a circuit, while the cis compartment is electrically grounded. Purified MspA nanopores were added in cis and spontaneously inserted into the membrane. With a single pore inserted (Fig. 1b) , the sequencing library, dNTPs and phi29 DNAP could be added into cis and stirred magnetically to reach final concentrations of 2 nM, 250 μM and 1 nM, respectively. Nanopore sequencing was initiated by holding an applied voltage at +180 mV. All electrophysiology recordings were acquired with a Digidata 1550B digitizer (Molecular Devices) with a 25 kHz sampling rate and low-pass filtered at 5 kHz. All NIPSS experiments were performed at room temperature (22 ±1 ℃) .
4. Data analysis.
All data analysis was performed identically to that in previously reported work using NIPSS [26] . Briefly, trace segments containing NIPSS events were extracted from raw electrophysiology traces. Sequencing steps, which appear as signal plateau transitions within the trace, were extracted by a custom LabView program. The DNA-miRNA chimeric design facilitates further data analysis. The characteristic signal pattern as acquired by sequencing the DNA drive strand and the abasic spacer, clearly marks the initiation of the miRNA sequencing signals afterwards. To avoid statistical biases, all statistics of sequencing results were taken from NIPSS events with more than 14 nucleotides coverage for the miRNA segments.
Materials
Potassium chloride (KCl) , sodium chloride (NaCl) , sodium hydrogen phosphate (Na 2HPO 4) and sodium dihydrogen phosphate (NaH 2PO 4) were obtained from Aladdin (China) . Magnesium chloride (MgCl 2) was from Macklin. Ammonium sulfate (NH 42SO 4 was from Xilong Scientific. 4- (2-hydroxyethyl) -1-piperazine ethanesulfonic acid (HEPES) was from Shanghai Yuanye Bio-Technology (China) . Ammonium persulfate, kanamycin sulfate, dl-dithiothreitol (DTT) , dioxane-free isopropyl-β-D-thiogalactopyranoside (IPTG) , N, N, N’, N’-Tetramethyl-ethylenediamine (TEMED) and imidazole were from Solarbio (China) . Ethylene-diaminetetraacetic acid (EDTA) , pentane, hexadecane, and Genapol X-80 were from SIGMA- ALDRICH. 1, 2-diphytanoyl-sn-glycero-3-phosphocholine (DPhPC) was from Avanti Polar Lipids. Urea was from BIOSHARP. Acrylamide was from Sangon Biotech. The Low Range ssRNA Ladder (#N0364S) , microRNA Marker (#N2102S) , RNA loading dye (#B0363S) , Nuclease-free Water (#B1500S) , T4 RNA ligase 2 truncated K227Q mutant (T4 Rnl2tr; #M0242S) , phi29 DNA Polymerase (#M0269S) , deoxynucleotide (dNTP) solution mix (#N0447S) and the 5’ DNA Adenylation kit (E2610S) were from New England Biolabs. E. coli strain BL21 (DE3) was from Biomed (China) . Luria-Bertani (LB) agar and LB broth were from Hopebio (China) .
All HPLC-purified DNA oligonucleotides, including DNA-miRNA template strands, primer, blocker, DNA linker and miR-21+U (Table 1) were custom synthesized by Genscript (New Jersey, USA) .
The MspA mutant (D90N/D91N/D93N/D118R/D134R/E139K) nanopore was expressed with E. coli BL21 (DE3) and purified with nickel affinity chromatography as described previously [49] . This MspA mutant was abbreviated as MspA in the invention.
Reference
1. Kim, J., Yao, F., Xiao, Z., Sun, Y. &Ma, L. MicroRNAs and metastasis: small RNAs play big roles. Cancer Metastasis Rev 37, 5-15 (2018) .
2. Mehta, A. &Baltimore, D. MicroRNAs as regulatory elements in immune system logic. Nature Reviews Immunology 16, 279-294 (2016) .
3. Bracken, C.P., Scott, H.S. &Goodall, G.J. A network-biology perspective of microRNA function and dysfunction in cancer. Nature Reviews Genetics 17, 719-732 (2016) .
4. Nassar, F.J., Nasr, R. &Talhouk, R. MicroRNAs as biomarkers for early breast cancer diagnosis, prognosis and therapy prediction. Pharmacol Ther 172, 34-49 (2017) .
5. He, Y. et al. Current State of Circulating MicroRNAs as Cancer Biomarkers. Clin Chem 61, 1138-1155 (2015) .
6. Teng, Y. et al. MVP-mediated exosomal sorting of miR-193a promotes colon cancer progression. Nature Communications 8, 14448 (2017) .
7. Pauley, K.M., Cha, S. &Chan, E.K.L. MicroRNA in autoimmunity and autoimmune diseases. Journal of Autoimmunity 32, 189-194 (2009) .
8. Hatziapostolou, M. et al. An HNF4α-miRNA Inflammatory Feedback Circuit Regulates Hepatocellular Oncogenesis. Cell 147, 1233-1247 (2011) .
9. Pritchard, C.C., Cheng, H.H. &Tewari, M. MicroRNA profiling: approaches and considerations. Nature Reviews Genetics 13, 358-369 (2012) .
10. Wu, H. et al. Label-free and enzyme-free colorimetric detection of microRNA by catalyzed hairpin assembly coupled with hybridization chain reaction. Biosens Bioelectron 81, 303-308 (2016) .
11. Xu, Q., Ma, F., Huang, S. -q., Tang, B. &Zhang, C. -y. Nucleic Acid Amplification-Free Bioluminescent Detection of MicroRNAs with High Sensitivity and Accuracy Based on Controlled Target Degradation. Analytical Chemistry 89, 7077-7083 (2017) .
12. Li, B. et al. Two-stage cyclic enzymatic amplification method for ultrasensitive electrochemical assay of microRNA-21 in the blood serum of gastric cancer patients. Biosens Bioelectron 79, 307-312 (2016) .
13. Kilic, T., Erdem, A., Ozsoz, M. &Carrara, S. microRNA biosensors: Opportunities and challenges among conventional and commercially available techniques. Biosens Bioelectron 99, 525-546 (2018) .
14. Dong, H. et al. MicroRNA: function, detection, and bioanalysis. Chem. Rev 113, 6207-6233 (2013) .
15. Boele, J. et al. PAPD5-mediated 3' adenylation and subsequent degradation of miR-21 is disrupted in proliferative disease. Proc Natl Acad Sci U S A 111, 11467-11472 (2014) .
16. Wu, X. et al. Comprehensive expression analysis of miRNA in breast cancer at the miRNA and isomiR levels. Gene 557, 195-200 (2015) .
17. Alarcon, C.R., Lee, H., Goodarzi, H., Halberg, N. &Tavazoie, S.F. N6-methyladenosine marks primary microRNAs for processing. Nature 519, 482-485 (2015) .
18. Berulava, T., Rahmann, S., Rademacher, K., Klein-Hitpass, L. &Horsthemke, B. N6-adenosine methylation in MiRNAs. PLoS One 10, e0118438 (2015) .
19. Creighton, C.J., Reid, J.G. &Gunaratne, P.H. Expression profiling of microRNAs by deep sequencing. Brief Bioinform 10, 490-497 (2009) .
20. Ozsolak, F. &Milos, P.M. RNA sequencing: advances, challenges and opportunities. Nature Reviews Genetics 12, 87-98 (2011) .
21. Rupaimoole, R. &Slack, F.J. MicroRNA therapeutics: towards a new era for the management of cancer and other diseases. Nature Reviews Drug Discovery 16, 203-222 (2017) .
22. Li, Z. &Rana, T.M. Therapeutic targeting of microRNAs: current status and future challenges. Nature Reviews Drug Discovery 13, 622-638 (2014) .
23. Laszlo, A.H. et al. Decoding long nanopore sequencing reads of natural DNA. Nature Biotechnology 32, 829-833 (2014) .
24. Manrao, E.A. et al. Reading DNA at single-nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase. Nature Biotechnology 30, 349-353 (2012) .
25. Garalde, D.R. et al. Highly parallel direct RNA sequencing on an array of nanopores. Nature Methods 15, 201-206 (2018) .
26. Yan, S. et al. Direct sequencing of 2′-deoxy-2′-fluoroarabinonucleic acid (FANA) using nanopore-induced phase-shift sequencing (NIPSS) . Chemical Science 10, 3110-3117 (2019) .
27. Cherf, G.M. et al. Automated forward and reverse ratcheting of DNA in a nanopore at 5-A precision. Nature Biotechnology 30, 344-348 (2012) .
28. Costa, P.M. et al. MiRNA-21 silencing mediated by tumor-targeted nanoparticles combined with sunitinib: A new multimodal gene therapy approach for glioblastoma. J Control Release 207, 31-39 (2015) .
29. Koppers-Lalic, D. et al. Nontemplated nucleotide additions distinguish the small RNA composition in cells from exosomes. Cell Rep 8, 1649-1658 (2014) .
30. Ibrahim, F. et al. Uridylation of mature miRNAs and siRNAs by the MUT68 nucleotidyltransferase promotes their degradation in Chlamydomonas. Proc Natl Acad Sci U S A 107, 3906-3911 (2010) .
31. Morin, R.D. et al. Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. Genome Research 18, 610-621 (2008) .
32. Kaushik, A., Saraf, S., Mukherjee, S.K. &Gupta, D. miRMOD: a tool for identification and analysis of 5' and 3' miRNA modifications in Next Generation Sequencing small RNA data. PeerJ 3, e1332 (2015) .
33. Koppers-Lalic, D. et al. Non-invasive prostate cancer detection by measuring miRNA variants (isomiRs) in urine extracellular vesicles. Oncotarget 7, 22566 (2016) .
34. Zhao, B.S., Roundtree, I.A. &He, C. Post-transcriptional gene regulation by mRNA modifications. Nature Reviews Molecular Cell Biology 18, 31-42 (2017) .
35. Dai, D., Wang, H., Zhu, L., Jin, H. &Wang, X. N6-methyladenosine links RNA metabolism to cancer progression. Cell Death &Disease 9, 124 (2018) .
36. Roundtree, I.A., Evans, M.E., Pan, T. &He, C. Dynamic RNA Modifications in Gene Expression Regulation. Cell 169, 1187-1200 (2017) .
37. Lence, T. et al. m (6) A modulates neuronal functions and sex determination in Drosophila. Nature 540, 242-247 (2016) .
38. Dominissini, D. et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature 485, 201-206 (2012) .
39. Meyer, K.D. et al. Comprehensive analysis of mRNA methylation reveals enrichment in 3' UTRs and near stop codons. Cell 149, 1635-1646 (2012) .
40. Lee, J.E. &Yi, R. Highly efficient ligation of small RNA molecules for microRNA quantitation by high-throughput sequencing. J. Vis. Exp 93, 1-7 (2014) .
41. Soskine, M. et al. An engineered ClyA nanopore detects folded target proteins by selective external association and pore entry. Nano Lett 12, 4895-4900 (2012) .
42. Huang, G., Willems, K., Soskine, M., Wloka, C. &Maglia, G. Electro-osmotic capture and ionic discrimination of peptide and protein biomarkers with FraC nanopores. Nature Communications 8, 935 (2017) .
43. Paredes, E., Evans, M. &Das, S.R. RNA labeling, conjugation and ligation. Methods 54, 251-259 (2011) .
44. Vogel, H. &Richert, C. Labeling small RNAs through chemical ligation at the 5' terminus: enzyme-free or combined with enzymatic 3'-labeling. Chembiochem 13, 1474-1482 (2012) .
45. Jones, M.R. et al. Zcchc11-dependent uridylation of microRNA directs cytokine expression. Nature cell biology 11, 1157-1163 (2009) .
46. Siomi, M.C., Sato, K., Pezic, D. &Aravin, A.A. PIWI-interacting small RNAs: the vanguard of genome defence. Nature Reviews Molecular Cell Biology 12, 246 (2011) .
47. Huang, S., Romero-Ruiz, M., Castell, O.K., Bayley, H. &Wallace, M.I. High-throughput optical sensing of nucleic acids in a nanopore array. Nature Nanotechnology 10, 986-991 (2015) .
48. Butler, T.Z., Pavlenok, M., Derrington, I.M., Niederweis, M. &Gundlach, J. H. Single-molecule DNA detection with an engineered MspA protein nanopore. Proc. Natl. Acad. Sci. USA 105, 20647-20652 (2008) .
49. Wang, Y. et al. Osmosis-Driven Motion-Type Modulation of Biological Nanopores for Parallel Optical Nucleic Acid Sensing. ACS Applied Materials &Interfaces 10, 7788-7797 (2018) .

Claims (58)

  1. Method of determining the sequence of a short RNA, the method including:
    providing two compartments separated by an interface; wherein each of the two compartments comprises a liquid medium and the interface has a channel so dimensioned as to allow passage of only one single-strand polynucleotide at a time;
    providing a DNA-short RNA chimeric substrate comprising a DNA-short RNA chimeric single-strand and at least one DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand, wherein the chimeric single-strand comprises the short RNA conjugated with a DNA;
    providing an enzyme that can process DNA in a nucleotide by nucleotide way;
    introducing the chimeric substrate and the enzyme into one of the two compartments and allowing the enzyme to bind to chimeric substrate;
    applying an electric potential difference between the two compartments, thereby causing the single-stranded portion of the polynucleotide complex to translocate through the channel to the other side;
    allowing the enzyme to process the DNA segment of the chimeric substrate, thereby driving the single-stranded portion of the polynucleotide complex to translocate through the channel in a nucleotide by nucleotide way; and
    measuring the change of the ionic current through the channel during the translocation, thereby determining the sequence of the short RNA.
  2. The method according to claim 1, wherein the short RNA is derived from miRNA, siRNA or piRNA.
  3. The method according to claim 1 or 2, wherein the DNA-short RNA chimeric single-strand is prepared by following steps: the 5’ phosphorylated DNA single-strand is first 5’ adenylated, assisted with the Mth RNA ligase; then the adenylated DNA is purified by ethanol precipitation and further ligated to the 3’ -end of the short RNA by RNA ligase.
  4. The method according to any one of claims 1-3, wherein in the DNA-short RNA chimeric single-strand, the DNA segment and the short RNA is separated by a spacer.
  5. The method according to claim 4, wherein the spacer is an abasic spacer.
  6. The method according to any one of claims 1-5, wherein the enzyme is DNA polymerase; the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand, a primer and a blocking oligomer; the primer is hybridized with the 3'end of the DNA segment of the chimeric single-strand; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand adjacent to the 3'end of the primer; the short RNA is adjacent to the 5'end of the DNA segment in the DNA-short RNA chimeric single-strand; and the method includes:
    introducing the chimeric substrate, the polymerase, divalent cation and dNTPs into one of the two compartments and allowing the polymerase to bind to chimeric substrate;
    applying an electric potential difference between the two compartments, thereby causing translocation of the single-stranded portion of the chimeric substrate through the channel and voltage-driven unzipping of the blocking oligomer, wherein the complete unzipping of the  blocking oligomer initiate the polymerase-driven primer extension so that the chimeric substrate moves against the electric field force; and
    measuring the change of the ionic current through the channel during the translocation, thereby determining the sequence of the short RNA;
    preferablly, the DNA polymerase is phi29 DNA polymerase or variants thereof; more preferablly, the DNA polymerase is wild-type phi29 DNA polymerase, or D12A/D66A mutant of wild-type phi29 DNA polymerase.
  7. The method according to claim 6, wherein the 5'end of the DNA segment is conjugated with the 5'end or the 3'end of the short RNA.
  8. The method according to claim 6 or 7, wherein the 3'end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
  9. The method according to claim 8, wherein the 3'end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
  10. The method according to any one of claims 6-9, wherein a cholesterol molecule is linked to the 3'end of the blocking oligomer.
  11. The method according to any one of claims 6-10, wherein the method includes:
    mixing the chimeric single-strand with the primer and the blocking oligomer to form the chimeric substrate, incubating the chimeric substrate with the polymerase to form a complex, and introducing the complex and dNTPs into one of the two compartments.
  12. The method according to any one of claims 6-11, wherein the DNA segment of the chimeric single-strand contains sequence repeats to generate a unique signal pattern during primer extension.
  13. The method according to claim 12, wherein the sequence repeats include sequence repeats between "AAGA" and "TTTC" from 3'to 5'.
  14. The method according to any one of claims 1-5, wherein the enzyme is a DNA helicase; the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand and a blocking oligomer; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand; and the method includes:
    introducing the chimeric substrate and the helicase into one of the two compartments and allowing the helicase to bind to chimeric substrate;
    applying an electric potential difference between the two compartments, thereby causing translocation of the single-stranded portion of the chimeric substrate through the channel, wherein unzipping the blocking oligomer by the helicase drives the short RNA to translocate through the channel in a nucleotide by nucleotide way;
    measuring the change of the ionic current through the channel during the translocation, thereby determining the sequence of the short RNA;
    preferably, the DNA helicase is DNA Helicase HEL308 or variants thereof.
  15. The method according to claim 14, wherein at least one ends of the blocking oligomer is not paired with the DNA segment of the chimeric template.
  16. The method according to claim 15, wherein the unpaired end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
  17. The method according to any one of claims 14-16, wherein a cholesterol molecule is linked to at least one end of the blocking oligomer.
  18. The method according to any one of claims 14-17, wherein the method includes:
    mixing the chimeric single-strand with the blocking oligomer to form the chimeric substrate, incubating the chimeric substrate with the helicase to form a complex, and introducing the complex into one of the two compartments.
  19. The method according to any one of claims 1-18, wherein the channel is a protein nanopore.
  20. The method according to claim 19, wherein the protein nanopore is MspA, CsgG, ClyA or FraC, or variants thereof.
  21. The method according to claim 20, wherein the protein nanopore is a mutant MspA comprising the mutations of D90N/D91N/D93N/D118R/D134R/E139K compared to the wild-type MspA.
  22. The method according to any one of claims 1-21, wherein the DNA-short RNA chimeric substrate is a plurality of DNA-short RNA chimeric substrates, wherein different DNA-short RNA chimeric substrate comprises different short RNA.
  23. The method according to claim 22, wherein different DNA-short RNA chimeric substrate comprises the same DNA segment.
  24. A sequencing library constructed by thermal annealing from a plurality of DNA-short RNA chimeric single-strands and a DNA fragment; wherein different DNA-short RNA chimeric single-strand comprises the same DNA segment and different short RNA, and the DNA fragment is complementary to at least partial of the DNA segment of the chimeric single-strand.
  25. The sequencing library according to claim 24, wherein the RNA is derived from miRNA, siRNA or piRNA.
  26. The sequencing library accroding to claim 24 or 25, wherein in the DNA-short RNA chimeric single-strand, the DNA segment and the short RNA is separated by a spacer.
  27. The sequencing library according to claim 26, wherein the spacer is an abasic spacer.
  28. The sequencing library according to any one of claims 24-27, wherein the sequencing library constructed by thermal annealing from a plurality of DNA-short RNA chimeric single-strands, a primer and a blocking oligomer; wherein the primer is hybridized with the 3'end of the DNA segment of the chimeric single-strand; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand adjacent to the 3'end of the primer; and the short RNA is adjacent to the 5'end of the DNA segment in the DNA-short RNA chimeric single-strand.
  29. The sequencing library according to claim 28, wherein the 5'end of the DNA segment is conjugated with the 5'end or the 3'end of the short RNA.
  30. The sequencing library according to claim 28 or 29, wherein the 3'end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
  31. The sequencing library according to claim 30, wherein the 3'end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
  32. The sequencing library according to any one of claims 28-31, wherein a cholesterol molecule is linked to the 3'end of the blocking oligomer.
  33. The sequencing library according to any one of claims 28-32, wherein the DNA segment of the chimeric single-strand contains sequence repeats to generate a unique signal pattern during primer extension.
  34. The sequencing library according to claim 33, wherein the sequence repeats includes sequence repeats between "AAGA" and "TTTC" from 3'to 5'.
  35. The sequencing library according to any one of claims 24-27, wherein the sequencing library constructed by thermal annealing from a plurality of DNA-short RNA chimeric single-strands and a blocking oligomer; wherein the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand.
  36. The sequencing library according to claim 35, wherein at least one end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
  37. The sequencing library according to claim 36, wherein the unpaired end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
  38. The sequencing library according to any one of claims 35-37, wherein a cholesterol molecule is linked to at least one end of the blocking oligomer.
  39. System for determining the sequence of a short RNA, the system comprising:
    two compartments separated by an interface; wherein each of the two compartments comprises a liquid medium and the interface has a channel so dimensioned as to allow passage of only one single-strand polynucleotide at a time;
    an enzyme and a DNA-short RNA chimeric substrate in one of the two compartments; wherein the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand and at least one DNA fragment complementary to at least partial of the DNA segment of the chimeric single-strand, the chimeric single-strand comprises the short RNA conjugated with a DNA, and the enzyme can process DNA in a nucleotide by nucleotide way.
  40. The system according to claim 39, wherein the short RNA is derived from miRNA, siRNA or piRNA.
  41. The system accroding to claim 39 or 40, wherein in the DNA-short RNA chimeric single-strand, the DNA segment and the short RNA is separated by a spacer.
  42. The system according to claim 41, wherein the spacer is an abasic spacer.
  43. The system according to any one of claims 39-42, wherein the system comprises a DNA polymerase, divalent cation, dNTPs and a DNA-short RNA chimeric substrate in one of  the two compartments; wherein the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand, a primer and a blocking oligomer; the primer is hybridized with the 3'end of the DNA segment of the chimeric single-strand; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand adjacent to the 3'end of the primer; the short RNA is adjacent to the 5'end of the DNA segment in the DNA-short RNA chimeric single-strand.
  44. The system accroding to claim 43, the 5'end of the DNA segment is conjugated with the 5'end or the 3'end of the short RNA.
  45. The system according to claim 43 or 44, wherein the 3'end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
  46. The system according to claim 45, wherein the 3'end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
  47. The system according to any one of claims 43-46, wherein a cholesterol molecule is linked to the 3'end of the blocking oligomer.
  48. The system according to any one of claims 43-47, wherein the DNA segment contains sequence repeats to generate a unique signal pattern during primer extension.
  49. The system according to claim 48, wherein the sequence repeats include sequence repeats between "AAGA" and "TTTC" from 3'to 5'.
  50. The system according to any one of claims 39-42, wherein the system comprises a DNA helicase, ATP, and a DNA-short RNA chimeric substrate in one of the two compartments; wherein the DNA-short RNA chimeric substrate comprises a DNA-short RNA chimeric single-strand and a blocking oligomer; the blocking oligomer is hybridized with the DNA segment of the chimeric single-strand.
  51. The system according to claim 50, wherein at least one end of the blocking oligomer is not paired with the DNA segment of the chimeric template.
  52. The method according to claim 51, wherein the unpaired end of the blocking oligomer has 1-10 nucleotides mismatched with the chimeric template or has 1-10 abasic residues.
  53. The method according to any one of claims 50-52, wherein a cholesterol molecule is linked to at least one end of the blocking oligomer.
  54. The system according to any one of claims 39-53, wherein the channel is a protein nanopore.
  55. The system according to claim 54, wherein the protein nanopore is MspA, CsgG, ClyA or FraC, or variants thereof.
  56. The system according to claim 55, wherein the protein nanopore is a mutant MspA comprising the mutations of D90N/D91N/D93N/D118R/D134R/E139K compared to the wild-type MspA.
  57. The method according to any one of claims 39-56, wherein the DNA-short RNA chimeric substrate is a plurality of DNA-short RNA chimeric substrates, wherein different DNA-short RNA chimeric substrate comprises different short RNA.
  58. The method according to claim 57, wherein different DNA-short RNA chimeric substrate comprises the same DNA segment.
PCT/CN2020/110836 2019-08-23 2020-08-24 Direct microrna sequencing using enzyme assisted sequencing WO2021036995A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CNPCT/CN2019/102162 2019-08-23
CN2019102162 2019-08-23

Publications (1)

Publication Number Publication Date
WO2021036995A1 true WO2021036995A1 (en) 2021-03-04

Family

ID=74683776

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/110836 WO2021036995A1 (en) 2019-08-23 2020-08-24 Direct microrna sequencing using enzyme assisted sequencing

Country Status (1)

Country Link
WO (1) WO2021036995A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023116575A1 (en) * 2021-12-21 2023-06-29 成都齐碳科技有限公司 Adapter for characterizing target polynucleotide, method, and use thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012164270A1 (en) * 2011-05-27 2012-12-06 Oxford Nanopore Technologies Limited Coupling method
WO2013016486A1 (en) * 2011-07-27 2013-01-31 The Board Of Trustees Of The University Of Illinois Nanopore sensors for biomolecular characterization
WO2015042708A1 (en) * 2013-09-25 2015-04-02 Bio-Id Diagnostic Inc. Methods for detecting nucleic acid fragments
CN106661631A (en) * 2014-06-06 2017-05-10 康奈尔大学 Method for identification and enumeration of nucleic acid sequence, expression, copy, or dna methylation changes, using combined nuclease, ligase, polymerase, and sequencing reactions
CN107109489A (en) * 2014-10-17 2017-08-29 牛津纳米孔技术公司 Nano-pore RNA characterizing methods
WO2019149692A1 (en) * 2018-01-30 2019-08-08 Immunovia Ab Methods, conjugates and systems

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012164270A1 (en) * 2011-05-27 2012-12-06 Oxford Nanopore Technologies Limited Coupling method
WO2013016486A1 (en) * 2011-07-27 2013-01-31 The Board Of Trustees Of The University Of Illinois Nanopore sensors for biomolecular characterization
WO2015042708A1 (en) * 2013-09-25 2015-04-02 Bio-Id Diagnostic Inc. Methods for detecting nucleic acid fragments
CN106661631A (en) * 2014-06-06 2017-05-10 康奈尔大学 Method for identification and enumeration of nucleic acid sequence, expression, copy, or dna methylation changes, using combined nuclease, ligase, polymerase, and sequencing reactions
CN107109489A (en) * 2014-10-17 2017-08-29 牛津纳米孔技术公司 Nano-pore RNA characterizing methods
WO2019149692A1 (en) * 2018-01-30 2019-08-08 Immunovia Ab Methods, conjugates and systems

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YAN,S.H. ET AL.: "Direct sequencing of 2'-deoxy-2'-fluoroarabinonucleic acid (FANA) using nanopore induced phase-shift sequencing (NIPSS)", CHEMICAL SCIENCE, vol. 10, 23 January 2019 (2019-01-23), pages 3110 - 3117, XP055601173, DOI: 20201112082702Y *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023116575A1 (en) * 2021-12-21 2023-06-29 成都齐碳科技有限公司 Adapter for characterizing target polynucleotide, method, and use thereof

Similar Documents

Publication Publication Date Title
US11542551B2 (en) Sample preparation method
KR102457147B1 (en) Method for nanopore rna characterisation
US20230132387A9 (en) Method of characterizing a target ribonucleic acid (rna) comprising forming a complementary polynucleotide which moves through a transmembrane pore
US20210277462A1 (en) Polymerase-template complexes
EP3033435B1 (en) Method for fragmenting nucleic acid by means of transposase
EP2895618B1 (en) Sample preparation method
US20190211390A1 (en) Enzyme stalling method
CN107922968B (en) Polymer-labeled nucleotides for single-molecule electronic SNP assays
US20230119938A1 (en) Methods of Preparing Dual-Indexed DNA Libraries for Bisulfite Conversion Sequencing
EP2987870A1 (en) Method of characterizing a target polynucleotide using a transmembrane pore and molecular motor
KR20170068540A (en) Method
Zhang et al. Direct microRNA sequencing using nanopore-induced phase-shift sequencing
WO2011127136A1 (en) Composition and methods related to modification of 5-hydroxymethylcytosine (5-hmc)
KR20110036646A (en) Method of detecting variation and kit to be used therein
KR20030074797A (en) Method of detecting nucleotide polymorphism
WO2021036995A1 (en) Direct microrna sequencing using enzyme assisted sequencing
WO2023116575A1 (en) Adapter for characterizing target polynucleotide, method, and use thereof
Cho Development of Single Molecule Electronic SNP Assays using Polymer Tagged Nucleotides and Nanopore Detection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20857159

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20857159

Country of ref document: EP

Kind code of ref document: A1