CN118241319A - Preparation method of sequencing library - Google Patents

Preparation method of sequencing library Download PDF

Info

Publication number
CN118241319A
CN118241319A CN202211661600.3A CN202211661600A CN118241319A CN 118241319 A CN118241319 A CN 118241319A CN 202211661600 A CN202211661600 A CN 202211661600A CN 118241319 A CN118241319 A CN 118241319A
Authority
CN
China
Prior art keywords
polynucleotide
sequencing
analyte
binding protein
polynucleotide binding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211661600.3A
Other languages
Chinese (zh)
Inventor
刘先宇
肖宓
常馨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qitan Technology Ltd Beijing
Original Assignee
Qitan Technology Ltd Beijing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qitan Technology Ltd Beijing filed Critical Qitan Technology Ltd Beijing
Priority to CN202211661600.3A priority Critical patent/CN118241319A/en
Priority to PCT/CN2023/136713 priority patent/WO2024131530A1/en
Publication of CN118241319A publication Critical patent/CN118241319A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention relates to a method of preparing a sequencing library, the method comprising simultaneously mixing a sequencing adapter, a polynucleotide binding protein and an analyte, thereby ligating to form the sequencing library. The method does not need to prepare the joint compound in advance, and is simple and convenient to operate.

Description

Preparation method of sequencing library
Technical field
The application belongs to the technical field of biological analysis and detection, and particularly relates to a preparation method of a sequencing library.
Background
There is a need for a rapid and inexpensive technique for sequencing and identification of analytes such as polynucleotides (e.g., DNA or RNA) that has a wide range of applications. The prior art is slow and expensive, mainly because they rely on amplification techniques to produce large amounts of polynucleotides, and require large amounts of specific fluorescent chemicals for signal detection.
Transmembrane pores (nanopores) have great potential as direct, electrobiosensors for polymers and various small molecules. In particular, nanopores, which are currently a potential DNA sequencing technology, have received much attention.
When an electrical potential is applied across the nanopore, a change in current occurs when an analyte such as a nucleotide resides in the barrel (barrel) for a short period of time. Nanopore detection of analytes such as nucleotides can produce a change in current of known characteristics and duration. In the chain sequencing method, the analyte, such as a single polynucleotide strand, passes through the pore and enables identification of the nucleotide. Strand sequencing may include the use of polynucleotide binding proteins to control the movement of the analyte, such as a polynucleotide, through the pore.
In the existing nanopore sequencing technology, polynucleotide binding proteins and sequencing adaptors are assembled to form a sequencing adaptor complex for nanopore sequencing, and then the sequencing adaptor complex is connected with a sequence to be tested by using ligase, so that a sequencing library is formed, and subsequent sequencing is completed.
The prior art needs to prepare a sequencing linker complex, wherein the sequencing linker complex is generally a complex formed by protein-nucleic acid, and the complex can be subjected to conditions of lower stability in storage and transportation, such as easy shedding of enzymes in the linker complex in long-time storage, and the like; and a certain proportion of raw materials are lost in the preparation process. Furthermore, the prior art methods of preparing sequencing libraries are complex to operate.
Disclosure of Invention
The inventors of the present invention have found for the first time that a library to be sequenced can be obtained by directly mixing a polynucleotide binding protein, a sequencing linker, a sequence to be sequenced, etc. in the preparation of a sequencing library without preparing a corresponding linker complex. The method for constructing the library does not need to prepare the joint compound in advance, and is simple and convenient to operate.
Accordingly, a first aspect of the invention relates to a method of preparing a sequencing library, the method comprising:
(a) Providing a sequencing adapter, a polynucleotide binding protein, and an analyte; and
(B) Mixing the sequencing adapter, the polynucleotide binding protein, and the analyte, thereby ligating to form a sequencing library in which the sequencing adapter is ligated to the analyte and the polynucleotide binding protein remains bound to the sequencing adapter.
Preferably, prior to step (b), the polynucleotide binding protein is not bound to the sequencing adapter.
Preferably, at the end of step (b), the analyte is linked to the sequencing adapter and the polynucleotide binding protein binds to and stagnates at the sequencing adapter.
Preferably, the mixing is performed in the same system and/or the mixing is simultaneous.
In some embodiments, the sequencing linker comprises a binding region for binding to a polynucleotide binding protein. Preferably, the binding region comprises a single stranded polynucleotide, wherein the polynucleotide binding protein has the ability to control the translocation of a polynucleotide and has a reduced ability to unwind a double stranded polynucleotide.
In some embodiments, the sequencing linker comprises a binding region for binding a polynucleotide binding protein and a spacer for arresting the polynucleotide binding protein. Preferably, the binding region comprises a single stranded polynucleotide and the spacer comprises any molecule or combination of molecules that arrests one or more polynucleotide binding proteins.
In some embodiments, the sequencing linker comprises a blocking band for simultaneously binding and blocking the polynucleotide binding protein. Preferably, the blocking band comprises a single-stranded nucleotide sequence formed by the ligation of a plurality of nucleotides, the nucleobases of which are benzo structures.
Preferably, the ligation of the sequencing adapter to the analyte is performed in a ligation system, which is an enzymatic ligation system or a click chemistry ligation system.
In some embodiments, the ligation employs an enzymatic ligation system comprising a ligase.
Preferably, the ligase is an enzyme with NTP, preferably ATP, as a substrate, more preferably the ligase is selected from the group consisting of T4 DNA ligase, e.coli DNA ligase, taq DNA ligase, tma DNA ligase, 9°n DNA ligase, or any combination thereof; most preferred is T4 DNA ligase.
Preferably, if the intramolecular covalent cross-linking is performed in the form of disulfide bonds, the linking system further comprises an intramolecular cross-linking agent, preferably selected from TMAD, oxygen, GSSG (oxidized glutathione) or any combination thereof. In the linking system, the concentration of the intramolecular crosslinking agent is 0 to 1M, preferably 1. Mu.M to 500mM, more preferably 10. Mu.M to 1mM. Preferably, the concentration of the intramolecular cross-linking agent is 10, 20, 50, 100, 125, 150, 200. Mu.M.
Preferably, the enzyme-linked system further comprises NTP, preferably ATP.
Preferably, the enzyme-linked system further comprises a metal cation that promotes enzymatic activity, preferably Mg 2+.
In some embodiments, the ligation employs a click chemistry ligation system, and the sequencing linker is ligated to the analyte by a click chemistry reaction. Click chemistry (CLICK CHEMISTRY) is the rapid completion of chemical synthesis of different molecules by the splicing of small units. In a preferred embodiment, a first linking group is attached to the sequencing linker and a second linking group is attached to the analyte, and a click chemistry reaction occurs between the first linking group and the second linking group to effect rapid attachment of the analyte to the sequencing linker. The first linking group and the second linking group may be linked by a carbon-carbon multiple bond addition reaction, linked by a nucleophilic ring opening reaction, click chemistry reaction by a cycloaddition reaction, or the like. For example: one of the first and second linking groups may be any one of cyclooctene (TCO), dibenzocyclooctyne (DBCO), cyclooctyne Difluoride (DIFO), bicyclonyne (BCN), or Dibenzocyclooctyne (DICO); the other may be azido (N3), tetrazinyl (TZ), etc. It will be appreciated by those skilled in the art that the present embodiment is not limited to what kind of click chemistry reaction groups the first linking group and the second linking group are specifically.
Preferably, the polynucleotide binding protein is selected from any one or a combination of two or more of a polymerase, a helicase, an exonuclease, and a topoisomerase.
Preferably, the polynucleotide binding protein is a helicase, more preferably, the helicase is (a) a Hel308 helicase, a RecD helicase, an XPD helicase, or a Dda helicase; (b) is derived from any one of the helicases of (a); or any combination of the helicases described in (a) and/or (b).
Preferably, the sequencing linker is a linker formed by a plurality of nucleotide linkages for binding to the polynucleotide binding protein and for linking to the analyte. The sequencing linker may be a Y-adaptor; the Y-adaptor comprises a leader sequence in a preferential helical hole. The sequencing adapter may also be a bridging moiety; the bridging portion is a hair clip.
Preferably, the sequencing adapter comprises an anchor capable of coupling to a membrane.
Preferably, the analyte is selected from a polynucleotide, polypeptide, lipid or polysaccharide, preferably a polynucleotide, which is a fully double stranded polynucleotide, a partially double stranded polynucleotide or a single stranded polynucleotide.
When the analyte is a polynucleotide, the sequencing adapter is directly linked to the polynucleotide during preparation of the polynucleotide sequencing library, thereby forming the polynucleotide sequencing library.
When the analyte is a polypeptide, in the preparation of the polypeptide sequencing library, the polypeptide is first ligated to a nucleic acid to obtain a nucleic acid-polypeptide linker, and then a sequencing linker is ligated to the nucleic acid-polypeptide linker, thereby forming the polypeptide sequencing library.
A second aspect of the invention relates to a construct prepared by a method according to the first aspect of the invention, the construct comprising: the sequencing adapter, the polynucleotide binding protein, and the analyte, the sequencing adapter being linked to the analyte, and the polynucleotide binding protein remaining bound to the sequencing adapter.
A third aspect of the invention relates to a method of characterizing an analyte, the method comprising:
(i) Performing the method of preparing a sequencing library according to the first aspect of the invention, thereby forming a sequencing library or construct;
(ii) Contacting the sequencing library or construct formed in step (i) with a transmembrane pore such that a polynucleotide binding protein controls movement of the analyte relative to the transmembrane pore; and
(Iii) One or more measurements are obtained as the analyte moves relative to the transmembrane pore, wherein the measurements represent one or more characteristics of the analyte and thereby characterize the analyte.
Preferably, the transmembrane pore is a protein pore or a solid state pore; more preferably, the protein pore is derived from Msp, a-hemolysin (a-HL), cytolysin, csgG, clyA, sp1 or FraC.
Preferably, the membrane is an amphiphilic layer or a solid state layer; more preferably, the amphiphilic layer is a lipid bilayer.
A fourth aspect of the invention relates to a kit for preparing a sequencing library comprising an individually packaged sequencing adapter, an individually packaged polynucleotide binding protein and an individually packaged optional ligation system.
A fifth aspect of the invention relates to a kit for characterizing an analyte comprising an individually packaged sequencing linker, an individually packaged polynucleotide binding protein, and an individually packaged optional ligation system. More preferably, the kit further comprises a transmembrane pore and a membrane.
Preferably, the polynucleotide binding protein is selected from any one or a combination of two or more of a polymerase, a helicase, an exonuclease and a topoisomerase; more preferably, the polynucleotide binding protein is a helicase which is (a) a Hel308 helicase, recD helicase, XPD helicase, or Dda helicase; (b) is derived from any one of the helicases of (a); or any combination of the helicases described in (a) and/or (b). The connection system further comprises NTP, preferably ATP; also included are metal cations that promote enzymatic activity, preferably Mg 2+. The ligase is selected from T4 DNA ligase, E.coli DNA ligase, taq DNA ligase, tma DNA ligase, 9 DEG N DNA ligase or any combination thereof; more preferably T4 DNA ligase. The intramolecular cross-linking agent is selected from TMAD, oxygen, GSSG (oxidized glutathione) or any combination thereof. The transmembrane pore is a protein pore or a solid state pore; more preferably, the protein pore is derived from Msp, a-hemolysin (a-HL), cytolysin, csgG, clyA, sp1 or FraC. The membrane is an amphiphilic layer or a solid state layer; more preferably, the amphiphilic layer is a lipid bilayer.
Preferably, in the kit according to the fourth and fifth aspects of the invention, the sequencing adapter is for binding to the polynucleotide binding protein and for linking to an analyte; the polynucleotide binding protein is used to control movement of the analyte relative to the transmembrane pore; the ligation system is used to mix a sequencing adapter, a polynucleotide binding protein, and an analyte therein and ligate the sequencing adapter to the analyte and to retain the polynucleotide binding protein bound to the sequencing adapter.
A sixth aspect of the invention relates to the use of a method as provided in the first and third aspects or a kit as provided in the fourth and fifth aspects for the preparation of a product for characterizing an analyte or for characterizing an analyte.
The technical scheme of the invention has the following technical effects:
The method of the present invention does not require the preparation of sequencing adapter complexes in advance, and the storage stability of the components (e.g., polynucleotide binding proteins, nucleic acid adapters) is further enhanced. The method for constructing the library is simple to operate, and does not need to purify the joint compound, thereby saving time.
Furthermore, in general, the presence of ligase substrates NTP such as ATP and magnesium ions in the ligation system used in the ligation of a ligase library, polynucleotide binding proteins such as helicase consume NTP such as ATP and move over the sequencing linker. Thus, in a ligation system in which a polynucleotide binding protein, such as a helicase, is co-present with a ligase, the polynucleotide binding protein, such as a helicase, may consume the substrate of the ligase, thereby affecting the function of the ligase; the presence of NTP such as ATP in the ligation system may cause the polynucleotide binding protein such as helicase to move ahead of time on the sequencing junction, which may result in a sequencing library construction failure. However, the inventors have surprisingly found that polynucleotide binding proteins (e.g., helicases) are also well able to bind to sequencing adaptors when the three are placed together in a ligation system (also containing ATP and magnesium ions) in the presence of a ligase during preparation of the sequencing library. When the three are put together into a connecting system in which ligase exists, the helicase is connected to a sequencing joint, and the connection of the helicase and the connection of the T4 ligase can be performed simultaneously without interference, so that the quality of a formed sequencing library is not affected.
In click chemistry ligation systems, the ligation of helicases and the ligation groups in click chemistry systems, as well as various chemical reactions, are also non-interfering with each other.
Drawings
FIG. 1 shows a construction method of a nanopore sequencing library, wherein A is a traditional nano Kong Jianku method, which requires that a sequencing linker complex consisting of an enzyme and a nucleic acid linker is prepared first, and then the sequencing linker complex is connected with a sequence to be tested, so that a sequencing library is prepared; b is a novel nano Kong Jianku mode, and the sequencing library can be prepared by only putting enzyme, a nucleic acid adaptor and a sequence to be tested together into a connecting system.
FIG. 2 shows the Q-sep assay of the sequencing library prepared in example 1 using the novel nano Kong Jianku format and the conventional library-building format.
FIG. 3 shows a plot of the current signal of the sequencing library prepared in example 1 using the novel nano Kong Jianku mode and the conventional library building mode as it passed through the nanopore, where A is the novel nano Kong Jianku mode and B is the conventional nano Kong Jianku mode.
FIG. 4 shows the Q-sep assay of the sequencing library prepared in example 2 using the novel nano Kong Jianku format and the conventional library-building format.
FIG. 5 shows a plot of the current signal of the sequencing library prepared in example 2 using the novel nano Kong Jianku mode and the conventional library building mode as it passed through the nanopore, where A is the novel nano Kong Jianku mode and B is the conventional nano Kong Jianku mode.
FIG. 6 shows a gel electrophoresis diagram of the rapid ligation library preparation system of example 3.
FIG. 7 shows a plot of the current signal of the sequencing library of example 4 as it passes through a nanopore, with the number of sample points on the abscissa and signal current (pA) on the ordinate.
Detailed Description
It will be appreciated that the different applications of the disclosed products and methods may be adapted to the specific needs in the art. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments of the invention only and is not intended to be limiting.
All publications, patents, and patent applications cited herein, whether supra or infra, are hereby incorporated by reference in their entirety.
Definition of the definition
In order to more clearly explain the embodiments of the invention, certain scientific terms and terminology are used herein. Unless defined otherwise herein, all such terms and terminology should be interpreted as having the meaning commonly understood by one of ordinary skill in the art. For clarity, the following definitions are made for certain terms used herein.
Analyte(s)
The analyte is selected from one or more of polynucleotides, polypeptides, polysaccharides and lipids. The analyte is preferably a polynucleotide such as a nucleic acid, including deoxyribonucleic acid (DNA) and/or ribonucleic acid (RNA). The polynucleotide may be single-stranded or double-stranded. The polynucleotide may be circular. The polynucleotide may be an aptamer, a probe that hybridizes to the microRNA, or the microRNA itself. The polynucleotide may be of any length. For example, a polynucleotide may be at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 400, or at least 500 nucleotide pairs in length. The polynucleotide may be 1000 or more nucleotide pairs, 5000 or more nucleotide pairs in length or 100000 or more nucleotide pairs in length.
The analyte may be present in any suitable sample. The invention is generally practiced on samples known or suspected to contain an analyte. The invention may be practiced on samples containing one or more analytes of unknown species. Or the invention may be practiced on a sample to confirm the identity of one or more analytes known or expected to be present in the sample. It will be appreciated by those skilled in the art that "providing an analyte" in accordance with the present invention refers to providing a sample comprising an analyte, and that "sequencing adapter linked to an analyte" in accordance with the present invention refers to linking a sequencing adapter to an analyte present in the sample.
Polynucleotide
The polynucleotide may be any polynucleotide. Polynucleotides, such as nucleic acids, are macromolecules containing two or more nucleotides. The polynucleotide or nucleic acid may comprise any combination of any nucleotides. The nucleotides may be naturally occurring or synthetic. One or more nucleotides in the polynucleotide may be oxidized or methylated. One or more nucleotides in the polynucleotide may be damaged. For example, the polynucleotide may comprise a pyrimidine dimer. Such dimers are often associated with uv-induced damage and are the leading cause of cutaneous melanoma. One or more nucleotides in the polynucleotide may be modified, for example with a label or tag. Suitable labels are described below.
Nucleotides generally contain a nucleobase, a sugar and at least one phosphate group. The nucleobase and sugar form a nucleoside. The nucleotide may be a natural nucleotide or a non-natural nucleotide.
Nucleoside bases are typically heterocyclic. Nucleobases include, but are not limited to: purine and pyrimidine, more specifically adenine (a), guanine (G), thymine (T), uracil (U) and cytosine (C).
The sugar is typically pentose. Nucleotide sugars include, but are not limited to, ribose and deoxyribose. The sugar is preferably deoxyribose.
The nucleotides in the polynucleotide are typically ribonucleotides or deoxyribonucleotides. The polynucleotide may comprise the following nucleosides: adenosine, uridine, guanosine and cytidine. The nucleotide is preferably a deoxyribonucleotide. The polynucleotide preferably comprises the following nucleosides: deoxyadenosine (dA), deoxyuridine (dU) and/or thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC).
The nucleotides typically contain a monophosphate, diphosphate or triphosphate. The phosphatase may be linked to the 5 'or 3' side of the nucleotide.
Suitable nucleotides include, but are not limited to, adenosine Monophosphate (AMP), guanosine Monophosphate (GMP), thymidine Monophosphate (TMP), uridine Monophosphate (UMP), cytidine Monophosphate (CMP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), deoxyadenosine monophosphate (dAMP), deoxyguanosine monophosphate (dGMP), deoxythymidine monophosphate (dTMP), deoxyuridine monophosphate (dUMP), and deoxycytidine monophosphate (dCMP). The nucleotide is preferably selected from AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP, dCMP and dUMP. Most preferably, the nucleotide is selected from dAMP, dTMP, dGMP, dCMP and dUMP. The polynucleotide preferably comprises the following nucleotides: dAMP, dUMP and/or dTMP, dGMP and dCMP.
The nucleotides in the polynucleotide may be linked to each other in any manner. As in nucleic acids, nucleotides are typically linked by their sugar and phosphate groups. As in pyrimidine dimers, the nucleotides may be linked by their nucleobases.
The polynucleotide may be single-stranded or double-stranded. At least a portion of the polynucleotide is preferably double-stranded.
The polynucleotide may be a nucleic acid. The polynucleotide may be any synthetic nucleic acid known in the art, such as Peptide Nucleic Acid (PNA), glycerol Nucleic Acid (GNA), threose Nucleic Acid (TNA), locked Nucleic Acid (LNA), bridged Nucleic Acid (BNA), or other synthetic polymer having a nucleotide side chain. PNA frameworks are composed of N- (2-aminoethyl) glycine repeat units linked by peptide bonds. The GNA backbone is composed of glycerol repeating units linked by phosphodiester bonds. The TNA backbone is composed of threose repeat units linked together by phosphodiester bonds. LNA is formed from ribonucleotides as described above with an additional bridge in the ribose moiety linking the 2 'oxygen and 4' carbon. Bridged Nucleic Acids (BNA) are modified RNA nucleotides. They may also be referred to as constrained or inaccessible RNAs. The BNA monomer may contain five-, six-or even seven-membered bridging structures with "fixed" C3 '-endo-saccharide folds (C3' -endo sugarpuckering). The bridging is introduced synthetically at the 2',4' -position of the ribose to produce the 2',4' -BNA monomer.
Most preferably, the polynucleotide is ribonucleic acid (RNA) or deoxyribonucleic acid (DNA).
The polynucleotide may be of any length. For example, the polynucleotide may be at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 400, or at least 500 nucleotides in length. The polynucleotide may be 1000 or more nucleotides, 5000 or more nucleotides, or 100000 or more nucleotides in length.
In the methods of the invention the helicase may move along the entire polynucleotide or only a portion of the polynucleotide. The methods of the invention can be used to characterize the entire polynucleotide or only a portion of the polynucleotide.
The polynucleotide may be single stranded. At least a portion of the polynucleotide is preferably double-stranded. Helicases typically bind to single stranded polynucleotides. If at least a portion of the polynucleotide is double stranded, the polynucleotide preferably comprises a single stranded region or a non-hybrid region. One or more helicases are capable of binding to one strand of the single stranded region or the non-hybrid region. The polynucleotide preferably comprises one or more single stranded regions or one or more non-hybrid regions.
If the one or more helicases used in the method move in the 5' to 3' direction, the polynucleotide preferably comprises a single stranded region or a non-heterozygous region at its 5' end. If the one or more helicases used in the method move in the 3' to 5' direction, the polynucleotide preferably comprises a single stranded region or a non-heterozygous region at its 3' end. If the one or more helicases are used in an inactive mode (i.e., as a brake), the location where the single stranded region or the non-heterozygous region is located is not important.
The single stranded region preferably comprises a leader sequence that preferentially screws into the pore.
If at least a portion of the polynucleotide is double stranded, the two strands of the double stranded portion are preferably joined using a bridging portion, such as a hairpin or hairpin loop. This contributes to the characterization method of the present invention.
The polynucleotide is present in any suitable sample. The invention is generally practiced with respect to samples known or suspected to contain the polynucleotide. The invention may be practiced on a sample to confirm the identity of one or more polynucleotides that are known or desired to be present in the sample.
Polynucleotide binding proteins
The polynucleotide binding protein may be any protein capable of binding to a polynucleotide and controlling its movement relative to the pore, e.g. through the pore. It is straightforward in the art to determine whether a protein binds to a polynucleotide. The protein typically interacts with and modifies at least one property of the polynucleotide. The protein may modify the polynucleotide by cleaving the polynucleotide to form single nucleotides or shorter chain nucleotides, such as dinucleotides or trinucleotides. The moiety may modify the polynucleotide by directing the polynucleotide or moving the polynucleotide to a specific location, for example controlling its movement.
Any number of polynucleotide proteins may be linked to a polynucleotide. For example, 1, 2,3, 4, 5, 6, 7, 8, 9, 10 or more proteins may be linked.
The one or more polynucleotide binding proteins may be one or more single stranded binding proteins (SSBs). The one or more single-chain binding proteins (SSBs) may comprise a carboxy-terminal (C-terminal) region that does not have a net negative charge or (ii) a modified SSB comprising one or more modifications in its C-terminal region that can reduce the net negative charge of the C-terminal region.
The one or more polynucleotide binding proteins are preferably derived from a polynucleotide handling enzyme. The polynucleotide handling enzyme is a polypeptide capable of interacting with a polynucleotide and modifying at least one characteristic of the polynucleotide. The enzyme may modify a polynucleotide by cleaving the polynucleotide to form single nucleotides or shorter chain nucleotides, such as dinucleotides or trinucleotides. The enzyme may modify the polynucleotide by directing the polynucleotide or moving the polynucleotide to a specific location. The polynucleotide handling enzyme need not exhibit enzymatic activity so long as it is capable of binding to the polynucleotide and controlling its movement relative to the well, e.g., through the well. For example, the enzyme may be modified to remove its enzymatic activity or may be used under conditions that prevent it from acting as an enzyme.
The one or more polynucleotide binding proteins are preferably derived from a nucleolytic enzyme. More preferably, the enzyme is derived from any member of Enzyme Classification (EC) groups 3.1.11, 3.1.13, 3.1.14, 3.1.15, 3.1.16, 3.1.21, 3.1.22, 3.1.25, 3.1.26, 3.1.27, 3.1.30 and 3.1.31.
Preferred enzymes are polymerases, exonucleases, helicases and topoisomerases such as gyrases and reverse transcriptases.
The one or more polynucleotide binding proteins are preferably derived from a helicase. The helicase may control movement of the polynucleotide in at least two active modes of manipulation (when the helicase is provided with all necessary components to facilitate movement, such as ATP and magnesium ions) and one inactive mode of manipulation (when the helicase is not provided with the necessary components to facilitate movement, or when the helicase is modified to prevent or hinder movement). When the helicase is provided with all the necessary components to facilitate movement, the helicase moves along the polynucleotide in the 5 'to 3' or 3 'to 5' direction (depending on the helicase), but the orientation of the polynucleotide in the well (depending on which end of the polynucleotide is captured by the well) means that the helicase can be used to move the polynucleotide out of the well against or into the well following the application of an electric field. When one end of the polynucleotide towards which the helicase moves is captured by the pore, the helicase can work against the direction of the electric field created by the applied potential and pull the helical polynucleotide out of the pore and into the cis chamber. However, when the end of the helicase that moves away from it is captured in the well, the helicase can work in the direction of the electric field created by the applied potential and push the helical polynucleotide into the well and into the trans-chamber.
Any helicase may be used in the present invention. The helicase may be derived from a Hel308 helicase, a RecD helicase, such as TraI helicase or TrwC helicase, an XPD helicase or Dda helicase.
Sequencing adapter
During sequencing library preparation, one or more polynucleotide binding proteins are provided that bind to (or are linked to) one or more sequencing linkers. In a preferred embodiment, the method further comprises binding (or ligating) the one or more polynucleotide binding proteins to the one or more sequencing adaptors.
Each sequencing linker may be any moiety capable of being attached to an analyte. Each sequencing linker may be of any length as long as the polynucleotide binding protein can bind and it can be linked to the analyte.
The one or more sequencing adaptors may be attached to the analyte in any manner. The one or more sequencing adaptors are preferably covalently linked to the analyte.
The one or more sequencing adaptors are preferably synthetic or artificial. The one or more sequencing adaptors are preferably non-native.
Suitable sequencing linkers include, but are not limited to, polymeric linkers, chemical linkers, polynucleotides or polypeptides. The one or more sequencing adaptors preferably comprise a polynucleotide or a loaded polynucleotide. In such embodiments, one or more polynucleotide binding proteins preferably bind to (or are linked to) the polynucleotide. Any of the polynucleotides described above may also be used. Preferably, the one or more sequencing adaptors comprise DNA, RNA, modified DNA (e.g., abasic DNA), RNA, PNA, LNA, BNA, or PEG. More preferably, the one or more sequencing adaptors comprise single or double stranded DNA or RNA.
The one or more sequencing adaptors preferably comprise a single stranded polynucleotide for binding (or ligating) to the one or more polynucleotide binding proteins, referred to as a binding region.
At least one of the one or more sequencing adaptors is preferably a Y adaptor.
At least one of the one or more sequencing adaptors is preferably a bridging moiety. Most preferably, the bridging moiety is a hairpin loop or hairpin loop body. Suitable hairpin loops may be designed using methods known in the art. The hair clip ring may be of any length.
If the polynucleotide is double stranded, the one or more sequencing adaptors preferably comprise a Y adaptor and optionally a bridging moiety, such as a hairpin loop adaptor. If at least one or more of the sequencing adaptors is a Y adaptor, it can be used in combination with a bridging adaptor that does not have any polynucleotide binding protein binding or ligation.
If the one or more polynucleotide binding proteins are derived from a helicase, they may stagnate at one or more spacers on the one or more sequencing adaptors.
If one or more polynucleotide binding proteins have the ability to control the translocation of a polynucleotide and have a translocation enzyme that has reduced ability to unwind a double-stranded polynucleotide, they may bind and stagnate at one or more binding regions on the one or more sequencing adaptors. In this case, there is no need to provide a spacer on the linker, and the helicity of the translocase is weak or no, and therefore will be directly arrested in the binding region without further movement. The translocation enzyme used comprises a native translocation enzyme having the ability to control translocation of a polynucleotide and having a reduced ability to unwind a double-stranded polynucleotide or an enzyme mutated to have the ability to control translocation of a polynucleotide and having a reduced ability to unwind a double-stranded polynucleotide.
If the one or more sequencing adaptors comprise a blocking band for simultaneously binding and blocking polynucleotide binding proteins, the one or more polynucleotide binding proteins bind and are blocked at the blocking band on the one or more sequencing adaptors. Preferably, the blocking band comprises a single-stranded nucleotide sequence formed by the ligation of a plurality of nucleotides, the nucleobases of which are benzo structures. The blocking tape is disclosed in patent CN114457145A, which is incorporated herein by reference in its entirety.
The term "blocking tape" refers to a region that functions as both a binding region and a spacer region. A "binding region" is a polynucleotide binding protein that binds to a polynucleotide when the polynucleotide is contacted with the polynucleotide binding protein. A "spacer" is a region that normally causes the polynucleotide-binding protein to arrest, i.e., prevents further movement of the polynucleotide-binding protein along the polynucleotide through the spacer. Typically, the polynucleotide binding protein binds to the binding region first and then moves along the polynucleotide until it reaches the spacer region, and stagnates in the spacer region until the polynucleotide to which the polynucleotide binding protein is bound is brought into contact with the transmembrane pore and an electrical potential is applied, and the polynucleotide binding protein does not move further through the spacer region. In the presence of a blocking band, however, the polynucleotide binding protein binds to the blocking band and is arrested directly on the blocking band prior to contact with the transmembrane pore and prior to application of the potential, without the need for additional provision of a region to which the polynucleotide binding protein binds, i.e., without the need to provide an additional "binding region". The "blocking tape" has the function of a region of both the "binding region" and the "spacer region".
The blocking band comprises a single stranded nucleotide sequence formed by the joining of nucleotides, preferably nucleotides whose nucleobases are benzo structures. The blocking band is a single stranded nucleotide sequence formed by a plurality of, preferably 7, 8, 9, 10, 11 or 12 nucleotides linked, preferably can be used to bind and arrest 1 polynucleotide binding protein.
Any number of one or more sequencing adaptors can be used. The method may comprise ligating two or more sequencing adaptors. For example, a sequencing linker may be attached to each end of the polynucleotide. In such embodiments, one sequencing linker is preferably a Y-adaptor, and the other sequencing linker may be a bridging moiety, such as a hairpin loop adaptor.
The one or more sequencing adaptors most preferably bind to the polynucleotide. The one or more sequencing adaptors may be ligated to either end of the polynucleotide, i.e., the 5 'or 3' end. Sequencing adaptors may be ligated to both ends of the polynucleotides. The one or more sequencing adaptors may be ligated to the polynucleotides using any method known in the art.
The one or more sequencing adaptors may be conjugated to the analyte using a ligase enzyme that uses NTP (e.g., ATP) as a substrate, such as T4 DNA ligase, E.coli DNA ligase, taq DNA ligase, tma DNA ligase, and 9℃N DNA ligase.
Spacer device
If one or more polynucleotide binding proteins are helicases and the one or more sequencing adaptors comprise polynucleotides, the one or more helicases may be arrested at one or more spacers. The spacers are also referred to as spacers.
As a portion of the polynucleotide enters the pore and moves relative to the pore along the field created by the applied potential, e.g., through the pore, the one or more helicases are moved by the pore through the spacer as the polynucleotide moves relative to the pore, e.g., through the pore. This is because the polynucleotide (including the one or more spacers) moves relative to the pore, such as through the pore, and the one or more helicases remain on top of the pore.
The one or more spacers are preferably part of the sequencing linker polynucleotide, e.g. it/they interrupt the polynucleotide sequence. The one or more spacers are preferably not part of one or more blocking molecules, such as a speed bump (speed bump), that are hybridized to the polynucleotide.
There may be any number of spacers in the sequencing linker polynucleotide, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more spacers. Preferably there are 2, 4 or 6 spacers in the polynucleotide. Among the different sequencing adapter polynucleotides may be spacers, such as in the leader sequence, and in the bridging moiety or hairpin loop.
The one or more spacers each provide an energy barrier that does not overcome the one or more helicases even in the active mode. The one or more spacers may arrest (stall) the one or more helicases by attenuating their traction (e.g., by removing bases from nucleotides in the polynucleotide) or physically blocking their movement (e.g., using bulky chemical groups).
The one or more spacers may comprise any molecule or combination of molecules that stagnates one or more helicases. The one or more spacers may comprise any molecule or combination of molecules that prevents the one or more helicases from moving along the polynucleotide. This may directly determine whether the one or more helicases are arrested at one or more spacers in the absence of a transmembrane pore and an applied potential. For example, it may be detected as shown in the examples, e.g., the ability of helicase to move past the spacer and displace the complementary strand of DNA may be determined by PAGE.
The one or more spacers typically comprise a linear molecule, such as a polymer. The one or more spacers typically have a different structure than the polynucleotide. For example, if the polynucleotide is DNA, the one or more spacers are typically not DNA. In particular, if the polynucleotide is deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), the one or more spacers preferably comprise Peptide Nucleic Acid (PNA), glycerol Nucleic Acid (GNA), threose Nucleic Acid (TNA), locked Nucleic Acid (LNA), bridged Nucleic Acid (BNA) or synthetic polymers having nucleotide side chains. The one or more spacers may comprise one or more nucleotides in a direction opposite the polynucleotide. For example, when the polynucleotide is in the 5 'to 3' direction, the one or more spacers may comprise one or more nucleotides in the 3 'to 5' direction. The nucleotide may be any of the nucleotides discussed above.
The one or more spacers preferably comprise one or more nitroindoles, e.g., one or more 5-nitroindoles, one or more inosine, one or more acridines, one or more 2-aminopurines, one or more 2-6-diaminopurines, one or more 5-bromo-deoxyuridine, one or more inverted thymidine (inverted dT), one or more inverted dideoxythymidine (ddT), one or more dideoxycytidine (ddC), one or more 5-methylcytidine, one or more 5-hydroxymethylcytidine, one or more 2' -O-methyl RNA bases, one or more isodeoxycytidine (Iso-dC), one or more isodeoxyguanosine (Iso-dG), one or more iSpC groups (i.e., nucleotides lacking sugar and bases), one or more Photocleavable (PC) groups, one or more hexanediol groups, one or more spacer(s) (i.e., 9, or more thiol groups) (i.e., 18). The one or more spacers may comprise any combination of these groups. Many of these groups are commercially available from (INTEGRATED DNA).
The one or more spacers may contain any number of these groups. For example, for the linkage of 2-aminopurine, 2-6-diaminopurine, 5-bromo-deoxyuridine, inverted dT, ddT, ddC, 5-methylcytidine, 5-hydroxymethylcytidine, 2' -O-methylRNA bases, iso-dC, iso-dG, iSpC3 groups, PC groups, hexanediol groups and thiols, the one or more spacers preferably comprise 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more. The one or more spacers preferably comprise 2,3, 4, 5, 6, 7, 8 or more iss 9 groups. The one or more spacers preferably comprise 2,3, 4, 5 or 6 or more iss 18 groups. The most preferred spacers are four iss 18 groups.
Simultaneous mixing
The term "simultaneously mix" in the present invention refers to mixing individual sequencing adaptors, individual polynucleotide binding proteins and individual polynucleotides with each other simultaneously or within a short time (e.g., 1,2, 3, 4, 5, 6, 7, 8, 9, 10 minutes, 3, 4, 5, 6, 7, 8, 9, or 10 seconds). The polynucleotide binding protein need not be pre-bound to the sequencing linker prior to mixing, that is, a complex of the sequencing linker and the polynucleotide binding protein need not be pre-formed.
Film and method for producing the same
Any film may be used according to the present invention. Suitable membranes are well known in the art. The membrane is preferably an amphiphilic layer. The amphiphilic layer is a layer formed of amphiphilic molecules (e.g., phospholipids) having hydrophilicity and lipophilicity. The amphiphilic molecules may be synthetic or naturally occurring. Non-naturally occurring amphiphiles and amphiphiles that form monolayers are known in the art and include, for example, block copolymers (Gonzalez-Perez et al, langmuir,2009,25,10447-10450). Block copolymers are polymeric materials in which two or more monomer subunits are polymerized together to produce a single polymer chain. Block copolymers typically have the characteristics provided by each monomer subunit. However, block copolymers may have unique properties that are not possessed by polymers formed from individual subunits. The block copolymer can be designed such that: a monomeric subunit is hydrophobic (i.e., lipophilic) while the other subunit or subunits are hydrophilic when in an aqueous medium. In this case, the block copolymer may have amphiphilic properties, and may form a structure simulating a biofilm. The block copolymer may be diblock (composed of two monomer subunits), but may also be composed of more than two monomer subunits to form a more complex arrangement that appears to be an amphiphile. The copolymer may be a triblock, tetrablock or pentablock copolymer.
Archaebacteria bipolar tetraether lipids are naturally occurring lipids that are configured such that the lipids form a monolayer film. These lipids are typically found in the most polar organisms, thermophiles, halophiles and acidophiles that survive in harsh biological environments. Their stability is believed to result from the fusion properties of the final bilayer. Block copolymer materials that mimic these biological entities are readily constructed by creating a triblock polymer with the general motif hydrophilic-hydrophobic-hydrophilic. The material may form a monomeric membrane that behaves like a lipid bilayer and has a range of phase states from vesicles to the membrane. Membranes formed from these triblock copolymers have several advantages over biolipid membranes. Because the triblock copolymers are synthetic, the precise construction can be carefully controlled to provide the correct chain length and properties required to form films and interact with pores and other proteins.
Block copolymers can also be constructed from subunits that are not classified as lipid submaterials; for example, the hydrophobic polymer may be made from siloxanes or other non-hydrocarbon compound based monomers. The hydrophilic sub-portion of the block copolymer may also have low protein binding properties, which enables the creation of a film that is highly resistant when exposed to unprocessed biological samples. The headgroup unit may also be derived from atypical lipid headgroups.
Triblock copolymer membranes also have enhanced mechanical and environmental stability compared to biolipid membranes, such as much higher operating temperatures or pH ranges. The synthetic nature of the block copolymers provides a platform for tailoring polymer-based films for a wide variety of applications.
The amphiphilic molecules may be chemically modified or functionalized to facilitate coupling of analytes.
The amphiphilic layer may be a single layer or a double layer. The amphiphilic layer is generally planar. The amphiphilic layer may be non-planar, e.g. curved.
The amphiphilic layer is typically a lipid bilayer. Lipid bilayers are a model of cell membranes and serve as an excellent platform for a series of experimental studies. For example, lipid bilayers can be used for in vitro studies of membrane proteins using single channel recordings. Or the lipid bilayer may be used as a biosensor to detect the presence of a range of substances. The lipid bilayer may be any lipid bilayer. Suitable lipid bilayers include, but are not limited to planar lipid bilayers, supported bilayers, or liposomes. The lipid bilayer is preferably a planar lipid bilayer.
In another preferred embodiment, the film is a solid layer. The solid layer is not of biological origin. In other words, the solid state layer is not derived or isolated from a biological environment, such as an organism or cell, or a biologically useful structure in a synthetically manufactured form. The solid layer may be formed of organic or inorganic materials including, but not limited to, microelectronic materials, insulating materials such as Si 3N4、Al2O3, and SiO 2, organic and inorganic polymers such as polyamides, plastics such as or elastomers such as two-component addition silicone rubber (two-component addition-cure silicone rubber), and glass. The solid layer may be formed of graphene.
Transmembrane pore
A transmembrane pore is a structure that allows hydrated ions driven by an applied potential to flow from one side of the membrane to the other side of the membrane.
The transmembrane pore is preferably a transmembrane protein pore. A transmembrane protein pore is a polypeptide or collection of polypeptides that allow hydrated ions (e.g., analytes) to flow from one side of a membrane to the other side of the membrane. In the present invention, transmembrane protein pores are capable of forming pores that allow hydrated ions driven by an applied potential to flow from one side of the membrane to the other. The transmembrane protein pore preferably allows the analyte (e.g., nucleotide) to flow from one side of the membrane (e.g., lipid bilayer) to the other. Transmembrane protein pores allow polynucleotides (e.g., DNA or RNA) to move through the pores.
The transmembrane protein pore may be monomeric or oligomeric. The pore is preferably composed of several repeating subunits (e.g., 6, 7 or 8 subunits). The pores are more preferably heptameric or octameric pores.
Transmembrane protein pores typically contain a barrel or channel through which the ions can flow. The subunits of the pore generally surround the central axis and provide a chain for the transmembrane β -barrel or channel or transmembrane α -helical bundle or channel.
The barrels or channels of transmembrane protein pores typically contain amino acids that facilitate interactions with analytes (e.g., nucleotides, polynucleotides, or nucleic acids). These amino acids are preferably located near the constriction of the barrel or channel. Transmembrane protein pores typically contain one or more positively charged amino acids (e.g., arginine, lysine, or histidine) or aromatic amino acids (e.g., tyrosine or tryptophan). These amino acids generally promote interactions between the pore and a nucleotide or polynucleotide or nucleic acid.
Transmembrane protein pores useful in the present invention may be derived from β -barrel pores comprising barrels or channels formed from β -strands or α -helical bundle pores. Suitable beta-barrel wells include, but are not limited to, beta-toxins, such as alpha-hemolysin, anthrax toxin, and leukocidal, and bacterial outer membrane proteins/porins (porin), such as mycobacterium smegmatis (Mycobacterium smegmatis) porins (Msp) (e.g., mspA), outer membrane porin F (OmpF), outer membrane porin G (OmpG), outer membrane phospholipase a, and Neisseria (neissenia) self-transport lipoproteins (NalP). The alpha-helical bundle holes comprise barrels or channels formed by alpha-helices. Suitable alpha-helical bundle pores include, but are not limited to, inner membrane proteins and alpha-outer membrane proteins, such as WZA and ClyA toxins. The transmembrane pore may be derived from Msp or α -hemolysin (α -HL).
The transmembrane protein pore is preferably derived from Msp, preferably from MspA. Such pores are oligomers and typically comprise 7, 8, 9 or 10 monomers derived from Msp. The pore may be a homo-oligomer (homo-oligo) pore derived from Msp comprising the same monomer. Alternatively, the pore may be a hetero-oligomeric (hetero) pore derived from Msp containing at least one monomer different from other monomers. The pore may also comprise one or more constructs comprising two or more covalently linked Msp-derived monomers.
The transmembrane protein pore is also preferably derived from alpha-hemolysin (alpha-HL). The wild-type a-HL pore is formed from 7 identical monomers or subunits (i.e., it is a heptamer).
In some embodiments, the transmembrane protein pore is chemically modified. The pores may be chemically modified at any site in any manner. The transmembrane protein pore is preferably chemically modified by binding of the molecule to one or more cysteines (cysteine attachment), binding of the molecule to one or more lysines, binding of the molecule to one or more unnatural amino acids, enzymatic modification of an epitope, or modification of a terminus. Suitable methods for making such modifications are well known in the art. The transmembrane protein pore may be chemically modified by binding any molecule. For example, the pore may be chemically modified by binding a dye or fluorophore.
Any number of monomers in the pores may be chemically modified. Preferably, one or more, e.g. 2,3, 4, 5, 6, 7, 8, 9 or 10 of said monomers are chemically modified as described above.
Method for characterizing an analyte
The present invention provides methods of characterizing analytes.
The methods of the invention involve measuring one or more characteristics of the analyte. The method includes controlling movement of an analyte through a transmembrane pore, and obtaining one or more measurements as the analyte moves relative to the pore, wherein the measurements are representative of one or more characteristics of the analyte.
Any number of polynucleotides may be characterized. For example, the methods of the invention may involve characterizing 2, 3, 4, 5, 6,7, 8, 9, 10, 20, 30, 50, 100, or more polynucleotides. The polynucleotide may be naturally occurring or artificial. For example, the method can be used to examine the sequence of the oligonucleotides produced. The method is generally performed in vitro.
The method may involve measuring a characteristic of one, two, three, four or five or more polynucleotides. The one or more characteristics are preferably selected from (i) the length of the polynucleotide, (ii) the identity of the polynucleotide, (iii) the sequence of the polynucleotide, (iv) the secondary structure of the polynucleotide, and (v) whether the polynucleotide is modified. (i) Any combination of (v) can be measured according to the present invention.
The method may be performed using any device suitable for studying a membrane/well system in which a well is present in a membrane, and the method may be performed using any device suitable for sensing a transmembrane pore. For example, the apparatus comprises a chamber comprising an aqueous solution and a barrier (barrier) dividing the chamber into two parts. The barrier typically has an opening, wherein a membrane comprising the aperture is formed in the opening. Or the barrier forms a membrane in which the pores are present.
The method may include measuring the current through the aperture as the analyte moves relative to the aperture. The device may thus also include circuitry capable of applying an electrical potential across the membrane and the aperture and measuring an electrical signal. The method may be performed using patch clamp or voltage clamp. The method preferably comprises the use of a voltage clamp.
The method of the invention comprises measuring the current through the well as the analyte moves relative to the well. Suitable conditions for measuring ionic current through a transmembrane protein pore are known in the art and are disclosed in the examples. The method is typically carried out with a voltage applied across the membrane and the pores. The voltages used are generally +5V to-5V, for example +4V to-4V, +3V to-3V or +2V to-2V. The voltages used are typically-600 mV to +600mV or-400 mV to +400mV. The voltage used is preferably in the range having lower and upper limits: the lower limit is selected from the range of-400 mV, -300mV, -200mV, -150mV, -100mV, -50mV, -20mV and 0mV, the upper limit is independently selected from +10mV, +20mV, +50mV, +100mV, +150mV, +200mV, +300mV and +400mV. The voltage used is more preferably in the range of 100mV to 240mV and most preferably in the range of 120mV to 220 mV. The discrimination of different nucleotides can be enhanced by applying an enhanced potential through the pore.
The process is generally carried out in the presence of any charge carrier, such as a metal salt, e.g., an alkali metal salt, a halogen salt, e.g., a hydrochloride salt, e.g., an alkali metal salt acid salt. The charge carriers may include ionic liquids or organic salts, such as tetramethyl ammonium chloride, trimethyl phenyl ammonium chloride, phenyl trimethyl ammonium chloride, or 1-ethyl-3-methylimidazolium chloride. In the exemplary apparatus discussed above, the salt is present in an aqueous solution in the chamber. Usually potassium chloride (KCl), sodium chloride (sodium chloride), cesium chloride (CsCl) or a mixture of potassium ferrocyanide and potassium ferricyanide is used. KCl, naCl, and mixtures of potassium ferrocyanide and potassium ferricyanide are preferred. The charge carriers may be asymmetric across the membrane. For example, the type and/or concentration of the charge carriers may be different on each side of the membrane
The salt concentration may be saturated. The salt concentration may be 3M or less, typically 0.1M to 2.5M,0.3M to 1.9M,0.5M to 1.8M,0.7M to 1.7M,0.9M to 1.6M or 1M to 1.4M. Preferably the salt concentration is 150mM to 1M. The process is preferably carried out using a salt concentration of at least 0.3M, for example at least 0.4M, at least 0.5M, at least 0.6M, at least 0.8M, at least 1.0M, at least 1.5M, at least 2.0M, at least 2.5M, or at least 3.0M. The high salt concentration provides a high signal to noise ratio and allows the current representing the presence of nucleotides to be identified in the context of normal current fluctuations.
The process is generally carried out in the presence of a buffer. In the exemplary apparatus discussed above, the buffer is present in the aqueous solution of the chamber. Any buffer may be used in the methods of the invention. Typically, the buffer is a phosphate buffer. Another suitable buffer is HEPES and Tris-HCl (Tris-hydroxymethyl aminomethane-HCl) buffer. The process is generally carried out at a pH of from 4.0 to 12.0,4.5 to 10.0,5.0 to 9.0,5.5 to 8.8,6.0 to 8.7,7.0 to 8.8, or from 7.5 to 8.5. The pH used is preferably about 7.5.
The process may be carried out at a temperature of typically 0 ℃ to 100 ℃,15 ℃ to 95 ℃,16 ℃ to 90 ℃,17 ℃ to 85 ℃,18 ℃ to 80 ℃,19 ℃ to 70 ℃, or from 20 ℃ to 60 ℃. The process is usually carried out at room temperature. The method is optionally carried out at a temperature that supports the function of the enzyme, e.g., about 37 ℃.
Kit for detecting a substance in a sample
The invention also provides a kit for preparing a sequencing library and a kit for sequencing, comprising: a sequencing adapter, a polynucleotide binding protein, and optionally a ligation system comprising an intramolecular cross-linking agent and a ligase. Any of the embodiments described above are applicable to the kit.
The kit is preferably for use with double stranded polynucleotides that pass through a transmembrane pore, and the kit preferably comprises a Y-adaptor with one or more helicases, or the kit preferably comprises a bridging moiety (or hairpin loop) adaptor with one or more helicases, a Y-adaptor, and a binding moiety (or hairpin loop) with one or more molecular stoppers attached. The Y-adaptor preferably comprises one or more first anchors for coupling the polynucleotide to the membrane, and the bridging moiety (or hairpin loop) adaptor preferably comprises one or more second anchors for coupling the polynucleotide to the membrane.
The kit preferably further comprises a transmembrane pore. Any of the membranes and wells described above may be in a kit.
Any of the embodiments described above for the methods of the invention are equally applicable to the kit. The kit may also include components of a membrane, such as an amphiphilic layer or a triblock copolymer membrane.
Kits of the invention may additionally include one or more other reagents or instruments that enable the practice of any of the embodiments described above. Such reagents or instruments include one or more of the following: suitable buffers (aqueous solutions), means for obtaining a sample from a subject (e.g., a vessel or instrument containing a needle), means for amplifying and/or expressing a polynucleotide, or a membrane as defined above, or a voltage or patch clamp device. Reagents may be present in the kit in a dry state such that the fluid sample can re-suspend the reagents. The kit may optionally also include instructions that enable the kit to be used in the method of the invention, or details regarding which organisms the method of the invention may be used in.
Examples
Details of experimental operations not specifically noted in the examples below can be found in the references cited herein, and the experimental reagents and instrumentation employed are conventional commercially available reagents or instrumentation.
Example 1: sequencing library preparation of-M1 protein
1. Preparation of sequencing linker Y-4C 18:
Y1-4C18 (as shown in SEQ ID NO.: 1):
Y2 (shown as SEQ ID NO: 2): 5'-CGACT TCTAC CGTTT GACTC CGC-3'
Y3 (as shown in SEQ ID NO.: 3):
the sequencing linker consisted of 3 DNA strands Y1-4C18, Y2, Y3, three DNA strands were sequenced according to 1:1:1 in 10mM HEPES 7.0, 160mM NaCl and 0.1Mm EDTA solution, heating to 95 ℃, maintaining for 5 minutes, and then annealing slowly to room temperature (the temperature-reducing amplitude is not more than-0.1 ℃/6 s), thus obtaining the sequencing joint Y-4C18. An M1 protein sequencing library was prepared according to the library preparation system of table 1 below.
Table 1: m1 protein sequencing library preparation system
Additive material Addition amount of
10Kb test sequence 300fmol
Sequencing adapter Y-4C18 1.5pmol
M1 helicase (as shown in SEQ ID NO: 4) 15pmol
T4 ligase 10μl
4 T4 ligase buffer B 25μl
Water and its preparation method Make up to 100. Mu.l
The 10Kb test sequence prepared by end repair in table 1 above, sequencing linker Y-4C18, M1 helicase and T4 ligase were added simultaneously to T4 ligase buffer B (4 x T4 ligase buffer B was composed of 264mM Tris-HCl, 40mM MgCl 2, 200 μm TMAD, 32% PEG 8000) and then water was added to 100 μl. The prepared system is incubated for 15 minutes at room temperature, and then the polynucleotide connected with the sequencing joint and the T4 Dda helicase is recovered by a magnetic bead method DNA to obtain a purified nanopore sequencing library. The library construction was performed as shown in FIG. 1, panel B, to obtain a new sequencing library 1.
The 10Kb test sequence of Table 1 above was subjected to end repair and library construction using the Ligation Sequencing Kit ligation kit SQK-LSK110 of Oxford Nanopore, in a manner shown in FIG. 1A, to give a conventional sequencing library 2.
The new sequencing library 1 and the conventional sequencing library 2 were each subjected to the following Q-sep assay, and the results are shown in FIG. 2.
Detecting the sequencing library by using a BIOptic Q-Sep100 full-automatic nucleic acid analysis system together with an S3 clamp; firstly, diluting a sequencing library to 1.5-2 ng/. Mu.L by using a dilution buffer solution, adding 20 mu.L of diluted sequencing library into a sample hole, selecting a gDNA mode in an instrument, and taking sample for 10-15s, wherein the sample separation time is 540s; clicking "Run" starts the sample analysis.
As can be seen from fig. 2, the comparison of the efficiency of the sequencing library 1 obtained in the new nano Kong Jianku mode with the efficiency of the sequencing library 2 obtained in the conventional library building mode shows that: the efficiency of the new nano Kong Jianku mode is substantially identical to that of the traditional nano Kong Jianku mode.
The new sequencing library 1 and the conventional sequencing library 2 were subjected to nanopore sequencing on a zipcarbon technologies company QNome-3841 nanopore gene sequencer, respectively, and the actual sequencing signals thereof are shown in fig. 3. As can be seen from FIG. 3, there is no significant difference in the signal patterns of the two, i.e., there is no significant difference in the quality of the library obtained by the two library construction methods.
Embodiment 2: sequencing library preparation-M2 protein
The same sequencing linker Y-4C18 was prepared as in the preparation of the sequencing linker in example 1. An M2 protein sequencing library was prepared according to the library preparation system of table 2 below.
Table 2: m2 protein sequencing library preparation system
The 10Kb test sequence prepared by end repair in table 2 above, sequencing linker Y-4C18, M2 helicase and T4 ligase were added simultaneously to T4 ligase buffer B (4 x T4 ligase buffer B was composed of 264mM Tris-Hcl, 40mM MgCl 2, 200 μm TMAD, 32% PEG 8000) and then water was added to 100 μl. The prepared system is incubated for 15 minutes at room temperature, and then the polynucleotides connected with the sequencing joint and the T4 Dda helicase are recovered by a magnetic bead method DNA to obtain a new sequencing library 3.
Library construction kit SQK-LSK110 (enzyme in AMX component is replaced with M2) was used for end repair and library construction of the 10Kb test sequences of Table 2 above, using the Ligation Sequencing Kit ligation method of Oxford Nanopore, resulting in conventional sequencing library 4.
The new sequencing library 3 and the conventional sequencing library 4 were each subjected to the following Q-sep assay, and the results are shown in FIG. 4.
Detecting the sequencing library by using a BIOptic Q-Sep100 full-automatic nucleic acid analysis system together with an S3 clamp; firstly, diluting a sequencing library to 1.5-2 ng/. Mu.L by using a dilution buffer solution, adding 20 mu.L of diluted sequencing library into a sample hole, selecting a gDNA mode in an instrument, and taking sample for 10-15s, wherein the sample separation time is 540s; clicking "Run" starts the sample analysis.
As can be seen from fig. 4, the comparison of the efficiency of the sequencing library 3 obtained in the new nano Kong Jianku mode with the efficiency of the sequencing library 4 obtained in the conventional library building mode shows that: the efficiency of the new nano Kong Jianku mode is substantially identical to that of the traditional nano Kong Jianku mode.
The new sequencing library 3 and the conventional sequencing library 4 were subjected to nanopore sequencing on a zipcarbon technologies company QNome-3841 nanopore gene sequencer, respectively, and the actual sequencing signals thereof are shown in fig. 5. As can be seen from FIG. 5, there is no significant difference in the signal patterns of the two, i.e., there is no significant difference in the quality of the library obtained by the two library construction methods.
Embodiment 3: new sequencing library preparation-Rapid ligation of library adaptors
Preparation of sequencing linker Y-TCO:
Y1-Tco:
Y2 (shown as SEQ ID NO: 2): 5'-CGACT TCTAC CGTTT GACTC CGC-3'
Y3 (as shown in SEQ ID NO.: 3):
Y1-Tco is the 3' end of the sequence shown as SEQ ID NO.1 linked with TCO. Sequencing linker Y-TCO was prepared according to the preparation of sequencing linker in example 1. Sequencing libraries were prepared according to the library preparation system of table 3 below.
Table 3: quick-connect sequencing library preparation system
Additive material Addition amount of
10Kb test sequence-Tz (test sequence linked to Tz group) 300fmol
Sequencing adapter Y-TCO 1.5pmol
M1 helicase (as shown in SEQ ID NO: 4) 15pmol
Water and its preparation method Make up to 100. Mu.l
The sequences to be tested-Tz, sequencing linker Y-Tco, M1 helicase in the above table were mixed simultaneously in the amounts indicated in Table 3 above, and then water was added to 100. Mu.l. The formulated system was incubated at room temperature for 5 minutes, then analyzed with a TBE gel at an electrophoresis voltage of 160V for 40 minutes, and stained with a Sybr gold dye followed by scanning in UV mode, with the final results shown in FIG. 6.
Example 4: sequencing library preparation-T17 protein
1. Preparation of T17 protein
Recombinant plasmids containing the sequence of the Dda helicase variant T17 (variant of the amino acid sequence SEQ ID NO: 6) were transformed into BL21 (DE 3) competent cells by heat shock, and after resuscitating bacteria liquid coating an ampicillin-resistant solid LB plate, the culture was carried out overnight at 37℃and monoclonal colonies were picked up and inoculated into 100ml of liquid LB medium containing ampicillin resistance for culture at 37 ℃. The culture was transferred to an ampicillin-resistant LB liquid medium at 1% of the inoculum size for expansion culture at 37℃and 200rpm, and the OD600 was measured continuously. When OD600 = 0.6-0.8, the broth in LB medium was cooled to 18 ℃ and Isopropyl thiogalactoside (isopropyless-D-Thiogalactoside, IPTG) was added to induce expression to a final concentration of 1mM. After 12-16h, bacteria were collected at 18 ℃. The bacteria were disrupted at high pressure, purified by FPLC method, and samples were collected to obtain purified protein T17 (SEQ ID NO:6 with M1G/E94C/C109A/C136A/A360C/P89A/F98A/Q146K mutation).
2. Preparation of sequencing linker Y:
Y1 (as shown in SEQ ID NO.: 7): 5'- (iSpC) 3) 30-GCGGA GTCAA ACGGT AGAAG TCG TTTTT TTTTT ACTGC TCATT CGGTC CTGCT GACT-3'
Y2 (shown as SEQ ID NO: 2): 5'-CGACT TCTAC CGTTT GACTC CGC-3'
Y3 (as shown in SEQ ID NO.: 3):
Sequencing linker Y was prepared according to the procedure for preparation of sequencing linker in example 1. A T17 protein sequencing library was prepared according to the library preparation system of Table 4 below.
Table 4: t17 protein sequencing library preparation system
The 10Kb test sequence prepared by end repair, sequencing adapter Y, T protein and T4 ligase in table 4 above were added simultaneously to T4 ligase buffer B (4 x T4 ligase buffer B was composed of 264mM Tris-Hcl, 40mM MgCl 2, 200 μm TMAD, 32% PEG 8000) and then water was added to 100 μl. The prepared system is incubated for 15 minutes at room temperature, and then the polynucleotide connected with the sequencing joint and the T17 protein is recovered by a magnetic bead method DNA to obtain a new sequencing library 5.
The new sequencing library 5 was subjected to nanopore sequencing on a QNome-3841 nanopore gene sequencer from zicarbon technologies limited. Sequencing results showed that in the absence of spacer 4C18 in the sequencing linker, the translocase T17 was able to complete library construction and sequencing signals could be collected as shown in fig. 7.

Claims (15)

1. A method of preparing a sequencing library, the method comprising:
(a) Providing a sequencing adapter, a polynucleotide binding protein, and an analyte; and
(B) Mixing the sequencing adapter, the polynucleotide binding protein, and the analyte, thereby ligating to form a sequencing library in which the sequencing adapter is ligated to the analyte and the polynucleotide binding protein remains bound to the sequencing adapter.
2. The method of claim 1, wherein prior to step (a), the polynucleotide binding protein is not bound to the sequencing adapter;
At the end of step (b), the polynucleotide binding protein binds to and stagnates at the sequencing adapter.
3. The method according to claim 1 or 2, wherein the mixing is performed in the same system and/or wherein the mixing is simultaneous mixing.
4. The method of claim 1 or 2, wherein the polynucleotide binding protein is selected from any one or a combination of two or more of a polymerase, a helicase, an exonuclease, and a topoisomerase.
5. The method of claim 1 or 2, wherein the polynucleotide binding protein is a helicase, preferably the helicase is (a) a Hel308 helicase, a RecD helicase, an XPD helicase, or a Dda helicase; (b) is derived from any one of the helicases of (a); or any combination of the helicases described in (a) and/or (b).
6. The method of claim 1 or 2, wherein the sequencing linker is a linker formed by multiple nucleotide ligation for binding to the polynucleotide binding protein and ligation to the analyte.
7. The method of claim 1 or 2, wherein the sequencing adapter comprises an anchor capable of coupling to a membrane.
8. The method according to claim 1 or 2, wherein the analyte is selected from a polynucleotide, a polypeptide, a lipid or a polysaccharide, preferably a polynucleotide, which is a fully double stranded polynucleotide, a partially double stranded polynucleotide or a single stranded polynucleotide.
9. A construct prepared by a method comprising any one of claims 1-8, the construct comprising: the sequencing adapter, the polynucleotide binding protein, and the analyte, the sequencing adapter being linked to the analyte, and the polynucleotide binding protein remaining bound to the sequencing adapter.
10. A method of characterizing an analyte, the method comprising:
(i) Performing the method of any one of claims 1 to 8, thereby forming a sequencing library or providing the construct of claim 9;
(ii) Contacting the sequencing library or construct formed in step (i) with a transmembrane pore such that a polynucleotide binding protein controls movement of the analyte relative to the transmembrane pore; and
(Iii) One or more measurements are obtained as the analyte moves relative to the transmembrane pore, wherein the measurements represent one or more characteristics of the analyte and thereby characterize the analyte.
11. The method of claim 10, wherein the transmembrane pore is a protein pore or a solid state pore; preferably, the protein pore is derived from Msp, a-hemolysin (a-HL), cytolysin, csgG, clyA, sp1, or FraC; and/or the membrane is an amphiphilic layer or a solid state layer; preferably, the amphiphilic layer is a lipid bilayer.
12. A kit for preparing a sequencing library comprising an individually packaged sequencing linker, an individually packaged polynucleotide binding protein, and an individually packaged optional ligation system.
13. A kit for characterizing an analyte comprising an individually packaged sequencing linker, an individually packaged polynucleotide binding protein, and an individually packaged optional ligation system.
14. The kit of claim 12 or 13, wherein the sequencing adapter is for binding to the polynucleotide binding protein and for linking to an analyte; the polynucleotide binding protein is used to control movement of the analyte relative to the transmembrane pore; the ligation system is used to mix a sequencing adapter, a polynucleotide binding protein, and an analyte therein and ligate the sequencing adapter to the analyte and to retain the polynucleotide binding protein bound to the sequencing adapter.
15. Use of a method according to any one of claims 1 to 8 and 10-11 or a kit according to any one of claims 12-14 or a construct according to claim 9 for the preparation of a product or for the characterization of an analyte.
CN202211661600.3A 2022-12-22 2022-12-22 Preparation method of sequencing library Pending CN118241319A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211661600.3A CN118241319A (en) 2022-12-22 2022-12-22 Preparation method of sequencing library
PCT/CN2023/136713 WO2024131530A1 (en) 2022-12-22 2023-12-06 Preparation method for sequencing library

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211661600.3A CN118241319A (en) 2022-12-22 2022-12-22 Preparation method of sequencing library

Publications (1)

Publication Number Publication Date
CN118241319A true CN118241319A (en) 2024-06-25

Family

ID=91556174

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211661600.3A Pending CN118241319A (en) 2022-12-22 2022-12-22 Preparation method of sequencing library

Country Status (2)

Country Link
CN (1) CN118241319A (en)
WO (1) WO2024131530A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201406151D0 (en) * 2014-04-04 2014-05-21 Oxford Nanopore Tech Ltd Method
GB201418159D0 (en) * 2014-10-14 2014-11-26 Oxford Nanopore Tech Ltd Method
GB201609220D0 (en) * 2016-05-25 2016-07-06 Oxford Nanopore Tech Ltd Method
CN113862264A (en) * 2021-12-03 2021-12-31 北京齐碳科技有限公司 Adapters, constructs, methods and uses for sequencing of target polynucleotides
CN114854826A (en) * 2022-05-13 2022-08-05 北京齐碳科技有限公司 Sequences, linkers comprising sequences and uses thereof

Also Published As

Publication number Publication date
WO2024131530A1 (en) 2024-06-27

Similar Documents

Publication Publication Date Title
US11649490B2 (en) Method of target molecule characterisation using a molecular pore
US11560589B2 (en) Enzyme stalling method
AU2018270075B2 (en) Transmembrane pore consisting of two CsgG pores
CN106459159B (en) Abrupt change hole
EP3126515B1 (en) Method for characterising a double stranded nucleic acid using a nano-pore and anchor molecules at both ends of said nucleic acid
CN106232834B (en) Sample preparation method
EP3097204B1 (en) Method for controlling the movement of a polynucleotide through a transmembrane pore
US9551023B2 (en) Sample preparation method
US20240076729A9 (en) Method
KR20220011665A (en) Way
CN114761799A (en) Methods of characterizing target polypeptides using nanopores
CN114457145B (en) Linkers, constructs, methods and uses for characterizing target polynucleotide sequencing
WO2024109455A1 (en) Rna-dna chimeric adapter and use thereof
CN116200477A (en) Adaptors, constructs, methods and uses comprising a combination blocker
CN118241319A (en) Preparation method of sequencing library
CN115747211B (en) Design and application of sequencing joint for nanopore sequencing
WO2024236325A1 (en) Method and products for characterizing a polynucleotide using a nanopore

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination