CN115960182A - Mutant of porin monomer, protein pore and application thereof - Google Patents

Mutant of porin monomer, protein pore and application thereof Download PDF

Info

Publication number
CN115960182A
CN115960182A CN202111186286.3A CN202111186286A CN115960182A CN 115960182 A CN115960182 A CN 115960182A CN 202111186286 A CN202111186286 A CN 202111186286A CN 115960182 A CN115960182 A CN 115960182A
Authority
CN
China
Prior art keywords
mutant
protein
mutation
seq
pore
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111186286.3A
Other languages
Chinese (zh)
Inventor
刘少伟
谢馥励
李倩雯
何京雄
赵帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Qitan Technology Ltd
Original Assignee
Chengdu Qitan Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Qitan Technology Ltd filed Critical Chengdu Qitan Technology Ltd
Priority to CN202111186286.3A priority Critical patent/CN115960182A/en
Publication of CN115960182A publication Critical patent/CN115960182A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention belongs to the technical field of characterization of target analyte characteristics, and particularly provides a mutant of a porin monomer, a protein pore comprising the mutant, and application of the mutant to detection of a target analyte, wherein the amino acid of the mutant of the porin monomer comprises a sequence shown in SEQ ID NO. 1 or a sequence at least 99%, 98%, 97%, 96%, 95%, 90%, 80%, 70%, 60% or 50% identical thereto, and the amino acid of the mutant of the porin monomer comprises a mutation at one or more positions corresponding to K67, D71, S72 and Y74 of SEQ ID NO. 1.

Description

Mutant of porin monomer, protein pore and application thereof
Technical Field
The invention belongs to the technical field of characterization of target analyte characteristics, and particularly relates to a mutant of a pore protein monomer, a protein pore containing the mutant and application of the mutant in detection of a target analyte.
Background
With the research on the structure and sequence of nucleic acid, the nucleic acid sequencing technology is continuously developed, becomes the core field of life science research, and has great promotion effect on the technical development in the fields of biology, chemistry, electricity, life science, medicine and the like. The research of a novel rapid, accurate, low-cost, high-precision and high-throughput nucleic acid sequencing technology by using the nanopore is one of the hot spots of the subsequent human genome project.
Nanopore (Nanopore) sequencing technology, also known as fourth generation sequencing technology, is a gene sequencing technology that takes a single-stranded nucleic acid molecule as a sequencing unit, utilizes a Nanopore capable of providing an ion current channel, allows the single-stranded nucleic acid molecule to pass through the Nanopore under electrophoretic drive, reduces the current of the Nanopore when nucleic acid passes through the Nanopore, and reads sequence information in real time for different generated signals.
Nanopore sequencing is mainly characterized in that: the reading length is long, the accuracy rate is high, and most error regions occur in the homopolymeric oligonucleotide region. Nanopore sequencing can not only realize natural DNA and RNA sequencing, but also directly acquire base modification information of DNA and RNA, for example, methylated cytosine can be directly read without need of bisulfite (bisufite) treatment on genome in advance like a second generation sequencing method, which greatly promotes the direct study of epigenetic correlation phenomenon at genome level. As a novel platform, the nanopore detection technology has the advantages of low cost, high flux, no mark and the like.
Nanopore analysis techniques originated from the invention of the Coulter counter and the recording technique of single-channel currents. In 1976, neher and Sakamann, physiological and medical Nobel prize-winning people, utilized the patch clamp technology to measure membrane potential and study membrane proteins and ion channels, and promoted the practical application process of nanopore sequencing technology. In 1996, kasiaanowicz et al proposed a new idea of DNA sequencing using α -hemolysin, which is a landmark marker for single molecule sequencing of biological nanopores. Subsequently, the research reports of biological nanopores such as the MspA porin and a bacteriophage Phi29 connector enrich the research on nanopore analysis technology. Li et al in 2001 opened a new era of solid-state nanopore research. Solid state nanopore sequencing has been slow progressing, limited by advances in the semiconductor and materials industries.
One of the key points of nanopore sequencing technology lies in the design of a special biological nanopore, a reading head structure formed in a constriction zone in the nanopore can cause the blockage of channel current when a single-stranded nucleic acid (such as ssDNA) molecule passes through the nanopore, so that the current intensity flowing through the nanopore is influenced transiently (the amplitude of current change influenced by each base is different), and finally, high-sensitivity electronic equipment detects the changes so as to identify the passed base. Currently, protein pores are used as nanopores for sequencing, and the pore proteins mainly use escherichia coli as a source.
At present, the nanopore protein is single, and a substitute nanopore protein needs to be developed to realize a nanopore sequencing technology. The porin is also closely related to sequencing precision, and the porin also relates to mode change of interaction with the rate control protein, so that the stability of an interaction interface of the porin and the rate control protein is further optimized, and the consistency and the stability of sequencing data are positively influenced. The accuracy of nanopore sequencing technology is also in need of improvement, and therefore, there is a need to develop improved nanopore proteins to further improve the resolution of nanopore sequencing.
Disclosure of Invention
In order to solve the above problems, it is an object of embodiments of the present invention to provide an alternative mutant of a porin monomer, a protein well comprising the same, and uses thereof.
In a first aspect, embodiments of the invention provide a mutant of a porin monomer, wherein the amino acid of the mutant of a porin monomer comprises or consists of the sequence set forth in SEQ ID No. 1 or a sequence at least 99%, 98%, 97%, 96%, 95%, 90%, 80%, 70%, 60% or 50% identical thereto, and the amino acid of the mutant of a porin monomer comprises a mutation at one or more of the positions K67, D71, S72, and Y74 corresponding to SEQ ID No. 1;
one or more of K67, D71, S72, and Y74 are specifically: (1) K67; (2) D71; (3) S72; (4) Y74; (5) K67 and D71; (6) K67 and S72; (7) K67 and Y74; (8) D71 and S72; (9) D71 and Y74; (10) S72 and Y74; (11) K67, D71 and S72; (12) K67, D71 and Y74; (13) D71, S72 and Y74; (14) K67, D71, S72 and Y74.
Preferably, the amino acids of the mutant of porin monomers comprise mutations at one or more positions corresponding to 62-209, 62-74, 62-75, 65-79, 67-209, 67-75, or 67-74 of SEQ ID NO. 1.
Preferably, the amino acids of the mutant of porin monomers comprise:
(1) 1 with insertions, deletions and/or substitutions of amino acids at one or more positions corresponding to Q62, K67, D71, S72, and Y74; (2) 1 with insertions, deletions and/or substitutions of amino acids at one or more positions corresponding to Q62, K67, D71, S72, Y74, E110, E119, E126, and K209 of SEQ ID NO; (3) 1 with insertions, deletions and/or substitutions of amino acids at one or more positions corresponding to K67, D71, S72, Y74 and S75; or (4) having an insertion, deletion and/or substitution of an amino acid at one or more positions corresponding to K67, T69, A70, D71, S72, S73, and Y74 of SEQ ID NO: 1.
In one embodiment, the amino acid mutation of the mutant of porin monomers is selected from the group consisting of:
(a) Q62 corresponding to SEQ ID NO. 1 is mutated into 0 to 5 of G, A, V, L, I; the K67 mutation is 0 to 3 of R, H, K; the D71 mutation is 0 to 4 of N, E, D, Q; s72 is mutated into 0 to 1 of P; the Y74 mutation is 0 to 5 of S, C, U, T, M;
(b) Q62 corresponding to SEQ ID NO. 1 is mutated into 0 to 5 of G, A, V, L, I; the K67 mutation is 0 to 3 of R, H, K; the D71 mutation is 0 to 4 of N, E, D, Q; s72 is mutated into 0 to 1 of P; y74 is mutated into 0 to 3 of F, Y, W; the E110 mutation is 0 to 4 of N, D, E, Q; the E119 mutation is 0 to 4 of N, D, E, Q; the E126 mutation is 0 to 4 of N, D, E, Q; the K209 mutation is 0 to 1 of P;
(c) The mutation of K67 corresponding to SEQ ID NO. 1 is 0 to 3 of R, H, K; the D71 mutation is 0 to 5 of G, A, V, L, I; s72 is mutated into 0 to 1 of P; y74 mutation is 0 to 3 of F, Y, W; the S75 mutation is 0 to 5 of C, U, S, T, M; and
(d) The mutation of K67 corresponding to SEQ ID NO. 1 is 0 to 3 of R, H, K; the T69 mutation is 0 to 5 of S, C, T, U, M; the A70 mutation is 0 to 1 of P; the D71 mutation is 0 to 5 of G, A, V, L, I; the S72 mutation is 0 to 3 of F, Y, W; the S73 mutation is 0 to 5 of G, A, V, L, I; the Y74 mutation is 0 to 3 of F, Y, W.
In one embodiment, the amino acid mutation of the mutant of porin monomers is selected from the group consisting of:
(a) Q62 corresponding to SEQ ID NO 1 is mutated to G, A, V, L, or I; k67 is mutated to R, or H; d71 is mutated to N, E, or Q; s72 is mutated to P; y74 is mutated to S, C, U, T, or M;
(b) Q62 corresponding to SEQ ID NO 1 is mutated to G, A, V, L, or I; k67 is mutated to R, or H; d71 is mutated to N, E, or Q; s72 is mutated to P; y74 is deleted; the E110 mutation is N, D, or Q; the E119 mutation is N, D, or Q; the E126 mutation is N, D, or Q; k209 is mutated to P;
(c) The K67 corresponding to SEQ ID NO. 1 is mutated to R or H; d71 is mutated to G, A, V, L, or I; s72 is mutated to P; y74 is deleted; s75 is absent; and
(d) The K67 corresponding to SEQ ID NO. 1 is mutated to R or H; the T69 mutation is S, C, U, or M; the A70 mutation is P; the D71 mutation is G, A, V, L, or I; s72 is mutated to F, Y, or W; the S73 mutation is G, A, V, L, or I; y74 is deleted.
In one embodiment, the amino acid mutation of the mutant of porin monomers is selected from the group consisting of:
(a) Q62L, K R, D N, S P and Y74T corresponding to SEQ ID NO: 1;
(b) Corresponding to Q62L, K R, D N, S P, Y deletion, E110N, E N, E N, and K209P of SEQ ID NO 1;
(c) K67R, D A, S P, Y deletion, and S75 deletion corresponding to SEQ ID NO 1; and
(d) K67R, T3269S, A P, D3271A, S Y, S A, and Y74 corresponding to SEQ ID NO:1 are missing.
In a second aspect, embodiments of the present invention provide a mutant of a porin monomer, wherein the amino acid of the mutant of the porin monomer comprises the sequence shown in SEQ ID No. 1 or a sequence at least 99%, 98%, 97%, 96%, 95%, 90%, 80%, 70%, 60% or 50% identical thereto, and the mutant of the porin monomer comprises:
(1) 1 having a mutation at one or more positions corresponding to Q62, K67, T69, a70, D71, S72, S73, Y74, S75, E110, E119, E126, and K209;
(2) Having mutations at one or more positions corresponding to Q62L, K R, T S, A P, D N/D71A, S P/S72Y, S A, Y T/Y74 deletion, S75 deletion, E110N, E38119N, E N, and K209P of SEQ ID NO 1;
(3) (ii) has a mutation at K67, D71, S72, and/or Y74 corresponding to SEQ ID NO. 1, and additionally has a mutation at least one position of Q62, T69, A70, S73, S75, E110, E119, E126, and K209;
(4) (ii) has a mutation at a deletion corresponding to K67R, D N/D71A, S P/S72Y, and/or Y74T/Y74 of SEQ ID NO 1; or
(5) Mutations at K67R, D N/D71A, S P/S72Y, and/or Y74T/Y74 deletions corresponding to SEQ ID NO:1, and additionally at least one position of Q62L, T3569S, A70P, S A, S deletion, E110N, E119N, E N, and K209P.
In one embodiment, in the mutation in (1) of the mutant of porin monomer of the second aspect: q62 mutations were 0 to 5 of G, A, V, L, I; the K67 mutation is 0 to 3 of R, H, K; the T69 mutation is 0-5 of S, C, T, U, M; the A70 mutation is 0 to 1 of P; the D71 mutation is 0 to 4 of N, E, D, Q, or 0 to 5 of G, A, V, L, I; s72 is mutated into 0 to 1 of P, or is mutated into 0 to 3 of F, Y, W; the S73 mutation is 0 to 5 of G, A, V, L, I; y74 is mutated into 0 to 5 of S, C, U, T, M or mutated into 0 to 3 of F, Y, W; the S75 mutation is 0 to 5 of C, U, S, T, M; the E110 mutation is 0 to 4 of N, D, E, Q; the E119 mutation is 0 to 4 of N, D, E, Q; the E126 mutation is 0 to 4 of N, D, E, Q; the K209 mutation is 0 to 1 of P.
The 0 to N species include 0, 1, 2,3,4 … … N species. For example, the Q62 mutation is 0 to 5 of G, A, V, L, I, which means that the Q62 mutation is 0, 1, 2,3,4 or 5 amino acids of G, A, V, L, I.
In one example, when the mutation is 1 amino acid, the amino acids before and after the mutation are not the same. For example, for the mutation of T69 to 0 to 5 of S, C, T, U, M, when this is 1, T69 is not mutated to T, but only to any of S, C, U, M; when this is 2, T69 can be mutated to any two of S, C, T, U, M, and so on.
When the mutation is 0 amino acids, the amino acid at the position is deleted. For example, when Y74 is mutated to 0 of F, Y, W, it means Y74 is deleted.
In a third aspect, embodiments of the invention provide a protein pore comprising at least one mutant of a porin monomer.
In a fourth aspect, embodiments of the present invention provide a complex for characterising a target analyte, characterised in that: the protein hole and the rate control protein matched with the protein hole are used.
In a fifth aspect, embodiments of the invention provide nucleic acids encoding a mutant, protein pore, or complex of a porin monomer.
In a sixth aspect, embodiments of the invention provide vectors or genetically engineered host cells comprising the nucleic acids.
In a seventh aspect, embodiments of the present invention provide the use of a mutant of a porin monomer, a porin pore, complex, nucleic acid, vector or host cell thereof, in detecting the presence, absence or one or more characteristics of a target analyte, or in the manufacture of a product for detecting the presence, absence or one or more characteristics of a target analyte.
In an eighth aspect, the embodiments of the present invention provide a method for producing a protein pore or a polypeptide thereof, comprising transforming the host cell with the vector, and inducing the host cell to express the protein pore or the polypeptide thereof.
In a ninth aspect, embodiments of the present invention provide a method for determining the presence, absence or one or more characteristics of a target analyte, comprising:
a. contacting a target analyte with a protein pore, complex, or protein pore in a complex such that the target analyte moves relative to the protein pore; and
b. obtaining one or more measurements while the target analyte is moving relative to the protein pore, thereby determining the presence, absence or one or more characteristics of the target analyte.
In one embodiment, the method comprises: the target analyte interacts with the protein pores present in the membrane such that the target analyte moves relative to the protein pores.
In one embodiment, the target analyte is a nucleic acid molecule.
In one embodiment, a method for determining the presence, absence or one or more characteristics of a target analyte comprises coupling the target analyte to a membrane; and the target analyte interacts with the protein pores present in the membrane such that the target analyte moves relative to the protein pores.
In a tenth aspect, embodiments of the present invention provide a kit for determining the presence, absence or one or more characteristics of a target analyte, comprising a mutant of said pore protein monomer, said protein pore, said complex, said nucleic acid, or said vector or host, and components of said membrane.
In an eleventh aspect, embodiments of the present invention provide a device for determining the presence, absence or one or more characteristics of a target analyte, comprising said protein pore or said complex, and said membrane.
In one embodiment, the target analyte comprises a polysaccharide, a metal ion, an inorganic salt, a polymer, an amino acid, a peptide, a protein, a nucleotide, an oligonucleotide, a polynucleotide, a dye, a drug, a diagnostic agent, an explosive, or an environmental contaminant;
preferably, the target analyte comprises a polynucleotide,
more preferably, the polynucleotide comprises DNA or RNA; and/or, the one or more characteristics are selected from (i) the length of the polynucleotide; (ii) identity of said polynucleotides; (iii) the sequence of the polynucleotide; (iv) (iv) the secondary structure of the polynucleotide and (v) whether the polynucleotide is modified; and/or, the rate controlling protein in the complex comprises a polynucleotide binding protein.
Drawings
The drawings described are only schematic and are non-limiting.
Fig. 1 illustrates the basic working principle of a nanopore according to one embodiment.
Figure 2 shows a schematic diagram of DNA sequencing according to one embodiment.
FIG. 3 shows the corresponding pore blocking signal when a nucleotide passes through a protein pore according to one embodiment.
Fig. 4A, 4B and 4C show a wild-type protein pore channel surface structure and a banderogram model according to an embodiment. Fig. 4A is a side view of the surface structure model, fig. 4B is a top view of the surface structure model, and fig. 4C is a ribbon structure model.
Fig. 5 shows a wild-type channel constriction zone amino acid residue distribution and constriction zone diameter, according to one embodiment.
Fig. 6A shows a monomer surface potential diagram of a wild-type channel according to an embodiment, and fig. 6B shows a stick model of a monomer streamer model and its constriction zone amino acid residue distribution.
Fig. 7 shows the constriction zone amino acid residue distribution characteristics and constriction zone diameters of mutant pore 1 according to one embodiment.
FIG. 8 shows a cartoon representation of a mutant hole 1 based on homologous modeling according to one embodiment.
FIG. 9 shows the result of negative staining electron micrograph of mutant pore 1 according to one embodiment, and the arrow indicates the objective protein particle.
FIG. 10 shows the two-dimensional classification of the mutant pore 1 according to negative staining electron microscopy with one example, a class indicated by arrows showing that the oligomeric state of the mutant pore 1 is 9 mer.
FIG. 11 shows the structure of the DNA construct BS7-4C3-PLT according to one embodiment.
Fig. 12A shows the opening current and its gating characteristics at a voltage of ± 180mV for the mutant opening 1 according to an embodiment.
FIG. 12B shows a nucleic acid via scenario with a mutant well 1 at +180mV voltage, according to one embodiment.
FIGS. 13A and 13B show example current trajectories when helicase Mph-MP1-E105C/A362C controls translocation of DNA construct BS7-4C3-PLT through mutation pore 1, according to one embodiment.
FIG. 14 is an enlarged view of a single signal in the embodiment of FIG. 13A.
Figure 15 shows chip test current traces (y-axis coordinates of two traces = current (pA), x-axis coordinates = sample point (s)) when helicase Mph-MP1-E105C/a362C controls translocation of DNA construct BS7-4C3-PLT through mutation well 1, according to one embodiment.
FIG. 16A shows the mutant hole 2 under a voltage of ± 180mV at Kong Dianliu and its gating characteristics, according to an embodiment.
FIG. 16B shows a nucleic acid via scenario with a mutant well 2 at +180mV voltage, according to one embodiment.
FIGS. 17A and 17B show example current trajectories when helicase Mph-MP1-E105C/A362C controls translocation of DNA construct BS7-4C3-PLT through mutation pore 2, according to one embodiment.
FIG. 18 is an enlarged view of a single signal in the embodiment of FIG. 17A.
Figure 19 shows chip test current traces (y-axis coordinate = current (pA) and x-axis coordinate = sample point (s)) when helicase Mph-MP1-E105C/a362C controls translocation of DNA construct BS7-4C3-PLT through mutant well 2, according to one embodiment.
FIG. 20A shows the mutant hole 3 under a voltage of ± 180mV at Kong Dianliu and its gating characteristics, according to one embodiment.
FIG. 20B shows a nucleic acid via scenario with a mutant well 3 at +180mV voltage, according to one embodiment.
FIGS. 21A and 21B show example current trajectories when helicase Mph-MP1-E105C/A362C controls translocation of DNA construct BS7-4C3-PLT through mutation pore 3, according to one embodiment.
FIG. 22 is an enlarged view of a single signal in the embodiment of FIG. 21A.
FIG. 23A shows the mutant hole 4 at a voltage of + -180 mV with Kong Dianliu and its gating feature, according to one embodiment.
FIG. 23B shows a nucleic acid via scenario with a mutant pore 4 at +180mV voltage, according to one embodiment.
FIGS. 24A and 24B show example current trajectories when helicase Mph-MP1-E105C/A362C controls translocation of DNA construct BS7-4C3-PLT through mutation pore 4, according to one embodiment.
FIG. 25 is an enlarged view of a single signal region of the embodiment of FIG. 24B.
FIG. 26 shows the results of protein purification of mutant 1 according to one embodiment, and lanes 1-6 show SDS-PAGE electrophoretic detection of the different fractions separated.
Figure 27 shows the results of molecular sieve purification of the protein of mutant 1 according to one embodiment, with the arrow indicating the position as the target protein peak.
Detailed Description
It is understood that the unused applications of the disclosed products and methods may be adapted according to the specific needs of the art. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments of the invention only, and is not intended to be limiting.
Also, as used in this specification and the claims, the singular forms "a", "an", and "the" include plural referents unless the context clearly dictates otherwise. For example, reference to "a nucleotide" includes two or more nucleotides and reference to "a helicase" includes two or more helicases.
As used herein, the term "comprising" means that any of the listed elements must be included, and that other elements may also optionally be included. "consisting of" means excluding all unrecited elements. Embodiments defined by each of these terms are within the scope of the invention.
As used herein, a "nucleotide sequence", "DNA sequence" or "nucleic acid molecule" refers to a polymeric form of nucleotides of any length (ribonucleotides or deoxyribonucleotides). The term refers only to the primary structure of the molecule. Thus, the term includes double-and single-stranded DNA and RNA.
The term "nucleic acid" as used herein refers to a single-or double-stranded covalently linked sequence of nucleotides in which the 3 'and 5' ends of each nucleotide are linked by a phosphodiester linkage. A nucleotide may consist of a deoxyribonucleotide base or a ribonucleotide base. Nucleic acids may include DNA and RNA, and may be synthetically prepared in vitro or isolated from natural sources. Nucleic acids may further include modified DNA or RNA, e.g., methylated DNA or RNA, or RNA that has been post-translationally modified, e.g., 5 '-capping with 7-methylguanosine, 3' -end processing, e.g., cleavage and polyadenylation, and splicing. Nucleic acids may also include synthetic nucleic acids (XNA), such as Hexitol Nucleic Acids (HNA), cyclohexene nucleic acids (CeNA), threose Nucleic Acids (TNA), glycerol Nucleic Acids (GNA), locked Nucleic Acids (LNA) and Peptide Nucleic Acids (PNA). The size of a nucleic acid (or polynucleotide) is typically expressed in terms of the number of base pairs (bp) of a double-stranded polynucleotide, or in the case of a single-stranded polynucleotide, the number of nucleotides (nt). 1 kilobase or nt equals one kilobase pair (kb). Polynucleotides less than about 40 nucleotides in length are commonly referred to as "oligonucleotides" and may comprise primers for use in DNA manipulation, e.g., by Polymerase Chain Reaction (PCR).
Polynucleotides, such as nucleic acids, are macromolecules comprising two or more nucleotides. The polynucleotide or nucleic acid may comprise any combination of any nucleotides. The nucleotides may be naturally occurring or synthetic. One or more nucleotides in the polynucleotide may be oxidized or methylated. One or more nucleotides in the polynucleotide may be damaged. For example, the polynucleotide may comprise a pyrimidine dimer. This dimer is often associated with damage caused by ultraviolet light and is the major cause of cutaneous melanoma. One or more nucleotides in the polynucleotide may be modified, for example with a conventional label or tag. The polynucleotide may comprise one or more nucleotides that are abasic (i.e., lack nucleobases), or lack nucleobases and sugars (i.e., are C3).
The nucleotides in the polynucleotide may be linked to each other in any manner. The nucleotides are typically linked by their sugar and phosphate groups, as in nucleic acids. The nucleotides may be linked by their nucleobases, as in the guanine dimers.
The polynucleotide may be single-stranded or double-stranded. At least a portion of the polynucleotide is preferably double stranded. The polynucleotide may be a nucleic acid, such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). The polynucleotide may comprise an RNA strand that is hybridized to a DNA strand. The polynucleotide may be any synthetic nucleic acid known in the art, such as Peptide Nucleic Acid (PNA), glycerol Nucleic Acid (GNA), threose Nucleic Acid (TNA), locked Nucleic Acid (LNA) or other synthetic polymers having nucleotide side chains. The PNA backbone is composed of repeating N- (2-aminoethyl) -glycine units linked by peptide bonds. The GNA backbone is composed of repeating ethylene glycol units linked by phosphodiester bonds. The TNA skeleton is composed of resuscitated glycosyl connected together through phosphodiester bonds. LNAs are formed from the ribonucleic acids described above, with an additional bridging structure connecting the 2 'oxygen and the 4' carbon in the ribose moiety. The Bridged Nucleic Acid (BNA) is a modified RNA nucleotide. They may also be referred to as restricted or inaccessible RNA13BNA monomers that may contain a 5-, 6-or even 7-membered bridging structure and have a "fixed" C3 '-endoglycofolding structure (C3' -endo sugar tucking). The bridging structure is synthetically introduced into the 2',4' -position of the ribose to produce the 2',4' -BNA monomer.
The polynucleotide is most preferably ribonucleic acid (RNA) or deoxyribonucleic acid (DNA). The polynucleotide may be of any length. For example, the polynucleotide may be at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 400, or at least 500 nucleotides or nucleotide pairs in length. The polynucleotide may be 1000 or more nucleotides or nucleotide pairs, 5000 or more nucleotides or nucleotide pairs or 100000 or more nucleotides or nucleotide pairs in length.
Any number of polynucleotides may be studied. For example, the methods of the embodiments can involve characterizing 2,3,4,5,6,7,8,9, 10, 20, 30, 50, 100, or more polynucleotides. If two or more polynucleotides are characterized, they may be the case for different polynucleotides or for the same polynucleotide.
Polynucleotides may be naturally occurring or artificially synthesized. For example, the method can be used to verify the sequence of the prepared oligonucleotides. The method is typically performed in vitro.
In the context of the present disclosure, the term "amino acid" is used in its broadest sense and is meant to include amine (NH) -containing compounds 2 ) And a Carboxyl (COOH) functional group and a side chain (e.g., R group) unique to each amino acid. In some embodiments, amino acids refer to naturally occurring L α -amino acids or residues. Commonly used single and three letter abbreviations for naturally occurring amino acids are used herein: a = Ala; c = Cys; d = Asp; e = Glu; f = Phe; g = Gly; h = His; i = Ile; k = Lys; l = Leu; m = Met; n = Asn; p = Pro; q = Gln; r = Arg; s = Ser; t = Thr; v = Val; w = Trp; and Y = Tyr (leining e r, a.l. (1 975) BioChemis try, 2 nd edition, pages 71-92, worth Publishers, new York). The generic term "amino acid" also includes D-amino acids, retro-inverso amino acids, and chemically modified amino acids (such as amino acid analogs), naturally occurring amino acids that are not normally incorporated into proteins (such as norleucine), and chemically synthesized compounds (such as β -amino acids) that have properties known in the art to be characteristic of amino acids. For example, included within the definition of amino acid are analogs or mimetics of phenylalanine or proline that allow the same conformational restriction of a peptidal compound as a native Phe or Pro. Such analogs and mimetics are referred to herein as "functional equivalents" of the corresponding amino acids. Roberts and Vellaccio, the Peptides: analysis, synthesis, biology, gross and Meiehofer, eds., vol.5, page 341, academic Press, inc., N.Y.1983, which are incorporated herein by reference, list additional examples of amino acids.
The terms "protein," "polypeptide," and "peptide" are further used interchangeably herein to refer to polymers of amino acid residues as well as variants and synthetic analogs of amino acid residues. Thus, these terms apply to amino acid polymers in which one or more amino acid residue is a synthetic non-naturally occurring amino acid, such as a chemical analog of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The polypeptide may also undergo maturation or post-translational modification processes, which may include, but are not limited to: glycosylation, proteolytic cleavage, lipidation, signal peptide cleavage, propeptide cleavage, phosphorylation, etc.
"homologues" of a protein encompass peptides, oligopeptides, polypeptides, proteins and enzymes having amino acid substitutions, deletions and/or insertions relative to the unmodified or wild-type protein in question and having similar biological and functional activity as the unmodified protein from which they are derived. As used herein, the term "amino acid identity" refers to the degree to which the sequences are identical on an amino acid-amino acid basis over a comparison window. Thus, the "percent sequence identity" is calculated by: the two optimally aligned sequences are compared over a comparison window, the number of positions in the two sequences at which the same amino acid residue occurs (e.g., ala, pro, ser, thr, gly, val, leu, ile, phe, tyr, trp, lys, arg, his, asp, glu, asn, gin, cys, and Met) is determined to yield the number of matched positions, the number of matched positions is divided by the total number of positions in the comparison window (i.e., the window size), and the result is multiplied by 100 to yield the percentage of sequence identity.
Sequence identity may also be a fragment or portion of a full-length polynucleotide or polypeptide. Thus, a sequence may have only 50% overall sequence identity to a full-length reference sequence, but the sequence of a particular region, domain or subunit may have 80%, 90% or up to 99% sequence identity to a reference sequence.
The term "wild-type" refers to a gene or gene product isolated from a naturally occurring source. The wild-type gene is the gene most commonly observed in a population and is therefore arbitrarily designed as the "normal" or "wild-type" form of the gene. Conversely, the term "modified," "mutation," or "variant" refers to a gene or gene product that exhibits sequence modification (e.g., substitution, truncation, or insertion), post-translational modification, and/or a functional property (e.g., altered characteristics) as compared to the wild-type gene or gene product. Note that naturally occurring mutants may be isolated; these mutants are identified by the fact that they have altered characteristics compared to the wild-type gene or gene product. Methods for introducing or substituting naturally occurring amino acids are well known in the art. For example, methionine (M) can be replaced with arginine (R) by replacing the codon for methionine (ATG) with the codon for arginine (CGT) at the relevant position in the polynucleotide encoding the mutated monomer. Methods of introducing or substituting non-naturally occurring amino acids are also well known in the art. For example, non-naturally occurring amino acids can be introduced by including synthetic aminoacyl-trnas in the IVTT system for expressing mutant monomers. Alternatively, non-naturally occurring amino acids may be introduced by expressing the mutated monomers in Gulbenzkiania indica, which is auxotrophic for particular amino acids in the presence of synthetic (i.e., non-naturally occurring) analogs of those particular amino acids. If the mutated monomers are generated using partial peptide synthesis, they may also be generated by naked ligation. Conservative substitutions replace amino acids with other amino acids having similar chemical structures, similar chemical properties, or similar side chain volumes. The amino acids introduced may have a polarity, hydrophilicity, hydrophobicity, basicity, acidity, neutrality or charge similar to the amino acids they replace. Alternatively, a conservative substitution may introduce another aromatic or aliphatic amino acid in place of a pre-existing aromatic or aliphatic amino acid. Conservative amino acid changes are well known in the art and may be selected based on the properties of the 20 major amino acids defined in table 1 below. In the case of amino acids with similar polarity, this can also be determined with reference to the hydrophilicity scale for the amino acid side chains in table 2.
TABLE 1 chemical Properties of amino acids
Figure BDA0003299348360000151
Figure BDA0003299348360000161
TABLE 2 hydrophilicity Scale
Side chains Hydrophilicity
Ile,I 4.5
Val,V 4.2
Leu,L 3.8
Phe,F 2.8
Cys,C 2.5
Met,M 1.9
Ala,A 1.8
Gly,G -0.4
Thr,T -0.7
Ser,S -0.8
Trp,W -0.9
Tyr,Y -1.3
Pro,P -1.6
His,H -3.2
Glu,E -3.5
Gln,Q -3.5
Asp,D -3.5
Asn,N -3.5
Lys,K -3.9
Arg,R -4.5
It is well known that conservative substitutions of amino acids with similar properties to each other, as shown in Table 3, do not generally affect the activity of the peptide sequence.
TABLE 3 conservative amino acid substitutions
Figure BDA0003299348360000171
The mutated or modified protein, monomer or peptide may also be chemically modified at any site in any manner. The mutated or modified monomer or peptide is preferably chemically modified by attachment of the molecule to one or more cysteines (cysteine linkage), attachment of the molecule to one or more lysines, attachment of the molecule to one or more unnatural amino acids, enzymatic modification of epitopes or modification of termini. Suitable methods for making such modifications are well known in the art. Mutants of modified proteins, monomers or peptides may be chemically modified by attachment of any molecule. For example, mutants of modified proteins, monomers or peptides can be chemically modified by attachment of dyes or fluorophores. In some embodiments, the mutant or modified monomer or peptide is chemically modified with a molecular adaptor that facilitates interaction between the pore comprising the monomer or peptide and the target nucleotide or target polynucleotide sequence. The molecular adaptor is preferably a cyclic molecule, a cyclodextrin, a substance capable of hybridizing, a DNA binding agent or intercalator, a peptide or peptide analogue, a synthetic polymer, an aromatic planar molecule, a positively charged small molecule or a small molecule capable of hydrogen bonding.
The presence of the adapter improves the host-guest chemistry of the pore and nucleotide or polynucleotide sequence, thereby improving the sequencing capability of the pore formed by the mutated monomer. The principles of host-guest chemistry are well known in the art. The adaptors have an effect on the physical or chemical properties of the pore, which improves the interaction of the pore with the nucleotide or polynucleotide sequence. The adapter may alter the charge of the barrel or channel of the pore, or specifically interact or bind with a nucleotide or polynucleotide sequence, thereby facilitating its interaction with the pore.
A "protein pore" is a transmembrane protein structure that defines a permissiveMolecules and ions are translocated from one side of the membrane to the channels or pores on the other side. The translocation of ionic species through the pore may be driven by a potential difference applied to either side of the pore. A "nanopore" is a protein pore in which the smallest diameter of a channel through which a molecule or ion passes is on the order of nanometers (10) -9 Rice). In some embodiments, the protein pore may be a transmembrane protein pore. The transmembrane protein structure of a protein pore may be monomeric or oligomeric in nature. Typically, the pore comprises a plurality of polypeptide subunits arranged around a central axis, thereby forming a protein-lined channel extending substantially perpendicular to the membrane in which the nanopore resides. The number of polypeptide subunits is not limited. Typically, the number of subunits is from 5 to 30, suitably from 6 to 10. Alternatively, the number of subunits is not defined as in the case of perfringolysin (perfringolysin) or related large membrane pores. The portion of the protein subunit within the nanopore that forms the protein lining channel typically comprises a secondary structural motif that may include one or more transmembrane β -barrel and/or α -helical portions.
In one embodiment, the protein pore comprises one or more pore protein monomers. Each porin monomer may be from Gulbenkiania indica. In one embodiment, the protein pore comprises a mutant of one or more pore protein monomers (i.e., a monomer mutated in one or more pore proteins).
In one embodiment, the porin is from a wild-type protein, wild-type homolog, or mutant thereof of kingdom biologies. The mutant may be a modified porin or a porin mutant. Modifications in the mutants include, but are not limited to, any one or more of the modifications disclosed herein or combinations of such modifications. In one embodiment, the bio-world wild-type protein is a protein from Gulbenkiania indica. In one embodiment, the wild-type protein of the kingdom of organisms is a protein from Gulbenkiania indica (Gene: ga0061063_ 1194).
In one embodiment, porin homologue refers to a polypeptide having at least 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50% complete sequence identity to the protein set forth in SEQ ID NO. 1.
In one embodiment, a porin homologue refers to a polynucleotide having at least 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50% complete sequence identity to the polynucleotide encoding the protein set forth in SEQ ID NO. 2. The polynucleotide sequence may comprise a sequence that differs from SEQ ID NO. 2 based on the degeneracy of the genetic code.
Polynucleotide sequences can be derived and replicated using methods standard in the art. Chromosomal DNA encoding the wild-type porin can be extracted from the pore-producing organism, e.g., gulbenzkiania indica. The gene encoding the pore subunit can be amplified using PCR including specific primers. The amplified sequence may then be subjected to site-directed mutagenesis. Suitable methods for site-directed mutagenesis are known in the art and include, for example, combinatorial chain reactions. The constructed polynucleotides encoding the embodiments can be prepared using techniques well known in the art, such as those described in Sambrook, J.and Russell, D. (2001). Molecular Cloning A Laboratory Manual,3rd edition, cold Spring Harbor Laboratory Press, cold Spring Harbor, NY.
The resulting polynucleotide sequence may then be integrated into a recombinant replicable vector, such as a cloning vector. The vectors can be used to replicate the polynucleotides in compatible host cells. Thus a polynucleotide sequence may be prepared by introducing the polynucleotide into a replicable vector, introducing the vector into a compatible host cell, and growing the host cell under conditions which cause the vector to replicate. The vector may be recovered from the host cell.
Fundamental working principle of nanopore or protein pore
In one embodiment, within the electrolyte filled chamber 100, an insulating film 102 with nanoscale pores divides the chamber into 2 cells, as shown in fig. 1, and when a voltage is applied to the electrolyte chamber, ions or other small molecule species pass through the pores under the force of an electric field, creating a stable detectable ionic current. Different types of biomolecules can be detected by grasping the size and surface characteristics of the nanopore, the applied voltage and the solution conditions.
Because the four bases of adenine (A), guanine (G), cytosine (C) and thymine (T) which form DNA have different molecular structures and sizes, when single-stranded DNA (ssDNA) passes through a nano-scale pore under the drive of a speed-control enzyme and an electric field, the change amplitude of current caused by the difference of chemical properties of different bases when the single-stranded DNA passes through the nano-scale pore or a protein pore is different, and thus, the sequence information of the detected nucleic acid such as DNA is obtained.
FIG. 2 shows a schematic 200 of DNA sequencing. As shown in fig. 2, in a typical nanopore/protepore sequencing experiment, the nanopore is the only channel for ions to pass through on both sides of the phospholipid membrane. Rate controlling proteins, such as polynucleotide binding proteins, act as motor proteins for nucleic acid molecules, such as DNA, pulling DNA strands to pass through a nanopore/protein pore in single nucleotide steps. Whenever one nucleotide passes through the nanopore/protein pore, the corresponding pore blocking signal is recorded (fig. 3). By analyzing the current signals associated with these sequences by a corresponding algorithm, sequence information of nucleic acid molecules such as DNA can be deduced back.
In the examples, porins are screened from different species in nature (mainly bacteria and archaea) by bioinformatics means and evolutionary points of view. In one embodiment, the porin is from any organism, preferably from Gulbenzkiania indica. By sequence analysis, porins have a complete functional domain. And (3) predicting and analyzing a porin 3D structure model by using a structural biology means, and selecting a channel protein with a proper reading head architecture form. And then, modifying, testing and optimizing the candidate channel protein (or porin) by means of genetic engineering, protein directed evolution, computer-aided protein design and the like, and obtaining a plurality of homologous protein mutants, preferably two homologous protein mutants (different homologous protein skeletons) with different signal characteristics and signal distribution modes after several iterations.
The porins of the examples are applicable to fourth generation sequencing technologies. In one embodiment, the porin is a nanopore. In one embodiment, the porin may be applied to solid state wells for sequencing.
In one embodiment, a new protein backbone is employed, forming a new constriction zone (read head zone) structure, thereby providing a completely new mode of action during sequencing. The porins of the examples have good skip-edge distribution and efficiency of recombination with phospholipid membranes.
In one embodiment, genetic mutation of a wild-type porin monomer modifies a mutant that forms a porin monomer. In one embodiment, the amino acids of the mutant of the porin monomer comprise the sequence set forth in SEQ ID No. 1 or a sequence having at least 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, or 50% identity thereto, and the amino acids of the mutant of the porin monomer have a mutation at one or more positions corresponding to K67, D71, S72, and Y74 of SEQ ID No. 1.
In one embodiment, the mutation comprises an insertion, deletion and/or substitution of an amino acid. In one embodiment, the mutation at one or more positions K67, D71, S72, and Y74 of SEQ ID NO. 1 is an amino acid insertion, deletion, and/or substitution at one or more positions K67, D71, S72, and Y74 of SEQ ID NO. 1.
In one embodiment, the mutant of the porin monomer has an amino acid corresponding to one or more of positions (1) 62-209, (2) 62-74, (3) 62-75, (4) 67-209, (5) 67-75, or (6) 67-74 of SEQ ID No. 1.
In one embodiment, the mutant of the porin monomer has amino acids corresponding to one or more of positions (1) 62-209, (2) 62-74, (3) 62-75, (4) 67-209, (5) 67-75, or (6) 67-74 of SEQ ID NO:1 with insertions, deletions, and/or substitutions of amino acids.
In one embodiment, the mutant of the porin monomer has amino acids with mutations only at positions Q62, K67, D71, S72, and Y74 corresponding to SEQ ID No. 1, or with insertions, deletions, and/or substitutions of amino acids at one or more positions.
In one embodiment, the amino acids of the mutant of the porin monomer have mutations only at positions corresponding to Q62, K67, D71, S72, Y74, E110, E119, E126, and K209 of SEQ ID NO:1, or have insertions, deletions, and/or substitutions of amino acids at one or more positions.
In one embodiment, the mutant of the porin monomer has amino acids with mutations only at positions corresponding to K67, D71, S72, Y74, and S75 of SEQ ID NO:1, or with insertions, deletions and/or substitutions of amino acids at one or more positions.
In one embodiment, the amino acids of the mutant of porin monomers have mutations only at positions corresponding to K67, T69, a70, D71, S72, S73, and Y74 of SEQ ID No. 1, or have insertions, deletions, and/or substitutions of amino acids at one or more positions.
"at one or more locations" refers to 1, 2,3,4,5,6,7,8,9, 10 … …, or up to all locations. For example, 5 amino acids at one or more positions are at 1, 2,3,4 or 5 positions.
In one embodiment, the position corresponding to SEQ ID NO. 1 is such that the numbering of the sequence of SEQ ID NO. 1 is used regardless of whether the relative position is unchanged by amino acid insertion or deletion or by using an identity sequence such that the numbering of the sequence is changed. For example, Q62 corresponding to SEQ ID NO:1 may be mutated to Q62L, and even if the SEQ ID NO:1 sequence number is changed or a sequence having the identity as defined herein with SEQ ID NO:1 is used, the amino acid Q corresponding to position 62 of SEQ ID NO:1 (even if position 62 is not in another sequence) may be mutated to L, and still be within the scope of the present invention.
In one embodiment, the amino acids of the mutant of the porin monomer consist of the sequence set forth in SEQ ID No. 1, or a sequence having at least 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 85%, 80%, 75%, or 70%, 65%, 60%, 55%, or 50% identity thereto, and the amino acids of the mutant of the porin monomer have mutations corresponding to one or more of positions K67, D71, S72, and Y74 of SEQ ID No. 1.
In one embodiment, the SEQ ID NO:1 sequence of the porin monomer is from Gulbenzkiania indica. The nucleotide sequence of the amino acid of the SEQ ID NO. 1 is SEQ ID NO. 2.
In one embodiment, the Q62 mutation corresponding to SEQ ID NO. 1 is 0 to 5 of G, A, V, L, I; the K67 mutation is 0 to 3 of R, H, K; the D71 mutation is 0 to 4 of N, E, D, Q; s72 is mutated into 0 to 1 of P; the Y74 mutation is 0 to 5 of S, C, U, T, M.
In one embodiment, the Q62 mutation corresponding to SEQ ID NO. 1 is 0 to 5 of G, A, V, L, I; the K67 mutation is 0 to 3 of R, H, K; the D71 mutation is 0 to 4 of N, E, D, Q; s72 is mutated into 0 to 1 of P; y74 mutation is 0 to 3 of F, Y, W; the E110 mutation is 0 to 4 of N, D, E, Q; the E119 mutation is 0 to 4 of N, D, E, Q; the E126 mutation is 0 to 4 of N, D, E, Q; the K209 mutation is 0 to 1 of P.
In one embodiment, the K67 mutation corresponding to SEQ ID NO. 1 is 0 to 3 of R, H, K; the D71 mutation is 0 to 5 of G, A, V, L, I; s72 is mutated into 0 to 1 of P; y74 mutation is 0 to 3 of F, Y, W; the S75 mutation was 0 to 5 of C, U, S, T, M.
In one embodiment, the K67 mutation corresponding to SEQ ID No. 1 is 0 to 3 of R, H, K; the T69 mutation is 0-5 of S, C, T, U, M; the A70 mutation is 0 to 1 of P; the D71 mutation is 0 to 5 of G, A, V, L, I; the S72 mutation is 0 to 3 of F, Y, W; the S73 mutation is 0 to 5 of G, A, V, L, I; the Y74 mutation is 0 to 3 of F, Y, W.
In one embodiment, the mutant of a porin monomer, wherein the amino acid mutation is selected from the group consisting of:
(a) Q62L, K R, D N, S P and Y74T corresponding to SEQ ID NO 1;
(b) Q62L, K R, D N, S P, Y deletion, E110N, E N, E N, and K209P corresponding to SEQ ID NO: 1;
(c) K67R, D A, S P, Y deletion, and S75 deletion corresponding to SEQ ID NO 1; and
(d) K67R, T S, A70P, D71A, S Y, S A, and Y74 corresponding to SEQ ID NO 1 are missing.
In one embodiment, the amino acid sequence of the mutant of the porin monomer comprises or consists of SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, or SEQ ID NO 19.
In one embodiment, the protein pore comprises a mutant of at least one porin monomer (or porin mutated monomer). In one embodiment, the protein pore comprises mutants of at least two, three, four, five, six, seven, eight, nine or ten or more pore protein monomers. In one embodiment, the protein pore comprises at least two mutants of porin monomers, which may be the same or different. In one embodiment, the protein pore comprises a mutant of two or more pore protein monomers, preferably the mutants of two or more monomers are identical. In one embodiment, the protein pore comprises a mutant of nine porin monomers. In one embodiment, the constriction zone pore diameter of the protein pore is 0.7nm to 2.2nm, 0.9nm to 1.6nm, 1.4 to 1.6nm, or
Figure BDA0003299348360000241
Use of a mutant of a porin monomer or a protein well comprising the same for detecting the presence, absence or one or more characteristics of a target analyte. In one embodiment, mutants of porin monomers or protein pores are used to detect the sequence of a nucleic acid molecule, or to characterize a polynucleotide sequence, e.g., sequence a polynucleotide sequence, as they can distinguish different nucleotides with high sensitivity. Mutants of porin monomers or protein wells comprising them can distinguish between four nucleotides in DNA and RNA, even methylated and unmethylated nucleotides, with unexpectedly high resolution. Mutant of porin monomers or protein wells showed almost complete separation of all four DNA/RNA nucleotides. Deoxycytidine monophosphate (dCMP) and methyl-dCMP are further distinguished based on the residence time in the protein pore and the current flowing through the protein pore.
Mutants of porin monomers or protein pores can also distinguish between different nucleotides under a range of conditions. In particular, mutants of the porin monomers or protein pores distinguish nucleotides under conditions that are favorable for nucleic acid characterization such as sequencing. By varying the applied potential, salt concentration, buffer, temperature and the presence of additives such as urea, betaine and DTT, the extent to which mutants of porin monomers or protein wells distinguish between different nucleotides can be controlled. This allows mutants of porin monomers or protein pore functions to be finely controlled, especially in sequencing. Mutants of porin monomers or protein pores may also be used to identify polynucleotide polymers by interaction with one or more monomers rather than on nucleotide-based nucleotides.
A mutant of a porin monomer or a protein pore may be isolated, substantially isolated, purified, or substantially purified. The mutant of the porin monomer or protein pore of the examples is isolated or purified if it is completely free of any other components, such as liposomes or other protein pores/porins. A mutant of a porin monomer or protein pore is substantially isolated if it is mixed with a carrier or diluent that does not interfere with its intended use. For example, a mutant of a porin monomer or a protein pore is substantially isolated or substantially purified if it is present in a form comprising less than 10%, less than 5%, less than 2%, or less than 1% of other components such as triblock copolymers, liposomes, or other protein pores/porins. Alternatively, a mutant of a porin monomer or a protein pore may be present in the membrane.
For example, the membrane is preferably an amphiphilic layer. The amphiphilic layer is a layer formed of amphiphilic molecules, for example, phospholipids, which have hydrophilicity and lipophilicity. The amphiphilic molecules may be synthetic or naturally occurring. The amphiphilic layer may be a monolayer or a bilayer. The amphiphilic layer is generally planar. The amphiphilic layer may be curved. The amphiphilic layer may be supported. The membrane may be a lipid bilayer. The lipid bilayer is formed by two opposing layers of lipid. The two layers of lipids are arranged such that their hydrophobic tail groups face each other to form a hydrophobic interior. The hydrophilic head groups of the lipid face outward toward the aqueous environment on each side of the bilayer. The membrane includes a solid layer. The solid-state layer may be formed of organic and inorganic materials. If the membrane comprises a solid layer, the pores are typically present in the amphiphilic membrane or in a layer comprised within the solid layer, e.g. in holes, wells, gaps, channels, trenches or slits within the solid layer.
Characterization of analytes
Embodiments provide a method of determining the presence, absence or one or more characteristics of a target analyte. The method involves contacting the target analyte with a mutant or protein well of a pore protein monomer such that the target analyte moves relative to, e.g., passes through, the mutant or protein well of the pore protein monomer, and taking one or more measurements while the target analyte moves relative to the mutant or protein well of the pore protein monomer, thereby determining the presence, absence, or one or more characteristics of the target analyte. The target analyte may also be referred to as a template analyte or analyte of interest.
The target analyte is preferably a polysaccharide, metal ion, inorganic salt, polymer, amino acid, peptide, polypeptide, protein, nucleotide, oligonucleotide, polynucleotide, dye, drug, diagnostic agent, explosive or environmental contaminant. The methods can involve determining the presence, absence, or one or more characteristics of two or more target analytes of the same class, e.g., two or more proteins, two or more nucleotides, or two or more drugs. Alternatively, the method may involve determining the presence, absence or one or more characteristics of two or more different classes of target analytes, e.g., one or more proteins, one or more nucleotides and one or more drugs.
The method comprises contacting the target analyte with a mutant of a pore protein monomer or a protein well such that the target analyte moves through the mutant of the pore protein monomer or the protein well. The protein pore typically comprises at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10 monomers of the porin mutation, e.g., 7,8,9 or 10 monomers. The protein pores comprise the same monomer or different porin monomers, preferably comprising 8 or 9 of the same monomer. One or more of the monomers, for example 2,3,4,5,6,7,8,9 or 10, are preferably chemically modified as discussed above. In one embodiment, the amino acids of each monomer include SEQ ID NO 1 and the above-described mutants thereof. In one embodiment, the amino acid of each monomer consists of SEQ ID NO 1 and its above mutants.
The methods of the embodiments can measure two, three, four, or five or more characteristics of the polynucleotide. The one or more characteristics are preferably selected from (i) the length of the polynucleotide, (ii) the identity of the polynucleotide, (iii) the sequence of the polynucleotide, (iv) the secondary structure of the polynucleotide, and (v) whether or not the polynucleotide is modified. In one embodiment, any combination of (i) to (v) may be measured.
For (i), the length of the polynucleotide can be measured, for example, by determining the number of interactions or the duration of interactions between the polynucleotide and the mutant/protein pore of the protein monomer.
For (ii), the identity of the polynucleotide may be measured in a number of ways, which may be measured in conjunction with or without measurement of the polynucleotide sequence. The former is simpler; the polynucleotide is sequenced and then identified. The latter can be done in several different ways. For example, the presence of a particular motif in a polynucleotide can be measured (without measuring the remaining sequence of the polynucleotide). Alternatively, measurement of a particular electrical and/or optical signal in the method may identify the polynucleotide as being from a particular source.
For (iii), the sequence of the polynucleotide may be determined as previously described. Suitable sequencing methods, particularly those using electrical measurement methods, are described in Stoddart D et al, proC Natl Acad Sci,12;106 (19) 7702-7, lieberman KR et al, J Am Chem SoC.2010;132 (50) 17961-72, and in international application W02000/28312.
For (iv), secondary structure can be measured using a variety of methods. For example, if the method involves an electrical measurement method, the secondary structure may be measured using changes in dwell time or changes in current flowing through the hole. This allows for distinguishing regions of single-stranded and double-stranded polynucleotides.
For (v), the presence or absence of any modification can be measured. The method preferably comprises determining whether the polynucleotide has been modified by methylation, oxidation, damage, by one or more proteins or by one or more labels, tags or by the absence or absence of nucleobases and sugars. Specific modifications will result in specific interactions with the pore, which can be measured using the methods described below. For example, methylcytosine can be distinguished from cytosine based on the current flowing through the pore during its interaction with each nucleotide.
The target polynucleotide is contacted with a mutant/protein pore of a protein monomer, for example a protein monomer as in the examples. Mutants/protein pores of the protein monomer are usually present in the membrane. Suitable membranes are as described hereinbefore. The method may be carried out using any device suitable for studying membrane/protein pores or mutant systems of porin monomers, in which a mutant of a protein monomer/protein pore is present in a membrane. The method may be performed using any device suitable for trans-membranous pore sensing. For example, the device comprises a chamber containing an aqueous solution and a barrier dividing the chamber into two parts. The barrier typically has a hole in which a membrane containing a pore is formed. Or the barrier forms a membrane in which mutants/protein pores of the protein monomer are present. The method may be carried out using the apparatus described in International application No. PCT/GB08/000562 (WO 2008/102120).
Various different types of measurements may be made. This includes, but is not limited to, electrical and optical measurements. Electrical measurements include voltage measurements, capacitance measurements, current measurements, impedance measurements, tunneling measurements (Ivanov AP et al, nano lett.2011jan12;11 (I): 279-85) and FET measurements (international application WO 2005/124888). Optical measurements can be combined with electrical measurements (Soni GV et al, rev Sci Instrum.2010Jan;81 (1) 014301). The measurement may be a transmembrane current measurement, for example a measurement of the ionic current flowing through the pore. In one embodiment, the electrical or optical measurements may employ conventional electrical or optical measurements.
Electrical measurements can be made using the techniques described in Stoddart D et al, proC Natl Acad Sci,12;106 7702-7, lieberman KR et al, J Am Chem SoC.2010;132 (50) 17961-72 and the standard single-channel recording device of international application WO 2000/28312. Alternatively, the electrical measurements can be made using a multi-channel system, for example as described in international application W02009/077734 and international application WO 2011/067559.
The method is preferably carried out using an electrical potential applied across the membrane. The applied potential may be a voltage potential. Alternatively, the applied potential may be a chemical potential. An example of this is the use of a salt gradient across the membrane, for example an amphiphilic molecule layer. Salt gradients are disclosed in Holden et al, J Am Chem soc.2007jul 11;129 (27): 8650-5. In some cases, the current flowing through the mutant/protein pore of the protein monomer as the polynucleotide moves relative to the mutant/protein pore of the protein monomer is used to estimate or determine the sequence of the polynucleotide. This is strand sequencing.
The method may comprise measuring the current flowing through the pore as the polynucleotide moves relative to the pore. The apparatus used in the method may therefore also comprise a circuit capable of applying an electrical potential and measuring an electrical signal across the membrane and the pores. The method may be performed using patch-clamping or voltage-clamping.
May comprise measuring the current flowing through the pore as the polynucleotide moves relative to the pore. Suitable conditions for measuring ion flow through a transmembrane protein pore are known in the art and are disclosed in the examples. The method is typically performed by applying a voltage across the membrane and the pores. The voltage used is typically from +5V to-5V, for example from +4V to-4V, from +3V to-3V or from +2V to-2V. The voltages used are generally from-600 mV to +600V or-400 mV to +400mV. The voltage used is preferably in a range having a lower limit selected from-400 mV, -300mV, -200mV, -150mV, -100mV, -50mV, -20mV and 0mV and an upper limit independently selected from +10mV, +20mV, +50mV, +100mV, +150mV, +200mV, +300nA P +400mV. The voltage used is more preferably in the range of 100mV to 240mV and most preferably in the range of 120mV to 220 mV. By using an increased applied potential, the recognition of different nucleotides by the wells can be increased.
The process is typically carried out in the presence of any charge carrier, for example a metal salt such as an alkali metal salt, a halide salt such as a chloride salt, for example an alkali metal chloride salt. The charge carrier may comprise an ionic liquid or an organic salt, such as tetramethylammonium chloride, trimethylphenylammonium chloride, phenyltrimethylammonium chloride or 1-ethyl-3-methylimidazole chloride. In the above exemplary apparatus, the salt is present in an aqueous solution in the chamber. Potassium chloride (KCl), sodium chloride (NaCl), cesium chloride (CsCl) or a mixture of potassium ferrocyanide and potassium ferricyanide is generally used. KCl, naCl and mixtures of potassium ferrocyanide and potassium ferricyanide are preferred. The charge carriers may be asymmetric on the membrane. For example, the type and/or concentration of charge carriers may be different on each side of the membrane.
The concentration of the salt may be saturated. The concentration of the salt may be 3M or less, and is usually 0.1 to 2.5m,0.3 to 1.9m,0.5 to 1.8m,0.7 to 1.7m,0.9 to 1.6M, or 1M to 1.4M. The concentration of the salt is preferably 150mM to 1M. The process is preferably carried out using a salt concentration of at least 0.3M, for example at least 0.4M, at least 0.5M, at least 0.6M, at least 0.8M, at least 1.0M, at least 1.5M, at least 2.0M, at least 2.5M or at least 3.0M. High salt concentrations provide a high signal-to-noise ratio and allow the presence of nucleotides to be identified in the context of normal current fluctuations to be indicated by the current.
The method is typically carried out in the presence of a buffer. In the above exemplary device, the buffer is present in an aqueous solution in the chamber. Any buffer may be used in the methods of the invention. Typically, the buffer is a phosphate buffer. Other suitable buffers are HEPES or Tris-HCl buffers. The process is typically carried out at a pH of 4.0 to 12.0, 4.5 to 10.0, 5.0 to 9.0, 5.5 to 8.8, 6.0 to 8.7, 7.0 to 8.8, or 7.5 to 8.5. The pH used is preferably about 7.5.
The process may be carried out at a temperature of from 0 ℃ to 100 ℃, from 15 ℃ to 95 ℃, from 16 ℃ to 90 ℃, from 17 ℃ to 85 ℃, from 18 ℃ to 80 ℃, from 19 ℃ to 70 ℃ or from 20 ℃ to 60 ℃. The process is typically carried out at room temperature. The process is optionally carried out at a temperature that supports enzyme function, for example about 37 ℃.
In one embodiment, a method for determining the presence, absence or one or more characteristics of a target analyte (e.g., a polynucleotide) comprises coupling the target analyte to a membrane; and the target analyte interacts with (e.g., contacts) the protein pore present in the membrane such that the target analyte moves relative to (e.g., through) the protein pore. In one embodiment, the current through the protein pore is measured as the target analyte moves relative to the protein pore, thereby determining the presence, absence or one or more characteristics (e.g., sequence of polynucleotides) of the target analyte.
Speed control protein
Rate-controlling proteins are proteins that can control (e.g., slow down) the speed of movement of a target analyte (e.g., a polynucleotide) relative to a protein pore, such that this speed enables detection of the presence, absence, or one or more characteristics of the target analyte (e.g., sequencing of the polynucleotide). Protein wells are used in conjunction with rate-controlling proteins for characterizing target analytes. In one embodiment, the rate controlling protein slows the rate at which the polynucleotide passes through the protein pore to effect sequencing. The rate controlling proteins include the polynucleotide binding proteins described below.
Polynucleotide binding proteins
The characterization methods of the embodiments preferably include contacting the polynucleotide with a polynucleotide binding protein such that the protein controls movement of the polynucleotide relative to, e.g., through, a mutant/protein pore of the protein monomer.
More preferably, the method comprises (a) contacting the polynucleotide with the mutant/protein pore of the protein monomer and the polynucleotide binding protein such that the protein controls movement of the polynucleotide relative to the mutant/protein pore of the protein monomer, e.g., by the mutant/protein pore of the protein monomer, and (b) taking one or more measurements as the polynucleotide moves relative to the mutant/protein pore of the protein monomer, wherein the measurements are indicative of one or more characteristics of the polynucleotide, thereby characterizing the polynucleotide.
More preferably, the method comprises (a) contacting the polynucleotide with the mutant/protein pore of the protein monomer and the polynucleotide binding protein such that the protein controls movement of the polynucleotide relative to the mutant/protein pore of the protein monomer, e.g., through the mutant/protein pore of the protein monomer, and (b) measuring a current through the mutant/protein pore of the protein monomer as the polynucleotide moves relative to the mutant/protein pore of the protein monomer, wherein the current is indicative of one or more characteristics of the polynucleotide, thereby characterizing the polynucleotide.
The polynucleotide binding protein may be any protein that is capable of binding to a polynucleotide and controlling its movement through a pore. Polynucleotide binding proteins typically interact with and modify at least one property of a polynucleotide. Proteins may be modified by cleaving a polynucleotide to form individual nucleotides or short strands of nucleotides (e.g., dinucleotides or trinucleotides). A protein may modify a polynucleotide by orienting it or moving it to a specific location, i.e., controlling its movement.
The polynucleotide binding protein is preferably derived from a polynucleotide processing enzyme. A polynucleotide processive enzyme is a polypeptide that is capable of interacting with a polynucleotide and modifying at least one property of the polynucleotide. The enzyme may modify the polynucleotide by cleaving it to form individual nucleotides or short strands of nucleotides (e.g., dinucleotides or trinucleotides). The enzyme may modify the polynucleotide by orienting it or moving it to a specific location. The polynucleotide-processing enzyme need not exhibit enzymatic activity as long as it is capable of binding a polynucleotide and controlling its movement through the pore. For example, the enzyme may be modified to remove its enzymatic activity, or may be used under conditions that prevent its use as an enzyme.
The polynucleotide processing enzyme is preferably a polymerase, exonuclease, helicase and topoisomerase, e.g., gyrase. In one embodiment, the enzyme is preferably a helicase, such as Hel308Mbu, hel308Csy, hel308Tga, hel308Mhu, tral Eco, XPD Mbu, dda, or a variant thereof. Any helicase may be used in the examples.
In one embodiment, any number of helicases may be used. For example, I,2,3,4,5,6,7,8,9, 10 or more helicases may be used. In some embodiments, a different number of helicases may be used.
The methods of the embodiments preferably comprise contacting the polynucleotide with two or more helicases. The two or more helicases are typically the same helicase. The two or more helicases may be different helicases.
The two or more helicases may be any combination of the helicases described above. The two or more helicases may be two or more Dda helicases. The two or more helicases may be one or more Dda helicases and one or more TrwC helicases. The two or more helicases may be different variants of the same helicase.
The two or more helicases are preferably linked to each other. The two or more helicases are more preferably covalently linked to each other. Helicases may be ligated in any order and using any method.
Reagent kit
The invention also provides a kit for characterizing a target analyte (e.g., a target polynucleotide). The kit contains the components of the wells and membranes of the examples. The film is preferably formed from the components. Pores are preferably present in the membrane. The kit may comprise the components of any of the membranes disclosed above (e.g., an amphiphilic layer or a triblock copolymer membrane). The kit may further comprise a polynucleotide binding protein. Any of the polynucleotide binding proteins discussed above may be used.
In one embodiment, the membrane is an amphiphilic layer, a solid state layer, or a lipid bilayer.
The kit may further comprise one or more anchors for coupling the polynucleotide to the membrane.
The kit is preferably for characterizing double-stranded polynucleotides and preferably comprises a Y adaptor and a hairpin loop adaptor.
The Y adaptor preferably has one or more helicases ligated, and the hairpin loop adaptor preferably has one or more molecular brakes ligated. The Y adaptor preferably comprises one or more first anchors for coupling the polynucleotide to the membrane, the hairpin loop adaptor preferably comprises one or more second anchors for coupling the polynucleotide to the membrane, and the strength of coupling of the hairpin loop adaptor to the membrane is preferably greater than the strength of coupling of the Y adaptor to the membrane.
The kit may additionally comprise one or more further reagents or instruments enabling the performance of any of the embodiments mentioned above. Such reagents or instruments include one or more of the following: suitable buffers (aqueous solutions), means for obtaining a sample from an individual (e.g., a container or instrument containing a needle), means for amplifying and/or expressing a polynucleotide, or a voltage or patch clamp device. The reagents may be present in the kit in a dry form such that the fluid sample re-suspends the reagents. The kit may also optionally contain instructions to enable the use of the kit with the methods of the invention or details as to what organism may use the method.
Equipment (or device)
The invention also provides a device for characterizing a target analyte (e.g., a target polynucleotide). The device comprises a mutant/protein pore of single or multiple protein monomers, and single or multiple membranes. Mutants/pores of the protein monomer are preferably present in the membrane. The number of pores and membranes is preferably equal. Preferably, there is a single hole in each membrane.
The apparatus preferably further comprises instructions for implementing the method of the embodiments. The device may be any conventional device for analyte analysis, for example an array or chip. Any of the embodiments discussed in connection with the method of the embodiment is equally applicable to the device. The device may also include any of the features present in the kits described herein. The apparatus used in the examples may be embodied as a QNome-9604 gene sequencer in carbon technology.
The above mentioned prior art is incorporated herein by reference in its entirety.
The following examples are intended to illustrate the invention without limiting it.
Example 1
In an example, the wild-type porin is from Gulbenkiania indica and the amino acid sequence of the wild-type porin is SEQ ID NO. 1 and the nucleotide sequence encoding this amino acid sequence is represented by SEQ ID NO. 2. Mutant 1 of porin monomers is wild-type porin with mutations at multiple positions corresponding to SEQ ID NO:1, specifically Q62L, K R, D N, S P, and Y74T. The protein pore of mutant 1, which includes a porin monomer, is mutant pore 1. The amino acid sequence of the mutant 1 of the protein monomer is shown as SEQ ID NO. 16.
Example 2
In an example, the wild-type porin is from Gulbenkiania indica and the amino acid sequence of the wild-type porin is SEQ ID NO 1, the nucleotide sequence of the sequence encoding this amino acid is shown by SEQ ID NO 2. Mutant 2 of porin monomers is wild-type porin with mutations at multiple positions corresponding to SEQ ID NO:1, specifically Q62L, K67R, D N, S P, Y deletion, E110N, E N, E N, and K209P. The protein pore of mutant 2, which includes a porin monomer, is mutant pore 2. The amino acid sequence of the mutant 2 of the protein monomer is shown as SEQ ID NO: 17.
Example 3
In an example, the wild-type porin is from Gulbenkiania indica and the amino acid sequence of the wild-type porin is SEQ ID NO. 1, the nucleotide sequence of the sequence encoding this amino acid being represented by SEQ ID NO. 2. Mutant 3 of porin monomers is wild-type porin with mutations in the multiple positions corresponding to SEQ ID NO:1, specifically K67R, D3271A, S P, Y deletion, and S75 deletion. The protein pore of mutant 3, which includes a porin monomer, is mutant pore 3. The amino acid sequence of the mutant 3 of the protein monomer is shown as SEQ ID NO. 18.
Example 4
In an example, the wild-type porin is from Gulbenkiania indica and the amino acid sequence of the wild-type porin is SEQ ID NO. 1, the nucleotide sequence of the sequence encoding this amino acid being represented by SEQ ID NO. 2. Mutant 4 of porin monomers is wild-type porin with mutations at the multiple positions corresponding to SEQ ID NO:1, specifically K67R, T S, A P, D A, S Y, S A, and Y74 deletions. The protein pore of mutant 4, which includes a porin monomer, is mutant pore 4. The amino acid sequence of the mutant 4 of the protein monomer is shown as SEQ ID NO. 19.
Example 5
The wild-type porin is subjected to homologous modeling by adopting SWISS MODEL, and the amino acid of a wild-type porin monomer is shown by SEQ ID NO. 1. FIG. 4A is a side view 400 of a model of predicted protein structure, where the darker part shows a single protein monomer 402. FIG. 4B is a top view 404 of the surface structure model, wherein the darker portion shows a protein monomer 406. Fig. 4C is a representation 408 of the ribbon structure with darker portions of protein monomers 410.
Fig. 5 shows the wild-type channel constriction zone amino acid residue distribution and constriction zone diameter. The maximum diameter of the channel in the constriction zone between the two hole protein monomers 502 and 504 is
Figure BDA0003299348360000361
Is secondly based on>
Figure BDA0003299348360000362
Has a minimum diameter of->
Figure BDA0003299348360000363
The amino acid compositions of the constriction zone structure, i.e. T69, S73 and Y74, are shown in the middle. />
FIG. 6A shows a surface potential diagram of a wild-type channel monomer, wherein the shade of color represents the electrical strength. Fig. 6B shows a monomer streamer model and a stick model showing the distribution of amino acid residues in the constriction zone, and shows the constriction zone loop amino acid composition and its numbering under magnification, wherein the portion 602 is an amino acid residue directed to the central region of the protein pore.
Mutant wells 1 were homologously modeled using SWISS MODEL. FIG. 7 shows the amino acid residue distribution characteristics and constriction diameter of the constriction zone of the mutant well 1. The stick model shows the key amino acid residues in the narrow region of the mutant channelThe base distribution and the mutation structure reduce the thickness of a constriction zone, and the amino acid residues pointing to the center of a channel are threonine at position 70, serine at position 74 and threonine at position 75. The hydrogen bonding interactions formed by the amino acid residues at positions 65-79 may be closely related to the correct assembly of the channel complex. The diameter of the narrowest region of the channel in the constriction zone between the two hole protein monomers 702 and 704 is about
Figure BDA0003299348360000364
The widest area has a diameter of about->
Figure BDA0003299348360000365
Has an intermediate diameter of about>
Figure BDA0003299348360000366
Fig. 8 shows a cartoon representation of a mutant pore 1 based on homologous modeling, with region 1 corresponding to the coronal forming region, region 2 corresponding to the constriction and loops (constrictions and loops) region, and region 3 corresponding to the transmembrane β -barrel region.
FIG. 9 shows the result of negative staining electron micrograph of mutant well 1, and the arrow indicates the objective protein particle. FIG. 10 shows the two-dimensional classification result of mutation pore 1 negative staining electron microscope, and the arrows indicate that the oligomerization state of the mutation pore 1 is 9-mer.
Example 6 preparation of DNA constructs
The DNA construct BS7-4C3-PLT was prepared. The structure of BS7-4C3-PLT is shown in FIG. 11, and the sequence information is as follows:
a:30*C3
b 5'-TTTTT TTTTT-3' (i.e. SEQ ID NO: 3)
c, rate controlling protein
d:4*C18
e:5'-AATGT ACTTC GTTCA GTTAC GTATT GCT-3' (i.e., SEQ ID NO: 4)
f 5'P-GC AATAC GTAAC TGAAC GAAGTTCACTATCGCATTCTCATGA-3' (i.e. SEQ ID NO: 5)
g cholesterol label
h:5'-TCATG AGAAT GCGAT AGTGA-3' (i.e., SEQ ID NO: 6)
i 5'-AAAAAAAAAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 7)/dSpacer/AAAAAAAAAA (SEQ ID NO: 8)/dSpacer/AAAAAAAAAAAAAATCTCTGAATCTCTGAATCTCTGAATCTCTAAAAAAAAAAAAGAAAAAAAAAAAACAAAAAAAAAAAATAAAAAAAAAAAAAGCAATACGTAACTGAACGAAGTACATTAAAAAAAAAA (SEQ ID NO: 9) -3'
j:5'-ATCCTTTTTTTTTTAATGTACTTCGTTCAGTTACGTATTGCT-3' (i.e. SEQ ID NO: 10)
k 5'P-TTTTTTTTTTTTATTTTTTTTTTTTGTTTTTTTTTTTTCTTTTTTTTTTTTAGAGATTCAGAGATTCAGAGATTCAGAGATTTTTTTTTTTTTT (i.e. SEQ ID NO: 11)/dSpacer/TTTTTTTTTTTT (i.e. SEQ ID NO: 12)/iSPC 3/TTTTTTTTTTTTTTTTTTTTTTTTTTTT (i.e. SEQ ID NO: 13) -3'
C3, C18, dSpacer and iSPC3 are sequences of markers (markers) introduced to indicate the resolution characteristics of pore sequencing.
In this example, the C-rate controlling protein in FIG. 11 is helicase Mph-MP1-E105C/A362C (with mutation E105C/A362C), the amino acid sequence is SEQ ID NO:14, and the nucleic acid sequence is SEQ ID NO:15.
Example 7
The mutant hole 1 is used as a protein hole and is detected by adopting a technical method of single-hole sequencing. After insertion of a single porin with the amino acid sequence mutant 1 into the phospholipid bilayer, buffer (625mM KCl,10mM HEPES pH 8.0, 50mM MgCl 2 ) Flow through the system to remove any excess mutant 1 nanopores. The DNA construct BS7-4C3-PLT (1-2 nM final concentration) was added to the mutant 1 nanopore assay system and after mixing well, buffer (625mM KCl,10mM HEPES pH 8.0, 50mM MgCl) 2 ) Flow through the system to remove any excess of the DNA construct BS7-4C3-PLT. A pre-mix of helicase (Mph-MP 1-E105C/A362C,15nM final concentration), fuel (ATP 3mM final concentration) was then added to the single mutant 1 nanopore assay system and the sequencing of the mutant 1 pore protein was monitored at +180 mV.
The mutant wells 1 were opened at a voltage of + -180 mV. FIG. 12A shows the opening current and its gating characteristics for mutant hole 1 at 180mV voltage. FIG. 12B shows a single stranded nucleic acid via for mutant well 1 at +180mV voltage. The nucleic acid may be passed through a pore. The downward line shows the nucleic acid pore signal after addition of single-stranded nucleic acid.
And sequencing the DNA construct BS7-4C3-PLT through the mutant hole 1 by adopting a single-hole sequencing technical method, and adding a nucleic acid sequencing signal appearing in a sequencing system after hole embedding. FIGS. 13A and 13B show example current traces when helicase Mph-MP1-E105C/A362C controls translocation of the DNA construct BS7-4C3-PLT through the mutant well 1. According to the signal characteristics, the mutant hole 1 has high resolution potential for nucleic acid sequencing.
Fig. 14 is an enlarged view of the portion of fig. 13A showing the current trajectory. The graph with the dashed box and arrows (middle graph) is the result after the original signal filtering process (y-axis coordinates of the two traces = current (pA), x-axis coordinates = time (s)). The dotted arrow indicates a portion showing the result of enlargement of the current trace. FIG. 15 shows the chip test current traces when helicase Mph-MP1-E105C/A362C controls translocation of the DNA construct BS7-4C3-PLT through mutant pore 1. These further indicate that mutant pore 1 has high resolution for nucleic acid sequencing.
Example 8
Similar to example 7, example 8 uses the bump hole 2 for the empty test and the via hole test.
FIG. 16A shows that the mutant pore 2 opens Kong Dianliu at a voltage of + -180 mV and its gating characteristics. FIG. 16B shows single stranded nucleic acid via scenario for mutant well 2 at +180mV voltage. The nucleic acid may be passed through a pore. After addition of single stranded nucleic acid, the downward line shows the nucleic acid via hole signal.
And sequencing the DNA construct BS7-4C3-PLT through the mutant hole 2 by adopting a single-hole sequencing technical method, and adding a nucleic acid sequencing signal appearing in a sequencing system after hole embedding. FIGS. 17A and 17B show example current trajectories when helicase Mph-MP1-E105C/A362C controls translocation of DNA construct BS7-4C3-PLT through mutant pore 2. Based on this signal characteristic, mutant well 2 can be used for nucleic acid sequencing.
Fig. 18 shows the result of enlargement of a part of the current trace. The graph with the dashed box and arrows is the result of the filtering process of the original signal (y-axis coordinate of the two traces = current (pA), x-axis coordinate = time (s)). The dotted arrow indicates a portion showing the result of enlargement of the current trace. FIG. 19 shows the chip test current traces when helicase Mph-MP1-E105C/A362C controls the translocation of the DNA construct BS7-4C3-PLT through the mutation well 2. These indicate that the mutant well 2 can be used for nucleic acid sequencing.
Example 9
Similar to example 7, example 9 uses the bump hole 3 for the empty test and the via hole test.
FIG. 20A shows that the mutant pore 3 opens Kong Dianliu at a voltage of + -180 mV and its gating characteristics. FIG. 20B shows single stranded nucleic acid via scenario for mutant well 3 at +180mV voltage. The nucleic acid may pass through the pore. The downward line shows the nucleic acid pore signal after addition of single-stranded nucleic acid.
Sequencing the DNA construct BS7-4C3-PLT through the mutant pore 3 by adopting a single-pore sequencing technical method, and adding a nucleic acid sequencing signal appearing in a sequencing system after completing pore embedding. FIGS. 21A and 21B show example current traces when helicase Mph-MP1-E105C/A362C controls translocation of the DNA construct BS7-4C3-PLT through the mutation pore 3. According to the signal characteristics, the mutant pore 3 nucleic acid sequencing has high resolution potential.
Fig. 22 shows the result of amplification of a part of the current trace. The graph with the dashed box and arrows is the result of the filtering process of the original signal (y-axis coordinate of the two traces = current (pA), x-axis coordinate = time (s)). The dotted arrow indicates a portion showing the result of enlargement of the current trace. The area of this single signal is shown in magnified form, further indicating that mutant well 3 provides high resolution for nucleic acid sequencing.
Example 10
Similar to example 7, example 10 used the dummy holes 4 for empty test and via test.
FIG. 23A shows that the mutant pore 4 opens Kong Dianliu at a voltage of + -180 mV and its gating characteristics. FIG. 23B shows single stranded nucleic acid via scenario for mutant well 4 at +180mV voltage. The nucleic acid may be passed through a pore. The downward line shows the nucleic acid pore signal after addition of single-stranded nucleic acid.
Sequencing the DNA construct BS7-4C3-PLT through the mutant hole 4 by adopting a single-hole sequencing technical method, and adding a nucleic acid sequencing signal appearing in a sequencing system after completing hole embedding. FIGS. 24A and 24B show example current traces when helicase Mph-MP1-E105C/A362C controls translocation of the DNA construct BS7-4C3-PLT through the mutant pore 4. Based on this signal characteristic, the mutant wells 4 can be used for nucleic acid sequencing.
Fig. 25 shows the result of enlargement of a part of the current trace. The graph with the dashed box and arrows is the result of the filtering process of the original signal (y-axis coordinate of the two traces = current (pA), x-axis coordinate = time (s)). The dotted arrow indicates a portion showing the result of enlargement of the current trajectory. The area of this single signal is shown in an enlarged scale, further demonstrating that mutant pore 4 can be used for nucleic acid sequencing.
Example 11
The recombinant plasmid containing mutant 1 nucleic acid sequence (corresponding amino acid sequence is shown in SEQ ID NO: 16) of porin monomer is transformed into BL21 (DE 3) competent cells by a heat shock method, 0.5ml of LB culture medium is added to be cultured for 1h at 30 ℃, then a proper amount of bacterium liquid is taken and coated on an ampicillin resistant solid LB plate, the temperature is 37 ℃ for overnight culture, the monoclonal colony is picked the next day and inoculated into 50ml of liquid LB culture medium containing ampicillin resistance for overnight culture at 37 ℃. Transferred to ampicillin-resistant TB liquid medium at an inoculum size of 1% for scale-up culture, cultured at 37 ℃ and 220rpm, and continuously measured for OD 600. When OD600=2.0-2.2, the culture broth in TB medium was cooled to 16 ℃, and Isopropyl Thiogalactoside (IPTG) was added to induce expression to a final concentration of 0.015mM. After the induction expression is carried out for 20-24h, the thalli are collected by centrifugation. The thalli is crushed under high pressure after being resuspended by crushing buffer solution, purified by a Ni-NTA affinity chromatography method, and a target elution sample is collected. Mutant 2-4 of porin monomers was purified as above.
Illustratively, the results of protein purification of mutant 1 are shown in FIG. 26, and SDS-PAGE electrophoretic detection of the different fractions separated is shown in lanes 1-6. Fig. 27 shows the results of molecular sieve purification of the protein of mutant 1, with the arrow indicating the position as the target protein peak.
SEQUENCE LISTING
<110> Chengdu carbon technology Co., ltd
Mutant of <120> porin monomer, protein pore and application thereof
<130> SPI214305-63
<160> 19
<170> PatentIn version 3.5
<210> 1
<211> 276
<212> PRT
<213> Gulbenkiania indica
<400> 1
Met Met Leu Leu Ala Thr Gly Leu Val Ser Gly Cys Ala Thr Leu Asp
1 5 10 15
Pro Asn Lys Gly Lys Pro Ala Ala Val Gly Glu Asp Ala Pro Val Leu
20 25 30
Thr Pro Met Ser Ser Thr His Lys Asp Leu Leu Asn Leu Pro Pro Ala
35 40 45
Arg Gly Pro Ile Val Ala Ala Val Tyr Asn Phe Arg Asp Gln Thr Gly
50 55 60
Gln Phe Lys Pro Thr Ala Asp Ser Ser Tyr Ser Thr Ala Val Thr Gln
65 70 75 80
Gly Ala Thr Ser Met Leu Ile Lys Ala Met Leu Asp Ser Gly Trp Phe
85 90 95
Val Pro Val Glu Arg Glu Gly Leu Gln Asn Leu Leu Thr Glu Arg Lys
100 105 110
Ile Ile Arg Ser Thr Glu Glu Lys Gly Val Ala Pro Val Glu Leu Pro
115 120 125
Asn Leu Met Ala Ala Gly Ile Leu Leu Glu Gly Gly Ile Ile Gly Tyr
130 135 140
Glu Thr Asn Val Lys Thr Gly Gly Gly Gly Ala Arg Tyr Leu Gly Ile
145 150 155 160
Gly Leu Ser Asp Met Tyr Arg Thr Asp Gln Val Thr Ile Asn Leu Arg
165 170 175
Ala Val Asp Ile Arg Ser Gly Arg Ile Leu Ser Ser Ile Ser Thr Thr
180 185 190
Lys Ala Ile Leu Ser Tyr Lys Leu Ser Gly Asp Val Tyr Lys Phe Ile
195 200 205
Lys Phe Lys Ser Leu Leu Glu Leu Glu Ala Gly Tyr Thr Arg Asn Glu
210 215 220
Pro Val Gln Leu Cys Val Gln Asp Ala Ile Glu Ala Gly Leu Ile Tyr
225 230 235 240
Leu Ile Thr Lys Gly Ile Lys Asp Asn His Trp Thr Leu Arg Asn Asn
245 250 255
Val Asp Leu Gln Ser Pro Val Leu Gln Arg Tyr Leu Gln Glu Leu Val
260 265 270
Ala Pro Ala Ala
275
<210> 2
<211> 831
<212> DNA
<213> Gulbenkiania indica
<400> 2
atgatgctgc ttgctaccgg cctcgtgtcc ggctgcgcca cgctggaccc caacaagggc 60
aagcccgctg cagtgggtga ggacgctccg gtgctgaccc cgatgtcctc cacgcacaag 120
gatctgctta acctgccgcc ggcccgcggc cccatcgtgg cggcggttta caacttccgt 180
gaccagacag gacagttcaa gcccaccgcg gacagttctt actccaccgc tgtgacgcag 240
ggcgccacct cgatgctgat caaggccatg cttgactcgg gctggttcgt gccggtggag 300
cgcgaagggc tgcagaatct gctgaccgag cgcaagatca tccgctctac cgaagaaaaa 360
ggggtggccc cggtggaact gcccaacctg atggcagcag gcatcctgct cgaaggcggg 420
atcatcggct acgagaccaa cgtgaagaca ggcggcggtg gggcgcgcta tctgggaatc 480
gggctgtccg acatgtaccg gacggaccag gtcacgatca acctgcgagc ggtggacatt 540
cgctcgggcc gtatcctcag cagcatctcc actaccaagg ctatcctctc gtacaagctc 600
agcggcgacg tctacaagtt catcaagttc aagagcctgc tggaactgga ggcaggctat 660
acccgtaacg agccagtgca gctgtgcgtg caggacgcga tcgaggccgg gctgatctac 720
ctcatcacca aggggatcaa ggacaaccat tggaccttgc gcaacaatgt cgacctgcag 780
tcgccagtgc tgcaacgcta cctgcaggag ttggtagccc ccgcagcctg a 831
<210> 3
<211> 10
<212> DNA
<213> Artificial Sequence
<220>
<223> part of BS7-4C3-PLT
<400> 3
tttttttttt 10
<210> 4
<211> 28
<212> DNA
<213> Artificial Sequence
<220>
<223> part of BS7-4C3-PLT
<400> 4
aatgtacttc gttcagttac gtattgct 28
<210> 5
<211> 42
<212> DNA
<213> Artificial Sequence
<220>
<223> part of BS7-4C3-PLT
<400> 5
gcaatacgta actgaacgaa gttcactatc gcattctcat ga 42
<210> 6
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> part of BS7-4C3-PLT
<400> 6
tcatgagaat gcgatagtga 20
<210> 7
<211> 28
<212> DNA
<213> Artificial Sequence
<220>
<223> part of BS7-4C3-PLT
<400> 7
aaaaaaaaaa aaaaaaaaaa aaaaaaaa 28
<210> 8
<211> 12
<212> DNA
<213> Artificial Sequence
<220>
<223> part of BS7-4C3-PLT
<400> 8
aaaaaaaaaa aa 12
<210> 9
<211> 132
<212> DNA
<213> Artificial Sequence
<220>
<223> part of BS7-4C3-PLT
<400> 9
aaaaaaaaaa aaaatctctg aatctctgaa tctctgaatc tctaaaaaaa aaaaagaaaa 60
aaaaaaaaca aaaaaaaaaa ataaaaaaaa aaaaagcaat acgtaactga acgaagtaca 120
ttaaaaaaaa aa 132
<210> 10
<211> 42
<212> DNA
<213> Artificial Sequence
<220>
<223> part of BS7-4C3-PLT
<400> 10
atcctttttt ttttaatgta cttcgttcag ttacgtattg ct 42
<210> 11
<211> 94
<212> DNA
<213> Artificial Sequence
<220>
<223> part of BS7-4C3-PLT
<400> 11
tttttttttt ttattttttt tttttgtttt ttttttttct tttttttttt tagagattca 60
gagattcaga gattcagaga tttttttttt tttt 94
<210> 12
<211> 12
<212> DNA
<213> Artificial Sequence
<220>
<223> part of BS7-4C3-PLT
<400> 12
tttttttttt tt 12
<210> 13
<211> 28
<212> DNA
<213> Artificial Sequence
<220>
<223> part of BS7-4C3-PLT
<400> 13
tttttttttt tttttttttt tttttttt 28
<210> 14
<211> 441
<212> PRT
<213> Artificial Sequence
<220>
<223> helicase
<400> 14
Met Ile Thr Ile Asp Gln Leu Thr Glu Gly Gln Phe Asp Ser Leu Gln
1 5 10 15
Arg Ala Lys Val Leu Ile Gln Glu Ala Thr Lys Asn Asp Gly Asn Trp
20 25 30
Asn His Arg Thr Lys His Leu Thr Ile Asn Gly Pro Ala Gly Thr Gly
35 40 45
Lys Thr Thr Met Met Lys Phe Leu Val Ser Trp Leu Arg Asp Glu Gly
50 55 60
Ile Thr Gly Val Ala Leu Ala Ala Pro Thr His Ala Ala Lys Lys Val
65 70 75 80
Leu Ala Asn Ala Val Gly Glu Glu Val Ser Thr Ile His Ser Ile Leu
85 90 95
Lys Ile Asn Pro Thr Thr Tyr Glu Cys Lys Gln Phe Phe Glu Gln Ser
100 105 110
Ala Pro Pro Asp Leu Ser Lys Ile Arg Ile Leu Ile Cys Glu Glu Cys
115 120 125
Ser Phe Tyr Asp Ile Lys Leu Phe Glu Ile Leu Met Asn Ser Ile Gln
130 135 140
Pro Trp Thr Ile Ile Ile Gly Ile Gly Asp Arg Ala Gln Leu Arg Pro
145 150 155 160
Ala Asp Asp Lys Gly Ile Ser Arg Phe Phe Thr Asp Gln Arg Phe Glu
165 170 175
Gln Thr Tyr Leu Thr Glu Ile Lys Arg Ser Asn Met Pro Ile Ile Glu
180 185 190
Val Ala Thr Glu Ile Arg Asn Gly Gly Trp Ile Arg Glu Asn Ile Ile
195 200 205
Asp Asp Leu Gly Val Lys Gln Asp Lys Ser Val Ser Glu Phe Met Thr
210 215 220
Asn Tyr Phe Lys Val Val Lys Ser Ile Asp Asp Leu Tyr Glu Thr Arg
225 230 235 240
Met Tyr Ala Tyr Thr Asn Asn Ser Val Asp Thr Leu Asn Lys Ile Ile
245 250 255
Arg Lys Lys Leu Tyr Glu Thr Glu Gln Asp Phe Ile Val Gly Glu Pro
260 265 270
Ile Val Met Gln Glu Pro Leu Ile Arg Asp Ile Asn Tyr Glu Gly Lys
275 280 285
Arg Phe Gln Glu Ile Val Phe Asn Asn Gly Glu Tyr Leu Glu Val Ser
290 295 300
Glu Ile Lys Pro Met Glu Ser Val Leu Lys Cys Arg Asn Ile Asp Tyr
305 310 315 320
Gln Leu Val Leu His Tyr Tyr Gln Leu Lys Val Lys Ser Ile Asp Thr
325 330 335
Gly Glu Ser Gly Leu Ile Asn Thr Ile Ser Asp Lys Asn Glu Leu Asn
340 345 350
Lys Phe Tyr Met Phe Leu Gly Lys Val Cys Gln Asp Tyr Lys Ser Gly
355 360 365
Thr Ile Lys Ala Phe Trp Asp Asp Phe Trp Lys Ile Lys Asn Asn Tyr
370 375 380
His Arg Val Lys Pro Leu Pro Val Ser Thr Ile His Lys Gly Gln Gly
385 390 395 400
Ser Thr Val Asp Asn Ser Phe Leu Tyr Thr Pro Cys Ile Thr Lys Tyr
405 410 415
Ala Glu Pro Asp Leu Ala Ser Gln Leu Leu Tyr Val Gly Val Thr Arg
420 425 430
Ala Arg His Asn Val Asn Phe Val Gly
435 440
<210> 15
<211> 1326
<212> DNA
<213> Artificial Sequence
<220>
<223> helicase
<400> 15
atgatcacca tcgaccagct gaccgaaggt cagttcgact ctctgcagcg tgctaaagtt 60
ctgatccagg aagctaccaa aaacgacggt aactggaacc accgtaccaa acacctgacc 120
atcaacggtc cggctggtac cggtaaaacc accatgatga aattcctggt ttcttggctg 180
cgtgacgaag gtatcaccgg tgttgctctg gctgctccga cccacgctgc taaaaaagtt 240
ctggctaacg ctgttggtga agaagtttct accatccact ctatcctgaa aatcaacccg 300
accacctacg aatgcaaaca gttcttcgaa cagtctgctc cgccggacct gtctaaaatc 360
cgtatcctga tctgcgaaga atgctctttc tacgacatca aactgttcga aatcctgatg 420
aactctatcc agccgtggac catcatcatc ggtatcggtg accgtgctca gctgcgtccg 480
gctgacgaca aaggtatctc tcgtttcttc accgaccagc gtttcgaaca gacctacctg 540
accgaaatca aacgttctaa catgccgatc atcgaagttg ctaccgaaat ccgtaacggt 600
ggttggattc gtgaaaacat catcgacgac ctgggtgtta aacaggacaa atctgtttct 660
gaatttatga ccaactactt caaagttgtt aaatctatcg acgacctgta cgaaacccgt 720
atgtacgctt acaccaacaa ctctgttgac accctgaaca aaatcatccg taaaaaactg 780
tacgaaaccg aacaggactt catcgttggt gaaccgatcg ttatgcagga accgctgatc 840
cgtgacatca actacgaagg taaacgtttc caggaaatcg ttttcaacaa cggtgaatac 900
ctggaagttt ctgaaatcaa accgatggaa tctgttctga aatgccgtaa catcgactac 960
cagctggttc tgcactacta ccagctgaaa gttaaatcta tcgacaccgg tgaatctggt 1020
ctgatcaaca ccatctctga caaaaacgaa ctgaacaaat tctacatgtt cctgggtaaa 1080
gtttgccagg actacaaatc tggtaccatc aaagcgttct gggacgactt ctggaaaatc 1140
aaaaacaact accaccgtgt taaaccgctg ccggtttcta ccatccacaa aggtcagggt 1200
tctaccgttg acaactcttt cctgtacacc ccgtgcatca ccaaatacgc tgaaccggac 1260
ctggcttctc agctgctgta cgttggtgtt acccgtgctc gtcacaacgt taacttcgtt 1320
ggttaa 1326
<210> 16
<211> 276
<212> PRT
<213> Artificial Sequence
<220>
<223> mutant 1
<400> 16
Met Met Leu Leu Ala Thr Gly Leu Val Ser Gly Cys Ala Thr Leu Asp
1 5 10 15
Pro Asn Lys Gly Lys Pro Ala Ala Val Gly Glu Asp Ala Pro Val Leu
20 25 30
Thr Pro Met Ser Ser Thr His Lys Asp Leu Leu Asn Leu Pro Pro Ala
35 40 45
Arg Gly Pro Ile Val Ala Ala Val Tyr Asn Phe Arg Asp Leu Thr Gly
50 55 60
Gln Phe Arg Pro Thr Ala Asn Pro Ser Thr Ser Thr Ala Val Thr Gln
65 70 75 80
Gly Ala Thr Ser Met Leu Ile Lys Ala Met Leu Asp Ser Gly Trp Phe
85 90 95
Val Pro Val Glu Arg Glu Gly Leu Gln Asn Leu Leu Thr Glu Arg Lys
100 105 110
Ile Ile Arg Ser Thr Glu Glu Lys Gly Val Ala Pro Val Glu Leu Pro
115 120 125
Asn Leu Met Ala Ala Gly Ile Leu Leu Glu Gly Gly Ile Ile Gly Tyr
130 135 140
Glu Thr Asn Val Lys Thr Gly Gly Gly Gly Ala Arg Tyr Leu Gly Ile
145 150 155 160
Gly Leu Ser Asp Met Tyr Arg Thr Asp Gln Val Thr Ile Asn Leu Arg
165 170 175
Ala Val Asp Ile Arg Ser Gly Arg Ile Leu Ser Ser Ile Ser Thr Thr
180 185 190
Lys Ala Ile Leu Ser Tyr Lys Leu Ser Gly Asp Val Tyr Lys Phe Ile
195 200 205
Lys Phe Lys Ser Leu Leu Glu Leu Glu Ala Gly Tyr Thr Arg Asn Glu
210 215 220
Pro Val Gln Leu Cys Val Gln Asp Ala Ile Glu Ala Gly Leu Ile Tyr
225 230 235 240
Leu Ile Thr Lys Gly Ile Lys Asp Asn His Trp Thr Leu Arg Asn Asn
245 250 255
Val Asp Leu Gln Ser Pro Val Leu Gln Arg Tyr Leu Gln Glu Leu Val
260 265 270
Ala Pro Ala Ala
275
<210> 17
<211> 275
<212> PRT
<213> Artificial Sequence
<220>
<223> mutant 2
<400> 17
Met Met Leu Leu Ala Thr Gly Leu Val Ser Gly Cys Ala Thr Leu Asp
1 5 10 15
Pro Asn Lys Gly Lys Pro Ala Ala Val Gly Glu Asp Ala Pro Val Leu
20 25 30
Thr Pro Met Ser Ser Thr His Lys Asp Leu Leu Asn Leu Pro Pro Ala
35 40 45
Arg Gly Pro Ile Val Ala Ala Val Tyr Asn Phe Arg Asp Leu Thr Gly
50 55 60
Gln Phe Arg Pro Thr Ala Asn Pro Ser Ser Thr Ala Val Thr Gln Gly
65 70 75 80
Ala Thr Ser Met Leu Ile Lys Ala Met Leu Asp Ser Gly Trp Phe Val
85 90 95
Pro Val Glu Arg Glu Gly Leu Gln Asn Leu Leu Thr Asn Arg Lys Ile
100 105 110
Ile Arg Ser Thr Glu Asn Lys Gly Val Ala Pro Val Asn Leu Pro Asn
115 120 125
Leu Met Ala Ala Gly Ile Leu Leu Glu Gly Gly Ile Ile Gly Tyr Glu
130 135 140
Thr Asn Val Lys Thr Gly Gly Gly Gly Ala Arg Tyr Leu Gly Ile Gly
145 150 155 160
Leu Ser Asp Met Tyr Arg Thr Asp Gln Val Thr Ile Asn Leu Arg Ala
165 170 175
Val Asp Ile Arg Ser Gly Arg Ile Leu Ser Ser Ile Ser Thr Thr Lys
180 185 190
Ala Ile Leu Ser Tyr Lys Leu Ser Gly Asp Val Tyr Lys Phe Ile Pro
195 200 205
Phe Lys Ser Leu Leu Glu Leu Glu Ala Gly Tyr Thr Arg Asn Glu Pro
210 215 220
Val Gln Leu Cys Val Gln Asp Ala Ile Glu Ala Gly Leu Ile Tyr Leu
225 230 235 240
Ile Thr Lys Gly Ile Lys Asp Asn His Trp Thr Leu Arg Asn Asn Val
245 250 255
Asp Leu Gln Ser Pro Val Leu Gln Arg Tyr Leu Gln Glu Leu Val Ala
260 265 270
Pro Ala Ala
275
<210> 18
<211> 274
<212> PRT
<213> Artificial Sequence
<220>
<223> mutant 3
<400> 18
Met Met Leu Leu Ala Thr Gly Leu Val Ser Gly Cys Ala Thr Leu Asp
1 5 10 15
Pro Asn Lys Gly Lys Pro Ala Ala Val Gly Glu Asp Ala Pro Val Leu
20 25 30
Thr Pro Met Ser Ser Thr His Lys Asp Leu Leu Asn Leu Pro Pro Ala
35 40 45
Arg Gly Pro Ile Val Ala Ala Val Tyr Asn Phe Arg Asp Gln Thr Gly
50 55 60
Gln Phe Arg Pro Thr Ala Ala Pro Ser Thr Ala Val Thr Gln Gly Ala
65 70 75 80
Thr Ser Met Leu Ile Lys Ala Met Leu Asp Ser Gly Trp Phe Val Pro
85 90 95
Val Glu Arg Glu Gly Leu Gln Asn Leu Leu Thr Glu Arg Lys Ile Ile
100 105 110
Arg Ser Thr Glu Glu Lys Gly Val Ala Pro Val Glu Leu Pro Asn Leu
115 120 125
Met Ala Ala Gly Ile Leu Leu Glu Gly Gly Ile Ile Gly Tyr Glu Thr
130 135 140
Asn Val Lys Thr Gly Gly Gly Gly Ala Arg Tyr Leu Gly Ile Gly Leu
145 150 155 160
Ser Asp Met Tyr Arg Thr Asp Gln Val Thr Ile Asn Leu Arg Ala Val
165 170 175
Asp Ile Arg Ser Gly Arg Ile Leu Ser Ser Ile Ser Thr Thr Lys Ala
180 185 190
Ile Leu Ser Tyr Lys Leu Ser Gly Asp Val Tyr Lys Phe Ile Lys Phe
195 200 205
Lys Ser Leu Leu Glu Leu Glu Ala Gly Tyr Thr Arg Asn Glu Pro Val
210 215 220
Gln Leu Cys Val Gln Asp Ala Ile Glu Ala Gly Leu Ile Tyr Leu Ile
225 230 235 240
Thr Lys Gly Ile Lys Asp Asn His Trp Thr Leu Arg Asn Asn Val Asp
245 250 255
Leu Gln Ser Pro Val Leu Gln Arg Tyr Leu Gln Glu Leu Val Ala Pro
260 265 270
Ala Ala
<210> 19
<211> 275
<212> PRT
<213> Artificial Sequence
<220>
<223> mutant 4
<400> 19
Met Met Leu Leu Ala Thr Gly Leu Val Ser Gly Cys Ala Thr Leu Asp
1 5 10 15
Pro Asn Lys Gly Lys Pro Ala Ala Val Gly Glu Asp Ala Pro Val Leu
20 25 30
Thr Pro Met Ser Ser Thr His Lys Asp Leu Leu Asn Leu Pro Pro Ala
35 40 45
Arg Gly Pro Ile Val Ala Ala Val Tyr Asn Phe Arg Asp Gln Thr Gly
50 55 60
Gln Phe Arg Pro Ser Pro Ala Tyr Ala Ser Thr Ala Val Thr Gln Gly
65 70 75 80
Ala Thr Ser Met Leu Ile Lys Ala Met Leu Asp Ser Gly Trp Phe Val
85 90 95
Pro Val Glu Arg Glu Gly Leu Gln Asn Leu Leu Thr Glu Arg Lys Ile
100 105 110
Ile Arg Ser Thr Glu Glu Lys Gly Val Ala Pro Val Glu Leu Pro Asn
115 120 125
Leu Met Ala Ala Gly Ile Leu Leu Glu Gly Gly Ile Ile Gly Tyr Glu
130 135 140
Thr Asn Val Lys Thr Gly Gly Gly Gly Ala Arg Tyr Leu Gly Ile Gly
145 150 155 160
Leu Ser Asp Met Tyr Arg Thr Asp Gln Val Thr Ile Asn Leu Arg Ala
165 170 175
Val Asp Ile Arg Ser Gly Arg Ile Leu Ser Ser Ile Ser Thr Thr Lys
180 185 190
Ala Ile Leu Ser Tyr Lys Leu Ser Gly Asp Val Tyr Lys Phe Ile Lys
195 200 205
Phe Lys Ser Leu Leu Glu Leu Glu Ala Gly Tyr Thr Arg Asn Glu Pro
210 215 220
Val Gln Leu Cys Val Gln Asp Ala Ile Glu Ala Gly Leu Ile Tyr Leu
225 230 235 240
Ile Thr Lys Gly Ile Lys Asp Asn His Trp Thr Leu Arg Asn Asn Val
245 250 255
Asp Leu Gln Ser Pro Val Leu Gln Arg Tyr Leu Gln Glu Leu Val Ala
260 265 270
Pro Ala Ala
275

Claims (22)

1. A mutant of a porin monomer, wherein the amino acid of the mutant of a porin monomer comprises the sequence set forth in SEQ ID No. 1 or a sequence at least 99%, 98%, 97%, 96%, 95%, 90%, 80%, 70%, 60% or 50% identical thereto, and the amino acid of the mutant of a porin monomer comprises a mutation at one or more positions corresponding to K67, D71, S72, and Y74 of SEQ ID No. 1.
2. A mutant of a porin monomer as claimed in claim 1, said mutant of a porin monomer having an amino acid comprising a mutation at one or more of the positions corresponding to 62-209, 62-74, 62-75, 65-79, 67-209, 67-75, or 67-74 of SEQ ID No. 1.
3. A mutant of a porin monomer as claimed in claim 1 or 2, the amino acids of said mutant of a porin monomer comprising:
(1) 1 with insertions, deletions and/or substitutions of amino acids at one or more positions corresponding to Q62, K67, D71, S72 and Y74 of SEQ ID NO; (2) 1 with an insertion, deletion and/or substitution of an amino acid at one or more positions corresponding to Q62, K67, D71, S72, Y74, E110, E119, E126, and K209; (3) 1 with insertions, deletions and/or substitutions of amino acids at one or more positions corresponding to K67, D71, S72, Y74 and S75; or (4) having an insertion, deletion and/or substitution of an amino acid at one or more positions corresponding to K67, T69, A70, D71, S72, S73, and Y74 of SEQ ID NO: 1.
4. A mutant of a porin monomer as claimed in any one of the preceding claims, wherein said sequence set forth in SEQ ID NO. 1 is derived from Gulbenzkiania indica.
5. A mutant of a porin monomer as claimed in any one of the preceding claims, wherein said mutant of a porin monomer has an amino acid mutation selected from the group consisting of:
(a) Q62 corresponding to SEQ ID NO. 1 is mutated into 0 to 5 of G, A, V, L, I; the K67 mutation is 0 to 3 of R, H, K; the D71 mutation is 0 to 4 of N, E, D, Q; s72 is mutated into 0 to 1 of P; the Y74 mutation is 0 to 5 of S, C, U, T, M;
(b) Q62 corresponding to SEQ ID NO. 1 is mutated into 0 to 5 of G, A, V, L, I; the K67 mutation is 0 to 3 of R, H, K; the D71 mutation is 0 to 4 of N, E, D, Q; s72 is mutated into 0 to 1 of P; y74 is mutated into 0 to 3 of F, Y, W; the E110 mutation is 0 to 4 of N, D, E, Q; the E119 mutation is 0 to 4 of N, D, E, Q; the E126 mutation is 0 to 4 of N, D, E, Q; the K209 mutation is 0 to 1 of P;
(c) The K67 corresponding to SEQ ID NO. 1 is mutated into 0 to 3 of R, H, K; the D71 mutation is 0 to 5 of G, A, V, L, I; s72 is mutated into 0 to 1 of P; y74 is mutated into 0 to 3 of F, Y, W; the S75 mutation is 0 to 5 of C, U, S, T, M; and
(d) The K67 corresponding to SEQ ID NO. 1 is mutated into 0 to 3 of R, H, K; the T69 mutation is 0 to 5 of S, C, T, U, M; the A70 mutation is 0 to 1 of P; the D71 mutation is 0 to 5 of G, A, V, L, I; s72 mutation is 0 to 3 of F, Y, W; the S73 mutation is 0 to 5 of G, A, V, L, I; the Y74 mutation is 0 to 3 of F, Y, W.
6. A mutant of a porin monomer as claimed in any one of the preceding claims, wherein said mutant of a porin monomer has an amino acid mutation selected from the group consisting of:
(a) Q62L, K R, D N, S P and Y74T corresponding to SEQ ID NO: 1;
(b) Q62L, K R, D N, S P, Y deletion, E110N, E N, E N, and K209P corresponding to SEQ ID NO: 1;
(c) K67R, D A, S P, Y deletion, and S75 deletion corresponding to SEQ ID NO 1; and
(d) K67R, T3269S, A P, D3271A, S Y, S A, and Y74 corresponding to SEQ ID NO:1 are missing.
7. A mutant of a porin monomer as claimed in any one of the preceding claims, wherein the amino acid sequence of said mutant of a porin monomer comprises or consists of SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, or SEQ ID NO 19.
8. A mutant of a porin monomer, wherein the amino acid of the mutant of a porin monomer comprises the sequence set forth in SEQ ID No. 1 or a sequence at least 99%, 98%, 97%, 96%, 95%, 90%, 80%, 70%, 60%, or 50% identical thereto, and the mutant of a porin monomer comprises:
(1) 1 having a mutation at one or more positions corresponding to Q62, K67, T69, a70, D71, S72, S73, Y74, S75, E110, E119, E126, and K209;
(2) Having mutations at one or more positions corresponding to Q62L, K R, T S, A P, D N/D71A, S P/S72Y, S A, Y T/Y74 deletion, S75 deletion, E110N, E38119N, E N, and K209P of SEQ ID NO 1;
(3) (ii) has a mutation at K67, D71, S72, and/or Y74 corresponding to SEQ ID NO. 1, and additionally has a mutation at least one position of Q62, T69, A70, S73, S75, E110, E119, E126, and K209;
(4) (ii) has a mutation at a deletion corresponding to K67R, D N/D71A, S P/S72Y, and/or Y74T/Y74 of SEQ ID NO: 1; or
(5) Having mutations at least one position corresponding to K67R, D N/D71A, S P/S72Y, and/or Y74T/Y74 deletions of SEQ ID NO:1, and additionally Q62L, T69S, A P, S A, S deletion, E110N, E119N, E N, and K209P.
9. A protein pore comprising at least one mutant of a porin monomer of any one of the preceding claims.
10. A protein pore according to claim 9, wherein the protein pore comprises at least two mutants of the porin monomer.
11. The protein pore of any one of claims 9-10, wherein the constriction zone pore diameter of the protein pore is from 0.7nm to 2.2nm, from 0.9nm to 1.6nm, from 1.4nm to 1.6nm, or
Figure FDA0003299348350000041
12. A complex for characterizing a target analyte, characterized by: comprising a protein pore according to any one of claims 9 to 11 and a rate controlling protein for use therewith.
13. A nucleic acid encoding a mutant of a porin monomer of any one of claims 1-8, a protein pore of any one of claims 9-11, or a complex of claim 12.
14. The nucleic acid of claim 13, wherein the nucleotide sequence of the porin monomer is the sequence set forth in SEQ ID NO 2.
15. A vector or a genetically engineered host cell comprising the nucleic acid of any one of claims 13-14.
16. Use of a mutant of a porin monomer of any one of claims 1 to 8, a protein well of any one of claims 9 to 11, a complex of claim 12, a nucleic acid of any one of claims 13 to 14, or a vector or host cell of claim 15, for detecting the presence, absence or one or more characteristics of a target analyte or for the manufacture of a product for detecting the presence, absence or one or more characteristics of a target analyte.
17. A method of producing a protein pore or polypeptide thereof, comprising transforming a host cell with a vector comprising claim 15, and inducing the host cell to express a protein pore or polypeptide thereof according to any one of claims 9-11.
18. A method for determining the presence, absence or one or more characteristics of a target analyte, comprising:
a. contacting a target analyte with the protein pore of any one of claims 9-11, the complex of claim 12, or the protein pore of the complex of claim 12 such that the target analyte moves relative to the protein pore; and
b. obtaining one or more measurements while the target analyte is moving relative to the protein pore, thereby determining the presence, absence or one or more characteristics of the target analyte.
19. The method of claim 18, wherein the method comprises:
the target analyte interacts with the protein pores present in the membrane such that the target analyte moves relative to the protein pores.
20. A kit for determining the presence, absence or one or more characteristics of a target analyte comprising a mutant of a porin monomer of any one of claims 1 to 8, a protein well of any one of claims 9 to 11, a complex of claim 12, a nucleic acid of any one of claims 13 to 14, or a vector or host of claim 15, and a component of a membrane as defined in claim 19.
21. A device for determining the presence, absence or one or more characteristics of a target analyte comprising a protein well according to any one of claims 9 to 11 or a complex according to claim 12 and a membrane as defined in claim 19.
22. The use, method, kit or device of any one of claims 16-21, wherein the target analyte comprises a polysaccharide, a metal ion, an inorganic salt, a polymer, an amino acid, a peptide, a protein, a nucleotide, an oligonucleotide, a polynucleotide, a dye, a drug, a diagnostic agent, an explosive, or an environmental contaminant;
preferably, the target analyte comprises a polynucleotide,
more preferably, the polynucleotide comprises DNA or RNA; and/or, the one or more characteristics are selected from (i) the length of the polynucleotide; (ii) identity of said polynucleotides; (iii) the sequence of the polynucleotide; (iv) (iv) the secondary structure of the polynucleotide and (v) whether the polynucleotide is modified; and/or, the rate controlling protein in the complex comprises a polynucleotide binding protein.
CN202111186286.3A 2021-10-12 2021-10-12 Mutant of porin monomer, protein pore and application thereof Pending CN115960182A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111186286.3A CN115960182A (en) 2021-10-12 2021-10-12 Mutant of porin monomer, protein pore and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111186286.3A CN115960182A (en) 2021-10-12 2021-10-12 Mutant of porin monomer, protein pore and application thereof

Publications (1)

Publication Number Publication Date
CN115960182A true CN115960182A (en) 2023-04-14

Family

ID=87358367

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111186286.3A Pending CN115960182A (en) 2021-10-12 2021-10-12 Mutant of porin monomer, protein pore and application thereof

Country Status (1)

Country Link
CN (1) CN115960182A (en)

Similar Documents

Publication Publication Date Title
CN113480620B (en) Mutant of porin monomer, protein hole and application thereof
CN113754743B (en) Mutant of porin monomer, protein hole and application thereof
CN113896776B (en) Mutant of porin monomer, protein hole and application thereof
KR102222191B1 (en) Mutant pore
CN113912683B (en) Mutant of porin monomer, protein hole and application thereof
CN113773373B (en) Mutant of porin monomer, protein hole and application thereof
CN104126018B (en) Enzymatic process
CN107002151B (en) Method of delivering an analyte to a transmembrane pore
CN102317310B (en) Enhance the method that charged analytes pass through the displacement in transmembrane protein hole
CN106459159A (en) Mutant pores
CN104039979A (en) Method of characterizing a target polynucleotide using a pore and a hel308 helicase
CN104136631A (en) Method for characterising a polynucelotide by using a XPD helicase
CN107636168A (en) Method
KR20140125874A (en) Analysis of measurements of a polymer
CN113651876B (en) Mutant of porin monomer, protein hole and application thereof
CN113735948B (en) Mutant of porin monomer, protein hole and application thereof
WO2023060419A1 (en) Mutant of porin monomer, protein pore and use thereof
CN115960182A (en) Mutant of porin monomer, protein pore and application thereof
WO2023060418A1 (en) Mutant of porin monomer, protein pore, and application thereof
WO2023050031A1 (en) Mutant of porin monomer, protein pore and use thereof
WO2023060422A1 (en) Mutant of porin monomer, protein pore and use thereof
WO2023060421A1 (en) Mutant of porin monomer, protein pore and use thereof
WO2023060420A1 (en) Mutant of porin monomer, protein pore, and use thereof
WO2023019470A1 (en) Mutant of pore protein monomer, protein pore, and use thereof
WO2023019471A1 (en) Mutant of porin monomer, protein pore and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40084554

Country of ref document: HK