WO2021049940A1 - Single-molecule fret for protein characterization - Google Patents

Single-molecule fret for protein characterization Download PDF

Info

Publication number
WO2021049940A1
WO2021049940A1 PCT/NL2020/050566 NL2020050566W WO2021049940A1 WO 2021049940 A1 WO2021049940 A1 WO 2021049940A1 NL 2020050566 W NL2020050566 W NL 2020050566W WO 2021049940 A1 WO2021049940 A1 WO 2021049940A1
Authority
WO
WIPO (PCT)
Prior art keywords
protein
barcode
tag
donor
chromophore
Prior art date
Application number
PCT/NL2020/050566
Other languages
French (fr)
Inventor
Chirlmin JOO
Mike FILIUS
Carlos DE LANNOY
Dick DE RIDDER
Original Assignee
Technische Universiteit Delft
Wageningen Universiteit
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Technische Universiteit Delft, Wageningen Universiteit filed Critical Technische Universiteit Delft
Publication of WO2021049940A1 publication Critical patent/WO2021049940A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/536Immunoassay; Biospecific binding assay; Materials therefor with immune complex formed in liquid phase
    • G01N33/542Immunoassay; Biospecific binding assay; Materials therefor with immune complex formed in liquid phase with steric inhibition or signal modification, e.g. fluorescent quenching
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6804Nucleic acid analysis using immunogens
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/58Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving labelled substances
    • G01N33/582Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving labelled substances with fluorescent label

Definitions

  • the invention relates to a method for the characterization of a protein.
  • the invention further relates to a system for characterization of a protein.
  • the invention further relates to a data carrier having stored thereon program instructions for the characterization of a protein.
  • WO0125794A2 describes that protein molecules of interest are isolated and modified into linearized protein molecules. Individual protein molecules are isolated for observation with hydrodynamic focusing apparatus, atomic force microscope, a separation plate, or the combination of isoelectric focusing gel, and electrophoresis gel. In the linearized protein molecule a first type of amino acid residue (K) is labelled with a first tag, and a second type of amino acid residue (C) is labelled with a second tag. Tags impart a detectable set of distinguishing characteristic ancillary properties to the linearized protein molecule that define the fingerprint thereof. Protein fingerprints are recovered as images by camera or fluorescence microscope, or as spectral analyses by optical detector. Compilations of fingerprints for known protein molecules comprise fingerprint libraries to which the fingerprint of a protein molecule of interest is compared.
  • US2016169903A1 describes a method for obtaining partial sequence information from a target protein, comprising (i) denaturing and elongating a protein, (ii) attaching docking strands to particular amino acids in the protein, (iii) capturing the protein on a substrate, (iv) repeatedly contacting the captured protein with fluorescently-labeled imager strands that transiently bind to the docking strand, and (v) imaging the substrate.
  • Proteins may be considered the basis of life since they are the workhorses in all living cells. The many thousands of different proteins may sustain the vast majority of functions of a cell, from copying DNA and catalyzing basic metabolism to producing cellular motion and more. For the understanding of biological processes and their regulation, including diseases, it may be critical to monitor the protein composition of cells, especially by identifying the proteins, such as by sequencing (i.e. determination of the amino acid sequence of proteins), and to determine protein structures. However, protein identification and structure determination may remain enormous challenges, especially when only small biological samples are available.
  • MS mass spectrometry-based identification techniques; determining the precise mass of protein fragments following the fragmentation of a protein by electron bombardment.
  • Current MS methods may generally suffer from several limitations. First, MS methods may only be capable of analyzing fragments of proteins, wherein information on those fragments is then used to reconstruct the full-length sequence, which may fail due to the combinatorial complexity, and which may imply a loss of information due to an uncertainty in which fragments correspond to the same protein, which may be particularly relevant with regards to proteoforms. Second, MS methods may often fail to recognize minor species among highly abundant species, since sequence prediction may be made through analysis of complex spectral peaks.
  • proteins may be post-translationally modified, providing additional combinatorial complexity for protein identification, especially given that it may typically not be fully known which post-translational modifications a given protein can undergo.
  • Protein structure elucidation may typically rely on a combination of computational methods and experimental measurements. Traditionally, protein structure elucidation may have relied on protein crystallization and X-ray analysis, which may be limited to proteins that can be successfully crystallized and may be relatively time-intensive and expensive. For decades now, the field of computational structure prediction may have garnered continuous attention due to the attractive outlook of structural prediction based on reference proteins without elaborate and expensive laboratory protocols. Recently, the introduction of deep learning implementations in structure prediction algorithms may have helped to break a short impasse in progress of ab initio modeling and may simultaneously have reduced the need for human expert input. The current top-performing implementations may construct residue distance matrices as an intermediate step. Experimental data regarding distances may benefit such computational structure predictions methods. Similarly, such data may also benefit protein structure models obtained based on X-ray crystallography.
  • FRET Forster Resonance Energy Transfer
  • FRET structural biology measurements one may be limited in data acquisition due to the number of fluorophores that can be used (simultaneously), which may be up to but 4 colors in one experiment based on the current state of the art.
  • the present invention may have as object to overcome or ameliorate at least one of the disadvantages of the prior art, or to provide a useful alternative.
  • the invention may provide an analysis method (also: “method”) for characterization of a (tagged) protein, especially a (tagged) protein complex, using FRET donor-acceptor pair chromophores.
  • the FRET donor-acceptor pair chromophores may comprise a first chromophore and a second chromophore.
  • the FRET donor-acceptor pair chromophores may have a (FRET) donor excitation radiation range, a (FRET) acceptor excitation radiation range, a (FRET) donor emission radiation range and a (FRET) acceptor emission radiation range.
  • one of the FRET donor-acceptor pair chromophores may be excitable by donor excitation radiation in the donor excitation radiation range, especially wherein the other of the FRET donor-acceptor pair chromophores (also: “acceptor chromophore”) may be configured to provide acceptor emission radiation in the FRET acceptor emission radiation range upon excitation with donor excitation radiation in the donor excitation radiation range of the one of the FRET donor-acceptor pair chromophores when the first chromophore and the second chromophore are configured within a predetermined distance (range).
  • the tagged protein may comprise a first amino acid tagged with a first tag and a second amino acid tagged with a second tag, especially wherein the first amino acid is different from the second amino acid.
  • the first tag may comprise the first chromophore or may be associated to the first chromophore.
  • the second tag may comprise an oligonucleotide (also: “nucleotide chain”), especially an oligonucleotide comprising 02 nucleotides, wherein 02 may be selected from the range of 3 to 20, especially 3 to 15, such as 5 to 12.
  • the method may comprise a barcode exposure stage. In further embodiments, the method may comprise a pattern generation stage.
  • the method may comprise a distance estimation stage.
  • the barcode exposure stage may comprise exposing the tagged protein to a (second) barcode, wherein the (second) barcode is configured to hybridize with the second tag, and wherein the (second) barcode comprises the second chromophore.
  • the barcode exposure stage may further comprise providing radiation having a wavelength selected from the donor excitation radiation range to the protein.
  • the barcode exposure stage may further comprise measuring emission in (the donor emission radiation range and/or the acceptor emission radiation range, especially in) the donor emission radiation range and the acceptor emission radiation range, especially to provide an emission signal.
  • the pattern generation stage may comprise generating a FRET efficiency pattern based on the emission signal.
  • the distance estimation stage may comprise estimating a distance between the first amino acid and the second amino acid based on the FRET efficiency pattern and/or the emission signal, especially based on the FRET efficiency pattern, or especially based on the emission signal, especially by determining FRET efficiency.
  • the method may enable quickly and precisely determining the distances between the first amino acid and a plurality of second amino acids, between a plurality of first amino acids and a second amino acid, and/or between a plurality of first amino acids and a plurality of second amino acids (see further below).
  • 3D) spatial information and/or protein characterization may be provided.
  • the invention may thus provide an advantageous protein characterization method based on single-molecule FRET analysis.
  • the invention may provide an advantageous distance measurement method based on single-molecule FRET analysis.
  • the method may in particular be suitable for one or more of: analysis of a nanometer-sized object such as a protein; single-molecule protein sequencing; analyzing proteoforms such as alternatively spliced proteins; single-molecule post-translational modification analysis; and single-molecule protein structure analysis.
  • the analysis method may enable directly analyzing, especially identifying, the sequence of full-length proteins, which may (greatly) improve the accuracy of protein identification.
  • the method may be suitable for single-molecule analysis, which may provide the ultimate sensitivity (one molecule) and allow the sequencing of proteins present in an amount 3-5 orders of magnitude smaller than what may presently be required for mass spectrometry.
  • the method may be suitable for single-cell proteomics, and may furthermore be suitable for real-time screening for on-site medical diagnostics.
  • the method may provide a FRET efficiency pattern based on one or more emission signals, especially based on a plurality of emission signals.
  • the FRET efficiency pattern may be characteristic of the protein.
  • the FRET efficiency pattern may comprise a protein fingerprint.
  • the analysis method may herein also be referred to as single-molecule superresolution FRET (ssFRET).
  • ssFRET single-molecule superresolution FRET
  • the invention may provide an analysis method for characterization of a protein.
  • the analysis method may involve analyzing a protein to determine a protein characteristic.
  • characterization of a protein may herein refer to one or more of identifying the protein, especially via sequencing, or determining (at least part of) the 3- D protein structure.
  • the protein may especially be a tagged protein, i.e., one or more tags may be provided to (or: “attached to”) the protein.
  • the tagged protein may comprise a first amino acid tagged with a first tag, and especially a second amino acid tagged with a second tag.
  • the first amino acid and the second amino acid may be independently selected from the group comprising cysteine, lysine, methionine, tyrosine, an amino acid comprising a C-terminal carboxyl group, and an amino acid comprising an N-terminal amine group.
  • the first amino acid and the second amino acid may be the same type of amino acid, especially wherein the first amino acid comprises a terminal amino acid, and/or especially wherein the second amino acid comprises a terminal amino acid.
  • the first amino acid and the second amino acid may be different, especially the first amino acid and the second amino acid may be different types of amino acids.
  • the first amino acid may, for example, comprise an amino acid selected from the group comprising cysteine, lysine, methionine, and tyrosine
  • the second amino acid may comprise an amino acid selected from the group comprising cysteine, lysine, methionine, and tyrosine differing from the first amino acid.
  • amino acid comprising a C-terminal carboxyl group also: “C- terminal amino acid”
  • amino acid comprising an N-terminal amine group also: “N- terminal amino acid”
  • tag and similar terms may herein refer to a molecule that attaches to an amino acid, especially wherein the tag (covalently) binds the amino acid.
  • the tag may be selected to be specific for a target (type of) amino acid. It will be clear to the person skilled in the art how a (type) of amino acid can be specifically tagged using known chemical approaches, such as using maleimide chemistry or reductive amination-aldehyde chemistry.
  • a tag for cysteine may comprise one or more of a maleimide group, a haloacetyl group, and a pyridyl disulfide group.
  • a tag for lysine may comprise one or more of an NHS ester, and a tag provided by reductive amination-aldehyde chemistry.
  • a tag for methionine may comprise one or more of azide and alkyne groups, especially provided via oxidation with oxaziridine, such as described by Lin, Shixian, et al.
  • a tag for tyrosine may comprise one or more of a diazodi carboxyl ate group and a diazodi carboxamide group.
  • a tag for an amino acid comprising a C-terminal carboxyl group may comprise a group provided by a decarboxyl ative alkylation reaction.
  • a tag for an amino acid comprising a N-terminal amine group may comprise a group provided by one or more of NHS chemistry, 2PCA chemistry and an alkyne-ketene reaction.
  • the analysis method may especially relate to the use of FRET donor-acceptor pair chromophores.
  • FRET Form Resonance Energy Transfer
  • FRET Form Resonance Energy Transfer
  • the term “FRET” may herein refer to the transfer of the energy of a donor chromophore to an acceptor chromophore, which may occur when the donor-acceptor pair chromophores are within a predetermined distance range, such as within several nanometers.
  • the FRET donor-acceptor pair chromophores may comprise a first chromophore and a second chromophore, wherein the FRET donor-acceptor pair chromophores have a donor excitation radiation range, an acceptor excitation radiation range, a donor emission radiation range and an acceptor emission radiation range, wherein one of the FRET donor-acceptor pair chromophores is excitable by donor excitation radiation in the donor excitation radiation range, wherein the other of the FRET donor-acceptor pair chromophores is configured to provide acceptor emission in the FRET acceptor emission radiation range upon excitation with donor excitation radiation in the donor excitation radiation range of the one of the FRET donor-acceptor pair chromophores when the first chromophore and the second chromophore are configured within a predetermined distance range.
  • the donor chromophore and the acceptor chromophore are arranged with a predetermined distance range, which may vary for different FRET donor-acceptor pairs, the donor chromophore may upon excitation with donor excitation radiation transfer energy to the acceptor chromophore, whereupon the acceptor chromophore may emit acceptor emission radiation.
  • This energy transfer may occur with a specific FRET efficiency depending on the (exact) distance between the donor chromophore and the acceptor chromophore.
  • FRET (transfer) efficiency information regarding the distance between the donor chromophore and the acceptor chromophore is obtained.
  • the FRET transfer efficiency may be sensitive to sub nanometer distance changes, which may make FRET an outstanding spectroscopic ruler for probing, for example, biological systems.
  • the FRET excitation and emission ranges may, for example, comprise wavelengths in the UV range, the visible light range, and/or the (N)IR range.
  • the FRET excitation and emission ranges may, for example, comprises a (sub)range selected from within the range of 200 - 1500 nm, especially from within the range of 400 - 800 nm.
  • the donor excitation radiation range may comprise a (sub)range selected from the range of 200 - 1500 nm, especially from the range of 400 - 800 nm.
  • the donor emission radiation range may comprise a (sub)range selected from the range of 200 - 1500 nm, especially from the range of 400 - 800 nm.
  • the acceptor excitation radiation range may comprise a (sub)range selected from the range of 200 - 1500 nm, especially from the range of 400 - 800 nm.
  • the acceptor emission radiation range may comprise a (sub)range selected from the range of 200 - 1500 nm, especially from the range of 400 - 800 nm.
  • the FRET excitation and emission ranges will in general depend on the used FRET pairs.
  • the FRET donor-acceptor chromophore pair may comprise Atto488 and Cy3, wherein Atto488 (the donor chromophore) and Cy3 (the acceptor chromophore) may be excited maximally at about 488 nm and 552 nm respectively, and wherein Atto488 and Cy3 may provide emission radiation at about 521 nm and 568 nm respectively.
  • the donor-accept chromophore pair may comprise Atto488 and Cy5, which may respectively be maximally excited at 488 nm and 650 nm, and may provide emission radiation at about 521 nm and 666 nm.
  • the donor-accept chromophore pair may comprise Cy3 and Cy5, which may respectively be maximally excited at 552 nm and 650 nm, and may provide emission radiation at about 568 nm and 666 nm.
  • the donor-accept chromophore pair may comprise Cy3 and Cy7, which may respectively be maximally excited at 488 nm and 750 nm, and may provide emission radiation at about 568 nm and 788 nm.
  • the donor-accept chromophore pair may comprise Cy5 and Cy7, which may respectively be maximally excited at 650 nm and 750 nm, and may provide emission radiation at about 666 nm and 788 nm.
  • the donor-acceptor chromophore pair may comprise a chromophore pair selected from the group comprising Atto488/Cy3, Atto488/Cy3b, Atto488/Cy5, Atto488/Atto647n, Cy3/Cy5, Cy3b/Cy5, Cy3/Cy7, Cy3b/Cy7, and Cy5/Cy7.
  • predetermined distance range and similar terms may herein especially refer to a distance range wherein FRET energy transfer can occur for the FRET donor-acceptor pair chromophores, which may vary for different sets of FRET donor-acceptor pair chromophores.
  • chromophore may herein especially refer to a fluorescent chemical molecule that upon excitation with light (e.g. radiation from a laser), emits light of a different wavelength.
  • the FRET donor-acceptor pair chromophores also: “donor-acceptor pair chromophores” may especially comprise fluorescent molecules and/or phosphorescent molecules.
  • the term “FRET donor-acceptor pair chromophores” may herein especially refer to two chromophores capable of FRET energy transfer, i.e., energy transfer in a non-radiative distance-dependent fashion, especially through dipole-dipole coupling of the donor chromophore and the acceptor chromophore.
  • the FRET donor-acceptor pair chromophores may comprise one or more pairs selected from the group comprising the Cyanine family, the Alexa family, the Atto family, the Dy family, and the Rhodamine family.
  • Different chromophore pairs may be sensitive at different distances, i.e., may provide a high effect regarding FRET efficiency for subnanometer distance changes (a high resolution).
  • the Cyanine family pair Cy3:Cy5 may be most sensitive at around a distance of 5 nm, such as at distances selected from the range of 3-7 nm.
  • the Cyanine family pair Cy3:Cy7 may be most sensitive at around a distance of 3 nm.
  • the Cyanine family pair Cy2:Cy3 may be most sensitive at around a distance of 7 nm.
  • the first chromophore may comprise the donor chromophore and the second chromophore may comprise the acceptor chromophore.
  • the second chromophore may comprise the donor chromophore and the first chromophore may comprise the acceptor chromophore.
  • the first tag may comprise the first chromophore or may be associated to the first chromophore. In further embodiments, the first tag may comprise the first chromophore. In further embodiments, the first tag may be associated to the first chromophore.
  • the term “associated” and similar terms may herein refer to two molecules being non- permanently connected, especially non-covalently connected, such as via hydrogen bond interactions. In particular, the first tag may be associated to the first chromophore via nucleotide hybridization (see further below).
  • the first tag and the first chromophore may dissociate during (part of) the method, i.e., the first tag and the first chromophore may be associated during at least part of the method, especially during at least part of the barcode exposure stage.
  • the second tag may comprise an oligonucleotide, especially an oligonucleotide comprising 02 nucleotides, wherein 02 is selected from the range of 3 to 20, especially 3 to 15, such as 5 to 12.
  • the second tag may comprise a moiety suitable for tagging a (specific) (type of) amino acid, and the second tag may comprise an oligonucleotide.
  • the oligonucleotide may especially comprise a subunit of deoxyribonucleic acid (DNA), ribonucleic acid (RNA) or peptide nucleic acid (PNA).
  • the oligonucleotide may comprise a plurality of (covalently bound) subunits of DNA, RNA, and/or PNA.
  • RNA may be relatively less stable than DNA and PNA, especially in biological environments, especially due to a relatively fast degradation.
  • the oligonucleotide may especially comprise a subunit of DNA or PNA.
  • the oligonucleotide chain may comprise a plurality of subunits of DNA and/or PNA.
  • the barcode exposure stage may comprise exposing the tagged protein to a second barcode (also “barcode”), wherein the barcode is configured to hybridize with the second tag.
  • the second barcode may (also) comprise an oligonucleotide.
  • the second tag and the second barcode may comprise the same type of oligonucleotide, i.e., the second tag and the second barcode may both comprise a DNA subunit, or may both comprise an RNA subunit, or may both comprise a PNA subunit.
  • the second tag and the second barcode may hybridize, especially due to complementary nucleotide pairing. The duration of the hybridization may depend on the number of complementary nucleotide pairs (also: “base pairs”) between the second tag and the second barcode, wherein a larger number of complementary nucleotide pairs may result in a longer hybridization duration.
  • the (second) barcode may comprise the second chromophore.
  • the second chromophore may be located in proximity of the second amino acid.
  • the distance between the second amino acid and the second chromophore may depend on the length of the second tag and the second barcode, which may be tailored via the selection of the number of nucleotides in the second tag and/or the second barcode, i.e., via the length of the corresponding nucleotide chains.
  • the second chromophore may be brought into proximity with the first chromophore, especially at a distance within the predetermined distance range, such that FRET energy transfer may take place if donor excitation radiation in the donor excitation radiation range is provided to the tagged protein.
  • the barcode exposure stage may further comprise providing radiation having a wavelength selected from the donor excitation radiation range to the protein.
  • the barcode exposure stage may comprise measuring emission in the donor emission radiation range and the acceptor emission radiation range to provide an emission signal.
  • the emission signal may comprise no emission (if the donor chromophore is not excited), or may comprise emission from the first and/or second chromophore, i.e., the emission signal may comprise donor emission radiation and/or acceptor emission radiation. If the donor chromophore and the acceptor chromophore are too far apart or too close for FRET energy transfer, the emission signal may essentially only comprise donor emission radiation.
  • the emission signal may comprise acceptor emission radiation, donor emission radiation, or a mixture of donor emission radiation and acceptor emission radiation, wherein the composition of the emission signal may depend on the FRET (transfer) efficiency, which may depend on the distance between the first chromophore and the second chromophore.
  • the position of the chromophore attached to the first amino acid and/or the second amino acid may be adjusted. For example, using a different first barcode (or second barcode) hybridizing at a different location at the first tag (or second tag).
  • the method may comprise sequentially providing a plurality of first barcodes (or second barcodes) for a (specific) first tag (or second tag), wherein different first barcodes (or second barcodes) of the plurality of first barcodes (or second barcodes) hybridize with the first tag (or second tag) at different locations.
  • the analysis method may comprise a pattern generation stage.
  • the pattern generation stage may comprise generating a FRET efficiency pattern based on the emission signal, especially based on a plurality of emission signals, especially based on a plurality of emission signals related to a plurality of first barcodes and/or a plurality of second barcodes (see below).
  • the term “FRET efficiency pattern” may herein refer to data related to FRET efficiency measurements of the protein.
  • the FRET efficiency pattern may comprise emission-related data, such as the emission signal, especially related to FRET efficiency.
  • the FRET efficiency pattern may comprise the emission signal.
  • the FRET efficiency pattern may (also) comprise data based on the emission signal, such as binned data and/or processed data.
  • the FRET efficiency pattern may be the emission signal.
  • emission-related data and similar terms may herein refer to the emission signal, FRET efficiency and/or an estimated distance. Instead of the term “FRET efficiency pattern”, also the term “emission-related data” may be applied.
  • the pattern generation stage may comprise binning data based on the emission signal to provide binned data, wherein the FRET efficiency pattern comprises the binned data.
  • the pattern generation stage may comprise processing the emission signal to provide processed data, wherein the FRET efficiency pattern comprises the processed data.
  • the FRET efficiency pattern may comprise a protein fingerprint (of the protein).
  • the method may further comprise a distance estimation stage comprising estimating a distance between the first amino acid and the second amino acid based on the FRET efficiency pattern, especially based on the emission signal, especially by determining the FRET efficiency.
  • the FRET efficiency (E) may be defined as:
  • the distance between the donor chromophore and the acceptor chromophore may then be estimated by comparing the measured value of E (equation above) to an estimated value of the FRET Efficiency E e as a function of the distance r: wherein R is the Forster radius, which may be specific for the donor-acceptor pair.
  • the distance between the first amino acid and the second amino acid may then be estimated based on the estimated distance between the donor chromophore and the acceptor chromophore, as well as, for example, based on the (length of) the tags and the barcodes.
  • the invention may provide an analysis method for characterization of a tagged protein using FRET donor-acceptor pair chromophores, wherein the FRET donor-acceptor pair chromophores comprise a first chromophore and a second chromophore, wherein the FRET donor-acceptor pair chromophores have a donor excitation radiation range, a donor emission radiation range and an acceptor emission radiation range, wherein one of the FRET donor-acceptor pair chromophores is excitable by donor excitation radiation in the donor excitation radiation range, wherein the other of the FRET donor-acceptor pair chromophores is configured to provide acceptor emission radiation in the acceptor emission radiation range upon excitation with donor excitation radiation in the donor excitation radiation range of the one of the FRET donor-acceptor pair chromophores when the first chromophore and the second chromophore are configured within a predetermined distance, wherein the tagged protein comprises a first amino acid tagged with a
  • the analysis method may further comprise a distance estimation stage comprising estimating a distance between the first amino acid and the second amino acid based on the FRET efficiency pattern, especially based on the emission signal.
  • oligonucleotide may herein refer to a chain of nucleotides with a relatively small number of subunits.
  • an oligonucleotide may comprise DNA subunits, RNA subunits, or PNA subunits, however, the oligonucleotide may also comprise combinations.
  • An oligonucleotide may comprise, for example, up to 200 nucleotides, such as up to 200 DNA subunits and/or PNA subunits.
  • the first amino acid may comprise a first post-translational modification, especially wherein the first tag is attached to the first post-translational modification; and/or the second amino acid may comprise a second post-translational modification, especially wherein the second tag is attached to the second post-translational modification.
  • the method may comprise providing a first tag (and/or second tag) that tags a (specific) amino acid independent of whether or not the amino acid comprises a post-translational modification (“PTM”).
  • the method may comprise providing a first tag (and/or second tag) that tags a (specific) amino acid if the amino acid is post-translationally modified, especially with a specific post translational modification.
  • the method may comprise tagging a post-translationally modified amino acid with a first tag (and/or second tag).
  • the presence of a PTM in a protein may be a key signature of several diseases in neurology, oncology and immunology. It has, however, been challenging to detect PTMs accurately.
  • a PTM may generally be difficult to determine (1) whether a PTM is present given a small amount of sample; (2) if it is present, to what degree the PTM has occurred within a tissue of interest; and (3) at which amino acid residue of a protein the PTM is located.
  • Above-mentioned embodiment may facilitate specifically locating the position of a PTM in a protein on a single-molecule basis.
  • the first amino acid may have been post-translationally modified via one or more of phosphorylation, O-linked glycosylation, acetylation, methylation, nitration, famesylation, palmitoylation, myristoylation, and S- nitrosylation.
  • the PTM may be selected from the group comprising a phosphoryl group, an O-glycan, an acetyl group, a methyl group, a nitro group, a farnesyl group, a palmitoyl group, a myristoyl group, and an S-nitrothiol group.
  • the first post-translational modification may be selected from the group comprising a phosphoryl group, an O-glycan, an acetyl group, a methyl group, a nitro group, a farnesyl group, a palmitoyl group, a myristoyl group, and an S-nitrothiol group.
  • the second post-translational modification may be selected from the group comprising a phosphoryl group, an O-glycan, an acetyl group, a methyl group, a nitro group, a farnesyl group, a palmitoyl group, a myristoyl group, and an S-nitrothiol group.
  • PTM may also refer to a plurality of (different) PTMs.
  • the PTM may comprise a phosphoryl group, wherein the first tag (or second tag) may be provided to the first amino acid (or second amino acid) via one or more of Beta-elimination/Michael addition (also “BEMA”) and/or Phosphoramidate chemistry.
  • BEMA may comprise the elimination of the phosphoryl group using a saturated Ba(OH)2 solution, resulting in the formation of an a-b-unsaturated carbonyl compound. This a-b-unsaturated carbonyl can be used as a Michael acceptor, where a nucleophilic compound containing an enrichment tag may be attached.
  • a phosphorylated serine or a phosphorylated threonine may be tagged with a tag comprising a nucleophile (e.g. a thiol or amine) and an oligonucleotide.
  • a nucleophile e.g. a thiol or amine
  • Phosphoramidate chemistry may comprise attaching a cysteamine to the phosphate group of an amino acid, wherein the addition of the cysteamine will result in a free sulfhydryl.
  • the tag may then be attached to the free sulfhydryl groups using maleimides, haloacetyls or pyridyl disulfides, i.e., the tag may comprise a tagging group selected from the group comprising maleimides, haloacetyls, and pyridyl disulfides.
  • the duration of hybridization between a barcode and a tag may depend (in part) on the number of complementary residues between the barcode and the tag. For example, barcodes with 5 to 10 nucleotides complementarity to a DNA tag may provide an off-rate in the range of 0.1-10 s 1 , whereas barcodes with 10 to 20 nucleotides complementarity to a DNA tag may provide an off-rate in the range of 0.001-0.1 s 1 .
  • the binding affinity between PNA-DNA, PNA-PNA, RNA-DNA and RNA-RNA may be higher than the binding affinity of DNA-DNA.
  • the binding affinities of PNA-DNA and PNA-PNA may be substantially higher.
  • the same off-rate may be achieved with a shorter (complementary section of a) barcode for, for example, PNA-DNA binding than for DNA-DNA binding.
  • a shorter (complementary section of a) barcode for, for example, PNA-DNA binding than for DNA-DNA binding.
  • the person skilled in the art will be capable of selecting the type and length of the barcodes in view of the desired off- rate.
  • the (second) barcode may comprise m nucleotides complementary with the second tag.
  • the person skilled in the art may select to provide a desired off-rate (also: “dissociation rate”).
  • a desired off-rate also: “dissociation rate”.
  • the protein may comprise a plurality of second amino acids, especially a plurality of second amino acids of the same type, tagged with (two or more) different second tags, i.e., each of the plurality of second amino acids is tagged with a different (or: unique) second tag.
  • the plurality of second amino acids may comprise at least 2 amino acids, such as at least 3 amino acids, especially at least 4 amino acids, such as at least 5 amino acids.
  • the plurality of second amino acids may comprise at most 70 amino acids, such as at most 50 amino acids, especially at most 40 amino acids, such as at most 25 amino acids.
  • the different second tags may comprise different nucleotides, especially different oligonucleotides, i.e., each of the second tags may comprise a different oligonucleotide.
  • the barcode exposure stage may comprise sequentially exposing the (tagged) protein to a plurality of different barcodes, especially wherein for each second tag the barcode exposure stage comprises exposing the protein to a corresponding barcode (of the plurality of different barcodes) configured to hybridize with the respective second tag.
  • the barcode exposure stage may comprise exposing the protein to one or more corresponding barcodes for each of the second tags, i.e., for each second tag one or more barcodes hybridizing with that tag are (sequentially) provided.
  • the plurality of different barcodes may especially be configured to uniquely hybridize to a specific second tag of the plurality of second tags.
  • the barcode exposure stage may comprise, during the exposure of the protein to each of the barcodes, providing radiation having a wavelength selected from the donor excitation radiation range to the tagged protein, and measuring emission in the donor emission radiation range and the acceptor emission radiation range to provide an emission signal.
  • the barcode exposure stage may comprise repeatedly or (essentially) continuously, especially repeatedly, or especially continuously, providing radiation having a wavelength selected from the donor excitation radiation range to the tagged protein, and measuring emission in the donor emission radiation range and the acceptor emission radiation range to provide an emission signal. Thereby, emission signals may be obtained for each barcode exposure.
  • the barcode exposure stage may comprise sequentially providing different second barcodes to the tagged protein, wherein each barcode uniquely hybridizes to a second tag, thereby (potentially) modifying the emission signal of the protein depending on the distance between the second chromophore (comprised by the second tag) and the first chromophore (comprised by or associated to the first tag), thereby providing a plurality of emission signals.
  • the distance between the first amino acid and each of the second amino acids may be estimated based on the FRET efficiency pattern, especially based on a plurality of emission signals.
  • the method may provide, for example, estimated distances between a first amino acid, for example the N-terminal amino acid, and (essentially) a plurality of second amino acids, for example all amino acids of a specific type, such as cysteine, in the (tagged) protein.
  • the donor chromophore may thus in general interact with at most one acceptor chromophore at a given timepoint.
  • the protein may have five second amino acids tagged with different tags, each with a different corresponding barcode. Then, the tagged protein may be exposed sequentially to the different barcodes, especially the protein may be flushed with complementary barcode sequences that contain the second chromophore one by one for each tag. Although beforehand it may not be known which second amino acid is probed at a given timepoint, each second amino acid may be probed one by one and high precision may be obtained.
  • the first tag may (also) comprise an oligonucleotide, especially an oligonucleotide comprising oi nucleotides, wherein oi is selected from the range of 3 to 25, such as 3 to 20, especially 5 to 15.
  • the barcode exposure stage may comprise exposing the (tagged) protein to a first barcode, especially wherein the first barcode is configured to hybridize with the first tag.
  • the first barcode may comprise the first chromophore.
  • the first barcode may comprise nucleotides complementary with the first tag.
  • the person skilled in the art may select to provide a desired off-rate (also: “dissociation rate”).
  • a desired off-rate also: “dissociation rate”.
  • m may be at most 30, such as at most 25, especially at most 20, such as at most 15, especially at most 10, such as at most 7, especially at most 5.
  • m may be selected from the range of 10-20.
  • the off-rate between the first tag and first barcode may differ from the off-rate between the second tag and the second barcode.
  • the different off-rates may be selected such that during the hybridization of the first tag with the first barcode (or: second tag with the second barcode), a plurality of second tags (or: first tags) may hybridize with corresponding second barcodes (or: first barcodes). This may be particularly beneficial for sequentially providing both different first barcodes and second barcodes.
  • the protein may comprise a plurality of first amino acids, especially of the same type, tagged with (two or more) different first tags, especially wherein the different first tags comprise different nucleotide sequences.
  • each of the plurality of first amino acids may be tagged with a different first tag.
  • the plurality of first amino acids may comprise at least 2 amino acids, such as at least 3 amino acids, especially at least 4 amino acids, such as at least 5 amino acids.
  • the plurality of first amino acids may comprise at most 70 amino acids, such as at most 50 amino acids, especially at most 40 amino acids, such as at most 25 amino acids.
  • the barcode exposure stage may comprise sequentially exposing the protein to a plurality of different first barcodes, especially wherein for each first tag the barcode exposure stage comprises exposing the protein to a corresponding first barcode (of the different first barcodes) configured to hybridize with the respective first tag.
  • the barcode exposure stage may comprise exposing the protein to one or more corresponding first barcodes for each of the first tags, i.e., for each first tag one or more first barcodes hybridizing with that first tag are (sequentially) provided.
  • the plurality of different first barcodes may especially be configured to uniquely hybridize to a specific first tag of the plurality of first tags. Thereby, estimated distances between a plurality of first amino acids and a second amino acid may be obtained.
  • the tagged protein may comprise both a plurality of (tagged) first amino acids and a plurality of (tagged) second amino acids.
  • the barcode exposure stage may especially comprise exposing the protein to a plurality of first barcodes and to a plurality of second barcodes, especially wherein during the exposure to each of the first barcodes, the protein is (sequentially) exposed to each of the second barcodes.
  • the method may provide estimated distances between (each of) the plurality of first amino acids and (each of) the plurality of second amino acids.
  • the analysis method may also comprise simultaneously providing one or more of a plurality of first barcodes and a plurality of second barcodes.
  • the tagged protein may thus be simultaneously associated with a plurality of first (or second) chromophores, which may result in a composite signal related to a plurality of FRET pairs.
  • the signals corresponding to different FRET pairs may need to be distinguished.
  • the off-rates (or dissociation rates) between the barcodes and the tags may be selected such that during at least part of the barcode exposure stage only a single first chromophore and only a single second chromophore are associated with the tagged protein, i.e., the off-rates may be selected to be relatively large in order to temporally separate the signals corresponding to different FRET pairs.
  • the first barcode (or the second barcode) may have 5 to 10 nucleotides complementary to the first tag (or the second tag), which may provide an off-rate in the range of 0.1-10 s 1 .
  • may be selected to provide an off-rate larger than 0.01 s 1 , such as larger than 0.1 s 1 , especially larger than 1 s 1 .
  • m may be selected to provide an off-rate larger than 0.01 s 1 , such as larger than 0.1 s 1 , especially larger than 1 s 1 .
  • signals corresponding to different FRET pairs may be temporally separated via (stochastic) activation and inactivation of chromophores.
  • the off-rates may be selected to be relatively small, i.e., the barcodes are associated to the tags for a relatively long time, but the chromophores may switch between active and inactive states during the barcode exposure stage (see below).
  • the first barcode (or the second barcode) may have 10 to 20 nucleotides complementary to the first tag (or the second tag), which may provide an off-rate in the range of 0.001-0.1 s 1 .
  • m may especially be selected from the range of 10-25, such as from the range of 10-20, especially from the range of 10-15.
  • m may especially be selected from the range of 10-25, such as from the range of 10-20, especially from the range of 10-15, or especially from the range of 15-20.
  • m may be selected to provide an off-rate smaller than 0.1 s 1 , such as smaller than 0.01 s 1 , especially smaller than 0.005 s 1 .
  • m may be selected to provide an off-rate smaller than 0.1 s 1 , such as smaller than 0.01 s 1 , especially smaller than 0.005 s 1 .
  • Such embodiments may be beneficial as a lower concentration of second barcode can be used due to longer hybridization times.
  • one or more of the first chromophore and the second chromophore may (be configured to) (stochastically) switch between an active and an inactive state, wherein the chromophore does not perform FRET in the inactive state.
  • one or more of the first chromophore and the second chromophore may be configured to switch between an active state and an inactive state, especially the first chromophore may be configured to switch between an active state and an inactive state, or especially the second chromophore may be configured to switch between an active and an inactive state.
  • a chromophore may switch between an active state and an inactive state as a function of radiation and/or redox agents.
  • Cy5 may switch from an inactive state to an active state when exposed to green light, and may switch from the active state to an inactive state when exposed to red light and a reducing agent.
  • ATT0655 may switch from an inactive state to an active state when exposed to an oxidizing agent, and may switch from the active state to the inactive state when exposed to red light and a reducing agent.
  • the analysis method may comprise providing conditions in which one or more of the first chromophore and the second chromophore switches between an active state and an inactive state, especially wherein the first chromophore switches between an active state and an inactive state, or especially wherein the second chromophore switches between an active state and an inactive state.
  • the analysis method may comprise providing switching radiation to the tagged protein, wherein the switching radiation is selected in order to switch the first chromophore (or the second chromophore) between an active state and an inactive state.
  • the analysis method may comprise providing a redox agent to the tagged protein, wherein the redox agent is suitable to switch the first chromophore (or the second chromophore) between an active state and an inactive state.
  • the off-rate may further be influenced by, for example, a liquid the tagged protein is exposed to.
  • the analysis method, especially the barcode exposure stage may comprise exposing the tagged protein to a dissociation buffer configured to increase the off-rate, especially the off-rate of the first tag and the first barcode, or especially the off-rate of the second tag and the second barcode.
  • the barcode exposure stage may comprise exposing the tagged protein to the dissociation buffer in between the exposing of the tagged protein to different barcodes.
  • the analysis method may, for example, comprise providing a barcode of the different barcodes to the tagged protein, wherein the barcode hybridizes with the second tag (or the first tag), subsequently providing the buffer to dissociate the barcode and the second tag, and subsequently providing an other barcode of the different barcodes to the tagged protein, wherein the other barcode hybridizes with the second tag (or the first tag).
  • the dissociation buffer may especially comprise one or more of deionized water, urea, formamide, and a strong base, especially deionized water.
  • the off rate may also be influenced by temperature, i.e., the off-rate may increase at higher temperatures.
  • the barcode exposure stage may comprise exposing the tagged protein to an elevated temperature in between the exposing of the tagged protein to different barcodes.
  • the elevated temperature may be selected from the range of 40 - 90°C, such as from the range of 40 - 80°C, especially from the range of 50 - 70°C, such as from the range of 55 - 70°C. It will be clear to the person skilled in the art that the selection of the elevated temperature may depend on the (complementary) sequence. For example, with respect to DNA-DNA hybridization, a higher elevated temperature may be selected for a sequence with a high G/C content than for a sequence with a high A/T content as the G-C pairing may be more stable than the A-T pairing.
  • the tagged protein may comprise a plurality of second amino acids tagged with second tags
  • the barcode exposure stage comprises exposing the tagged protein to a barcode configured to hybridize with two or more of the plurality of second tags, wherein the barcode comprises m nucleotides complementary with the second tag.
  • m may be selected from the range of 5-10, especially from the range of 5-9, such as from the range of 5-8.
  • the off-rate may be sufficiently high that the association of the barcode to different second tags may be temporally separated, and thus that the signal from different FRET pairs may be temporally separated.
  • m may be selected from the range of 10-20, especially from the range of 11-15.
  • the off-rate may be relatively low, which may cause multiple second tags to simultaneously be associated to the barcode.
  • one or more of the first chromophore and the second chromophore may be configured to switch between an active state and an inactive state. Thereby, the signal from different FRET pairs may be temporally separated.
  • the tagged protein may comprise a plurality of first amino acids tagged with first tags
  • the barcode exposure stage comprises exposing the tagged protein to a first barcode configured to hybridize with two or more of the plurality of first tags, wherein the first barcode comprises nucleotides complementary with the first tag.
  • the first barcode comprises nucleotides complementary with the first tag.
  • the plurality of first amino acids may especially comprise (essentially) all amino acids in the protein of a first type, wherein the first type is especially selected from the group cysteine, lysine, methionine, and tyrosine.
  • the plurality of first amino acids may especially comprise (essentially) all amino acids of the first type in the protein.
  • the plurality of first amino acids may comprise all cysteines in the protein.
  • the plurality of second amino acids may especially comprise (essentially) all amino acids in the protein of a second type, wherein the second type is especially selected from the group cysteine, lysine, methionine, and tyrosine.
  • the plurality of second amino acids may especially comprise (essentially) all amino acids of the second type in the protein.
  • the plurality of second amino acids may comprise all tyrosines in the protein.
  • the first type may be different from the second type.
  • the first tag may comprise the same number of nucleotides as the first barcode. In further embodiments, the first tag may comprise fewer nucleotides than the first barcode. In further embodiments, the first tag may comprise more nucleotides than the first barcode.
  • the second tag may comprise the same number of nucleotides as the second barcode. In further embodiments, the second tag may comprise fewer nucleotides than the second barcode. In further embodiments, the second tag may comprise more nucleotides than the second barcode.
  • first tag (and/or second tag) may have more nucleotides than the first barcode (and/or second barcode), as this may facilitate providing a plurality of barcodes corresponding to the same tag.
  • the plurality of barcodes may hybridize at different locations along the tag, thereby providing for a tuneable distance between the first amino acid (or second amino acid) and the first chromophore (or second chromophore).
  • the combination of tag and corresponding barcode may thus be considered a “linker” connecting the first amino acid (or second amino acid) to the first chromophore (or second chromophore.
  • the combination of tag and corresponding barcodes may be considered a variable linker.
  • the analysis method may further comprise a tagging stage.
  • the tagging stage may especially be arranged prior to the barcode exposure stage.
  • the tagging stage may comprise tagging the first amino acid with the first tag and/or tagging the second amino acid with the second tag in a protein to provide the tagged protein.
  • the analysis method may provide a (processed) emission signal, especially a FRET efficiency or distance.
  • the (processed) emission signal such as the FRET efficiency, may be used for one or more of protein identification, protein structure determination, protein conformational change analysis, protein substrate binding analysis, protein DNA binding analysis, and in silico experimentation (such as with fixed cells).
  • the analysis method may comprise one or more of protein identification, protein structure determination, protein conformational change analysis, protein substrate binding analysis, protein DNA binding analysis.
  • the analysis method may comprise further characterizing the protein using second FRET donor-acceptor pair chromophores, wherein the second FRET donor-acceptor pair chromophores differ from the FRET donor-acceptor pair chromophores.
  • second FRET donor-acceptor pair chromophores differ from the FRET donor-acceptor pair chromophores.
  • the analysis method may comprise a fingerprint provision stage comprising providing a protein fingerprint based on the FRET efficiency pattern and/or the estimated distance, especially based on the FRET efficiency pattern, or especially based on the estimated distance.
  • protein fingerprint may herein refer to a protein-specific (unique) signal, especially wherein the protein fingerprint is suitable for identification of the protein.
  • the protein fingerprint may especially refer to one or more of an array of FRET efficiency values; an array of estimated distances; and/or raw data, especially one or more emission signals, obtained according to the method of the invention.
  • the FRET efficiency pattern may (essentially) comprise a protein fingerprint.
  • the FRET efficiency pattern may be processed (in the protein fingerprint provision stage) to provide the protein fingerprint.
  • the protein fingerprint may vary in dependence on, for example, the used FRET donor-acceptor pair chromophores, tags, and/or barcodes, i.e., there may be a plurality of (possible) protein fingerprints (unique) for the protein in dependence on the selected FRET donor-acceptor pair chromophores, (length of the) tags, (length of the) barcodes, and/or (length of) the complementary sequence between the tags and the barcodes.
  • the analysis method may comprise providing a plurality of protein fingerprints of the protein by varying one or more of FRET donor-acceptor pair chromophores, first tags, second tags, first barcodes and/or second barcodes, especially by varying the FRET donor-acceptor pair chromophores.
  • the protein fingerprint may comprise data related to emission signals (independently) obtained using a plurality of (different) FRET donor-acceptor pair chromophores.
  • the analysis method may comprise a protein identification stage comprising identifying the protein by comparing the protein fingerprint to protein-related information in reference data.
  • the protein-related information may comprise predetermined protein fingerprints, and the protein may be identified based on a comparison between the protein fingerprint and the predetermined protein fingerprints in the reference data.
  • the protein-related information may comprise predicted predetermined protein fingerprints, which may each be predicted based on a corresponding (known) protein structure of a protein.
  • the method according to the invention may be particularly suitable to identify different proteoforms of a protein, especially due to alternative splicing, of the protein as the method may not rely on protein fragmentation, may be versatile in the amino acids that can be tagged, and may be sensitive towards identifying “missing” amino acids due to single nucleotide polymorphisms (SNPs) and alternative splicing.
  • SNPs single nucleotide polymorphisms
  • the term “missing amino acid” may especially refer to an amino acid that is typically present in a certain location in the protein, but has been replaced due to a mutation or is absent due to alternative splicing.
  • proteoforms may be important as several diseases, such as cystic fibrosis, cancer and Parkinson disease have been associated with mutations in their splicing elements that lead to alternative splicing and abnormal protein production.
  • the post-translational modification of proteins may be a key signature of several diseases in neurology, oncology and immunology.
  • proteoform may herein refer to all different forms of a protein that may be transcribed from a single protein encoding gene, wherein the difference may be due to alternative splicing and/or differences in post-translational modifications, as well as to different forms of a protein resulting from gene variations such as SNPs, i.e., the term “proteoform” may herein also refer to two proteins transcribed from two different alleles of the same gene in two different individuals.
  • the protein identification stage may comprise identifying a proteoform of the protein, especially wherein the reference data comprises protein-related information pertaining to the proteoforms.
  • the protein identification stage may comprise identifying an alternatively spliced form of the protein, especially wherein the reference data comprises protein-related information pertaining to alternative splicing.
  • the reference data may comprise an (online) database.
  • the method may comprise retrieving protein-related information from the reference data, especially from the (online) database.
  • the fingerprint provision stage may comprise predicting a (partial) amino acid sequence based on the FRET efficiency pattern, especially based on the estimated distance, and may provide a protein fingerprint comprising the predicted (partial) amino acid sequence.
  • the protein-related information may comprise amino acid sequences.
  • the amino acid sequences may especially comprise translated nucleotide sequences, and/or amino acid sequences resulting from alternative splicing.
  • Such embodiments may be particularly beneficial with regards to the identification of linearized proteins, as the distance between two amino acid residues can be relatively directly translated to a relative position in the amino acid chain.
  • the (tagged) protein may comprise a denatured protein.
  • denatured and similar terms herein especially refers to the protein having lost its secondary and tertiary structures, including any disulfide bonds between cysteine residues.
  • the analysis method may comprise a denaturation stage comprising denaturation of the protein.
  • Methods for the denaturation of proteins will be known to the person skilled in the art and may, for example, comprise exposing the protein to sodium dodecyl sulfate (SDS).
  • Denaturation of the protein may further facilitate tagging the first and second amino acids.
  • the tagging stage may be arranged after the denaturation stage.
  • the tagging stage may comprise tagging the first amino acid with the first tag and/or tagging the second amino acid with the second tag in the denatured protein.
  • the denatured tagged protein may be subjected to the barcode exposure stage, especially in embodiments for the identification of the protein, more especially in embodiments for the identification of the protein via the determination of a (partial) amino acid sequence.
  • the analysis method may further comprise a refolding stage.
  • the refolding stage may especially be arranged following the denaturation stage and the tagging stage.
  • the refolding state may comprise refolding of the denatured protein. Specifically, some amino acids in a protein may be difficult to tag as the protein is in its folded state. Hence, the protein may be denatured, tagged, and refolded such that tags may be provided to amino acids that would otherwise be relatively difficult to tag.
  • Refolding of the tagged protein may be particularly relevant for embodiments regarding the determination of the structure of a protein.
  • the tags may, however, affect the 3D- structure of the protein depending on properties of the tag such as size and hydrophobicity. It may thus be beneficial for structure prediction to employ relatively small tags to minimize the impact on protein structure.
  • the first tag and/or the second tag may comprise a tagging group selected from the group comprising a ormyl-glycine generating enzyme (FGE) tag (this will allow aldehyde-hydrazide chemistry), a haloalkane dehalogenase tag (HALO tag), a His tag (6-8 histidines, tris-NTA compounds allow modification of his-tags), a lipoic acid ligase tag (for azide chemistry), and a tubulin tyrosine ligase tag (enzymatic ligation to a modified tyrosin).
  • FGE ormyl-glycine generating enzyme
  • the analysis method may comprise a structure prediction stage.
  • the structure prediction stage may comprise predicting a protein structure of the (tagged) protein.
  • the structure prediction stage may comprise predicting the protein structure of the (untagged) protein based on the emission signal obtained from the tagged protein.
  • the structure prediction stage may especially comprise predicting the protein structure based on the emission signal, or especially based on the FRET efficiency, or especially based on the estimated distance, more especially based on a plurality of estimated distances.
  • the structure prediction stage may especially comprise (using) a computational process, such as a computational algorithm.
  • the tagged protein may be folded during the barcode exposure stage, i.e., the tagged protein does not comprise a denatured protein.
  • the tagged protein may comprise a folded protein.
  • the analysis method may both contribute to the refinement of existing model protein structures by providing a distance measurement between two amino acid residues, and may contribute to the generation of new model protein structures, especially by providing a plurality of distance measurements between a plurality of amino acid residues.
  • the analysis method may comprise a computational process comprising a refinement of a model protein structure based on the estimated distance.
  • the model protein structure may be based on X-ray crystallography and/or nuclear magnetic resonance (NMR)-spectroscopy.
  • the model protein structure may also be a predicted model protein structure based on amino acid identity or similarity with a protein having a known structure, such as a model protein structure based on homology modelling.
  • the measurements provided by the analysis method according to the invention may be used to detect differences between the homolog and the target protein beforehand and correct the initial model. This may be particularly valuable in regions of low sequence identity or potential hinge regions. Moreover, using the analysis method in ab initio modeling may result in a sufficiently high resolution model to serve as a starting point in the refinement process.
  • the estimated distance may comprise a plurality of estimated distances
  • the computational process may comprise a de novo protein structure prediction, especially based on a distance matrix-based structure predictor, such as described by Adhikari, Badri, and Jianlin Cheng. "CONFOLD2: improved contact-driven ab initio protein structure modeling.” BMC bioinformatics 19.1 (2018): 22, which is hereby herein incorporated by reference .
  • the method according to the invention may make up for several shortcomings in computational structure predictors.
  • distance matrix-based methods may partially rely on a sequence homolog, especially a plurality of sequence homologs, with a known structure.
  • the homolog may be used to estimate inter-residue distances in the target protein, assuming that the target protein folds like the homolog. It only makes sense that the quality of predicted structures may decrease as the number and quality of available homologs decreases.
  • the distance information provided by the method according to the invention may aid in the construction and validation of distance matrices for such structures. Furthermore, the reliability of information provided by current methods for distance matrix construction may decrease with inter-residue distance.
  • the analysis method according to the invention may provide existing (computational) methods with a wealth of previously inaccessible information.
  • the method may comprise analyzing (dynamic) structure changes of the protein, especially by exposing the protein to an agent, such as a denaturing agent, affecting the protein structure during the barcode exposure stage.
  • an agent such as a denaturing agent
  • the analysis method may especially comprise a non-medical method. In embodiments, the analysis method may especially comprise a non-diagnostic method.
  • the invention may further provide a system for the characterization of a (tagged) protein, especially using a FRET donor-acceptor pair.
  • the system may comprise one or more of an analytical surface, a barcode supply, a radiation source, a single- molecule fluorescence microscope, and a control system.
  • the analytical surface may be configured to host the (tagged) protein.
  • the barcode supply may be configured to provide (nucleotide) barcodes to the analytical surface, especially to the hosted protein.
  • the radiation source may be configured to provide donor excitation radiation to the analytical surface, especially to the hosted protein.
  • the single-molecule fluorescence microscope may be configured to measure emission in a donor emission radiation range and in a acceptor emission radiation range at the analytical surface.
  • the single-molecule fluorescence microscope may further be configured to provide an emission signal to the control system.
  • the control system may be configured to generate a FRET efficiency pattern based on the emission signal.
  • control system may be configured to estimate a distance, especially between a first amino acid and a second amino acid in the (tagged) protein, based on the FRET efficiency pattern, especially based on the emission signal, especially by determining FRET efficiency.
  • the invention may provide a system for characterization of a (tagged) protein using a FRET donor-acceptor pair.
  • the system may especially be configured to execute the method according to the invention.
  • the system may comprise an analytical surface.
  • the analytical surface may be configured to host the (tagged) protein.
  • the analytical surface may especially comprise a glass surface configured for single-molecule imaging, especially a quartz surface for single-molecule imaging.
  • the system may be configured to immobilize the protein at the analytical surface, especially via biotin-Streptavidin binding or with a covalent chemical approach (e.g. amine-NHS, 2PCA or thiol chemistry).
  • Biotin or biotinylated DNA linker may be attached to the protein molecule for surface immobilization.
  • the immobilization may especially comprise attaching a terminal end of the protein to the analytical surface, especially the N-terminal end, or especially the C-terminal end.
  • NHS chemistry, 2PCA chemistry and/or an alkyne-ketene reaction may be used for N-terminal end immobilization.
  • a decaboxylative alkylation reaction for the addition of a DNA linker to the C-terminal end of the protein may be used for C-terminal end immobilization.
  • the analytical surface may comprise a surface coating such as PEGylation before Streptavidin is introduced.
  • the analytical surface may be subjected to a surface passivation method before Streptavidin is introduced.
  • the system may comprise a barcode supply.
  • the barcode supply may especially be configured to provide (nucleotide) barcodes to the analytical surface, especially to the hosted protein.
  • the barcode supply may be configured to sequentially supply a plurality of barcodes to the analytical surface, especially to the hosted protein.
  • the system may comprise a radiation source.
  • the radiation source may be configured to provide donor excitation radiation to the analytical surface, especially to the hosted protein.
  • the hosted protein may comprise or be associated with a donor chromophore and/or an acceptor chromophore, wherein the donor chromophore may be excited by the donor excitation radiation provided by the radiation source.
  • the donor chromophore may then emit donor emission radiation and/or may provide energy to the acceptor chromophore via FRET energy transfer if the donor chromophore and the acceptor chromophore are within a predetermined distance, which may cause the acceptor chromophore to subsequently emit acceptor emission radiation.
  • the system may comprise a single-molecule fluorescence microscope.
  • the single-molecule fluorescence microscope may be configured to measure emission in a donor emission radiation range and in a acceptor emission radiation range, especially at the analytical surface, i.e., the single-molecule fluorescence microscope may measure emission emitted from the (protein at the) analytical surface.
  • the single-molecule fluorescence microscope may further be configured to provide an emission signal to the control system, especially wherein the emission signal comprises donor excitation radiation and/or acceptor emission radiation.
  • the system may comprise a control system.
  • the control system may be configured to control one or more of the analytical surface, the barcode supply, the radiation source, and the single-molecule fluorescence microscope.
  • the control system may further be configured to estimate a distance based on the emission signal, especially by determining FRET efficiency.
  • the control system may comprise a processor.
  • the control system may further be configured to retrieve data and/or (program) instructions from an (online) resource, such as an (online) database.
  • control system may be configured to estimate a protein fingerprint, especially based on the FRET efficiency pattern or an estimated distance, especially based on the FRET efficiency pattern, or especially based on the estimated distance, more especially based on the FRET efficiency.
  • control system may be configured to identify the protein by comparing the protein fingerprint to protein-related information in reference data.
  • the system may comprise a denaturation unit configured to denature the protein.
  • the control system may be configured to control the denaturation unit.
  • the system may comprise a tagging unit configured to provide a first tag and/or a second tag to the (untagged) protein.
  • the tagging unit may especially be configured to provide a first tag and/or a second tag to a denatured protein.
  • the control system may be configured to control the tagging unit.
  • the system may comprise a refolding unit configured to refold a denatured protein.
  • the denaturation unit and the refolding unit may essentially be the same unit.
  • the control system may be configured to control the refolding unit.
  • control system may be configured to predict a protein structure of the protein (using a computational process), especially based on the estimated distance, or especially based on the FRET efficiency.
  • the computational process may especially comprise a computational algorithm.
  • the computational process may comprise a refinement of a model protein structure based on the estimated distance.
  • the computational process may comprise a de novo protein structure prediction.
  • the estimated distance may especially comprise a plurality of estimated distances,
  • control system may be configured to execute in a controlling mode the analysis method according to the invention.
  • the control system may especially receive program instructions from a data carrier such that the control system executes the method according to the invention.
  • control system may be configured to select tags and/or barcodes to acquire (specific) information, especially pertaining to the protein.
  • the control system may during the execution of the protein identification stage and/or the protein structure prediction stage determine which information regarding the protein may benefit the protein identification and/or the protein structure prediction.
  • the control system may subsequently acquire this information by providing (additional) tags to the protein, especially to specific amino acids.
  • the invention may provide a data carrier having stored thereon program instructions, which when executed by the system according to the invention, especially by the control system, causes the system to execute the method according to the invention.
  • an embodiment describing the method with respect to the barcodes and/or tags may, for example, also apply to the system, particularly to the barcode supply of the system.
  • an embodiment of the system describing the radiation such as the donor excitation radiation or the acceptor emission radiation, may, for example, further apply to the method.
  • stage and similar terms used herein may refer to a (time) period (also “phase”) of the analysis method.
  • the different stages may (partially) overlap (in time).
  • the barcode exposure stage may, in general, be initiated prior to the distance estimation stage, but may partially overlap in time therewith.
  • the tagging stage may typically be completed prior to the barcode exposure stage.
  • the stages may be beneficially arranged in time.
  • the protein identification stage may occur simultaneously with the barcode exposure stage such that if the protein has been successfully identified, the barcode exposure stage may be terminated.
  • the method and/or system may be applied in or may be part of analysis methods/sy stems of biological samples, such as protein samples, particularly in relation to protein sequencing, protein structure elucidation, and/or protein interactomics.
  • Fig. 1A-C schematically depict embodiments of the analysis method according to the invention
  • Fig. 2 schematically depicts an embodiment of the system according to the invention
  • Fig. 3A-B depict experimental measurements obtained using embodiments of the analysis method according to the invention
  • Fig. 4A-B depict experimental measurements obtained using embodiments of the analysis method according to the invention
  • Fig. 5A-C depict experimental measurements obtained using embodiments of the analysis method according to the invention.
  • the schematic drawings are not necessarily on scale.
  • Fig. 1A-C schematically depict embodiments of the analysis method 100 for characterization of a tagged protein 10 using FRET donor-acceptor pair chromophores 20.
  • the FRET donor-acceptor pair chromophores 20 comprise a first chromophore 21 and a second chromophore 22.
  • the FRET donor-acceptor pair chromophores 20 have a donor excitation radiation range, a donor emission radiation range and an acceptor emission radiation range, wherein one of the FRET donor-acceptor pair chromophores 23 (also: “donor chromophore 23”) is excitable by donor excitation radiation 51 in the donor excitation radiation range, wherein the other of the FRET donor-acceptor pair chromophores 24 (also: “acceptor chromophore 24”) is configured to provide acceptor emission radiation 54 in the acceptor emission radiation range upon excitation with donor excitation radiation 51 in the donor excitation radiation range of the one of the FRET donor-acceptor pair chromophores 23 when the first chromophore 21 and the second chromophore 22 are configured within a predetermined distance.
  • the tagged protein 10 may comprise a first amino acid 11 tagged with a first tag 31 and a second amino acid 12 tagged with a second tag 32.
  • the first tag 31 may comprise the first chromophore 21 or may be associated to the first chromophore 21.
  • the second tag 32 comprises an oligonucleotide.
  • the analysis method may comprise a barcode exposure stage 110 and a distance estimation stage.
  • the barcode exposure stage 110 may comprise exposing the tagged protein 10 to a (second) barcode 42, wherein the barcode 42 is configured to hybridize with the second tag 32, and wherein the barcode 42 comprises the second chromophore 22.
  • the barcode exposure stage 110 may further comprise providing radiation having a wavelength selected from the donor excitation radiation range to the tagged protein 10, especially providing donor excitation radiation 51 to the tagged protein 10.
  • the barcode exposure stage may further comprise measuring emission in the donor emission radiation range and the acceptor emission radiation range to provide an emission signal.
  • the barcode exposure stage may comprise measuring donor emission radiation 53 and acceptor emission radiation 54.
  • the distance estimation stage may comprise estimating a distance between the first amino acid 11 and the second amino acid 12 based on the FRET efficiency pattern, especially based on the emission signal, more especially based on a plurality of emission signals.
  • the first chromophore 21 may be the donor chromophore 23 or the acceptor chromophore 24.
  • the second chromophore 22 may be the donor chromophore 23 or the acceptor chromophore 24.
  • the first chromophore 21 is the donor chromophore 23 and the second chromophore 22 is the acceptor chromophore 24, or (ii) the first chromophore 21 is the acceptor chromophore 24 and the second chromophore 22 is the donor chromophore.
  • Fig. 1 A schematically depicts an embodiment of the method wherein the first tag 31 comprises the first chromophore 21.
  • the first chromophore 21 comprises the acceptor chromophore 24, whereas the second chromophore 22 comprises the donor chromophore 23.
  • the tagged protein 10 comprises a plurality of second amino acids 12 tagged with (two or more) different second tags 32,32 a ,32 b ,32 c , wherein the different second tags 32,32 a ,32 b ,32 c comprise different nucleotide sequences.
  • the barcode exposure stage 110 may comprise sequentially exposing the tagged protein 10 to a plurality of different barcodes 42,42 ai ,42 a2 ,42 b ,42 c .
  • the barcodes may be provided sequentially (and separately).
  • the barcode exposure stage 110 comprises exposing the tagged protein 10 to a corresponding barcode 42,42 ai ,42 a2 ,42 b ,42 c configured to hybridize with the respective second tag 32,32 a ,32 b ,32 c .
  • the barcode exposure stage 110 comprises exposing the tagged protein 10 to two corresponding barcodes 42,42 ai ,42 a2 ; these two barcodes will result in different distances of the (respective) second chromophore to the (respective) second amino acid 12, thereby enabling to probe the same amino acid with chromophores arranged at multiple distances, which may be beneficial given that FRET donor-acceptor pairs may provide the highest sensitivity with regards to FRET efficiency within a certain predetermined distance range.
  • Fig. IB schematically depicts an embodiment of the analysis method 100 wherein the first tag 31 is associated to the first chromophore 21.
  • the first chromophore 21 comprises the donor chromophore 23
  • the second chromophore 22 comprises the acceptor chromophore 24.
  • the first tag 31 comprises an immobilization tag 33 configured to immobilize the tagged protein 10 to an analytical surface 210.
  • the tagged protein 10 may be immobilized on an analytical surface 210.
  • the analysis method 100 may comprise tagging the protein with an immobilization tag 33 to immobilize the tagged protein 10 to an analytical surface 210.
  • the first tag 31 (also) comprises an oligonucleotide.
  • the barcode exposure stage 110 comprises exposing the tagged protein 10 to a first barcode 41, wherein the first barcode 41 is configured to hybridize with the first tag 31, and wherein the first barcode 41 comprises the first chromophore 21.
  • the first tag 31 and the first barcode 41 are depicted to have more complementary nucleotides than the second tag 32 and the second barcode 42.
  • Such embodiment may enable the first barcode 41 to remain hybridized with the first tag 31 while a plurality of second barcodes 42 are sequentially hybridized with corresponding second tags 32.
  • the (second) barcode 42 may comprises m nucleotides complementary with the second tag 32, wherein is selected from the range of 5-10
  • first barcode 41 may comprise m nucleotides complementary with the first tag 31, wherein m is selected from the range of 10-20.
  • Fig. 1C schematically predicts an embodiment of the analysis method 100, wherein the tagged protein 10 comprises a plurality of first amino acids 11 tagged with different first tags 31.
  • the different first tags 31 comprise different nucleotide sequences
  • the barcode exposure stage 110 may comprise sequentially exposing the tagged protein 10 to a plurality of different first barcodes 41, especially wherein for each first tag 31 the barcode exposure stage 110 comprises exposing the tagged protein 10 to a corresponding first barcode 41 configured to hybridize with the respective first tag 31.
  • the tagged protein 10 comprises a denatured protein 15.
  • a denatured protein 15 may be particularly suitable for determining the relative position of amino acids in the amino acid chain (of the tagged protein 10).
  • a single first barcode 41 is hybridized with one of the plurality of first tags 31.
  • a single second barcode 42 is hybridized with one of the plurality of second tags 32.
  • one of the barcodes may be replaced with another barcode, especially wherein the other barcode remains.
  • the first chromophore 21 comprises the donor chromophore 23 and is excited by donor excitation radiation 51 in the donor excitation radiation range. Hence, the first chromophore 21 may provide donor emission radiation 53. Further, the first chromophore 21 may transfer energy to the second chromophore 22 via FRET energy transfer 60. The second chromophore 22 comprises the acceptor chromophore 24 and may, upon receiving energy from the first chromophore, emit acceptor emission radiation 54.
  • the FRET donor-acceptor pair 20 is depicted to provide both donor emission radiation 53 and acceptor emission radiation 54. However, depending on the FRET efficiency, which may depend on the distance between the chromophores, the FRET donor-acceptor pair 20 may also provide (essentially) only donor emission radiation 53 or (essentially) only acceptor emission radiation 54.
  • Fig. 2 schematically depicts an embodiment of the system 200 according to the invention.
  • the system 200 comprises an analytical surface 210, a barcode supply 220, a radiation source 230, a single-molecule fluorescence microscope 240, and a control system 300
  • the analytical surface 210 is configured to host the tagged protein 10 (not depicted)
  • the barcode supply 220 is configured to provide barcodes to the analytical surface 210
  • the radiation source 230 is configured to provide donor excitation radiation 51 to the analytical surface 210
  • the single-molecule fluorescence microscope 240 is configured to measure emission in a donor emission radiation range and in an acceptor emission radiation range at the analytical surface 210 and to provide an emission signal to the control system 300, wherein the control system 300 is configured to generate a FRET efficiency pattern.
  • system further comprises a barcode outlet 225 configured for the removal of barcodes from the analytical surface 210.
  • the analytical surface 210 comprises a quartz slide (on top) of a glass coverslip (on the bottom), which are separated by a layer of water. Further, a prism, especially a Pellin-Broca prism is arranged on top of the quartz slide, with a layer of immersion oil in between. Yet further, an additional layer of water is arranged between the glass coverslip and an objective lens of the single-molecule fluorescence microscope 240. In addition, polyethylene glycol (PEG) is arranged attached to the quartz slide on one side to streptavidin on the other. Hence, in embodiments, the analytical surface 210 may comprise streptavidin.
  • PEG polyethylene glycol
  • the tagged protein 10 may be (non-covalently) bound to the streptavidin using biotin. It will be clear to the person skilled in the art, that many variations of the analytical surface may be possible without deviating from the scope of the invention as described herein.
  • the radiation source 230 comprises a plurality of radiation sources 230, especially configured to provide different wavelengths of radiation.
  • the radiation source 230 may be suitable to provide radiation in the donor excitation radiation range corresponding to different FRET donor-acceptor chromophore pairs.
  • the single-molecule fluorescence microscope 240 comprises or is functionally coupled to a plurality of optical elements configured to separate the radiation emitted from the analytical surface 210 (by the donor chromophore 23 and/or the acceptor chromophore 24) into the donor emission radiation 53 and acceptor emission radiation 54.
  • the single-molecule fluorescence microscope 240 may comprise an EMCCD camera 241 to measure the donor emission radiation 53 and the acceptor emission radiation 54. It will be clear to the person skilled in the art, that many variations of the single-molecule fluorescence microscope 240 and/or the optical elements may be possible without deviating from the scope of the invention as described herein.
  • Fig. 2 may depict a system 200 configured for TIR excitation and FRET pair emission detection using prism type TIRE.
  • Immobilized molecules may be excited by TIR using a green or red laser, tubing is connected to the slide to allow for exchange of DNA- barcodes (either manually or by using a pumping system) and buffers for multiple rounds of imaging and probing of different barcodes on protein substrates.
  • Fluorescence may be collected by an objective lens and the slit may create images of half the size of the EM-CCD camera.
  • the fluorescence signal may be split into donor and acceptor signal by a dichroic mirror and may be imaged side by side on the EM-CCD.
  • control system 300 may be configured to estimate a protein fingerprint based on the FRET efficiency pattern. In further embodiments, the control system 300 may be configured to identify the tagged protein 10 by comparing the protein fingerprint to protein-related information in reference data.
  • the system may further comprise a denaturation unit 250 configured to denature the protein to provide a denatured protein 15.
  • the control system 300 may be configured to execute in a controlling mode the analysis method 100 according to the invention.
  • Fig. 2 schematically further depicts a data carrier 400 having stored thereon program instructions, which when executed by the system 200 according to the invention, especially by the control system 300, causes the system 200 to execute the analysis method 100 according to the invention.
  • the control system 300 may comprise the data carrier 400.
  • Most of the not uniquely identifiable proteins may either be classifiable to a smaller subset of potential candidate proteins or may be unclassifiable due to the lack of cysteines and lysines.
  • the discernibility may increase with the number of tagged residues, implying that an even wider range of proteins may be discernible if more residue types such as methionine and tyrosine are tagged.
  • the starting position may be generated using a random walk in an appropriately sized lattice
  • (DNA-)tagged residues may be accounted for by assuming that a long negatively charged tag requires a straight unobstructed path to the model surface at all times, and (3) temperatures were automatically tuned in the parallel tempering procedure to optimize mixing between chains.
  • E fret fingerprint may be defined herein as an array of E fret values calculated from all donor-acceptor pairs in the model, in small to large order.
  • E fret values lower than 0.10 were considered too low to discern from background noise, and are thus omitted.
  • Dk . k ' was compared to the E fret resolution and F was considered unique if Dk . k- is larger in all comparisons against other fingerprints with the same number of observed FRET values.
  • Fig. 3A-B schematically depict a proof-of-concept experiment of the method of the invention.
  • an acceptor Cy5 labelled single stranded DNA molecule was immobilized on a surface.
  • This ssDNA molecule contains a barcode target sequence that is either 5nt (target #1, Fig. 3 A) or 30nt (target #2, Fig. 3B) separated from the acceptor.
  • a complementary donor (Cy3) labelled 8nt imaging barcode was used to probe either of the aforementioned strands (in separate experiments), including the collection of FRET efficiency E measurements and dwell-time T d (in seconds) measurements. From this data the dwell-time T d vs FRET efficiency E of each individual FRET event was plotted in the left panels of Fig. 3 A and Fig. 3B. Next, the FRET efficiencies for both target strands were plotted in the histograms depicted in the right panels of Fig. 3 A and Fig. 3B indicating counts N versus Fret efficiency E. The observed FRET efficiencies were subjected to Gaussian fitting, which indicated that the observed FRET efficiencies E were 0.99 and 0.69 for target #1 and target #2 respectively.
  • the FRET efficiency pattern may comprise the FRET efficiency E measurements.
  • the distribution of the center of the peaks from either barcode measurement may be a protein fingerprint.
  • the FRET efficiency pattern may comprise a protein fingerprint.
  • the FRET efficiency pattern may comprise raw data, such as depicted in either of the left panels of Fig. 3 A-B, which may especially comprise a protein fingerprint.
  • the FRET efficiency pattern may comprise binned data, such as depicted in either of the right panels of Fig. 3 A-B, which may especially comprise a protein fingerprint.
  • the protein fingerprints may be the signatures of the DNA strands.
  • the FRET efficiency pattern may comprise processed data, which may especially comprise a protein fingerprint.
  • Fig. 4A-B depict of a proof-of-concept experiment for probing different targets on one DNA tag with different barcodes.
  • the location of the target sequence for each barcode is probed relative to a 5’ end labelled with acceptor fluorophore Cy5 with a Cy3 labelled donor barcode.
  • the complementary barcode for target A (see below) is flushed, then the flow cell is washed with washing buffer and the barcode for target B is flushed and the emission is measured.
  • the left histogram corresponds to target B
  • the right (dashed) histogram corresponds to target A.
  • Mean FRET efficiencies are derived from gaussian fits of individual histograms, and errors are the standard error of the mean.
  • the barcode for target A is: 3’end- Cy3-TATGTAGA
  • the barcode for target B is: 3’end- Cy3-AGAAGTAAT
  • Fig. 4A depict the data corresponding to DNA construct: Cv5-TTTTTATACATCTATTTTTTTTTTTTTTTTTTTTTTTTCTTCATTACTT TTTTTTTTTTTTT-Biotin, corresponding to Cy5 bound to SEQ ID NO:l bound to Biotin, wherein the bold part indicates target (site) A, and wherein the underlined part indicates target (site) B.
  • the mean FRET efficiency is 0.90 ⁇ 2.1 3 and 0.55 ⁇ 1.6 3 , for target A and target B respectively.
  • Fig. 4B depict the data corresponding to DNA construct: Cv5-TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTATACATCTATTCTTCATTACTT TTTTTTTTTTTTTTT-Biotin, corresponding to Cy5 bound to SEQ ID NO:2 bound to Biotin wherein the bold part indicates target (site) A, and wherein the underlined part indicates target (site) B.
  • the mean FRET is 0.67 ⁇ 2.2 3 and 0.59 ⁇ 2.2 3 , for target A and target B respectively.
  • FIG. 5A-C depict results of a proof of concept experiment wherein three proteins corresponding to SEQ ID NO:3, SEQ ID NO:4 and SEQ ID NO:5, are analyzed using the analysis method of the invention.
  • the three proteins each correspond to alpha synuclein with a C-terminal FGE binding sequence, a small linker, a thrombin cleaving sequence, and a histidine tag.
  • the alpha synuclein sequence is based on the sequence of accession number P37840 at Uniprot.org.
  • SEQ ID NO:3 the serine at location 87 in the alpha synuclein sequence is replaced with a cysteine, i.e., SEQ ID NO:3 corresponds to a S87C version of alpha synuclein.
  • SEQ ID NO:4 the serine at location 129 in the alpha synuclein sequence is replaced with a cysteine, i.e., SEQ ID NO:4 corresponds to a S129C version of alpha synuclein.
  • SEQ ID NO:5 the serine residues at locations 87 and 129 in the alpha synuclein sequence are replaced with cysteine residues, i.e., SEQ ID NO:5 corresponds to a S87C+S129C version of alpha synuclein.
  • Proteins corresponding to SEQ ID NO: 3-5 were produced in Escherichia coli BL21 (DE3).
  • E. coli BL21 (DE3) also (naturally) produces an FGE enzyme that converts the cysteine in the FGE binding site at location 142 to a formyl glycine, which can be coupled to a provide via hydrazide chemistry.
  • the formyl glycine was linked toa first tag 31 comprising a biotin probe via hydrazide chemistry.
  • the proteins were then analyzed using the analysis method of the invention.
  • a first tag 31 was associated to the formyl glycine site (at location 142) via hydrazide chemistry.
  • the first tag 31 further comprised a biotin probe (at the other end of the tag than the end associated with the formyl glycine site.
  • the biotin probe was used to immobilize the first tag 31 on a streptavidin surface.
  • the first tag 31 comprised 25 nucleotides.
  • a second tag 32 was associated to one or more cysteine residues in the protein.
  • the second tag 32 comprised 10 nucleotides.
  • Each of the tagged proteins 10 was then exposed to a first barcode 41 configured to hybridize with the first tag 31 and to a second barcode 42 configured to hybridize with the second tag 32.
  • the first barcode 41 comprised a first chromophore 21, and comprised 10 nucleotides complementary to the first tag 31
  • the second barcode 42 comprised a second chromophore 22, and comprised 8 nucleotides complementary to the second tag 32.
  • the first chromophore 21 was Cy5, whereas the second chromophore 22 was Cy3. Radiation in the donor excitation range was provided to the tagged proteins 10, and emission in the donor emission radiation range and the acceptor emission radiation range was measured.
  • Fig. 5A-C depicts counts N versus FRET efficiency E. In particular, of each observed binding event with a duration of at least 0.3 s the average FRET efficiency was determined, and Fig. 5A-C summarize all determined average FRET efficiencies.
  • Fig. 5A depicts the observations with regards to SEQ ID NO:3, Fig. 5B the observations with regards to SEQ ID NO: 4, and Fig. 5C the observations with regards to SEQ ID NO:5.
  • the second barcode 42 comprising the second chromophore 22 may associate to two different second tags 32 associated to different cysteine residues.
  • the analysis method of the invention may facilitate characterizing a tagged protein comprising a plurality of second tags 32, wherein the first barcode 42 can associate to each of the plurality of second tags 32.
  • the analysis method of the invention may facilitate characterizing a tagged protein comprising a plurality of first tags 31, wherein the first barcode 41 can associate to each of the plurality of first tags 31. Thereby, different sites may be queried in parallel, which may facilitate a quicker characterization.
  • the terms “substantially” or “essentially” herein, and similar terms, will be understood by the person skilled in the art.
  • the terms “substantially” or “essentially” may also include embodiments with “entirely”, “completely”, “all”, etc. Hence, in embodiments the adjective substantially or essentially may also be removed.
  • the term “substantially” or the term “essentially” may also relate to 90% or higher, such as 95% or higher, especially 99% or higher, even more especially 99.5% or higher, including 100%.
  • the terms ’’about” and “approximately” may also relate to 90% or higher, such as 95% or higher, especially 99% or higher, even more especially 99.5% or higher, including 100%.
  • a phrase “item 1 and/or item 2” and similar phrases may relate to one or more of item 1 and item 2.
  • the term “comprising” may in an embodiment refer to "consisting of but may in another embodiment also refer to "containing at least the defined species and optionally one or more other species”.
  • the invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer.
  • a device claim, or an apparatus claim, or a system claim enumerating several means, several of these means may be embodied by one and the same item of hardware.
  • the mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
  • the invention also provides a control system that may control the device, apparatus, or system, or that may execute the herein described method or process. Yet further, the invention also provides a computer program product, when running on a computer which is functionally coupled to or comprised by the device, apparatus, or system, controls one or more controllable elements of such device, apparatus, or system.
  • the invention further applies to a device, apparatus, or system comprising one or more of the characterizing features described in the description and/or shown in the attached drawings.
  • the invention further pertains to a method or process comprising one or more of the characterizing features described in the description and/or shown in the attached drawings.
  • a method or an embodiment of the method is described being executed in a device, apparatus, or system, it will be understood that the device, apparatus, or system is suitable for or configured for (executing) the method or the embodiment of the method respectively.
  • the various aspects discussed in this patent can be combined in order to provide additional advantages. Further, the person skilled in the art will understand that embodiments can be combined, and that also more than two embodiments can be combined. Furthermore, some of the features can form the basis for one or more divisional applications.

Abstract

The invention provides an analysis method (100) for characterization of a tagged protein (10) using FRET donor-acceptor pair chromophores (20), wherein the FRET donor- acceptor pair chromophores (20) comprise a first chromophore (21) and a second chromophore (22), wherein the FRET donor-acceptor pair chromophores (20) have a donor excitation radiation range, a donor emission radiation range and an acceptor emission radiation range, wherein one of the FRET donor-acceptor pair chromophores (23) is excitable by donor excitation radiation (51) in the donor excitation radiation range, wherein the other of the FRET donor-acceptor pair chromophores (24) is configured to provide acceptor emission radiation (54) in the acceptor emission radiation range upon excitation with donor excitation radiation (51) in the donor excitation radiation range of the one of the FRET donor-acceptor pair chromophores (23) when the first chromophore (21) and the second chromophore (22) are configured within a predetermined distance, wherein the tagged protein (10) comprises a first amino acid (11) tagged with a first tag (31) and a second amino acid (12) tagged with a second tag (32), wherein the first tag (31) comprises the first chromophore (21) or is associated to the first chromophore (21), wherein the second tag (32) comprises an oligonucleotide, and wherein the analysis method (100) comprises: a barcode exposure stage (110) comprising: (i) exposing the tagged protein (10) to a barcode (42), wherein the barcode (42) is configured to hybridize with the second tag (32), and wherein the barcode (42) comprises the second chromophore (22), (ii) providing radiation having a wavelength selected from the donor excitation radiation range to the tagged protein (10), and (iii) measuring emission in the donor emission radiation range and the acceptor emission radiation range to provide an emission signal.

Description

Single-molecule FRET for protein characterization
FIELD OF THE INVENTION
The invention relates to a method for the characterization of a protein. The invention further relates to a system for characterization of a protein. The invention further relates to a data carrier having stored thereon program instructions for the characterization of a protein.
BACKGROUND OF THE INVENTION
Analysis methods to determine a distance using FRET is known in the art. For example, Roy et al, “A practical guide to single-molecule FRET”, Nature Methods, vol. 5 no. 6, June 2008, describes single-molecule fluorescence resonance energy transfer (smFRET) as one of the most general and adaptable single-molecule techniques. It provides a practical guide to using smFRET, focusing on the study of immobilized molecules that allow measurements of single-molecule reaction trajectories from 1 ms to several minutes. It further describes that the extent of non-radiative energy transfer between two fluorescent dye molecules-termed donor and acceptor-reports the intervening distance which can be estimated from the ratio of acceptor intensity to total emission intensity.
WO0125794A2 describes that protein molecules of interest are isolated and modified into linearized protein molecules. Individual protein molecules are isolated for observation with hydrodynamic focusing apparatus, atomic force microscope, a separation plate, or the combination of isoelectric focusing gel, and electrophoresis gel. In the linearized protein molecule a first type of amino acid residue (K) is labelled with a first tag, and a second type of amino acid residue (C) is labelled with a second tag. Tags impart a detectable set of distinguishing characteristic ancillary properties to the linearized protein molecule that define the fingerprint thereof. Protein fingerprints are recovered as images by camera or fluorescence microscope, or as spectral analyses by optical detector. Compilations of fingerprints for known protein molecules comprise fingerprint libraries to which the fingerprint of a protein molecule of interest is compared.
US2016169903A1 describes a method for obtaining partial sequence information from a target protein, comprising (i) denaturing and elongating a protein, (ii) attaching docking strands to particular amino acids in the protein, (iii) capturing the protein on a substrate, (iv) repeatedly contacting the captured protein with fluorescently-labeled imager strands that transiently bind to the docking strand, and (v) imaging the substrate. Auer et al., “Fast, Background-Free DNA-PAINT Imaging Using FRET -Based Probes”, Nano Letters, 2017, describes that DNA-PAINT suffers from the emission of fluorescence from imager strands when not they are not bound to their docking strands as the blinking rate of probes is limited by an upper-bound of imager strand concentrations. D3 describes that FRET-based imaging probes alleviate the concentration-limit of imager strands and speed up image acquisition by several orders of magnitude.
Dingf elder et al., “Mapping an Equilibrium Folding Intermediate of the Cytolytic Pore Toxin ClyA with Single-Molecule FRET”, Journal of Physical Chemistry B, 2018, describes mapping of the structure of a ClyA monomer during denaturant-induced unfolding with single-molecule Forster resonance energy transfer (FRET) spectroscopy.
Yoo et al., “Three-Color Single-Molecule FRET and Fluorescence Lifetime Analysis of Fast Protein Folding”, Journal of Physical Chemistry B, 2018, describes the theory, experiment, and analysis of three-color Forster resonance energy transfer (FRET) spectroscopy for probing conformational dynamics of 013D.
SUMMARY OF THE INVENTION
Proteins may be considered the basis of life since they are the workhorses in all living cells. The many thousands of different proteins may sustain the vast majority of functions of a cell, from copying DNA and catalyzing basic metabolism to producing cellular motion and more. For the understanding of biological processes and their regulation, including diseases, it may be critical to monitor the protein composition of cells, especially by identifying the proteins, such as by sequencing (i.e. determination of the amino acid sequence of proteins), and to determine protein structures. However, protein identification and structure determination may remain enormous challenges, especially when only small biological samples are available.
Modern protein identification may typically involve mass spectrometry-based (MS) identification techniques; determining the precise mass of protein fragments following the fragmentation of a protein by electron bombardment. Current MS methods may generally suffer from several limitations. First, MS methods may only be capable of analyzing fragments of proteins, wherein information on those fragments is then used to reconstruct the full-length sequence, which may fail due to the combinatorial complexity, and which may imply a loss of information due to an uncertainty in which fragments correspond to the same protein, which may be particularly relevant with regards to proteoforms. Second, MS methods may often fail to recognize minor species among highly abundant species, since sequence prediction may be made through analysis of complex spectral peaks. As many important cellular proteins such as signaling proteins may exist in very low abundance, it may be difficult to obtain comprehensive proteomic information. Third, proteins may be post-translationally modified, providing additional combinatorial complexity for protein identification, especially given that it may typically not be fully known which post-translational modifications a given protein can undergo.
Early detection of diseases may also rely on the detection of low concentrations of protein biomarkers and thereby forms a strong demand for protein identification techniques capable of working at the single-molecule scale. Alternative methods such as immunoassays may, on the other hand, only be capable of analyzing relatively few proteins in a sample. Further, immunoassays may be time-intensive to adapt and may be limited by the availability of antibodies, which may also often present specificity issues. Yet further, immunoassays may be limited in their ability to distinguish between proteoforms of a protein.
Protein structure elucidation may typically rely on a combination of computational methods and experimental measurements. Traditionally, protein structure elucidation may have relied on protein crystallization and X-ray analysis, which may be limited to proteins that can be successfully crystallized and may be relatively time-intensive and expensive. For decades now, the field of computational structure prediction may have garnered continuous attention due to the attractive outlook of structural prediction based on reference proteins without elaborate and expensive laboratory protocols. Recently, the introduction of deep learning implementations in structure prediction algorithms may have helped to break a short impasse in progress of ab initio modeling and may simultaneously have reduced the need for human expert input. The current top-performing implementations may construct residue distance matrices as an intermediate step. Experimental data regarding distances may benefit such computational structure predictions methods. Similarly, such data may also benefit protein structure models obtained based on X-ray crystallography.
For example, Forster Resonance Energy Transfer (FRET) may have been used to provide distance measurements to refine computational structure predictions. However, with FRET structural biology measurements one may be limited in data acquisition due to the number of fluorophores that can be used (simultaneously), which may be up to but 4 colors in one experiment based on the current state of the art.
Hence, it is an aspect of the invention to provide an alternative analysis method, which preferably further at least partly obviates one or more of above-described drawbacks. The present invention may have as object to overcome or ameliorate at least one of the disadvantages of the prior art, or to provide a useful alternative.
Hence, in a first aspect, the invention may provide an analysis method (also: “method”) for characterization of a (tagged) protein, especially a (tagged) protein complex, using FRET donor-acceptor pair chromophores. The FRET donor-acceptor pair chromophores may comprise a first chromophore and a second chromophore. The FRET donor-acceptor pair chromophores may have a (FRET) donor excitation radiation range, a (FRET) acceptor excitation radiation range, a (FRET) donor emission radiation range and a (FRET) acceptor emission radiation range. Especially, one of the FRET donor-acceptor pair chromophores (also: “donor chromophore”) may be excitable by donor excitation radiation in the donor excitation radiation range, especially wherein the other of the FRET donor-acceptor pair chromophores (also: “acceptor chromophore”) may be configured to provide acceptor emission radiation in the FRET acceptor emission radiation range upon excitation with donor excitation radiation in the donor excitation radiation range of the one of the FRET donor-acceptor pair chromophores when the first chromophore and the second chromophore are configured within a predetermined distance (range). In embodiments, the tagged protein may comprise a first amino acid tagged with a first tag and a second amino acid tagged with a second tag, especially wherein the first amino acid is different from the second amino acid. In further embodiments, the first tag may comprise the first chromophore or may be associated to the first chromophore. In further embodiments, the second tag may comprise an oligonucleotide (also: “nucleotide chain”), especially an oligonucleotide comprising 02 nucleotides, wherein 02 may be selected from the range of 3 to 20, especially 3 to 15, such as 5 to 12. In embodiments, the method may comprise a barcode exposure stage. In further embodiments, the method may comprise a pattern generation stage. In further embodiments, the method may comprise a distance estimation stage. The barcode exposure stage may comprise exposing the tagged protein to a (second) barcode, wherein the (second) barcode is configured to hybridize with the second tag, and wherein the (second) barcode comprises the second chromophore. The barcode exposure stage may further comprise providing radiation having a wavelength selected from the donor excitation radiation range to the protein. The barcode exposure stage may further comprise measuring emission in (the donor emission radiation range and/or the acceptor emission radiation range, especially in) the donor emission radiation range and the acceptor emission radiation range, especially to provide an emission signal. The pattern generation stage may comprise generating a FRET efficiency pattern based on the emission signal. The distance estimation stage may comprise estimating a distance between the first amino acid and the second amino acid based on the FRET efficiency pattern and/or the emission signal, especially based on the FRET efficiency pattern, or especially based on the emission signal, especially by determining FRET efficiency.
In particular, the method may enable quickly and precisely determining the distances between the first amino acid and a plurality of second amino acids, between a plurality of first amino acids and a second amino acid, and/or between a plurality of first amino acids and a plurality of second amino acids (see further below). In this way, (3D) spatial information and/or protein characterization may be provided. The invention may thus provide an advantageous protein characterization method based on single-molecule FRET analysis. In particular, the invention may provide an advantageous distance measurement method based on single-molecule FRET analysis. The method may in particular be suitable for one or more of: analysis of a nanometer-sized object such as a protein; single-molecule protein sequencing; analyzing proteoforms such as alternatively spliced proteins; single-molecule post-translational modification analysis; and single-molecule protein structure analysis. In particular, the analysis method may enable directly analyzing, especially identifying, the sequence of full-length proteins, which may (greatly) improve the accuracy of protein identification. The method may be suitable for single-molecule analysis, which may provide the ultimate sensitivity (one molecule) and allow the sequencing of proteins present in an amount 3-5 orders of magnitude smaller than what may presently be required for mass spectrometry. Hence, the method may be suitable for single-cell proteomics, and may furthermore be suitable for real-time screening for on-site medical diagnostics.
In particular, the method may provide a FRET efficiency pattern based on one or more emission signals, especially based on a plurality of emission signals. The FRET efficiency pattern may be characteristic of the protein. Especially, the FRET efficiency pattern may comprise a protein fingerprint.
The analysis method may herein also be referred to as single-molecule superresolution FRET (ssFRET).
Hence, the invention may provide an analysis method for characterization of a protein. The analysis method may involve analyzing a protein to determine a protein characteristic. In particular, the term “characterization of a protein” may herein refer to one or more of identifying the protein, especially via sequencing, or determining (at least part of) the 3- D protein structure.
In embodiments, the protein may especially be a tagged protein, i.e., one or more tags may be provided to (or: “attached to”) the protein. In particular, the tagged protein may comprise a first amino acid tagged with a first tag, and especially a second amino acid tagged with a second tag. In further embodiments, the first amino acid and the second amino acid may be independently selected from the group comprising cysteine, lysine, methionine, tyrosine, an amino acid comprising a C-terminal carboxyl group, and an amino acid comprising an N-terminal amine group. In further embodiments, the first amino acid and the second amino acid may be the same type of amino acid, especially wherein the first amino acid comprises a terminal amino acid, and/or especially wherein the second amino acid comprises a terminal amino acid. In further embodiments, the first amino acid and the second amino acid may be different, especially the first amino acid and the second amino acid may be different types of amino acids. In such embodiments, the first amino acid may, for example, comprise an amino acid selected from the group comprising cysteine, lysine, methionine, and tyrosine, and the second amino acid may comprise an amino acid selected from the group comprising cysteine, lysine, methionine, and tyrosine differing from the first amino acid.
The terms “amino acid comprising a C-terminal carboxyl group” (also: “C- terminal amino acid”) and “amino acid comprising an N-terminal amine group” (also: “N- terminal amino acid”) may especially refer to the C-terminal carboxyl group and the N-terminal amino group of the protein. The term “tag” and similar terms may herein refer to a molecule that attaches to an amino acid, especially wherein the tag (covalently) binds the amino acid. The tag may be selected to be specific for a target (type of) amino acid. It will be clear to the person skilled in the art how a (type) of amino acid can be specifically tagged using known chemical approaches, such as using maleimide chemistry or reductive amination-aldehyde chemistry.
Hence, in embodiments, a tag for cysteine may comprise one or more of a maleimide group, a haloacetyl group, and a pyridyl disulfide group. In further embodiments, a tag for lysine may comprise one or more of an NHS ester, and a tag provided by reductive amination-aldehyde chemistry. In further embodiments, a tag for methionine may comprise one or more of azide and alkyne groups, especially provided via oxidation with oxaziridine, such as described by Lin, Shixian, et al. "Redox-based reagents for chemoselective methionine bioconjugation." Science 355.6325 (2017): 597-602, which is hereby herein incorporated by reference. In further embodiments, a tag for tyrosine may comprise one or more of a diazodi carboxyl ate group and a diazodi carboxamide group. In further embodiments, a tag for an amino acid comprising a C-terminal carboxyl group may comprise a group provided by a decarboxyl ative alkylation reaction. In further embodiments, a tag for an amino acid comprising a N-terminal amine group may comprise a group provided by one or more of NHS chemistry, 2PCA chemistry and an alkyne-ketene reaction.
The analysis method may especially relate to the use of FRET donor-acceptor pair chromophores. The term “FRET” (Forster Resonance Energy Transfer) may herein refer to the transfer of the energy of a donor chromophore to an acceptor chromophore, which may occur when the donor-acceptor pair chromophores are within a predetermined distance range, such as within several nanometers. Hence, the FRET donor-acceptor pair chromophores may comprise a first chromophore and a second chromophore, wherein the FRET donor-acceptor pair chromophores have a donor excitation radiation range, an acceptor excitation radiation range, a donor emission radiation range and an acceptor emission radiation range, wherein one of the FRET donor-acceptor pair chromophores is excitable by donor excitation radiation in the donor excitation radiation range, wherein the other of the FRET donor-acceptor pair chromophores is configured to provide acceptor emission in the FRET acceptor emission radiation range upon excitation with donor excitation radiation in the donor excitation radiation range of the one of the FRET donor-acceptor pair chromophores when the first chromophore and the second chromophore are configured within a predetermined distance range. Hence, if the donor chromophore and the acceptor chromophore are arranged with a predetermined distance range, which may vary for different FRET donor-acceptor pairs, the donor chromophore may upon excitation with donor excitation radiation transfer energy to the acceptor chromophore, whereupon the acceptor chromophore may emit acceptor emission radiation. This energy transfer may occur with a specific FRET efficiency depending on the (exact) distance between the donor chromophore and the acceptor chromophore. Hence, by measuring the FRET (transfer) efficiency, information regarding the distance between the donor chromophore and the acceptor chromophore is obtained. In particular, the FRET transfer efficiency may be sensitive to sub nanometer distance changes, which may make FRET an outstanding spectroscopic ruler for probing, for example, biological systems.
The FRET excitation and emission ranges may, for example, comprise wavelengths in the UV range, the visible light range, and/or the (N)IR range. Hence, the FRET excitation and emission ranges may, for example, comprises a (sub)range selected from within the range of 200 - 1500 nm, especially from within the range of 400 - 800 nm. In embodiments, the donor excitation radiation range may comprise a (sub)range selected from the range of 200 - 1500 nm, especially from the range of 400 - 800 nm. In embodiments, the donor emission radiation range may comprise a (sub)range selected from the range of 200 - 1500 nm, especially from the range of 400 - 800 nm. In embodiments, the acceptor excitation radiation range may comprise a (sub)range selected from the range of 200 - 1500 nm, especially from the range of 400 - 800 nm. In embodiments, the acceptor emission radiation range may comprise a (sub)range selected from the range of 200 - 1500 nm, especially from the range of 400 - 800 nm. The FRET excitation and emission ranges will in general depend on the used FRET pairs.
For example, in embodiments, the FRET donor-acceptor chromophore pair may comprise Atto488 and Cy3, wherein Atto488 (the donor chromophore) and Cy3 (the acceptor chromophore) may be excited maximally at about 488 nm and 552 nm respectively, and wherein Atto488 and Cy3 may provide emission radiation at about 521 nm and 568 nm respectively. In further embodiments, the donor-accept chromophore pair may comprise Atto488 and Cy5, which may respectively be maximally excited at 488 nm and 650 nm, and may provide emission radiation at about 521 nm and 666 nm. In further embodiments, the donor-accept chromophore pair may comprise Cy3 and Cy5, which may respectively be maximally excited at 552 nm and 650 nm, and may provide emission radiation at about 568 nm and 666 nm. In further embodiments, the donor-accept chromophore pair may comprise Cy3 and Cy7, which may respectively be maximally excited at 488 nm and 750 nm, and may provide emission radiation at about 568 nm and 788 nm. In further embodiments, the donor-accept chromophore pair may comprise Cy5 and Cy7, which may respectively be maximally excited at 650 nm and 750 nm, and may provide emission radiation at about 666 nm and 788 nm.
In embodiments, the donor-acceptor chromophore pair may comprise a chromophore pair selected from the group comprising Atto488/Cy3, Atto488/Cy3b, Atto488/Cy5, Atto488/Atto647n, Cy3/Cy5, Cy3b/Cy5, Cy3/Cy7, Cy3b/Cy7, and Cy5/Cy7.
The term “predetermined distance range” and similar terms may herein especially refer to a distance range wherein FRET energy transfer can occur for the FRET donor-acceptor pair chromophores, which may vary for different sets of FRET donor-acceptor pair chromophores.
The term “chromophore” may herein especially refer to a fluorescent chemical molecule that upon excitation with light (e.g. radiation from a laser), emits light of a different wavelength. The FRET donor-acceptor pair chromophores (also: “donor-acceptor pair chromophores”) may especially comprise fluorescent molecules and/or phosphorescent molecules. The term “FRET donor-acceptor pair chromophores” may herein especially refer to two chromophores capable of FRET energy transfer, i.e., energy transfer in a non-radiative distance-dependent fashion, especially through dipole-dipole coupling of the donor chromophore and the acceptor chromophore.
In embodiments, the FRET donor-acceptor pair chromophores may comprise one or more pairs selected from the group comprising the Cyanine family, the Alexa family, the Atto family, the Dy family, and the Rhodamine family. Different chromophore pairs may be sensitive at different distances, i.e., may provide a high effect regarding FRET efficiency for subnanometer distance changes (a high resolution). For example, the Cyanine family pair Cy3:Cy5 may be most sensitive at around a distance of 5 nm, such as at distances selected from the range of 3-7 nm. Similarly, the Cyanine family pair Cy3:Cy7 may be most sensitive at around a distance of 3 nm. The Cyanine family pair Cy2:Cy3 may be most sensitive at around a distance of 7 nm.
In embodiments, the first chromophore may comprise the donor chromophore and the second chromophore may comprise the acceptor chromophore. In further embodiments, the second chromophore may comprise the donor chromophore and the first chromophore may comprise the acceptor chromophore.
In embodiments, the first tag may comprise the first chromophore or may be associated to the first chromophore. In further embodiments, the first tag may comprise the first chromophore. In further embodiments, the first tag may be associated to the first chromophore. The term “associated” and similar terms may herein refer to two molecules being non- permanently connected, especially non-covalently connected, such as via hydrogen bond interactions. In particular, the first tag may be associated to the first chromophore via nucleotide hybridization (see further below). In such embodiments of the analysis method, the first tag and the first chromophore may dissociate during (part of) the method, i.e., the first tag and the first chromophore may be associated during at least part of the method, especially during at least part of the barcode exposure stage.
In embodiments, the second tag may comprise an oligonucleotide, especially an oligonucleotide comprising 02 nucleotides, wherein 02 is selected from the range of 3 to 20, especially 3 to 15, such as 5 to 12. Hence, the second tag may comprise a moiety suitable for tagging a (specific) (type of) amino acid, and the second tag may comprise an oligonucleotide. The oligonucleotide may especially comprise a subunit of deoxyribonucleic acid (DNA), ribonucleic acid (RNA) or peptide nucleic acid (PNA). Especially, the oligonucleotide may comprise a plurality of (covalently bound) subunits of DNA, RNA, and/or PNA. RNA may be relatively less stable than DNA and PNA, especially in biological environments, especially due to a relatively fast degradation. Hence, in further embodiments, the oligonucleotide may especially comprise a subunit of DNA or PNA. In general, the oligonucleotide chain may comprise a plurality of subunits of DNA and/or PNA.
In embodiments, the barcode exposure stage may comprise exposing the tagged protein to a second barcode (also “barcode”), wherein the barcode is configured to hybridize with the second tag. Hence, the second barcode may (also) comprise an oligonucleotide. Typically, the second tag and the second barcode may comprise the same type of oligonucleotide, i.e., the second tag and the second barcode may both comprise a DNA subunit, or may both comprise an RNA subunit, or may both comprise a PNA subunit. In particular, the second tag and the second barcode may hybridize, especially due to complementary nucleotide pairing. The duration of the hybridization may depend on the number of complementary nucleotide pairs (also: “base pairs”) between the second tag and the second barcode, wherein a larger number of complementary nucleotide pairs may result in a longer hybridization duration.
In further embodiments, the (second) barcode may comprise the second chromophore. Hence, during the hybridization between the second tag and the second barcode, the second chromophore may be located in proximity of the second amino acid. In particular, the distance between the second amino acid and the second chromophore may depend on the length of the second tag and the second barcode, which may be tailored via the selection of the number of nucleotides in the second tag and/or the second barcode, i.e., via the length of the corresponding nucleotide chains. During the hybridization between the second tag and the second barcode, the second chromophore may be brought into proximity with the first chromophore, especially at a distance within the predetermined distance range, such that FRET energy transfer may take place if donor excitation radiation in the donor excitation radiation range is provided to the tagged protein.
Hence, in embodiments, the barcode exposure stage may further comprise providing radiation having a wavelength selected from the donor excitation radiation range to the protein.
In further embodiments, the barcode exposure stage may comprise measuring emission in the donor emission radiation range and the acceptor emission radiation range to provide an emission signal. The emission signal may comprise no emission (if the donor chromophore is not excited), or may comprise emission from the first and/or second chromophore, i.e., the emission signal may comprise donor emission radiation and/or acceptor emission radiation. If the donor chromophore and the acceptor chromophore are too far apart or too close for FRET energy transfer, the emission signal may essentially only comprise donor emission radiation. If the donor chromophore and the acceptor chromophore are arranged at a distance within the predetermined distance range, the emission signal may comprise acceptor emission radiation, donor emission radiation, or a mixture of donor emission radiation and acceptor emission radiation, wherein the composition of the emission signal may depend on the FRET (transfer) efficiency, which may depend on the distance between the first chromophore and the second chromophore.
If the donor chromophore and the acceptor chromophore are too close for FRET energy transfer, the position of the chromophore attached to the first amino acid and/or the second amino acid may be adjusted. For example, using a different first barcode (or second barcode) hybridizing at a different location at the first tag (or second tag). Hence, in embodiments, the method may comprise sequentially providing a plurality of first barcodes (or second barcodes) for a (specific) first tag (or second tag), wherein different first barcodes (or second barcodes) of the plurality of first barcodes (or second barcodes) hybridize with the first tag (or second tag) at different locations.
In embodiments, the analysis method may comprise a pattern generation stage. The pattern generation stage may comprise generating a FRET efficiency pattern based on the emission signal, especially based on a plurality of emission signals, especially based on a plurality of emission signals related to a plurality of first barcodes and/or a plurality of second barcodes (see below). The term “FRET efficiency pattern” may herein refer to data related to FRET efficiency measurements of the protein. In particular, the FRET efficiency pattern may comprise emission-related data, such as the emission signal, especially related to FRET efficiency. Especially, the FRET efficiency pattern may comprise the emission signal. However, the FRET efficiency pattern may (also) comprise data based on the emission signal, such as binned data and/or processed data. In further embodiments, the FRET efficiency pattern may be the emission signal. The term “emission-related data” and similar terms may herein refer to the emission signal, FRET efficiency and/or an estimated distance. Instead of the term “FRET efficiency pattern”, also the term “emission-related data” may be applied.
Hence, in embodiments, the pattern generation stage may comprise binning data based on the emission signal to provide binned data, wherein the FRET efficiency pattern comprises the binned data.
In further, embodiments, the pattern generation stage may comprise processing the emission signal to provide processed data, wherein the FRET efficiency pattern comprises the processed data.
In embodiments, the FRET efficiency pattern may comprise a protein fingerprint (of the protein).
Hence, in embodiments, the method may further comprise a distance estimation stage comprising estimating a distance between the first amino acid and the second amino acid based on the FRET efficiency pattern, especially based on the emission signal, especially by determining the FRET efficiency. The FRET efficiency (E) may be defined as:
, _ la c-FRET — j , j la + Id wherein Ia is the intensity of the acceptor emission, and wherein Id is the intensity of the donor emission. The distance between the donor chromophore and the acceptor chromophore may then be estimated by comparing the measured value of E (equation above) to an estimated value of the FRET Efficiency Ee as a function of the distance r:
Figure imgf000013_0001
wherein R is the Forster radius, which may be specific for the donor-acceptor pair.
The distance between the first amino acid and the second amino acid may then be estimated based on the estimated distance between the donor chromophore and the acceptor chromophore, as well as, for example, based on the (length of) the tags and the barcodes.
Hence, in specific embodiments, the invention may provide an analysis method for characterization of a tagged protein using FRET donor-acceptor pair chromophores, wherein the FRET donor-acceptor pair chromophores comprise a first chromophore and a second chromophore, wherein the FRET donor-acceptor pair chromophores have a donor excitation radiation range, a donor emission radiation range and an acceptor emission radiation range, wherein one of the FRET donor-acceptor pair chromophores is excitable by donor excitation radiation in the donor excitation radiation range, wherein the other of the FRET donor-acceptor pair chromophores is configured to provide acceptor emission radiation in the acceptor emission radiation range upon excitation with donor excitation radiation in the donor excitation radiation range of the one of the FRET donor-acceptor pair chromophores when the first chromophore and the second chromophore are configured within a predetermined distance, wherein the tagged protein comprises a first amino acid tagged with a first tag and a second amino acid tagged with a second tag, wherein the first tag comprises the first chromophore or is associated to the first chromophore, wherein the second tag comprises an oligonucleotide , and wherein the method comprises: - a barcode exposure stage comprising: (i) exposing the tagged protein to a (second) barcode, wherein the barcode is configured to hybridize with the second tag, and wherein the barcode comprises the second chromophore, (ii) providing radiation having a wavelength selected from the donor excitation radiation range to the protein, and (iii) measuring emission in the donor emission radiation range and the acceptor emission radiation range to provide an emission signal; and a pattern generation stage comprising generating a FRET efficiency pattern based on the emission signal.
In further embodiments, the analysis method may further comprise a distance estimation stage comprising estimating a distance between the first amino acid and the second amino acid based on the FRET efficiency pattern, especially based on the emission signal.
The term “oligonucleotide” may herein refer to a chain of nucleotides with a relatively small number of subunits. In general, an oligonucleotide may comprise DNA subunits, RNA subunits, or PNA subunits, however, the oligonucleotide may also comprise combinations. An oligonucleotide may comprise, for example, up to 200 nucleotides, such as up to 200 DNA subunits and/or PNA subunits.
In embodiments, the first amino acid may comprise a first post-translational modification, especially wherein the first tag is attached to the first post-translational modification; and/or the second amino acid may comprise a second post-translational modification, especially wherein the second tag is attached to the second post-translational modification.
Hence, in embodiments, the method may comprise providing a first tag (and/or second tag) that tags a (specific) amino acid independent of whether or not the amino acid comprises a post-translational modification (“PTM”). In further embodiments, the method may comprise providing a first tag (and/or second tag) that tags a (specific) amino acid if the amino acid is post-translationally modified, especially with a specific post translational modification. Hence, in such embodiments, the method may comprise tagging a post-translationally modified amino acid with a first tag (and/or second tag). The presence of a PTM in a protein may be a key signature of several diseases in neurology, oncology and immunology. It has, however, been challenging to detect PTMs accurately. For example, it may generally be difficult to determine (1) whether a PTM is present given a small amount of sample; (2) if it is present, to what degree the PTM has occurred within a tissue of interest; and (3) at which amino acid residue of a protein the PTM is located. Above-mentioned embodiment may facilitate specifically locating the position of a PTM in a protein on a single-molecule basis.
In embodiments, the first amino acid (or the second amino acid) may have been post-translationally modified via one or more of phosphorylation, O-linked glycosylation, acetylation, methylation, nitration, famesylation, palmitoylation, myristoylation, and S- nitrosylation. That is, the PTM may be selected from the group comprising a phosphoryl group, an O-glycan, an acetyl group, a methyl group, a nitro group, a farnesyl group, a palmitoyl group, a myristoyl group, and an S-nitrothiol group. In further embodiments, the first post-translational modification may be selected from the group comprising a phosphoryl group, an O-glycan, an acetyl group, a methyl group, a nitro group, a farnesyl group, a palmitoyl group, a myristoyl group, and an S-nitrothiol group. In further embodiments, the second post-translational modification may be selected from the group comprising a phosphoryl group, an O-glycan, an acetyl group, a methyl group, a nitro group, a farnesyl group, a palmitoyl group, a myristoyl group, and an S-nitrothiol group. The term “PTM” may also refer to a plurality of (different) PTMs.
In specific embodiments, the PTM, especially the first post-translational modification, or especially the second post-translational modification, may comprise a phosphoryl group, wherein the first tag (or second tag) may be provided to the first amino acid (or second amino acid) via one or more of Beta-elimination/Michael addition (also “BEMA”) and/or Phosphoramidate chemistry. BEMA may comprise the elimination of the phosphoryl group using a saturated Ba(OH)2 solution, resulting in the formation of an a-b-unsaturated carbonyl compound. This a-b-unsaturated carbonyl can be used as a Michael acceptor, where a nucleophilic compound containing an enrichment tag may be attached. Especially, a phosphorylated serine or a phosphorylated threonine may be tagged with a tag comprising a nucleophile (e.g. a thiol or amine) and an oligonucleotide.
Phosphoramidate chemistry may comprise attaching a cysteamine to the phosphate group of an amino acid, wherein the addition of the cysteamine will result in a free sulfhydryl. The tag may then be attached to the free sulfhydryl groups using maleimides, haloacetyls or pyridyl disulfides, i.e., the tag may comprise a tagging group selected from the group comprising maleimides, haloacetyls, and pyridyl disulfides.
The duration of hybridization between a barcode and a tag may depend (in part) on the number of complementary residues between the barcode and the tag. For example, barcodes with 5 to 10 nucleotides complementarity to a DNA tag may provide an off-rate in the range of 0.1-10 s 1, whereas barcodes with 10 to 20 nucleotides complementarity to a DNA tag may provide an off-rate in the range of 0.001-0.1 s 1. In particular, the binding affinity between PNA-DNA, PNA-PNA, RNA-DNA and RNA-RNA may be higher than the binding affinity of DNA-DNA. Especially, the binding affinities of PNA-DNA and PNA-PNA may be substantially higher. Hence, the same off-rate may be achieved with a shorter (complementary section of a) barcode for, for example, PNA-DNA binding than for DNA-DNA binding. The person skilled in the art will be capable of selecting the type and length of the barcodes in view of the desired off- rate.
Hence, in embodiments, the (second) barcode may comprise m nucleotides complementary with the second tag. The person skilled in the art may select to provide a desired off-rate (also: “dissociation rate”). In further embodiments, may be at least 3, such as at least 5, especially at least 7, such as at least 10, especially at least 15. In further embodiments, may be at most 30, such as at most 25, especially at most 20, such as at most 15, especially at most 10, such as at most 7, especially at most 5. In further embodiments, may be selected from the range of 5-10. In further embodiments, the protein may comprise a plurality of second amino acids, especially a plurality of second amino acids of the same type, tagged with (two or more) different second tags, i.e., each of the plurality of second amino acids is tagged with a different (or: unique) second tag. In further embodiments, the plurality of second amino acids may comprise at least 2 amino acids, such as at least 3 amino acids, especially at least 4 amino acids, such as at least 5 amino acids. In further embodiments, the plurality of second amino acids may comprise at most 70 amino acids, such as at most 50 amino acids, especially at most 40 amino acids, such as at most 25 amino acids. Especially, the different second tags may comprise different nucleotides, especially different oligonucleotides, i.e., each of the second tags may comprise a different oligonucleotide. In such embodiments, the barcode exposure stage may comprise sequentially exposing the (tagged) protein to a plurality of different barcodes, especially wherein for each second tag the barcode exposure stage comprises exposing the protein to a corresponding barcode (of the plurality of different barcodes) configured to hybridize with the respective second tag. In further embodiments, the barcode exposure stage may comprise exposing the protein to one or more corresponding barcodes for each of the second tags, i.e., for each second tag one or more barcodes hybridizing with that tag are (sequentially) provided. The plurality of different barcodes may especially be configured to uniquely hybridize to a specific second tag of the plurality of second tags.
In further embodiments, the barcode exposure stage may comprise, during the exposure of the protein to each of the barcodes, providing radiation having a wavelength selected from the donor excitation radiation range to the tagged protein, and measuring emission in the donor emission radiation range and the acceptor emission radiation range to provide an emission signal. In general, the barcode exposure stage may comprise repeatedly or (essentially) continuously, especially repeatedly, or especially continuously, providing radiation having a wavelength selected from the donor excitation radiation range to the tagged protein, and measuring emission in the donor emission radiation range and the acceptor emission radiation range to provide an emission signal. Thereby, emission signals may be obtained for each barcode exposure.
Hence, the barcode exposure stage may comprise sequentially providing different second barcodes to the tagged protein, wherein each barcode uniquely hybridizes to a second tag, thereby (potentially) modifying the emission signal of the protein depending on the distance between the second chromophore (comprised by the second tag) and the first chromophore (comprised by or associated to the first tag), thereby providing a plurality of emission signals. During the distance estimation stage, the distance between the first amino acid and each of the second amino acids may be estimated based on the FRET efficiency pattern, especially based on a plurality of emission signals. Hence, thereby, the method may provide, for example, estimated distances between a first amino acid, for example the N-terminal amino acid, and (essentially) a plurality of second amino acids, for example all amino acids of a specific type, such as cysteine, in the (tagged) protein.
During the barcode exposure stage, the donor chromophore may thus in general interact with at most one acceptor chromophore at a given timepoint. For example, the protein may have five second amino acids tagged with different tags, each with a different corresponding barcode. Then, the tagged protein may be exposed sequentially to the different barcodes, especially the protein may be flushed with complementary barcode sequences that contain the second chromophore one by one for each tag. Although beforehand it may not be known which second amino acid is probed at a given timepoint, each second amino acid may be probed one by one and high precision may be obtained.
In embodiments, the first tag may (also) comprise an oligonucleotide, especially an oligonucleotide comprising oi nucleotides, wherein oi is selected from the range of 3 to 25, such as 3 to 20, especially 5 to 15. In such embodiments, the barcode exposure stage may comprise exposing the (tagged) protein to a first barcode, especially wherein the first barcode is configured to hybridize with the first tag. In further embodiments, the first barcode may comprise the first chromophore.
In further embodiments, the first barcode may comprise nucleotides complementary with the first tag. The person skilled in the art may select to provide a desired off-rate (also: “dissociation rate”). In further embodiments, may be at least 3, such as at least 5, especially at least 7, such as at least 10, especially at least 15. In further embodiments, m may be at most 30, such as at most 25, especially at most 20, such as at most 15, especially at most 10, such as at most 7, especially at most 5. In further embodiments, m may be selected from the range of 10-20.
In further embodiments, may especially be selected to be larger than m, i.e., > m, or may especially be selected to be smaller than m, i.e., < m.
By providing a different number of complementary nucleotides between the combination of first tag and first barcode and the combination of second tag and second barcode, the off-rate between the first tag and first barcode may differ from the off-rate between the second tag and the second barcode. In particular, the different off-rates may be selected such that during the hybridization of the first tag with the first barcode (or: second tag with the second barcode), a plurality of second tags (or: first tags) may hybridize with corresponding second barcodes (or: first barcodes). This may be particularly beneficial for sequentially providing both different first barcodes and second barcodes.
Hence, in further embodiments, the protein may comprise a plurality of first amino acids, especially of the same type, tagged with (two or more) different first tags, especially wherein the different first tags comprise different nucleotide sequences. Especially, each of the plurality of first amino acids may be tagged with a different first tag. In further embodiments, the plurality of first amino acids may comprise at least 2 amino acids, such as at least 3 amino acids, especially at least 4 amino acids, such as at least 5 amino acids. In further embodiments, the plurality of first amino acids may comprise at most 70 amino acids, such as at most 50 amino acids, especially at most 40 amino acids, such as at most 25 amino acids. In further embodiments, the barcode exposure stage may comprise sequentially exposing the protein to a plurality of different first barcodes, especially wherein for each first tag the barcode exposure stage comprises exposing the protein to a corresponding first barcode (of the different first barcodes) configured to hybridize with the respective first tag. In further embodiments, the barcode exposure stage may comprise exposing the protein to one or more corresponding first barcodes for each of the first tags, i.e., for each first tag one or more first barcodes hybridizing with that first tag are (sequentially) provided. The plurality of different first barcodes may especially be configured to uniquely hybridize to a specific first tag of the plurality of first tags. Thereby, estimated distances between a plurality of first amino acids and a second amino acid may be obtained.
In embodiments, the tagged protein may comprise both a plurality of (tagged) first amino acids and a plurality of (tagged) second amino acids. In such embodiments, the barcode exposure stage may especially comprise exposing the protein to a plurality of first barcodes and to a plurality of second barcodes, especially wherein during the exposure to each of the first barcodes, the protein is (sequentially) exposed to each of the second barcodes. Thereby, the method may provide estimated distances between (each of) the plurality of first amino acids and (each of) the plurality of second amino acids.
Rather than sequentially providing one or more of a plurality of first barcodes and a plurality of second barcodes, the analysis method may also comprise simultaneously providing one or more of a plurality of first barcodes and a plurality of second barcodes. In such embodiments, the tagged protein may thus be simultaneously associated with a plurality of first (or second) chromophores, which may result in a composite signal related to a plurality of FRET pairs. Hence, in such embodiments, the signals corresponding to different FRET pairs may need to be distinguished.
Hence, in further embodiments, the off-rates (or dissociation rates) between the barcodes and the tags may be selected such that during at least part of the barcode exposure stage only a single first chromophore and only a single second chromophore are associated with the tagged protein, i.e., the off-rates may be selected to be relatively large in order to temporally separate the signals corresponding to different FRET pairs. For example, in such embodiments, the first barcode (or the second barcode) may have 5 to 10 nucleotides complementary to the first tag (or the second tag), which may provide an off-rate in the range of 0.1-10 s 1. In further embodiments, may be selected to provide an off-rate larger than 0.01 s 1 , such as larger than 0.1 s 1, especially larger than 1 s 1. Similarly, in further embodiments, m may be selected to provide an off-rate larger than 0.01 s 1 , such as larger than 0.1 s 1, especially larger than 1 s 1.
However, in further embodiments, signals corresponding to different FRET pairs may be temporally separated via (stochastic) activation and inactivation of chromophores. In particular, the off-rates may be selected to be relatively small, i.e., the barcodes are associated to the tags for a relatively long time, but the chromophores may switch between active and inactive states during the barcode exposure stage (see below). For example, in such embodiments, the first barcode (or the second barcode) may have 10 to 20 nucleotides complementary to the first tag (or the second tag), which may provide an off-rate in the range of 0.001-0.1 s 1. In further embodiments, may especially be selected from the range of 10-25, such as from the range of 10-20, especially from the range of 10-15. Similarly, in further embodiments, m may especially be selected from the range of 10-25, such as from the range of 10-20, especially from the range of 10-15, or especially from the range of 15-20. In particular, m may be selected to provide an off-rate smaller than 0.1 s 1 , such as smaller than 0.01 s 1, especially smaller than 0.005 s 1. Similarly, in further embodiments, m may be selected to provide an off-rate smaller than 0.1 s 1 , such as smaller than 0.01 s 1, especially smaller than 0.005 s 1.
Such embodiments, with small off-rates, may be beneficial as a lower concentration of second barcode can be used due to longer hybridization times.
Further, in such embodiments, one or more of the first chromophore and the second chromophore may (be configured to) (stochastically) switch between an active and an inactive state, wherein the chromophore does not perform FRET in the inactive state.
Hence, in embodiments, one or more of the first chromophore and the second chromophore may be configured to switch between an active state and an inactive state, especially the first chromophore may be configured to switch between an active state and an inactive state, or especially the second chromophore may be configured to switch between an active and an inactive state.
For example, a chromophore may switch between an active state and an inactive state as a function of radiation and/or redox agents. For instance, Cy5 may switch from an inactive state to an active state when exposed to green light, and may switch from the active state to an inactive state when exposed to red light and a reducing agent. Similarly, ATT0655 may switch from an inactive state to an active state when exposed to an oxidizing agent, and may switch from the active state to the inactive state when exposed to red light and a reducing agent. It will be clear to the person skilled in the art that the conditions at which a chromophore switch between an active state and an inactive state may be chromophore-dependent, and that these conditions may thus be selected specifically for the chromophore the skilled person decides to use.
Hence, in further embodiments, the analysis method may comprise providing conditions in which one or more of the first chromophore and the second chromophore switches between an active state and an inactive state, especially wherein the first chromophore switches between an active state and an inactive state, or especially wherein the second chromophore switches between an active state and an inactive state. In particular, in embodiments, the analysis method may comprise providing switching radiation to the tagged protein, wherein the switching radiation is selected in order to switch the first chromophore (or the second chromophore) between an active state and an inactive state. In further embodiments, the analysis method may comprise providing a redox agent to the tagged protein, wherein the redox agent is suitable to switch the first chromophore (or the second chromophore) between an active state and an inactive state. Besides the number of complementary nucleotides, the off-rate may further be influenced by, for example, a liquid the tagged protein is exposed to. Hence, in further embodiments, the analysis method, especially the barcode exposure stage, may comprise exposing the tagged protein to a dissociation buffer configured to increase the off-rate, especially the off-rate of the first tag and the first barcode, or especially the off-rate of the second tag and the second barcode. In particular, the barcode exposure stage may comprise exposing the tagged protein to the dissociation buffer in between the exposing of the tagged protein to different barcodes. Hence, the analysis method may, for example, comprise providing a barcode of the different barcodes to the tagged protein, wherein the barcode hybridizes with the second tag (or the first tag), subsequently providing the buffer to dissociate the barcode and the second tag, and subsequently providing an other barcode of the different barcodes to the tagged protein, wherein the other barcode hybridizes with the second tag (or the first tag).
In embodiments, the dissociation buffer (or “melting buffer”) may especially comprise one or more of deionized water, urea, formamide, and a strong base, especially deionized water.
The off rate may also be influenced by temperature, i.e., the off-rate may increase at higher temperatures. Hence, in further embodiments, the barcode exposure stage may comprise exposing the tagged protein to an elevated temperature in between the exposing of the tagged protein to different barcodes. In further embodiments, the elevated temperature may be selected from the range of 40 - 90°C, such as from the range of 40 - 80°C, especially from the range of 50 - 70°C, such as from the range of 55 - 70°C. It will be clear to the person skilled in the art that the selection of the elevated temperature may depend on the (complementary) sequence. For example, with respect to DNA-DNA hybridization, a higher elevated temperature may be selected for a sequence with a high G/C content than for a sequence with a high A/T content as the G-C pairing may be more stable than the A-T pairing.
Hence, in embodiments, the tagged protein may comprise a plurality of second amino acids tagged with second tags, wherein the barcode exposure stage comprises exposing the tagged protein to a barcode configured to hybridize with two or more of the plurality of second tags, wherein the barcode comprises m nucleotides complementary with the second tag.
In further embodiments, m may be selected from the range of 5-10, especially from the range of 5-9, such as from the range of 5-8. In such embodiments, the off-rate may be sufficiently high that the association of the barcode to different second tags may be temporally separated, and thus that the signal from different FRET pairs may be temporally separated.
In further embodiments, m may be selected from the range of 10-20, especially from the range of 11-15. In such embodiments, the off-rate may be relatively low, which may cause multiple second tags to simultaneously be associated to the barcode. Hence, in such embodiments, one or more of the first chromophore and the second chromophore may be configured to switch between an active state and an inactive state. Thereby, the signal from different FRET pairs may be temporally separated.
Similarly, in further embodiments, the tagged protein may comprise a plurality of first amino acids tagged with first tags, wherein the barcode exposure stage comprises exposing the tagged protein to a first barcode configured to hybridize with two or more of the plurality of first tags, wherein the first barcode comprises nucleotides complementary with the first tag. In further embodiments, may be selected from the range of 5-10, especially from the range of 5- 9, such as from the range of 5-8, and one or more of the first chromophore and the second chromophore may especially be configured to switch between an active state and an inactive state.
In embodiments, the plurality of first amino acids may especially comprise (essentially) all amino acids in the protein of a first type, wherein the first type is especially selected from the group cysteine, lysine, methionine, and tyrosine. Hence, the plurality of first amino acids may especially comprise (essentially) all amino acids of the first type in the protein. For example, the plurality of first amino acids may comprise all cysteines in the protein. In further embodiments, the plurality of second amino acids may especially comprise (essentially) all amino acids in the protein of a second type, wherein the second type is especially selected from the group cysteine, lysine, methionine, and tyrosine. Hence, the plurality of second amino acids may especially comprise (essentially) all amino acids of the second type in the protein. For example, the plurality of second amino acids may comprise all tyrosines in the protein. In particular, the first type may be different from the second type.
In embodiments, the first tag may comprise the same number of nucleotides as the first barcode. In further embodiments, the first tag may comprise fewer nucleotides than the first barcode. In further embodiments, the first tag may comprise more nucleotides than the first barcode.
In embodiments, the second tag may comprise the same number of nucleotides as the second barcode. In further embodiments, the second tag may comprise fewer nucleotides than the second barcode. In further embodiments, the second tag may comprise more nucleotides than the second barcode.
In particular, it may be beneficial for the first tag (and/or second tag) to have more nucleotides than the first barcode (and/or second barcode), as this may facilitate providing a plurality of barcodes corresponding to the same tag. The plurality of barcodes may hybridize at different locations along the tag, thereby providing for a tuneable distance between the first amino acid (or second amino acid) and the first chromophore (or second chromophore). The combination of tag and corresponding barcode may thus be considered a “linker” connecting the first amino acid (or second amino acid) to the first chromophore (or second chromophore. Specifically, with a plurality of barcodes, the combination of tag and corresponding barcodes may be considered a variable linker.
In embodiments, the analysis method may further comprise a tagging stage. The tagging stage may especially be arranged prior to the barcode exposure stage. The tagging stage may comprise tagging the first amino acid with the first tag and/or tagging the second amino acid with the second tag in a protein to provide the tagged protein.
In embodiments, the analysis method may provide a (processed) emission signal, especially a FRET efficiency or distance. In further embodiments, the (processed) emission signal, such as the FRET efficiency, may be used for one or more of protein identification, protein structure determination, protein conformational change analysis, protein substrate binding analysis, protein DNA binding analysis, and in silico experimentation (such as with fixed cells). Hence, in embodiments, the analysis method may comprise one or more of protein identification, protein structure determination, protein conformational change analysis, protein substrate binding analysis, protein DNA binding analysis.
In further embodiments, the analysis method may comprise further characterizing the protein using second FRET donor-acceptor pair chromophores, wherein the second FRET donor-acceptor pair chromophores differ from the FRET donor-acceptor pair chromophores. Such embodiments may be beneficial as different FRET donor-acceptor pair chromophores may vary in their sensitivities in FRET efficiency with respect to distances due to different Forster radii.
In embodiments, the analysis method may comprise a fingerprint provision stage comprising providing a protein fingerprint based on the FRET efficiency pattern and/or the estimated distance, especially based on the FRET efficiency pattern, or especially based on the estimated distance.
The term “protein fingerprint” may herein refer to a protein-specific (unique) signal, especially wherein the protein fingerprint is suitable for identification of the protein. Herein, the protein fingerprint may especially refer to one or more of an array of FRET efficiency values; an array of estimated distances; and/or raw data, especially one or more emission signals, obtained according to the method of the invention.
Hence, in embodiments, the FRET efficiency pattern may (essentially) comprise a protein fingerprint. In further embodiments, the FRET efficiency pattern may be processed (in the protein fingerprint provision stage) to provide the protein fingerprint. It will be clear to the person skilled in the art that the protein fingerprint may vary in dependence on, for example, the used FRET donor-acceptor pair chromophores, tags, and/or barcodes, i.e., there may be a plurality of (possible) protein fingerprints (unique) for the protein in dependence on the selected FRET donor-acceptor pair chromophores, (length of the) tags, (length of the) barcodes, and/or (length of) the complementary sequence between the tags and the barcodes. Hence, in embodiments, the analysis method may comprise providing a plurality of protein fingerprints of the protein by varying one or more of FRET donor-acceptor pair chromophores, first tags, second tags, first barcodes and/or second barcodes, especially by varying the FRET donor-acceptor pair chromophores. In further embodiments, the protein fingerprint may comprise data related to emission signals (independently) obtained using a plurality of (different) FRET donor-acceptor pair chromophores.
In further embodiments, the analysis method may comprise a protein identification stage comprising identifying the protein by comparing the protein fingerprint to protein-related information in reference data. Especially, the protein-related information may comprise predetermined protein fingerprints, and the protein may be identified based on a comparison between the protein fingerprint and the predetermined protein fingerprints in the reference data. In further embodiments, the protein-related information may comprise predicted predetermined protein fingerprints, which may each be predicted based on a corresponding (known) protein structure of a protein.
The method according to the invention may be particularly suitable to identify different proteoforms of a protein, especially due to alternative splicing, of the protein as the method may not rely on protein fragmentation, may be versatile in the amino acids that can be tagged, and may be sensitive towards identifying “missing” amino acids due to single nucleotide polymorphisms (SNPs) and alternative splicing. The term “missing amino acid” may especially refer to an amino acid that is typically present in a certain location in the protein, but has been replaced due to a mutation or is absent due to alternative splicing. The identification of proteoforms may be important as several diseases, such as cystic fibrosis, cancer and Parkinson disease have been associated with mutations in their splicing elements that lead to alternative splicing and abnormal protein production. Similarly, the post-translational modification of proteins may be a key signature of several diseases in neurology, oncology and immunology.
The term “proteoform” may herein refer to all different forms of a protein that may be transcribed from a single protein encoding gene, wherein the difference may be due to alternative splicing and/or differences in post-translational modifications, as well as to different forms of a protein resulting from gene variations such as SNPs, i.e., the term “proteoform” may herein also refer to two proteins transcribed from two different alleles of the same gene in two different individuals.
Hence, in embodiments, the protein identification stage may comprise identifying a proteoform of the protein, especially wherein the reference data comprises protein-related information pertaining to the proteoforms. In further embodiments, the protein identification stage may comprise identifying an alternatively spliced form of the protein, especially wherein the reference data comprises protein-related information pertaining to alternative splicing.
In embodiments, the reference data may comprise an (online) database. Hence, the method may comprise retrieving protein-related information from the reference data, especially from the (online) database.
In further embodiments, the fingerprint provision stage may comprise predicting a (partial) amino acid sequence based on the FRET efficiency pattern, especially based on the estimated distance, and may provide a protein fingerprint comprising the predicted (partial) amino acid sequence. In such embodiments, the protein-related information may comprise amino acid sequences. The amino acid sequences may especially comprise translated nucleotide sequences, and/or amino acid sequences resulting from alternative splicing.
Such embodiments may be particularly beneficial with regards to the identification of linearized proteins, as the distance between two amino acid residues can be relatively directly translated to a relative position in the amino acid chain.
Hence, in embodiments, the (tagged) protein may comprise a denatured protein. The term “denatured” and similar terms herein especially refers to the protein having lost its secondary and tertiary structures, including any disulfide bonds between cysteine residues.
Hence, in further embodiments, the analysis method may comprise a denaturation stage comprising denaturation of the protein. Methods for the denaturation of proteins will be known to the person skilled in the art and may, for example, comprise exposing the protein to sodium dodecyl sulfate (SDS).
Denaturation of the protein may further facilitate tagging the first and second amino acids. Hence, in embodiments, the tagging stage may be arranged after the denaturation stage. Especially, the tagging stage may comprise tagging the first amino acid with the first tag and/or tagging the second amino acid with the second tag in the denatured protein.
In embodiments, the denatured tagged protein may be subjected to the barcode exposure stage, especially in embodiments for the identification of the protein, more especially in embodiments for the identification of the protein via the determination of a (partial) amino acid sequence. In further embodiments, the analysis method may further comprise a refolding stage. The refolding stage may especially be arranged following the denaturation stage and the tagging stage. The refolding state may comprise refolding of the denatured protein. Specifically, some amino acids in a protein may be difficult to tag as the protein is in its folded state. Hence, the protein may be denatured, tagged, and refolded such that tags may be provided to amino acids that would otherwise be relatively difficult to tag.
Refolding of the tagged protein may be particularly relevant for embodiments regarding the determination of the structure of a protein. The tags may, however, affect the 3D- structure of the protein depending on properties of the tag such as size and hydrophobicity. It may thus be beneficial for structure prediction to employ relatively small tags to minimize the impact on protein structure. Hence, in such embodiments, the first tag and/or the second tag may comprise a tagging group selected from the group comprising a ormyl-glycine generating enzyme (FGE) tag (this will allow aldehyde-hydrazide chemistry), a haloalkane dehalogenase tag (HALO tag), a His tag (6-8 histidines, tris-NTA compounds allow modification of his-tags), a lipoic acid ligase tag (for azide chemistry), and a tubulin tyrosine ligase tag (enzymatic ligation to a modified tyrosin).
In embodiments, the analysis method may comprise a structure prediction stage. The structure prediction stage may comprise predicting a protein structure of the (tagged) protein. Especially, the structure prediction stage may comprise predicting the protein structure of the (untagged) protein based on the emission signal obtained from the tagged protein. The structure prediction stage may especially comprise predicting the protein structure based on the emission signal, or especially based on the FRET efficiency, or especially based on the estimated distance, more especially based on a plurality of estimated distances. The structure prediction stage may especially comprise (using) a computational process, such as a computational algorithm.
In general, in embodiments wherein the analysis method comprises the structure prediction stage, the tagged protein may be folded during the barcode exposure stage, i.e., the tagged protein does not comprise a denatured protein. Hence, in embodiments, the tagged protein may comprise a folded protein.
The analysis method may both contribute to the refinement of existing model protein structures by providing a distance measurement between two amino acid residues, and may contribute to the generation of new model protein structures, especially by providing a plurality of distance measurements between a plurality of amino acid residues.
Hence, in further embodiments, the analysis method may comprise a computational process comprising a refinement of a model protein structure based on the estimated distance. The model protein structure may be based on X-ray crystallography and/or nuclear magnetic resonance (NMR)-spectroscopy. The model protein structure may also be a predicted model protein structure based on amino acid identity or similarity with a protein having a known structure, such as a model protein structure based on homology modelling.
Especially, if a homologous structure is available, the measurements provided by the analysis method according to the invention may be used to detect differences between the homolog and the target protein beforehand and correct the initial model. This may be particularly valuable in regions of low sequence identity or potential hinge regions. Moreover, using the analysis method in ab initio modeling may result in a sufficiently high resolution model to serve as a starting point in the refinement process.
In further embodiments, the estimated distance may comprise a plurality of estimated distances, and the computational process may comprise a de novo protein structure prediction, especially based on a distance matrix-based structure predictor, such as described by Adhikari, Badri, and Jianlin Cheng. "CONFOLD2: improved contact-driven ab initio protein structure modeling." BMC bioinformatics 19.1 (2018): 22, which is hereby herein incorporated by reference .
The method according to the invention may make up for several shortcomings in computational structure predictors. As structural prediction methods typically do, distance matrix-based methods may partially rely on a sequence homolog, especially a plurality of sequence homologs, with a known structure. Specifically, the homolog may be used to estimate inter-residue distances in the target protein, assuming that the target protein folds like the homolog. It only makes sense that the quality of predicted structures may decrease as the number and quality of available homologs decreases. The distance information provided by the method according to the invention may aid in the construction and validation of distance matrices for such structures. Furthermore, the reliability of information provided by current methods for distance matrix construction may decrease with inter-residue distance. The analysis method according to the invention may provide existing (computational) methods with a wealth of previously inaccessible information.
In further embodiments, the method may comprise analyzing (dynamic) structure changes of the protein, especially by exposing the protein to an agent, such as a denaturing agent, affecting the protein structure during the barcode exposure stage.
In embodiments, the analysis method may especially comprise a non-medical method. In embodiments, the analysis method may especially comprise a non-diagnostic method.
In a second aspect, the invention may further provide a system for the characterization of a (tagged) protein, especially using a FRET donor-acceptor pair. The system may comprise one or more of an analytical surface, a barcode supply, a radiation source, a single- molecule fluorescence microscope, and a control system. The analytical surface may be configured to host the (tagged) protein. The barcode supply may be configured to provide (nucleotide) barcodes to the analytical surface, especially to the hosted protein. The radiation source may be configured to provide donor excitation radiation to the analytical surface, especially to the hosted protein. The single-molecule fluorescence microscope may be configured to measure emission in a donor emission radiation range and in a acceptor emission radiation range at the analytical surface. The single-molecule fluorescence microscope may further be configured to provide an emission signal to the control system. The control system may be configured to generate a FRET efficiency pattern based on the emission signal.
In embodiments, the control system may be configured to estimate a distance, especially between a first amino acid and a second amino acid in the (tagged) protein, based on the FRET efficiency pattern, especially based on the emission signal, especially by determining FRET efficiency.
Hence, in an aspect, the invention may provide a system for characterization of a (tagged) protein using a FRET donor-acceptor pair. The system may especially be configured to execute the method according to the invention.
In embodiments, the system may comprise an analytical surface. The analytical surface may be configured to host the (tagged) protein. The analytical surface may especially comprise a glass surface configured for single-molecule imaging, especially a quartz surface for single-molecule imaging. In further embodiments, the system may be configured to immobilize the protein at the analytical surface, especially via biotin-Streptavidin binding or with a covalent chemical approach (e.g. amine-NHS, 2PCA or thiol chemistry). Biotin or biotinylated DNA linker may be attached to the protein molecule for surface immobilization. The immobilization may especially comprise attaching a terminal end of the protein to the analytical surface, especially the N-terminal end, or especially the C-terminal end. In further embodiments, NHS chemistry, 2PCA chemistry and/or an alkyne-ketene reaction may be used for N-terminal end immobilization. In further embodiments, a decaboxylative alkylation reaction for the addition of a DNA linker to the C-terminal end of the protein may be used for C-terminal end immobilization.
In further embodiments, the analytical surface may comprise a surface coating such as PEGylation before Streptavidin is introduced. In further embodiments, the analytical surface may be subjected to a surface passivation method before Streptavidin is introduced.
In embodiments, the system may comprise a barcode supply. The barcode supply may especially be configured to provide (nucleotide) barcodes to the analytical surface, especially to the hosted protein. In further embodiments, the barcode supply may be configured to sequentially supply a plurality of barcodes to the analytical surface, especially to the hosted protein.
In further embodiments, the system may comprise a radiation source. The radiation source may be configured to provide donor excitation radiation to the analytical surface, especially to the hosted protein. In particular, the hosted protein may comprise or be associated with a donor chromophore and/or an acceptor chromophore, wherein the donor chromophore may be excited by the donor excitation radiation provided by the radiation source. After excitation, the donor chromophore may then emit donor emission radiation and/or may provide energy to the acceptor chromophore via FRET energy transfer if the donor chromophore and the acceptor chromophore are within a predetermined distance, which may cause the acceptor chromophore to subsequently emit acceptor emission radiation.
In further embodiments, the system may comprise a single-molecule fluorescence microscope. The single-molecule fluorescence microscope may be configured to measure emission in a donor emission radiation range and in a acceptor emission radiation range, especially at the analytical surface, i.e., the single-molecule fluorescence microscope may measure emission emitted from the (protein at the) analytical surface. The single-molecule fluorescence microscope may further be configured to provide an emission signal to the control system, especially wherein the emission signal comprises donor excitation radiation and/or acceptor emission radiation.
In further embodiments, the system may comprise a control system. The control system may be configured to control one or more of the analytical surface, the barcode supply, the radiation source, and the single-molecule fluorescence microscope. The control system may further be configured to estimate a distance based on the emission signal, especially by determining FRET efficiency. The control system may comprise a processor. The control system may further be configured to retrieve data and/or (program) instructions from an (online) resource, such as an (online) database.
In embodiments, the control system may be configured to estimate a protein fingerprint, especially based on the FRET efficiency pattern or an estimated distance, especially based on the FRET efficiency pattern, or especially based on the estimated distance, more especially based on the FRET efficiency. In further embodiments, the control system may be configured to identify the protein by comparing the protein fingerprint to protein-related information in reference data.
In embodiments, the system may comprise a denaturation unit configured to denature the protein. In such embodiments, the control system may be configured to control the denaturation unit. In embodiments, the system may comprise a tagging unit configured to provide a first tag and/or a second tag to the (untagged) protein. The tagging unit may especially be configured to provide a first tag and/or a second tag to a denatured protein. In such embodiments, the control system may be configured to control the tagging unit.
In embodiments, the system may comprise a refolding unit configured to refold a denatured protein. In further embodiments, the denaturation unit and the refolding unit may essentially be the same unit. In further embodiments, the control system may be configured to control the refolding unit.
In embodiments, the control system may be configured to predict a protein structure of the protein (using a computational process), especially based on the estimated distance, or especially based on the FRET efficiency. The computational process may especially comprise a computational algorithm.
In further embodiments, the computational process may comprise a refinement of a model protein structure based on the estimated distance. In further embodiments, the computational process may comprise a de novo protein structure prediction. In such embodiments, the estimated distance may especially comprise a plurality of estimated distances,
In embodiments, the control system may be configured to execute in a controlling mode the analysis method according to the invention. The control system may especially receive program instructions from a data carrier such that the control system executes the method according to the invention.
In specific embodiments, the control system may be configured to select tags and/or barcodes to acquire (specific) information, especially pertaining to the protein. In particular, the control system may during the execution of the protein identification stage and/or the protein structure prediction stage determine which information regarding the protein may benefit the protein identification and/or the protein structure prediction. The control system may subsequently acquire this information by providing (additional) tags to the protein, especially to specific amino acids.
Hence, in a third aspect, the invention may provide a data carrier having stored thereon program instructions, which when executed by the system according to the invention, especially by the control system, causes the system to execute the method according to the invention.
The embodiments described herein are not limited to a single aspect of the invention. For example, an embodiment describing the method with respect to the barcodes and/or tags may, for example, also apply to the system, particularly to the barcode supply of the system. Similarly, an embodiment of the system describing the radiation, such as the donor excitation radiation or the acceptor emission radiation, may, for example, further apply to the method.
The term “stage” and similar terms used herein may refer to a (time) period (also “phase”) of the analysis method. The different stages may (partially) overlap (in time). For example, the barcode exposure stage may, in general, be initiated prior to the distance estimation stage, but may partially overlap in time therewith. However, for example, the tagging stage may typically be completed prior to the barcode exposure stage. It will be clear to the person skilled in the art how the stages may be beneficially arranged in time. For example, the protein identification stage may occur simultaneously with the barcode exposure stage such that if the protein has been successfully identified, the barcode exposure stage may be terminated.
The method and/or system may be applied in or may be part of analysis methods/sy stems of biological samples, such as protein samples, particularly in relation to protein sequencing, protein structure elucidation, and/or protein interactomics.
BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying schematic drawings in which corresponding reference symbols indicate corresponding parts, and in which: Fig. 1A-C schematically depict embodiments of the analysis method according to the invention; Fig. 2 schematically depicts an embodiment of the system according to the invention; Fig. 3A-B depict experimental measurements obtained using embodiments of the analysis method according to the invention; Fig. 4A-B depict experimental measurements obtained using embodiments of the analysis method according to the invention. Fig. 5A-C depict experimental measurements obtained using embodiments of the analysis method according to the invention. The schematic drawings are not necessarily on scale.
DETAILED DESCRIPTION OF THE EMBODIMENTS
Fig. 1A-C schematically depict embodiments of the analysis method 100 for characterization of a tagged protein 10 using FRET donor-acceptor pair chromophores 20. In the depicted embodiments, the FRET donor-acceptor pair chromophores 20 comprise a first chromophore 21 and a second chromophore 22. The FRET donor-acceptor pair chromophores 20 have a donor excitation radiation range, a donor emission radiation range and an acceptor emission radiation range, wherein one of the FRET donor-acceptor pair chromophores 23 (also: “donor chromophore 23”) is excitable by donor excitation radiation 51 in the donor excitation radiation range, wherein the other of the FRET donor-acceptor pair chromophores 24 (also: “acceptor chromophore 24”) is configured to provide acceptor emission radiation 54 in the acceptor emission radiation range upon excitation with donor excitation radiation 51 in the donor excitation radiation range of the one of the FRET donor-acceptor pair chromophores 23 when the first chromophore 21 and the second chromophore 22 are configured within a predetermined distance. The phrase “wherein one of the FRET donor-acceptor pair chromophores 23 and similar phrases may also be read as “wherein one chromophore 23 of the FRET donor-acceptor pair chromophores 20 .. Similarly, the phrase “wherein the other of the FRET donor-acceptor pair chromophores 24 .. may also be read as “wherein the other chromophore 24 of the FRET donor-acceptor pair chromophores 20
The tagged protein 10 may comprise a first amino acid 11 tagged with a first tag 31 and a second amino acid 12 tagged with a second tag 32. In embodiments, the first tag 31 may comprise the first chromophore 21 or may be associated to the first chromophore 21. In the depicted embodiments, the second tag 32 comprises an oligonucleotide. The analysis method may comprise a barcode exposure stage 110 and a distance estimation stage. The barcode exposure stage 110 may comprise exposing the tagged protein 10 to a (second) barcode 42, wherein the barcode 42 is configured to hybridize with the second tag 32, and wherein the barcode 42 comprises the second chromophore 22. The barcode exposure stage 110 may further comprise providing radiation having a wavelength selected from the donor excitation radiation range to the tagged protein 10, especially providing donor excitation radiation 51 to the tagged protein 10. The barcode exposure stage may further comprise measuring emission in the donor emission radiation range and the acceptor emission radiation range to provide an emission signal. Especially, the barcode exposure stage may comprise measuring donor emission radiation 53 and acceptor emission radiation 54. The distance estimation stage may comprise estimating a distance between the first amino acid 11 and the second amino acid 12 based on the FRET efficiency pattern, especially based on the emission signal, more especially based on a plurality of emission signals.
In embodiments, the first chromophore 21 may be the donor chromophore 23 or the acceptor chromophore 24. In further embodiments, the second chromophore 22 may be the donor chromophore 23 or the acceptor chromophore 24. In general, either (i) the first chromophore 21 is the donor chromophore 23 and the second chromophore 22 is the acceptor chromophore 24, or (ii) the first chromophore 21 is the acceptor chromophore 24 and the second chromophore 22 is the donor chromophore.
Fig. 1 A schematically depicts an embodiment of the method wherein the first tag 31 comprises the first chromophore 21. In the depicted embodiment the first chromophore 21 comprises the acceptor chromophore 24, whereas the second chromophore 22 comprises the donor chromophore 23.
In the depicted embodiment, the tagged protein 10 comprises a plurality of second amino acids 12 tagged with (two or more) different second tags 32,32a,32b,32c, wherein the different second tags 32,32a,32b,32c comprise different nucleotide sequences. Hence, the barcode exposure stage 110 may comprise sequentially exposing the tagged protein 10 to a plurality of different barcodes 42,42ai,42a2,42b,42c. Herein, for visualization purposes only, several different barcodes are depicted simultaneously. However, in embodiments of the analysis method, the barcodes may be provided sequentially (and separately). For each depicted second tag 32,32a,32b,32c, the barcode exposure stage 110 comprises exposing the tagged protein 10 to a corresponding barcode 42,42ai,42a2,42b,42c configured to hybridize with the respective second tag 32,32a,32b,32c. In particular, for tag 32a, the barcode exposure stage 110 comprises exposing the tagged protein 10 to two corresponding barcodes 42,42ai,42a2; these two barcodes will result in different distances of the (respective) second chromophore to the (respective) second amino acid 12, thereby enabling to probe the same amino acid with chromophores arranged at multiple distances, which may be beneficial given that FRET donor-acceptor pairs may provide the highest sensitivity with regards to FRET efficiency within a certain predetermined distance range.
Fig. IB schematically depicts an embodiment of the analysis method 100 wherein the first tag 31 is associated to the first chromophore 21. In the depicted embodiment the first chromophore 21 comprises the donor chromophore 23, whereas the second chromophore 22 comprises the acceptor chromophore 24.
In particular, the first tag 31 comprises an immobilization tag 33 configured to immobilize the tagged protein 10 to an analytical surface 210. Hence, in embodiments, the tagged protein 10 may be immobilized on an analytical surface 210.
In further embodiments, the analysis method 100, especially the tagging stage, may comprise tagging the protein with an immobilization tag 33 to immobilize the tagged protein 10 to an analytical surface 210.
In the depicted embodiment, the first tag 31 (also) comprises an oligonucleotide. Further, the barcode exposure stage 110 comprises exposing the tagged protein 10 to a first barcode 41, wherein the first barcode 41 is configured to hybridize with the first tag 31, and wherein the first barcode 41 comprises the first chromophore 21.
In the depicted embodiment, the first tag 31 and the first barcode 41 are depicted to have more complementary nucleotides than the second tag 32 and the second barcode 42. Such embodiment may enable the first barcode 41 to remain hybridized with the first tag 31 while a plurality of second barcodes 42 are sequentially hybridized with corresponding second tags 32. Hence, in the depicted embodiment, the (second) barcode 42 may comprises m nucleotides complementary with the second tag 32, wherein is selected from the range of 5-10, whereas first barcode 41 may comprise m nucleotides complementary with the first tag 31, wherein m is selected from the range of 10-20.
Fig. 1C schematically predicts an embodiment of the analysis method 100, wherein the tagged protein 10 comprises a plurality of first amino acids 11 tagged with different first tags 31. Especially, the different first tags 31 comprise different nucleotide sequences, and the barcode exposure stage 110 may comprise sequentially exposing the tagged protein 10 to a plurality of different first barcodes 41, especially wherein for each first tag 31 the barcode exposure stage 110 comprises exposing the tagged protein 10 to a corresponding first barcode 41 configured to hybridize with the respective first tag 31.
In the depicted embodiment, the tagged protein 10 comprises a denatured protein 15. A denatured protein 15 may be particularly suitable for determining the relative position of amino acids in the amino acid chain (of the tagged protein 10).
In the depicted embodiment, a single first barcode 41 is hybridized with one of the plurality of first tags 31. Similarly, a single second barcode 42 is hybridized with one of the plurality of second tags 32. Hereafter, one of the barcodes may be replaced with another barcode, especially wherein the other barcode remains.
In the depicted embodiment, the first chromophore 21 comprises the donor chromophore 23 and is excited by donor excitation radiation 51 in the donor excitation radiation range. Hence, the first chromophore 21 may provide donor emission radiation 53. Further, the first chromophore 21 may transfer energy to the second chromophore 22 via FRET energy transfer 60. The second chromophore 22 comprises the acceptor chromophore 24 and may, upon receiving energy from the first chromophore, emit acceptor emission radiation 54. In the depicted embodiment, the FRET donor-acceptor pair 20 is depicted to provide both donor emission radiation 53 and acceptor emission radiation 54. However, depending on the FRET efficiency, which may depend on the distance between the chromophores, the FRET donor-acceptor pair 20 may also provide (essentially) only donor emission radiation 53 or (essentially) only acceptor emission radiation 54.
Fig. 2 schematically depicts an embodiment of the system 200 according to the invention. In particular, it depicts a system 200 for characterization of a tagged protein 10 using a FRET donor-acceptor pair 20, wherein the system 200 comprises an analytical surface 210, a barcode supply 220, a radiation source 230, a single-molecule fluorescence microscope 240, and a control system 300, wherein the analytical surface 210 is configured to host the tagged protein 10 (not depicted), wherein the barcode supply 220 is configured to provide barcodes to the analytical surface 210, wherein the radiation source 230 is configured to provide donor excitation radiation 51 to the analytical surface 210, wherein the single-molecule fluorescence microscope 240 is configured to measure emission in a donor emission radiation range and in an acceptor emission radiation range at the analytical surface 210 and to provide an emission signal to the control system 300, wherein the control system 300 is configured to generate a FRET efficiency pattern.
In the depicted embodiment, the system further comprises a barcode outlet 225 configured for the removal of barcodes from the analytical surface 210.
In the depicted embodiment, the analytical surface 210 comprises a quartz slide (on top) of a glass coverslip (on the bottom), which are separated by a layer of water. Further, a prism, especially a Pellin-Broca prism is arranged on top of the quartz slide, with a layer of immersion oil in between. Yet further, an additional layer of water is arranged between the glass coverslip and an objective lens of the single-molecule fluorescence microscope 240. In addition, polyethylene glycol (PEG) is arranged attached to the quartz slide on one side to streptavidin on the other. Hence, in embodiments, the analytical surface 210 may comprise streptavidin. In further embodiments, during operation, the tagged protein 10 may be (non-covalently) bound to the streptavidin using biotin. It will be clear to the person skilled in the art, that many variations of the analytical surface may be possible without deviating from the scope of the invention as described herein.
In the depicted embodiment, the radiation source 230 comprises a plurality of radiation sources 230, especially configured to provide different wavelengths of radiation. In particular, the radiation source 230 may be suitable to provide radiation in the donor excitation radiation range corresponding to different FRET donor-acceptor chromophore pairs.
In the depicted embodiment, the single-molecule fluorescence microscope 240 comprises or is functionally coupled to a plurality of optical elements configured to separate the radiation emitted from the analytical surface 210 (by the donor chromophore 23 and/or the acceptor chromophore 24) into the donor emission radiation 53 and acceptor emission radiation 54. The single-molecule fluorescence microscope 240 may comprise an EMCCD camera 241 to measure the donor emission radiation 53 and the acceptor emission radiation 54. It will be clear to the person skilled in the art, that many variations of the single-molecule fluorescence microscope 240 and/or the optical elements may be possible without deviating from the scope of the invention as described herein.
In particular, Fig. 2 may depict a system 200 configured for TIR excitation and FRET pair emission detection using prism type TIRE. Immobilized molecules may be excited by TIR using a green or red laser, tubing is connected to the slide to allow for exchange of DNA- barcodes (either manually or by using a pumping system) and buffers for multiple rounds of imaging and probing of different barcodes on protein substrates. Fluorescence may be collected by an objective lens and the slit may create images of half the size of the EM-CCD camera. The fluorescence signal may be split into donor and acceptor signal by a dichroic mirror and may be imaged side by side on the EM-CCD.
In embodiments, the control system 300 may be configured to estimate a protein fingerprint based on the FRET efficiency pattern. In further embodiments, the control system 300 may be configured to identify the tagged protein 10 by comparing the protein fingerprint to protein-related information in reference data.
In embodiments, the system may further comprise a denaturation unit 250 configured to denature the protein to provide a denatured protein 15. In further embodiments, the control system 300 may be configured to execute in a controlling mode the analysis method 100 according to the invention. Fig. 2 schematically further depicts a data carrier 400 having stored thereon program instructions, which when executed by the system 200 according to the invention, especially by the control system 300, causes the system 200 to execute the analysis method 100 according to the invention. In further embodiments, the control system 300 may comprise the data carrier 400.
Experiments
Experiment 1 - bioinformatics analysis
1.1 In silico simulation - To evaluate the prediction power of the analysis method, we assessed in silico whether the estimated FRET efficiencies in a large set of Swiss-Prot entries of human origin would be discernible given the expected resolution of the analysis method (one- percentage point resolution as an estimate) and assuming perfect labeling efficiency of cysteine and lysine residues. As denaturing proteins and attaching the negatively charged DNA tags may disturb the conformation of proteins to the point of disorder, the effect of this treatment was simulated using a coarse-grained lattice folding model. Under the assumptions underlying the lattice model, 92% of proteins were found to be uniquely identifiable. Most of the not uniquely identifiable proteins may either be classifiable to a smaller subset of potential candidate proteins or may be unclassifiable due to the lack of cysteines and lysines. The discernibility may increase with the number of tagged residues, implying that an even wider range of proteins may be discernible if more residue types such as methionine and tyrosine are tagged.
1.1.1 Methods - The folding of denatured and (DNA-)tagged proteins was simulated using a coarse-grained lattice model. In this model amino acids may be considered atomic units, which can only occupy the vertices in a three-dimensional cubic lattice. The sequence may be initiated in a random configuration and may then be randomly altered until either no decrease in free energy is obtained or a maximum number of mutations has been reached. The model also stores and mutates the direction of side chains, which may be of particular interest when considering the effect of long (DNA-)tags on the protein structure. In the model: (1) the starting position may be generated using a random walk in an appropriately sized lattice, (2) (DNA-)tagged residues may be accounted for by assuming that a long negatively charged tag requires a straight unobstructed path to the model surface at all times, and (3) temperatures were automatically tuned in the parallel tempering procedure to optimize mixing between chains.
It may be expected to observe the disordered protein visiting multiple configurations in rapid succession rather than a static configuration. Hence, twenty models were generated for each sequence and the five lowest energy structures were joined at the C-terminus. From this ensemble FRET efficiencies were extracted to generate a protein fingerprint, especially an Efret fingerprint. A Efret fingerprint may be defined herein as an array of Efret values calculated from all donor-acceptor pairs in the model, in small to large order. Efret values of a single protein between which the difference is less than the Efret resolution, set here at 1 percentage point, were treated as a single value. Furthermore Efret values lower than 0.10 were considered too low to discern from background noise, and are thus omitted.
Fingerprints containing different numbers of Efret values were automatically considered discernible. To determine whether a fingerprint F consisting of N Efret values is discernible from another fingerprint F' consisting of N Efret values, we consider the maximum difference between the paired Efret values Dk.k-:
AF, F' = max [Fn — F,[Vn; neN]
Dk.k' was compared to the Efret resolution and F was considered unique if Dk.k- is larger in all comparisons against other fingerprints with the same number of observed FRET values.
1.1.2 Results - Lattice model structures were generated for 1807 human protein sequences extracted from the Swiss-Prot database. Assuming a resolution in ssFRET of 1 percentage point, 1648 (92%) structures were characterized by unique fingerprints. Groups of structures with fewer tags may contain a higher fraction of structures with non-unique fingerprints, indicating a positive relationship between the number of tags and uniqueness of the fingerprint. Interestingly, non-unique fingerprints may still be clustered together in groups that are wholly or partially separable from each other, suggesting that non-unique fingerprints may still provide sufficient information to narrow the number of possible identifications down to a limited group. In aforementioned scenario some tags were considered indiscernible or undiscoverable as the transformation from distance to Efret ultimately results in values that may respectively be too close to each other or too low at set Efret resolution and detection limit. Here this transformation may be parameterized on a Cy3-Cy5 FRET pair, which may have its resolution sweet spot of ~lA at a donor-acceptor distance equal to its Forster radius (6 nm). However by attaching dyes more distally using linkers or using FRET dye pairs with a different Forster radius, the resolution sweet spot distance of the FRET pair may be fine-tuned such that previously indiscernible or undiscoverable tags may become visible. Therefore a second scenario was explored, in which any C-terminal tag to residue tag distance difference of 1 A, regardless of the translation to Efret values, is considered sufficient to discriminate two fingerprints. Under these assumptions 1736 (99%) of fingerprints is unique. Most of the remaining unidentifiable proteins contain no cysteines or lysines and are thus undetectable under any circumstance using this labeling scheme. However, the proteins may be identifiable based on the tagging of other first amino acids and/or second amino acids, such as methionine and threonine.
1.1.3 Conclusion - The in silico assessment suggests that protein fingerprints obtained via the analysis method according to the invention may allow for the unique classification of a wide range of proteins. Furthermore, the results suggest that an increase in the number of observed tags in a protein fingerprint may increase the probability of the protein being uniquely identifiable.
Experiment 2 - FRET efficiency measurement proof of concept
Fig. 3A-B schematically depict a proof-of-concept experiment of the method of the invention. In the experiment, an acceptor (Cy5) labelled single stranded DNA molecule was immobilized on a surface. This ssDNA molecule contains a barcode target sequence that is either 5nt (target #1, Fig. 3 A) or 30nt (target #2, Fig. 3B) separated from the acceptor.
A complementary donor (Cy3) labelled 8nt imaging barcode was used to probe either of the aforementioned strands (in separate experiments), including the collection of FRET efficiency E measurements and dwell-time Td (in seconds) measurements. From this data the dwell-time Td vs FRET efficiency E of each individual FRET event was plotted in the left panels of Fig. 3 A and Fig. 3B. Next, the FRET efficiencies for both target strands were plotted in the histograms depicted in the right panels of Fig. 3 A and Fig. 3B indicating counts N versus Fret efficiency E. The observed FRET efficiencies were subjected to Gaussian fitting, which indicated that the observed FRET efficiencies E were 0.99 and 0.69 for target #1 and target #2 respectively.
In embodiments, the FRET efficiency pattern may comprise the FRET efficiency E measurements. In further embodiments, the distribution of the center of the peaks from either barcode measurement may be a protein fingerprint. Hence, the FRET efficiency pattern may comprise a protein fingerprint. In particular, in embodiments, the FRET efficiency pattern may comprise raw data, such as depicted in either of the left panels of Fig. 3 A-B, which may especially comprise a protein fingerprint. In further embodiments, the FRET efficiency pattern may comprise binned data, such as depicted in either of the right panels of Fig. 3 A-B, which may especially comprise a protein fingerprint. In the depicted embodiment, the protein fingerprints may be the signatures of the DNA strands. In further embodiments, the FRET efficiency pattern may comprise processed data, which may especially comprise a protein fingerprint.
Experiment 3 - single tag with multiple barcodes proof of concept Fig. 4A-B depict of a proof-of-concept experiment for probing different targets on one DNA tag with different barcodes. The location of the target sequence for each barcode is probed relative to a 5’ end labelled with acceptor fluorophore Cy5 with a Cy3 labelled donor barcode. In a first round of detection the complementary barcode for target A (see below) is flushed, then the flow cell is washed with washing buffer and the barcode for target B is flushed and the emission is measured. In both Fig. 4A and 4B, the left histogram corresponds to target B, and the right (dashed) histogram corresponds to target A. Mean FRET efficiencies are derived from gaussian fits of individual histograms, and errors are the standard error of the mean.
The barcode for target A is: 3’end- Cy3-TATGTAGA The barcode for target B is: 3’end- Cy3-AGAAGTAAT Fig. 4A depict the data corresponding to DNA construct: Cv5-TTTTTATACATCTATTTTTTTTTTTTTTTTTTTTTTCTTCATTACTT TTTTTTTTTTTTT-Biotin, corresponding to Cy5 bound to SEQ ID NO:l bound to Biotin, wherein the bold part indicates target (site) A, and wherein the underlined part indicates target (site) B. The mean FRET efficiency is 0.90 ± 2.1 3 and 0.55 ± 1.63, for target A and target B respectively.
Fig. 4B depict the data corresponding to DNA construct: Cv5-TTTTTTTTTTTTTTTTTTTTTTTTTATACATCTATTCTTCATTACTT TTTTTTTTTTTTT-Biotin, corresponding to Cy5 bound to SEQ ID NO:2 bound to Biotin wherein the bold part indicates target (site) A, and wherein the underlined part indicates target (site) B. The mean FRET is 0.67 ± 2.23 and 0.59 ± 2.23, for target A and target B respectively.
These results demonstrate the resolution of the analysis method according to the invention in distinguishing different small distances between the donor chromophore and the acceptor chromophore. Further, these results demonstrate that a single tag may be probed with different barcodes at different sites, which may be beneficial to probe the protein with chromophores arranged at different distances from the tagged (first or second) amino acid.
Experiment 4 - single protein with multiple FRET pairs proof of concept Fig. 5A-C depict results of a proof of concept experiment wherein three proteins corresponding to SEQ ID NO:3, SEQ ID NO:4 and SEQ ID NO:5, are analyzed using the analysis method of the invention. The three proteins each correspond to alpha synuclein with a C-terminal FGE binding sequence, a small linker, a thrombin cleaving sequence, and a histidine tag.
The alpha synuclein sequence is based on the sequence of accession number P37840 at Uniprot.org. In SEQ ID NO:3 the serine at location 87 in the alpha synuclein sequence is replaced with a cysteine, i.e., SEQ ID NO:3 corresponds to a S87C version of alpha synuclein. In SEQ ID NO:4 the serine at location 129 in the alpha synuclein sequence is replaced with a cysteine, i.e., SEQ ID NO:4 corresponds to a S129C version of alpha synuclein. In SEQ ID NO:5 the serine residues at locations 87 and 129 in the alpha synuclein sequence are replaced with cysteine residues, i.e., SEQ ID NO:5 corresponds to a S87C+S129C version of alpha synuclein.
Proteins corresponding to SEQ ID NO: 3-5 were produced in Escherichia coli BL21 (DE3). E. coli BL21 (DE3) also (naturally) produces an FGE enzyme that converts the cysteine in the FGE binding site at location 142 to a formyl glycine, which can be coupled to a provide via hydrazide chemistry. In the performed experiments, the formyl glycine was linked toa first tag 31 comprising a biotin probe via hydrazide chemistry.
The proteins were then analyzed using the analysis method of the invention. In particular, for each of the three proteins, for each of the three proteins a first tag 31 was associated to the formyl glycine site (at location 142) via hydrazide chemistry. The first tag 31 further comprised a biotin probe (at the other end of the tag than the end associated with the formyl glycine site. The biotin probe was used to immobilize the first tag 31 on a streptavidin surface. The first tag 31 comprised 25 nucleotides. Similarly, a second tag 32 was associated to one or more cysteine residues in the protein. In particular, for SEQ ID NO:3 to the cysteine residue at location 87, for SEQ ID NO:4 to the cysteine residue at location 129, and for SEQ ID NO:5 to the cysteine residues at locations 87 and 129. The second tag 32 comprised 10 nucleotides. Each of the tagged proteins 10 was then exposed to a first barcode 41 configured to hybridize with the first tag 31 and to a second barcode 42 configured to hybridize with the second tag 32. In particular, the first barcode 41 comprised a first chromophore 21, and comprised 10 nucleotides complementary to the first tag 31, whereas the second barcode 42 comprised a second chromophore 22, and comprised 8 nucleotides complementary to the second tag 32. The first chromophore 21 was Cy5, whereas the second chromophore 22 was Cy3. Radiation in the donor excitation range was provided to the tagged proteins 10, and emission in the donor emission radiation range and the acceptor emission radiation range was measured. Fig. 5A-C depicts counts N versus FRET efficiency E. In particular, of each observed binding event with a duration of at least 0.3 s the average FRET efficiency was determined, and Fig. 5A-C summarize all determined average FRET efficiencies.
Fig. 5A depicts the observations with regards to SEQ ID NO:3, Fig. 5B the observations with regards to SEQ ID NO: 4, and Fig. 5C the observations with regards to SEQ ID NO:5.
Comparing Fig. 5A and Fig. 5B, the FRET efficiency in Fig. 5A is smaller than in Fig. 5B, which is in agreement with the first chromophore 21 and the second chromophore 22 being further apart in SEQ ID NO:3 than in SEQ ID NO:4.
In Fig. 5C, the second barcode 42 comprising the second chromophore 22 may associate to two different second tags 32 associated to different cysteine residues. As the 8 complementary nucleotides provide a relatively high off-rate, the binding events of the second barcode to the different second tags 32 may be temporally resolved, as can be observed in Fig. 5C, which shows two peaks in the range of E=0.8 - E=1.0, each corresponding to the second barcode 42 hybridizing to a different second tag 32.
Hence, the analysis method of the invention may facilitate characterizing a tagged protein comprising a plurality of second tags 32, wherein the first barcode 42 can associate to each of the plurality of second tags 32. Similarly, the analysis method of the invention may facilitate characterizing a tagged protein comprising a plurality of first tags 31, wherein the first barcode 41 can associate to each of the plurality of first tags 31. Thereby, different sites may be queried in parallel, which may facilitate a quicker characterization.
The term “plurality” refers to two or more. Furthermore, the terms “a plurality of’ and “a number of’ may be used interchangeably.
The terms “substantially” or “essentially” herein, and similar terms, will be understood by the person skilled in the art. The terms “substantially” or “essentially” may also include embodiments with “entirely”, “completely”, “all”, etc. Hence, in embodiments the adjective substantially or essentially may also be removed. Where applicable, the term “substantially” or the term “essentially” may also relate to 90% or higher, such as 95% or higher, especially 99% or higher, even more especially 99.5% or higher, including 100%. Moreover, the terms ’’about” and “approximately” may also relate to 90% or higher, such as 95% or higher, especially 99% or higher, even more especially 99.5% or higher, including 100%. For numerical values it is to be understood that the terms “substantially”, “essentially”, “about”, and “approximately” may also relate to the range of 90% - 110%, such as 95%-105%, especially 99%-101% of the values(s) it refers to. The term “comprise” includes also embodiments wherein the term “comprises” means “consists of’.
The term “and/or” especially relates to one or more of the items mentioned before and after “and/or”. For instance, a phrase “item 1 and/or item 2” and similar phrases may relate to one or more of item 1 and item 2. The term "comprising" may in an embodiment refer to "consisting of but may in another embodiment also refer to "containing at least the defined species and optionally one or more other species".
Furthermore, the terms first, second, third and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.
The devices, apparatus, or systems may herein amongst others be described during operation. As will be clear to the person skilled in the art, the invention is not limited to methods of operation, or devices, apparatus, or systems in operation.
The term “further embodiment” may refer to an embodiment comprising the features of the previously discussed embodiment, but may also refer to an alternative embodiment.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims.
In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim.
Use of the verb "to comprise" and its conjugations does not exclude the presence of elements or steps other than those stated in a claim. Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise”, “comprising”, “include”, “including”, “contain”, “containing” and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to”.
The article "a" or "an" preceding an element does not exclude the presence of a plurality of such elements.
The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a device claim, or an apparatus claim, or a system claim, enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
The invention also provides a control system that may control the device, apparatus, or system, or that may execute the herein described method or process. Yet further, the invention also provides a computer program product, when running on a computer which is functionally coupled to or comprised by the device, apparatus, or system, controls one or more controllable elements of such device, apparatus, or system.
The invention further applies to a device, apparatus, or system comprising one or more of the characterizing features described in the description and/or shown in the attached drawings. The invention further pertains to a method or process comprising one or more of the characterizing features described in the description and/or shown in the attached drawings. Moreover, if a method or an embodiment of the method is described being executed in a device, apparatus, or system, it will be understood that the device, apparatus, or system is suitable for or configured for (executing) the method or the embodiment of the method respectively. The various aspects discussed in this patent can be combined in order to provide additional advantages. Further, the person skilled in the art will understand that embodiments can be combined, and that also more than two embodiments can be combined. Furthermore, some of the features can form the basis for one or more divisional applications.

Claims

CLAIMS:
1. An analysis method (100) for characterization of a tagged protein (10) using FRET donor-acceptor pair chromophores (20), wherein the FRET donor-acceptor pair chromophores
(20) comprise a first chromophore (21) and a second chromophore (22), wherein the FRET donor- acceptor pair chromophores (20) have a donor excitation radiation range, a donor emission radiation range and an acceptor emission radiation range, wherein one of the FRET donor- acceptor pair chromophores (23) is excitable by donor excitation radiation (51) in the donor excitation radiation range, wherein the other of the FRET donor-acceptor pair chromophores (24) is configured to provide acceptor emission radiation (54) in the acceptor emission radiation range upon excitation with donor excitation radiation (51) in the donor excitation radiation range of the one of the FRET donor-acceptor pair chromophores (23) when the first chromophore (21) and the second chromophore (22) are configured within a predetermined distance, wherein the tagged protein (10) comprises a first amino acid (11) tagged with a first tag (31) and a second amino acid (12) tagged with a second tag (32), wherein the first tag (31) comprises the first chromophore
(21) or is associated to the first chromophore (21), wherein the second tag (32) comprises an oligonucleotide , and wherein the analysis method (100) comprises:
- a barcode exposure stage (110) comprising: (i) exposing the tagged protein (10) to a barcode (42), wherein the barcode (42) is configured to hybridize with the second tag (32), and wherein the barcode (42) comprises the second chromophore (22), (ii) providing radiation having a wavelength selected from the donor excitation radiation range to the tagged protein (10), and (iii) measuring emission in the donor emission radiation range and the acceptor emission radiation range to provide an emission signal.
2. The analysis method (100) according to claim 1, wherein:
- the first amino acid (11) comprises a first post-translational modification, wherein the first tag (31) is attached to the first post-translational modification; and/or
- the second amino acid (12) comprises a second post-translational modification, wherein the second tag (32) is attached to the second post-translational modification.
3. The analysis method (100) according to any one of the preceding claims, wherein the barcode (42) comprises m nucleotides complementary with the second tag (32), wherein m is selected from the range of 5-10.
4. The analysis method (100) according to any one of the preceding claims, wherein the tagged protein (10) comprises a plurality of second amino acids (12) tagged with different second tags (32), wherein the different second tags (32) comprise different nucleotide sequences, and wherein the barcode exposure stage (110) comprises sequentially exposing the tagged protein (10) to a plurality of different barcodes (42), wherein for each second tag (32) the barcode exposure stage (110) comprises exposing the tagged protein (10) to a corresponding barcode (42) configured to hybridize with the respective second tag (32).
5. The analysis method (100) according to any one of the preceding claims 1-2, wherein the tagged protein (10) comprises a plurality of second amino acids (12) tagged with second tags (32), and wherein the barcode exposure stage (110) comprises exposing the tagged protein (10) to a barcode (42) configured to hybridize with two or more of the plurality of second tags (32), wherein the barcode (42) comprises m nucleotides complementary with the second tag (32), wherein ri2 is selected from the range of 5-10; or ri2 is selected from the range of 10-20, and one or more of the first chromophore (21) and the second chromophore (22) are configured to switch between an active state and an inactive state.
6. The analysis method (100) according to any one of the preceding claims, wherein the first tag (31) comprises an oligonucleotide, and wherein the barcode exposure stage (110) comprises (ib) exposing the tagged protein (10) to a first barcode (41), wherein the first barcode (41) is configured to hybridize with the first tag (31), and wherein the first barcode (41) comprises the first chromophore (21).
7. The analysis method (100) according to claim 6, wherein the first barcode (41) comprises nucleotides complementary with the first tag (31), wherein is selected from the range of 10-20.
8. The analysis method (100) according to any one of the preceding claims 6-7, wherein the tagged protein (10) comprises a plurality of first amino acids (11) tagged with different first tags (31), wherein the different first tags (31) comprise different nucleotide sequences, and wherein the barcode exposure stage (110) comprises sequentially exposing the tagged protein (10) to a plurality of different first barcodes (41), wherein for each first tag (31) the barcode exposure stage (110) comprises exposing the tagged protein (10) to a corresponding first barcode (41) configured to hybridize with the respective first tag (31).
9. The analysis method (100) according to any one of the preceding claims, wherein the analysis method (100) further comprises:
- a fingerprint provision stage comprising providing a protein fingerprint based on the emission signal;
- a protein identification stage comprising identifying the tagged protein (10) by comparing the protein fingerprint to protein-related information in reference data.
10. The analysis method (100) according to claim 9, wherein the protein fingerprint comprises a deduced amino acid sequence, and wherein the protein-related information comprises amino acid sequences.
11. The analysis method (100) according to any one of the preceding claims, wherein the tagged protein (10) comprises a denatured protein (15).
12. The analysis method (100) according to any one of the preceding claims, wherein the analysis method (100) comprises:
- a distance estimation stage comprising estimating a distance between the first amino acid (11) and the second amino acid (12) based on the emission signal.
13. The analysis method (100) according to claim 12, wherein the analysis method (100) comprises:
- a structure prediction stage comprising predicting a protein structure of the tagged protein (10) based on the estimated distance.
14. A system (200) for characterization of a tagged protein (10) using a FRET donor- acceptor pair (20), wherein the system (200) comprises an analytical surface (210), a barcode supply (220), a radiation source (230), a single-molecule fluorescence microscope (240), and a control system (300), wherein the analytical surface (210) is configured to host the tagged protein (10), wherein the barcode supply (220) is configured to provide barcodes to the analytical surface (210), wherein the radiation source (230) is configured to provide donor excitation radiation (51) to the analytical surface (210), wherein the single-molecule fluorescence microscope (240) is configured to measure emission in a donor emission radiation range and in a acceptor emission radiation range at the analytical surface (210) and to provide an emission signal to the control system (300), wherein the control system (300) is configured to execute in a controlling mode the analysis method (100) according to any one of the preceding claims 1-13. 15. The system (200) according to claim 14, wherein the control system (300) is configured to estimate a protein fingerprint based on the emission signal, and wherein the control system (300) is configured to identify the tagged protein (10) by comparing the protein fingerprint to protein-related information in reference data. 16. The system (200) according to any one of the preceding claims 14-15, wherein the system (200) further comprises a denaturation unit (250) configured to denature the protein to provide a denatured protein (15).
17. The system (200) according to any one of the preceding claims 14-16, wherein the control system (300) is configured to estimate a distance based on the emission signal.
18. The system (200) according to claim 17, wherein the control system (300) is configured to predict a protein structure of the tagged protein (10) based on the estimated distance.
19. A data carrier (400) having stored thereon program instructions, which when executed by the system (200) according to any one of preceding claims 14-18 causes the system (200) to execute the analysis method (100) according to any one of preceding claims 1-13.
PCT/NL2020/050566 2019-09-12 2020-09-11 Single-molecule fret for protein characterization WO2021049940A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
NL2023825A NL2023825B1 (en) 2019-09-12 2019-09-12 Single-molecule FRET for protein characterization
NL2023825 2019-09-12

Publications (1)

Publication Number Publication Date
WO2021049940A1 true WO2021049940A1 (en) 2021-03-18

Family

ID=68654852

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/NL2020/050566 WO2021049940A1 (en) 2019-09-12 2020-09-11 Single-molecule fret for protein characterization

Country Status (2)

Country Link
NL (1) NL2023825B1 (en)
WO (1) WO2021049940A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023165933A1 (en) * 2022-03-03 2023-09-07 Norelle Wildburger Method for detecting multimeric, preferably dimeric peptides using single-domain antibodies
WO2024049290A1 (en) 2022-08-31 2024-03-07 Technische Universiteit Delft Single-molecule aptamer fret for protein identification and structural analysis

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001025794A2 (en) 1999-10-05 2001-04-12 The Molecular Sciences Institute, Inc. Protein fingerprinting by multisite labelling
US20160169903A1 (en) 2014-12-15 2016-06-16 President And Fellows Of Harvard College Methods and compositions relating to super-resolution imaging and modification

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001025794A2 (en) 1999-10-05 2001-04-12 The Molecular Sciences Institute, Inc. Protein fingerprinting by multisite labelling
US20160169903A1 (en) 2014-12-15 2016-06-16 President And Fellows Of Harvard College Methods and compositions relating to super-resolution imaging and modification

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
A AUER ET AL: "Fast, Background-Free DNA-PAINT Imaging Using FRET-Based Probes", NANO LETTERS, vol. 17, no. 10, 21 September 2017 (2017-09-21), pages 6428 - 6434, XP055641459, DOI: 10.1021/acs.nanolett.7b03425 *
AUER ET AL.: "Fast, Background-Free DNA-PAINT Imaging Using FRET-Based Probes", NANO LETTERS, 2017
DINGFELDER ET AL.: "Mapping an Equilibrium Folding Intermediate of the Cytolytic Pore Toxin ClyA with Single-Molecule FRET", JOURNAL OF PHYSICAL CHEMISTRY B, 2018
F DINGFELDER ET AL: "Mapping an Equilibrium Folding Intermediate of the Cytolytic Pore Toxin ClyA with Single-Molecule FRET", J PHYS CHEM B, vol. 122, no. 49, 29 August 2018 (2018-08-29), pages 11251 - 11261, XP055685946, DOI: 10.1021/acs.jpcb.8b07026 *
J YOO ET AL: "Three-Color Single-Molecule FRET and Fluorescence Lifetime Analysis of Fast Protein Folding", J PHYS CHEM B, vol. 122, no. 49, 19 September 2018 (2018-09-19), pages 11702 - 11720, XP055685952, DOI: 10.1021/acs.jpcb.8b07768 *
LINSHIXIAN ET AL.: "Redox-based reagents for chemoselective methionine bioconjugation", SCIENCE, vol. 355.6325, 2017, pages 597 - 602, XP055612170, DOI: 10.1126/science.aal3316
ROY ET AL.: "A practical guide to single-molecule FRET", NATURE METHODS, vol. 5, June 2008 (2008-06-01), XP055078936, DOI: 10.1038/nmeth.1208
YOO ET AL.: "Three-Color Single-Molecule FRET and Fluorescence Lifetime Analysis of Fast Protein Folding", JOURNAL OF PHYSICAL CHEMISTRY B, 2018

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023165933A1 (en) * 2022-03-03 2023-09-07 Norelle Wildburger Method for detecting multimeric, preferably dimeric peptides using single-domain antibodies
WO2024049290A1 (en) 2022-08-31 2024-03-07 Technische Universiteit Delft Single-molecule aptamer fret for protein identification and structural analysis
NL2032916B1 (en) * 2022-08-31 2024-03-15 Univ Delft Tech Single-molecule aptamer FRET for protein identification and structural analysis

Also Published As

Publication number Publication date
NL2023825B1 (en) 2021-05-17

Similar Documents

Publication Publication Date Title
Iacobucci et al. A cross-linking/mass spectrometry workflow based on MS-cleavable cross-linkers and the MeroX software for studying protein structures and protein–protein interactions
US20220163536A1 (en) Identifying peptides at the single molecule level
JP7295092B2 (en) How to choose a binding reagent
Dey et al. DNA–protein interactions: methods for detection and analysis
EP2872898B1 (en) Single molecule protein sequencing
DK2209893T3 (en) The use of aptamers in proteomics
US11499979B2 (en) Single-molecule protein and peptide sequencing
CN110730826A (en) Analyte detection
US20210174903A1 (en) Enhanced protein structure prediction using protein homolog discovery and constrained distograms
WO1994012665A1 (en) Quantitative detection of macromolecules with fluorescent oligonucleotides
WO2010065531A1 (en) Single molecule protein screening
WO2013112745A1 (en) Peptide identification and sequencing by single-molecule detection of peptides undergoing degradation
WO2021049940A1 (en) Single-molecule fret for protein characterization
De Lannoy et al. Evaluation of FRET X for single-molecule protein fingerprinting
WO2024049290A1 (en) Single-molecule aptamer fret for protein identification and structural analysis
WO2022163770A1 (en) Genome-editing-tool evaluation method
WO2024076928A1 (en) Fluorophore-polymer conjugates and uses thereof
CN105548129A (en) Method for molecule/ion detection based on single fluorescent molecule bleaching and imaging
Beveridge et al. A synthetic peptide library for benchmarking crosslinking mass spectrometry search engines
WO2023056414A1 (en) Structural profiling of native proteins using fluorosequencing, a single molecule protein sequencing technology
JP2004361252A (en) Method of determining phosphorylation of peptide or protein
WO2023130098A2 (en) High efficiency labels for biomolecular analysis
Zürbig et al. Capillary electrophoresis coupled to mass spectrometry for urinary proteome analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20775082

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20775082

Country of ref document: EP

Kind code of ref document: A1