CN117881796A - Detection of analytes using targeted epigenetic assays, proximity-induced tagging, strand invasion, restriction or ligation - Google Patents

Detection of analytes using targeted epigenetic assays, proximity-induced tagging, strand invasion, restriction or ligation Download PDF

Info

Publication number
CN117881796A
CN117881796A CN202280059144.7A CN202280059144A CN117881796A CN 117881796 A CN117881796 A CN 117881796A CN 202280059144 A CN202280059144 A CN 202280059144A CN 117881796 A CN117881796 A CN 117881796A
Authority
CN
China
Prior art keywords
oligonucleotide
analyte
recognition
probe
coupled
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280059144.7A
Other languages
Chinese (zh)
Inventor
A·肯尼迪
S·舒尔扎贝格
K·巴斯比
C·布朗
A·普赖斯
E·维尔马斯
R·潘托哈
M·菲利
J·邹
李勇
S·阿尔马赛
A·杜塔
M·阿尔瓦雷斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inmair Ltd
Original Assignee
Inmair Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inmair Ltd filed Critical Inmair Ltd
Priority claimed from PCT/US2022/039853 external-priority patent/WO2023018730A1/en
Publication of CN117881796A publication Critical patent/CN117881796A/en
Pending legal-status Critical Current

Links

Abstract

Provided herein are methods for detecting analytes using proximity-induced labeling, strand invasion, restriction, or ligation. In some examples, detecting the analyte includes coupling a donor recognition probe to a first portion of the analyte. The donor recognition probe comprises a first recognition element specific for a first portion of the analyte, a first oligonucleotide corresponding to the first portion, and a transposase coupled to the first recognition element and the first oligonucleotide. The receptor recognition probe is coupled to a second portion of the analyte. The receptor recognition probe comprises a second recognition element specific for a second portion of the analyte and a second oligonucleotide coupled to the second recognition element and corresponding to the second portion. The transposase is used to generate a reporter polynucleotide comprising a first oligonucleotide and a second oligonucleotide. Detecting the analyte based on the reporter comprising the first oligonucleotide and the second oligonucleotide.

Description

Detection of analytes using targeted epigenetic assays, proximity-induced tagging, strand invasion, restriction or ligation
Cross Reference to Related Applications
The present application claims the benefit of the following applications, the entire contents of each of which are incorporated herein by reference:
U.S. provisional patent application No. 63/231,970, entitled "Targeted Epigenetic Assays", filed 8/11/2021
U.S. provisional patent application No. 63/250,574, entitled "Detection of Analytes Using Proximity-Induced Tagmentation", filed on 9 and 30 of 2021.
Background
Detection of specific nucleic acid sequences present in biological samples has been used as a method for, for example, identifying and classifying microorganisms, diagnosing infectious diseases, detecting and characterizing genetic abnormalities, identifying genetic changes associated with cancer, studying genetic susceptibility to diseases, and measuring responses to various types of treatments. A common technique for detecting specific nucleic acid sequences in biological samples is nucleic acid sequencing.
Nucleic acid sequencing methods have evolved from the chemical degradation methods used by Maxam and Gilbert, and the chain extension methods used by Sanger. Several sequencing methods are now used that allow thousands of nucleic acids to be processed in parallel on a single chip. Some platforms include bead-based formats and microarray formats, where silica beads are functionalized with probes, depending on the application of such formats in applications including sequencing, genotyping, or gene expression profiling.
Some sequencing systems use fluorescence-based detection, whether for "sequencing-by-synthesis" or genotyping, in which a given nucleotide is labeled with a fluorescent tag, and the nucleotide is identified based on detecting fluorescence from the tag.
There remains an unmet need for a method that can sensitively characterize epigenetic changes at a targeted DNA locus. Chromatin accessibility (via ATAC-seq) and proteins associated with DNA loci (via ChIP-seq) are examples of epigenetic elements that are difficult to target using existing hybrid capture techniques. Typically, DNA sequence enrichment assays are associated with epigenetic characteristics. However, since these sequences are a priori unknown, it is challenging to design appropriate hybridization capture oligonucleotides to effectively enrich the output of an epigenetic assay for a particular genomic region of interest (e.g., genomic locus).
Early methods for targeted locus specific protein isolation to identify histone gene modulators using inactivated Cas (dCas 9) have been proposed; see, e.g., tsu et al, "dCAS9-targeted locus-specific protein isolation method identifies histone gene regulators", PNAS, volume 115, phase 2: pages E2734-E2741, 2018, the entire contents of which are incorporated herein by reference. Such methods indicate that dCas 9-based loci are enriched for isolable chromatin, which can then be determined by mass spectrometry. However, this method only allows the determination of a single chromatin locus in each experiment. Furthermore, this early work provided two independent results, namely the sequence of the DNA locus and mass spectrometry to identify DNA-associated proteins. There is therefore a need for improved methods for epigenetic analysis of targeted loci.
Disclosure of Invention
Provided herein are systems and methods for detecting analytes using targeted epigenetic assays, proximity-induced tagging, strand invasion, restriction, or ligation.
Some examples herein provide a method for detecting an analyte. The method may include coupling a donor recognition probe to the first portion of the analyte. The donor recognition probe can comprise a first recognition element specific for the first portion of the analyte, a first oligonucleotide corresponding to the first portion of the analyte, and a transposase coupled to the first recognition element and the first oligonucleotide. The method may comprise coupling a receptor recognition probe to the second portion of the analyte. The receptor recognition probe may comprise a second recognition element specific for the second portion of the analyte and a second oligonucleotide coupled to the second recognition element and corresponding to the second portion of the analyte. The method may comprise generating a reporter polynucleotide comprising the first oligonucleotide and the second oligonucleotide using a transposase. The method may comprise detecting an analyte based on the reporter polynucleotide comprising the first oligonucleotide and the second oligonucleotide.
In some examples, the analyte comprises a first molecule. In some examples, the first portion of the analyte comprises a first portion of the first molecule and the second portion of the analyte comprises a second portion of the first molecule.
In some examples, the first molecule comprises a protein or peptide. The first recognition element can include a first antibody or first aptamer specific for a first portion of the protein or peptide. The second recognition element can include a second antibody or second aptamer specific for a second portion of the protein or peptide.
In some examples, the first molecule comprises a target polynucleotide. The first recognition element can include a first CRISPR associated (Cas) protein specific for a first subsequence of the target polynucleotide. The second recognition element can include a second Cas protein specific for a second subsequence of the target polynucleotide. In some examples, the target polynucleotide comprises RNA, and the first Cas protein and the second Cas protein are independently selected from the group consisting of rCas9 and dCas 13.
In some examples, the first molecule comprises a carbohydrate. The first recognition element may include a first lectin specific to a first portion of the carbohydrate. The second recognition element can include a second lectin specific to a second portion of the carbohydrate.
In some examples, the first molecule comprises a biological molecule. The biomolecule may be specific to the first recognition element and the second recognition element.
In some examples, the analyte further comprises a second molecule that interacts with the first molecule. In some examples, the first portion of the analyte comprises the first molecule and the second portion of the analyte comprises the second molecule.
In some examples, the first molecule may comprise a first protein or a first peptide; and the first recognition element may include a first antibody or first aptamer specific for the first protein or first peptide. Alternatively, for example, the first molecule may comprise a first target polynucleotide; and the first recognition element can include a first CRISPR associated (Cas) protein specific for the first target polynucleotide. Alternatively, for example, the first molecule may comprise a first carbohydrate; and the first recognition element may include a first lectin specific to the first carbohydrate. Alternatively, for example, the first molecule may comprise a first biomolecule specific to the first recognition element.
It should be appreciated that any suitable second molecule is compatible with any of the first molecules described above. For example, the second molecule may comprise a second protein or a second peptide; and the second recognition element can include a second antibody or second aptamer specific for the second protein or second peptide. Alternatively, the second molecule may comprise a second target polynucleotide; and the second recognition element comprises a second Cas protein specific for the second target polynucleotide. Alternatively, the second molecule may comprise a second carbohydrate; and the second recognition element may include a second lectin specific to the second carbohydrate. Alternatively, the second molecule may comprise a second biomolecule specific to the second recognition element.
In some examples, a portion of the second oligonucleotide comprises a double-stranded polynucleotide, and the transposase tags the first oligonucleotide to the double-stranded polynucleotide to produce the reporter polynucleotide.
In some examples, the first oligonucleotide comprises a first barcode corresponding to the first portion of the analyte and the second oligonucleotide comprises a second barcode corresponding to the second portion of the analyte.
In some examples, the first oligonucleotide comprises a chimeric end (ME) transposon end coupled to a transposase.
In some examples, the first oligonucleotide has a different sequence than the second oligonucleotide.
In some examples, the first oligonucleotide comprises a forward primer and the second oligonucleotide comprises a reverse primer.
In some examples, the method further comprises inhibiting the activity of the transposase while specifically coupling the donor recognition probe to a first portion of the analyte and while specifically coupling the acceptor recognition probe to a second portion of the analyte. In some examples, the first condition of the fluid is used to inhibit the activity of a transposase. In some examples, the first condition of the fluid includes at least one of: (i) The presence of a sufficient amount of EDTA to inhibit the activity of the transposase, and (ii) the absence of a sufficient amount of magnesium ions for the activity of the transposase. In some examples, dsDNA quenchers are used to inhibit the activity of the transposase. In some examples, the activity of the transposase is inhibited by associating a blocking agent with the transposase. In some examples, the activity of the transposase is inhibited by a single stranded second oligonucleotide. In some examples, the method further comprises promoting activity of the transposase prior to producing the reporter polynucleotide using the transposase. In some examples, the second condition of the fluid is used to promote activity of the transposase. In some examples, the second condition of the fluid includes the presence of a sufficient amount of magnesium ions for activity of the transposase. In some examples, the activity of the transposase is promoted by degrading the blocking agent. In some examples, transposase activity is promoted by annealing a third oligonucleotide to a second oligonucleotide to form a double stranded polynucleotide.
In some examples, detecting the analyte includes sequencing the reporter polynucleotide. In some examples, sequencing includes performing synthetic sequencing on the reporter polynucleotide.
In some examples, the transposase is coupled to the first recognition element via the first oligonucleotide.
In some examples, the donor recognition probe comprises two transposases, two first recognition elements, and two first oligonucleotides, wherein the two transposases form a dimer, each of the transposases being coupled to a corresponding one of the first recognition elements via a corresponding one of the first oligonucleotides.
In some examples, the donor recognition probe comprises two transposases, one first recognition element, and two first oligonucleotides. The two transposases may form a dimer, each of the transposases being coupled to one of the first recognition elements via a respective one of the first oligonucleotides.
In some examples, the donor recognition probe comprises two transposases, one first recognition element, and two first oligonucleotides. The two transposases can form a dimer, at least one of the transposases being coupled to one of the first recognition elements via a covalent bond.
In some examples, the first oligonucleotide and the second oligonucleotide comprise DNA.
In some examples, the transposase comprises Tn5.
In some examples, the receptor recognition probe is coupled to the bead prior to coupling the receptor recognition probe to the second portion of the analyte. The method may further comprise washing the beads after coupling the acceptor recognition probes to the second portion of the analyte and before coupling the donor recognition probes to the first portion of the analyte.
In some examples, the first recognition element and the first oligonucleotide are coupled to a first portion of an analyte prior to coupling the transposase to the first oligonucleotide and the first recognition element.
Some examples herein provide a method for detecting different analytes in a mixture. The method may comprise coupling different analytes in the mixture to corresponding donor recognition probes. Each donor recognition probe can comprise a first recognition element specific for a first portion of a respective analyte, a first oligonucleotide corresponding to the first portion of that analyte, and a transposase coupled to the first recognition element and the first oligonucleotide. The method may comprise coupling different analytes in the mixture to corresponding receptor recognition probes. Each receptor recognition probe of the receptor recognition probes may comprise a second recognition element specific for a second portion of the respective analyte, and a second oligonucleotide corresponding to the second portion of that analyte and coupled to the second recognition element. The method may comprise, for each analyte of the analytes coupled to the respective donor recognition probe and coupled to the respective acceptor recognition probe, generating a reporter polynucleotide comprising a first oligonucleotide and a second oligonucleotide corresponding to that analyte using a transposase of the donor recognition probe. The method may include detecting analytes in the mixture based on a reporter polynucleotide comprising a first oligonucleotide and a second oligonucleotide corresponding to the analytes.
In some examples, the method further comprises determining the amount of analytes detected in the mixture based on the amounts of the reporter polynucleotides corresponding to those analytes.
In some examples, for a first analyte of the analytes, the first donor recognition probe of the donor recognition probe is specific for a first form of the first portion of that analyte. In some examples, for a first analyte of the analytes, a second donor recognition probe of the donor recognition probes is specific for a second form of the first portion of that analyte. In some examples, the first and second donor identification probes of the donor identification probes are mixed with the analyte simultaneously with one another.
In some examples, for a first analyte of the analytes, the second donor recognition probe of the donor recognition probe is specific for both the first and second forms of the first portion of that analyte. In some examples, the second donor identification probe of the donor identification probe is mixed with the analyte after the first donor identification probe of the donor identification probe is mixed with the analyte. In some examples, the first form is post-translationally modified (PTM), and the second form is not PTM. In some examples, the first form is phosphorylated, acetylated, methylated, nitrosylated, or glycosylated relative to the second form.
In some examples, the method further comprises determining the amount of the first and second forms of the first analyte of the analyte based on the amount of the reporter polynucleotide corresponding to the first and second donor recognition probes of the donor recognition probes.
Some examples herein provide a composition. The composition may comprise an analyte having a first portion and a second portion. The composition may comprise a donor recognition probe coupled to a first portion of the analyte. The donor recognition probe can comprise a first recognition element specific for the first portion of the analyte, a first oligonucleotide corresponding to the first portion of the analyte, and a transposase coupled to the first recognition element and the first oligonucleotide. The composition may comprise a receptor recognition probe coupled to the second portion of the analyte, the receptor recognition probe comprising a second recognition element specific for the second portion of the analyte and a second oligonucleotide coupled to the second recognition element and corresponding to the second portion of the analyte.
Some examples herein provide a kit. The kit may comprise a plurality of donor recognition probes, each donor recognition probe comprising a recognition element specific for a first portion of a respective analyte, a first oligonucleotide corresponding to that first portion of the respective analyte, and a transposase coupled to the first recognition element and the first oligonucleotide. The kit may further comprise a plurality of receptor recognition probes, each receptor recognition probe comprising a recognition element specific for a second portion of the respective analyte and a second polynucleotide coupled to the second recognition element and corresponding to the second portion of that respective analyte.
Some examples herein provide a method for detecting an analyte. The method may include coupling a first recognition probe to a first portion of the analyte. The first recognition probe may comprise a first recognition element specific for a first portion of the analyte and a first oligonucleotide corresponding to the first portion of the analyte. The method may include coupling a second recognition probe to a second portion of the analyte. The second recognition probe may comprise a second recognition element specific for a second portion of the analyte and a second oligonucleotide corresponding to the second portion of the analyte. The method may include coupling the first oligonucleotide to the second oligonucleotide using a splint oligonucleotide that has complementarity to both a portion of the first oligonucleotide and a portion of the second oligonucleotide to form a reporter oligonucleotide coupled to the first recognition probe and the second recognition probe. The method may comprise performing a sequence analysis on the reporter oligonucleotide. The method may include detecting the analyte based on sequence analysis of the reporter oligonucleotide.
In some examples, the method further comprises generating a double-stranded oligonucleotide comprising a reporter oligonucleotide coupled to the first recognition probe and the second recognition probe and a complementary oligonucleotide hybridized to the reporter oligonucleotide. In some examples, the method further comprises cleaving a portion of the double-stranded oligonucleotide, wherein sequence analysis is performed on the cleaved portion of the double-stranded oligonucleotide.
In some examples, the sequence analysis performed includes any one or more of isothermal bead-based amplification, targeted genomic amplification, and whole genome amplification.
In some examples, the first recognition probe or the second recognition probe comprises an antibody, lectin, or aptamer. In some examples, the first recognition probe includes a first antibody, a first lectin, or a first aptamer. In some examples, the second recognition probe includes a second antibody, a second lectin, or a second aptamer.
In some examples, the first oligonucleotide comprises a partial barcode and the second oligonucleotide comprises a partial barcode, wherein coupling the first oligonucleotide to the second oligonucleotide produces a complete barcode corresponding to the target analyte.
In some examples, performing sequence analysis includes performing a Polymerase Chain Reaction (PCR) on the reporter oligonucleotide. In some examples, the reporter oligonucleotide comprises a Unique Molecular Identifier (UMI) that is amplified during PCR.
Some examples herein provide a method for detecting multiple analytes in a sample. The method may comprise incubating the sample with a plurality of pairs of recognition probes. Each pair of identification probes may include a first identification probe and a second identification probe. Each pair of recognition probes may be specific for a respective analyte of the analytes. Each first recognition probe and each second recognition probe may be coupled to a corresponding oligonucleotide. The method may comprise incubating the sample with a plurality of splint oligonucleotides. Each splint oligonucleotide may be complementary to portions of the oligonucleotide that are coupled to a first and second recognition probe, respectively, of a pair of recognition probes specific for a respective analyte of the analyte. Complementary binding of each splint oligonucleotide to the oligonucleotides coupled to the first recognition probe and the second recognition probe may result in the formation of a reporter oligonucleotide. The method may comprise washing the sample to remove any unbound recognition probes and any unbound splint oligonucleotides. The method may comprise performing a sequence analysis on the reporter oligonucleotide. The method may include detecting the plurality of analytes based on the sequence analysis.
In some examples, incubating the sample further comprises incubating with a ligase.
In some examples, performing sequence analysis includes using any one or more of a microarray, bead array, library preparation, or PCR.
Some examples herein provide a composition. The composition may comprise a plurality of analytes. The composition may comprise a plurality of pairs of recognition probes. Each pair of identification probes may include a first identification probe and a second identification probe. Each pair of recognition probes may be specific for a respective analyte of the analytes. Each first recognition probe and each second recognition probe may be coupled to a corresponding oligonucleotide. The composition may comprise a plurality of splint oligonucleotides. Each splint oligonucleotide may be complementary to portions of the oligonucleotide that are coupled to a first and second recognition probe, respectively, of a pair of recognition probes specific for a respective analyte of the analyte.
Some examples herein provide a kit. The kit may comprise a plurality of pairs of recognition probes. Each pair of identification probes may include a first identification probe and a second identification probe. Each pair of recognition probes may be specific for a respective analyte of the analytes. Each first recognition probe and each second recognition probe may be coupled to a corresponding oligonucleotide. The kit may comprise a plurality of splint oligonucleotides. Each splint oligonucleotide may be complementary to portions of the oligonucleotide that are coupled to a first and second recognition probe, respectively, of a pair of recognition probes specific for a respective analyte of the analyte.
Some examples herein provide a method for detecting an analyte. The method may include coupling a first recognition probe to a first portion of the analyte. The first recognition probe may comprise a first recognition element specific for a first portion of the analyte and a double-stranded oligonucleotide comprising a first barcode corresponding to the first portion of the analyte. The method may include coupling a second recognition probe to a second portion of the analyte, the second recognition probe may comprise a second recognition element specific for the second portion of the analyte and a single-stranded oligonucleotide that may comprise a second barcode corresponding to the second portion of the analyte. The method can include hybridizing the single-stranded oligonucleotide to a single oligonucleotide strand of the double-stranded oligonucleotide to form a reporter oligonucleotide comprising the first barcode and the second barcode. The method may comprise performing a sequence analysis on the reporter oligonucleotide. The method may include detecting the analyte based on sequence analysis of the reporter oligonucleotide.
In some examples, the hybridizing step includes strand invasion of the double-stranded oligonucleotide by the single-stranded oligonucleotide.
In some examples, the sequence analysis performed includes any one or more of isothermal bead-based amplification, targeted genomic amplification, and whole genome amplification.
In some examples, detecting the analyte includes performing a quantitative detection of the reporter oligonucleotide.
Some examples herein provide a method for detecting an analyte. The method may include coupling a first recognition probe to a first portion of the analyte, the first recognition probe comprising a first recognition element specific for the first portion of the analyte and a first oligonucleotide corresponding to the first portion of the analyte. The first oligonucleotide may comprise a first restriction endonuclease site. The method may include coupling a second recognition probe to a second portion of the analyte, the second recognition probe comprising a second recognition element specific for the second portion of the analyte and a second oligonucleotide corresponding to the second portion of the analyte. The second oligonucleotide may comprise a second restriction endonuclease site. The method may comprise coupling the first oligonucleotide to the second oligonucleotide. The method can include cleaving the first oligonucleotide and the second oligonucleotide at the first restriction endonuclease site and the second restriction endonuclease site to form a reporter oligonucleotide. The method may comprise performing a sequence analysis on the reporter oligonucleotide. The method may include detecting the analyte based on sequence analysis of the reporter oligonucleotide.
In some examples, the cleavage step includes the use of one or more restriction endonucleases.
In some examples, the sequence analysis performed includes any one or more of isothermal bead-based amplification, targeted genomic amplification, and whole genome amplification.
In some examples, detecting the analyte includes performing a quantitative detection of the reporter oligonucleotide.
Some examples herein provide a method of performing a targeted epigenetic assay. The method may comprise contacting the polynucleotide with a mixture of first complexes specific for different types of proteins coupled to respective loci of the polynucleotide. Each first complex of the first complexes may comprise a first antibody specific for a corresponding type of protein, and a first transposome coupled to the first antibody and comprising a first oligonucleotide corresponding to that type of protein. The method may comprise coupling the first complexes to proteins specific for the first antibodies, respectively. The method can include generating a fragment of a polynucleotide, including activating a first transposome to create a first nick in the polynucleotide and coupling a first oligonucleotide to the first nick. The method may include removing the protein and the first complex from the fragment. The method may include subsequently sequencing the fragment and the first oligonucleotide coupled to the fragment. The method may comprise identifying proteins that have been coupled to the fragments using the sequences of the first oligonucleotides coupled to those fragments.
In some examples, each of the first composites comprises a plurality of first rotors. For example, each first composite of the first composites may comprise two first rotating bodies.
Additionally, or alternatively, in some examples, the first transposome may be inactivated using a first condition of the fluid. In some examples, the first condition of the fluid may include at least one of: (i) The presence of EDTA in an amount sufficient to inhibit the activity of the first transposome, and (ii) the absence of magnesium ions in an amount sufficient for the activity of the first transposome. Additionally, or alternatively, in some examples, the first rotating seat is activated using a second condition of the fluid. In some examples, the second condition of the fluid includes the presence of a sufficient amount of magnesium ions for activity of the first transposomes.
Additionally, or alternatively, in some examples, the sequencing includes performing sequencing-by-synthesis on the fragments and the oligonucleotides coupled to the fragments.
Additionally, or alternatively, in some examples, the method includes identifying a respective locus of the protein using a respective position in the fragment of the first oligonucleotide.
Additionally, or alternatively, in some examples, the first oligonucleotide comprises a primer.
Additionally, or alternatively, in some examples, the first oligonucleotide comprises a Unique Molecular Identifier (UMI).
Additionally, or alternatively, in some examples, the first oligonucleotide comprises a barcode corresponding to a protein.
Additionally, or alternatively, in some examples, the first oligonucleotide comprises a chimeric end (ME) transposon end.
Additionally, or alternatively, in some examples, the first scaffold is coupled to the first antibody via a covalent bond.
Additionally, or alternatively, in some examples, the first transposome is coupled to the first antibody via a non-covalent bond. For example, the first transposome may be coupled to protein a, and the active site of the first antibody may be coupled to protein a.
Additionally, or alternatively, in some examples, the first rotating mount includes Tn5.
Additionally, or alternatively, in some examples, each first complex of the first complexes comprises a fusion protein comprising a first antibody and a first transposome.
Additionally, or alternatively, in some examples, the first antibody is coupled to a first oligonucleotide, and wherein the first adaptor is coupled to the first antibody via the first oligonucleotide.
Additionally, or alternatively, in some examples, the method further comprises contacting the polynucleotide with a mixture of second complexes specific for the first complexes. Each second complex of the second complexes may include a second antibody specific for the first antibody, and a second transposome coupled to the second antibody and comprising a second oligonucleotide. The method may include separately coupling the second complex with the first complex. Generating a fragment of the polynucleotide further can include activating a second transposome to create a second nick in the polynucleotide and coupling the second oligonucleotide to the second nick. The second oligonucleotide may be used to amplify the fragment prior to sequencing.
Additionally, or alternatively, in some examples, the polynucleotide comprises double-stranded DNA.
Some examples herein provide a composition. The composition may comprise polynucleotides having different types of proteins coupled to their respective loci. The composition may comprise a mixture of first complexes specific for different types of proteins. Each first complex of the first complexes may comprise a first antibody selective for a type of protein, and a first transposome coupled to the first antibody and comprising a first oligonucleotide corresponding to that type of protein.
In some examples, each of the first composites comprises a plurality of first rotors. For example, each first composite of the first composites may comprise two first rotating bodies.
Additionally, or alternatively, in some examples, the first rotating seat is deactivated using a condition of the fluid. For example, the condition of the fluid may include at least one of: (i) The presence of EDTA in an amount sufficient to inhibit the activity of the first transposome, and (ii) the absence of magnesium ions in an amount sufficient for the activity of the first transposome.
Additionally, or alternatively, in some examples, the first adaptor is activatable to cleave the polynucleotide and add the first oligonucleotide to the nick. In some examples, the first rotating seat may be activated using fluid conditions. In some examples, the conditions of the fluid include the presence of a sufficient amount of magnesium ions for transposase activity.
Additionally, or alternatively, in some examples, the first oligonucleotide comprises a primer.
Additionally, or alternatively, in some examples, the first oligonucleotide comprises a Unique Molecular Identifier (UMI).
Additionally, or alternatively, in some examples, the first oligonucleotide comprises a barcode corresponding to a protein.
Additionally, or alternatively, in some examples, the first oligonucleotide comprises a chimeric end (ME) transposon end.
Additionally, or alternatively, in some examples, the first scaffold is coupled to the antibody via a covalent bond.
Additionally, or alternatively, in some examples, the first transposome is coupled to the antibody via a non-covalent bond.
Additionally, or alternatively, in some examples, the first transposome is coupled to protein a and the active site of the first antibody is coupled to protein a.
Additionally, or alternatively, in some examples, the first rotating mount includes Tn5.
Additionally, or alternatively, in some examples, each first complex of the first complexes comprises a fusion protein comprising a first antibody and a first transposome.
Additionally, or alternatively, in some examples, the first antibody is coupled to the first oligonucleotide, and the first adaptor is coupled to the first antibody via the first oligonucleotide.
Additionally, or alternatively, in some examples, the composition further comprises a mixture of second complexes that are specific to the first complexes. Each of the second complexes may comprise a second antibody coupled to one of the first antibodies and a second transposome comprising a second oligonucleotide.
Additionally, or alternatively, in some examples, the polynucleotide comprises double-stranded DNA.
It should be understood that any respective feature/example of each of the aspects of the disclosure as described herein may be implemented together in any suitable combination, and any feature/example from any one or more of these aspects may be implemented together with any suitable combination of features of other aspect(s) as described herein to achieve the benefits as described herein.
Drawings
FIG. 1 schematically illustrates exemplary operations and compositions in a process flow for detecting analytes using proximity-induced labeling.
FIG. 2 schematically illustrates an exemplary donor identification probe for detecting analytes using proximity-induced labeling.
FIG. 3 schematically illustrates an exemplary receptor recognition probe for detecting analytes using proximity-induced labeling.
Fig. 4A-4G schematically illustrate further details of operations and compositions in the process flow of fig. 1 according to some examples.
Fig. 5 schematically illustrates exemplary operations and compositions in a process flow for detecting post-translational modifications (PTMs) using donor identification probes.
Fig. 6 schematically illustrates exemplary operations and compositions in a process flow for detecting post-translational modifications (PTMs) using PTM-specific and non-PTM-specific donor identification probes.
Fig. 7A-7C schematically illustrate exemplary operations and compositions in a process flow for detecting molecular interactions using proximity-induced labeling.
Fig. 8A-8C schematically illustrate an exemplary process flow for preparing a donor identification probe.
Fig. 9A-9E schematically illustrate exemplary compositions and operations for reducing background labeling during proximity-induced labeling.
Fig. 10A-10D schematically illustrate additional exemplary compositions and operations for reducing background labeling during proximity-induced labeling.
Fig. 11A-11C schematically illustrate additional exemplary compositions and operations for reducing background labeling during proximity-induced labeling.
Fig. 12 schematically illustrates exemplary compositions and operations for reducing contaminants during proximity-induced labeling.
FIG. 13 illustrates an exemplary operational flow in a method for detecting an analyte using proximity-induced labeling.
Fig. 14 schematically illustrates exemplary operations and compositions in a process flow for detecting molecular interactions using proximity-induced labeling.
Fig. 15A-15C schematically illustrate exemplary operations and compositions in a process flow. FIG. 15A shows the detection of RNA modification on a specific RNA target. FIGS. 15B and 15C illustrate detection of molecular interactions using proximity-induced labeling.
FIG. 16 schematically illustrates exemplary operations and compositions in a process flow for detecting nucleic acid modifications using donor recognition probes specific for the nucleic acid modifications.
FIG. 17 schematically illustrates exemplary operations and compositions in a process flow for detecting nucleic acid modifications using a donor recognition probe that can specifically detect modifications and a donor recognition probe that is specific for a target but not specific for the modification.
Fig. 18 schematically illustrates exemplary operations and compositions in a process flow for detecting background labeling during proximity-induced labeling.
FIG. 19 schematically shows an exemplary process flow for adding an aptamer to a reporter polynucleotide.
FIGS. 20A and 20B schematically illustrate exemplary operations and compositions for detecting analytes using bead arrays.
Fig. 21A-21B schematically illustrate additional exemplary operations and compositions for detecting analytes using bead arrays.
Fig. 22 schematically illustrates additional exemplary operations and compositions for detecting analytes using bead arrays.
Fig. 23A and 23B schematically illustrate an exemplary process flow for adding unique molecular identifiers to a donor recognition probe and an acceptor recognition probe.
Fig. 24A-24D schematically illustrate an exemplary procedure for proximity induced ligation assays using splint oligonucleotides.
FIGS. 25A-25C schematically illustrate examples of ways to distinguish between ligated and non-ligated oligonucleotides.
Fig. 26A-26C schematically illustrate another exemplary procedure for proximity induced ligation assays using splint oligonucleotides.
Fig. 27A-27B illustrate a flow of operations in an exemplary method for detecting an analyte using a splint oligonucleotide according to some examples herein.
Fig. 28A-28D schematically illustrate an exemplary process of proximity induction chain invasion assay.
FIG. 29 illustrates an operational flow in an exemplary method for detecting an analyte using proximity-induced chain invasion according to some examples herein.
Fig. 30A-30D schematically illustrate an exemplary process of proximity induction restriction measurement.
FIG. 31 illustrates an operational flow in an exemplary method for detecting an analyte using proximity-induced confinement according to some examples herein.
Fig. 32A-32C schematically illustrate exemplary procedures and compositions for use in whole genome amplification using random-initiated isothermal Multiple Displacement Amplification (MDA).
Fig. 33A to 33C schematically show exemplary synthetic oligonucleotide sequences.
Fig. 33D is a table with a corresponding number of targets synthesized for each probe class.
FIG. 34 schematically illustrates an exemplary synthetic model system for assessing detection of synthetic oligonucleotides.
Fig. 35A-35C schematically illustrate an exemplary synthetic model system for assessing detection of synthetic oligonucleotides.
Fig. 36 shows fluorescence measured during use of the exemplary synthetic model system of fig. 34 and 35A-35C.
Fig. 37 shows the results of additional measurements made during use of the exemplary synthetic model system of fig. 34 and 35A-35C.
FIGS. 38A-38E schematically illustrate exemplary compositions and operations in a process flow for targeting epigenetic assays.
Fig. 39A schematically shows exemplary oligonucleotides that may be used in the process flows of fig. 38A-38E.
FIG. 39B schematically shows fragments coupled to the exemplary oligonucleotides of FIG. 39A.
Fig. 40A-40C schematically show further details of a composite such as may be used in the process flows of fig. 38A-38E.
FIG. 41 schematically shows an exemplary procedure for generating complexes each comprising a transposome conjugated to an antibody.
FIG. 42 schematically shows an exemplary procedure for generating a complex comprising a plurality of transposomes coupled to antibodies, respectively.
FIG. 43 shows an operation in which an antibody of one of the complexes of FIG. 5 selectively binds to a protein at a locus of a polynucleotide.
FIG. 44 schematically shows an exemplary flow of operations for amplifying a fragment of a polynucleotide after tagging by a transposome of a complex.
FIG. 45 schematically illustrates another exemplary procedure for generating complexes each comprising a transposome conjugated to a plurality of antibodies.
Fig. 46A to 46B schematically show exemplary procedures for producing complexes each comprising a transposome conjugated to an antibody.
Fig. 47A and 47C schematically show an exemplary flow of operations in which proteins at respective loci of polynucleotides are sequentially bound by antibodies of primary and secondary complexes.
FIG. 47B shows an exemplary fragment of the polynucleotides of FIG. 47A or FIG. 47C after tagging.
FIG. 48 illustrates an exemplary flow of operations in a method for targeting epigenetic assays.
Detailed Description
Provided herein are targeted epigenetic assays, proximity-induced labeling, strand invasion, restriction and ligation, and their use for detecting analytes.
For example, the present examples can be used to detect analytes, such as biomolecules, by using analyte recognition elements (e.g., antibodies, aptamers, or lectins) specific for the respective analytes to generate reporter polynucleotides of sequences corresponding to those analytes. The reporter polynucleotide may then be sequenced and the corresponding analyte may be detected from those sequences. In some examples provided herein, a proximity-induced labelling reaction between two analyte-binding recognition elements coupled to: 1) A donor recognition probe comprising a transposome with an active barcode, and 2) an acceptor DNA handle with a second barcode. In other examples provided herein, a reporter polynucleotide is generated using proximity-induced strand invasion between analyte-binding recognition elements that are respectively coupled to: 1) Double stranded oligonucleotides and 2) single stranded oligonucleotides that invade the double stranded oligonucleotides. In still other examples provided herein, a proximity-induced ligation reaction between analyte-binding recognition elements is used to generate a reporter polynucleotide, the analyte-binding recognition elements being coupled to single-stranded oligonucleotides, respectively, that become coupled to each other when in proximity to each other and to a splint oligonucleotide to which the two single-stranded oligonucleotides hybridize. In yet other examples provided herein, a reporter polynucleotide is generated using a proximity-induced restriction in which recognition elements that bind to an analyte are respectively coupled to single-stranded oligonucleotides that hybridize to each other when in proximity to each other to form double-stranded oligonucleotides that comprise one or more targets of a restriction enzyme, and the double-stranded oligonucleotides are cleaved using the restriction enzyme. As will be apparent from the present description, the methods of the present invention provide highly scalable, multiplexed detection, quantification and/or characterization of analytes.
Some of the examples of the invention may use an antibody-transposome complex that selectively couples an oligonucleotide to a polynucleotide near a locus coupled to a protein. These oligonucleotides can then be sequenced to identify the proteins and their corresponding loci along that polynucleotide. Each of the complexes may comprise an antibody, an oligonucleotide, and one or more transposomes selectively coupled to a corresponding protein along the polynucleotide, the one or more transposomes respectively (i) cleave the polynucleotide at a position adjacent to that protein (e.g., within about 1-20 base pairs of that protein) and (ii) couple the oligonucleotide to the cleaved end of the polynucleotide. Each of the oligonucleotides may include a barcode corresponding to the protein for which the respective complex is selective, and may further include a Unique Molecular Identifier (UMI) corresponding to the particular polynucleotide molecule being cleaved. The position at which the oligonucleotide is coupled to the polynucleotide corresponds to the position of the protein. Thus, the sequence of the oligonucleotide, together with the position of the oligonucleotide, can be used to identify a particular protein coupled to a particular locus of a particular polynucleotide molecule. If there is a large amount of overlap in the sequence, UMI can be used to accurately quantify; for example, if the same locus is cut at substantially the same position in 50 separate copies of the polynucleotide (each of these copies has its own UMI), it can be determined that there are 50 original polynucleotide fragments. Such manipulations may be performed along any desired portion of the polynucleotide, and may in fact be performed on the entire chromosome or even on an entire genomic (WG) sample, thus producing a collection of fragment molecules, each labeled with an oligonucleotide indicative of a protein coupled to that particular fragment molecule. These fragments (with oligonucleotides coupled thereto) can be readily sequenced in multiplex fashion, for example using existing commercially available sequencing-by-synthesis systems. The sequences thus obtained may be associated with proteins coupled to those fragments. Thus, the present examples provide a powerful and highly multiplexed platform for determining which proteins are coupled to which specific loci of any desired polynucleotide or set of polynucleotides.
Thus, it should be understood that some examples herein relate to enriching DNA regions (small or large) that retain epigenetic characteristics (e.g., proteins) that are subsequently processed in an epigenetic-NGS assay. This approach enables ultra-deep epigenetic assays, thereby increasing resolution of fine epigenetic changes (e.g., as compared to chromatin co-immunoprecipitation sequencing (ChIP-seq)) and complex networks (e.g., locus-related proteomics), which may be advantageous for better understanding epigenetic mechanisms, such as may be important for research or clinical development.
First, some terms used herein will be briefly explained. Then, some exemplary compositions and exemplary methods for targeting epigenetic assays or for using proximity-induced tagging, strand invasion, restriction or ligation will be described.
Terminology
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. The use of the term "include" as used herein is not limiting as other forms such as "include", and "include". The use of the term "have", and "have" and other forms such as "have" are not limiting. As used in this specification, the terms "comprise" and "comprising" are to be interpreted as having an open-ended meaning, both in the transitional phrase and in the body of the claim. That is, the above terms should be interpreted synonymously with the phrase "having at least" or "including at least". For example, when used in the context of a process, the term "comprising" means that the process includes at least the recited steps, but may also include additional steps. The term "comprising" when used in the context of a compound, composition or device means that the compound, composition or device comprises at least the recited features or components, but may also comprise additional features or components.
As used herein, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise.
The terms "substantially," "about," and "approximately" are used throughout this specification to describe and illustrate minor fluctuations as a result of variations in processing. For example, they may refer to less than or equal to ±10%, such as less than or equal to ±5%, such as less than or equal to ±2%, such as less than or equal to ±1%, such as less than or equal to ±0.5%, such as less than or equal to ±0.2%, such as less than or equal to ±0.1%, such as less than or equal to ±0.05%.
As used herein, terms such as "hybridizing … …" and "hybridization" are intended to mean that polynucleotides associate non-covalently with each other along the length of those polynucleotides to form a double-stranded "duplex," triplex, "or higher order structure. For example, two DNA polynucleotide strands may associate to form a duplex by complementary base pairing. The major interactions between polynucleotide strands are typically nucleotide base specific through Watson-Crick and Hoogsteen type hydrogen bonds, e.g., A: T, A: U and G: C. Base stacking and hydrophobic interactions can also contribute to duplex stability. Hybridization conditions can include a salt concentration of less than about 1M, more typically less than about 500mM or less than about 200 mM. Hybridization buffers may include buffered saline, such as 5% SSPE, or another suitable buffer known in the art. Hybridization temperatures can be as low as 5 ℃, but are typically greater than 22 ℃, and more typically greater than about 30 ℃, and typically greater than 37 ℃. The strength of association between the first and second polynucleotides increases with complementarity between nucleotide sequences within those polynucleotides. The hybridization strength between polynucleotides can be characterized by the melting temperature (Tm) at which 50% of the duplex has polynucleotide strands dissociated from each other.
As used herein, the term "nucleotide" is intended to mean a molecule comprising a sugar and at least one phosphate group, and in some examples also a nucleobase. Nucleotides lacking nucleobases may be referred to as "abasic". The nucleotides comprise deoxyribonucleotides, modified deoxyribonucleotides, ribonucleotides, modified ribonucleotides, peptide nucleotides, modified phosphosugar backbone nucleotides, and mixtures thereof. Examples of nucleotides include Adenosine Monophosphate (AMP), adenosine Diphosphate (ADP), adenosine Triphosphate (ATP), thymidine Monophosphate (TMP), thymidine Diphosphate (TDP), thymidine Triphosphate (TTP), cytidine Monophosphate (CMP), cytidine Diphosphate (CDP), cytidine Triphosphate (CTP), guanosine Monophosphate (GMP), guanosine Diphosphate (GDP), guanosine Triphosphate (GTP), uridine Monophosphate (UMP), uridine Diphosphate (UDP), uridine Triphosphate (UTP), deoxyadenosine monophosphate (dabp), deoxyadenosine diphosphate (dADP), deoxyadenosine triphosphate (dATP), deoxycytidine diphosphate (dTTP), deoxycytidine diphosphate (dCDP), deoxycytidine triphosphate (dCTP), deoxyguanosine monophosphate (dGDP), deoxyuridine diphosphate (dGTP), deoxyuridine diphosphate (dgd), deoxyuridine diphosphate (UDP), and deoxyuridine triphosphate (dgp).
As used herein, the term "nucleotide" is also intended to encompass any nucleotide analog that is a type of nucleotide that comprises modified nucleobase, sugar, backbone and/or phosphate moieties as compared to naturally occurring nucleotides. Nucleotide analogs can also be referred to as "modified nucleic acids". Exemplary modified nucleobases include inosine, xanthine (xathanine), hypoxanthine, isocytosine, isoguanine, 2-aminopurine, 5-methylcytosine, 5-hydroxymethylcytosine, 2-aminoadenine, 6-methyladenine, 6-methylguanine, 2-propylguanine, 2-propyladenine, 2-thiouracil, 2-thiothymine, 2-thiocytosine, 15-halouracil, 15-halocytosine, 5-propynyluracil, 5-propynylcytosine, 6-azouracil, 6-azocytosine, 6-azothymine, 5-uracil, 4-thiouracil, 8-haloadenine or guanine, 8-aminoadenine or guanine, 8-thioladenine or guanine, 8-thioalkyl adenine or guanine, 8-hydroxy adenine or guanine, 5-halo substituted uracil or cytosine, 7-methylguanine, 7-methyladenine, 8-azaadenine, 7-deazaadenine, 3-deazaadenine, and the like. As is known in the art, certain nucleotide analogs cannot be incorporated into polynucleotides, for example nucleotide analogs such as 5' -phosphoadenosine sulfate. The nucleotides may comprise any suitable number of phosphates, for example three, four, five, six, or more than six phosphates. Nucleotide analogs also include Locked Nucleic Acids (LNA), peptide Nucleic Acids (PNA), and 5-hydroxybutyrine-2' -deoxyuridine ("super T").
As used herein, the term "polynucleotide" refers to a molecule comprising nucleotide sequences that bind to each other. Polynucleotides are one non-limiting example of polymers. Examples of polynucleotides include deoxyribonucleic acid (DNA), ribonucleic acid (RNA), and analogs thereof, such as Locked Nucleic Acids (LNA) and Peptide Nucleic Acids (PNA). Polynucleotides may be single-stranded sequences of nucleotides, such as RNA or single-stranded DNA; a double-stranded sequence of nucleotides, such as double-stranded DNA; or may comprise a mixture of single-and double-stranded sequences of nucleotides. Double-stranded DNA (dsDNA) comprises genomic DNA, and PCR and amplification products. Single-stranded DNA (ssDNA) may be converted to dsDNA and vice versa. Polynucleotides may include non-naturally occurring DNA, such as enantiomeric DNA, LNA, or PNA. The exact sequence of the nucleotides in the polynucleotide may be known or unknown. The following are examples of polynucleotides: a gene or gene fragment (e.g., probe, primer, expressed Sequence Tag (EST) or gene expression Series Analysis (SAGE) tag), genomic DNA fragment, exon, intron, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozyme, cDNA, recombinant polynucleotide, synthetic polynucleotide, branched polynucleotide, plasmid, vector, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probe, primer, or amplified copy of any of the foregoing.
As used herein, "polymerase" is intended to mean an enzyme having an active site that assembles a polynucleotide by polymerizing nucleotides into a polynucleotide. The polymerase may bind to the primed single stranded target polynucleotide and nucleotides may be added sequentially to the growth primer to form a "complementary copy" polynucleotide having a sequence complementary to the sequence of the target polynucleotide. Next, another polymerase or the same polymerase can form copies of the target nucleotide by forming complementary copies of the complementary replication polynucleotide. Any of such copies may be referred to herein as "amplicons". The DNA polymerase can bind to the target polynucleotide and then move down the target polynucleotide, sequentially adding nucleotides to the free hydroxyl group at the 3' end of the growing polynucleotide strand (growing amplicon). DNA polymerase can synthesize complementary DNA molecules from DNA templates and RNA polymerase can synthesize RNA molecules (transcription) from DNA templates. The polymerase may use short RNA or DNA strands (primers) to initiate strand growth. Some polymerases can shift the strand such that they add bases upstream of the site of the strand. Such polymerases may be referred to as strand-shifted, meaning that they have the activity to remove the complementary strand from the template strand read by the polymerase.
Exemplary polymerases include Bst DNA polymerase, 9Nm DNA polymerase, phi29 DNA polymerase, DNA polymerase I (E.coli)), DNA polymerase I (large), (Klenow) fragment, klenow fragment (3 '-5' outer-), T4 DNA polymerase, T7 DNA polymerase, deep VentR TM (exo-) DNA polymerase, deep VentR TM DNA polymerase, dyNAzyme TM EXT DNA、DyNAzyme TM II Hot start DNA polymerase, phusion TM High-fidelity DNA polymerase and terminator TM DNA polymerase, therapist TM IIDNA polymerase,DNA polymerase,/->(exo-) DNA polymerase, repliHI TM Phi29 DNA polymerase, rBst DNA polymerase, rBst DNA polymerase (Large) and fragment (IsoTherm) TM DNA polymerase), masterAmp TM AmpliTherm TM DNA polymerase, taq DNA polymerase, tth DNA polymerase, tfl DNA polymerase, tgo DNA polymerase, SP6 DNA polymerase, tbr DNA polymerase, DNA polymerase beta and ThermoPhi DNA polymerase. In specific non-limiting examples, the polymerase is selected from Bst, bsu, and Phi29. When the polymerase extends the hybrid chain, it may be beneficial to include a single-stranded binding protein (SSB). SSB can stabilize displaced (non-template) chains. Exemplary polymerases with strand displacement activity include, but are not limited to, bacillus stearothermophilus (Bacillus stearothermophilus, bst) polymerase, exo-Klenow polymerase, or large fragments of sequencing grade T7 exo-polymerase. Some polymerases degrade the strands in front of them, effectively replacing the front strand (5' exonuclease activity) with the later grown strand. Some polymerases have activity to degrade their subsequent strand (3' exonuclease activity). Some useful polymerases have been mutated or otherwise modified to reduce or eliminate 3 'and/or 5' exonuclease activity.
As used herein, the term "primer" is defined as a polynucleotide to which nucleotides can be added by free 3' oh groups. The primer may include a 3' blocking group that is capable of inhibiting polymerization until the blocking group is removed. The primer may include a modification at the 5' end to allow a coupling reaction or to allow the primer to be coupled to another moiety. The primer may include one or more moieties, such as 8-oxo-G, that are cleavable under suitable conditions (such as UV light, chemistry, enzymes, etc.). The primer length may be any suitable number of bases in length and may comprise any suitable combination of natural and non-natural nucleotides. The target polynucleotide may comprise an "amplification adaptor" or more simply an "adaptor" that hybridizes to (has a sequence complementary to) the primer and may be amplified to produce a complementary replicated polynucleotide (amplicon) by adding nucleotides to the free 3' oh group of the primer.
As used herein, the term "plurality" is intended to mean a population of two or more distinct members. A plurality of numbersThe mesh may be in the small, medium, large to extremely large size range. The size of the small number of numbers may range from, for example, a few members to tens of members. The number of medium-sized members may range from, for example, tens of members to about 100 members or hundreds of members. The large number of multiple members may range, for example, from about hundreds of members to about 1000 members, to thousands of members, and up to tens of thousands of members. The extremely large number of members may range, for example, from tens of thousands of members to about hundreds of thousands, one million, millions, tens of millions, and up to or exceeding hundreds of millions of members. Thus, the number of numbers may be in the range of two to well over the size of one hundred million members and all sizes as measured by the number of members, between, and beyond the above exemplary ranges. A plurality of exemplary polynucleotides includes, for example, about 1X 10 5 Or more, 5X 10 5 Or more, or 1X 10 6 Or a population of more distinct polynucleotides. Thus, the definition of a term is intended to include all integer values greater than two. The upper limit of the number may be set, for example, by the theoretical diversity of polynucleotide sequences in the sample.
As used herein, the term "double-stranded" when used with reference to a polynucleotide is intended to mean that all or substantially all of the nucleotides in the polynucleotide hydrogen bond with corresponding nucleotides in a complementary polynucleotide. Double-stranded polynucleotides may also be referred to as "duplex". As used herein, the term "single stranded" when used with reference to a polynucleotide means that substantially none of the nucleotides in the polynucleotide hydrogen bond with the corresponding nucleotides in the complementary polynucleotide.
As used herein, the term "target polynucleotide" is intended to mean a polynucleotide that is the object of analysis or action, and may also be referred to as using terms such as "library polynucleotide", "template polynucleotide" or "library template". The analysis or action includes subjecting the polynucleotide to capture, amplification, sequencing and/or other procedures. The target polynucleotide may comprise nucleotide sequences other than the target sequence to be analyzed. For example, the target polynucleotide may comprise one or more adaptors, including amplified adaptors that serve as primer binding sites flanking the target polynucleotide sequence to be analyzed. Target polynucleotides that hybridize to a capture primer may include nucleotides that extend beyond the 5 'or 3' end of the capture oligonucleotide in a manner that is not readily extendable by all target polynucleotides. In particular examples, the target polynucleotides may have sequences that are different from each other, but may have first and second adaptors that are identical to each other. Two adaptors that may flank a particular target polynucleotide sequence may have sequences that are identical to each other, or complementary to each other, or the two adaptors may have different sequences. Thus, a species in a plurality of target polynucleotides may include a region of known sequence flanking a region of unknown sequence to be assessed by, for example, sequencing (e.g., SBS). In some examples, the target polynucleotide carries amplification adaptors at a single end, and such adaptors may be located at the 3 'end or the 5' end of the target polynucleotide. The target polynucleotide may be used without any adaptors, in which case the primer binding sequences may be derived directly from sequences found in the target polynucleotide.
The terms "polynucleotide" and "oligonucleotide" are used interchangeably herein. Unless specifically indicated otherwise, the different terms are not intended to represent any particular difference in size, sequence, or other characteristic. For clarity of description, the term may be used to distinguish one polynucleotide species from another polynucleotide species when describing a particular method or composition that includes several polynucleotide species.
In some cases, the terms "sequence" and "subsequence" are used interchangeably herein. For example, a sequence may include one or more subsequences therein. Each of such subsequences may also be referred to as a sequence.
As used herein, the term "amplicon" when used in reference to a polynucleotide is intended to mean a product of replicating the polynucleotide, wherein the product has a nucleotide sequence that is substantially identical to or substantially complementary to at least a portion of the nucleotide sequence of the polynucleotide. "amplification" refers to the process of making an amplicon of a polynucleotide. The first amplicon of the target polynucleotide may be a complementary copy. The additional amplicon is a copy produced from the target polynucleotide or from the first amplicon after the first amplicon is produced. The subsequent amplicon may have a sequence that is substantially complementary to or substantially identical to the target polynucleotide. It will be appreciated that when an amplicon of a polynucleotide is produced, a small amount of mutation of the polynucleotide may occur (e.g., due to amplification artifacts).
As used herein, the term "complex" is intended to mean an element comprising two or more elements having different functional properties from each other.
As used herein, the terms "fusion protein" and "chimeric protein" are intended to mean an element comprising two or more polypeptide domains that have different functional properties (such as different enzymatic activities) from each other. These domains may be coupled to each other covalently or non-covalently. The fusion protein may optionally include a third polypeptide domain, a fourth polypeptide domain, or a fifth polypeptide domain, or other polypeptide domain, operably linked to one or more other polypeptide domains in the polypeptide domain. The fusion protein may comprise multiple copies of the same polypeptide domain. The fusion protein may also or alternatively comprise one or more mutations in one or more of the polypeptides. Fusion proteins may include one or more non-protein elements, such as polynucleotides and/or adaptors that couple domains to each other. Fusion proteins can be formed by combining gene sequences from different proteins into a single gene encoding those proteins. In one non-limiting, purely illustrative example, tn5 is a fusion protein with protein a when both domains are expressed together from a single gene.
As used herein, terms such as "CRISPR-Cas system", "Cas-gRNA ribonucleoprotein" and Cas-gRNA RNP refer to an enzyme system that includes a guide RNA (gRNA) sequence that includes an oligonucleotide sequence that is complementary or substantially complementary to a sequence within a target polynucleotide and a Cas protein. CRISPR-Cas systems can generally be classified into three main types, which are further subdivided into ten subtypes, based on core element content and sequence; see, e.g., makarova et al, "Evolution and classification of the CRISPR-Cas systems", nat Rev microbiol., volume 9, phase 6: pages 467-477, 2011. Cas proteins may have a variety of activities, such as nuclease activity. Thus, CRISPR-Cas systems provide a mechanism for targeting specific sequences (e.g., via gRNA) as well as certain enzymatic activities on the sequences (e.g., via Cas proteins).
The type I CRISPR-Cas system can include a Cas3 protein with separate helicase and dnase activities. For example, in a 1-E type system, crRNA is incorporated into a multi-subunit effector complex called cascade (CRISPR-associated complex for antiviral defense) that binds to target DNA and triggers degradation of Cas3 protein; see, e.g., brouns et al, "Small CRISPR RNAs guide antiviral defense in prokaryotes," Science 321 (5891): 960-964 (2008); sink unas et al, "Cas3 is a single-stranded DNA nuclease and ATP-dependent helicase in the CRISPR-Cas immune system," EMBO J30:1335-1342 (2011); and Beloglazova et al, "Structure and activity of the Cas3 HD nucleic MJ0384, an effector enzyme of the CRISPR interference, EMBO J30:4616-4627 (2011). The type II CRISPR-Cas system comprises a characteristic Cas9 protein, a single protein (about 160 KDa) capable of producing crRNA and cleaving target DNA. Cas9 proteins typically include two nuclease domains, a RuvC-like nuclease domain near the amino terminus and an HNH (or McrA-like) nuclease domain near the middle of the protein. Each nuclease domain of the Cas9 protein is dedicated to cleaving one strand of the duplex; see, e.g., jinek et al, "A programmabledual-RNA-guided DNA endonuclease in adaptive bacterial immunity, science 337 (6096): 816-821 (2012). Type III CRISPR-Cas systems include a polymerase and a RAMP module. Type III systems can be further divided into subtypes III-A and III-B. The type III-a CRISPR-Cas system has been shown to target plasmids, and the polymerase-like protein of the type III-a system is involved in cleavage of target DNA; see, e.g., marraffini et al, "CRISPR interference limits horizontal gene transfer in Staphylococci by targeting DNA," Science, volume 322, phase 5909: pages 1843-1845, 2008. Type III-B CRISPR-Cas systems also show targeting RNAs; see, e.g., hale et al, "RNA-guided RNA cleavage by a CRISPR-RNA-Cas protein complex," Cell, volume 139, phase 5: 945 to 956, 2009. The CRISPR-Cas system includes engineered and/or programmed nuclease systems derived from naturally occurring CRISPR-Cas systems. The CRISPR-Cas system may comprise engineered and/or mutated Cas proteins. The CRISPR-Cas system may include engineered and/or programmed guide RNAs.
In some specific examples, a Cas protein in one of the Cas-gRNA RNPs of the invention can include Cas9 or other suitable Cas that can cleave a target polynucleotide at a gRNA complementary sequence in a manner such as described in the following references, the entire contents of each of which are incorporated herein by reference: nachmanson et al, "Targeted Genome fragmentation with CRISPR/Cas9 enables fast and efficient enrichment of small genomic regions and ultra-accurate sequencing with low DNA input (CRISPR-DS)", genome Res., volume 28, phase 10: pages 1589-1599, 2018; vakulskas et al, "A high-fidelity Cas9 mutant delivered as a ribonucleoprotein complex enables efficient gene editing in human hematopoietic stem and progenitor cells", nature Medicine, volume 24: pages 1216-1224, 2018; chatterjee et al, "Minimal PAM specificity of a highly similar SpCas9 orthog", science Advances, volume 4, phase 10: eaau0766, pages 1-10, 2018; lee et al, "CRISPR-Cap: multiplexed double-stranded DNA enrichment based on the CRISPR system", nucleic Acids Research, volume 47, phase 1: pages 1-13, 2019. Isolated Cas9-crRNA complexes from streptococcus thermophilus (s.thermophilus) CRISPR-Cas systems, as well as complexes assembled in vitro from separate components, demonstrate their binding to synthetic oligodeoxynucleotides and plasmid DNA carrying nucleotide sequences complementary to crrnas. Cas9 has been shown to have two nuclease domains, ruvC-and HNH-active sites/nuclease domains, and these two nuclease domains are responsible for cleaving the opposite DNA strand. In some examples, the Cas9 protein is derived from a Cas9 protein of a streptococcus thermophilus CRISPR-Cas system. In some examples, the Cas9 protein is a multidomain protein having about 1,409 amino acid residues. Some Cas9 proteins are useful in the context of the protein sequences described in, for example, ma et al, "Single-stranded DNA Cleavage by Divergent CRISPR-Cas9 Enzymes," Molecular cells, volume 60, phase 3: pages 398-407, the entire contents of which are incorporated herein by reference, target single stranded DNA in the manner described in 2016.
In other examples, cas may be engineered so as not to cleave the target polynucleotide at sequences complementary to grnas, for example, in a manner such as described in the following references, the entire contents of each of which are incorporated herein by reference: guilinger et al, "Fusion of catalytically inactive Cas9 to Fokl nuclease improves the specificity of genome modification", nature Biotechnology, volume 32: pages 577-582, 2014; bhatt et al, "Targeted DNA transposition using a dCas9-transposase fusion protein", https:// doi.org/10.1101/571653, pages 1-89, 2019; xu et al, "CRISPR-assisted targeted enrichment-sequencing (CATE-seq)", available from URL www.biorxiv.org/content/10.1101/672816v1, pages 1-30, 2019; tijan et al, "dCAS9-targeted locus-specific protein isolation method identifies histone gene regulators", PNAS, volume 115, phase 12: E2734-E2741, 2018. Cas lacking nuclease activity may be referred to as inactive Cas (dCas). In some examples, dCas can include a nuclease-free variant of Cas9 protein in which both RuvC-and HNH-active site/nuclease domains are mutated. The nuclease-free variant of Cas9 protein (dCas 9) binds double-stranded DNA, but does not cleave the DNA. Another variant of Cas9 protein has two inactivated nuclease domains, a first mutation in the domain that cleaves the strand complementary to the crRNA and a second mutation in the domain that cleaves the strand not complementary to the crRNA. In some examples, the Cas9 protein has a first mutation D10A and a second mutation H840A. In examples where the target polynucleotide is RNA, dCas13 or rCas9 lacking nuclease activity can be used to bind the target polynucleotide at a sequence complementary to the gRNA. For further details regarding dCS 13, see Yang et al, "Dynamic imaging of RNA in living cells by CRISPR-Cas13 systems," Molecular Cell, volume 76, phase 6: pages 981-997, E7, 2019, the entire contents of which are incorporated herein by reference. For further details regarding rCas9, see Nelles et al, "Programmable RNA tracking in live cells with CRISPR/Cas9," Cell, volume 165: page 488-496, 2016, incorporated herein by reference in its entirety.
In other examples, the Cas protein comprises a cascade protein. The cascade complexes in E.coli (E.coli) recognize double-stranded DNA (dsDNA) targets in a sequence-specific manner. The E.coli cascade complex is a 405-kDa complex comprising five functionally essential CRISPR-associated (Cas) proteins (CasA 1B2C6D1E1, also known as cascade proteins) and 61 nucleotide crRNAs. crrnas direct tandem complexes to dsDNA target sequences by forming base pairs with complementary DNA strands while displacing non-complementary strands to form R-loops. The cascade recognizes the target DNA without consuming ATP, indicating that continuous invader DNA monitoring occurs without energy input; see, for example, matthijs et al, "Structural basis for CRISPR RNA-guided DNA recognition by Cascade", nature Structural & Molecular Biology, volume 18, phase 5: pages 529 to 536, 2011. In other examples, the Cas protein comprises a Cas3 protein. Illustratively, E.coli Cas3 can catalyze the ATP-independent annealing of RNA to R-loop forming DNA and hybridize base-paired RNA to double-stranded DNA. Cas3 proteins may use longer grnas than Cas 9; see, e.g., howard et al, "Helicase disassociation and annealing of RNA-DNA hybrids by Escherichia coli Cas3 protein", biochem j., volume 439, stage 1: pages 85-95, 2011. Such longer grnas may allow other elements to more easily access the target DNA, e.g., access to the primer to be extended by the polymerase. Another feature provided by Cas3 proteins is that Cas3 proteins do not require PAM sequences like Cas9, thus providing greater flexibility for targeting desired sequences. The R-ring formation from Cas3 can utilize magnesium as a cofactor; see, e.g., howard et al, "Helicase disassociation and annealing of RNA-DNA hybrids by Escherichia coli Cas3 protein", biochem j., volume 439, stage 1: pages 85-95, 2011. Cas9 variants have also been developed that reduce or avoid the need for PAM sequences; see, e.g., walton et al, "Unconstrained genome targeting with near-PAMless engineered CRISPR-Cas9 varians", science, volume 368, 6488: pages 290-296, 2020, the entire contents of which are incorporated herein by reference. It is to be understood that any suitable cofactor, such as cations, may be used with Cas proteins used in the compositions and methods of the invention.
It should also be understood that any CRISPR-Cas system capable of disrupting double-stranded polynucleotides and generating loop structures may be used. For example, cas proteins may include, but are not limited to, cas proteins such as described in the following references, the entire contents of each of which are incorporated herein by reference: chute et al, "A guide of 45CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes," PLoS Comput biol. Volume 1, phase 6: e60, pages 1-10, 2005; zhang et al, "Expanding the catalog of cas genes with metagenomes," nucleic acids Res, volume 42, phase 4: 2448-2459, 2013; strecker et al, "RNA-guided DNA insertion with CRISPR-associated transposases", science, volume 365, stage 6448: pages 48-53, 2019, wherein the Cas protein may include Cas12k. Some of these CRISPR-Cas systems can utilize specific sequences to recognize and bind target sequences. For example, cas9 may take advantage of the presence of the 5' -NGG Protospacer Adjacent Motif (PAM).
The CRISPR-Cas system may also include engineered and/or programmed guide RNAs (grnas). As used herein, the terms "guide RNA" and "gRNA" (and sometimes referred to in the art as single guide RNA, or sgRNA) are intended to mean an RNA that includes a sequence that is complementary or substantially complementary to a region of a target DNA sequence and directs Cas protein to that region. The guide RNA may include nucleotide sequences other than those complementary or substantially complementary to regions of the target DNA sequence. Methods for designing grnas are well known in the art, and non-limiting examples are provided in the following references, the entire contents of each of which are incorporated herein by reference: stevens et al, "Anovel CRISPR/Cas9 associated technology for sequence-specific nucleic acid enrichment," PLoS ONE 14 (4): e0215441, pages 1-7 (2019); fu et al, "Improving CRISPR-Cas nuclease specificity using truncated guide RNAs, nature Biotechnology 32 (3): 279-284 (2014); kocak et al, "Increasing the specificity of CRISPR systems with engineered RNA secondary structures," Nature Biotechnology 37:37:657-666 (2019); lee et al, "CRISPR-Cap: multiplexed double-stranded DNA enrichment based on the CRISPR system," Nucleic Acids Research 47 (1): e1,1-13 (2019); quan et al, "FLASH: a next-generation CRISPR diagnostic for multiplexed detection of antimicrobial resistance sequences," Nucleic Acids Research 47 (14): e83,1-9 (2019); and Xu et al, "CRISPR-assisted targeted enrichment-sequencing (CATE-seq)," https:// doi.org/10.1101/672816,1-30 (2019).
In some examples, the gRNA includes a chimera, such as CRISPR RNA (crRNA) fused to trans-activated CRISPR RNA (tracrRNA). Such chimeric single guide RNAs (sgRNAs) are described in Jinek et al, "A programmabledual-RNA-guided endonuclease in adaptive bacterial immunity," Science, volume 337, stage 6096: pages 816-821, 2012. Cas proteins can be directed to any locus by chimeric sgrnas, followed by a 5' -NGG Protospacer Adjacent Motif (PAM). In one non-limiting example, crRNA and tracrRNA can be synthesized by in vitro transcription using a synthetic double stranded DNA template comprising a T7 promoter. the tracrRNA may have a fixed sequence, while the target sequence may determine a portion of the crRNA sequence. The crRNA and tracrRNA may be mixed at equimolar concentrations and heated at 55 ℃ for 30 seconds. Cas9 can be added at 37 ℃ at the same molar concentration and incubated with the RNA mixture for 10 minutes. The resulting Cas9-gRNA RNP can then be added to the target DNA in a 10-to 20-fold molar excess. The binding reaction may occur within 15 minutes. Other suitable reaction conditions may be readily employed.
As used herein, the term "transposase" is intended to mean an enzyme capable of coupling an oligonucleotide to a double stranded polynucleotide under certain conditions. The oligonucleotide comprises at least one chimeric end (ME) sequence, which may also be referred to as a Transposed End (TE). "transposome" or "transposable system" is intended to mean a transposase coupled to a corresponding oligonucleotide comprising at least an ME sequence. For example, a combination of a transposase and a transposon end may be referred to as a "transposome". The transposomes may be activated under certain conditions to cleave a double-stranded polynucleotide and couple the oligonucleotide to the cleavage end. For example, a transposome and a double-stranded polynucleotide may form a "transposable complex," wherein the transposome inserts an oligonucleotide into the double-stranded polynucleotide. In some examples, the transposomes may perform a process that may be referred to as "tagging" or "transposing" that results in fragmentation of the target polynucleotide and ligation of adaptors to the 5' ends or to the 5' ends and 3' ends of both strands of the double-stranded DNA fragment, for example in a manner such as described in US2010/0120098 or WO 2010/048605, the entire contents of each of which are incorporated herein by reference.
One non-limiting example of a transposase is Tn5. Another non-limiting example of a transposase is Tn3. Another non-limiting example of a transposase is Mu. In further examples, the transposase may include an integrase from a retrotransposon or a retrovirus. Other examples of known transposable complexes (or components thereof) that can be used in the methods of the invention include, but are not limited to: staphylococcus aureus (Staphylococcus aureus) Tn552, tyl, transposon Tn7, tn/O and IS10, mariner transposase, tel, P element, tn3, bacterial insert, retrotransposon of retroviruses and yeasts (see, e.g., colego et al, 2001, J. Bacteria. Volume 183: pages 2384-2388; kirby et al, 2002, mol. Microbiol. 43:173-186, devices and Boeke,1994, nucleic Acids Res., 22:3765-3772, international patent application WO 95/23875, craig,1996, science, 271:1512, craig,1996, curr Top Microbiol Immunol, 204:27-48, kleckner et al, 1996, curr Top Microbiol Immunol, 204:49-82, lampe et al, 1996, EMBO J, 15:5470-5479, plasterk,1996, curr Top Microbiol Immunol, 204:125-143, gloor,2004, methods mol. Biol., 260:97-114, ikawa and OSubo, 1996, 58:48, and U.S. 16:48:94, 6:204:94, 6:94:94, and U.S. 16:26:18, 35, 2:94:94:94, and 4:94:94.1989, and 4:94.V.1989, thereby, and 4:94.1989. Still other exemplary transposition systems include, but are not limited to: those formed by a highly active Tn5 transposase and Tn 5-type transposon ends, or by a MuA transposase and a Mu transposon end comprising an R1 end sequence and an R2 end sequence. See, for example, the following references, the entire contents of each of which are incorporated herein by reference: goryshin et al, "Tn5 in vitro transposition", J.biol. Chem.273:7367-7394 (1998); mizuuchi, "In vitro transposition of bacteriophage Mu: a biochemical approach to a novel replication reaction,", cell, volume 35 (3 pt 2): pages 785-794, 1983; and Savilahti et al, "The phage Mu transposomes core: DNA requirements for assembly and function," EMBO j., volume 14, 19: pages 4893-4903, 1995. Transposases may be mutated to modulate their activity and/or ME sequences may be altered such as in Reznikoff, "Tn5 as a model for understanding DNA transposition," mol. Microbiol. Volume 47, phase 5: pages 1199-1206, the entire contents of which are incorporated herein by reference, modulate transposome activity in the manner described in 2003.
Additional examples of transposases and other suitable transposable systems include Staphylococcus aureus Tn552 (see, e.g., colego et al, "In vitro transposition system for efficient generation of random mutants of Campylobacter jejuni," J bacteriol. Vol. 183: pages 2384-2388, 2001 and Kirby et al, "Cryptic plasmids of Mycobacterium avium: tn552to the focus," Mol Microbiol., vol. 43, vol. 1: pages 173-186, 2002); tyI (Devine et al, "Efficient integration of artificial transposons into plasmid targets in vitro: a useful tool for DNA mapping, sequencing and genetic analysis", nucleic Acids Res., vol.22, 18:3765-3772, 1994 and International patent application WO 95/23875); transposon Tn7 (Craig, "V (D) J recombination and transposition: closer than expected", science, volume 271, 5255: page 1512, reviewed in 1996 and Craig, curr Top Microbiol Immunol, volume 204: pages 27-48, 1996); tnIO and ISlO (Kleckner et al Curr Top Microbiol Immunol, vol.204: pages 49-82, 1996); mariner transposase (Lampe et al, "A purified Mariner transposase is sufficient to mediate transposition in vitro", EMBO J, volume 15, 19: pages 5470-5479, 1996); tci (Plasterk, curr Top Microbiol Immunol, volume 204: pages 125-143, 1996), P-factor (glood, "Gene targeting in Drosophila", methods Mol Biol, volume 260: pages 97-114, 2004); tnJ (Ichikawa et al, "In vitro transposition of transposon Tn", J Biol chem., volume 265, 31: pages 18829-18832, 1990); bacterial insert (Ohtsupo et al, "Bacterial insertion sequences", curr. Top. Microbiol. Immunol., vol.204:1-26, 1996); retroviruses (Brown et al, "Retroviral integration: structure of the initial covalent product and its precursor, and a role for the viral IN protein", proc Natl Acad Sci USA, volume 86: pages 2525-2529, 1989); and retrotransposons of yeast (Boeke et al, "Transcription and reverse transcription of retrotransposons", annu Rev microbiol., 43: pages 403-434, 1989). Transposases, transposomes, ME sequences, transposons, and transposable systems and transposable complexes are generally known to those skilled in the art, as exemplified by the disclosure of US 2010/01200098, the entire contents of which are incorporated herein by reference.
Some transposomes may comprise transposase monomers. For example, a single unit (monomer) Tn3 transposase can bind two target sequences simultaneously and change conformation to form a transposome, e.g., as in Nicolas et al, "Unlocking Tn3-family transposase activity in vitro unveils an asymetric pathway for transposome assembly," PMAS, volume 114, phase 5: pages E669-E678, the entire contents of which are incorporated herein by reference, in the manner described in 2017. Some transposomes may include transposase dimers. For example, tn5 transposase may be found, for example, in Naumann et al, "Trans catalysis in Tn5 transfer," PNAS, vol.97, 16: pages 8944-8949, the entire contents of which are incorporated herein by reference, dimerize in the manner described in 2000. Some transposomes may include transposase tetramers. For example, mu transposase may form tetramers in a manner as described in Harshey, "Transposable phase Mu," Microbiol Spectr.2 (5): MDNA 3-0007-2014/Microbiolspec.MDNA3-0007-2014 (page 22), 2014 and in Lamberg et al, "Efficient insertion mutagenesis strategy for bacterial genomes involving electroporation of in vitro-assembled DNA transposition complexes of bacteriophage Mu," Appl Environ Microbiol. Volume 68, 2 nd, pages 705-712, 2002), the entire contents of each of which are incorporated herein by reference.
In the context of polypeptides, the terms "variant" and "derivative" as used herein refer to polypeptides comprising an amino acid sequence or polypeptide fragment of a polypeptide that has been altered by the introduction of amino acid residue substitutions, deletions or additions. A variant or derivative of a polypeptide may be a fusion protein comprising a portion of the amino acid sequence of the polypeptide. As used herein, the term "variant" or "derivative" also refers to a polypeptide or polypeptide fragment that has been chemically modified, for example, by covalently attaching any type of molecule to the polypeptide. For example, but not limited to, a polypeptide or polypeptide fragment may be chemically modified, such as by glycosylation, acetylation, pegylation, phosphorylation, methylation, nitrosylation, amidation, derivatization by known protecting/blocking groups, proteolytic cleavage, attachment to a cellular ligand or other protein, and the like. Variants or derivatives are modified in the type or position of the attached molecule in a manner different from the naturally occurring or starting peptide or polypeptide. Variants or derivatives further include the deletion of one or more chemical groups naturally occurring on the peptide or polypeptide. Variants or derivatives of the polypeptide or polypeptide fragment may be chemically modified by chemical modification using techniques known to those skilled in the art, including, but not limited to, specific chemical cleavage, acetylation, formulation, metabolic synthesis of tunicamycin, and the like. In addition, variants or derivatives of the polypeptide or polypeptide fragment may comprise one or more atypical amino acids. The polypeptide variants or derivatives may have similar or identical functions to the polypeptides or polypeptide fragments described herein. The polypeptide variants or derivatives may have additional or different functions compared to the polypeptides or polypeptide fragments described herein.
As used herein, the term "sequencing" is intended to mean determining the sequence of a polynucleotide. Sequencing may include one or more of sequencing-by-synthesis, bridging PCR, chain termination sequencing, sequencing-by-hybridization, nanopore sequencing, and sequencing-by-ligation.
As used herein, "selective" for an element is intended to mean coupled to the target and not to a different element. For example, an antibody selective for a protein may be conjugated to that protein rather than to a different protein.
As used herein, the terms "unique molecular identifier" and "UMI" are intended to mean oligonucleotides that can be coupled to a polynucleotide and through which a polynucleotide can be identified. For example, a set of different UMIs may be coupled to a plurality of different polynucleotides, and each of these polynucleotides may be identified using the particular UMI coupled to that polynucleotide. One example of a UMI is a "bar code".
As used herein, the term "whole genome" or "WG" of a species is intended to mean a set of one or more polynucleotides that together provide the majority of the polynucleotides used in the cellular processes of the species. The whole genome of a species may comprise any suitable combination of chromosomal DNA and/or mitochondrial DNA of the species, and in the case of plant species may comprise DNA contained in chloroplasts. Together, the set of one or more polynucleotides may provide at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90%, or at least about 95%, or at least about 98%, or at least about 99% of the polynucleotides used by the cellular processes of the species.
As used herein, the term "fragment" is intended to mean a portion of a polynucleotide. For example, a polynucleotide may be a total number of bases in length, and a fragment of the polynucleotide may be less than the total number of bases in length.
As used herein, the term "sample" is intended to mean a volume of fluid comprising one or more polynucleotides. The polynucleotides in the sample may comprise the whole genome, or may comprise only a portion of the whole genome. The sample may comprise polynucleotides from a single species or from multiple species.
As used herein, the term "antibody" encompasses monoclonal antibodies (including full length monoclonal antibodies), polyclonal antibodies, multispecific antibodies (e.g., bispecific antibodies), and antibody fragments so long as they exhibit the desired biological activity of binding to the target antigen site and its target isotype. For example, an antibody may selectively bind to a target protein, such as a protein at a locus of a polynucleotide, and may not bind to any other target protein. As another example, the first antibody may selectively bind to a portion of the second antibody. A different set of antibodies may also include that portion, and thus, the first antibody may selectively bind that portion of each of these antibodies, and may not bind any other portion of those antibodies or any other protein. The term "antibody fragment" includes a portion of a full length antibody, typically the antigen binding or variable region of a full length antibody. As used herein, the term "antibody" encompasses any antibody derived from any species and resource, including but not limited to human antibodies, rat antibodies, mouse antibodies, rabbit antibodies, and the like, and may be synthetically prepared or naturally occurring.
As used herein, the term "monoclonal antibody" refers to an antibody obtained from a substantially homogeneous population of antibodies. That is, individual antibodies, including populations, are identical, except for possible naturally occurring mutations that may be present in minor amounts. Monoclonal antibodies are highly specific, being directed against a single antigenic site. Furthermore, in contrast to conventional (polyclonal) antibody preparations, which typically include different antibodies directed against different determinants (epitopes), each monoclonal antibody is directed against a single determinant on the antigen. "monoclonal antibodies" can also be isolated from phage antibody libraries using techniques known in the art. The term monoclonal antibody as used herein may include "chimeric" antibodies (immunoglobulins) in which a portion of the heavy and/or light chains are identical or homologous to corresponding sequences in antibodies derived from a particular species or belonging to a particular antibody class or subclass, while the remainder of the chains are identical or homologous to corresponding sequences in antibodies derived from another species or belonging to another antibody class or subclass, as well as fragments of such antibodies, so long as they exhibit the desired biological activity.
As used herein, terms such as "target-specific" and "selective" when used in reference to a polynucleotide are intended to mean a polynucleotide that includes sequences that are specific for (substantially complementary to and hybridizable to) sequences within another polynucleotide. As used herein, terms such as "target-specific" and "selective" when used in reference to an antibody are intended to mean an antibody that includes features specific for (coupled to) a particular type of target protein and not coupled to any other type of protein.
As used herein, the terms "complementary" and "substantially complementary," when used in reference to a polynucleotide, are intended to mean that the polynucleotide includes sequences that are capable of selectively hybridizing under certain conditions to sequences in another polynucleotide.
As used herein, terms such as "amplification" refer to the use of any suitable amplification method to produce an amplicon of a polynucleotide. Polymerase Chain Reaction (PCR) is a non-limiting amplification method. Other suitable amplification methods known in the art include, but are not limited to, rolling circle amplification; ribose primer (riboprimer) amplification (e.g., as described in U.S. Pat. No. 7,413,857); ICAN; UCAN; ribosphia; end markers (e.g., as described in U.S. 2005/0153333); and eberwire type aRNA amplification or strand displacement amplification. Further, non-limiting examples of amplification methods are described in the following patents: WO 02/16639; WO 00/56877; AU 00/29742; U.S.5,523,204; U.S.5,536,649; U.S.5,624,825; U.S.5,631,147; U.S.5,648,211; U.S.5,733,752; U.S.5,744,311; U.S.5,756,702; U.S.5,916,779; U.S.6,238,868; U.S.6,309,833; U.S.6,326,173; U.S.5,849,547; U.S.5,874,260; U.S.6,218,151; U.S.5,786,183; U.S.6,087,133; U.S.6,214,587; U.S.6,063,604; U.S.6,251,639; U.S.6,410,278; WO 00/28082; U.S.5,591,609; U.S.5,614,389; U.S.5,773,733; U.S.5,834,202; U.S.6,448,017; U.S.6,124,120; and U.S.6,280,949.
As used herein, the terms "polymerase chain reaction" and "PCR" refer to procedures in which small amounts of polynucleotides (e.g., RNA and/or DNA) are amplified. Typically, amplification primers are coupled to the polynucleotide for use during PCR. See, for example, the following references, the entire contents of which are incorporated herein by reference: U.S. Pat. No. 4,683,195 to Mullis; mullis et al, cold Spring Harbor symp. Quant. Biol., volume 51: page 263, 1987; and Erlich editions, PCR Technology, (Stockton Press, NY, 1989). As known to those skilled in the art, a variety of enzymes and kits can be used to perform PCR. For example, in some examples, FAILSAFE from EPICENTRE Biotechnologies (Madison, wis.) is used TM PCR System or MASTERAMP TM Extra-Long PCR systems PCR amplification was performed as described by the manufacturer.
As used herein, the term "chromatin" means a structure in which DNA is condensed into a chromosome along with one or more proteins (such as histones). More tightly-condensed chromatin may be referred to as heterochromatin, while more loosely-condensed chromatin may be referred to as euchromatin.
As used herein, the term "protein" is intended to refer to a polypeptide chain that is folded into a tertiary structure. Proteins coupled to DNA may be referred to as "epigenetic" or "epigenetic" modifications to DNA, and thus such "epigenetic" or "epigenetic" assays may refer herein to assays that identify which proteins bind to corresponding DNA loci. It may be desirable to determine which proteins (such as euchromatin proteins) are coupled to DNA, and the corresponding loci of such proteins, as such proteins may be transcriptionally active and therefore of interest.
As used herein, the term "locus" refers to the location along a polynucleotide at which a corresponding element (such as a protein) is present.
As used herein, the term "substrate" refers to a material that serves as a support for the compositions described herein. Exemplary substrate materials may include glass, silica, plastic, quartz, metal oxide, organosilicates (e.g., polyhedral organic silsesquioxanes (POSS)), polyacrylates, tantalum oxide, complementary Metal Oxide Semiconductors (CMOS), or combinations thereof. An example of a POSS may be that described by Kehagias et al on pages 776-778 of Microelectronic Engineering 86 (2009), which is incorporated by reference in its entirety. In some examples, the substrate used in the present application comprises a silica-based substrate, such as glass, fused silica, or other silica-containing materials. In some examples, the silicon dioxide-based substrate may include silicon, silicon dioxide, silicon nitride, or silane. In some examples, the substrates used in the present application comprise plastic materials or components such as polyethylene, polystyrene, poly (vinyl chloride), polypropylene, nylon, polyester, polycarbonate, and poly (methyl methacrylate). Exemplary plastic materials include poly (methyl methacrylate), polystyrene, and cyclic olefin polymer substrates. In some examples, the substrate is or comprises a silica-based material or a plastic material or a combination thereof. In certain examples, the substrate has at least one surface comprising glass or a silicon-based polymer. In some examples, the substrate may include a metal. In some such examples, the metal is gold. In some examples, the substrate has at least one surface comprising a metal oxide. In one example, the surface comprises tantalum oxide or tin oxide. Acrylamide, ketene, or acrylate may also be used as the base material or component. Other substrate materials may include, but are not limited to, gallium arsenide, indium phosphide, aluminum, ceramics, polyimides, quartz, resins, polymers, and copolymers. In some examples, the substrate and/or substrate surface may be or include quartz. In some other examples, the substrate and/or substrate surface may be or include a semiconductor, such as GaAs or ITO. The foregoing list is intended to be illustrative of, but not limiting of, the present application. The substrate may comprise a single material or a plurality of different materials. The substrate may be a composite or laminate. In some examples, the substrate includes an organosilicate material.
The substrate may be flat, circular, spherical, rod-like, or any other suitable shape. The substrate may be rigid or flexible. In some examples, the substrate is a bead or flow cell.
The substrate may be unpatterned, textured or patterned on one or more surfaces of the substrate. In some examples, the substrate is patterned. Such patterns may include pillars, pads, holes, ridges, channels, or other three-dimensional concave or convex structures. The pattern may be regular or irregular across the substrate surface. For example, the pattern may be formed by nanoimprint lithography or by using, for example, metal pads that form features on a non-metallic surface.
In some examples, the substrate described herein forms at least a portion of, is located in, or is coupled to a flow cell. The flow cell may comprise a flow chamber divided into a plurality of lanes or a plurality of partitions. Exemplary flow cells and substrates for use in the methods and compositions set forth herein include, but are not limited to, those commercially available from Illumina, inc., san Diego, CA.
As used herein, the term "post-translational modification" (PTM) refers to modification of a protein after its biosynthesis. Non-limiting examples of PTMs include phosphorylation, methylation, nitrosylation, acetylation, and glycosylation. For a given protein, one of the forms of the given protein may not be post-translationally modified, while one or more other of the forms of the given protein may be post-translationally modified, such as by an enzyme.
As used herein, "analyte" is intended to mean a chemical or biological element that is desired to be detected. The analyte may be referred to as a "target". Analytes may include nucleotide analytes and non-nucleotide analytes. The nucleotide analyte may comprise one or more nucleotides. Non-nucleotide analytes may include chemical entities that are not nucleotides. Exemplary nucleotide analytes are DNA analytes that include deoxyribonucleotides or modified deoxyribonucleotides. The DNA analyte may include any DNA sequence or feature that may be of interest for detection, such as a single nucleotide polymorphism or DNA methylation. Another exemplary nucleotide analyte is an RNA analyte that includes ribonucleotides or modified ribonucleotides. RNA analytes may include any RNA sequence or feature that may be of interest for detection, such as the presence or amount of mRNA or cDNA. An exemplary non-nucleotide analyte is a protein analyte. Proteins include polypeptide sequences that fold into structure. Another exemplary non-nucleotide analyte is a metabolite analyte. A metabolite analyte is a chemical element that is formed or used during metabolism. Additional exemplary analytes include, but are not limited to, carbohydrates, fatty acids, sugars (such as glucose), amino acids, nucleosides, neurotransmitters, phospholipids, and heavy metals. In the present disclosure, analytes may be detected in the context of any suitable application, such as analysis of disease states, analysis of metabolic health, analysis of microbiome, analysis of drug interactions, analysis of drug reactions, analysis of toxicity or analysis of infectious diseases. Illustratively, the metabolite may include chemical elements that are up-or down-regulated in response to the disease. Non-limiting examples of analytes include lipids, kinases, serine hydrolases, metalloproteases, disease specific biomarkers such as antigens for a particular disease, and glucose.
As used herein, "aptamer" is intended to mean an oligonucleotide having a tertiary structure that renders the oligonucleotide selective for a target, such as an analyte. "selective" for a target is intended to mean coupled to the target and not to a different target. The nucleic acid ligand may comprise any suitable type of oligonucleotide, e.g., DNA, RNA, and/or nucleic acid analogs, such as exemplified elsewhere herein. The nucleic acid ligand can be coupled to the target by any suitable combination of interactions (e.g., by any suitable combination of electrostatic interactions, hydrophobic interactions, and formation of tertiary structures).
As used herein, "lectin" is intended to mean a protein that selectively binds one or more specific sugars, and thus does not bind any other sugar. A "monovalent" lectin may bind a single sugar at a given time, while a "divalent" lectin may bind two sugars simultaneously, and a "multivalent" lectin may bind two or more sugars simultaneously. Lectins may be naturally occurring or non-naturally occurring. Naturally occurring lectins can include plant lectins and animal lectins.
As used herein, "sugar" is intended to mean a water-soluble carbohydrate. The sugar may include monosaccharides, disaccharides and polysaccharides.
As used herein, "splint oligonucleotide" is intended to mean any oligonucleotide capable of linking two other oligonucleotides together by complementary binding of the "splint oligonucleotide" to the corresponding portion of each of the two other oligonucleotides. In some examples, a "splint oligonucleotide" ligates the two other oligonucleotides together by ligating the two other oligonucleotides together.
As used herein, "probe" is intended to mean any biological or synthetic molecule capable of interacting with a target of interest and capable of detecting the target of interest. Detection of a target of interest may be performed by directly detecting the interaction of a probe with the target or by indirectly detecting the amino acid or nucleotide sequence attached to the probe. In some examples, detection of the target of interest occurs after separation of the amino acid or nucleotide sequence from the probe.
As used herein, "reporter oligonucleotide" is intended to mean any oligonucleotide that can be analyzed to determine the identity of a target of interest or an analyte of interest. In some examples, a "reporter oligonucleotide" is linked to a "probe". In some examples, the "reporter oligonucleotide" is separated from the "probe".
Compositions and methods for detecting analytes using proximity-induced labeling
Some examples herein provide for detection of analytes using proximity-induced labeling.
For example, proteomics provides a significant opportunity to explore in biological systems. Enzyme-linked immunosorbent assays (ELISA) are standard methods for detecting and quantifying specific proteins in complex mixtures. This method relies on specific immobilization of the target of interest, typically via an antibody or other target recognition element, followed by detection and quantification with a secondary antibody conjugated to a reporter molecule. This approach is well accepted, but it is difficult to assess multiple targets simultaneously due to the limited variety of available reporter molecules. It is expected that a robust and simplified method for converting multiplex protein detection to polynucleotide readout would help advance the field of proteomics and increase the utility of Next Generation Sequencing (NGS) technologies.
As provided herein, proximity-induced tagging is used to address the problem of detecting analytes (such as proteins or other biomolecules) in a multiplexed manner that is achieved by producing a reporter polynucleotide that can be sequenced and the analyte detected from these sequences. In a manner as described herein, proximity-induced tagging may be performed using donor and acceptor recognition probes. The donor recognition probe comprises a first analyte-specific recognition element and a transposome comprising a barcode (sequence) corresponding to the target analyte. The receptor recognition probe includes a second analyte-specific recognition element and an oligonucleotide. In response to the recognition elements of the respective donor and acceptor recognition probes selectively binding to the same analyte as each other, the barcoded transposomes are sufficiently close to the oligonucleotide to tag that oligonucleotide with a bar code-hence the term "proximity induced tagging". The polynucleotides resulting from such labeling comprise both barcodes from the donor recognition probes and oligonucleotides from the acceptor recognition probes. Thus, the sequence of such a "reporter" polynucleotide reflects that it is formed in response to the proximity of two probes specific for the same analyte. Thus, it will be appreciated that the present assay is highly specific and can be readily read out by sequencing the reporter polynucleotide.
FIG. 1 schematically illustrates exemplary operations and compositions in a process flow for detecting analytes using proximity-induced labeling. The composition 100 shown in fig. 1 includes a plurality of analytes 111, 111', a plurality of donor recognition probes 120, 120', and a plurality of acceptor recognition probes 130, 130', each having a first portion and a second portion. Each of the donor identification probes 120, 120' may comprise a first identification element 121, 121' specific for a first portion of the respective analyte 111, 111', a first oligonucleotide 122, 122' corresponding to that first portion of the respective analyte, and a transposase 123, 123' coupled to the first identification element and the first oligonucleotide. Each of the receptor recognition probes 130, 130 'may comprise a second recognition element 131, 131' specific for a second portion of the respective analyte 111, 111', and a second polynucleotide 132, 132' coupled to the second recognition element and corresponding to the second portion of the respective analyte. In some examples, the first oligonucleotides 122, 122 'and the second oligonucleotides 132, 132' may include DNA. It is noted that the donor recognition probes 120, 120 'and the acceptor recognition probes 130, 130' may be provided in a kit comprising, for each analyte (the number of which may be, for example, tens, hundreds, thousands or millions) for which an assay may be desired, a plurality of donor recognition probes and a plurality of acceptor recognition probes comprising recognition elements specific for the analyte. In the simplified example shown in fig. 1, the kit may include a plurality of donor recognition probes 120 and a plurality of acceptor recognition probes 130 specific for the analyte 111, and a plurality of donor recognition probes 120' and a plurality of acceptor recognition probes 130' specific for the analyte 111 '.
At a particular time as shown in fig. 1, the first recognition element 121 of the first donor recognition probe 120 is specifically coupled to a first portion of the analyte 111, and the second recognition element 131 of the first acceptor recognition probe 130 is specifically coupled to a second portion of the analyte 111. In response to such coupling of the recognition element 121, 131 to the corresponding portion of the analyte 111, the transposase 123 labels the second oligonucleotide 132, thereby causing the first oligonucleotide 122 to become covalently coupled to the second oligonucleotide 132. The first oligonucleotide 122 may comprise a sequence corresponding to a first portion of the analyte 111, such as the barcode "ID-X1", and the second oligonucleotide 132 may comprise a sequence corresponding to a second portion of the analyte 111, such as the barcode "ID-X2". Thus, it will be appreciated that transposase 123 produces a "reporter" polynucleotide comprising both sequences ID-X1 and ID-X2 from which the presence of analyte 111 can be determined and coupled to both first recognition element 121 and second recognition element 131, resulting in proximity-induced tagging of second oligonucleotide 132 by transposase 123. Because the sequences ID-X1 and ID-X2 correspond to analytes that are identical to each other, it can be determined that both the first recognition element 121 and the second recognition element 131 are specifically coupled to such analytes.
Similarly, the first recognition element 121 'of the second donor recognition probe 120' is specifically coupled to a first portion of the analyte 111', and the second recognition element 131' of the second acceptor probe 130 'is specifically coupled to a second portion of the analyte 111'. In response to such coupling of the recognition elements 121', 131' to the corresponding portion of the analyte 111', the transposase 123' labels the second oligonucleotide 132', thereby causing the first oligonucleotide 122' to become covalently coupled to the second oligonucleotide 132 '. The first oligonucleotide 122 'may comprise a sequence corresponding to a first portion of the analyte 111', such as the barcode "ID-Y1", and the second oligonucleotide 132 'may comprise a sequence corresponding to a second portion of the analyte 111', such as the barcode "ID-Y2". Thus, it will be appreciated that transposase 123 'generates a "reporter" polynucleotide comprising both sequences ID-Y1 and ID-Y2, from which analyte 111' can be determined to be present and coupled to both first recognition element 121 'and second recognition element 131', resulting in proximity-induced tagging of first oligonucleotide 132 'by transposase 123'. Because the sequences ID-Y1 and ID-Y2 correspond to analytes that are identical to each other, it can be determined that both the first recognition element 121 'and the second recognition element 131' are specifically coupled to such analytes.
In contrast, any labelling resulting from non-specific binding of the recognition element to contaminants or other elements in the sample can be expected to produce a reporter polynucleotide comprising a mismatched barcode. In the illustrative example of non-specific binding, the first recognition element 121 'of the second donor probe 120' is non-specifically coupled to a first portion of the analyte 141 and the second recognition element 131 of the first acceptor probe 130 is non-specifically coupled to a second portion of the analyte 141. In response to such coupling of recognition elements 121', 131 to the corresponding portion of analyte 141, transposase 123' labels oligonucleotide 132, thereby causing oligonucleotide 122' to become covalently coupled to oligonucleotide 132. As described above, oligonucleotide 122 'may comprise a sequence corresponding to a first portion of analyte 111', such as barcode "ID-Y1", and oligonucleotide 132 may comprise a sequence corresponding to a first portion of analyte 111, such as barcode "ID-X2". Thus, it will be appreciated that transposase 123' generates a "reporter" polynucleotide comprising both sequences ID-Y1 and ID-X2, from which analyte 141 can be determined to be present and coupled to both first recognition element 121' and second recognition element 131, resulting in proximity induced tagging of oligonucleotide 132 by transposase 123 '. Because the sequences ID-Y1 and ID-X2 do not correspond to analytes that are identical to each other, it can be determined that one or both of the first recognition element 121' and the second recognition element 131 are non-specifically coupled to such analytes.
It should be appreciated that proximity-induced labeling may be used to determine any suitable analyte, and that any suitable recognition element may be used to specifically bind such analyte. In some examples, the analyte may comprise a first molecule. For example, a first portion of the analyte (to which a first recognition element may specifically bind) may comprise a first portion of the first molecule, and a second portion of the analyte (to which a second recognition element may specifically bind) may comprise a second portion of the first molecule. Illustratively, the first molecule may comprise a protein or peptide, the first recognition element 121, 121 'may comprise a first antibody or first aptamer specific for a first portion of the protein or peptide, and the second recognition element 131, 131' may comprise a second antibody or second aptamer specific for a second portion of the protein or peptide. Alternatively, for example, the first molecule may comprise a target polynucleotide, the first recognition element 121, 121 'may comprise a first CRISPR associated (Cas) protein specific for a first subsequence of the target polynucleotide, and the second recognition element 131, 131' may comprise a second Cas protein specific for a second subsequence of the target polynucleotide. In some examples, the target polynucleotide can include RNA, and the first Cas protein and the second Cas protein are independently selected from the group consisting of racas 9 and dCas 13. Alternatively, for example, the first molecule may comprise a carbohydrate, the first recognition element 121, 121 'may comprise a first lectin specific to a first portion of the carbohydrate, and the second recognition element 131, 131' may comprise a second lectin specific to a second portion of the carbohydrate. Alternatively, for example, the first molecule may comprise a biological molecule, and the biological molecule may be specific for the first recognition element 121 or 121 'and the second recognition element 131 or 131'. However, it should be appreciated that the recognition element 121, 121', 131' may have any suitable configuration that specifically recognizes and couples to an analyte of interest, or that specifically recognizes and couples to a specific binding protein, for example.
The oligonucleotides 122, 122 'of the donor recognition probes 120, 120' may comprise any suitable sequence for binding to the transposases 123, 123 'used to tag the oligonucleotides 132, 132' and subsequently amplified and sequenced. FIG. 2 schematically illustrates an exemplary donor identification probe for detecting analytes using proximity-induced labeling. The first oligonucleotide 122 may be synthetic and may comprise an annealed chimeric end (ME, ME') transposon end sequence, a sequencing primer (e.g., a 14), a unique barcode identifying the recognition element 121 (e.g., ID-X1), and a primer binding site (e.g., primer C). Similarly, the first oligonucleotide 122' may be synthetic and may comprise an annealed chimeric end (ME, ME ') transposon end sequence, a sequencing primer (e.g., a 14), a unique barcode identifying the recognition element 121' (e.g., ID-Y1), and a forward primer binding site (e.g., primer C). Transposase 123, such as Tn5, can be coupled to the annealed chimeric terminal transposon end sequences (ME, ME ') of the 3' end of oligonucleotide 122 to form an active transposome. Transposases 123', such as Tn5, can be coupled to annealed chimeric terminal transposon end sequences (ME, ME') at the 3 'end of the oligonucleotide 122' to form active transposomes. The 5' end of oligonucleotide 122 may be coupled to recognition element 121 via linker 124, and the 5' end of oligonucleotide 122' may be coupled to recognition element 121' via linker 124 '. Aptamers, antibodies, proteins, etc. conjugated to custom designed oligonucleotides are commercially available or methods of making such conjugates are known in the art. Further options for preparing donor recognition probes are provided further below with reference to fig. 8A-8C. In this regard, while the description of transposases in the donor identification probes 120, 120' may be simplified in FIGS. 1, 2, 4A, 7A-7C, 9A-9E, 10A-10D, 11A-11C, and 12 by illustrating only a single oligonucleotide coupled to those transposases, it should be understood that the donor identification probes of the present invention may include pairs of oligonucleotides that reflect that the transposases may dimerize in a manner such as described below with reference to FIGS. 8A-8C.
The oligonucleotides 132, 132 'of the receptor recognition probes 130, 130' may comprise any suitable sequence for labelling by transposases 123, 123 'to be coupled to oligonucleotides 122, 122' and subsequently amplified and sequenced. FIG. 3 schematically illustrates an exemplary receptor recognition probe for detecting analytes using proximity-induced labeling. The second oligonucleotide 132 may be synthetic and may comprise a reverse chimeric terminal transposon end sequence (ME '), a reverse sequencing primer (e.g., B15'), a unique barcode identifying the recognition element 131 (e.g., ID-X2), and a double stranded tagged receptor site (e.g., TN5 receptor site) 134. Similarly, the second oligonucleotide 132' may be synthetic and may comprise a reverse annealed chimeric terminal transposon end sequence (ME '), a reverse sequencing primer (e.g., B15 '), a unique barcode identifying the recognition element 131' (e.g., ID-Y2), and a double stranded tagged receptor site (e.g., TN5 receptor site) 134'. In addition, the receptor recognition probe may comprise two 3 'overhangs, each 3' overhang of which may comprise a unique barcode identifying the recognition element. The 5' end of oligonucleotide 132 may be coupled to recognition element 131 via linker 135, and the 5' end of oligonucleotide 132' may be coupled to recognition element 131' via linker 135 '. Aptamers, antibodies, proteins, etc. conjugated to custom designed oligonucleotides are commercially available or methods of making such conjugates are known in the art.
Fig. 4A-4G schematically illustrate further details of operations and compositions in the process flow of fig. 1 according to some examples. For example, fig. 4A illustrates an assay in which a donor recognition probe 120 (described with reference to fig. 1 and 2) and an acceptor recognition probe 130 (described with reference to fig. 1 and 3) perform proximity-induced labeling in response to the recognition elements of those probes specifically binding to analyte 111. More specifically, transposase 123 labels double-stranded labeled receptor site 134 in response to this specific binding of analyte 111. The labelling reaction may be initiated by the addition of any suitable cofactor for transposome cleavage and intercalation activity, such as magnesium ions (mg++). FIG. 4B shows further details of the labelling reaction, wherein a transposase 123 inserts a first oligonucleotide 122 into a double stranded labelling acceptor site 134 of a second oligonucleotide 132. In this regard, as described elsewhere herein, the donor recognition probes 120 of the present invention can include pairs of oligonucleotides that reflect that a transposase can dimerize, for example, in the manner described below with reference to fig. 8A-8C, and thus the labeling reaction can produce top and bottom strands having sequences schematically shown, for example, in fig. 4B and 4C. Furthermore, as described elsewhere herein, the receptor recognition probe may comprise two 3 'overhangs, each 3' overhang of which may comprise a unique barcode identifying the recognition element, as shown in fig. 4B and 4C, which may provide increased redundancy by generating two template strands per labelling event. As shown in fig. 4D, sample indices (i 7 and i 5) can be added to the template strand using primers and extended to form a duplex, as shown in fig. 4E. As shown in fig. 4F, a primer (e.g., primer C') can anneal to the complementary strand and extend to form an elongated reporter polynucleotide, which is then PCR amplified and contains both the sample index and the barcode corresponding to the recognition element 121, 131. Sequencing was then used to determine donor and recognition probe identity and sample index, as shown in fig. 4G. For example, a first read ("read 1") may be performed by annealing the appropriate primer to the B15 'primer on the top strand to read the sequence ID-X2' corresponding to the recognition element 131. Additionally, a second read ("read 2") may be performed by annealing the appropriate primers to the ME and a14 primer on the top strand to read the sequence ID-X1 and i5 sample indices corresponding to the identification element 121. Additionally, a third read ("read 3") may be performed by annealing the appropriate primers to the ME and B15 primers on the bottom strand to read the i7 sample index. However, it should be understood that any suitable sequencing method may be used to read the two barcode sequences within the reporter oligonucleotide, and the use of sample indices is optional.
Amplification of PCR repeats may affect the accuracy of PCR quantification of tagged products. To distinguish the repetition from different detection events, a Unique Molecular Identifier (UMI) may be added to the donor recognition probe (as shown in fig. 23A), to the acceptor recognition probe (as shown in fig. 23B), or to both. The UMI sequence may be a random sequence of nucleotides. Alternatively, UMI may be a sequence randomly selected from a set of known sequences that enable error correction and avoid undesirable secondary structures, such as dsDNA that would be a target for transposase Tn 5. Note that the two UMIs shown in the receptor recognition probe shown in fig. 23B may be the same as this, or may be different from each other.
It should also be understood that the example of an analyte provided with reference to fig. 1 is purely illustrative. Another non-limiting example of an analyte that can be determined using proximity-induced labeling of the present invention is post-translational modification of a Protein (PTM). For example, proteins often exhibit PTM due to phosphorylation, acetylation, methylation, nitrosylation, glycosylation, and many other mechanisms. To distinguish these different target forms, and to determine the fraction of total target modified with PTM of interest, a system comprising three recognition elements may be used: a donor (PTM) recognition probe having a recognition element that binds to a target in a PTM-specific manner; a donor (no PTM) recognition probe having (1) a recognition element specific for a target of the opposite form as the donor (PTM) probe or (2) a target of either form that can bind; and a receptor recognition probe having a recognition element that binds to either form of target. Depending on the specificity of the donor recognition probe, different incubation strategies may be used.
For example, if the donor recognition probes are exclusive and specific for each PTM format, they can be incubated in the same reaction and bioinformatically distinguished by a unique combination of acceptor and donor barcodes. For example, fig. 5 schematically illustrates exemplary operations and compositions in a process flow for detecting PTM. In some examples, this strategy uses two donor recognition probes that bind to similar sites on the protein (the difference at that site is +/-PTM).
In fig. 5, the first form 511 of the protein is post-translationally modified (PTM), and the second form 511' is not PTM or has a different PTM. Illustratively, the first form 511 may be phosphorylated, acetylated, methylated, nitrosylated, or glycosylated relative to the second form 511', but the first form may comprise any other suitable modification, and optionally the second form may be modified differently than the first form. The donor identification probe (e.g., in a kit) includes a first donor identification probe 520 specific for the first form (e.g., comprising an identification element specific for the first form) and comprising an oligonucleotide having a barcode (e.g., ID-X1 p) corresponding to the first form, and a second donor identification probe 520' specific for the second form (e.g., comprising an identification element specific for the second form) and comprising an oligonucleotide having a barcode (e.g., ID-X1) corresponding to the second form. The receptor recognition probe 530 (e.g., in a kit) can be specific for the protein, but need not be specific for either the first form or the second form, and comprises an oligonucleotide having a barcode (e.g., ID-X2) corresponding to either form. It will be appreciated that the use of receptor recognition probes each specific for a particular form may provide further specificity.
As shown in FIG. 5, the first donor recognition probe 520 and the acceptor recognition probe 530 specifically bind to the first format 511, and in response thereto proximity induced tagging occurs, resulting in the production of a reporter polynucleotide comprising ID-X1p and ID-X2. The second donor recognition probe 520 'and the acceptor recognition probe 530 specifically bind to the second format 511', and in response thereto proximity induced labelling occurs, resulting in the generation of a reporter polynucleotide comprising ID-X1 and ID-X2. Because the first and second donor identification probes 520, 520 'are specific for their respective forms 511, 511', they can be co-incubated. Thus, from the sequence of the reporter polynucleotide it can be determined that the protein has a first form 511 in the first case and a second form 511' in the second case. Optionally, the amounts of the first and second forms of the first analyte of the analyte may be determined based on the amounts of the reporter polynucleotides corresponding to the first and second donor recognition probes of the donor recognition probes. For example, the amount of the corresponding reporter polynucleotide shown in FIG. 5 is correlated with the amounts of the first form 511 and the second form 511' that are determined.
Alternatively, if one of the donor recognition probes is not specific for PTM but specific for the analyte, a continuous reaction may be used to distinguish between the two forms of the analyte. For example, fig. 6 schematically illustrates exemplary operations and compositions in a process flow for detecting post-translational modifications (PTMs) using PTM-specific and non-PTM-specific donor recognition probes. In fig. 6, the first form 611 of the protein is post-translationally modified (PTM), and the second form 611' is not PTM or has a different PTM. Illustratively, the first form 611 may be phosphorylated, acetylated, methylated, nitrosylated, or glycosylated relative to the second form 611', but the first form may comprise any other suitable modification, and optionally the second form may be modified differently than the first form. The donor recognition probe (e.g., in a kit) includes a first donor recognition probe 620 that is specific for the first form (e.g., includes a recognition element specific for the first form) and includes an oligonucleotide having a barcode (e.g., ID-X1 p) corresponding to the first form and a second donor recognition probe 620' that is specific for the protein but not specific for the first form or the second form (e.g., includes a recognition element specific for the protein) and includes an oligonucleotide having a barcode (e.g., ID-X1) corresponding to the protein. The receptor recognition probe 630 (e.g., in a kit) can be specific for the protein, but need not be specific for either the first form or the second form, and comprises an oligonucleotide having a barcode (e.g., ID-X2) corresponding to either form.
Because the second donor recognition probe 620 'can bind non-specifically to either the first form 611 or the second form 611', if the probe 620 'is incubated with the probe 620, the probe 620' can bind to the first form 611, thereby inhibiting the probe 620 from binding to the first form 611 and making it appear (read via sequencing) that the first form is not present. To provide enhanced differentiation between the first form 611 and the second form 611', a continuous reaction as shown in fig. 6 may be used, wherein the first donor recognition probe 620 specifically binds to the first form 611 and not to the second form 611', and wherein the acceptor recognition probe 630 binds to both the first form and the second form 611'. In a first form 611, the transposase of the donor recognition probe 620 performs proximity-induced tagging using the oligonucleotide of the acceptor recognition probe 630, thereby generating a reporter polynucleotide comprising ID-X1p and ID-X2. In the second form 611', the receptor recognition probes 630 are bound, but lack receptor recognition probes for proximity-induced labeling. The donor recognition probe 620' is then added and incubated as shown in fig. 6. During such incubation, donor recognition probe 620 'may attempt to bind to first form 611, but is inhibited from participating in proximity-induced labeling because acceptor recognition probe 630 has reacted with probe 620 and/or because the recognition element of probe 620 at least partially occupies the landing site of probe 620'. However, the donor recognition probe 620' can readily bind to the second format 611', in response to which the transposase of the probe 620' performs proximity-induced tagging using the oligonucleotides of the acceptor recognition probe 630, thereby generating a reporter polynucleotide comprising ID-X1 and ID-X2. Thus, from the sequence of the reporter polynucleotide it can be determined that the protein has a first form 611 in the first case and a second form 611' in the second case. Optionally, the amounts of the first and second forms of the first analyte of the analyte may be determined based on the amounts of the reporter polynucleotides corresponding to the first and second donor recognition probes of the donor recognition probes. For example, the amount of the corresponding reporter polynucleotide shown in fig. 6 is correlated with the amounts of the first form 611 and the second form 611' that are determined.
It is noted that in examples where different probes may compete with each other for binding to an analyte, for example, such as described with reference to fig. 6, the concentration of each probe may be calibrated to achieve enhanced specificity. For example, a higher concentration of more specific donor recognition probes may be used to drive rapid, accurate binding prior to non-specific binding of other recognition probes.
Similar to the assay used to detect PTM, proximity-induced tagging can be used to detect nucleic acid modifications, such as N 6 -methyl adenosine RNA modification, 5-methylcytosine DNA modification, etc. For example, as shown in the top panel of fig. 16, donor recognition probe 1620 and acceptor recognition probe 1630 specifically bind to modified oligonucleotide target 1611 and proximity induced tagging occurs, resulting in the production of a reporter polynucleotide comprising ID-X1p and ID-X2.
Proximity induced tagging can also be used to distinguish between different target forms, such as modified forms of an oligonucleotide and non-modified forms of the same oligonucleotide, and determine the fraction of total target modified. Three identification elements may be used: a first donor recognition probe having a recognition element that binds to the target in a modification-specific manner; a second donor recognition probe having (1) a recognition element specific for a target of the opposite form as the first donor recognition probe or (2) a target of either form that can bind; and a receptor recognition probe having a recognition element that binds to either form of target. Depending on the specificity of the donor recognition probe, different incubation strategies may be used.
For example, if the donor recognition probes are exclusive and specific for each form of target, they can be incubated in the same reaction and bioinformatically distinguished by a unique combination of acceptor and donor barcodes. FIG. 16 schematically illustrates exemplary operations and compositions in a process flow for detecting modifications using donor recognition probes specific for nucleotide modifications. In fig. 16, a first form 1611 of the oligonucleotide is modified, and a second form 1611' is not modified or has a different modification. Illustratively, the first form 1611 may comprise methylated adenosine relative to the second form 1611', but the first form may also comprise any other suitable modification, and optionally the second form may be modified differently than the first form. The donor identification probe (e.g., in a kit) includes a first donor identification probe 1620 specific for the first form (e.g., comprising an identification element specific for the first form) and comprising an oligonucleotide having a barcode (e.g., ID-X1 p) corresponding to the first form, and a second donor identification probe 1620' specific for the second form (e.g., comprising an identification element specific for the second form) and comprising an oligonucleotide having a barcode (e.g., ID-X1) corresponding to the second form. The receptor recognition probe 1630 (e.g., in a kit) can be specific for the oligonucleotide, but need not be specific for either the first form or the second form, and comprises an oligonucleotide having a barcode (e.g., ID-X2) corresponding to either form. It will be appreciated that the use of receptor recognition probes each specific for a particular form may provide further specificity.
As shown in FIG. 16, first donor recognition probe 1620 and acceptor recognition probe 1630 specifically bind to first form 1611, inducing tagged occurrence in response to this proximity, resulting in the production of a reporter polynucleotide comprising ID-X1p and ID-X2. Second donor recognition probe 1620 'and acceptor recognition probe 1630 specifically bind to second form 1611', in response to which proximity induced labeling occurs, resulting in the production of a reporter polynucleotide comprising ID-X1 and ID-X2. Because the first donor identification probe 1620 and the second donor identification probe 1620 'are specific for their respective forms 1611, 1611', they can be co-incubated. Thus, from the sequence of the reporter polynucleotide, it can be determined that the oligonucleotide is in the first form 1611 in the first case and in the second form 1611' in the second case. Optionally, the amount of the first and second forms of the analyte may be determined based on the amount of the reporter polynucleotide corresponding to the first and second donor recognition probes. For example, the amount of the corresponding reporter polynucleotide shown in fig. 16 is correlated with the amounts of the first and second forms 1611, 1611' determined.
Alternatively, if one of the donor recognition probes is non-specific, sequential reactions can be used to distinguish between the two forms. For example, fig. 17 schematically illustrates exemplary operations and compositions in a process flow for detecting nucleic acid modifications using a donor recognition probe that can specifically detect the modification and a donor recognition probe that is specific for the target but not specific for the modification. In fig. 17, a first form 1711 of the target oligonucleotide comprises nucleotide modifications and a second form 1711' is unmodified or has a different modification. The donor recognition probe (e.g., in a kit) includes a first donor recognition probe 1720 that is specific for the first form (e.g., includes a recognition element specific for the first form) and includes an oligonucleotide having a barcode (e.g., ID-X1 p) corresponding to the first form and a second donor recognition probe 1720' that is specific for the target oligonucleotide but not specific for the first form or the second form (e.g., includes a recognition element specific for the oligonucleotide) and includes an oligonucleotide having a barcode (e.g., ID-X1) corresponding to the target oligonucleotide. Receptor recognition probe 1730 (e.g., in a kit) can be specific for the target oligonucleotide, but need not be specific for either the first form or the second form, and comprises an oligonucleotide having a barcode (e.g., ID-X2) corresponding to either form.
Because the second donor recognition probe 1720 'can bind non-specifically to either the first form 1711 or the second form 1711', if the probe 1720 'is incubated with the probe 1720 at the same time, the probe 1720' can bind to the first form 1711, thereby inhibiting the probe 1720 from binding to the first form 1711 and making it appear (read via sequencing) that the first form is not present. To provide enhanced differentiation between the first form 1711 and the second form 1711', a continuous reaction as shown in fig. 17 may be used, wherein the first donor recognition probes 1720 specifically bind to the first form 1711 and not to the second form 1711', and wherein the acceptor recognition probes 1730 bind to both the first form 1711 and the second form 1711'. In a first form 1711, the transposase of the donor recognition probe 1720 performs proximity induction tagging using the oligonucleotide of the acceptor recognition probe 1730, thereby generating a reporter polynucleotide comprising ID-X1p and ID-X2. In the second form 1711', the receptor-recognizing probes 1730 are bound, but lack receptor-recognizing probes for proximity-induced labeling. Donor recognition probes 1720' are then added and incubated. During such incubation, donor recognition probe 1720 'may attempt to bind to first form 1711, but is inhibited from participating in proximity-induced labeling because acceptor recognition probe 1730 has reacted with probe 1720 and/or because the recognition element of probe 1720 at least partially occupies the landing site of probe 1720'. However, donor recognition probe 1720' can readily bind to second form 1711', in response to which the transposase of probe 1720' performs proximity induced tagging using the oligonucleotide of acceptor recognition probe 1730, thereby generating a reporter polynucleotide comprising ID-X1 and ID-X2. Thus, from the sequence of the reporter polynucleotide, it can be determined that the target oligonucleotide has a first form 1711 in the first case and a second form 1711' in the second case. Optionally, the amount of the first and second forms of the analyte may be determined based on the amount of the reporter polynucleotide corresponding to the first and second donor recognition probes. For example, the amount of the corresponding reporter polynucleotide shown in fig. 17 is correlated with the amounts of the first and second forms 1711, 1711' that are determined.
In examples where different probes may compete with each other for binding to an analyte, for example, such as described with reference to fig. 17, the concentration of each probe may be calibrated to achieve enhanced specificity. For example, a higher concentration of more specific donor recognition probes may be used to drive rapid, accurate binding prior to non-specific binding of other recognition probes.
As shown in fig. 18, when the modified oligonucleotide targets are assayed, the amount of background activity can be quantified to determine how much signal is observed in the assay due to true proximity-induced labeling. For example, the sample may be incubated with a mixture of a mimetic donor recognition probe 1825 that does not specifically bind to modified oligonucleotide 1811 and comprises a distinguishable barcode "IDN-1" and an acceptor recognition probe 1830 that comprises a distinguishable barcode "ID-X2". Receptor recognition probes 1830 may specifically bind to molecules 1811. As a result of non-specific binding, the mimetic donor recognition probe 1825 can be brought sufficiently close to the acceptor recognition probe 1830 to perform background tagging, thereby generating a background reporter polynucleotide comprising barcodes IDN-1 and ID-X2. The sample (or another sample) may also be incubated with a mixture of a mimetic acceptor recognition probe 1835 that does not specifically bind to molecule 1811 and that contains a distinguishable barcode "IDN-2" and a donor recognition probe 1820 that contains a barcode "ID-X1p". Donor recognition probes 1820 may specifically bind to molecules 1811. As a result of non-specific binding, the mimetic acceptor recognition probe 1835 may be sufficiently close to the donor recognition probe 1820 to perform background tagging, thereby generating a background reporter polynucleotide comprising barcodes ID-X1p and IDN-2. All of the reporter nucleotides in the sample can be sequenced and quantified, and the amounts of the two background reporter polynucleotides representing the background tagged event can be sequenced and quantified, and the amounts can be compared to the amounts of the reporter polynucleotides comprising barcodes ID-X1p and ID-X2 representing the actual proximity-induced tagged event.
In some examples, proximity-induced labeling may be used to detect molecular interactions, wherein the analyte comprises at least two molecules that interact with each other. For example, biomolecular interactions, such as protein-protein interactions and RNA-protein interactions, play an important role in cell biology and are increasingly being the target of drug development; see, for example, lu et al, "Recent advances in the development of protein-protein interactions modulators: mechanisms and clinical trials," Signal Transduction and Targeted Therapy, volume 5, phase 1: article No. 213, 2020, the entire content of which is incorporated herein by reference. However, existing methods for detecting biomolecular interactions are complex and often require affinity purification of the biomolecule of interest followed by characterization of the bound material by techniques such as mass spectrometry (protein) or sequencing (RNA). The proximity-induced labeling assay of the present invention can be used to detect such interactions without the need for affinity purification, but instead uses a simple sequencing read similar to that described with reference to fig. 1. Assays for detecting molecular interactions may include: a donor recognition probe having a recognition element that binds to a target molecule (e.g., a biomolecule) X; a receptor recognition probe having a recognition element that binds to target molecule (e.g., biomolecule) Y, but optionally the target molecule can bind to oligonucleotide 132, thereby eliminating the need for a recognition element; and a simulated donor probe and a simulated acceptor probe, both of which comprise a non-specific recognition element or lack a recognition element/target. These mimetic probes provide a measure of non-specific labelling events that can be used as controls in the assay.
For example, fig. 7A-7C schematically illustrate exemplary operations and compositions in a process flow for detecting molecular interactions using proximity-induced labeling. As shown in fig. 7A, molecules 711 (X) and 711' (Y) in the sample interact with each other, e.g., are coupled covalently or non-covalently to each other. The donor recognition probe 720 specifically binds to the molecule 711 and the acceptor recognition probe 730 specifically binds to the molecule 711', such that the transposase of the probe 720 is in sufficient proximity to the probe 730 to perform proximity-induced labeling in a manner as described elsewhere herein. The generated reporter polynucleotide comprises a barcode (e.g., IDX-1) corresponding to molecule 711 and a barcode (e.g., IDY-1) corresponding to molecule 711'. Thus, the sequence of the polynucleotide indicates that molecules 711 and 711' interact with each other in sufficient proximity to bind both the donor recognition probe and the acceptor recognition probe 730.
The sample may be incubated with a mixture of a mimetic donor identification probe 725 that does not specifically bind to a molecule 711 or 711 'having a distinguishable barcode "IDN-1" and a receptor identification probe 730 that specifically binds to the molecule 711' as shown in fig. 7B. As a result of non-specific binding, the mimetic donor identification probe 725 may be sufficiently close to the acceptor identification probe 730 to perform proximity-induced labeling, thereby producing a reporter polynucleotide comprising barcodes IDN-1 and IDY-2. The sample (or another sample) may also be incubated with a mixture of donor recognition probes 720 and mimetic acceptor recognition probes 735 that bind non-specifically to molecules 711 or 711' and contain a distinguishable barcode "IDN-2". As can be seen in fig. 7B, donor recognition probes 720 can specifically bind to molecules 711. As a result of non-specific binding, the simulated acceptor recognition probe 735 can be sufficiently close to the donor recognition probe 720 to perform proximity-induced labeling to produce a reporter polynucleotide comprising barcodes IDX-1 and IDN-2. Two reporter polynucleotides can be sequenced, whereby the amount of background tagging near molecules 711' and 711, respectively, can be obtained. Such amounts can be compared to the amount of specific labelling detected from the pairing of the pairing 711-711 'obtained as described with reference to fig. 7A as a control, for example to quantify the amount of biomolecular interactions between molecules 711 and 711' within the sample. For example, the IDN signal will indicate that background tagging is occurring. More specifically, if the amount of IDN1-IDY2 and/or IDX1-IDN2 is high relative to the "true" signal of IDX1-IDY2, this will indicate that the interaction between X and Y is not true. This can be measured as the multiple difference of (IDX 1-IDY2 signal)/(IDN signal), where the higher the value, the greater the confidence that there is an interaction.
Using assays such as those described with reference to fig. 7A-7B, various molecular (e.g., biomolecular) interactions can be detected and quantified. For example, if the recognition elements of both the donor and acceptor recognition probes target a protein, then protein-protein interactions will be detected, whereas if one of the recognition elements targets RNA and the other one targets protein, then RNA-protein interactions will be detected. FIG. 7C schematically illustrates the detection of protein-protein interactions, RNA-protein interactions, and protein-small molecule interactions using proximity-induced tagging. In the case of the protein-small molecule interaction shown in the lower panel of fig. 7C, it is noted that the small molecule is coupled to oligonucleotide 132 and acted upon by the protein, so that the recognition element of the receptor recognition probe may be omitted. Non-limiting examples of biomolecules and corresponding recognition elements that can be used in the present assay are listed in table 1:
fig. 14 shows additional examples in which proximity-induced labeling is used to detect molecular interactions in the absence of one or both recognition elements. Here, one or both of the recognition elements may be omitted, as the transposomes may be tethered directly to the target. The donor recognition probe can be attached to the first molecule (target X), as shown in the top panel. The receptor recognition probe may be directly attached to the second molecule (target Y), as shown in the middle panel. Both the donor and acceptor recognition probes can be directly attached to the molecule, as shown in the lower panels.
Other examples of biological molecules and interactions that can be evaluated when the recognition probe is directly attached to the molecule of interest are shown in fig. 15A-15C. Fig. 15A-15C schematically illustrate exemplary operations and compositions in a process flow. FIG. 15A shows the detection of RNA modification on a specific RNA target. FIGS. 15B and 15C illustrate detection of molecular interactions using proximity-induced labeling. More specifically, fig. 15A shows a receptor recognition probe directly attached to an RNA modification for assessing the presence of an RNA molecule with the modification. Figure 15B shows donor recognition molecules attached directly to RNA molecules of interest to assess their interactions with proteins of interest. FIG. 15C shows a receptor recognition molecule attached directly to a protein of interest to assess the interaction of the protein with another protein of interest. In these examples, RNA modifications, RNA of interest, and protein of interest act as recognition elements in proximity-induced tagging assays. Further examples of biomolecules that can be used as recognition elements are provided in table 2, along with the corresponding interactions that can be evaluated:
any mechanism for attaching a molecule of interest to a recognition probe may be used . For example, the protein of interest may be directly attached to the recognition probe by using a covalent attachment method (e.g., SNAP TAG). Additional attachment mechanisms may be via certain nucleotides (as described in Klocker et al, "Covalent labeling of nucleic acids," Chem Soc Rev., "Vol.49, 23:8749-8773, 2020), or certain nucleotide modifications (as described in Wang et al," anti-body-free enzyme-assisted chemical approach for detection of N) 6 Methylidenosine, "Nat Chem biol. 16, volume 8: 896-903, 2020; and Zhang et al, "Tet-mediated covalent labelling of 5-methylcytosine for its genome-wide detection and sequencing," Nat Commun. Volume 4: page 1517, 2013) to couple a donor probe or an acceptor probe to a nucleic acid.
It should be appreciated that any suitable combination of recognition elements may be used to detect any suitable number of analytes, which optionally may interact with each other. Illustratively, the first molecule may comprise a first protein or a first peptide; and the first recognition element may include a first antibody or first aptamer specific for the first protein or first peptide. Alternatively, for example, the first molecule may comprise a first target polynucleotide; and the first recognition element can include a first CRISPR-associated (Cas) protein specific for the first target polynucleotide. Alternatively, for example, the first molecule may comprise a first carbohydrate; and the first recognition element may include a first lectin specific to the first carbohydrate. Alternatively, for example, the first molecule may comprise a first biomolecule specific to the first recognition element. In examples of detecting interactions between the first molecule and the second molecule, the second molecule may comprise a second protein or a second peptide; and the second recognition element can include a second antibody or second aptamer specific for the second protein or second peptide. Alternatively, for example, the second molecule may comprise a second target polynucleotide; and the second recognition element comprises a second Cas protein specific for the second target polynucleotide. Alternatively, for example, the second molecule may comprise a second carbohydrate; and the second recognition element may include a second lectin specific to the second carbohydrate. Alternatively, for example, the second molecule may comprise a second biological molecule specific for the second recognition element.
As described elsewhere herein, the donor recognition probes of the invention may comprise a recognition element (which may be referred to as a barcoded transposome) coupled to a transposase and a first oligonucleotide, and may actually comprise an active transposome dimer, but is sometimes illustrated in a simpler form. For example, an active transposome may carry two ME duplex (which may be referred to herein elsewhere as annealed chimeric terminal transposon end sequences (ME, ME')), one ME duplex per monomer of transposase (e.g., tn 5). Any suitable method may be used to prepare the donor recognition probes of the invention. Fig. 8A-8C schematically illustrate an exemplary process flow for preparing the donor identification probe 120. In the example shown in FIG. 8A, each recognition element 121 carries a copy of an oligonucleotide 122 to which transposase 123 is loaded; the transposases of two such complexes then dimerize to form an active transposome. Thus, the donor recognition probe 120 shown in FIG. 8A can comprise two transposases, two first recognition elements, and two first oligonucleotides, wherein the two transposases form a dimer, each of the transposases being coupled to a corresponding one of the first recognition elements via a corresponding first oligonucleotide of the first oligonucleotides.
In another option, such as shown in fig. 8B, two or more oligonucleotides 122 are coupled to recognition element 121. The transposomes 123 are loaded to the corresponding oligonucleotides 122, and then dimerized to form active transposomes. Thus, the donor recognition probe 120 shown in FIG. 8B can comprise two transposases, one first recognition element, and two first oligonucleotides, wherein the two transposases form a dimer, each of the transposases being coupled to the one first recognition element via the two first oligonucleotides.
In another option, such as shown in fig. 8C, the active transposomes are formed prior to conjugation with the recognition element. For example, oligonucleotide 122 may be prepared, loaded into transposase 123, and the transposase dimerized to form an active transposome. One or more recognition elements may then be coupled to the active transposomes. For example, the recognition element 121 can be conjugated to the first moiety 126, and the oligonucleotide 122 or transposase 123 can be conjugated to the second moiety 127, which reacts with the first moiety to form a bond. Illustratively, the first portion 126 may include a click chemistry moiety, such as Dibenzocyclooctyne (DBCO), and the second portion 127 may include a complementary click chemistry moiety, such as an azide, that reacts with the first portion to bind the recognition element 121 to the oligonucleotide 122 or transposase 123. In some examples, the recognition element 121 may use NHS-PEG-DBCO such as described in Gong et al, "Simple method to prepare oligonucleotide-conjugated antibodies and its application in multiplex protein detection in single cells," Bioconjugate Chemistry, volume 27, phase 1: pages 217-225, which is incorporated herein by reference in its entirety, are conjugated to DBCO or other suitable first moiety in a manner described in 2016. In some examples, oligonucleotide 122 may be conjugated to an azide or other suitable second moiety using techniques known in the art. Active transposomes can be assembled by incubating synthetic oligonucleotides with a transposase (e.g., tn 5). Such transposases may be introduced as monomers, or the proprietary dimeric form of the enzyme with peptide linkers attached to two monomer subunits may be found, for example, in Blundell-Hunter, "Transposase subunit architecture and its relationship to genome size and the rate of transposition in prokaryotes and eukaryotes," Nucleic Acids Research, volume 46, 18: pages 9637-9646, the entire contents of which are incorporated herein by reference, are used in the manner described in 2018. The assembled transposomes comprising the dimeric transposase and the synthetic oligonucleotide may then be incubated with the recognition element, resulting in the reaction of the first portion 126 with the second portion 127, and thus covalently coupling the transposomes to antibodies, thereby forming the donor recognition probe 120 as shown in fig. 8C. Thus, the donor recognition probe 120 shown in FIG. 8C can comprise two transposases, one first recognition element, and two first oligonucleotides, wherein the two transposases form a dimer, at least one of the transposases being coupled to the one first recognition element via a covalent bond.
Regardless of the particular manner in which the donor and acceptor recognition probes of the present invention are prepared and the particular analyte to be detected, it may be useful to promote specificity of the recognition element by reducing background interactions. For example, a long incubation time may be used to drive binding between the recognition element and the analyte. During this incubation, there may be some non-specific binding and labeling of the transposomes of the donor recognition probes with the acceptor sites 134 of the acceptor recognition probes in the absence of target binding. These non-specific interactions are expected to occur randomly, rather than between pairs of acceptor and donor recognition probes specific for the same analyte. Thus, a reporter polynucleotide having a sequence comprising a non-corresponding barcode may be filtered out using bioinformatics, for example, in the manner described with reference to fig. 1. The level of this type of background signal can also be monitored as a measure of the measured performance.
However, having too many of these background products may interfere with sensitivity and/or may be addressed by increasing sequencing depth. Any one of several parameters of the assay may be adjusted in order to further reduce background product formation. This may include the concentration of the donor recognition probes 120, the concentration of the acceptor recognition probes 130, the incubation time, the incubation temperature, and/or buffer conditions (e.g., adding or removing mg++). Additionally, or alternatively, the acceptor site 134 of the acceptor recognition probe may be shortened or modified (e.g., by methylation) in order to reduce the nonspecific affinity of the donor probe 120 and the acceptor probe 130. Additionally, or alternatively, non-superactive variants of transposases (e.g., non-superactive variants of Tn 5) may be used in a method such as those described in Wiegand et al, "Characterization of two hypertransposing Tn variants," j. Bacteriol. Volume 174, phase 4: the DNA binding strength of the transposomes was reduced in the manner described in 1992 on pages 1229-1239, the entire contents of which are incorporated herein by reference.
Such relief, such as removal of magnesium, may reduce or inhibit premature enzymatic cleavage of the transposomes, but may not completely prevent non-specific DNA binding; see, for example, ami et al, "duplex-reserved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing," Nat. Genet. Volume 46: pages 1343-1349, 2014, the entire contents of which are incorporated herein by reference. To further reduce background product formation, additional components or changes to the workflow may be used. For example, fig. 9A-9E schematically illustrate exemplary compositions and operations for reducing background labeling during proximity-induced labeling. Fig. 9A shows an option in which dsDNA quencher molecules without priming sites for amplification are used to compete for non-specific interactions. Specific interactions may be less affected because they are approximated by the presence of the analyte, and thus the concentration of quencher may be set at a level that reduces background product formation while having little or no effect on specific interactions. Figure 9B shows an option in which the transposomes are pre-bound to a blocking agent that can be degraded after washing away any unbound donor recognition probes. Options for degradable transposome blocking include a DNA blocking agent comprising uracil (USER degradable), a blocking agent with an RNA base (rnase degradable), or additional DNA at the 3' end of the ME sequence (cleaved by transposome in the presence of mg++). Further details regarding the capping reagent are provided below with reference to fig. 11A-11C. Fig. 9C shows an option in which the transposase receptor site 134 is initially single-stranded and, prior to the labeling, a complementary oligonucleotide is introduced that produces a dsDNA target for transposase (e.g., tn 5) binding. Figure 9D shows an option in which the transposomes are assembled in situ. For example, the donor recognition probe may not include a transposase when bound to the analyte, and the transposase is added after the donor recognition probe binds to the analyte. Transposases assemble onto the blunt, annealed ME end (with or without magnesium) and then bind to oligonucleotides of adjacent acceptor probes. It is noted that this option may be used with sufficient donor recognition probes (over acceptor recognition probes) as it may be useful to increase the number of acceptor-analyte complexes to be able to form complexes with the correct donor recognition probes.
In the example shown in fig. 9E, a chemical blocking agent can be incorporated into the transposon sequence of the donor recognition probe to reduce or inhibit background tagging. For example, tn5 requires a 3 'hydroxyl group on the transposon for labeling, so providing a blocking group at the 3' hydroxyl group may reduce or inhibit the labeling activity in a manner such as shown in operation 990 of fig. 9E. One or more suitable reagents may then be used to remove the chemical blocking agent, such that the transposase may label the oligonucleotides of the receptor recognition probe in a manner such as that shown at operation 991 of fig. 9E. The transposase may then label the receptor probe using the deblocked 3' hydroxyl group in a manner such as that shown at operation 992 of fig. 9E. In the non-limiting example shown in fig. 9E, the 3 'blocking group is an azidomethyl group that is cleaved under mild conditions using tris (2-carboxyethyl) phosphine (TCEP) to yield a 3' hydroxyl group that allows Tn5 to label the acceptor probe in a manner such as described elsewhere herein. However, many different chemically cleavable blocking agents (and related reagents) may be used, such as, for example, those described in Chen et al, "The history and advances of reversible terminators used in new generations of sequencing technology," Genomics, proteomics & Bioinformatics, volume 11, phase 1: pages 34-40, 2013, the entire contents of which are incorporated herein by reference.
In examples such as those described with reference to fig. 9A-9E, blocking (e.g., a quencher, a blocking agent, a lack of double-stranded DNA to be tagged, and/or a lack of transposase) can be used to provide sufficient time to specifically form a correct complex between a recognition element and a corresponding analyte prior to tagging. After a sufficient time has elapsed, the transposase may be activated, and the preformed complex may be expected to react faster than the nonspecific interactions.
In some examples, a substrate such as a bead may be used to further reduce background product formation. For example, fig. 10A-10D schematically illustrate additional exemplary compositions and operations for reducing background labeling during proximity-induced labeling. The examples shown in fig. 10A-10D are similar to the examples described with reference to fig. 9A-9D, but include additional bead washes to remove unbound probes. More specifically, an acceptor recognition probe may be coupled to a substrate that pulls down the analyte coupled to that probe and the corresponding donor recognition probe. Any unbound donor recognition probes may be washed away prior to removal of the blocking or otherwise activating the transposase. Alternatively, the donor recognition probe may be coupled to a substrate that pulls down the analyte coupled to the probe and the corresponding acceptor recognition probe. Any unbound receptor recognition probes may be washed away prior to removal of the blocking or otherwise activating the transposase. Fig. 10A shows an option in which dsDNA quencher molecules without priming sites for amplification are used to compete for non-specific interactions. Figure 10B shows an option in which the transposomes are pre-bound to a blocking agent that can be degraded after washing away any unbound donor recognition probes. FIG. 10C shows an option in which the transposase receptor site 134 is initially single stranded and, prior to the labeling, a complementary oligonucleotide is introduced that produces a dsDNA target for transposase (e.g., tn 5) binding. Fig. 10D shows an option in which the transposomes are assembled in situ. It should be understood that the operations described with reference to fig. 9E may similarly be adapted for use with beads. In the examples shown in fig. 10A-10D, rather than relying on preformed complexes that react faster than nonspecific interactions, a different buffer (e.g., with tween or other mild detergent) may be used to remove nonspecifically bound donor recognition probes prior to transposome deblocking or activation.
Fig. 11A-11C schematically illustrate additional exemplary compositions and operations for reducing background labeling during proximity-induced labeling (e.g., closure-related examples as described with reference to fig. 9B and 10B). Fig. 11A shows an example of using a magnesium activated capping reagent. During in vivo assembly of the active transposomes, the ME sequence is part of a longer DNA fragment, so additional bases may be present after the ME sequence. This also applies to in vitro reactions; for example, the transposomes may be used, for example, as described in Gradman et al, "A bifunctional DNA binding region in Tn5 transposase," Molecular Microbiology, volume 67, phase 3: pages 528-540, incorporated herein by reference in its entirety, are assembled with additional DNA beyond ME in the manner described in 2008. As provided herein, the additional bases may include pure DNA or may include a nick prior to the ME region that is expected to improve transposome formation. Additional bases may occupy non-specific DNA binding pockets and thus may need to be cut off before the transposomes can bind to the target DNA (e.g., 134). Because magnesium (mg++) is required for this cleavage, a "magnesium-activated" transposome is one that assembles with this additional DNA at the end of oligonucleotide 122. Once magnesium is added, the transposase (e.g., tn 5) can cleave off the additional base and can bind and tag the oligonucleotide 132.
Figure 11B shows a degradable blocking agent, such as a short blocking agent that can occupy a non-specific DNA binding pocket of a transposome and comprises degradable residues (e.g., uracil or RNA). Prior to the introduction of magnesium, the blocking agent is degraded (e.g., with USER or RNase), allowing the transposomes to bind to the target dsDNA. Magnesium may then be added to the reaction to allow proximity-induced labelling.
FIG. 11C shows a thermosensitive blocking agent that can be similarly used as a degradable blocking agent, but contains short DNA fragments with several nicks. These nicks allow the molecule to more easily melt and separate into single stranded DNA at relatively low temperatures (e.g., 30 ℃ to 50 ℃). The blocking agent may have a melting temperature that is lower than the melting temperature of the transposase receptor site 134. After analyte binding incubation (< 30 ℃) the reactant may be warmed to about the melting temperature of the blocking agent and below the melting temperature of the transposase receptor site 134. This allows the transposomes to bind to the transposase receptor site 134. Magnesium may then be added to the reaction to allow labelling.
Other types of cleanup may be used after combining to provide complex sample types. For example, some sample types may have relatively high levels of contaminants that will affect the assay. To determine those types of samples, washing steps similar to those described with reference to fig. 10A to 10D may be used. More specifically, fig. 12 schematically illustrates exemplary compositions and operations for reducing contaminants during proximity-induced labeling. The receptor recognition probe may be coupled to a substrate that pulls down the analyte coupled to the probe; any unbound acceptor recognition probes, and any contaminants, may be washed away prior to addition of the donor recognition probes. Alternatively, the donor recognition probe may be coupled to a substrate that pulls down the analyte to which the probe is bound; any unbound donor recognition probes, and any contaminants, may be washed away prior to addition of the acceptor recognition probes. Proximity induced tagging optionally may be further controlled in a manner such as described elsewhere herein, for example with reference to fig. 10A-10D. Additionally, in examples such as those described with reference to fig. 12, the volume of the reaction may be reduced. For example, after a first incubation and wash, the bound analyte may be resuspended in a smaller volume for a second incubation. Concentration reactions can accelerate probe-analyte binding and improve sensitivity for low abundance analytes.
In examples such as those described with reference to fig. 10A-10D and 12, the acceptor recognition probe or the donor recognition probe may be coupled to the substrate in any suitable manner. Illustratively, the acceptor recognition probe or the donor recognition probe may comprise a biotin handle that binds to streptavidin beads.
Thus, some examples herein provide for inhibiting the activity of the transposase while specifically coupling the donor recognition probe to a first portion of the analyte and while specifically coupling the acceptor recognition probe to a second portion of the analyte, e.g., as described with reference to fig. 9A-9E and 10A-10D. The first condition of the fluid is used to inhibit transposase activity. For example, the first condition of the fluid may include at least one of: (i) The presence of a sufficient amount of EDTA to inhibit the activity of the transposase, and (ii) the absence of a sufficient amount of magnesium ions for the activity of the transposase. Additionally, or alternatively, dsDNA quenchers may be used to inhibit transposase activity, e.g., as described with reference to fig. 9A and 10A. Additionally, or alternatively, the activity of the transposase may be inhibited by associating a blocking agent with the transposase, e.g., as described with reference to fig. 9B, 10B, and 11A-11C. Additionally, or alternatively, the activity of the transposase may be inhibited by a single stranded second oligonucleotide, e.g., as described with reference to fig. 9C and 10C. Additionally, or alternatively, the activity of the transposase may be promoted using a second condition of the fluid, for example, prior to using the transposase to produce the reporter polynucleotide. Illustratively, the second condition of the fluid may include the presence of a sufficient amount of magnesium ions for the activity of the transposase. Additionally, or alternatively, the activity of the transposase may be promoted by degrading the blocking agent, e.g., as described with reference to fig. 9B, 10B, and 11A-11C. Additionally, or alternatively, transposase activity may be facilitated by annealing a third oligonucleotide to a second oligonucleotide to form a double stranded polynucleotide, e.g., as described with reference to fig. 9C and 10C. FIG. 13 illustrates an exemplary operational flow in a method for detecting an analyte using proximity-induced labeling. The method 1300 shown in fig. 13 may include coupling a donor recognition probe to a first portion of an analyte (operation 1301). The donor recognition probe can comprise a first recognition element specific for the first portion of the analyte, a first oligonucleotide corresponding to the first portion of the analyte, and a transposase coupled to the first recognition element and the first oligonucleotide. For example, the donor identification probe 120 can be configured in a manner such as described with reference to fig. 1, 2, 8A, 8B, or 8C. The method 1300 may also include coupling the receptor recognition probe to a second portion of the analyte (operation 1302). The receptor recognition probe may comprise a second recognition element specific for the second portion of the analyte and a second oligonucleotide coupled to the second recognition element and corresponding to the second portion of the analyte. For example, the receptor recognition probe 130 may be configured as described with reference to fig. 1 or 3. Method 1300 may include generating a reporter polynucleotide comprising a first oligonucleotide and a second oligonucleotide using a transposase (operation 1303). For example, the transposase may perform proximity-induced tagging in a manner such as described with reference to fig. 1, 4A-4C, 5, 6, 7A, or 7C. Proximity induced tagging may optionally be modulated in the manner described in fig. 9A-9E, 10A-10D, 11 or 12. Method 1300 may include detecting an analyte based on a reporter polynucleotide comprising a first oligonucleotide and a second oligonucleotide (operation 1304). For example, the reporter polynucleotide may be sequenced, for example, using sequencing by synthesis. The barcodes within the first and second oligonucleotides may be used to detect analytes to which the donor and acceptor recognition probes have bound, for example, to detect molecules, post-translational modifications, or molecules that interact with each other.
As an alternative to PCR-based amplification and sequencing techniques, other techniques may be used to detect analytes. For example, as shown in fig. 19, after proximity-induced tagging, a sample index primer may be added by ligation and polymerase extension to produce an elongated reporter polynucleotide comprising both a sample index and a barcode corresponding to the recognition element, and the reporter polynucleotide may be sequenced to identify the analyte.
Another option for detecting the presence of an analyte is to use an array of beads, as shown in fig. 20A-20B. FIG. 20A depicts proximity induced labeling on a target protein, wherein the resulting reporter polynucleotide 2014 comprises barcodes ID-X1 and ID-X2. The bead 2010 may contain one or more capture probes 2011 designed to specifically hybridize to one of the barcodes ID-X1 and ID-X2. The sample may contain a detection probe 2012 that is labeled with a fluorophore 2013 and is designed to specifically hybridize to the other of barcodes ID-X1 and ID-X2. After incubating the sample to facilitate hybridization, the sample may be washed to remove unbound bead of the reporter polynucleotide and detection probes. The presence of the reporter polynucleotide can then be assessed, for example, by detecting and quantifying fluorescence from the fluorophore using a suitable imaging camera and detection circuitry in a manner similar to that described in international publication No. WO 2021/074087, the entire contents of which are incorporated herein by reference.
This method can be used to assess more than one analyte, as shown in FIG. 20B. For example, multiple species of target analytes, such as analytes with post-translational modifications, nucleotide modifications, and the like, may be assessed. Illustratively, the sample may comprise a reporter polynucleotide 2014 produced by labeling on the analyte, and a second reporter polynucleotide 2016 produced by labeling on a modified form of the analyte. Due to the presence of a common barcode (ID-X2 in this example), the bead 2010 may capture both of the reporting polynucleotides 2014, 2016. However, the second detection probe 2020 is designed to specifically hybridize to the free barcode of the second reporter polynucleotide 2016. The second detection probe 2020 may be labeled with a fluorophore 2018 that provides a different signal than the fluorophore 2013. Thus, when both analytes are present in a sample, they can be detected and quantified relative to each other by observing the total signal of fluorophores 2013 and 2018 and the ratio between these two signals in a manner similar to that described in international publication No. WO 2021/074087, the entire contents of which are incorporated herein by reference.
It will be appreciated that the sample may comprise any suitable number of different beads, each bead being specific for a different reporter polynucleotide. Thus, any number of analytes may be assessed in the sample, for example greater than 100, greater than 1,000, greater than 10,000, greater than 100,000, or greater than 1,000,000 analytes.
The beads 2010 may be coupled to a surface, for example, immobilized to a surface within a flow cell. In some examples, such coupling of the bead 2010 to the surface may be performed prior to coupling of the reporter polynucleotide 2014 to the bead. For example, a solution comprising reporter polynucleotides 2014 may flow over the surface-coupled beads, and the beads may capture from the solution the reporter polynucleotides specific for those beads. In other examples, such coupling of the bead 2010 to the surface may be performed after the reporter polynucleotide 2014 is coupled to the bead. For example, a solution comprising the reporter polynucleotide 2014 may be mixed with a solution comprising the beads 2010, resulting in a corresponding coupling between the beads 2010 and the reporter polynucleotide 2014 to which those beads are specific, and the beads may then be coupled to a surface, for example using bio-orthogonal conjugation chemistry, such as Cu (I) catalyzed click reaction (between azide and alkyne), strain-promoted azide-alkyne cycloaddition (between azide and DBCO (dibenzocyclooctyne)), hybridization of oligonucleotides to complementary oligonucleotides, biotin-streptavidin, NTA-His tag, or spatag-spacatcher, charge-based immobilization (such as aminosilane or polylysine), or non-specificity (such as a surface coated with a polymer).
The fluorophore may also be coupled to the corresponding reporter polynucleotide at any suitable time during the assay. For example, fluorophore 2013 may be coupled to reporter polynucleotide 2014 after the analyte is captured by reporter polynucleotide 2014, before reporter polynucleotide 2014 is coupled to bead 2010, or after reporter polynucleotide 2014 is coupled to bead 2010.
In additional examples, the detection probes may be removed, such as by dehybridization, and further analyzed by sequencing by synthesis or other suitable methods.
Fig. 21A-21B illustrate examples in which both reporter polynucleotide barcodes are used to hybridize to a bead array. For example, bead 2010 contains capture probe 2110 that contains two hybridization sites that specifically bind to the ID-X1 and ID-X2 barcodes on reporter polynucleotide 2014. The two hybridization sites are separated by a spacer 2111 to reduce spatial constraints. By providing two hybridization sites, the capture probes 2110 have increased specificity such that the undesired reporter polynucleotide (e.g., in this case with barcodes ID-X1 and ID-Y2) is only partially complementary to the capture probes and can be washed away with stringent washing (e.g., heating). As shown in FIG. 21A, for detection, a general primer binding site 2114, e.g., primer C, on the reporter polynucleotide 2014 can be used to bind to the fluorescent detection probe 2112. FIG. 21B illustrates the mechanism of using an amplification template 2116 to increase the fluorescent signal. In this example, the amplification template 2116 hybridizes to a generic primer binding site 2114 and the 3' end of the primer binding site 2114 is extended. The sample contains fluorescently labeled nucleotides 2118 that are incorporated into the growing strand to produce an increased detection signal. In some examples, each nucleotide may be labeled with a different fluorophore. For example, guanine nucleotides can be labeled with a first fluorophore, thymine nucleotides can be labeled with a second fluorophore, and so on. The particular sequence of the elongate chains 2120 and thus the number, sequence, spacing, and type of fluorophores in the elongate chains 2120 may be defined by the sequence of the amplification template 2116. Different levels and colors of fluorescence can be provided by tuning the length and sequence of the amplification template 2116 to affect the number, density, and color of fluorescently labeled nucleotides coupled to the amplification template. Additionally, as shown in fig. 21B, each incorporated nucleotide 2118 can be coupled to a second primer binding site 2122, which can each be extended by incorporation of a nucleotide, to further effect additional signal amplification cycles in a manner similar to that described in international publication No. WO 2021/074087, the entire disclosure of which is incorporated herein by reference.
Another mechanism for increasing signal is rolling circle amplification. As shown in fig. 22, capture probes 2011 on beads 2010 hybridize to the first barcodes of the reporter polynucleotides 2014, 2016 and detection probes 2012, 2020 bind to the other barcodes or reporter polynucleotides 2014, 2016. More than one detection probe 2012, 2020 may be used to bind different barcodes corresponding to different analytes detected by proximity-induced labeling. In this example, each detection probe 2012, 2020 comprises a 3' sequence 2210, 2212 (e.g., RCA1 and RCA 2) complementary to the circular DNA template 2202, 2204. The circular DNA template comprises fluorophore-binding sequences 2206, 2208. A sustained polymerase (e.g., phi 29) can bind the 3' RCA sequences 2210, 2212 and produce copies of the circular DNA templates 2202, 2204. When circular DNA is amplified, the fluorescently labeled nucleotides can be incorporated into the growing copy at the replicated fluorophore-binding sequences 2206, 2208 in a manner similar to that described in international publication No. WO 2021/074087, the entire contents of which are incorporated herein by reference. When two forms of analyte are assessed, the fluorophore-binding sequence 2206 can recruit a different fluorophore than the fluorophore-binding sequence 2208. When the amplification process is stopped, the signals of both the fluorophore specific for the binding sequence 2206 and the fluorophore specific for the binding sequence 2208 can be quantified and the ratio between the two signals can be compared.
The use of bead arrays for detection and quantification of analytes is further described in WO2021/074087, the entire contents of each of these documents being incorporated herein by reference.
From the foregoing, it will be appreciated that proximity-induced labelling using recognition elements coupled to the transposomes with active barcodes can generate reporter polynucleotides in an irreversible (covalent) process, thereby reducing the likelihood of non-specific background noise and providing specific detection and quantification of analytes of interest. Additionally, proximity-induced tagging covalently links barcodes from a pair of corresponding recognition elements in a reporter polynucleotide. Ligating barcodes from the respective donor and acceptor recognition probes allows identification and filtering of any nonspecific or off-target labelling from the dataset, thereby further improving the specificity of the assay. Precise control of transposome activity is provided, for example, by using a double-stranded DNA handle to inhibit hybridization of a consensus region. This provides control over the onset of labelling and may improve the specificity and signal to noise ratio of the assay. In some examples, covalent attachment of the barcode via labeling may provide for simultaneous measurement of PTM and total protein amounts in a single assay, for example by introducing a third protein recognition element specific for PTM, as well as an additional unique barcode. It will be further appreciated that the method of the invention may be used to measure interactions between molecules, including highly multiplexed protein-protein, protein-RNA or protein-small molecule interactions, thereby allowing additional information to be obtained about molecular interactions in a sample.
Compositions and methods for detecting analytes using proximity-induced strand invasion, restriction or ligation
Some examples herein provide for detection of an analyte using proximity-induced strand invasion, confinement or ligation.
As provided herein, proximity-induced strand invasion, restriction or ligation is an alternative mechanism to address the problem of detecting analytes (such as proteins or other biomolecules). Described herein are high throughput methods for detecting a protein, sugar, or biological substance of interest in a biological sample. A biological or synthetic molecule (e.g., antibody, toxin, ligand, lectin, etc.) linked to a nucleotide sequence can bind to a target or analyte of interest. The nucleotide sequence may be analyzed to determine the identity of the target or analyte of interest. High throughput sequencing methods can be used to analyze sequences, allowing detection and quantification of millions of targets or analytes of interest. For example, array technology can be used as part of a large-scale parallel detection scheme to identify and quantify targets or analytes of interest.
Whole Genome Amplification (WGA) can be used to identify and quantify targets or analytes of interest. There are different WGA methods. These methods include WGA methods requiring a Polymerase Chain Reaction (PCR) step and WGA methods relying on isothermal reaction steps rather than PCR. In some examples, the identification and quantification of the target or analyte of interest is determined using WGA comprising an isothermal reaction. In some examples, WGA comprises isothermal, multiple Displacement Amplification (MDA), a WGA method that relies on strand displacement DNA polymerase to amplify genomic DNA.
An additional technique that may be used to identify and quantify targets or analytes is Targeted Genomic Amplification (TGA). TGA focuses on targets or analytes that are or are derived from a specific subset of genes within the genome. Alternative mechanisms for identifying and quantifying targets rely on capturing nucleotide sequences corresponding to targets on analytes on the surface of the beads (bead capture), and amplifying these nucleotide sequences. Non-limiting methods for amplifying the nucleotide sequences coupled to the beads include bridge amplification, kinetic exclusion amplification (ExAmp), and the like.
Fig. 24A-24D schematically illustrate an exemplary procedure for proximity induced ligation assays using splint oligonucleotides. In the non-limiting example shown in fig. 24A, first antibody 3000 and second antibody 3010 interact with analyte 3020 in a manner such as described elsewhere herein. The first and second antibodies are non-limiting examples of recognition elements capable of interacting with the analyte, and any other recognition element may be used, such as described elsewhere herein. The first oligonucleotide 3030 is attached to a first antibody (or other recognition element) and the second oligonucleotide 3040 is attached to a second antibody (or other recognition element). As shown in fig. 24B, splint oligonucleotide 3050 binds to the ends of both first oligonucleotide 3030 and second oligonucleotide 3040, thereby causing the first oligonucleotide to ligate with the second oligonucleotide to form reporter oligonucleotide 3035 (fig. 24B). Ligation may be performed, for example, using any suitable ligase. The sequence of splint oligonucleotide 3050 may be selected such that the splint oligonucleotide may facilitate such ligation between substantially only the first oligonucleotide 3030 and the second oligonucleotide 3040, rather than between any other two pairs of oligonucleotides. For example, splint oligonucleotide 3050 may comprise a first portion that is complementary to a sufficient number of bases at the 3 'end of first oligonucleotide 3030 to hybridize to the first oligonucleotide, and may comprise a second portion that is complementary to a sufficient number of bases at the 5' end of second oligonucleotide 3040 to hybridize to the second oligonucleotide. Thus, splint oligonucleotide 3050 may be used to couple first oligonucleotide 3030 with second oligonucleotide 3040, thereby producing reporter oligonucleotide 3050. Additionally, if any of the oligonucleotides other than the first oligonucleotide 3030 and/or the second oligonucleotide are brought into proximity with each other, e.g., due to non-specific binding to the analyte 3020 or random interactions in solution, the splint oligonucleotide 3050 will not hybridize sufficiently to both oligonucleotides to facilitate ligation of both oligonucleotides to each other.
The ligated reporter oligonucleotide 3035 may be amplified and/or sequenced in any suitable manner such as provided herein or in a manner such as known in the art and the sequence of the ligated oligonucleotide may be used to identify an analyte. In some examples, one or more of primers 3060, 3070, and 3080 may be used to amplify reporter oligonucleotide 3035 (fig. 24C). For example, primers 3060, 3070, and 3080 can have sequences selected to bind different portions of the first oligonucleotide 3030 and/or the second oligonucleotide 3040 within the reporter oligonucleotide 3035. Suitable polymerases can be used to extend the primer using the sequence of the first oligonucleotide and/or the second oligonucleotide, thereby forming a double stranded oligonucleotide and a reporter oligonucleotide 3035. The amplicon may comprise a sequence complementary to any suitable portion of the sequence of the first oligonucleotide 3030 and/or the second oligonucleotide 3040. The amplified fragments can then be analyzed using WGA (shown in fig. 24D). For example, as shown in fig. 24D, a plurality of amplicons may be complementary to both a portion of the first oligonucleotide 3030 and a portion of the second oligonucleotide 3040, such that sequencing the amplicons provides the sequence of a portion of the first oligonucleotide and a portion of the second oligonucleotide. From the presence of these two sequences, the identity of the analyte can be determined. For example, in a manner similar to that described above with respect to proximity induced tagging, the first oligonucleotide 3030 may include a first sequence corresponding to the analyte 3020 and the second oligonucleotide 3040 may include a second sequence corresponding to the analyte 3020. The presence of analyte 3020 in a sample may be determined based on the presence of the first sequence and the second sequence in the reporter oligonucleotide (or an amplicon thereof). Additionally, the amount of reporter oligonucleotide (or amplicon thereof) may be used to determine the amount of analyte 3020 in a manner similar to that described above with respect to proximity-induced labeling. In some examples, the oligonucleotides (e.g., oligonucleotide 3030, oligonucleotide 3040, and/or reporter oligonucleotide 3035) attached to the probe comprise a barcode. In some examples, the oligonucleotides (e.g., oligonucleotide 3030, oligonucleotide 3040, and/or reporter oligonucleotide 3035) attached to the probe comprise a partial barcode. In such examples, coupling oligonucleotide 3030 with oligonucleotide 3040 in a manner such as described with reference to fig. 24B can produce a complete barcode composed of a partial barcode.
In other examples, TGA or bead capture (methods described herein) can be used to analyze amplicons. Non-limiting examples of using bead capture to analyze amplicons are further described above and further described below with reference to fig. 25A-25C.
In examples such as shown in fig. 24A-24D, the first antibody 3000 and the second antibody 3010 form probes that can be used to determine the identity of an analyte. In some examples, biomolecules other than antibodies or synthetic molecules may be used as probes to bind and detect analytes. In some examples, the probe is a biological or synthetic molecule comprising an amino acid sequence. In some examples, the probe is a biological or synthetic molecule comprising a nucleic acid sequence. In some examples, the probe is a biological molecule or a synthetic molecule comprising a combination of amino acid and nucleic acid sequences. In some examples, the probe comprises lectin. In some examples, the probe includes an aptamer. In some examples, the probe includes a lectin and an aptamer. In some examples, the probes include lectins and antibodies. In some examples, the probes include an aptamer and an antibody. Still other options are contemplated based on the teachings herein.
In some examples, the probe incorporates a label that can be detected. In some examples, the label comprises a fluorescent label. In some examples, the label comprises a fluorophore. In some examples, the label includes an enzyme. In some examples, the label comprises biotin. In some examples, the label comprises a hapten.
FIGS. 25A-25C schematically illustrate examples of ways to distinguish between ligated and non-ligated oligonucleotides. Probes may be designed such that they are optimized to detect the attached product. For example, the linked oligonucleotide 4000 comprising a Single Nucleotide Polymorphism (SNP) 4005 can be used to form a stable duplex incorporating a hapten-labeled modified base 4010 (fig. 25A). The duplex is resistant to stringent washing procedures. In contrast, the unligated oligonucleotide 4020 comprising SNP 4025 may not be thermostable with the oligonucleotide 4030 comprising hapten-labeled modified base 4040 because the nucleotide overlap between these oligonucleotides is minimal (fig. 25B). Stringent washing steps result in removal of unbound oligonucleotides. In some cases, the unligated oligonucleotides may be capable of forming stable duplex with oligonucleotides comprising hapten-labeled modified bases (fig. 25C). However, the unligated oligonucleotides do not contain SNPs.
It will be appreciated that any suitable splint oligonucleotide may be used to generate the reporter polynucleotide using the first oligonucleotide 3030 and the second oligonucleotide 3040. For example, fig. 26A-26C schematically illustrate another exemplary procedure for proximity induced ligation assays using splint oligonucleotides. As shown in fig. 26A, a first recognition element (e.g., antibody) 4070 and a second recognition element (e.g., antibody 4080) interact with the analyte 4090 in a manner similar to that described with reference to fig. 24A-24D. The first splint oligonucleotide 5000 binds both a first portion of a first oligonucleotide 5010 attached to a first recognition element (e.g., an antibody) and a first portion of a second oligonucleotide 5020 attached to a second recognition element (e.g., an antibody). Additionally, second splint oligonucleotide 5001 binds to a second portion of first oligonucleotide 5010 and a second portion of second oligonucleotide 5020. As shown in fig. 26A, the binding of the first splint oligonucleotide 5000 to the first and second oligonucleotides 5010, 5020 and the binding of the second splint oligonucleotide 5000 to the first and second oligonucleotides 5010, 5020 results in ligation of the first splint oligonucleotide to the second splint oligonucleotide to form a circular reporter oligonucleotide 5002 (fig. 26A). Ligation may be performed, for example, using any suitable ligase.
The respective sequences of splint oligonucleotides 5000 and 5001 may be selected to facilitate such ligation substantially only between first splint oligonucleotide 5000 and second splint oligonucleotide 5001, but not between any other two pairs of oligonucleotides. For example, the first splint oligonucleotide 5000 may comprise a first portion that is complementary to a sufficient number of bases along the first oligonucleotide 5010 to hybridize with the first oligonucleotide, and may comprise a second portion that is complementary to a sufficient number of bases along the second oligonucleotide 5020 to hybridize with the second oligonucleotide. Similarly, second splint oligonucleotide 5001 may include a first portion that is complementary to a sufficient number of bases along first oligonucleotide 5010 to hybridize with the first oligonucleotide and may include a second portion that is complementary to a sufficient number of bases along second oligonucleotide 5020 to hybridize with the second oligonucleotide. Thus, splint oligonucleotides 5000, 5001 may be used to couple first oligonucleotide 5010 with second oligonucleotide 5020, thereby producing reporter oligonucleotide 5002. Additionally, if any of the oligonucleotides other than the first oligonucleotide 5010 and/or the second oligonucleotide 5020 are brought into proximity with each other, for example due to non-specific binding to the analyte 4090 or random interactions in solution, the splint oligonucleotides 5000, 5001 will not hybridize sufficiently to both oligonucleotides to facilitate ligation of both splint oligonucleotides to each other.
Exonucleases can be used to degrade the first and second oligonucleotides 5010 and 5020, as well as any splint oligonucleotides that do not form a circular reporter oligonucleotide, resulting in isolation of circular reporter oligonucleotide 5002 shown in FIG. 26B, which is resistant to DNA degradation. In a manner similar to that described with reference to fig. 24C, multiple primers (illustratively, 5030, 5040, and 5050) can be used to amplify the circular splint oligonucleotide (fig. 26B). Whole Genome Amplification (WGA) can then be used to amplify and analyze the fragment 5090 (shown in fig. 26C) to determine the identity of the analyte in a manner similar to that described with reference to fig. 24D. The isolated circular splint nucleotides may be analyzed using other techniques (such as TGA and bead capture), as described herein.
Fig. 27A-27B illustrate a flow of operations in an exemplary method for detecting an analyte using a splint oligonucleotide according to some examples herein. Referring first to fig. 27A, a method 2700 includes coupling a first recognition probe to a first portion of an analyte, the first recognition probe including a first recognition element specific for the first portion of the analyte and a first oligonucleotide corresponding to the first portion of the analyte (operation 2701). For example, in the manner described with reference to fig. 24A, first recognition element 3000 (illustratively, a first antibody) is coupled to a first portion of analyte 3020. Alternatively, for example, a first recognition element 4070 (illustratively, a first antibody) is coupled to a first portion of the analyte 4090 in a manner as described with reference to fig. 26A. Non-limiting examples of recognition elements and analytes are described elsewhere herein. For example, the first recognition probe or the second recognition probe may comprise an antibody, lectin, or aptamer. Illustratively, the first recognition probe may comprise a first antibody, a first lectin, or a first aptamer, and the second recognition probe may comprise a second antibody, a second lectin, or a second aptamer. In one non-limiting example, the analyte comprises molecules that interact with each other in a manner as described elsewhere herein.
Still referring to fig. 27A, method 2700 can include coupling a second recognition probe to a second portion of the analyte, the second recognition probe including a second recognition element specific for the second portion of the analyte and a second oligonucleotide corresponding to the second portion of the analyte (operation 2702). For example, a second recognition element 3010 (illustratively, a second antibody) is coupled to a second portion of analyte 3020 in a manner such as described with reference to fig. 24A. Alternatively, for example, a second recognition element 4080 (illustratively, a second antibody) is coupled to a second portion of the analyte 4090 in a manner such as described with reference to fig. 26A. Non-limiting examples of recognition elements and analytes are described elsewhere herein.
The method 2700 shown in fig. 27A may further include coupling the first oligonucleotide to the second oligonucleotide using a splint oligonucleotide having complementarity to both a portion of the first oligonucleotide and a portion of the second oligonucleotide to form a reporter oligonucleotide coupled to the first recognition probe and the second recognition probe (operation 2703). For example, in a manner such as described with reference to fig. 24B, linear splint oligonucleotide 3050 may comprise a first sequence that is complementary to a portion of first oligonucleotide 3030, and a second sequence that is complementary to a portion of second oligonucleotide 3040. In some examples, linear splint oligonucleotide 3050 and a ligase may be used to ligate first oligonucleotide 3030 to second oligonucleotide 3040 to form reporter oligonucleotide 3035. In another example, in a manner such as described with reference to fig. 26A, the first splint oligonucleotide 5000 and the second splint oligonucleotide 5001 may each comprise a first sequence complementary to a respective portion of the first oligonucleotide 5010 and a second sequence complementary to a respective portion of the second oligonucleotide 5020. In some examples, a ligase is used to ligate the first splint oligonucleotide 5000 and the second splint oligonucleotide 5001 to each other, thereby forming a reporter oligonucleotide 5002 that couples the first oligonucleotide 5010 to the second oligonucleotide 5020. In some examples, the first oligonucleotide comprises a partial barcode and the second oligonucleotide comprises a partial barcode, and coupling the first oligonucleotide to the second oligonucleotide produces a complete barcode corresponding to the target analyte.
The method 2700 shown in fig. 27A may further include performing sequence analysis of the reporter oligonucleotide (operation 2704). In some examples, sequence analysis includes amplification of the reporter oligonucleotide, e.g., using WGA, TGA, or bead-based amplification, such as described elsewhere herein. Non-limiting examples of performing WGA to amplify a reporter oligonucleotide are described with reference to fig. 24C-24D and fig. 26B-26C. Optionally, a portion of the double-stranded oligonucleotide formed prior to or during such amplification may be excised, and sequence analysis may be performed on the excised portion of the double-stranded oligonucleotide. Such excision may be performed, for example, using CRISPR-associated (Cas) proteins, restriction enzymes, and the like.
The method 2700 may also include detecting an analyte based on sequence analysis of the reporter oligonucleotide (operation 2705). In some examples, performing sequence analysis includes performing a Polymerase Chain Reaction (PCR) on the reporter oligonucleotide. In some examples, the reporter oligonucleotide comprises a Unique Molecular Identifier (UMI) that is amplified during PCR.
While fig. 24A-24D, 25A-25C, 26A-26C, and 27A may focus on interactions between the first and second recognition probes and analytes for which those recognition probes are selective, it should be appreciated that such interactions may be multiplexed. For example, the sample may contain a plurality of different analytes that may be detected, for example, by contacting the analytes with a plurality of different recognition probes that each correspond to an analyte that may be present in the sample, and with a plurality of different splint oligonucleotides that correspond to these recognition probes.
For example, fig. 27B illustrates an exemplary operational flow in a method 2750 for detecting multiple analytes in a sample. The method 2750 can include incubating the sample with a plurality of pairs of recognition probes and a plurality of splint oligonucleotides (operation 2751). Each pair of recognition probes includes a first recognition probe and a second recognition probe, and each pair of recognition probes is specific for a respective analyte of the analytes. Additionally, each first recognition probe and each second recognition probe is coupled to a respective oligonucleotide. Exemplary configurations of recognition probes and exemplary oligonucleotides are described elsewhere herein, for example, with reference to fig. 24A-24D and 26A-26C. Each splint oligonucleotide may be complementary to portions of the oligonucleotide that are coupled to a first recognition probe and a second recognition probe of a pair of recognition probes specific for the respective analyte, respectively, and complementary binding of each splint oligonucleotide to the oligonucleotides coupled to the first recognition probe and the second recognition probe results in formation of a reporter oligonucleotide. Exemplary configurations of splint oligonucleotides and their use for forming reporter oligonucleotides are described with reference to fig. 24A-24D and fig. 26A-26C. Illustratively, incubating the sample in operation 2751 further may include incubating with a ligase.
The method 2750 may further include washing the sample to remove any unbound recognition probes and any unbound splint oligonucleotides (operation 2752). The method 2750 may also include performing sequence analysis of the reporter oligonucleotide, e.g., after the washing operation 2752 (operation 2753). Non-limiting examples of targets are provided elsewhere herein. For example, performing sequence analysis may include using any one or more of microarrays, bead arrays, library preparation, or PCR. Method 2750 may also include detecting multiple analytes based on the sequence analysis. Exemplary methods for detecting analytes based on sequence analysis are described elsewhere herein. It should be appreciated that while a variety of analytes, recognition probes, and splint oligonucleotides may be incubated with one another for a given sample during operation 2751, the recognition probe pair is specific for a given analyte, and the splint oligonucleotide is specific for the recognition probe pair, thereby providing a relatively high degree of specificity in the detection of the analyte. Additionally, sequence analysis of various reporter oligonucleotides can be performed in a multiplexed manner, thereby providing rapid analysis of different analytes in a sample without the need to perform different assays on the different analytes separately.
Some examples herein provide a kit comprising a plurality of pairs of recognition probes and a plurality of splint oligonucleotides. In a manner similar to that discussed with reference to operation 2751 of fig. 27B and elsewhere herein, each pair of recognition probes includes a first recognition probe and a second recognition probe, each pair of recognition probes being specific for a respective analyte of the analyte, and each first recognition probe and each second recognition probe being coupled to a respective oligonucleotide. Additionally, in a manner similar to that discussed with reference to operation 2751 of fig. 27B and elsewhere herein, each splint oligonucleotide is complementary to portions of the oligonucleotide that are coupled to a first and second recognition probe, respectively, of a pair of recognition probes specific for a corresponding analyte of the analyte. Illustratively, the kit may be used in a manner such as described with reference to fig. 27B. During use of such a kit and/or during implementation of method 2750, operations such as described with reference to fig. 27A may be performed.
Still other procedures and compositions may be used to generate reporter oligonucleotides, sequence analysis may be performed on these reporter oligonucleotides, and for this purpose the sequence analysis may be used to identify analytes. For example, fig. 28A-28D schematically illustrate an exemplary process of proximity induction chain invasion assay. As shown in fig. 28A, a first recognition element (e.g., antibody) 5060 and a second recognition element (e.g., antibody 5070) interact with an analyte 5080. A first recognition element (e.g., an antibody) is attached to double-stranded oligonucleotide strand 5090, and a second recognition element (e.g., an antibody) is attached to single-stranded oligonucleotide strand 6000. The 5' -end of single-stranded oligonucleotide 6000 invades double-stranded oligonucleotide 6010 (FIG. 28B). For example, the strand with 3' terminated double stranded oligonucleotide 6010 in fig. 28A may hybridize with the strand with 5' terminated double stranded oligonucleotide 6010 at a lower intensity than with the 5' end of single stranded oligonucleotide 6000. Thus, single-stranded oligonucleotide 6000 may partially displace the strand of double-stranded oligonucleotide 6010 with 3' termination in fig. 28A, thereby forming a double-stranded oligonucleotide as shown at 6010 in fig. 28B. Strand invasion brings the barcodes 6020 on each strand into proximity with each other (fig. 28C). Primer 6030 may be used to amplify a barcode (fig. 28D). Quantitative detection (such as array or sequencing techniques) may be used to analyze the amplified barcodes to determine the identity of the analyte in a manner such as described elsewhere herein.
Fig. 29 illustrates an operational flow in an exemplary method 2900 for detecting an analyte using proximity-induced chain invasion, according to some examples herein. The method 2900 shown in fig. 29 may include coupling a first recognition probe to a first portion of an analyte (operation 2901). The first recognition probe may comprise a first recognition element specific for a first portion of the analyte and a double-stranded oligonucleotide comprising a first barcode corresponding to the first portion of the analyte, e.g., in a manner such as described with reference to fig. 28A-28D. Method 2900 may also include coupling a second recognition probe to a second portion of the analyte (operation 2902). The second recognition probe may comprise a second recognition element specific for a second portion of the analyte and a single stranded oligonucleotide comprising a second barcode corresponding to the second portion of the analyte, e.g., in a manner such as described with reference to fig. 28A-28D. Non-limiting examples of recognition elements and analytes are provided elsewhere herein. Method 2900 may also include hybridizing the single stranded oligonucleotide to a single oligonucleotide strand of the double stranded oligonucleotide to form a reporter oligonucleotide comprising the first barcode and the second barcode (operation 2903). In some examples, the hybridization operation includes strand invasion of the double-stranded oligonucleotide by the single-stranded oligonucleotide. Such chain intrusion may be performed in a manner such as described with reference to fig. 28B. Method 2900 may also include performing sequence analysis of the reporter oligonucleotide (operation 2904). Non-limiting examples of sequence analysis are provided elsewhere herein. Illustratively, the sequence analysis performed may include any one or more of isothermal bead-based amplification, targeted genomic amplification, and whole genome amplification. Fig. 28D illustrates one potential way for performing sequence analysis. Method 2900 may also include detecting an analyte based on the sequence analysis of the reporter oligonucleotide (operation 2905). Exemplary procedures for detecting analytes based on sequence analysis of reporter oligonucleotides are provided elsewhere herein. Optionally, detecting the analyte comprises performing a quantitative detection of the reporter oligonucleotide.
In yet other examples, proximity-induced confinement is used to detect analytes. For example, fig. 29A-29D schematically illustrate an exemplary process of proximity induction restriction measurement. As shown in fig. 29, a first recognition element (e.g., an antibody) 6040 and a second recognition element (e.g., an antibody) 6050 interact with an analyte 6060 in a manner such as described elsewhere herein. A first recognition element (e.g., an antibody) is linked to a first single-stranded oligonucleotide 6070 and a second recognition element (e.g., an antibody) is linked to a second single-stranded oligonucleotide 6080. Each of the first single stranded oligonucleotide and the second single stranded oligonucleotide comprises a restriction endonuclease site 6090. The complementary strands of each of the first single-stranded oligonucleotide and the second single-stranded oligonucleotide hybridize to each other, for example, at the position labeled 7000 in FIG. 28B. For example, a portion of the first oligonucleotide 6070 may be complementary to a portion of the second oligonucleotide 6080 such that the oligonucleotides hybridize to each other. The hybridized oligonucleotide may be cleaved at restriction site 6090 (FIG. 28C), for example using a restriction endonuclease such as EcoR1. In some examples, the cleaved DNA can be amplified with primer 7010 (fig. 28D). Quantitative detection such as array or sequencing techniques can be used to analyze the cleaved DNA. In some examples, these single stranded oligonucleotides comprise any restriction endonuclease site known in the art.
FIG. 31 illustrates an operational flow in an exemplary method for detecting an analyte using proximity-induced confinement according to some examples herein. The method 3100 illustrated in fig. 31 includes coupling a first recognition probe to a first portion of the analyte (operation 3101). The first recognition probe can comprise a first recognition element specific for a first portion of the analyte and a first oligonucleotide corresponding to the first portion of the analyte, wherein the first oligonucleotide comprises a first restriction endonuclease site. The method 3100 may further include coupling a second recognition probe to a second portion of the analyte (operation 3102). The second recognition probe can comprise a second recognition element specific for a second portion of the analyte and a second oligonucleotide corresponding to the second portion of the analyte, wherein the second oligonucleotide comprises a second restriction endonuclease site. Operations 3101 and 3102 may be performed, for example, in the manner described with reference to fig. 30A. Non-limiting examples of recognition elements and analytes are provided elsewhere herein. Method 3100 can further include coupling the first oligonucleotide to a second oligonucleotide (operation 3103). For example, a portion of the first oligonucleotide 6070 may be hybridized to the second oligonucleotide 6080 in a manner such as described with reference to fig. 30A-30B. The method 3100 can further include cleaving the first oligonucleotide and the second oligonucleotide at the first restriction endonuclease site and the second restriction endonuclease site to form a reporter oligonucleotide (operation 3104). The cleavage optionally may include the use of one or more restriction endonucleases. Alternatively, instead of including restriction endonuclease sites in the first and second oligonucleotides, sequences that can be targeted by and excised by CRISPR-Cas ribonucleoproteins can be included. Method 3100 can include performing sequence analysis of the reporter oligonucleotide (operation 3105), e.g., in a manner such as described elsewhere herein. For example, the sequence analysis performed may include any one or more of isothermal bead-based amplification, targeted genomic amplification, and whole genome amplification. Method 3100 can also include detecting an analyte based on sequence analysis of the reporter oligonucleotide (operation 3106), e.g., in a manner such as described elsewhere herein. Optionally, detecting the analyte comprises performing a quantitative detection of the reporter oligonucleotide.
Compositions and methods for targeted epigenetic assays
Some examples herein provide for enrichment of polynucleotides (such as DNA) to generate fragments of epigenetic interest, and assaying proteins at loci along those fragments. Several non-limiting examples of assays with specific workflow operations and ordering are given, but other examples can be easily envisioned. In examples of the invention, subsequent sequenced oligonucleotides may be used to label loci, and the sequences of these oligonucleotides may be used to characterize proteins coupled to such loci, respectively. For example, the sequence of the oligonucleotide may provide information about the presence of the protein at the locus of a given fragment, may provide information about the location of the protein at the locus of a given fragment, may provide information about the amount of the protein at the locus of a given fragment, or any suitable combination of such information. Fragments may be enriched, e.g., fragments that bind to a protein may be specifically selected from a given polynucleotide, amplified, sequenced to obtain information therefrom, while other portions of that polynucleotide and portions of other polynucleotides may not be amplified or sequenced, and thus may be discarded. Such locus-related proteomic analysis can illustratively be used to provide genome-wide proteomic profiles that complement whole genome sequencing to provide enhanced characterization of relationships between genotype phenotypes, or better characterize epigenetic features associated with a particular locus and understand epigenetic mechanisms important for research or clinical applications and therapies. For example, while previously known techniques may allow for detection of the location of a single protein binding at a time, the present epigenetic assay provides for targeted, multiplexed detection of multiple proteins across the entire chromosome, or even across the entire genome.
As provided herein, complexes comprising a transposome conjugated to an antibody can be used to generate fragments of polynucleotides, and optionally fragments of polynucleotides within a whole genome sample. The transposomes of the complexes may label each of the fragments with oligonucleotides corresponding to the specific proteins to which those fragments are coupled. For example, as will now be described, the loci of a polynucleotide may be labeled using a mixture of complexes each comprising antibodies specific for different proteins coupled to those loci. Each of the complexes may further comprise one or more transposomes, each of the one or more transposomes optionally may comprise a dimer of a transposase, and each of these transposases may be coupled to an oligonucleotide for labeling that locus, thereby characterizing the protein coupled to that locus. For example, a transposome conjugated to an antibody can cleave a polynucleotide and add the oligonucleotide to the cleaved end in a process that can be referred to as "tagging". For example, for whole polynucleotides or even for WG samples, the resulting fragments and the corresponding sequences of oligonucleotides added by transposomes can be used to identify proteins that have been coupled to those fragments in a multiplexed manner.
For example, composition 3800 shown in fig. 38A comprises polynucleotide P in contact with a mixture of complexes specific for different types of proteins, such as first complex 3841, second complex 3842, and third complex 3843. Illustratively, the polynucleotide P may be contacted with the first complex 3841, the second complex 3842, and the third complex 3843 using a fluid 3860 in which such complexes are provided. The polynucleotide P may comprise different types of proteins coupled to their respective loci, e.g., may comprise proteins 3801 and 3802 at the respective loci, as well as chromatin 3803 (e.g., nucleosomes comprising DNA wrapped around histones). The polynucleotide P may correspond to a representative polynucleotide within a purified, isolated whole genome sample from a cell or tissue. Alternatively, polynucleotide P may be enriched, for example, using Cas 9-based methods such as described in international patent application No. PCT/US2022/019252 entitled "Genomic Library Preparation and Targeted Epigenetic Assays Using Cas-gRNA Ribonucleoproteins," filed 3-8 in 2022, the entire contents of which are incorporated herein by reference. As provided herein, proteins 3801 and 3802 may be assayed substantially without disrupting the interaction between polynucleotide P and the protein.
Each of the complexes 3841, 3842, 3843 may include antibodies corresponding to (selective for) a type of protein, oligonucleotides corresponding to that type of protein, and transposomes that may be activated under certain conditions. The transposomes may comprise an oligonucleotide comprising the ME sequence and identifying the sequence of the protein to which the antibody corresponds. For example, first complex 3841 comprises first antibody 3811 coupled to first transposome 3821 comprising first oligonucleotide 3831. Second complex 3842 comprises second antibody 3812 coupled to second transposome 3822 comprising second oligonucleotide 3832. Third complex 3843 comprises third antibody 3813 coupled to third transposome 3823 comprising third oligonucleotide 3833. In a non-limiting example such as shown in fig. 38A, each antibody can be coupled to more than one transposome. For example, a first complex 3841 may comprise a first antibody 3811 coupled to two transposomes 3823, a second complex 3842 may comprise a second antibody 3812 coupled to two transposomes 3821, and a third complex 3843 may comprise a third antibody 3813 coupled to two transposomes 3822. However, each complex may comprise a single transposome coupled to each antibody, or more than two transposomes coupled to each antibody, or two antibodies coupled to each transposome, or more than two antibodies coupled to each transposome.
Each of the transposomes may comprise any suitable number of oligonucleotides, such as one or more oligonucleotides. For example, each transposome 3821 may comprise two first oligonucleotides 3831 (one coupled to each transposase), each of the transposomes 3822 may comprise two second oligonucleotides 3832 (one coupled to each transposase), and each of the transposomes 3823 may comprise two third oligonucleotides 3833 (one coupled to each transposase). The transposomes 3821, 3822, 3823 may be otherwise substantially identical to each other, but they are differently shaded from each other in fig. 38A, and similarly shaded with antibodies to which they are respectively coupled, to facilitate visual differentiation. The oligonucleotides 3831, 3832, 3833 may have one or more subsequences that are common to each other, as well as different one or more subsequences. Further details regarding the first oligonucleotide 3831, the second oligonucleotide 3832, and the third oligonucleotide 3833 are provided below with reference to fig. 39A-39B and fig. 44. Further details regarding the preparation of the complexes 3841, 3842, 3843 are provided below with reference to fig. 40A-40C, fig. 41, fig. 42, fig. 45, and fig. 46A-46B.
Each of the antibodies 3811, 3812, 3813 is specific for a different protein, which may or may not be coupled to the locus of the polynucleotide P. It will be appreciated that the polynucleotide P may be contacted with any suitable number and type of different complexes, respectively, comprising antibodies specific for different proteins potentially coupled to the locus along the polynucleotide P (and indeed the polynucleotide of the WG sample). Additionally, it should be appreciated that the polynucleotide P (and indeed each polynucleotide of the WG sample) may comprise any suitable number and type of different proteins at a locus along that polynucleotide. For any antibodies in the mixture that are specific for proteins coupled to the corresponding loci of polynucleotide P, those antibodies, as well as the corresponding transposomes and oligonucleotides, can become coupled to those proteins. In the non-limiting example shown in fig. 38B, a first antibody 3811 is specific for and coupled to a first protein 3801, while a second antibody 3812 is specific for and coupled to a second protein 3802. Note that in this example, a plurality of second proteins 3802 are coupled to respective loci in a locus, and a plurality of second antibodies 3812 in a mixture are coupled to proteins at that locus (for ease of differentiation, second antibodies in such antibodies are labeled 3812 'and their transposomes are labeled 3822'). In this example, the portion of the polynucleotide P shown in fig. 38A-38B does not contain the protein to which the third antibody 3813 is specific, and thus the antibody (and its corresponding transposomes and oligonucleotides) does not become coupled to that portion of the polynucleotide. Proteins 3801 and 3802 may be transcriptionally active and thus of interest to assays, for example, to determine which specific proteins (such as transcription factors, repressors, etc.) bind to which specific loci of polynucleotide P.
At a specific time shown in fig. 38A and 38B, conditions of the fluid 3860 that allow the activity of the antibodies 3811, 3812, 3813 and inhibit the activity of the transposomes 3821, 3822, 3823 may optionally be used. For example, it is well known that different enzymes can function using certain ions. Illustratively, the transposomes 3821, 3822, 3823 may function using magnesium ions (mg2+), for example, to couple a corresponding oligonucleotide to the target polynucleotide P, while the presence or absence of magnesium ions may not affect the activity of the antibodies 3811, 3812, 3813. Additionally, or alternatively, the presence of ethylenediamine tetraacetic acid (EDTA) in the fluid 3860 may inhibit the activity of the transposomes 3821, 3822, 3823, while the presence or absence of EDTA may not affect the activity of the antibodies 3811, 3812, 3813. Thus, by contacting polynucleotide P with fluid 3860 having conditions including that of transposomes 3821, 3822, 3823 are inhibited, while antibodies 3811, 3812, 3813 can function properly: there is a sufficient amount of EDTA to inhibit the activity of the transposomes 3821, 3822, 3823, there is no sufficient amount of magnesium ions for the activity of the transposomes 3821, 3822, 3823, or a combination of sufficient amount of EDTA and no sufficient amount of magnesium ions. Additionally, or alternatively, binding of the transposome may be inhibited in any suitable manner, such as reversibly blocking a binding site on the transposome, using a temperature different from that used for the transposome to bind the antibody, and/or delaying binding of the transposase adapter to the transposase until after the antibody has bound, thereby delaying the binding capacity of the transposome, and so forth. Additionally, or alternatively, a sufficiently low concentration of complex may be used such that any off-target labelling yields a product that may not be amplifiable and thus may not be detected using sequencing.
After any antibodies in fluid 3860 become coupled to the corresponding proteins in polynucleotide P, the transposomes to which those antibodies are coupled may be activated, thereby adding the corresponding oligonucleotides to the polynucleotide in the manner shown in fig. 38C. For example, the conditions of fluid 3860 may be altered in a manner that promotes transposome activity. Illustratively, a sufficient amount of magnesium ions may be added to the fluid 3860 for activity of the transposomes 3821, 3822'. In response to such a change in the condition of the fluid, the first transposome 3821 may add the first oligonucleotide 3831 to a corresponding location in the polynucleotide P, and the second transposome 3822, 3822 'may add the second oligonucleotide 3832, 3832' to a corresponding location in the polynucleotide P, while simultaneously dividing the polynucleotide into multiple fragments. These fragments can then be released from the first and second complexes 3841 and 3842 and from the proteins 3801 and 3802 and other chromatin 3803 to provide the composition 3800' shown in fig. 38D. Such release may be performed using proteinase K, sodium Dodecyl Sulfate (SDS), or both proteinase K and SDS. In addition to or instead of using fluidic conditions, the transposomes 3821, 3822, 3823 may be selected so as to have a relatively low activity, e.g. so as to tag substantially only the polynucleotide P when maintained in sufficient proximity thereto by the corresponding antibody. For example, transposases may be mutated to modulate their activity and/or ME sequences may be altered to modulate the activity of the transposases such as in Reznikoff, "Tn5 as a model for understanding DNA transposition," mol. Microbiol. Volume 47, phase 5: pages 1199-1206, the entire contents of which are incorporated herein by reference, modulate transposome activity in the manner described in 2003.
The ends of the fragments 3851, 3852 coupled to the protein for which the antibody is selective comprise oligonucleotides corresponding to that protein. One end of fragment 3853 not yet coupled to a protein for which the antibody is selective comprises an oligonucleotide corresponding to a protein that has been coupled to an adjacent fragment on that side, and the other end of these fragments comprises an oligonucleotide corresponding to a protein that has been coupled to an adjacent fragment on that side. Further details and examples of labeling and exemplary fragments resulting therefrom are provided with reference to fig. 41, 42, 43, 44, 45, 46A-46B and 47A-47C.
It is noted that the length of a fragment may be related to the size and/or amount of the protein at the locus of that fragment. For example, as shown in fig. 38C, a transposome may be able to extend from a corresponding antibody a distance defined by the nature of the coupling between the transposome and the antibody. Thus, when the antibody 3811 is coupled to a corresponding protein 3801 in the polynucleotide P and the transposome 3821 is activated (e.g., using fluid conditions), the transposome may become coupled to a region of the polynucleotide relatively close to the antibody and thus relatively close to the protein at any position where coupling is allowable, illustratively between 1-20 bases, or between 2-15 bases, or between 5-10 bases, respectively. Additionally, the binding of the transposomes may be inhibited by any protein (e.g., chromatin 3803) that occupies the location the transposomes would otherwise bind. Such inhibition may affect the size of fragments generated using the transposomes.
For antibodies 3812, 3812' coupled to protein 3802, the situation is more complicated because more than one protein is coupled to that locus. As shown in fig. 38C, one of the transposomes 3822 coupled to antibody 3812 may add the second oligonucleotide to the polynucleotide P on one side of the protein 3802, while one of the transposomes 3822 'coupled to antibody 3812' may add the second oligonucleotide to the polynucleotide on the other side of the protein 3802. During the preparation of the transposome-antibody complex, the distance between the transposome and the antibody, and thus the distance between the protein and the oligonucleotide added to the polynucleotide P, can be controlled. Illustratively, the transposome 3822 may add the second oligonucleotide to the polynucleotide P within about 10 bases on one side of the second protein 3802, and the transposome 3822 may add the second oligonucleotide to the polynucleotide P within about 10 bases on the other side of the second protein 3802. Note that because the second protein comprises multiple proteins at that locus, the distance between the second oligonucleotide 3832 added by the transposome 3821 and the second oligonucleotide 3832 'added by the transposome 3822' can be significantly different from the distance between the first oligonucleotide 3831 added by the transposome 3822. For example, the distance between the first oligonucleotides 3831 added by the transposomes 3821 may approximately correspond to those lateral distances that the transposomes extend on either side from the antibody 3811. In contrast, the distance between the second oligonucleotides 3832 added by the transposomes 3822, 3812' may approximately correspond to the lateral distance that the transposomes 3822 may extend from the antibody 3812, plus the distance occupied by the protein 3802, plus the lateral distance that the transposomes 3822' may extend from the antibody 3822 '. The number of proteins at each locus may be determined based on the respective lengths of the subfragments. Thus, it is understood that segment 3851 has a length corresponding to the presence of one copy of protein 3801, while segment 3852 has a length corresponding to the presence of two copies of protein 3802.
Fragments 3851, 3852, 3853 may be amplified and sequenced. As shown in FIG. 38E, amplification can result in extended fragments 3851', 3852', 3853', including pairs of full oligonucleotides at the ends of these fragments. Fragments produced by the corresponding transposomes, including the corresponding oligonucleotides (or deletions thereof), may be sequenced parallel to each other using any suitable method, such as by performing SBS on the fragments to which the corresponding oligonucleotides are added. Thus, the sequence of the fragment can be determined in combination with the sequence of the oligonucleotide corresponding to the protein to which the fragment has been coupled. Because these fragments can be generated simultaneously with each other, and such fragments can be individually labeled with oligonucleotides that identify the proteins present, the epigenetic proteins of the polynucleotides along the entire polynucleotide, or even along the entire WG sample, can be determined in a multiplex manner to identify a particular protein at a particular locus of that polynucleotide. For example, a second amount of the same polynucleotide may be sequenced, for example, using SBS, but not using the epigenetic assay of the present invention. The sequences of the different fragments produced by the epigenetic assays of the invention can be compared to the sequences of the polynucleotides, and based on such comparison, the corresponding position of each of these fragments within the entire polynucleotide can be determined. Based on the oligonucleotides located at the ends of these fragments (which are not present in the polynucleotide without the use of the epigenetic assay of the invention), proteins coupled to those fragments can be identified.
It will be appreciated that suitable sequence oligonucleotide sequences may be used. Fig. 39A schematically shows exemplary oligonucleotides that may be used in the process flows of fig. 38A-38E. In the non-limiting example shown in fig. 39A, oligonucleotides 3831, 3832, 3833 each comprise: primers 3910 (e.g., a14 forward primer) for amplifying the corresponding fragments; a respective barcode 3921, 3922, 3923 corresponding to a protein to which the respective antibody is specific; corresponding UMIs 3931, 3932, 3933 that can be used to identify specific fragments that are conjugated to the protein; and a chimeric end (ME) transposon end 3940, the ME transposon end being coupled to a corresponding transposase. The oligonucleotides may comprise a primer 3910 and an ME transposon end 3940 that are common to each other, while the barcode and UMI are different. While individual exemplary oligonucleotides are shown in fig. 39A, each oligonucleotide corresponding to a different protein, it should be understood that fluid 3860 may comprise a plurality of complexes corresponding to the same protein as each other, e.g., a plurality of complexes 3841, a plurality of complexes 3842, and a plurality of complexes 3843, each coupled to a corresponding oligonucleotide. UMI of oligonucleotides can be used to distinguish fragment molecules from each other even when such fragments are coupled to proteins of the same type as each other. For example, fig. 39A shows an oligonucleotide 3831' that corresponds to the same protein as the oligonucleotide 3831 and thus contains the same barcode 3921 as the oligonucleotide 3831, as well as the same primer 3910 and ME transposon end 3940 as the other oligonucleotides. However, oligonucleotide 3831 'comprises a different UMI 3931' than UMI3931 of oligonucleotide 3831. Similarly, any other oligonucleotide corresponding to the same protein as the oligonucleotides 3831, 3831' may have the same primer 3910, barcode 3921 and ME transposon end 3940 as each other, but may have a different UMI than another other oligonucleotide in such an oligonucleotide. Thus, each fragment generated using such an oligonucleotide may become coupled to an oligonucleotide comprising a different UMI, and such UMI may be used to identify which protein has been coupled to that particular fragment molecule.
For example, FIG. 39B schematically shows a fragment coupled to the exemplary oligonucleotide of FIG. 39A, more specifically, a fragment 3851' coupled to the oligonucleotide 3831 at each of its ends, and a fragment 3851' coupled to the oligonucleotide 3831' at each of its ends. Fragments 3851', 3851 "may be produced using operations such as those described with reference to fig. 38A-38E, wherein molecules of different complexes 3841 are selectively coupled to molecules of different proteins 3802 and thus produce different fragment molecules. From the bar code 3921 within the sequence of the oligonucleotides 3831, 3831', it can be appreciated that these fragments are coupled to proteins of the same type as each other, and from the UMI within the sequence of the oligonucleotides 3831, 3831', it can be appreciated that these fragments are generated using molecules of the complex 3841 that are different from each other. It is noted that during fragment amplification such as described with reference to fig. 38E, both the barcode and UMI are amplified, and thus each resulting amplicon can be associated with the correct protein molecule that was initially coupled to the corresponding complex molecule 3841. It will be appreciated that other fragments coupled to other proteins may have other oligonucleotides at their ends. Additionally, the length of these fragments may be significantly longer than the length of these oligonucleotides. Additional non-limiting examples of oligonucleotides and fragments are further provided below with reference to FIG. 44.
The complex may be prepared by coupling the transposome to the corresponding antibody in any suitable manner prior to contact with polynucleotide P. Illustratively, each antibody may be coupled to the corresponding transposome via a covalent bond or via a non-covalent bond. Illustratively, covalent bonds may be formed via copper (I) catalyzed click reactions, or strain-promoted azide-alkyne cycloadditions. The non-covalent bond may be formed in any suitable manner. For example, fig. 40A-40C schematically show further details of a composite such as may be used in the process flows of fig. 38A-38E. It should be appreciated that the complex 3841 shown in fig. 40A-40C may comprise any suitable number of transposomes 3821 coupled to the antibody 3811, although only one such transposome is shown for simplicity of illustration. It is also understood that the particular coupling between the antibody 3811 and the transposome 3821 may define a distance that the transposome may extend from the antibody, e.g., about 1 to 20 bases, or about 2 to 15 bases, or about 5 to 10 bases, in the manner described with reference to fig. 38A-38E.
In some examples, in a manner such as that shown in fig. 40A, complex 3841 may comprise a transposome 3821 coupled to an antibody 3811 via a reaction between any suitable elements (such as click chemistry reactants) or antigen-antibody coupling. For example, antibody 3811 may comprise or may be coupled (covalently or non-covalently) to element 4062, and transposome 3821 may be coupled (covalently or non-covalently) to element 4061 which may be suitably reacted with element 4062 to couple antibody 3811 with transposome 3821. In some examples, antibody 3811 may comprise multiple active sites. One or more of these active sites may be used to couple a corresponding transposome to antibody 3811 in a manner such as that shown in fig. 40A, and another one or more of these active sites may be used to selectively couple an antibody to a protein on a polynucleotide. In one specific example, the transposome 3821 is coupled to protein a (optionally, the transposome 3821 and protein a form a fusion protein), and protein a may be coupled to antibody 3811 in a manner such as described in more detail with reference to fig. 41. In some examples, the transposomes may be modified so as to target a desired antibody, e.g., so as to fuse with a common region of an antibody, but it should be understood that any suitable number of transposomes may be coupled to any suitable portion of an antibody using any suitable technique. Alternatively, in a manner such as that shown in fig. 40B, complex 3841 may comprise a transposome 3821 coupled to antibody 3811 via an alternative coupling between element 4061 'and element 4062'. It is to be understood that complex 3841 may comprise one or more additional transposomes 3821 coupled to antibody 3811, although only one such transposome is shown in fig. 40B for simplicity of illustration. Elements 4061, 4061 'and 4062, 4062' may, for example, include reactants, such as SNAP proteins with O-benzyl guanine; CLIP protein with O-benzyl cytosine; spyTag with Spycatcher; biotin with streptavidin; NTA with His tag; anti-FLAG antibodies and FLAG tags; etc.
As yet another example, in a manner such as that shown in fig. 40C, a portion of complex 3841 'may comprise an antibody 3811 non-covalently coupled to a first subunit of a transposome (transposase) 3821' via an oligonucleotide 4063 that the antibody 3811 has been modified to comprise. Oligonucleotide 4063 may comprise a sequence corresponding to the type of protein for which antibody 3811 is selective, and an ME sequence. The complementary oligonucleotide may anneal only to that ME sequence to double-stranded the ME sequence. Antibody 3811 and transposase (single subunit) can be incubated to bind that double stranded ME. Similar manipulations may be performed on other subunits of the transposomes, for example in a manner similar to that described with reference to fig. 45. These two subunits may then dimerize to form complex 3841. The resulting transposomes 3821 may comprise two separate ME sequences, each of which couples the antibody to a respective subunit. Custom oligonucleotide conjugated antibodies are commercially available or can be prepared using known techniques, for example, such as described in the following references, the entire contents of each of which are incorporated herein by reference: gong et al, "Simple method to prepare oligonucleotide-conjugated antibodies and its application to multiplex protein detection in single cells", bioconjugate chem., volume 27: pages 217-225, 2016; and Stoeckius et al, "Simultaneous epitope and transcriptome measurement in single cells", nature Methods, volume 14: pages 865-868 (2017).
Additional non-limiting examples of transposome-antibody complexes of the invention, methods of labeling using such complexes, oligonucleotides that may be added during labeling, and amplification of such oligonucleotides will now be described with reference to fig. 41, 42, 43, 44, 45, 46A-46B, and 47A-47C.
Referring now to fig. 41, another exemplary procedure for generating complexes each comprising a transposome conjugated to an antibody is schematically illustrated. Can be similar to Kaya-Okur et al, "CUT & Tag for efficient epigenomic profiling of small samples and single cells," Nature Communications, volume 10: a variety of fusion proteins were produced in the manner described in article 1930, 2019, each comprising a transposome 4121 coupled to protein a 4162, the entire contents of which are incorporated herein by reference. Different volumes of fusion protein may be contacted with different oligonucleotides corresponding to different proteins. For example, starting from the 5 'end, the first oligonucleotide 4131 may comprise a forward primer (e.g., primer C), a first barcode sequence designated as corresponding to the first protein (a unique sequence known as "ID 1"), a sequencing primer (e.g., a 14), and a duplex for insertion into a corresponding transposase, the duplex comprising a forward ME sequence hybridized to a complementary ME' sequence. Similarly, starting from the 5 'end, the second oligonucleotide 4132 may comprise a forward primer (e.g., primer C), a second barcode sequence designated as corresponding to a second protein (a unique sequence known as "ID 2"), a sequencing primer (e.g., a 14), and a duplex for insertion of a transposase comprising a forward ME sequence hybridized to a complementary ME' sequence. Similarly, starting from the 5 'end, the third oligonucleotide 4133 may comprise a forward primer (e.g., primer C), a third barcode sequence designated as corresponding to a third protein (a unique sequence known as "ID 3"), a sequencing primer (e.g., a 14), and a duplex for insertion of a transposase comprising a forward ME sequence hybridized to a complementary ME' sequence.
Different volumes of fusion proteins with oligonucleotides coupled thereto may be kept separate from each other and coupled to corresponding antibodies selective for the proteins to which the barcode sequences correspond, respectively. For example, in a manner similar to that described in Kaya-Okur et al, "CUT & Tag for efficient epigenomic profiling of small samples and single cells," Nature Communications, volume 10: in the manner described in article 1930, 2019, protein a 4162 of the fusion protein coupled to the first oligonucleotide 4131 may be coupled to the first antibody 4111; protein a 4162 of the fusion protein coupled to the second oligonucleotide 4132 may be coupled to the second antibody 4112; and protein a 4162 of the fusion protein coupled to the third oligonucleotide 4133 may be coupled to the second antibody 4113, the entire contents of which are incorporated herein by reference. The resulting transposome-antibody complexes are thus coupled to oligonucleotides corresponding to proteins for which the corresponding antibodies are selective.
It will be appreciated that any suitable number of transposomes may be coupled to an antibody to provide a complex of the invention, and that such transposomes do not necessarily need to comprise oligonucleotides identical to each other. For example, FIG. 42 schematically shows an exemplary procedure for generating a complex comprising a plurality of transposomes coupled to antibodies, respectively. Multiple fusion proteins, each comprising a transposome 4221 coupled to protein a 4262, can be produced in a manner similar to that described with reference to fig. 41. Different volumes of fusion protein may be contacted with different oligonucleotides corresponding to different proteins. For example, the first oligonucleotide 4231 may comprise a forward primer, a first barcode sequence designated as corresponding to a first protein (a unique sequence known as "ID 1"), a sequencing primer (e.g., a 14), and a duplex for insertion into a corresponding transposase, the duplex comprising a forward ME sequence hybridized to a complementary ME' sequence. Similarly, the second oligonucleotide 4232 may comprise a forward primer, a second barcode sequence designated as corresponding to a second protein (a unique sequence known as "ID 2"), a sequencing primer (e.g., a 14), and a duplex for insertion of a transposase, the duplex comprising a forward ME sequence hybridized to a complementary ME' sequence. Additionally, the third oligonucleotide 4231 may comprise a reverse primer (e.g., B15) and a duplex for insertion into a corresponding transposase 4222, the duplex comprising a forward ME sequence hybridized to a complementary ME' sequence.
Different volumes of fusion proteins with oligonucleotides coupled thereto may be kept separate from each other and coupled to corresponding antibodies selective for the proteins to which the barcode sequences correspond, respectively. For example, protein a 4262 of the fusion protein coupled to the first oligonucleotide 4231 and protein a 4262 of the fusion protein coupled to the third oligonucleotide 4233 may be coupled to the first antibody 4211 in a manner similar to that described with reference to fig. 41; and protein a 4262 of the fusion protein coupled to the second oligonucleotide 4232 and protein a 4262 of the fusion protein coupled to the third oligonucleotide 4233 may be coupled to the second antibody 4212. The resulting transposome-antibody complexes are thus coupled to oligonucleotides corresponding to proteins for which the corresponding antibodies are selective.
The composite prepared in a manner such as described with reference to fig. 41 and 42 may be used in a manner similar to that described with reference to fig. 38A to 38E. For example, fig. 43 schematically illustrates an operation in which an antibody of one of the complexes of fig. 42 selectively binds to protein 4201 at the locus of polynucleotide P1. As shown in fig. 43, selective binding of antibody 4211 to protein 4201 when transposomes 4221, 4222 are inactivated brings transposomes 4221, 4222 sufficiently close to polynucleotide P1 that when the transposomes are activated, these transposomes can tag the polynucleotide with oligonucleotide 4231 on one end and oligonucleotide 4233 on the other end, respectively. It will be appreciated that the polynucleotide P1 may be contacted with a library of different complexes which are selective for different proteins which may or may not be at different loci of the polynucleotide P1. Fragments generated in a manner such as that shown in fig. 43 may be amplified and sequenced to determine the identity of the protein 4201 coupled to that fragment. Note that due to variations in their manufacture, some of the complexes used to generate fragments may not necessarily contain both transposomes 4221 and transposomes 4222; alternatively, some fragments may comprise two transposomes 4221 without a transposome 4222, or may comprise two transposomes 4222 without a transposome 4221. In a manner similar to that described with reference to fig. 47B, fragments produced by any such complexes may not include all of the amplification adaptors required to amplify such fragments, for example, using operations such as will now be described with reference to fig. 44.
FIG. 44 schematically shows an exemplary flow of operations for amplifying a fragment of a polynucleotide after tagging by a transposome of a complex. Fragment 4431 may comprise two strands hybridized to each other after labelling and purification to remove protein and transposome-antibody complexes. The first strand from the 5 'end to the 3' end may comprise a primer (e.g., primer C), a first barcode sequence designated as corresponding to the first protein (a unique sequence known as "ID 1"), a sequencing primer (e.g., a 14), a forward ME sequence, and a fragment region F1 cleaved from the polynucleotide P1 by transposomes 4221 and 4222. The second strand 4431 "from the 5' end to the 3' end may comprise a reverse primer (e.g., B15), an ME sequence, and a region of complementary fragment F1' cleaved from polynucleotide P1 by transposomes 4221 and 4222. As shown in FIG. 44, the single-stranded portion of fragment 4431 may be extended to form a full duplex comprising strand 4431' and complementary strand 4431 ". Strand 4431' may comprise, from the 5' end to the 3' end, a primer (e.g., primer C), a first barcode sequence designated to correspond to a first protein (a unique sequence known as "ID 1"), a sequencing primer (e.g., a 14), a forward ME sequence, a fragment region F1 cleaved from polynucleotide P1 by transposomes 4221 and 4222, a complementary ME ' sequence, and a complementary reverse primer (e.g., B15 '). Strand 4431 "may comprise, from the 5' end to the 3' end, a reverse primer (e.g., B15), an ME sequence, complementary fragment regions F1' cleaved from polynucleotide P1 by transposomes 4221 and 4222, a complementary forward primer (e.g., a 14), a complementary first barcode sequence (complementary sequence ID1' of ID 1), and a complementary primer (e.g., primer C ').
As shown in fig. 44, the primers and sample index may anneal to fragments 4431', 4431″ for subsequent use in amplifying the fragments. For example, the primer 4450 that anneals to the complementary strand 4431 "may comprise (a) a primer (e.g., primer C) that may anneal to a complementary forward primer (e.g., primer C') of the strand 4431", a (b) sample index (unique identifier corresponding to the sample), and (C) an amplification primer (e.g., P5 primer). Primer 4451 may comprise (a) a primer (e.g., primer B15) that may anneal to a complementary reverse primer (e.g., primer B15 ') of strand 4431', a sample index (unique identifier corresponding to the sample), and (c) an amplification primer (e.g., P7 primer). As shown in fig. 44, the primers 4451, 4450 may extend to form a full duplex 4441 between the primer extended strand 4441' and the complementary primer extended strand 4441″. Strand 4441 'may be similar to strand 4431', but includes a sample index and amplification primer (e.g., P7) at its 3 'end, and may include a sample index and amplification primer (e.g., P5) at its 5' end. Strand 4441 "may be the complement of strand 4441'.
While fig. 41 and 42 illustrate one exemplary preparation of the composite of the present invention, it should be understood that other preparations may be suitably employed. For example, FIG. 45 schematically illustrates another exemplary procedure for generating a complex comprising a transposome coupled to a plurality of antibodies, respectively. Such as those described in Weiner et al, "Preparation of single-and double-oligonucleotide antibody conjugates and their application for protein analytics," Scientific Reports, volume 10: multiple antibodies were prepared in the manner described in page 1457, 2020, each conjugated to a different oligonucleotide. In the example shown in fig. 45, the first antibody 4511 may be selective for the first protein and may be coupled to the 5' end of the first oligonucleotide 4531; the second antibody 4512 may be selective for the second protein and may be coupled to the 5' end of the second oligonucleotide 4532; and third antibody 4513 may be selective for the third protein and may be coupled to the 5' end of third oligonucleotide 4533. Starting from the 5' end, the first oligonucleotide 4531 may comprise a forward primer (e.g., primer C), a first barcode sequence designated as corresponding to the first protein (a unique sequence known as "ID 1"), a sequencing primer (e.g., a 14), and a duplex for insertion into a corresponding transposase comprising a forward ME sequence. Similarly, starting from the 5' end, the second oligonucleotide 4532 may comprise a forward primer (e.g., primer C), a second barcode sequence designated as corresponding to the second protein (a unique sequence known as "ID 2"), a sequencing primer (e.g., a 14), and a duplex for insertion of a transposase, the duplex comprising a forward ME sequence. Similarly, starting from the 5' end, the third oligonucleotide 4533 may comprise a forward primer (e.g., primer C), a third barcode sequence designated as corresponding to a third protein (a unique sequence known as "ID 3"), a sequencing primer (e.g., a 14), and a duplex for insertion of a transposase, the duplex comprising a forward ME sequence. In the non-limiting example shown in FIG. 45, different antibodies coupled to the corresponding oligonucleotides are contacted with transposases 4521, which become coupled to the corresponding oligonucleotides. The transposase can then optionally dimerize as shown in figure 45 to form transposomes, each of which is coupled to two antibodies. The resulting transposome-antibody complexes are thus coupled to oligonucleotides corresponding to proteins for which the corresponding antibodies are selective. The complexes may then be combined. It should be understood that a transposome such as described herein may comprise any suitable number of transposases, e.g., may comprise transposase monomers, dimers, or tetramers.
It is further understood that any suitable number of transposomes may be coupled to an antibody to provide a complex of the invention, and that such transposomes do not necessarily need to be coupled to the same oligonucleotides as each other. For example, fig. 46A to 46B schematically show exemplary procedures for producing complexes each comprising a transposome conjugated to an antibody. For example, as shown in fig. 46A, antibody 4611 may be selective for proteins and may be coupled to the 5' end of each of two oligonucleotides 4631 comprising ME duplex. The oligonucleotide 4631 may have a sequence similar to the first oligonucleotide 4531. In a non-limiting example shown in fig. 46A, the antibody 4611 coupled to the oligonucleotide 4631 may be contacted with transposases 4621, which become coupled to the ME duplex of the oligonucleotide 4631. The transposase 4621 can then dimerize as shown in figure 46A to form a transposome coupled to antibody 4611. The resulting transposome-antibody complexes are thus coupled to oligonucleotides corresponding to proteins for which the corresponding antibodies are selective.
In the example shown in fig. 46B, antibody 4611 may be selective for proteins and may be coupled to the 5 'end of first oligonucleotide 4631 and the 5' end of third oligonucleotide 4633. Oligonucleotide 4631 may have a sequence similar to oligonucleotide 4531 and an ME duplex. The oligonucleotide 4633 may include a reverse primer (e.g., B15), and a duplex for insertion into a corresponding transposase that includes a forward ME sequence. In a non-limiting example shown in fig. 46B, the antibody 4611 coupled to the oligonucleotides 4631, 4633 may be contacted with transposases 4621, which become coupled to the ME duplex of the oligonucleotide 4631. The transposase 4621 can then dimerize as shown in figure 46B to form a transposome coupled to antibody 4611. The resulting transposome-antibody complexes are thus coupled to oligonucleotides corresponding to proteins for which the corresponding antibodies are selective. Note that dimerization of transposases 4621, such as described with reference to fig. 46A to 46B, may be performed at a concentration low enough to make transposases 4621 coupled to antibodies identical to each other more likely to dimerize each other than transposases coupled to other antibodies.
Complexes prepared in a manner such as described with reference to fig. 45 and 46A-46B may be used in a manner similar to that described with reference to fig. 38A-38E or fig. 43 and may be used to generate fragments which may then be amplified in a manner such as described with reference to fig. 44.
Still other complexes and methods can be used to label polynucleotides. For example, fig. 47A schematically shows an exemplary flow of operations in which proteins at respective loci of polynucleotides are sequentially bound by antibodies of primary and secondary complexes. For example, as shown in fig. 47A, polynucleotide P2 (comprising proteins 4701 and 4702 at the respective loci) is contacted with a complex such as described with reference to fig. 45 or fig. 46A, e.g., a complex comprising a first oligonucleotide 4511 coupled to a first oligonucleotide 4531, and a transposome comprising a transposase 4521 coupled to the first oligonucleotide 4531; and a complex comprising a second oligonucleotide 4512 coupled to a second oligonucleotide 4532, and a transposome comprising a transposase 4521 coupled to the second oligonucleotide 4532. As shown in FIG. 47A, selective binding of antibody 4511 to protein 4701 brings transposase 4521 sufficiently close to polynucleotide P2 to tag the polynucleotide with oligonucleotide 4531 at one end. Similarly, selective binding of antibody 4512 to protein 4702 brings transposase 4521 sufficiently close to polynucleotide P2 to tag the polynucleotide with oligonucleotide 4532 at one terminus. It will be appreciated that the polynucleotide P1 may be contacted with a library of different complexes which are selective for different proteins which may or may not be at different loci of the polynucleotide P1.
As also shown in fig. 47A, polynucleotide P2 having a complex selectively coupled thereto may then be contacted with a mixture of second complexes specific for the first complex. For example, in a manner similar to that described with reference to fig. 45, each second complex of the second complexes may comprise an antibody 4711, an oligonucleotide 4731 coupled to the antibody, and a transposome comprising a transposase 4721 coupled to the oligonucleotide. Antibody 4711 may recognize an antibody common region and thus be compatible with all antibodies 4711 and 4712 and other antibodies that may be in contact with polynucleotide P2. As shown in fig. 47A, binding of antibody 4711 to antibody 4711 brings transposase 4721 sufficiently close to polynucleotide P2 to tag the polynucleotide with oligonucleotide 4731 at the end opposite oligonucleotide 4731. Similarly, binding of antibody 4711 to antibody 4712 brings transposase 4721 sufficiently close to polynucleotide P2 to tag the polynucleotide with oligonucleotide 4732 at the end opposite oligonucleotide 4731.
FIG. 47B shows an exemplary fragment of the polynucleotide of FIG. 47A after tagging. One 5 'end of fragment 4741 comprises oligonucleotide 4731 to which a forward primer can anneal in a manner similar to that described with reference to FIG. 44, and the other 5' end of fragment 4741 comprises oligonucleotide 4731 to which a reverse primer can anneal and then amplify the fragment in a manner similar to that described with reference to FIG. 44. One 5 'end of fragment 4742 comprises oligonucleotide 4732 to which a forward primer can anneal in a manner similar to that described with reference to FIG. 44, and the other 5' end of fragment 4742 comprises oligonucleotide 4732 to which a reverse primer can anneal and then amplify the fragment in a manner similar to that described with reference to FIG. 44. In some cases, although antibody 4711 is specific for antibodies 4711, 4712, they may also bind elsewhere and thus the transposomes coupled thereto may generate fragment 4743 comprising oligonucleotide 4731 on both ends. Because only the reverse primer (e.g., B15) can anneal to such fragments, these fragments are not amplifiable.
It is noted that the second antibody need not necessarily be used to provide a reverse primer (e.g., B15) suitable for amplifying a fragment that has been tagged to include oligonucleotide 4731 in a manner such as described with reference to fig. 47A. For example, as shown in fig. 47C, a complex comprising antibody 4711 and transposase 4721 may be used to tag a polynucleotide in a manner similar to that described with reference to fig. 47A. Standard transposition of the entire genome can then be performed using transposomes 4721' loaded with, for example, the B15-ME sequence and not conjugated to any antibodies. The transposomes 4721' may tag the entire genome, but only the region with both the primer (tagged via use of oligonucleotide 4731) and the reverse primer (e.g., B15) is tagged in a similar manner as the fragment 4741 described with reference to fig. 47B.
FIG. 48 illustrates an exemplary flow of operations in a method for targeting epigenetic assays. The method 4800 shown in fig. 48 can be used to characterize proteins coupled to respective loci of polynucleotides, and can include contacting the polynucleotides with a mixture of complexes specific for different types of proteins that may or may not be coupled to respective loci of the polynucleotides (operation 4801). Each of the complexes may comprise an antibody specific for the corresponding type of protein, and a transposome coupled to the antibody and comprising an oligonucleotide corresponding to that type of protein. Exemplary composites are described with reference to fig. 40A-40C, 41, 42, 45, 46A-46B, and 47A-47C. Exemplary oligonucleotides are described with reference to fig. 39A-39B and fig. 44.
The method 4800 shown in FIG. 48 may further include coupling the complexes to proteins specific for the antibodies, respectively (operation 4802). Optionally, operation 4802 can include inactivating the transposomes. Exemplary conditions for inactivating a transposome while coupling an antibody to a protein are described with reference to fig. 38A to 38B. Additionally, or alternatively, a sufficiently low concentration of complex may be used such that any off-target labelling yields a product that may not be amplifiable and thus may not be detected using sequencing.
The method 4800 shown in FIG. 48 further can include generating a fragment of the polynucleotide, including activating a transposome to make a nick in the polynucleotide and coupling an oligonucleotide to the fragment (operation 4803). Exemplary conditions for activating a transposome are described with reference to fig. 38C. Exemplary fragments that can be used to couple oligonucleotides using transposomes are described with reference to fig. 38D-38E, fig. 43, fig. 44, and fig. 47A-47C.
The method 4800 shown in FIG. 48 further can include removing proteins and complexes from the fragments (operation 4804). Exemplary fluid conditions for removing proteins and complexes are described with reference to fig. 38D. It is noted that the proteins and complexes may be removed at any suitable step prior to sequencing.
The method 4800 shown in FIG. 48 may further include subsequent sequencing of the fragments and oligonucleotides coupled to the fragments (operation 4805). For example, SBS may be performed on these fragments and oligonucleotides coupled to these fragments.
The method 4800 shown in FIG. 48 may further include identifying proteins that have been coupled to the corresponding fragments using the sequences of oligonucleotides coupled to those fragments (operation 4806). For example, in a manner such as described with reference to fig. 38E, a second amount of the same polynucleotide may be sequenced, e.g., using SBS, but without the epigenetic assay of the present invention. The sequences of the different fragments produced by the epigenetic assays of the invention can be compared to the sequences of the polynucleotides, and based on such comparison, the corresponding position of each of these fragments within the entire polynucleotide can be determined. Based on the oligonucleotides located at the ends of these fragments (which are not present in the polynucleotide without the use of the epigenetic assay of the invention), proteins coupled to those fragments, respectively, can be identified.
Working examples
The following examples are intended to be illustrative only and are not limiting of the invention.
The Infinium platform of Illumina, inc. Is designed to detect millions of Single Nucleotide Polymorphisms (SNPs) per genomic sample. For example, fig. 32A-32C schematically illustrate exemplary operations and compositions for use in Whole Genome Amplification (WGA) using random-initiated isothermal Multiple Displacement Amplification (MDA) as implemented on an Infinium platform.More specifically, fig. 32A shows WGA using randomly initiated isothermal MDA. MDA uses random primers and strand displacement DNA polymerase to exponentially amplify genomic DNA with minimal representation bias. For example, independent of DNA input, a 3 hour incubation results in a DNA yield of about 100. Mu.g (illustratively,>10ng, standard workflow uses 100ng of input gDNA representing approximately 28K target molecule). This may alleviate PCR bottlenecks due to isothermal requirements. FIG. 32B shows modeling of denatured DNA hybridizable to random primers using synthetic oligonucleotides having about 101 nucleotides. The left segment represents the probe complement and the right segment represents the overhang. A range of input concentrations of molecules ranging from 20M to 280M were tested. FIG. 32C shows the Infinium workflow capability to detect millions of SNPs simultaneously. For further details on WGA and Infinium platforms and their use, see the following references, the entire contents of each of which are incorporated herein by reference: gunderson et al, "Decoding randomly ordered DNA arrays," Genome Research, volume 14, phase 5: pages 870-877, 2004; gunderson et al, "Whole genome genotypic technologies on the BeadArray TM platform, "biotechnol. Volume 2: pages 41-49, 2007; and Peiffer et al, "High-resolution genomic profiling of chromosomal aberrations using Infinium whole-Genome genotyping," Genome Research, volume 16, phase 9: pages 1136-1148, 2006.
In the present examples, the beads were conjugated to a single 95 nucleotide (nt) long synthetic oligonucleotide (oligo). The oligonucleotide sequence comprises two domains: 45-nt decoding segment and 50-nt probe. The beads were loaded onto microfabricated bead chips. Hybridization sequencing is used to generate a spatially decoded map based on the decoded sequences. Mapping the decoding enables a corresponding classification of probes for binding to the sample target strand. Bead chip construction was accomplished with a hyb-seal that separates the regions into wells for individual sample loading.
The fragmented WGA material is then loaded onto a bead chip well and incubated in the presence of a buffer at a temperature suitable for hybridizing SNP probes to the corresponding DNA targets. After washing, the sample wells are subjected to a polymerase extension reaction to incorporate the next hapten-labeled correct non-extendable dideoxynucleotide. After extension, the sample wells are treated with stringent washes to remove hybridized targets. Hapten labels are then exposed to three rounds of immunostaining for robust target detection.
The DNA samples (targets) used in the foregoing assays were prepared for genotyping by amplifying genomic DNA using WGA method. WGA uses a proprietary multiplex displacement amplification method that is isothermal, rapid, efficient and cost effective (fig. 32A). Genomic DNA (gDNA) was chemically denatured and random sequence primers were hybridized. The gDNA hybridized to the random primer is then mixed with an isothermal extension preparation containing strand displacement polymerase, catalytic metal, and dntps. Substitution of a portion of dTTP with dUTP allows the product to be fragmented into shorter fragments (less than about 500 base pairs on average) with uracil-DNA glycosylase (UDG) to cleave off the bases, followed by heating to cleave the remaining phosphate bonds. Fragments are designed to independently sample SNPs of interest.
It was demonstrated that the Infinium workflow can be extended to detect synthetic oligonucleotides with similar sensitivity (molecular range of about 1M to 10M) as WGA DNA. An oligonucleotide having 101nt was synthesized using an established phosphoramidite oligomerization process. Fig. 33A to 33C schematically show exemplary synthetic oligonucleotide sequences, which are used to demonstrate proof of concept. More specifically, FIG. 33A shows a synthetic oligonucleotide sequence representing a synthetic human genome segment (101 nt). Representative segments are selected from the group consisting of-or +chains. The synthetic oligonucleotide segment comprises a 50nt segment that is complementary to a probe sequence on a Global Screening Array (GSA) pharmacogenomics (PGx) bead chip (commercially available from Illumina, inc., san Diego CA). FIG. 33B shows the complete complementary sequence from the sequence in FIG. 33A, which was synthesized and used to mimic double-stranded DNA (dsDNA). dsDNA targets are used in the potential enzyme detection schemes described herein. Fig. 33C shows synthetic oligonucleotide targets with overlapping regions on the human genome, which were tested to demonstrate robustness to target probe activity. Fig. 33D is a table with a corresponding number of targets synthesized for each probe class.
Additionally, two scenarios were modeled to demonstrate utility: i) The full complement of the 101nt oligonucleotide was synthesized to represent that dsDNA (fig. 33B) and ii) the 101nt segment was overlapping when mapped on the genomic segment. dsDNA substrates are used to achieve subsequent enzymatic activity in certain implementations. Robustness to overlapping genome segments demonstrates robustness to cross-reactivity.
A synthetic model consisting of 101nt oligonucleotide sequence targets was selected to perform the assay with a commercially available GSA PGx bead chip. Synthetic targets were designed to report alternative alleles different from those expected using human genomic DNA input (NA 11922). For example, if a WGA sample derived from NA11922 produces an AA allele, successful binding of the synthetic target produces an AB allele when the synthetic target and WGA target are stoichiometrically balanced. Increasing the concentration of synthetic target shifted allele detection exclusively to BB. For example, FIG. 34 schematically illustrates an exemplary synthetic model system for assessing detection of synthetic oligonucleotides. In fig. 34, condition 1 corresponds to a control condition in which WGA NA11992 DNA was tested with three probe categories (accurate, inaccurate, uncertain). The exact class of probes results in the absence of AA allele results for the synthetic target. Condition 2 corresponds to probes tested with WGA and low input synthetic targets. Increasing the amount of input synthetic target to about 3pM resulted in heterozygous AB allele signals with accurate probe class. Condition n corresponds to probes tested with increasing amounts of synthetic targets, resulting in dominant signals with opposite alleles (BB) of the exact probe class. Allele reads were obtained from the genome studio software.
The GDA PGx bead chip contained a subset of probes for rare alleles that detected either the AA allele or the BB allele using only NA11922 DNA input (condition 1 and condition n in FIG. 34). For example, if an AA allele is detected, the AB allele and BB allele are not measurable. Oligonucleotide sequences (fig. 33A-33C) were designed to hybridize to these probes and enable detection of AB genotypes. Furthermore, the following three probe types were chosen to demonstrate how the synthesis of oligonucleotides was performed: accurate, inaccurate, and uncertain. Accurate probes are those that conform to 1000 genome Next Generation Sequencing (NGS) standards in consistency. Inaccurate probes are those with detection rates that do not match the 1000 genomic NGS standard. The uncertainty probes are those probes that do not have enough data for standard WGA materials to generate the signals required to assign probes into an accurate probe class or an inaccurate probe class. Synthetic oligomers were designed for these three probe types (fig. 33A-33C).
Fig. 35A-35C schematically illustrate an exemplary synthetic model system for assessing detection of synthetic oligonucleotides. More specifically, FIG. 35A shows the GDA-PGx bead chip and probe QC real data for the three probe categories. The fluorescence intensity response curves for the three probe classes are shown in fig. 35B. In fig. 35B, probes classified as "accurate" have samples for all allele types (AA, AB, BB) that enable genelin validation, and have consistent genotype NGS data. Inaccurate probe signals do not correspond to NGS. Uncertainty probes are not classified due to very low Minor Allele Frequencies (MAFs). These probes generate signals that are either AA alleles or BB alleles. The uncertain class probe performance with synthetic targets is measured and compared to Artificial Intelligence (AI) model predictions. Alleles with corresponding fluorescent signals: aa=red, bb=green, ab=red/green. Fig. 35C shows synthetic oligonucleotides (101 nt) designed to model synthetic targets that bind to probes conjugated to beads immobilized on a bead chip. Fig. 36 shows fluorescence measured during use of the exemplary synthetic model system of fig. 34 and 35A-35C. Fig. 37 shows the results of additional measurements made during use of the exemplary synthetic model system of fig. 34 and 35A-35C.
The synthetic oligonucleotides are incorporated into the WGA reaction either before or after incubation. The pre-incubation step provides the opportunity to amplify synthetic oligonucleotides with random oligonucleotides (random) during the WGA step. Synthetic oligonucleotides added after WGA incubation did not undergo further amplification or fragmentation. Titration series were performed with both pre-incubation and incubation formats. The final oligonucleotide concentrations were: 0pM, 0.003pM, 0.03pM, 0.3pM, 3pM, 30pM, and 300pM. In fig. 35B and 36, the Xraw value and the Yraw value correspond to the red signal and the green signal from their respective channels. Under all conditions, the signal increases with increasing concentration of synthetic target. The signal also increases with increasing probe concentration.
The probes were designed to demonstrate that homozygous allele signals (AA or BB) can be converted to heterozygous Allele (AB) signals of synthetic DNA with balanced input (fig. 34). WGA amplified material provided background homozygous alleles (AA or BB) and synthetic probes introduced opposite alleles to produce heterozygous AB alleles. Heterozygous AB alleles were detected with about 0.3pM synthetic oligonucleotide input (fig. 35B and 36). The 3pM input corresponds to about 1 to 10M of molecules per sample well, which is consistent with the input generated from genomic DNA after WGA step at each SNP. In addition, each sample well contains about 12 beads/probe types and each contains about 60K of oligonucleotides/beads. This indicates that the amount of SNP synthesis target required for detection is about 10-fold excess over the amount of probe present. Increasing the amount of synthetic probes above about 3pM resulted in a signaling platform and migration of alleles to homozygous (AA or BB) because synthetic targets outperformed WGA SNP input in competition. The concentration of synthetic oligonucleotide required for detection (about 0.3 pM) was about 10,000,000 times lower than that of synthetic oligonucleotide synthesized, and about 1,000,000 times lower than that required for standard PCR reactions (about 0.1. Mu.M to 0.5. Mu.M).
The application extending beyond protein detection is to perform Quality Control (QC) on the probe mixtures required for PCR or targeted enrichment applications using microarrays; high complexity PCR applications can be extended to probes up to >10K in a single formulation. Typical assay QC involves repeating the assay with multiple oligonucleotide library batches to demonstrate that the failure mode is due to intrinsic target tissue and to exclude missing oligonucleotides. The use of microarrays can alleviate the need for repeated PCR multiplex assays, which can be expensive and time consuming.
Additional notes
Practice of the present disclosure may employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are within the skill of the art. Such techniques are well explained in the literature, such as Molecular Cloning: A Laboratory Manual, version 2 (Sambrook et al, 1989); oligonucleotide Synthesis (M.J.Gait editions, 1984); animal Cell Culture (r.i. freshney edit, 1987); methods in Enzymology (Academic Press, inc.); current Protocols in Molecular Biology (F.M. Ausubel et al, edited, 1987 and updated periodically); PCR: the Polymerase Chain Reaction (Mullis et al, 1994); remington, the Science and Practice of Pharmacy, 20 th edition (lipkincott, williams & Wilkins, 2003), remington, the Science and Practice of Pharmacy, 22 nd edition (Pharmaceutical Press and Philadelphia College of Pharmacy at University of the Sciences, 2012).
All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
While various illustrative examples have been described above, it will be apparent to those skilled in the art that various changes and modifications may be made therein without departing from the invention. It is intended that the appended claims cover all such changes and modifications as fall within the true spirit and scope of the invention.
It should be understood that any respective feature/example of each of the aspects of the disclosure as described herein may be implemented together in any suitable combination, and any feature/example from any one or more of these aspects may be implemented together with any suitable combination of features of other aspect(s) as described herein to achieve the benefits as described herein.

Claims (129)

1. A method for detecting an analyte, the method comprising:
coupling a donor recognition probe to a first portion of the analyte, the donor recognition probe comprising a first recognition element specific for the first portion of the analyte, a first oligonucleotide corresponding to the first portion of the analyte, and a transposase coupled to the first recognition element and the first oligonucleotide;
Coupling a receptor recognition probe to a second portion of the analyte, the receptor recognition probe comprising a second recognition element specific for the second portion of the analyte and a second oligonucleotide coupled to the second recognition element and corresponding to the second portion of the analyte;
generating a reporter polynucleotide comprising the first oligonucleotide and the second oligonucleotide using the transposase; and
detecting the analyte based on the reporter polynucleotide comprising the first oligonucleotide and the second oligonucleotide.
2. The method of claim 1, wherein the analyte comprises a first molecule.
3. The method of claim 2, wherein the first portion of the analyte comprises a first portion of the first molecule, and wherein the second portion of the analyte comprises a second portion of the first molecule.
4. The method according to claim 2, wherein:
the first molecule comprises a protein or peptide;
the first recognition element comprises a first antibody or first aptamer specific for a first portion of the protein or peptide; and is also provided with
The second recognition element includes a second antibody or second aptamer specific for a second portion of the protein or peptide.
5. The method according to claim 2, wherein:
the first molecule comprises a target polynucleotide;
the first recognition element comprises a first CRISPR-associated (Cas) protein specific for a first subsequence of the target polynucleotide; and is also provided with
The second recognition element includes a second Cas protein specific for a second subsequence of the target polynucleotide.
6. The method of claim 5, wherein the target polynucleotide comprises RNA, and wherein the first Cas protein and the second Cas protein are independently selected from the group consisting of rCas9 and dCas1 3.
7. The method according to claim 2, wherein:
the first molecule comprises a carbohydrate;
the first recognition element includes a first lectin specific to a first portion of the carbohydrate; and is also provided with
The second recognition element includes a second lectin specific to a second portion of the carbohydrate.
8. The method according to claim 2, wherein:
the first molecule comprises a biological molecule;
wherein the biomolecule is specific to the first recognition element and the second recognition element.
9. The method of claim 2, wherein the analyte further comprises a second molecule that interacts with the first molecule.
10. The method of claim 9, wherein the first portion of the analyte comprises the first molecule, and wherein the second portion of the analyte comprises the second molecule.
11. The method according to claim 10, wherein:
the first molecule comprises a first protein or a first peptide; and is also provided with
The first recognition element includes a first antibody or first aptamer specific for the first protein or first peptide.
12. The method according to claim 10, wherein:
the first molecule comprises a first target polynucleotide; and is also provided with
The first recognition element includes a first CRISPR-associated (Cas) protein specific for the first target polynucleotide.
13. The method according to claim 10, wherein:
the first molecule comprises a first carbohydrate; and is also provided with
The first recognition element includes a first lectin specific to the first carbohydrate.
14. The method according to claim 10, wherein:
the first molecule comprises a first biomolecule specific to the first recognition element.
15. The method of any one of claims 11 to 14, wherein:
the second molecule comprises a second protein or a second peptide; and is also provided with
The second recognition element includes a second antibody or second aptamer specific for the second protein or second peptide.
16. The method of any one of claims 11 to 14, wherein:
the second molecule comprises a second target polynucleotide; and is also provided with
The second recognition element includes a second Cas protein specific for the second target polynucleotide.
17. The method of any one of claims 11 to 14, wherein:
the second molecule comprises a second carbohydrate; and is also provided with
The second recognition element includes a second lectin specific to the second carbohydrate.
18. The method of any one of claims 9 to 14, wherein:
the second molecule comprises a second biomolecule capable of interacting with the second recognition element.
19. The method of claim 18, wherein the second biomolecule is specific to the second recognition element.
20. The method of any one of claims 1-19, wherein a portion of the second oligonucleotide comprises a double-stranded polynucleotide, the transposase tags the first oligonucleotide to the double-stranded polynucleotide to produce the reporter polynucleotide.
21. The method of any one of claims 1-20, wherein the first oligonucleotide comprises a first barcode corresponding to the first portion of the analyte, and wherein the second oligonucleotide comprises a second barcode corresponding to the second portion of the analyte.
22. The method of any one of claims 1 to 21, wherein the first oligonucleotide comprises a chimeric end (ME) transposon end coupled to the transposase.
23. The method of any one of claims 1 to 22, wherein the first oligonucleotide has a different sequence than the second oligonucleotide.
24. The method of any one of claims 1 to 23, wherein the first oligonucleotide comprises a forward primer binding site, and wherein the second oligonucleotide comprises a reverse primer binding site.
25. The method of any one of claims 1 to 24, further comprising inhibiting the activity of the transposase while specifically coupling the donor recognition probe to the first portion of the analyte and while specifically coupling the acceptor recognition probe to the second portion of the analyte.
26. The method of claim 25, wherein the activity of the transposase is inhibited using a first condition of a fluid.
27. The method of claim 26, wherein the first condition of the fluid comprises at least one of: (i) The presence of a sufficient amount of EDTA to inhibit the activity of the transposase, and (ii) the absence of a sufficient amount of magnesium ions for the activity of the transposase.
28. The method of claim 25, wherein the activity of the transposase is inhibited using a dsDNA quencher.
29. The method of claim 25, wherein the activity of the transposase is inhibited by associating a blocking agent with the transposase.
30. The method of claim 25, wherein the activity of the transposase is inhibited by the second oligonucleotide being single stranded.
31. The method of any one of claims 25 to 30, further comprising promoting activity of the transposase prior to producing the reporter polynucleotide using the transposase.
32. The method of claim 31, wherein the activity of the transposase is promoted using a second condition of the fluid.
33. The method of claim 32, wherein the second condition of the fluid comprises the presence of a sufficient amount of magnesium ions for the activity of the transposase.
34. The method of claim 29, wherein the activity of the transposase is promoted by degrading the blocking agent.
35. The method of claim 31, wherein the activity of the transposase is promoted by annealing a third oligonucleotide to the second oligonucleotide to form a double stranded polynucleotide.
36. The method of claim 25, wherein the activity of the transposase is inhibited using a blocking group coupled to the first oligonucleotide.
37. The method of claim 36, further comprising removing the blocking group using a reagent.
38. The method of any one of claims 1 to 37, wherein detecting the analyte comprises sequencing the reporter polynucleotide.
39. The method of claim 38, wherein the sequencing comprises performing synthetic sequencing on the reporter polynucleotide.
40. The method of any one of claims 1 to 39, wherein detecting the analyte comprises:
Attaching the reporter polynucleotide to a bead,
hybridizing a detection probe to said reporter polynucleotide, said detection probe comprising a fluorophore, and
detecting a signal emitted by the fluorophore.
41. The method of claim 40, wherein the beads comprise capture probes, and
wherein the capture probe hybridizes to the reporter polynucleotide.
42. The method of any one of claims 1 to 41, wherein the transposase is coupled to the first recognition element via the first oligonucleotide.
43. The method of any one of claims 1 to 42, wherein the donor recognition probe comprises two transposases, two first recognition elements, and two first oligonucleotides, wherein the two transposases form a dimer, each of the transposases being coupled to a corresponding one of the first recognition elements via a corresponding first oligonucleotide of the first oligonucleotides.
44. The method of any one of claims 1 to 42, wherein the donor recognition probe comprises two transposases, one first recognition element, and two first oligonucleotides, wherein the two transposases form a dimer, each of the transposases being coupled to the one first recognition element via a corresponding first oligonucleotide of the first oligonucleotides.
45. The method of any one of claims 1 to 42, wherein the donor recognition probe comprises two transposases, one first recognition element, and two first oligonucleotides, wherein the two transposases form a dimer, at least one of the transposases being coupled to the one first recognition element via a covalent bond.
46. The method of any one of claims 1-45, wherein the first oligonucleotide and the second oligonucleotide comprise DNA.
47. The method of any one of claims 1 to 46, wherein the first oligonucleotide and the second oligonucleotide each comprise a unique molecular identifier.
48. The method of any one of claims 1 to 47, wherein the transposase comprises Tn5.
49. The method of any one of claims 1 to 48, wherein the acceptor recognition probe is coupled to a bead prior to coupling the acceptor recognition probe to the second portion of the analyte, the method further comprising washing the bead after coupling the acceptor recognition probe to the second portion of the analyte and prior to coupling the donor recognition probe to the first portion of the analyte.
50. The method of any one of claims 1-49, wherein the first recognition element and the first oligonucleotide are coupled to the first portion of the analyte prior to coupling the transposase to the first oligonucleotide and the first recognition element.
51. A method for detecting different analytes in a mixture, the method comprising:
coupling different analytes in the mixture with respective donor recognition probes, each donor recognition probe of the donor recognition probes comprising a first recognition element specific for a first portion of the respective analyte, a first oligonucleotide corresponding to the first portion of that analyte, and a transposase coupled to the first recognition element and the first oligonucleotide;
coupling different analytes in the mixture with respective receptor recognition probes, each receptor recognition probe of the receptor recognition probes comprising a second recognition element specific for a second portion of the respective analyte, and a second oligonucleotide corresponding to the second portion of that analyte and coupled to the second recognition element;
for each of the analytes coupled to the respective donor recognition probe and to the respective acceptor recognition probe, generating a reporter polynucleotide comprising the first oligonucleotide and the second oligonucleotide corresponding to that analyte using the transposase of that donor recognition probe; and
Detecting those analytes in the mixture based on the reporter polynucleotide comprising the first oligonucleotide and the second oligonucleotide corresponding to the analytes.
52. The method of claim 51, further comprising determining the amount of the analytes detected in the mixture based on the amount of the reporter polynucleotides corresponding to those analytes.
53. The method of claim 51 or claim 52, wherein for a first analyte of the analytes, a first donor recognition probe of the donor recognition probes is specific for a first form of the first portion of that analyte.
54. The method of claim 53, wherein for the first analyte of the analytes, a second donor recognition probe of the donor recognition probes is specific for a second form of the first portion of that analyte.
55. The method of claim 54, wherein the first and second donor identification probes of the donor identification probe and the analyte are mixed with each other simultaneously.
56. The method of claim 53, wherein for the first analyte of the analytes, a second donor recognition probe of the donor recognition probes is specific for both the first and second forms of the first portion of that analyte.
57. The method of claim 56, wherein the second donor identification probe of the donor identification probe is mixed with the analyte after mixing the first donor identification probe of the donor identification probe with the analyte.
58. The method according to any one of claims 54 to 57, wherein the analyte is a protein, wherein the first form is post-translationally modified (PTM), and wherein the second form is not PTM.
59. The method of claim 58, wherein the first form is phosphorylated, acetylated, methylated, nitrosylated, or glycosylated relative to the second form.
60. The method of any one of claims 51 to 57, wherein the analyte is a nucleic acid, wherein the first form comprises modified nucleotides, and wherein the second form does not comprise modified nucleotides.
61. The method of any one of claims 51 to 60, further comprising determining the amount of the first and second forms of the first analyte of the analyte based on the amount of the reporter polynucleotide corresponding to the first and second donor recognition probes of the donor recognition probes.
62. A composition, the composition comprising:
an analyte having a first portion and a second portion;
a donor recognition probe coupled to the first portion of the analyte, the donor recognition probe comprising a first recognition element specific for the first portion of the analyte, a first oligonucleotide corresponding to the first portion of the analyte, and a transposase coupled to the first recognition element and the first oligonucleotide; and
a receptor recognition probe coupled to the second portion of the analyte, the receptor recognition probe comprising a second recognition element specific for the second portion of the analyte and a second oligonucleotide coupled to the second recognition element and corresponding to the second portion of the analyte.
63. A kit, the kit comprising:
a plurality of donor recognition probes, each donor recognition probe comprising a recognition element specific for a first portion of a respective analyte, a first oligonucleotide corresponding to the first portion of that respective analyte, and a transposase coupled to the first recognition element and the first oligonucleotide; and
A plurality of receptor recognition probes, each receptor recognition probe comprising a recognition element specific for a second portion of a respective analyte and a second polynucleotide coupled to the second recognition element and corresponding to the second portion of that respective analyte.
64. A method for detecting an analyte, the method comprising:
coupling a donor recognition probe to a first portion of the analyte, the donor recognition probe comprising a first oligonucleotide corresponding to the first portion of the analyte and a transposase coupled to the first oligonucleotide;
coupling a receptor recognition probe to a second portion of the analyte, the receptor recognition probe comprising a second oligonucleotide corresponding to the second portion of the analyte;
generating a reporter polynucleotide comprising the first oligonucleotide and the second oligonucleotide using the transposase; and
detecting the analyte based on the reporter polynucleotide comprising the first oligonucleotide and the second oligonucleotide.
65. The method of claim 64, wherein the donor recognition probe is coupled to the first portion of the analyte via a covalent bond, and wherein the acceptor recognition probe is coupled to the second portion of the analyte via a covalent bond.
66. A method for detecting an analyte, the method comprising:
coupling a first recognition probe to a first portion of the analyte, the first recognition probe comprising a first recognition element specific for the first portion of the analyte and a first oligonucleotide corresponding to the first portion of the analyte;
coupling a second recognition probe to a second portion of the analyte, the second recognition probe comprising a second recognition element specific for the second portion of the analyte and a second oligonucleotide corresponding to the second portion of the analyte;
coupling the first oligonucleotide to the second oligonucleotide using a splint oligonucleotide having complementarity to both a portion of the first oligonucleotide and a portion of the second oligonucleotide to form a reporter oligonucleotide coupled to the first recognition probe and the second recognition probe;
performing sequence analysis on the reporter oligonucleotide; and
detecting the analyte based on the sequence analysis of the reporter oligonucleotide.
67. The method of claim 66, the method further comprising:
generating a double-stranded oligonucleotide comprising the reporter oligonucleotide coupled to the first recognition probe and the second recognition probe and a complementary oligonucleotide hybridized to the reporter oligonucleotide.
68. The method of claim 67, further comprising cleaving a portion of the double-stranded oligonucleotide, wherein the sequence analysis is performed on the cleaved portion of the double-stranded oligonucleotide.
69. The method of claim 68, wherein the sequence analysis performed comprises any one or more of isothermal bead-based amplification, targeted genomic amplification, and whole genome amplification.
70. The method of claim 66, wherein the first recognition probe or the second recognition probe comprises an antibody, lectin, or aptamer.
71. The method of claim 66, wherein the first recognition probe comprises a first antibody, a first lectin, or a first aptamer.
72. The method of claim 66, wherein the second recognition probe comprises a second antibody, a second lectin, or a second aptamer.
73. The method of claim 66, wherein the first oligonucleotide comprises a partial barcode and the second oligonucleotide comprises a partial barcode, wherein coupling the first oligonucleotide to the second oligonucleotide produces a complete barcode corresponding to the target analyte.
74. The method of claim 66, wherein performing the sequence analysis comprises performing a Polymerase Chain Reaction (PCR) on the reporter oligonucleotide.
75. The method of claim 66, wherein the reporter oligonucleotide comprises a Unique Molecular Identifier (UMI) that is amplified during the PCR.
76. A method for detecting a plurality of analytes in a sample, the method comprising:
incubating the sample with:
a plurality of pairs of the identification probes are provided,
wherein each pair of identification probes comprises a first identification probe and a second identification probe,
wherein each pair of recognition probes is specific for a respective analyte of the analytes, and wherein each first recognition probe and each second recognition probe is coupled to a respective oligonucleotide; and
a plurality of splint oligonucleotides are provided, each of which comprises a plurality of splint oligonucleotides,
wherein each splint oligonucleotide is complementary to a portion of the oligonucleotide that is coupled to a first and second recognition probe, respectively, of a pair of recognition probes specific for a respective analyte of the analyte, and
wherein complementary binding of each splint oligonucleotide to an oligonucleotide coupled to the first recognition probe and the second recognition probe results in the formation of a reporter oligonucleotide;
Washing the sample to remove any unbound recognition probes and any unbound splint oligonucleotides;
performing sequence analysis on the reporter oligonucleotide; and
detecting the plurality of analytes based on the sequence analysis.
77. The method of claim 76, wherein incubating the sample further comprises incubating with a ligase.
78. The method of claim 76, wherein performing the sequence analysis comprises using any one or more of a microarray, bead array, library preparation, or PCR.
79. A composition, the composition comprising:
a plurality of analytes;
a plurality of pairs of the identification probes are provided,
wherein each pair of identification probes comprises a first identification probe and a second identification probe,
wherein each pair of recognition probes is specific for a respective analyte of the analytes, and
wherein each first recognition probe and each second recognition probe is coupled to a respective oligonucleotide; and
a plurality of splint oligonucleotides are provided, each of which comprises a plurality of splint oligonucleotides,
wherein each splint oligonucleotide is complementary to a portion of the oligonucleotide that is coupled to a first and second recognition probe, respectively, of a pair of recognition probes specific for a respective analyte of the analyte.
80. A kit, the kit comprising:
a plurality of pairs of the identification probes are provided,
wherein each pair of identification probes comprises a first identification probe and a second identification probe,
wherein each pair of recognition probes is specific for a respective analyte of the analytes, and
wherein each first recognition probe and each second recognition probe is coupled to a respective oligonucleotide;
and
A plurality of splint oligonucleotides are provided, each of which comprises a plurality of splint oligonucleotides,
wherein each splint oligonucleotide is complementary to a portion of the oligonucleotide that is coupled to a first and second recognition probe, respectively, of a pair of recognition probes specific for a respective analyte of the analyte.
81. A method for detecting an analyte, the method comprising:
coupling a first recognition probe to a first portion of the analyte, the first recognition probe comprising a first recognition element specific for the first portion of the analyte and a double-stranded oligonucleotide comprising a first barcode corresponding to the first portion of the analyte;
coupling a second recognition probe to a second portion of the analyte, the second recognition probe comprising a second recognition element specific for the second portion of the analyte and a single-stranded oligonucleotide comprising a second barcode corresponding to the second portion of the analyte;
Hybridizing the single stranded oligonucleotide to a single oligonucleotide strand of the double stranded oligonucleotide to form a reporter oligonucleotide comprising the first barcode and the second barcode;
performing sequence analysis on the reporter oligonucleotide; and
detecting the analyte based on the sequence analysis of the reporter oligonucleotide.
82. The method of claim 81, wherein the hybridizing step comprises strand invasion of the double stranded oligonucleotide by the single stranded oligonucleotide.
83. The method of claim 81, wherein the sequence analysis performed comprises any one or more of isothermal bead-based amplification, targeted genomic amplification, and whole genome amplification.
84. The method of claim 81, wherein detecting the analyte comprises performing a quantitative detection of the reporter oligonucleotide.
85. A method for detecting an analyte, the method comprising:
coupling a first recognition probe to a first portion of the analyte, the first recognition probe comprising a first recognition element specific for the first portion of the analyte and a first oligonucleotide corresponding to the first portion of the analyte, wherein the first oligonucleotide comprises a first restriction endonuclease site;
Coupling a second recognition probe to a second portion of the analyte, the second recognition probe comprising a second recognition element specific for the second portion of the analyte and a second oligonucleotide corresponding to the second portion of the analyte, wherein the second oligonucleotide comprises a second restriction endonuclease site;
coupling the first oligonucleotide to the second oligonucleotide;
cleaving the first oligonucleotide and the second oligonucleotide at the first restriction endonuclease site and the second restriction endonuclease site to form a reporter oligonucleotide;
performing sequence analysis on the reporter oligonucleotide; and
detecting the analyte based on the sequence analysis of the reporter oligonucleotide.
86. The method of claim 85, wherein the cleaving step comprises using one or more restriction endonucleases.
87. The method of claim 85, wherein the sequence analysis performed comprises any one or more of isothermal bead-based amplification, targeted genomic amplification, and whole genome amplification.
88. The method of claim 85, wherein detecting the analyte comprises performing a quantitative detection of the reporter oligonucleotide.
89. A method of performing a targeted epigenetic assay, the method comprising:
contacting a polynucleotide with a mixture of first complexes specific for different types of proteins coupled to corresponding loci of the polynucleotide,
each first complex of the first complexes comprises a first antibody specific for a corresponding type of protein, and a first adaptor conjugated to the first antibody and comprising a first oligonucleotide corresponding to that type of protein;
coupling the first complexes to proteins specific for the first antibodies, respectively;
generating a fragment of the polynucleotide comprising activating the first transposome to create a first nick in the polynucleotide and coupling the first oligonucleotide to the first nick;
removing the protein and first complex from the fragment;
sequencing the fragment and the first oligonucleotide coupled to the fragment; and
identifying the protein to which the fragments have been coupled using the sequences of the first oligonucleotides coupled to those fragments.
90. The method of claim 89, wherein each first composite of said first composites comprises a plurality of first receptacles.
91. The method of claim 90, wherein each first composite of the first composites comprises two first receptacles.
92. The method of any one of claims 89-91, wherein the first transposomes are inactivated using a first condition of a fluid.
93. The method of claim 92, wherein the first condition of the fluid comprises at least one of: (i) The presence of a sufficient amount of EDTA to inhibit the activity of the first transposome, and (ii) the absence of a sufficient amount of magnesium ions for the activity of the first transposome.
94. The method of claim 92 or claim 93, wherein the first transposomes are activated using a second condition of the fluid.
95. The method of claim 94, wherein the second condition of the fluid comprises the presence of a sufficient amount of magnesium ions for activity of the first transposome.
96. The method of any one of claims 89 to 95, wherein the sequencing comprises performing sequencing-by-synthesis on the fragment and the oligonucleotide coupled to the fragment.
97. The method of any one of claims 89-96, comprising identifying the respective loci of the proteins using respective positions in the fragments of the first oligonucleotides.
98. The method of any one of claims 89-97, wherein the first oligonucleotide comprises a primer.
99. The method of any one of claims 89-98, wherein the first oligonucleotide comprises a Unique Molecular Identifier (UMI).
100. The method of any one of claims 89-99, wherein the first oligonucleotide comprises a barcode corresponding to the protein.
101. The method of any one of claims 89 to 100, wherein the first oligonucleotide comprises a chimeric terminal (ME) transposon end.
102. The method of any one of claims 89-101, wherein the first transposome is coupled to the first antibody via a covalent bond.
103. The method of any one of claims 89-101, wherein the first transposome is coupled to the first antibody via a non-covalent bond.
104. The method of claim 103, wherein the first transposome is coupled to protein a, and wherein the active site of the first antibody is coupled to the protein a.
105. The method of any one of claims 89-104, wherein the first transposome comprises Tn5.
106. The method of any one of claims 89-105, wherein each first complex of the first complexes comprises a fusion protein comprising the first antibody and the first transposome.
107. The method of any one of claims 89-106, wherein the first antibody is coupled to the first oligonucleotide, and wherein the first transposome is coupled to the first antibody via the first oligonucleotide.
108. The method of any one of claims 89-107, further comprising:
contacting the polynucleotide with a mixture of second complexes specific for the first complex,
each second complex of the second complexes comprises a second antibody specific for the first antibody and a second transposome coupled to the second antibody and comprising a second oligonucleotide; and
coupling the second complex with the first complex, respectively;
wherein generating a fragment of the polynucleotide further comprises activating the second transposome to generate a second nick in the polynucleotide and coupling the second oligonucleotide to the second nick; and
wherein the second oligonucleotide is used to amplify the fragment prior to sequencing.
109. The method of any one of claims 89 to 108, wherein the polynucleotide comprises double stranded DNA.
110. A composition, the composition comprising:
Polynucleotides having different types of proteins coupled to their respective loci; and
a mixture of first complexes specific for different types of proteins,
each of the first complexes comprises a first antibody selective for a type of protein, and a first transposome coupled to the first antibody and comprising a first oligonucleotide corresponding to that type of protein.
111. The composition of claim 110, wherein each first complex of the first complexes comprises a plurality of first receptacles.
112. The composition of claim 111, wherein each first complex of the first complexes comprises two first receptacles.
113. The composition of any one of claims 110 to 112, wherein the first transposomes are inactivated using fluid conditions.
114. The composition of claim 113, wherein the condition of the fluid comprises at least one of: (i) The presence of a sufficient amount of EDTA to inhibit the activity of the first transposome, and (ii) the absence of a sufficient amount of magnesium ions for the activity of the first transposome.
115. The composition of any one of claims 110-114, wherein the first transposome is activatable to cleave the polynucleotide and add the first oligonucleotide to the nick.
116. The composition of claim 115, wherein the first transposomes are activatable using fluid conditions.
117. The composition of claim 116, wherein the condition of the fluid comprises the presence of a sufficient amount of magnesium ions for the activity of the first transposome.
118. The composition of any one of claims 110 to 117, wherein the first oligonucleotide comprises a primer.
119. The composition of any one of claims 110 to 118, wherein said first oligonucleotide comprises a Unique Molecular Identifier (UMI).
120. The composition of any one of claims 110 to 119, wherein the first oligonucleotide comprises a barcode corresponding to the protein.
121. The composition of any one of claims 110 to 120, wherein the first oligonucleotide comprises a chimeric terminal (ME) transposon end.
122. The composition of any one of claims 110-121, wherein the first transposome is coupled to the antibody via a covalent bond.
123. The composition of any one of claims 110-122, wherein the first transposome is coupled to the antibody via a non-covalent bond.
124. The composition of claim 123, wherein the first transposome is coupled to protein a, and wherein the active site of the first antibody is coupled to the protein a.
125. The composition of any one of claims 110-124, wherein the first transposome comprises Tn5.
126. The composition of any one of claims 110-125, wherein each first complex of the first complexes comprises a fusion protein comprising the first antibody and the first transposome.
127. The composition of any one of claims 110-126, wherein the first antibody is coupled to the first oligonucleotide, and wherein the first transposome is coupled to the first antibody via the first oligonucleotide.
128. The composition of any one of claims 110 to 127, further comprising:
a mixture of second complexes, the second complexes being specific for the first complexes,
each of the second complexes comprises a second antibody coupled to one of the first antibodies and a second transposome comprising a second oligonucleotide.
129. The composition of any one of claims 110 to 128, wherein the polynucleotide comprises double stranded DNA.
CN202280059144.7A 2021-08-11 2022-08-09 Detection of analytes using targeted epigenetic assays, proximity-induced tagging, strand invasion, restriction or ligation Pending CN117881796A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US63/231970 2021-08-11
US202163250574P 2021-09-30 2021-09-30
US63/250574 2021-09-30
PCT/US2022/039853 WO2023018730A1 (en) 2021-08-11 2022-08-09 Detection of analytes using targeted epigenetic assays, proximity-induced tagmentation, strand invasion, restriction, or ligation

Publications (1)

Publication Number Publication Date
CN117881796A true CN117881796A (en) 2024-04-12

Family

ID=90588814

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280059144.7A Pending CN117881796A (en) 2021-08-11 2022-08-09 Detection of analytes using targeted epigenetic assays, proximity-induced tagging, strand invasion, restriction or ligation

Country Status (1)

Country Link
CN (1) CN117881796A (en)

Similar Documents

Publication Publication Date Title
AU2020205215B2 (en) Preserving genomic connectivity information in fragmented genomic DNA samples
EP3102691B1 (en) Method for controlled dna fragmentation
AU2015273232B2 (en) Methods and compositions for preparing sequencing libraries
US20180305751A1 (en) Compositions and methods for improving sample identification in indexed nucleic acid libraries
CN105917004B (en) polynucleotide modification on solid supports
CA2810931C (en) Direct capture, amplification and sequencing of target dna using immobilized primers
JP2019068824A (en) Sample preparation on solid support
AU2015243130B2 (en) Systems and methods for clonal replication and amplification of nucleic acid molecules for genomic and therapeutic applications
KR20200084866A (en) New method for synthesizing polynucleotides using various libraries of oligonucleotides
KR20240024835A (en) Methods and compositions for bead-based combinatorial indexing of nucleic acids
US20240124921A1 (en) Detection of analytes using targeted epigenetic assays, proximity-induced tagmentation, strand invasion, restriction, or ligation
WO2021252937A2 (en) Compositions and methods for dna methylation analysis
CN117881796A (en) Detection of analytes using targeted epigenetic assays, proximity-induced tagging, strand invasion, restriction or ligation
AU2022328378A1 (en) Detection of analytes using targeted epigenetic assays, proximity-induced tagmentation, strand invasion, restriction, or ligation
US20220162596A1 (en) A library of polynucleotides
CA3209074A1 (en) Genomic library preparation and targeted epigenetic assays using cas-grna ribonucleoproteins
CN117255856A (en) Genomic library preparation and targeting epigenetic assays using CAS-gRNA ribonucleoprotein

Legal Events

Date Code Title Description
PB01 Publication