WO2022187273A1 - Bioreactive compounds and methods of use thereof - Google Patents
Bioreactive compounds and methods of use thereof Download PDFInfo
- Publication number
- WO2022187273A1 WO2022187273A1 PCT/US2022/018381 US2022018381W WO2022187273A1 WO 2022187273 A1 WO2022187273 A1 WO 2022187273A1 US 2022018381 W US2022018381 W US 2022018381W WO 2022187273 A1 WO2022187273 A1 WO 2022187273A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- receptor
- substituted
- unsubstituted
- biomolecule
- moiety
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P21/00—Preparation of peptides or proteins
- C12P21/02—Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/58—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving labelled substances
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K47/00—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient
- A61K47/50—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates
- A61K47/51—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the non-active ingredient being a modifying agent
- A61K47/62—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the non-active ingredient being a modifying agent the modifying agent being a protein, peptide or polyamino acid
- A61K47/64—Drug-peptide, drug-protein or drug-polyamino acid conjugates, i.e. the modifying agent being a peptide, protein or polyamino acid which is covalently bonded or complexed to a therapeutically active agent
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K47/00—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient
- A61K47/50—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates
- A61K47/51—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the non-active ingredient being a modifying agent
- A61K47/68—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the non-active ingredient being a modifying agent the modifying agent being an antibody, an immunoglobulin or a fragment thereof, e.g. an Fc-fragment
- A61K47/6835—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the non-active ingredient being a modifying agent the modifying agent being an antibody, an immunoglobulin or a fragment thereof, e.g. an Fc-fragment the modifying agent being an antibody or an immunoglobulin bearing at least one antigen-binding site
- A61K47/6849—Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the non-active ingredient being a modifying agent the modifying agent being an antibody, an immunoglobulin or a fragment thereof, e.g. an Fc-fragment the modifying agent being an antibody or an immunoglobulin bearing at least one antigen-binding site the antibody targeting a receptor, a cell surface antigen or a cell surface determinant
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07C—ACYCLIC OR CARBOCYCLIC COMPOUNDS
- C07C305/00—Esters of sulfuric acids
- C07C305/26—Halogenosulfates, i.e. monoesters of halogenosulfuric acids
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07C—ACYCLIC OR CARBOCYCLIC COMPOUNDS
- C07C309/00—Sulfonic acids; Halides, esters, or anhydrides thereof
- C07C309/63—Esters of sulfonic acids
- C07C309/72—Esters of sulfonic acids having sulfur atoms of esterified sulfo groups bound to carbon atoms of six-membered aromatic rings of a carbon skeleton
- C07C309/77—Esters of sulfonic acids having sulfur atoms of esterified sulfo groups bound to carbon atoms of six-membered aromatic rings of a carbon skeleton containing carboxyl groups bound to the carbon skeleton
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07C—ACYCLIC OR CARBOCYCLIC COMPOUNDS
- C07C309/00—Sulfonic acids; Halides, esters, or anhydrides thereof
- C07C309/78—Halides of sulfonic acids
- C07C309/86—Halides of sulfonic acids having halosulfonyl groups bound to carbon atoms of six-membered aromatic rings of a carbon skeleton
- C07C309/89—Halides of sulfonic acids having halosulfonyl groups bound to carbon atoms of six-membered aromatic rings of a carbon skeleton containing carboxyl groups bound to the carbon skeleton
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07C—ACYCLIC OR CARBOCYCLIC COMPOUNDS
- C07C311/00—Amides of sulfonic acids, i.e. compounds having singly-bound oxygen atoms of sulfo groups replaced by nitrogen atoms, not being part of nitro or nitroso groups
- C07C311/30—Sulfonamides, the carbon skeleton of the acid part being further substituted by singly-bound nitrogen atoms, not being part of nitro or nitroso groups
- C07C311/45—Sulfonamides, the carbon skeleton of the acid part being further substituted by singly-bound nitrogen atoms, not being part of nitro or nitroso groups at least one of the singly-bound nitrogen atoms being part of any of the groups, X being a hetero atom, Y being any atom, e.g. N-acylaminosulfonamides
- C07C311/47—Y being a hetero atom
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07D—HETEROCYCLIC COMPOUNDS
- C07D233/00—Heterocyclic compounds containing 1,3-diazole or hydrogenated 1,3-diazole rings, not condensed with other rings
- C07D233/54—Heterocyclic compounds containing 1,3-diazole or hydrogenated 1,3-diazole rings, not condensed with other rings having two double bonds between ring members or between ring members and non-ring members
- C07D233/64—Heterocyclic compounds containing 1,3-diazole or hydrogenated 1,3-diazole rings, not condensed with other rings having two double bonds between ring members or between ring members and non-ring members with substituted hydrocarbon radicals attached to ring carbon atoms, e.g. histidine
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IG], e.g. monoclonal or polyclonal antibodies
- C07K16/18—Immunoglobulins [IG], e.g. monoclonal or polyclonal antibodies against material from animals or humans
- C07K16/28—Immunoglobulins [IG], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants
- C07K16/2863—Immunoglobulins [IG], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants against receptors for growth factors, growth regulators
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/93—Ligases (6)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y601/00—Ligases forming carbon-oxygen bonds (6.1)
- C12Y601/01—Ligases forming aminoacyl-tRNA and related compounds (6.1.1)
- C12Y601/01026—Pyrrolysine-tRNAPyl ligase (6.1.1.26)
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/58—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving labelled substances
- G01N33/582—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving labelled substances with fluorescent label
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/68—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
- G01N33/6872—Intracellular protein regulatory factors and their receptors, e.g. including ion channels
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2317/00—Immunoglobulins specific features
- C07K2317/50—Immunoglobulins specific features characterized by immunoglobulin fragments
- C07K2317/56—Immunoglobulins specific features characterized by immunoglobulin fragments variable (Fv) region, i.e. VH and/or VL
- C07K2317/569—Single domain, e.g. dAb, sdAb, VHH, VNAR or nanobody®
Definitions
- SuFEx click chemistry via the latent aryl fluorosulfate group has demonstrated value in aiding modular organic synthesis, chemical biology, and drug development.
- the inventors incorporated fluorosulfate-L-tyrosine (FSY) into proteins for protein crosslinking and generating covalent protein drugs.
- FSY fluorosulfate-L-tyrosine
- FSK fluorosulfonyloxybenzoyl-L-lysine
- A fluorosulfonyloxybenzoyl-L-lysine
- R 1 is a biomolecule.
- R 1 is a peptidyl moiety, a nucleic acid moiety, a carbohydrate moiety, or a small molecule. In aspects, R 1 is a peptidyl moiety.
- biomolecule conjugates comprising a first biomolecule moiety conjugated to a second biomolecule moiety through a bioconjugate linker, wherein the bioconjugate linker has the structure of Formula (D): [0010]
- biomolecules comprising FSK, wherein FSK has a side chain having the structure of Formula (F): aspects, the biomolecule is a protein.
- the protein is an antibody, an antibody variant (e.g., an antigen-binding fragment, a single-domain antibody, a single-chain variable fragment, an affibody), or a membrane receptor.
- FIGS.1A-1C show site-specific incorporation of FSK into proteins.
- FIG.1A Schematic illustration of FSK incorporation into proteins via genetic code expansion.
- FIG.1B SDS-PAGE of purified ubiquitin (6FSK).
- FIG.1C Mass spectrum of the intact ubiquitin (6FSK). Theoretical molecular weight: 9589.9 Da; observed: 9590.1 Da.
- FIGS.2A-2F show genetically encoded FSK enables protein crosslinking at long distance.
- FIG.2A Chemical structures of FSY and FSK, and schematic illustration of aryl fluorosulfate reacting with nucleophilic residues (e.g., Lys, Tyr, His) in proximity via SuFEx chemistry.
- nucleophilic residues e.g., Lys, Tyr, His
- FIG.2B Reaction distances of FSY and FSK measured from the C ⁇ to the F atom at their energy minimized states: 9.0 angstroms (FSY) and 13.8 angstroms (FSK).
- FIG.2C Crystal structure of ecGST (PDB code: 1A0F) showing the distances between Glu65 and the adjacent nucleophilic residues in yellow dotted lines.
- FIG.2D Western blot analysis of ecGST dimeric crosslinking induced by FSK or FSY incorporated at site 65 of ecGST.
- FIG.2E Crystal structure of sjGST (PDB code: 1Y6E) showing the distance between the C ⁇ s of Lys 44 and Ala97).
- FIG.2F Western blot analysis of sjGST dimeric crosslinking induced by FSK or FSY. * indicates other proteins interacting with sjGST in E. coli.
- FIGS.3A-3B show FSK mediated intramolecular covalent crosslinking in ubiquitin.
- FIG.3A Structure of ubiquitin (PDB code: 1AAR) showing Glu18 for FSK incorporation to target Lys29.
- FIG.3B ESI-MS of Ub(18FSK). The peak of 9587.9 Da corresponds to the intact Ub(18FSK) (calculated MW: 9587.9 Da).
- the peak of 9568.6 Da corresponds to the intramolecularly cross-linked Ub via FSK18 reacting with Lys29 and losing HF (calculated MW: 9567.9).
- the peak 9506.8 Da corresponds to Ub(18FSK) losing SO 2 F (calculated MW: 9506.9), which could be due to the impurity of FSK.
- the tandem mass spectrum (not shown) of the cross-linked peptide identified from the trypsin-digested Ub(18FSK) showed that FSK reacted with Lys29 as designed.
- FIGS.4A-4F show FSK enabled 7D12 nanobody to covalently target the EGFR receptor.
- FIG.4A Structure of nanobody 7D12 in complex with EGFR (PDB code: 4KRL), showing Arg30 and Ser31 of 7D12 for FSK incorporation to target His359 of EGFR.
- FIG.4B ESI-MS analysis of 7D12(31FSK). Calculated MW: 14673.1 Da (forming 1 pair of disulfide bond); measured MW: 14673.2 Da.
- FIG.4C SDS-PAGE analysis of covalent crosslinking of nanobody 7D12 with EGFR in vitro.
- FIG.4D Western blot analysis of covalent crosslinking of nanobody 7D12 with EGFR in vitro. Only 7D12(31FSK) cross-linked EGFR covalently.
- FIG. 4E Western blot analysis of nanobody 7D12 crosslinking with native EGFR expressed on A431 cell surface.7D12 and 7D12(31FSK) were incubated with A431 cells at indicated time interval, and cell lysates were immunoblotted with anti-His antibody to detect the nanobody 7D12.
- FIG. 4F schematic of the distance between 7D12 (Tyr109) and EGFR (Lys443), where the Lys 443 was shown as a state after site mutagenesis in PDB structure 4KRL.
- FIGS.5A-5C show genetically incorporation of FSK into proteins for protein crosslinking in mammalian cells.
- FIG.5A Fluorescence microscopic images of HeLa- EGFP(182TAG) reporter cells under different conditions. Cells were transfected with or without pNEU-FSKRS, and grew with or without 1 mM FSK. Top: bright field; bottom: GFP fluorescence channel.
- FIG.5B Western blot analysis of FSK incorporation into EGFP in HeLa cells. Samples from (a) were lysed and detected using an anti-GFP antibody. GAPDH expression level was used as reference.
- FIG.5C Western blot analysis of FSK-mediated ecGST crosslinking in mammalian cells.
- FIGS.6A-6B show primers used for cloning as described in the Examples.
- FIG.7 compares the FSK incorporation efficiency of the FSKRS’ mutants inducing at 18 °C for 24 hr.
- FIG.8 compares the FSK incorporation efficiency of the FSKRS’ mutants inducing at 30 °C for 6 hr.
- FIG.9 shows incorporation of FSK into EGFP (182TAG) detected by Western blot. FSKRS was co-transformed into pBAD-EGFP(182TAG), protein expression was induced with or without 1 mM FSK. The successful incorporation of FSK into EGFP was detected by running Western blot using anti-his antibody.
- FIG.10 shows incorporation of FSK into sfGFP (2TAG) and sfGFP(151TAG).
- FIG.11 compares the FSY and FSK mediated GST crosslinking in short distance proximity.
- FIGS.12A-12B compare the FSY and FSK mediated E. coli GST crosslinking at the 86th position.
- FIG.12A is a schematic of the FSY/FSK crosslinking at Val86.
- FIG.12B show the reulst of the pEVOL-FSYRS and pEVOL-FSKRS co-expressed with ecGST WT or pBAD- GST (86TAG) in the presence of 1 mM FSK or FSY at 37 °C for 6 hr.
- the WT GST was used as a negative control.
- the GST dimer crosslinking was detected by Western blot by using anti- His antibody.
- FIG.13 compares the crosslinking efficiency of FSK and FSY in mediating Trx and CysH crosslinking.
- FIG.14 shows purification of 7D12 (30FSK) and 7D12 (31FSK) in the presence and absence of 1 mM FSK during expression.
- FIG.15 shows utilization of 7D12 (30FSK), 7D12 (30FSY), 7D12 (109FSK) and 7D12 (109FSY) for crosslinking with EGFR in vitro.
- FIG.16 shows the utilization of Trx (59FSK), Trx (62FSK), Trx (59FSY), Trx (62FSY), for crosslinking with unknown substrates in vivo.
- FIG.17 is a schematic illustration of using FSK or FSY to identify Trx substrate proteins through genetically encoded chemical crosslinking in live cells.
- FIG.18 shows the scheme for the synthesis of fluorosulfonyloxybenzoyl-L-lysine (FSK).
- FIG.19 shows the incorporation of FSK into sfGFP(151TAG) using different FSKRS in the absence of FSK in the media or in the presence of 1 mM FSK in the media (where + indicates the presence of 1 mM FSK in the media).
- Cells were grown at 37 °C and induced for 5.5 h.
- sfGFP fluorescence intensity was measured and normalized to cell optical density. NThis means Hisx6 was appended at the N-terminus; CThis means Hisx6 was appended at the C- terminus.
- FIG.20 shows the incorporation of FSK into sfGFP(151TAG) using different FSKRS in the absence and presence of 1 mM FSK in the media.
- FIGS.21A-21B show a Western Blot analysis of covalent crosslinking of nanobody 7D12 with EGFR in vitro, wherein nanobody 7D12 contained FSY at position 109, 113, and 116 (FIG.21A) or FSY at position 1, 109, or 113 (FIG.21B). Nanobody 7D12 was incubated with 500 nM EGFR in 15 ul PBS at 37 °C for 20 hours. Nanobody 7D12 is set forth as SEQ ID NO:88.
- FIGS.22A-22B show a SDS Page analysis (FIG.22A) and a Western Blot analysis (FIG.22B) of covalent crosslinking of nanobody 7D12 with EGFR in vitro, wherein nanobody 7D12 contained FSY at position 109.2 ⁇ M of purified nanobody 7D12 was incubated with 500 nM EGFR in 15 ul PBS at 37 °C for 20 hours.
- FIG.23 is a Western Blot analysis of covalent crosslinking of nanobody 7D12 WT or nanobody 7D12 containing FSY at position 109 with the A431 cell line.
- FIGS.24A-24B show a SDS Page analysis (FIG.24A) and a Western Blot analysis (FIG.24B) of covalent crosslinking of nanobody 7D12 with EGFR in vitro, wherein nanobody 7D12 contained FSK at position 30 or position 31, or wherein nanobody 7D12 contained FSY at position 109.2 ⁇ M of purified nanobody 7D12 was incubated with 500 nM EGFR in 15 ul PBS at 37 °C for 20 hours.
- FIGS.25A-25B show a SDS Page analysis (FIG.25A) and a Western Blot analysis (FIG.25B) of covalent crosslinking of nanobody 7D12 with EGFR in vitro, wherein nanobody 7D12 contained FSK at position 31, or wherein nanobody 7D12 contained FSY at position 109. 2 ⁇ M of purified nanobody 7D12 was incubated with 500 nM EGFR in 15 ul PBS at 37 °C for 20 hours.
- DETAILED DESCRIPTION [0038] Definitions [0039] The abbreviations used herein have their conventional meaning within the chemical and biological arts.
- the alkyl may include a designated number of carbons (e.g., C 1 -C 10 means one to ten carbons).
- Alkyl is an uncyclized chain.
- saturated hydrocarbon radicals include, but are not limited to, groups such as methyl, ethyl, n-propyl, isopropyl, n-butyl, t-butyl, isobutyl, sec-butyl, methyl, homologs and isomers of, for example, n-pentyl, n-hexyl, n-heptyl, n-octyl, and the like.
- An unsaturated alkyl group is one having one or more double bonds or triple bonds.
- Examples of unsaturated alkyl groups include, but are not limited to, vinyl, 2- propenyl, crotyl, 2-isopentenyl, 2-(butadienyl), 2,4-pentadienyl, 3-(1,4-pentadienyl), ethynyl, 1- and 3-propynyl, 3-butynyl, and the higher homologs and isomers.
- An alkoxy is an alkyl attached to the remainder of the molecule via an oxygen linker (-O-).
- An alkyl moiety may be an alkenyl moiety.
- An alkyl moiety may be an alkynyl moiety.
- An alkyl moiety may be fully saturated.
- alkenyl may include more than one double bond and/or one or more triple bonds in addition to the one or more double bonds.
- An alkynyl may include more than one triple bond and/or one or more double bonds in addition to the one or more triple bonds.
- alkylene by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from an alkyl, as exemplified, but not limited by, -CH 2 CH 2 CH 2 CH 2 -.
- an alkyl (or alkylene) group will have from 1 to 24 carbon atoms, with those groups having 10 or fewer carbon atoms being preferred herein.
- a “lower alkyl” or “lower alkylene” is a shorter chain alkyl or alkylene group, generally having eight or fewer carbon atoms.
- alkenylene by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from an alkene.
- heteroalkyl by itself or in combination with another term, means, unless otherwise stated, a stable straight or branched chain, or combinations thereof, including at least one carbon atom and at least one heteroatom (e.g., O, N, P, Si, and S), and wherein the nitrogen and sulfur atoms may optionally be oxidized, and the nitrogen heteroatom may optionally be quaternized.
- heteroatom(s) may be placed at any interior position of the heteroalkyl group or at the position at which the alkyl group is attached to the remainder of the molecule.
- a heteroalkyl moiety may include one heteroatom.
- a heteroalkyl moiety may include two optionally different heteroatoms.
- a heteroalkyl moiety may include three optionally different heteroatoms.
- a heteroalkyl moiety may include four optionally different heteroatoms.
- a heteroalkyl moiety may include five optionally different heteroatoms.
- a heteroalkyl moiety may include up to 8 optionally different heteroatoms.
- the term “heteroalkenyl,” by itself or in combination with another term, means, unless otherwise stated, a heteroalkyl including at least one double bond.
- a heteroalkenyl may optionally include more than one double bond and/or one or more triple bonds in additional to the one or more double bonds.
- a heteroalkynyl may optionally include more than one triple bond and/or one or more double bonds in additional to the one or more triple bonds.
- heteroalkylene by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from heteroalkyl, as exemplified, but not limited by, -CH 2 -CH 2 -S-CH 2 -CH 2 - and -CH 2 -S-CH 2 -CH 2 -NH-CH 2 -.
- heteroatoms can also occupy either or both of the chain termini (e.g., alkyleneoxy, alkylenedioxy, alkyleneamino, alkylenediamino, and the like).
- heteroalkyl groups include those groups that are attached to the remainder of the molecule through a heteroatom, such as - C(O)R', -C(O)NR', -NR'R'', -OR', -SR', and/or -SO 2 R'.
- heteroalkyl is recited, followed by recitations of specific heteroalkyl groups, such as -NR'R'' or the like, it will be understood that the terms heteroalkyl and -NR'R'' are not redundant or mutually exclusive. Rather, the specific heteroalkyl groups are recited to add clarity. Thus, the term “heteroalkyl” should not be interpreted herein as excluding specific heteroalkyl groups, such as -NR'R'' or the like. [0045]
- Cycloalkyl and heterocycloalkyl are not aromatic. Additionally, for heterocycloalkyl, a heteroatom can occupy the position at which the heterocycle is attached to the remainder of the molecule.
- Examples of cycloalkyl include, but are not limited to, cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, 1-cyclohexenyl, 3-cyclohexenyl, cycloheptyl, and the like.
- heterocycloalkyl examples include, but are not limited to, 1-(1,2,5,6- tetrahydropyridyl), 1-piperidinyl, 2-piperidinyl, 3-piperidinyl, 4-morpholinyl, 3-morpholinyl, tetrahydrofuran-2-yl, tetrahydrofuran-3-yl, tetrahydrothien-2-yl, tetrahydrothien-3-yl, 1- piperazinyl, 2-piperazinyl, and the like.
- cycloalkylene and a “heterocycloalkylene,” alone or as part of another substituent, means a divalent radical derived from a cycloalkyl and heterocycloalkyl, respectively.
- cycloalkyl means a monocyclic, bicyclic, or a multicyclic cycloalkyl ring system.
- monocyclic ring systems are cyclic hydrocarbon groups containing from 3 to 8 carbon atoms, where such groups can be saturated or unsaturated, but not aromatic.
- cycloalkyl groups are fully saturated.
- monocyclic cycloalkyls include cyclopropyl, cyclobutyl, cyclopentyl, cyclopentenyl, cyclohexyl, cyclohexenyl, cycloheptyl, and cyclooctyl.
- Bicyclic cycloalkyl ring systems are bridged monocyclic rings or fused bicyclic rings.
- bridged monocyclic rings contain a monocyclic cycloalkyl ring where two non adjacent carbon atoms of the monocyclic ring are linked by an alkylene bridge of between one and three additional carbon atoms (i.e., a bridging group of the form (CH 2 )w , where w is 1, 2, or 3).
- bicyclic ring systems include, but are not limited to, bicyclo[3.1.1]heptane, bicyclo[2.2.1]heptane, bicyclo[2.2.2]octane, bicyclo[3.2.2]nonane, bicyclo[3.3.1]nonane, and bicyclo[4.2.1]nonane.
- fused bicyclic cycloalkyl ring systems contain a monocyclic cycloalkyl ring fused to either a phenyl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, a monocyclic heterocyclyl, or a monocyclic heteroaryl.
- the bridged or fused bicyclic cycloalkyl is attached to the parent molecular moiety through any carbon atom contained within the monocyclic cycloalkyl ring.
- cycloalkyl groups are optionally substituted with one or two groups which are independently oxo or thia.
- the fused bicyclic cycloalkyl is a 5 or 6 membered monocyclic cycloalkyl ring fused to either a phenyl ring, a 5 or 6 membered monocyclic cycloalkyl, a 5 or 6 membered monocyclic cycloalkenyl, a 5 or 6 membered monocyclic heterocyclyl, or a 5 or 6 membered monocyclic heteroaryl, wherein the fused bicyclic cycloalkyl is optionally substituted by one or two groups which are independently oxo or thia.
- multicyclic cycloalkyl ring systems are a monocyclic cycloalkyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a bicyclic aryl, a monocyclic or bicyclic heteroaryl, a monocyclic or bicyclic cycloalkyl, a monocyclic or bicyclic cycloalkenyl, and a monocyclic or bicyclic heterocyclyl.
- multicyclic cycloalkyl is attached to the parent molecular moiety through any carbon atom contained within the base ring.
- multicyclic cycloalkyl ring systems are a monocyclic cycloalkyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a monocyclic heteroaryl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, and a monocyclic heterocyclyl.
- a cycloalkyl is a cycloalkenyl.
- the term “cycloalkenyl” is used in accordance with its plain ordinary meaning.
- a cycloalkenyl is a monocyclic, bicyclic, or a multicyclic cycloalkenyl ring system.
- monocyclic cycloalkenyl ring systems are cyclic hydrocarbon groups containing from 3 to 8 carbon atoms, where such groups are unsaturated (i.e., containing at least one annular carbon carbon double bond), but not aromatic.
- monocyclic cycloalkenyl ring systems include cyclopentenyl and cyclohexenyl.
- bicyclic cycloalkenyl rings are bridged monocyclic rings or a fused bicyclic rings.
- bridged monocyclic rings contain a monocyclic cycloalkenyl ring where two non adjacent carbon atoms of the monocyclic ring are linked by an alkylene bridge of between one and three additional carbon atoms (i.e., a bridging group of the form (CH 2 )w, where w is 1, 2, or 3).
- alkylene bridge of between one and three additional carbon atoms
- bicyclic cycloalkenyls include, but are not limited to, norbornenyl and bicyclo[2.2.2]oct 2 enyl.
- fused bicyclic cycloalkenyl ring systems contain a monocyclic cycloalkenyl ring fused to either a phenyl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, a monocyclic heterocyclyl, or a monocyclic heteroaryl.
- the bridged or fused bicyclic cycloalkenyl is attached to the parent molecular moiety through any carbon atom contained within the monocyclic cycloalkenyl ring.
- cycloalkenyl groups are optionally substituted with one or two groups which are independently oxo or thia.
- multicyclic cycloalkenyl rings contain a monocyclic cycloalkenyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two ring systems independently selected from the group consisting of a phenyl, a bicyclic aryl, a monocyclic or bicyclic heteroaryl, a monocyclic or bicyclic cycloalkyl, a monocyclic or bicyclic cycloalkenyl, and a monocyclic or bicyclic heterocyclyl.
- multicyclic cycloalkenyl is attached to the parent molecular moiety through any carbon atom contained within the base ring.
- multicyclic cycloalkenyl rings contain a monocyclic cycloalkenyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two ring systems independently selected from the group consisting of a phenyl, a monocyclic heteroaryl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, and a monocyclic heterocyclyl.
- a heterocycloalkyl is a heterocyclyl.
- heterocyclyl as used herein, means a monocyclic, bicyclic, or multicyclic heterocycle.
- the heterocyclyl monocyclic heterocycle is a 3, 4, 5, 6 or 7 membered ring containing at least one heteroatom independently selected from the group consisting of O, N, and S where the ring is saturated or unsaturated, but not aromatic.
- the 3 or 4 membered ring contains 1 heteroatom selected from the group consisting of O, N and S.
- the 5 membered ring can contain zero or one double bond and one, two or three heteroatoms selected from the group consisting of O, N and S.
- the 6 or 7 membered ring contains zero, one or two double bonds and one, two or three heteroatoms selected from the group consisting of O, N and S.
- the heterocyclyl monocyclic heterocycle is connected to the parent molecular moiety through any carbon atom or any nitrogen atom contained within the heterocyclyl monocyclic heterocycle.
- heterocyclyl monocyclic heterocycles include, but are not limited to, azetidinyl, azepanyl, aziridinyl, diazepanyl, 1,3-dioxanyl, 1,3-dioxolanyl, 1,3-dithiolanyl, 1,3-dithianyl, imidazolinyl, imidazolidinyl, isothiazolinyl, isothiazolidinyl, isoxazolinyl, isoxazolidinyl, morpholinyl, oxadiazolinyl, oxadiazolidinyl, oxazolinyl, oxazolidinyl, piperazinyl, piperidinyl, pyranyl, pyrazolinyl, pyrazolidinyl, pyrrolinyl, pyrrolidinyl, tetrahydrofuranyl, tetrahydrothienyl
- the heterocyclyl bicyclic heterocycle is a monocyclic heterocycle fused to either a phenyl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, a monocyclic heterocycle, or a monocyclic heteroaryl.
- the heterocyclyl bicyclic heterocycle is connected to the parent molecular moiety through any carbon atom or any nitrogen atom contained within the monocyclic heterocycle portion of the bicyclic ring system.
- bicyclic heterocyclyls include, but are not limited to, 2,3-dihydrobenzofuran-2-yl, 2,3-dihydrobenzofuran-3-yl, indolin-1-yl, indolin-2-yl, indolin-3-yl, 2,3-dihydrobenzothien-2-yl, decahydroquinolinyl, decahydroisoquinolinyl, octahydro-1H-indolyl, and octahydrobenzofuranyl.
- heterocyclyl groups are optionally substituted with one or two groups which are independently oxo or thia.
- the bicyclic heterocyclyl is a 5 or 6 membered monocyclic heterocyclyl ring fused to a phenyl ring, a 5 or 6 membered monocyclic cycloalkyl, a 5 or 6 membered monocyclic cycloalkenyl, a 5 or 6 membered monocyclic heterocyclyl, or a 5 or 6 membered monocyclic heteroaryl, wherein the bicyclic heterocyclyl is optionally substituted by one or two groups which are independently oxo or thia.
- Multicyclic heterocyclyl ring systems are a monocyclic heterocyclyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a bicyclic aryl, a monocyclic or bicyclic heteroaryl, a monocyclic or bicyclic cycloalkyl, a monocyclic or bicyclic cycloalkenyl, and a monocyclic or bicyclic heterocyclyl.
- multicyclic heterocyclyl is attached to the parent molecular moiety through any carbon atom or nitrogen atom contained within the base ring.
- multicyclic heterocyclyl ring systems are a monocyclic heterocyclyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a monocyclic heteroaryl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, and a monocyclic heterocyclyl.
- multicyclic heterocyclyl groups include, but are not limited to 10H-phenothiazin-10-yl, 9,10- dihydroacridin-9-yl, 9,10-dihydroacridin-10-yl, 10H-phenoxazin-10-yl, 10,11-dihydro-5H- dibenzo[b,f]azepin-5-yl, 1,2,3,4-tetrahydropyrido[4,3-g]isoquinolin-2-yl, 12H- benzo[b]phenoxazin-12-yl, and dodecahydro-1H-carbazol-9-yl.
- halo or “halogen,” by themselves or as part of another substituent, mean, unless otherwise stated, a fluorine, chlorine, bromine, or iodine atom. Additionally, terms such as “haloalkyl” are meant to include monohaloalkyl and polyhaloalkyl.
- halo(C 1 -C 4 )alkyl includes, but is not limited to, fluoromethyl, difluoromethyl, trifluoromethyl, 2,2,2-trifluoroethyl, 4-chlorobutyl, 3-bromopropyl, and the like.
- acyl means, unless otherwise stated, -C(O)R where R is a substituted or unsubstituted alkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
- aryl means, unless otherwise stated, a polyunsaturated, aromatic, hydrocarbon substituent, which can be a single ring or multiple rings (preferably from 1 to 3 rings) that are fused together (i.e., a fused ring aryl) or linked covalently.
- a fused ring aryl refers to multiple rings fused together wherein at least one of the fused rings is an aryl ring.
- heteroaryl refers to aryl groups (or rings) that contain at least one heteroatom such as N, O, or S, wherein the nitrogen and sulfur atoms are optionally oxidized, and the nitrogen atom(s) are optionally quaternized.
- heteroaryl includes fused ring heteroaryl groups (i.e., multiple rings fused together wherein at least one of the fused rings is a heteroaromatic ring).
- a 5,6-fused ring heteroarylene refers to two rings fused together, wherein one ring has 5 members and the other ring has 6 members, and wherein at least one ring is a heteroaryl ring.
- a 6,6-fused ring heteroarylene refers to two rings fused together, wherein one ring has 6 members and the other ring has 6 members, and wherein at least one ring is a heteroaryl ring.
- a 6,5- fused ring heteroarylene refers to two rings fused together, wherein one ring has 6 members and the other ring has 5 members, and wherein at least one ring is a heteroaryl ring.
- a heteroaryl group can be attached to the remainder of the molecule through a carbon or heteroatom.
- Non- limiting examples of aryl and heteroaryl groups include phenyl, naphthyl, pyrrolyl, pyrazolyl, pyridazinyl, triazinyl, pyrimidinyl, imidazolyl, pyrazinyl, purinyl, oxazolyl, isoxazolyl, thiazolyl, furyl, thienyl, pyridyl, pyrimidyl, benzothiazolyl, benzoxazoyl benzimidazolyl, benzofuran, isobenzofuranyl, indolyl, isoindolyl, benzothiophenyl, isoquinolyl, quinoxalinyl, quinolyl, 1-naphthyl, 2-naphthyl, 4-biphenyl, 1-pyrrolyl, 2-pyrrolyl, 3-pyrrolyl, 3-pyrazolyl, 2- imidazolyl, 4-imid
- Substituents for each of the above noted aryl and heteroaryl ring systems are selected from the group of acceptable substituents described below.
- a heteroaryl group substituent may be -O- bonded to a ring heteroatom nitrogen.
- a fused ring heterocyloalkyl-aryl is an aryl fused to a heterocycloalkyl.
- a fused ring heterocycloalkyl-heteroaryl is a heteroaryl fused to a heterocycloalkyl.
- a fused ring heterocycloalkyl-cycloalkyl is a heterocycloalkyl fused to a cycloalkyl.
- a fused ring heterocycloalkyl-heterocycloalkyl is a heterocycloalkyl fused to another heterocycloalkyl.
- Fused ring heterocycloalkyl-aryl, fused ring heterocycloalkyl-heteroaryl, fused ring heterocycloalkyl-cycloalkyl, or fused ring heterocycloalkyl-heterocycloalkyl may each independently be unsubstituted or substituted with one or more of the substituents described herein.
- Spirocyclic rings are two or more rings wherein adjacent rings are attached through a single atom.
- the individual rings within spirocyclic rings may be identical or different.
- Individual rings in spirocyclic rings may be substituted or unsubstituted and may have different substituents from other individual rings within a set of spirocyclic rings.
- Possible substituents for individual rings within spirocyclic rings are the possible substituents for the same ring when not part of spirocyclic rings (e.g. substituents for cycloalkyl or heterocycloalkyl rings).
- Spirocyclic rings may be substituted or unsubstituted cycloalkyl, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkyl or substituted or unsubstituted heterocycloalkylene and individual rings within a spirocyclic ring group may be any of the immediately previous list, including having all rings of one type (e.g. all rings being substituted heterocycloalkylene wherein each ring may be the same or different substituted heterocycloalkylene).
- heterocyclic spirocyclic rings means a spirocyclic rings wherein at least one ring is a heterocyclic ring and wherein each ring may be a different ring.
- substituted spirocyclic rings means that at least one ring is substituted and each substituent may optionally be different.
- alkylsulfonyl means a moiety having the formula -S(O 2 )-R', where R' is a substituted or unsubstituted alkyl group as defined above. R' may have a specified number of carbons (e.g., “C 1 -C 4 alkylsulfonyl”).
- alkylarylene as an arylene moiety covalently bonded to an alkylene moiety (also referred to herein as an alkylene linker). In aspects, the alkylarylene group has the formula: benzylene).
- An alkylarylene moiety may be substituted (e.g.
- alkylarylene moiety is unsubstituted.
- alkyl e.g., “alkyl,” “heteroalkyl,” “cycloalkyl,” “heterocycloalkyl,” “aryl,” and “heteroaryl”
- alkyl e.g., “alkyl,” “heteroalkyl,” “cycloalkyl,” “heterocycloalkyl,” “aryl,” and “heteroaryl”
- Preferred substituents for each type of radical are provided below.
- R, R', R'', R'', and R''' each preferably independently refer to hydrogen, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl (e.g., aryl substituted with 1-3 halogens), substituted or unsubstituted heteroaryl, substituted or unsubstituted alkyl, alkoxy, or thioalkoxy groups, or arylalkyl groups.
- aryl e.g., aryl substituted with 1-3 halogens
- substituted or unsubstituted heteroaryl substituted or unsubstituted alkyl, alkoxy, or thioalkoxy groups, or arylalkyl groups.
- each of the R groups is independently selected as are each R', R'', R''', and R''' group when more than one of these groups is present.
- R' and R'' are attached to the same nitrogen atom, they can be combined with the nitrogen atom to form a 4-, 5-, 6-, or 7-membered ring.
- -NR'R'' includes, but is not limited to, 1-pyrrolidinyl and 4-morpholinyl.
- alkyl is meant to include groups including carbon atoms bound to groups other than hydrogen groups, such as haloalkyl (e.g., -CF 3 and -CH 2 CF 3 ) and acyl (e.g., -C(O)CH 3 , -C(O)CF 3 , -C(O)CH 2 OCH 3 , and the like).
- haloalkyl e.g., -CF 3 and -CH 2 CF 3
- acyl e.g., -C(O)CH 3 , -C(O)CF 3 , -C(O)CH 2 OCH 3 , and the like.
- each of the R groups is independently selected as are each R', R'', R'', and R''' groups when more than one of these groups is present.
- Substituents for rings e.g. cycloalkyl, heterocycloalkyl, aryl, heteroaryl, cycloalkylene, heterocycloalkylene, arylene, or heteroarylene
- substituents on the ring may be depicted as substituents on the ring rather than on a specific atom of a ring (commonly referred to as a floating substituent).
- the substituent may be attached to any of the ring atoms (obeying the rules of chemical valency) and in the case of fused rings or spirocyclic rings, a substituent depicted as associated with one member of the fused rings or spirocyclic rings (a floating substituent on a single ring), may be a substituent on any of the fused rings or spirocyclic rings (a floating substituent on multiple rings).
- the multiple substituents may be on the same atom, same ring, different atoms, different fused rings, different spirocyclic rings, and each substituent may optionally be different.
- a point of attachment of a ring to the remainder of a molecule is not limited to a single atom (a floating substituent)
- the attachment point may be any atom of the ring and in the case of a fused ring or spirocyclic ring, any atom of any of the fused rings or spirocyclic rings while obeying the rules of chemical valency.
- a ring, fused rings, or spirocyclic rings contain one or more ring heteroatoms and the ring, fused rings, or spirocyclic rings are shown with one more floating substituents (including, but not limited to, points of attachment to the remainder of the molecule), the floating substituents may be bonded to the heteroatoms.
- the ring heteroatoms are shown bound to one or more hydrogens (e.g. a ring nitrogen with two bonds to ring atoms and a third bond to a hydrogen) in the structure or formula with the floating substituent, when the heteroatom is bonded to the floating substituent, the substituent will be understood to replace the hydrogen, while obeying the rules of chemical valency.
- Two or more substituents may optionally be joined to form aryl, heteroaryl, cycloalkyl, or heterocycloalkyl groups.
- Such so-called ring-forming substituents are typically, though not necessarily, found attached to a cyclic base structure.
- the ring-forming substituents are attached to adjacent members of the base structure.
- two ring-forming substituents attached to adjacent members of a cyclic base structure create a fused ring structure.
- the ring-forming substituents are attached to a single member of the base structure.
- two ring-forming substituents attached to a single member of a cyclic base structure create a spirocyclic structure.
- the ring- forming substituents are attached to non-adjacent members of the base structure.
- Two of the substituents on adjacent atoms of the aryl or heteroaryl ring may optionally form a ring of the formula -T-C(O)-(CRR') q -U-, wherein T and U are independently -NR-, -O-, -CRR'-, or a single bond, and q is an integer of from 0 to 3.
- two of the substituents on adjacent atoms of the aryl or heteroaryl ring may optionally be replaced with a substituent of the formula -A-(CH 2 ) r -B-, wherein A and B are independently -CRR'-, -O-, -NR-, -S-, -S(O) -, -S(O) 2 -, -S(O) 2 NR'-, or a single bond, and r is an integer of from 1 to 4.
- One of the single bonds of the new ring so formed may optionally be replaced with a double bond.
- two of the substituents on adjacent atoms of the aryl or heteroaryl ring may optionally be replaced with a substituent of the formula -(CRR') s -X'- (C''R''R'') d -, where s and d are independently integers of from 0 to 3, and X' is -O-, -NR'-, -S-, -S(O)-, -S(O) 2 -, or -S(O) 2 NR'-.
- R, R', R'', and R''' are preferably independently selected from hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, and substituted or unsubstituted heteroaryl.
- heteroatom or “ring heteroatom” are meant to include oxygen (O), nitrogen (N), sulfur (S), phosphorus (P), and silicon (Si).
- a “substituent group,” as used herein, means a group selected from the following moieties: (A) oxo, halogen, -CCl 3 , -CBr 3 , -CF 3 , -Cl 3 ,-CN, -OH, -NH 2 , -COOH, -CONH 2 , -NO 2 , -SH, -SO 3 H, -SO 4 H, -SO 2 NH 2 , ⁇ NHNH 2 , ⁇ ONH 2 , ⁇ NHC(O)NHNH 2 , -NHC(O)NH 2 , -NHSO 2 H, -NHC(O)H, -NHC(O)OH, -NHOH, -OCCl 3 , -OCF 3 , -OCBr 3 , -OCI 3 ,-OCHCl 2 , -OCHBr 2 , -OCHI 2 , -OCHF 2 , unsubstituent
- a “size-limited substituent” or “ size-limited substituent group,” as used herein, means a group selected from all of the substituents described above for a “substituent group,” wherein each substituted or unsubstituted alkyl is a substituted or unsubstituted C 1 -C 20 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 20 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C 3 -C 8 cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 8 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted C 6 -C 10 aryl, and each substituted or unsubstituted heteroary
- a “lower substituent” or “ lower substituent group,” as used herein, means a group selected from all of the substituents described above for a “substituent group,” wherein each substituted or unsubstituted alkyl is a substituted or unsubstituted C 1 -C 8 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 8 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C 3 -C 7 cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 7 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted C 6 -C 10 aryl, and each substituted or unsubstituted heteroaryl is
- each substituted group described in the compounds herein is substituted with at least one substituent group. More specifically, in aspects, each substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene described in the compounds herein are substituted with at least one substituent group. In aspects, at least one or all of these groups are substituted with at least one size-limited substituent group. In aspects, at least one or all of these groups are substituted with at least one lower substituent group.
- each substituted or unsubstituted alkyl may be a substituted or unsubstituted C 1 -C 20 alkyl
- each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 20 membered heteroalkyl
- each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C 3 -C 8 cycloalkyl
- each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 8 membered heterocycloalkyl
- each substituted or unsubstituted aryl is a substituted or unsubstituted C 6 -C 10 aryl
- each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 10 membered heteroaryl.
- each substituted or unsubstituted alkylene is a substituted or unsubstituted C 1 -C 20 alkylene
- each substituted or unsubstituted heteroalkylene is a substituted or unsubstituted 2 to 20 membered heteroalkylene
- each substituted or unsubstituted cycloalkylene is a substituted or unsubstituted C 3 -C 8 cycloalkylene
- each substituted or unsubstituted heterocycloalkylene is a substituted or unsubstituted 3 to 8 membered heterocycloalkylene
- each substituted or unsubstituted arylene is a substituted or unsubstituted C 6 -C 10 arylene
- each substituted or unsubstituted heteroarylene is a substituted or unsubstituted 5 to 10 membered heteroarylene.
- each substituted or unsubstituted alkyl is a substituted or unsubstituted C 1 -C 8 alkyl
- each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 8 membered heteroalkyl
- each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C 3 -C 7 cycloalkyl
- each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 7 membered heterocycloalkyl
- each substituted or unsubstituted aryl is a substituted or unsubstituted C 6 -C 10 aryl
- each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 9 membered heteroaryl.
- each substituted or unsubstituted alkylene is a substituted or unsubstituted C 1 -C 8 alkylene
- each substituted or unsubstituted heteroalkylene is a substituted or unsubstituted 2 to 8 membered heteroalkylene
- each substituted or unsubstituted cycloalkylene is a substituted or unsubstituted C 3 -C 7 cycloalkylene
- each substituted or unsubstituted heterocycloalkylene is a substituted or unsubstituted 3 to 7 membered heterocycloalkylene
- each substituted or unsubstituted arylene is a substituted or unsubstituted C 6 -C 10 arylene
- each substituted or unsubstituted heteroarylene is a substituted or unsubstituted 5 to 9 membered heteroarylene.
- a substituted or unsubstituted moiety e.g., substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, and/or substituted or unsubstituted heteroarylene) is unsubstituted (e.g., is an unsubstituted alkyl, unsubstituted heteroalkyl, unsubstituted cycloalkyl, unsubstituted
- a substituted or unsubstituted moiety e.g., substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, and/or substituted or unsubstituted heteroarylene) is substituted (e.g., is a substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene
- a substituted moiety e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene
- is substituted with at least one substituent group wherein if the substituted moiety is substituted with a plurality of substituent groups, each substituent group may optionally be different. In aspects, if the substituted moiety is substituted with a plurality of substituent groups, each substituent group is different.
- a substituted moiety e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene
- is substituted with at least one size-limited substituent group wherein if the substituted moiety is substituted with a plurality of size-limited substituent groups, each size-limited substituent group may optionally be different.
- each size-limited substituent group is different.
- a substituted moiety e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene
- a substituted moiety e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene
- a substituted moiety is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted moiety is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size- limited substituent group, and/or lower substituent group may optionally be different.
- the substituted moiety is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group is different.
- substituent groups size-limited substituent groups, and lower substituent groups
- each substituent group, size-limited substituent group, and/or lower substituent group is different.
- Certain compounds of the present disclosure possess asymmetric carbon atoms (optical or chiral centers) or double bonds; the enantiomers, racemates, diastereomers, tautomers, geometric isomers, stereoisometric forms that may be defined, in terms of absolute stereochemistry, as (R)-or (S)- or, as (D)- or (L)- for amino acids, and individual isomers are encompassed within the scope of the present disclosure.
- the compounds of the present disclosure do not include those that are known in art to be too unstable to synthesize and/or isolate.
- the present disclosure is meant to include compounds in racemic and optically pure forms.
- Optically active (R)- and (S)-, or (D)- and (L)-isomers may be prepared using chiral synthons or chiral reagents, or resolved using conventional techniques.
- the compounds described herein contain olefinic bonds or other centers of geometric asymmetry, and unless specified otherwise, it is intended that the compounds include both E and Z geometric isomers.
- the term “isomers” refers to compounds having the same number and kind of atoms, and hence the same molecular weight, but differing in respect to the structural arrangement or configuration of the atoms.
- the term “tautomer,” as used herein, refers to one of two or more structural isomers which exist in equilibrium and which are readily converted from one isomeric form to another. [0080] It will be apparent to one skilled in the art that certain compounds of this disclosure may exist in tautomeric forms, all such tautomeric forms of the compounds being within the scope of the disclosure.
- structures depicted herein are also meant to include all stereochemical forms of the structure; i.e., the R and S configurations for each asymmetric center. Therefore, single stereochemical isomers as well as enantiomeric and diastereomeric mixtures of the present compounds are within the scope of the disclosure.
- structures depicted herein are also meant to include compounds which differ only in the presence of one or more isotopically enriched atoms. For example, compounds having the present structures except for the replacement of a hydrogen by a deuterium or tritium, or the replacement of a carbon by 13 C- or 14 C-enriched carbon are within the scope of this disclosure.
- the compounds of the present disclosure may also contain unnatural proportions of atomic isotopes at one or more of the atoms that constitute such compounds.
- the compounds may be radiolabeled with radioactive isotopes, such as for example tritium ( 3 H), iodine-125 ( 125 I), or carbon-14 ( 14 C). All isotopic variations of the compounds of the present disclosure, whether radioactive or not, are encompassed within the scope of the present disclosure.
- radioactive isotopes such as for example tritium ( 3 H), iodine-125 ( 125 I), or carbon-14 ( 14 C). All isotopic variations of the compounds of the present disclosure, whether radioactive or not, are encompassed within the scope of the present disclosure.
- each member of the Markush group should be considered separately, thereby comprising another embodiment, and the Markush group is not to be read as a single unit.
- “Analog,” or “analogue” is used in accordance with its plain ordinary meaning within Chemistry and Biology and refers to a chemical compound that is structurally similar to another compound (i.e., a so-called “reference” compound) but differs in composition, e.g., in the replacement of one atom by an atom of a different element, or in the presence of a particular functional group, or the replacement of one functional group by another functional group, or the absolute stereochemistry of one or more chiral centers of the reference compound.
- an analog is a compound that is similar or comparable in function and appearance but not in structure or origin to a reference compound.
- a or “an,” as used in herein means one or more.
- substituted with a[n] means the specified group may be substituted with one or more of any or all of the named substituents.
- a group such as an alkyl or heteroaryl group
- the group may contain one or more unsubstituted C 1 -C 20 alkyls, and/or one or more unsubstituted 2 to 20 membered heteroalkyls.
- R substituent the group may be referred to as “R-substituted.”
- R-substituted the moiety is substituted with at least one R substituent and each R substituent is optionally different.
- a “detectable agent” or “detectable moiety” is a composition detectable by appropriate means such as spectroscopic, photochemical, biochemical, immunochemical, chemical, magnetic resonance imaging, or other physical means.
- useful detectable agents include 18 F, 32 P, 33 P, 45 Ti, 47 Sc, 52 Fe, 59 Fe, 62 Cu, 64 Cu, 67 Cu, 67 Ga, 68 Ga, 77 As, 86 Y, 90 Y. 89 Sr, 89 Zr, 94 Tc, 94 Tc, 99m Tc, 99 Mo, 105 Pd, 105 Rh, 111 Ag, 111 In, 123 I, 124 I, 125 I, 131 I, 142 Pr, 143 Pr, 149 Pm, 153 Sm, 154-1581 Gd, 161 Tb, 166 Dy, 166 Ho, 169 Er, 175 Lu, 177 Lu, 186 Re, 188 Re, 189 Re, 194 Ir, 198 Au, 199 Au, 211 At, 211 Pb, 212 Bi, 212 Pb, 213 Bi, 223 Ra, 225 Ac, Cr, V, Mn, Fe, Co, Ni, Cu, La, Ce, Pr, Nd, Pm,
- fluorescent dyes include fluorescent dyes), electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, paramagnetic molecules, paramagnetic nanoparticles, ultrasmall superparamagnetic iron oxide (“USPIO”) nanoparticles, USPIO nanoparticle aggregates, superparamagnetic iron oxide (“SPIO”) nanoparticles, SPIO nanoparticle aggregates, monocrystalline iron oxide nanoparticles, monochrystalline iron oxide, nanoparticle contrast agents, liposomes or other delivery vehicles containing Gadolinium chelate (“Gd-chelate”) molecules, Gadolinium, radioisotopes, radionuclides (e.g.
- microbubbles e.g. including microbubble shells including albumin, galactose, lipid, and/or polymers; microbubble gas core including air, heavy gas(es), perfluorcarbon, nitrogen, octafluoropropane, perflexane lipid microsphere, perflutren, etc.
- iodinated contrast agents e.g.
- a detectable moiety is a monovalent detectable agent or a detectable agent capable of forming a bond with another composition.
- Radioactive substances e.g., radioisotopes
- Radioactive substances include, but are not limited to, 18 F, 32 P, 33 P, 45 Ti, 47 Sc, 52 Fe, 59 Fe, 62 Cu, 64 Cu, 67 Cu, 67 Ga, 68 Ga, 77 As, 86 Y, 90 Y.
- Paramagnetic ions that may be used as additional imaging agents in accordance with the embodiments of the disclosure include, but are not limited to, ions of transition and lanthanide metals (e.g., metals having atomic numbers of 21-29, 42, 43, 44, or 57-71).
- Descriptions of compounds of the present disclosure are limited by principles of chemical bonding known to those skilled in the art. Accordingly, where a group may be substituted by one or more of a number of substituents, such substitutions are selected so as to comply with principles of chemical bonding and to give compounds which are not inherently unstable and/or would be known to one of ordinary skill in the art as likely to be unstable under ambient conditions, such as aqueous, neutral, and several known physiological conditions.
- a heterocycloalkyl or heteroaryl is attached to the remainder of the molecule via a ring heteroatom in compliance with principles of chemical bonding known to those skilled in the art thereby avoiding inherently unstable compounds.
- a variable e.g., moiety or linker
- a compound or of a compound genus e.g., a genus described herein
- the unfilled valence(s) of the variable will be dictated by the context in which the variable is used.
- variable of a compound as described herein when a variable of a compound as described herein is connected (e.g., bonded) to the remainder of the compound through a single bond, that variable is understood to represent a monovalent form (i.e., capable of forming a single bond due to an unfilled valence) of a standalone compound (e.g., if the variable is named “methane” in an embodiment but the variable is known to be attached by a single bond to the remainder of the compound, a person of ordinary skill in the art would understand that the variable is actually a monovalent form of methane, i.e., methyl or – CH 3 ).
- variable is the divalent form of a standalone compound (e.g., if the variable is assigned to “PEG” or “polyethylene glycol” in an embodiment but the variable is connected by two separate bonds to the remainder of the compound, a person of ordinary skill in the art would understand that the variable is a divalent (i.e., capable of forming two bonds through two unfilled valences) form of PEG instead of the standalone compound PEG).
- Nucleic acid refers to nucleotides (e.g., deoxyribonucleotides or ribonucleotides) and polymers thereof in either single-, double- or multiple-stranded form, or complements thereof.
- polynucleotide e.g., deoxyribonucleotides or ribonucleotides
- oligonucleotide oligo or the like refer, in the usual and customary sense, to a linear sequence of nucleotides.
- nucleotide refers, in the usual and customary sense, to a single unit of a polynucleotide, i.e., a monomer.
- Nucleotides can be ribonucleotides, deoxyribonucleotides, or modified versions thereof.
- Examples of polynucleotides contemplated herein include single and double stranded DNA, single and double stranded RNA, and hybrid molecules having mixtures of single and double stranded DNA and RNA.
- Examples of nucleic acid, e.g. polynucleotides contemplated herein include any types of RNA, e.g. mRNA, siRNA, miRNA, and guide RNA and any types of DNA, genomic DNA, plasmid DNA, and minicircle DNA, and any fragments thereof.
- nucleic acids can be linear or branched.
- nucleic acids can be a linear chain of nucleotides or the nucleic acids can be branched, e.g., such that the nucleic acids comprise one or more arms or branches of nucleotides.
- the branched nucleic acids are repetitively branched to form higher ordered structures such as dendrimers and the like.
- Nucleic acids including e.g., nucleic acids with a phosphothioate backbone, can include one or more reactive moieties.
- the term reactive moiety includes any group capable of reacting with another molecule, e.g., a nucleic acid or polypeptide through covalent, non-covalent or other interactions.
- the nucleic acid can include an amino acid reactive moiety that reacts with an amio acid on a protein or polypeptide through a covalent, non-covalent or other interaction.
- the terms also encompass nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non- naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides.
- Examples of such analogs include, without limitation, phosphodiester derivatives including, e.g., phosphoramidate, phosphorodiamidate, phosphorothioate (also known as phosphorothioate having double bonded sulfur replacing oxygen in the phosphate), phosphorodithioate, phosphonocarboxylic acids, phosphonocarboxylates, phosphonoacetic acid, phosphonoformic acid, methyl phosphonate, boron phosphonate, or O-methylphosphoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press) as well as modifications to the nucleotide bases such as in 5-methyl cytidine or pseudouridine and peptide nucleic acid backbones and linkages.
- phosphodiester derivatives including, e.g., phosphoramidate, phosphorodiamidate, phosphorothioate (also known as phosphorothioate having double bonded sulfur
- nucleic acids include those with positive backbones; non- ionic backbones, modified sugars, and non-ribose backbones (e.g. phosphorodiamidate morpholino oligos or locked nucleic acids (LNA) as known in the art), including those described in U.S. Patent Nos.5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, Carbohydrate Modifications in Antisense Research, Sanghui & Cook, eds. Nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acids.
- LNA locked nucleic acids
- Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments or as probes on a biochip.
- Mixtures of naturally occurring nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made.
- the internucleotide linkages in DNA are phosphodiester, phosphodiester derivatives, or a combination of both.
- Nucleic acids can include nonspecific sequences.
- nonspecific sequence refers to a nucleic acid sequence that contains a series of residues that are not designed to be complementary to or are only partially complementary to any other nucleic acid sequence. y way of example, a nonspecific nucleic acid sequence is a sequence of nucleic acid residues that does not function as an inhibitory nucleic acid when contacted with a cell or organism.
- a polynucleotide is typically composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G); and thymine (T) (uracil (U) for thymine (T) when the polynucleotide is RNA).
- polynucleotide sequence is the alphabetical representation of a polynucleotide molecule; alternatively, the term may be applied to the polynucleotide molecule itself.
- This alphabetical representation can be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching.
- Polynucleotides may optionally include one or more non-standard nucleotide(s), nucleotide analog(s) and/or modified nucleotides.
- complement refers to a nucleotide (e.g., RNA or DNA) or a sequence of nucleotides capable of base pairing with a complementary nucleotide or sequence of nucleotides.
- a complement may include a sequence of nucleotides that base pair with corresponding complementary nucleotides of a second nucleic acid sequence.
- the nucleotides of a complement may partially or completely match the nucleotides of the second nucleic acid sequence. Where the nucleotides of the complement completely match each nucleotide of the second nucleic acid sequence, the complement forms base pairs with each nucleotide of the second nucleic acid sequence. Where the nucleotides of the complement partially match the nucleotides of the second nucleic acid sequence only some of the nucleotides of the complement form base pairs with nucleotides of the second nucleic acid sequence.
- Examples of complementary sequences include coding and a non-coding sequences, wherein the non-coding sequence contains complementary nucleotides to the coding sequence and thus forms the complement of the coding sequence.
- a further example of complementary sequences are sense and antisense sequences, wherein the sense sequence contains complementary nucleotides to the antisense sequence and thus forms the complement of the antisense sequence.
- the complementarity of sequences may be partial, in which only some of the nucleic acids match according to base pairing, or complete, where all the nucleic acids match according to base pairing.
- two sequences that are complementary to each other may have a specified percentage of nucleotides that are the same (i.e., about 60% identity, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region).
- amino acid refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids.
- Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, ⁇ -carboxyglutamate, and O-phosphoserine.
- Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an ⁇ carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
- Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
- amino acid side chain refers to the functional substituent contained on amino acids.
- an amino acid side chain may be the side chain of a naturally occurring amino acid.
- Naturally occurring amino acids are those encoded by the genetic code (e.g., alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, or valine), as well as those amino acids that are later modified, e.g., hydroxyproline, ⁇ -carboxyglutamate, and O-phosphoserine.
- the amino acid side chain may be a non-natural amino acid side chain.
- the amino acid side chain is H,
- non-natural amino acid side chain or “unnatural amino acid side chain” or “Uaa” refers to the functional substituent of compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an ⁇ carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium, allylalanine, 2-aminoisobutryric acid.
- Non-natural amino acids are non-proteinogenic amino acids that either occur naturally or are chemically synthesized.
- Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
- Non-limiting examples include exo-cis-3-aminobicyclo[2.2.1]hept-5-ene-2-carboxylic acid hydrochloride, cis-2-aminocycloheptane-carboxylic acid hydrochloride, cis-6-Amino-3- cyclohexene-1-carboxylic acid hydrochloride, cis-2-amino-2-methylcyclohexanecarboxylic acid hydrochloride, cis-2-amino-2-methylcyclopentane-carboxylic acid hydrochloride, 2-(Boc- aminomethyl)benzoic acid, 2-(Boc-amino)octanedioic acid, Boc-4,5-dehydro-Leu-OH (dicyclohexylammonium), Boc-4-(
- the unnatural amino acid is fluorosulfonyloxybenzoyl-L-lysine (FSK) having the following formula: [0103]
- FSK fluorosulfonyloxybenzoyl-L-lysine
- Constantly modified variants applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, “conservatively modified variants” refers to those nucleic acids that encode identical or essentially identical amino acid sequences. Because of the degeneracy of the genetic code, a number of nucleic acid sequences will encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine.
- nucleic acid variations are "silent variations," which are one species of conservatively modified variations. Every nucleic acid sequence herein which encodes a polypeptide also describes every possible silent variation of the nucleic acid.
- AUG which is ordinarily the only codon for methionine
- TGG which is ordinarily the only codon for tryptophan
- each silent variation of a nucleic acid which encodes a polypeptide is implicit in each described sequence.
- amino acid sequences one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a "conservatively modified variant" where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the disclosure.
- the following eight groups each contain amino acids that are conservative substitutions for one another: (1) Alanine (A), Glycine (G); (2) Aspartic acid (D), Glutamic acid (E); (3) Asparagine (N), Glutamine (Q); (4) Arginine (R), Lysine (K); (5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); (6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); (7) Serine (S), Threonine (T); and (8) Cysteine (C), Methionine (M). (see, e.g., Creighton, Proteins (1984)).
- polypeptide refers to a polymer of amino acid residues, wherein the polymer may in embodiments be conjugated to a moiety that does not consist of amino acids.
- the terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers.
- a “fusion protein” refers to a chimeric protein encoding two or more separate protein sequences that are recombinantly expressed as a single moiety.
- amino acid or nucleotide base "position" is denoted by a number that sequentially identifies each amino acid (or nucleotide base) in the reference sequence based on its position relative to the N-terminus (or 5'-end). Due to deletions, insertions, truncations, fusions, and the like that must be taken into account when determining an optimal alignment, in general the amino acid residue number in a test sequence determined by simply counting from the N- terminus will not necessarily be the same as the number of its corresponding position in the reference sequence. For example, in a case where a variant has a deletion relative to an aligned reference sequence, there will be no amino acid in the variant that corresponds to a position in the reference sequence at the site of deletion.
- an amino acid residue in a protein "corresponds" to a given residue when it occupies the same essential structural position within the protein as the given residue.
- a selected residue in a selected protein corresponds to Tyr126 of the PylRS protein of SEQ ID NO:1 when the selected residue occupies the same essential spatial or other structural relationship as Tyr126 in the PylRS protein of SEQ ID NO:1.
- the position in the aligned selected protein aligning with Tyr126 is said to correspond to Tyr126.
- a three dimensional structural alignment can also be used, e.g., where the structure of the selected protein is aligned for maximum correspondence with the PylRS protein and the overall structures compared. In this case, an amino acid that occupies the same essential position as Tyr126 in the structural model is said to correspond to the Tyr126 residue.
- "Percentage of sequence identity" is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
- the percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
- nucleic acids or polypeptide sequences refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (e.g., ncbi.nlm.nih.gov/BLAST/ or the like).
- sequences are then the to be "substantially identical.”
- This definition also refers to, or may be applied to, the compliment of a test sequence.
- the definition also includes sequences that have deletions and/or additions, as well as those that have substitutions.
- the preferred algorithms can account for gaps and the like.
- identity exists over a region that is at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 50-100 amino acids or nucleotides in length.
- antibody is used according to its commonly known meaning in the art. Antibodies exist, e.g., as intact immunoglobulins or as a number of well-characterized fragments produced by digestion with various peptidases.
- pepsin digests an antibody below the disulfide linkages in the hinge region to produce F(ab)' 2 , a dimer of Fab which itself is a light chain joined to VH-CH1 by a disulfide bond.
- the F(ab)'2 may be reduced under mild conditions to break the disulfide linkage in the hinge region, thereby converting the F(ab)' 2 dimer into an Fab' monomer.
- the Fab' monomer is essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed.1993).
- antibody fragments are defined in terms of the digestion of an intact antibody, one of skill will appreciate that such fragments may be synthesized de novo either chemically or by using recombinant DNA methodology.
- antibody also includes antibody fragments either produced by the modification of whole antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those identified using phage display libraries (e.g., McCafferty et al., Nature 348:552-554 (1990)).
- phage display libraries e.g., McCafferty et al., Nature 348:552-554 (1990)
- Each light chain and heavy chain in turn consists of two regions: a variable (“V”) region involved in binding the target antigen, and a constant (“C”) region that interacts with other components of the immune system.
- the light and heavy chain variable regions come together in 3-dimensional space to form a variable region that binds the antigen (for example, a receptor on the surface of a cell).
- the complementarity determining regions Within each light or heavy chain variable region, there are three short segments (averaging 10 amino acids in length) called the complementarity determining regions (“CDRs").
- CDRs complementarity determining regions
- An exemplary immunoglobulin (antibody) structural unit comprises a tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair having one “light” (about 25 kD) and one “heavy” chain (about 50-70 kD). The N-terminus of each chain defines a variable region of about 100 to 110 or more amino acids primarily responsible for antigen recognition.
- variable light chain (VL) and variable heavy chain (VH) refer to these light and heavy chains respectively.
- the Fc i.e. fragment crystallizable region
- the Fc region is the “base” or “tail” of an immunoglobulin and is typically composed of two heavy chains that contribute two or three constant domains depending on the class of the antibody. By binding to specific proteins the Fc region ensures that each antibody generates an appropriate immune response for a given antigen.
- the Fc region also binds to various cell receptors, such as Fc receptors, and other immune molecules, such as complement proteins.
- An “antibody variant” as provided herein refers to a polypeptide capable of binding to a receptor protein or an antigen and including one or more structural domains of an antibody or fragment thereof.
- Non-limiting examples of antibody variants include single-domain antibodies (nanobodies), affibodies (polypeptides smaller than monoclonal antibodies (e.g., about 6kDA) and capable of binding receptor proteins or antigens with high affinity and imitating monoclonal antibodies), an antigen-binding fragment (Fab), Fab dimer (monospecific Fab 2 , bispecific Fab 2 ), trispecific Fab3, monovalent IgGs, single-chain variable fragments (scFv), bispecific diabodies, trispecific triabodies, scFv-Fc, minibodies, IgNAR, V-NAR, hcIgG, VhH, or peptibodies.
- Fab antigen-binding fragment
- Fab dimer monospecific Fab 2 , bispecific Fab 2
- trispecific Fab3 monovalent IgGs
- scFv single-chain variable fragments
- minibodies minibodies, IgNAR, V-NAR,
- a “peptibody” as provided herein refers to a peptide moiety attached (through a covalent or non- covalent linker) to the Fc domain of an antibody.
- Further non-limiting examples of antibody variants known in the art include antibodies produced by cartilaginous fish or camelids. A general description of antibodies from camelids and the variable regions thereof and methods for their production, isolation, and use may be found in references WO 97/49805 and WO 97/49805, which are incorporated, by reference herein in their entirety and for all purposes. Likewise, antibodies from cartilaginous fish and the variable regions thereof and methods for their production, isolation, and use may be found in WO2005/118629, which is incorporated by reference herein in its entirety and for all purposes.
- a “single-domain antibody” or “nanobody” refers to an antibody fragment having a single monomeric variable antibody domain. Like a whole antibody, it is able to bind selectively to a specific antigen.
- the single domain antibody is a human or humanized single-domain antibody.
- a single-chain variable fragment (scFv) is typically a fusion protein of the variable regions of the heavy (VH) and light chains (VL) of immunoglobulins, connected with a short linker peptide of 10 to about 25 amino acids.
- the linker may usually be rich in glycine for flexibility, as well as serine or threonine for solubility.
- the linker can either connect the N- terminus of the VH with the C-terminus of the VL, or vice versa.
- the term "antigen” as provided herein refers to molecules capable of binding to the antibody binding domain provided herein.
- An "antigen binding domain” as provided herein is a region of an antibody that binds to an antigen (epitope).
- the antigen binding domain may include one constant and one variable domain of each of the heavy and the light chain (VL, VH, CL and CH1, respectively).
- the antigen binding domain includes a light chain variable domain and a heavy chain variable domain.
- the antigen binding domain includes light chain variable domain and does not include a heavy chain variable domain and/or a heavy chain constant domain.
- the paratope or antigen-binding site is formed on the N-terminus of the antigen binding domain.
- the two variable domains of an antigen binding domain may bind the epitope of an antigen.
- Antibodies exist, for example, as intact immunoglobulins or as a number of well- characterized fragments produced by digestion with various peptidases. Thus, for example, pepsin digests an antibody below the disulfide linkages in the hinge region to produce F(ab)’2, a dimer of Fab which itself is a light chain joined to VH-CH1 by a disulfide bond.
- the F(ab)’2 may be reduced under mild conditions to break the disulfide linkage in the hinge region, thereby converting the F(ab)’2 dimer into an Fab’ monomer.
- the Fab’ monomer is essentially the antigen binding portion with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed.1993). While various antibody fragments are defined in terms of the digestion of an intact antibody, one of skill will appreciate that such fragments may be synthesized de novo either chemically or by using recombinant DNA methodology.
- antibody also includes antibody fragments either produced by the modification of whole antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those identified using phage display libraries (see, e.g., McCafferty et al., Nature 348:552-554 (1990)).
- the epitope of an antibody is the region of its antigen to which the antibody binds. Two antibodies bind to the same or overlapping epitope if each competitively inhibits (blocks) binding of the other to the antigen.
- a 1x, 5x, 10x, 20x or 100x excess of one antibody inhibits binding of the other by at least 30% but preferably 50%, 75%, 90% or even 99% as measured in a competitive binding assay (see, e.g., Junghans et al., Cancer Res.50:1495, 1990).
- two antibodies have the same epitope if essentially all amino acid mutations in the antigen that reduce or eliminate binding of one antibody reduce or eliminate binding of the other.
- Two antibodies have overlapping epitopes if some amino acid mutations that reduce or eliminate binding of one antibody reduce or eliminate binding of the other.
- Antibodies e.g., recombinant, monoclonal, or polyclonal antibodies
- can be prepared by many techniques known in the art see, e.g., Kohler & Milstein, Nature 256:495-497 (1975); Kozbor et al., Immunology Today 4: 72 (1983); Cole et al., pp.77-96 in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc. (1985); Coligan, Current Protocols in Immunology (1991); Harlow & Lane, Antibodies, A Laboratory Manual (1988); and Goding, Monoclonal Antibodies: Principles and Practice (2d ed.1986)).
- the genes encoding the heavy and light chains of an antibody of interest can be cloned from a cell, e.g., the genes encoding a monoclonal antibody can be cloned from a hybridoma and used to produce a recombinant monoclonal antibody.
- Gene libraries encoding heavy and light chains of monoclonal antibodies can also be made from hybridoma or plasma cells. Random combinations of the heavy and light chain gene products generate a large pool of antibodies with different antigenic specificity (see, e.g., Kuby, Immunology (3rd ed.1997)). Techniques for the production of single chain antibodies or recombinant antibodies (U.S. Patent 4,946,778, U.S.
- Patent No.4,816,567) can be adapted to produce antibodies to polypeptides.
- transgenic mice, or other organisms such as other mammals may be used to express humanized or human antibodies (see, e.g., U.S. Patent Nos.5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 5,661,016, Marks et al., Bio/Technology 10:779-783 (1992); Lonberg et al., Nature 368:856-859 (1994); Morrison, Nature 368:812-13 (1994); Fishwild et al., Nature Biotechnology 14:845-51 (1996); Neuberger, Nature Biotechnology 14:826 (1996); and Lonberg & Huszar, Intern.
- phage display technology can be used to identify antibodies and heteromeric Fab fragments that specifically bind to selected antigens (see, e.g., McCafferty et al., Nature 348:552-554 (1990); Marks et al., Biotechnology 10:779-783 (1992)).
- Antibodies can also be made bispecific, i.e., able to recognize two different antigens (see, e.g., WO 93/08829, Traunecker et al., EMBO J.10:3655-3659 (1991); and Suresh et al., Methods in Enzymology 121:210 (1986)).
- Antibodies can also be heteroconjugates, e.g., two covalently joined antibodies, or immunotoxins (see, e.g., U.S. Patent No.4,676,980 , WO 91/00360; WO 92/200373; and EP 03089).
- heteroconjugates e.g., two covalently joined antibodies, or immunotoxins.
- Humanized antibodies are further described in, e.g., Winter and Milstein (1991) Nature 349:293.
- a humanized antibody has one or more amino acid residues introduced into it from a source which is non-human.
- humanized antibodies are chimeric antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact human variable domain has been substituted by the corresponding sequence from a non-human species.
- humanized antibodies are typically human antibodies in which some CDR residues and possibly some FR residues are substituted by residues from analogous sites in rodent antibodies.
- polynucleotides comprising a first sequence coding for humanized immunoglobulin framework regions and a second sequence set coding for the desired immunoglobulin complementarity determining regions can be produced synthetically or by combining appropriate cDNA and genomic DNA segments.
- Human constant region DNA sequences can be isolated in accordance with well known procedures from a variety of human cells.
- a "chimeric antibody” is an antibody molecule in which (a) the constant region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site (variable region) is linked to a constant region of a different or altered class, effector function and/or species, or an entirely different molecule which confers new properties to the chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable region, or a portion thereof, is altered, replaced or exchanged with a variable region having a different or altered antigen specificity.
- the antibodies described herein include humanized and/or chimeric monoclonal antibodies.
- the phrase “specifically (or selectively) binds” to an antibody or an antigen or “specifically (or selectively) immunoreactive with” when referring to a protein or peptide refers to a binding reaction that is determinative of the presence of the protein, often in a heterogeneous population of proteins and other biologics.
- the specified antibodies bind to a particular protein at least two times the background and more typically more than 10 to 100 times background.
- Specific binding to an antibody under such conditions requires an antibody that is selected for its specificity for a particular protein.
- polyclonal antibodies can be selected to obtain only a subset of antibodies that are specifically immunoreactive with the selected antigen and not with other proteins.
- This selection may be achieved by subtracting out antibodies that cross-react with other molecules.
- a variety of immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein.
- solid-phase ELISA immunoassays are routinely used to select antibodies specifically immunoreactive with a protein (see, e.g., Harlow & Lane, Using Antibodies, A Laboratory Manual (1998) for a description of immunoassay formats and conditions that can be used to determine specific immunoreactivity).
- “Receptor protein” or “membrane receptor” refers to a receptor (protein) that is embedded in the plasma membrane of a cell.
- the receptor protein is located in the extracellular domain of a cell, the transmembrane domain of a cell, or the intracellular domain of a cell.
- the receptor protein is a cell-surface receptor.
- the receptor protein is in the extracellular domain.
- the receptor protein is in the transmembrane domain.
- the receptor protein is an ion channel- linked receptor, an enzyme-linked receptor, or a G protein-coupled receptor.
- the receptor protein is a hormone receptor.
- biomolecule refers to a protein.
- biomolecule refers to a nucleic acid.
- biomolecule refers to a carbohydrate.
- the protein is a single-domain antibody.
- the protein is a membrane receptor.
- biomolecule moiety refers to a peptidyl moiety, a carbohydrate moiety, a lipid moiety, or a nucleic acid moiety that forms a biomolecule.
- peptidyl moiety refers to a protein, protein fragment, or peptide that may form part of a biomolecule or a biomolecule conjugate.
- the peptidyl moiety forms part of a biomolecule (e.g., protein). In aspects, the peptidyl moiety forms part of a biomolecule (e.g., protein) conjugate. The peptidyl moiety may also be substituted with additional chemical moieties (e.g., additional R substituents). In aspects, the peptidyl moiety forms part of a single-domain antibody. In aspects, the peptidyl moiety forms part of a membrane receptor. [0129]
- amino acid moiety refers refers to a monovalent amino acid, such that the amino acid can be linked to another compound or moiety, such as the compound of Formula (B) described herein.
- carbohydrate moiety refers to carbohydrates, for example, polyhydroxy aldehydes, ketones, alcohols, acids, their simple derivatives and their polymers having linkages of the acetal type, that may form part of a biomolecule or a biomolecule conjugate.
- carbohydrate moiety forms part of a biomolecule.
- carbohydrate moiety forms part of a biomolecule conjugate.
- the carbohydrate moiety may also be substituted with additional chemical moieties (e.g., additional R substituents).
- nucleic acid moiety refers to nucleic acids, for example, DNA, and RNA, that may form part of a biomolecule or biomolecule conjugate. In aspects, the nucleic acid moiety forms part of a biomolecule. In aspects, the nucleic acid moiety forms part of a biomolecule conjugate. The nucleic acid moiety may also be substituted with additional chemical moieties (e.g., additional R substituents).
- a “small molecule” is a low molecular weight organic compounds, having a molecular weight of 10,000 Daltons or less, of natural or synthetic nature.
- a “small molecule moiety” refers to a small molecule that may form part of biomolecule or that may contain one or more FSK amino acid side chains represented by Formula (F). In embodiments, a small molecule moiety is a monovalent small molecule.
- pyrrolysyl-tRNA synthetase refers to an enzyme (including homologs, isoforms, and functional fragments thereof) with pyrrolysyl-tRNA synthetase activity.
- Pyrrolysyl-tRNA synthetase is an aminoacyl-tRNA synthetase (aaRS) that catalyzes the reaction necessary to attach ⁇ -amino acid pyrrolysine to the cognate tRNA (tRNA pyl ), thereby allowing incorporation of pyrrolysine during proteinogenesis at amber stop codons (e.g., TAG).
- aaRS aminoacyl-tRNA synthetase
- the term includes any recombinant or naturally-occurring form of pyrrolysyl-tRNA synthetase or variants, homologs, or isoforms thereof that maintain pyrrolysyl-tRNA synthetase activity (e.g. within at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 100% activity compared to wild- type pyrrolysyl-tRNA synthetase).
- the variants, homologs, or isoforms have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g., a 50, 100, 150 or 200 continuous amino acid portion) compared to a naturally occurring pyrrolysyl-tRNA synthetase.
- the pyrrolysyl-tRNA synthetase comprises the sequence set forth by SEQ ID NO:1.
- the pyrrolysyl-tRNA synthetase is the sequence set forth by SEQ ID NO:1.
- mutant pyrrolysyl-tRNA synthetase or “mutant PylRS” or “variant pyrrolysyl-tRNA synthetase” or “variant PylRS” refers to any pyrrolysyl-tRNA synthetase that has a different amino acid sequence from a wild-type amino acid sequence.
- the variant PylRS refers to any pyrrolysyl-tRNA synthetase that has a different amino acid sequence from a wild-type amino acid sequence of Methanomethylophilus alvus pyrrolysyl- tRNA synthetase set forth as SEQ ID NO:1.
- mutant pyrrolysyl-tRNA synthetase refers to any pyrrolysyl-tRNA synthetase that catalyzes the attachment of fluorosulfonyloxybenzoyl-L-lysine (FSK) to a tRNA pyl .
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having mutations at one or more residues selected from the group consisting of tyrosine at position 126, methionine at position 129, valine at position 168, histidine at position 227, tyrosine at position 228, and lysine at position 229.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having the following five mutations: (i) Y126G; (ii) M129A; (iii) V168F; (iv) H227T, H227S, or H227I; and (v) Y228P.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having the following six mutations: (i) Y126G; (ii) M129A; (iii) V168F; (iv) H227T, H227S, or H227I; (v) Y228P; and (vi) L229V or L229I.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having mutations the following six mutations: Y126G; M129A; V168F; H227T; Y228P; and L229I.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having the following six mutations: Y126G; M129A; V168F; H227S; Y228P; and L229V. In aspects, the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having the following six mutations: Y126G; M129A; V168F; H227I; and Y228P. In aspects, the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having the following six mutations: Y126G; M129A; V168F; H227S; Y228P; and L229I.
- the mutant pyrrolysyl-tRNA synthetase includes the sequence set forth by SEQ ID NO:2. In aspects, the mutant pyrrolysyl-tRNA synthetase is the sequence set forth by SEQ ID NO:2. In aspects, the mutant pyrrolysyl-tRNA synthetase is encoded by the sequence set forth by SEQ ID NO:2. In aspects, the mutant pyrrolysyl-tRNA synthetase further comprises six histidine residues at the N-terminus and/or the C-terminus. In aspects, the mutant pyrrolysyl-tRNA synthetase further comprises six histidine residues at the N-terminus.
- the mutant pyrrolysyl-tRNA synthetase further comprises six histidine residues at the C- terminus.
- the mutant pyrrolysyl-tRNA synthetase includes the sequence set forth by SEQ ID NO:86.
- the mutant pyrrolysyl-tRNA synthetase is the sequence set forth by SEQ ID NO:86.
- the mutant pyrrolysyl-tRNA synthetase is encoded by the sequence set forth by SEQ ID NO:86.
- the mutant pyrrolysyl-tRNA synthetase includes the sequence set forth by SEQ ID NO:87.
- the mutant pyrrolysyl-tRNA synthetase is the sequence set forth by SEQ ID NO:87. In aspects, the mutant pyrrolysyl-tRNA synthetase is encoded by the sequence set forth by SEQ ID NO:87. In aspects, “mutant pyrrolysyl-tRNA synthetase” is referred to as “pyrrolysyl-tRNA synthetase,” and the skilled artisan will readily recognize whether the pyrrolysyl-tRNA synthetase is mutant based on a comparison to the wild- type SEQ ID NO:1.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 1 to 10 histidine residues at the C-terminus and/or the N-terminus (e.g., after the M residue); and having mutations at one or more residues selected from the group consisting of tyrosine at position 126, methionine at position 129, valine at position 168, histidine at position 227, tyrosine at position 228, and lysine at position 229.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 1 to 10 histidine residues at the C-terminus and/or the N-terminus (e.g., after the M residue); and having the following five mutations: (i) Y126G; (ii) M129A; (iii) V168F; (iv) H227T, H227S, or H227I; and (v) Y228P.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 1 to 10 histidine residues at the C- terminus and/or the N-terminus (e.g., after the M residue); and having the following six mutations: (i) Y126G; (ii) M129A; (iii) V168F; (iv) H227T, H227S, or H227I; (v) Y228P; and (vi) L229V or L229I.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 1 to 10 histidine residues at the C-terminus and/or the N-terminus (e.g., after the M residue); and having mutations the following six mutations: Y126G; M129A; V168F; H227T; Y228P; and L229I.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 6 histidine residues at the C-terminus; and having mutations the following six mutations: Y126G; M129A; V168F; H227T; Y228P; and L229I.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 6 histidine residues at the the N-terminus (after the M residue); and having mutations the following six mutations: Y126G; M129A; V168F; H227T; Y228P; and L229I.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 1 to 10 histidine residues at the C-terminus and/or the N-terminus (e.g., after the M residue); and having the following six mutations: Y126G; M129A; V168F; H227S; Y228P; and L229V.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 6 histidine residues at the C-terminus; and having the following six mutations: Y126G; M129A; V168F; H227S; Y228P; and L229V.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 6 histidine residues at the N-terminus (after the M residue); and having the following six mutations: Y126G; M129A; V168F; H227S; Y228P; and L229V.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 1 to 10 histidine residues at the C-terminus and/or the N-terminus (e.g., after the M residue); and having the following six mutations: Y126G; M129A; V168F; H227I; and Y228P.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 6 histidine residues at the C-terminus; and having the following six mutations: Y126G; M129A; V168F; H227I; and Y228P.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 6 histidine residues at the N- terminus (after the M residue); and having the following six mutations: Y126G; M129A; V168F; H227I; and Y228P.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 1 to 10 histidine residues at the C-terminus and/or the N-terminus (e.g., after the M residue); and having the following six mutations: Y126G; M129A; V168F; H227S; Y228P; and L229I.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 6 histidine residues at the C-terminus; and having the following six mutations: Y126G; M129A; V168F; H227S; Y228P; and L229I.
- the mutant pyrrolysyl-tRNA synthetase includes SEQ ID NO:1 having 6 histidine residues at the N-terminus (after the M residue); and having the following six mutations: Y126G; M129A; V168F; H227S; Y228P; and L229I.
- the mutant pyrrolysyl-tRNA synthetase includes the sequence set forth by SEQ ID NO:86. In aspects, the mutant pyrrolysyl-tRNA synthetase is the sequence set forth by SEQ ID NO:86. In aspects, the mutant pyrrolysyl-tRNA synthetase is encoded by the sequence set forth by SEQ ID NO:86. In aspects, the mutant pyrrolysyl-tRNA synthetase includes the sequence set forth by SEQ ID NO:87. In aspects, the mutant pyrrolysyl-tRNA synthetase is the sequence set forth by SEQ ID NO:87.
- the mutant pyrrolysyl-tRNA synthetase is encoded by the sequence set forth by SEQ ID NO:87.
- tRNA Pyl refers to a single-stranded RNA molecule containing about 50 to about 100 nucleotides which fold via intrastrand base pairing to form a characteristic cloverleaf structure that carries a specific amino acid (e.g., pyrrolysine, FSK) and matches it to its corresponding codon (i.e., a complementary to the anticodon of the tRNA) on an mRNA during protein synthesis.
- a specific amino acid e.g., pyrrolysine, FSK
- the abbreviation “Pyl” of tRNA Pyl stands for pyrrolysine.
- the anticodon comprises CUA, TTA, or TCA. In embodiments, the anticodon comprises CUA. In embodiments, the anticodon comprises TTA. In embodiments, the anticodon comprises TCA. In embodiments, the anticodon comprises at least one non-canonical base. Anticodon CUA is complementary to the amber stop codon.
- tRNA Pyl is attached to FSK. In aspects, tRNA Pyl refers to a single-stranded RNA molecule containing about 50 to about 100 nucleotides. In aspects, tRNA Pyl refers to a single-stranded RNA molecule containing about 60 to about 90 nucleotides.
- tRNA Pyl refers to a single-stranded RNA molecule containing about 65 to about 85 nucleotides. In aspects, tRNA Pyl refers to a single-stranded RNA molecule containing about 70 to about 90 nucleotides. In aspects, tRNA Pyl refers to a single-stranded RNA molecule containing about 60 to about 80 nucleotides.
- substrate-binding site refers to residues located in the enzyme active site that form temporary bonds or interactions with the substrate.
- the substrate-binding site of pyrrolysyl-tRNA synthetase refers to residues located in the active site of pyrrolysyl-tRNA synthetase that form temporary bonds or interactions with the amino acid substrate.
- the substrate-binding site of pyrrolysyl-tRNA synthetase includes one or more of the following residues: tyrosine at position 126, methionine at position 129, valine at position 168, histidine at position 227, tyrosine at position 228, and lysine at position 229 as set forth in the amino acid sequence of SEQ ID NO:1.
- plasmid refers to a nucleic acid molecule that encodes for genes and/or regulatory elements necessary for the expression of genes. Expression of a gene from a plasmid can occur in cis or in trans. If a gene is expressed in cis, the gene and the regulatory elements are encoded by the same plasmid. Expression in trans refers to the instance where the gene and the regulatory elements are encoded by separate plasmids.
- complex refers to a composition that includes two or more components, where the components bind together to make a functional unit.
- a complex described herein include a mutant pyrrolysyl-tRNA synthetase described herein and an amino acid substrate (e.g., FSK).
- a complex described herein includes a mutant pyrrolysyl-tRNA synthetase described herein and a tRNA (e.g., tRNA Pyl ).
- a complex described herein includes a mutant pyrrolysyl-tRNA synthetase described herein, an amino acid substrate (e.g., FSK) and a tRNA (e.g., tRNA Pyl ).
- a complex described herein includes at least two components selected from the group consisting of a mutant pyrrolysyl-tRNA synthetase described herein, an amino acid substrate (e.g., FSK), a polypeptide containing FSK, and a tRNA (e.g., tRNA Pyl ) [0141]
- the terms "transfection”, “transduction”, “transfecting” or “transducing” can be used interchangeably and are defined as a process of introducing a nucleic acid molecule or a protein to a cell. Nucleic acids are introduced to a cell using non-viral or viral-based methods. The nucleic acid molecules may be gene sequences encoding complete proteins or functional portions thereof.
- Non-viral methods of transfection include any appropriate transfection method that does not use viral DNA or viral particles as a delivery system to introduce the nucleic acid molecule into the cell.
- Exemplary non-viral transfection methods include calcium phosphate transfection, liposomal transfection, nucleofection, sonoporation, transfection through heat shock, magnetifection and electroporation.
- the nucleic acid molecules are introduced into a cell using electroporation following standard procedures well known in the art.
- any useful viral vector may be used in the methods described herein. Examples for viral vectors include, but are not limited to retroviral, adenoviral, lentiviral and adeno-associated viral vectors.
- the nucleic acid molecules are introduced into a cell using a retroviral vector following standard procedures well known in the art.
- the terms ′′transfection′′ or ′′transduction′′ also refer to introducing proteins into a cell from the external environment. Typically, transduction or transfection of a protein relies on attachment of a peptide or protein capable of crossing the cell membrane to the protein of interest. See, e.g., Ford et al. (2001) Gene Therapy 8:1-4 and Prochiantz (2007) Nat. Methods 4:119-20.
- isolated when applied to a nucleic acid or protein, denotes that the nucleic acid or protein is essentially free of other cellular components with which it is associated in the natural state.
- “Contacting” is used in accordance with its plain ordinary meaning and refers to the process of allowing at least two distinct species (e.g. chemical compounds including biomolecules, biomolecule moieties, or cells) to become sufficiently proximal to react, interact or physically touch.
- the resulting reaction product can be produced directly from a reaction between the added reagents or from an intermediate from one or more of the added reagents that can be produced in the reaction mixture.
- the term “contacting” may include allowing two species to react, interact, or physically touch, wherein the two species may be biomolecules and/or biomolecule moieties as described herein.
- contacting includes allowing two biomolecule moieties as described herein to interact, wherein the biomolecule moieties covalently bond to form a conjugate.
- bioconjugate reactive moiety and “bioconjugate reactive group” refers to a moiety or group capable of forming a bioconjugate (e.g., covalent linker) as a result of the association between atoms or molecules of bioconjugate reactive groups.
- the association can be direct or indirect.
- a conjugate between a first bioconjugate reactive group e.g., –NH 2 , –COOH, –N-hydroxysuccinimide, or –maleimide
- a second bioconjugate reactive group e.g., sulfhydryl, sulfur-containing amino acid, amine, amine sidechain containing amino acid, or carboxylate
- covalent bond or linker e.g. a first linker of second linker
- indirect e.g., by non-covalent bond (e.g. electrostatic interactions (e.g. ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g.
- bioconjugates or bioconjugate linkers are formed using bioconjugate chemistry (i.e. the association of two bioconjugate reactive groups) including, but are not limited to nucleophilic substitutions (e.g., reactions of amines and alcohols with acyl halides, active esters), electrophilic substitutions (e.g., enamine reactions) and additions to carbon-carbon and carbon-heteroatom multiple bonds (e.g., Michael reaction, Diels- Alder addition).
- bioconjugate chemistry i.e. the association of two bioconjugate reactive groups
- nucleophilic substitutions e.g., reactions of amines and alcohols with acyl halides, active esters
- electrophilic substitutions e.g., enamine reactions
- additions to carbon-carbon and carbon-heteroatom multiple bonds e.g., Michael reaction, Diels- Alder addition.
- the first bioconjugate reactive group e.g., maleimide moiety
- the second bioconjugate reactive group e.g. a sulfhydryl
- the first bioconjugate reactive group e.g., haloacetyl moiety
- the second bioconjugate reactive group e.g.
- the first bioconjugate reactive group e.g., pyridyl moiety
- the second bioconjugate reactive group e.g. a sulfhydryl
- the first bioconjugate reactive group e.g., –N-hydroxysuccinimide moiety
- the second bioconjugate reactive group e.g. an amine
- the first bioconjugate reactive group e.g., maleimide moiety
- the first bioconjugate reactive group (e.g., –sulfo–N-hydroxysuccinimide moiety) is covalently attached to the second bioconjugate reactive group (e.g. an amine).
- bioconjugate reactive moieties used for bioconjugate chemistries herein include, for example: (a) carboxyl groups and various derivatives thereof including, but not limited to, N- hydroxysuccinimide esters, N-hydroxybenztriazole esters, acid halides, acyl imidazoles, thioesters, p-nitrophenyl esters, alkyl, alkenyl, alkynyl and aromatic esters; (b) hydroxyl groups which can be converted to esters, ethers, aldehydes, etc.; (c) haloalkyl groups wherein the halide can be later displaced with a nucleophilic group such as, for example, an amine, a carboxylate anion, thi
- bioconjugate reactive groups can be chosen such that they do not participate in, or interfere with, the chemical stability of the conjugate described herein. Alternatively, a reactive functional group can be protected from participating in the crosslinking reaction by the presence of a protecting group.
- the bioconjugate comprises a molecular entity derived from the reaction of an unsaturated bond, such as a maleimide, and a sulfhydryl group.
- an in vitro translation system refers to a system that provides for the in vitro synthesis of proteins in cell-free extracts that may provide for the identification of gene products (e.g., proteomics), localization of mutations through synthesis of truncated gene products, protein folding studies, and incorporation of modified or unnatural amino acids in to proteins.
- an in vitro translation system refers to a system that provides for the incorporation of modified or unnatural amino acids (e.g., FSK) into proteins.
- An exemplary in vitro translation system is PURExpress® In Vitro Protein Synthesis Kit by New England BioLabs, Inc.
- Exemplary components of an in vitro translation system include amino acids, wheat germ extract, cellular components for protein synthesis (e.g., tRNA, ribosomes, initiation factors, elongation factors, termination factors), salts (e.g., Mg 2+ , K + ), and the like.
- the in vitro translation system is a rabbit reticulocyte system or a wheat germ extract system.
- the terms “fluorosulfate-L-tyrosine” and “FSY” refer to the unnatural amino acid having the following structure: [0150] FSY comprises the amino acid side chain of the formula: .
- fluorosulfonyloxybenzoyl-L-lysine and “FSK” refer to the unnatural amino acid having the structure of Formula (A): [0152] FSK comprises the amino acid side chain of Formula (F): [0153]
- FSK biomolecule refers to a biomolecule comprising the FSK unnatural amino acid and/or the amino acid side chain thereof.
- biomolecule conjugate or “FSK biomolecule conjugate” refers to any biomolecule comprising a bioconjugate linker (“FSK bioconjugate linker”) having the structure of Formula (D):
- FSK protein refers to a protein comprising the FSK unnatural amino acid and/or the amino acid side chain thereof.
- protein conjugate or “FSK protein conjugate” refers to any protein comprising a bioconjugate linker having the structure of Formula (D):
- Sulfur-fluoride exchange reaction or “SuFEx” refers to a type of click chemistry as described in detail by, e.g., Dong et al, Angewandte Chemie, 53(36):9340-9448 (2014); Wang et al, J. Am. Chem. Soc., 140(15):4995-4999 (2016); and as described in the examples herein.
- proximally-enabled SuFEx refers to the sulfur-fluoride exchange reaction occurring when the reactive species are proximal to each other, i.e., spatially close enough for the SuFEx reaction to occur.
- the proximity may occur within a single biomolecule (e.g., protein) or between two different biomolecules (e.g., proteins).
- the skilled artisan could readily determine whether the reactive species are sufficiently proximal for the reaction to occur (e.g., sulfur-fluoride exchange reaction between FSK and lysine, histidine, or tyrosine to form the bioconjugate, the moiety of Formula (A), (B), or (C), or the protein of Formula (I), (II), or (III)).
- intermolecular linker refers to a linking group between two different biomolecules.
- the peptidyl moiety of R 1 is a first protein and the peptidyl moiety of R 2 is a second protein, such that the first protein and the second protein are covalently bonded via the moiety of Formula (E) (I), (II), or (III).
- the first protein and the second protein are different proteins, e.g., providing an intermolecular linker between two different proteins, such as a single-domain antibody and a membrane receptor.
- intramolecular linker refers to a linking group within a single biomolecule.
- the compound of Formula (E) (I), (II), or (III) has an intramolecular linker, then the peptidyl moiety of R 1 and the peptidyl moiety of R 2 are in the same protein.
- the first protein and the second protein are the same protein, i.e., providing an intermolecular linker within a single protein.
- Biomolecules and Biomolecule Conjugates [0161] Provided herein are biomolecules and biomolecule conjugates formed through the interaction of latent bioreactive unnatural amino acids with naturally occurring amino acids.
- Fluorosulfonyloxybenzoyl-L-lysine (FSK or N6-(4-((fluorosulfonyl)oxy)benzoyl)-L-lysine), a latent bioreactive unnatural amino acid, facilitates formation of covalent bonds with proximal target amino acid residues (e.g., lysine, histidine, tyrosine) by undergoing a click chemistry reaction (e.g., sulfur-fluoride exchange reaction (SuFEx)).
- proximal target amino acid residues e.g., lysine, histidine, tyrosine
- a click chemistry reaction e.g., sulfur-fluoride exchange reaction (SuFEx)
- FSK may be inserted into or replace an amino acid in a naturally occurring protein, thereby endowing the protein with the ability to form a covalent bond with proximally positioned target amino acid residues (e.g., lysine, histidine, tyrosine) on the protein itself or with proteins it naturally interacts with.
- FSK may be used to facilitate the formation of covalent bonds between or within proteins in both in vitro and in vivo conditions, owing, at least in part, to its being non-toxic to cells.
- the latent bioreactive unnatural amino acid FSK is useful for covalently linking biomolecules (e.g., proteins, carbohydrates, nucleic acids) to form biomolecule conjugates.
- the latent bioreactive unnatural amino acid FSK is useful for covalently linking biomolecule moieties (e.g., peptidyl moieties) within a single biomolecule (e.g., protein).
- the latent bioreactive unnatural amino acid FSK is useful for covalently linking biomolecule moieties (e.g., peptidyl moieties) in different biomolecules (e.g., covalently linking two proteins).
- the latent bioreactive unnatural amino acid FSK is useful for covalently linking single domain antibodies to membrane receptors.
- FSK as a latent bioreactive unnatural amino acid, has shown excellent chemical functionality (i.e., superior properties) compared to previously described bioreactive unnatural amino acids.
- FSK is stable, nontoxic and nonreactive inside cells, yet when placed in proximity to target residues it becomes reactive under cellular conditions.
- FSK is able to react with lysine, histidine, and tyrosine specifically with great selectivity via proximity-enabled SuFEx reaction within and between proteins under physiological conditions.
- biomolecules comprising one or more latent bioreactive unnatural amino acids.
- the biomolecule is a protein, a nucleic acid, or a carbohydrate.
- the biomolecule is a protein.
- FSK and the lysine, histidine, or tyrosine are in an ⁇ -strand of the protein.
- FSK and the lysine, histidine, or tyrosine are in a ⁇ -strand of the protein.
- the protein is a single-domain antibody.
- the protein is a membrane receptor.
- the latent bioreactive unnatural amino acid is fluorosulfonyloxybenzoyl-L-lysine (FSK) having the structure of Formula (A): protein comprising the FSK unnatural amino acid.
- the protein comprises at least one FSK.
- the protein comprises one FSK.
- the proteins comprises two or more FSK. In aspects, the proteins comprises two FSK. In aspects, the proteins comprises three FSK.
- the biomolecule is a protein comprising the FSK amino acid side chain represented by Formula (F): aspects, the protein comprises FSK that is proximal to lysine, histidine, tyrosine, or a combination of two or more thereof. In aspects, the protein comprises FSK that is proximal to lysine. In aspects, the protein comprises FSK that is proximal to histidine. In aspects, the protein comprises FSK that is proximal to tyrosine. In aspects, the protein is an antibody or an antibody variant.
- the protein is an antibody, an antigen-binding fragment, a single-chain variable fragment, a single-domain antibody, or an affibody.
- Proximal means that FSK and lysine, histidine, or tyrosine are close enough to each other for a SuFEx reaction to successfully occur.
- proximal means that FSK is within 1 to 50 amino acids of a lysine, histidine, or tyrosine.
- proximal means that FSK is within 1 to 45 amino acids of a lysine, histidine, or tyrosine.
- proximal means that FSK is within 1 to 40 amino acids of a lysine, histidine, or tyrosine. In aspects “proximal” means that FSK is within 1 to 35 amino acids of a lysine, histidine, or tyrosine. In aspects “proximal” means that FSK is within 1 to 30 amino acids of a lysine, histidine, or tyrosine. In aspects “proximal” means that FSK is within 1 to 25 amino acids of a lysine, histidine, or tyrosine. In aspects “proximal” means that FSK is within 1 to 20 amino acids of a lysine, histidine, or tyrosine.
- proximal means that FSK is within 1 to 15 amino acids of a lysine, histidine, or tyrosine. In aspects “proximal” means that FSK is within 1 to 10 amino acids of a lysine, histidine, or tyrosine. In aspects “proximal” means that FSK is within 1 to 9 amino acids of a lysine, histidine, or tyrosine. In aspects “proximal” means that FSK is within 1 to 8 amino acids of a lysine, histidine, or tyrosine. In aspects “proximal” means that FSK is within 1 to 7 amino acids of a lysine, histidine, or tyrosine.
- proximal means that FSK is within 1 to 6 amino acids of a lysine, histidine, or tyrosine. In aspects “proximal” means that FSK is within 1 to 5 amino acids of a lysine, histidine, or tyrosine. In aspects “proximal” means that FSK is within 1 to 4 amino acids of a lysine, histidine, or tyrosine. In aspects “proximal” means that FSK is within 1 to 3 amino acids of a lysine, histidine, or tyrosine. In aspects “proximal” means that FSK is within 1 to 2 amino acids of a lysine, histidine, or tyrosine.
- proximal means that FSK is adjacent a lysine, histidine, or tyrosine.
- bioconjugate linker has the structure of Formula (D): aspects, the first biomolecule moiety and the second biomolecule moiety are each independently a peptidyl moiety.
- the biomolecule conjugate is a protein conjugate.
- the biomolecule conjugate is a protein conjugate, wherein the bioconjugate linker is an intramolecular linker.
- the protein conjugate comprises a plurality of intramolecular linkers.
- the biomolecule conjugate is a protein conjugate, wherein the bioconjugate linker is an intermolecular linker.
- the protein conjugate comprises a plurality of intermolecular linkers.
- the protein conjugate comprises intramolecular linkers and intermolecular linkers.
- the biomolecule conjugate has the structure of Formula (E): the second bioconjugate moiety; L 1 is a bond or a first covalent linker; L 2 is a bond of a second covalent linker; and X 1 is –NR 5 -, -O-, -S-, or , wherein ring A is a substituted or unsubstituted heteroarylene or substituted or unsubstituted heterocycloalkylene, and wherein the nitrogen in A is attached to the bioconjugate linker.
- R 5 is hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
- L 1 is a bond, -S(O) 2 -, -NR 3A -, -O-, -S-, -C(O)-, -C(O)NR 3A -, -NR 3A C(O)-, -NR 3A C(O)NR 3B -, -C(O)O-, -OC(O)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene.
- R 3A and R 3B are independently hydrogen, substituted or unsubstituted alkylyl, substituted or unsubstituted heteroalkylyl, substituted or unsubstituted cycloalkylyl, substituted or unsubstituted heterocycloalkylyl, substituted or unsubstituted arylyl, or substituted or unsubstituted heteroarylyl.
- L 2 is a bond, -S(O) 2 -, -NR 4A -, -O-, -S-, -C(O)-, -C(O)NR 4A -, -NR 4A C(O)-, -NR 4A C(O)NR 4B -, -C(O)O-, -OC(O)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, substituted or unsubstituted heteroarylene, or substituted or unsubstituted alkylarylene.
- L 2 is a bond, -S(O) 2 -, -NR 4A -, -O-, -S-, -C(O)-, -C(O)NR 4A -, -NR 4A C(O)-, -NR 4A C(O)NR 4B -, -C(O)O-, -OC(O)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, substituted or unsubstituted heteroarylene.
- R 4A and R 4B are independently hydrogen, substituted or unsubstituted alkylyl, substituted or unsubstituted heteroalkylyl, substituted or unsubstituted cycloalkylyl, substituted or unsubstituted heterocycloalkylyl, substituted or unsubstituted arylyl, or substituted or unsubstituted heteroarylyl.
- X 1 is –NR 5 -, -O-, -S-, or wherein ring A is a substituted or unsubstituted heteroarylene or substituted or unsubstituted heterocycloalkylene. In aspects, X 1 is –NR 5 -. In aspects X 1 is -O-.
- X 1 is -S-. In aspects, X 1 is , wherein ring A is a substituted or unsubstituted heteroarylene or substituted or unsubstituted heterocycloalkylene. In aspects, ring A is substituted or unsubstituted heteroarylene. In aspects, ring A is substituted or unsubstituted heterocycloalkylene. In aspects, ring A is unsubstituted heteroarylene. In aspects, ring A is unsubstituted heterocycloalkylene.
- ring A is substituted heterocycloalkylene (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered).
- ring A is unsubstituted heterocycloalkylene (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered).
- ring A is substituted or unsubstituted heteroarylene (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
- ring A is substituted heteroarylene (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered). In aspects, ring A is unsubstituted heteroarylene (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered). In embodiments, X 1 is a bond.
- R 5 is hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
- R 5 is hydrogen.
- R 5 is hydrogen, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted alkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heteroalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted cycloalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heterocycloalkyl
- R 5 is hydrogen, substituted or unsubstituted (e.g., C 1 -C 20 , C 1 -C 10 , C 1 -C 5 ) alkyl, substituted or unsubstituted (e.g., 2 to 20 membered, 2 to 10 membered, 2 to 5 membered) heteroalkyl, substituted or unsubstituted (e.g., C 3 -C 8 , C 3 -C 6 , C 3 -C 5 ) cycloalkyl, substituted or unsubstituted (e.g., 3 to 8 membered, 3 to 6 membered, 3 to 5 membered) heterocycloalkyl, substituted or unsubstituted (e.g., C 6 -C 10 , C 6 -C 8 , C 6 -C 5 ) aryl or substituted or unsubstituted (e.g., 5 to 10 membered, 5 to 8 member
- R 5 is hydrogen, unsubstituted (e.g., C 1 -C 20 , C 1 -C 10 , C 1 -C 5 ) alkyl, unsubstituted (e.g., 2 to 20 membered, 2 to 10 membered, 2 to 5 membered) heteroalkyl, unsubstituted (e.g., C 3 -C 8 , C 3 -C 6 , C 3 -C 5 ) cycloalkyl, unsubstituted (e.g., 3 to 8 membered, 3 to 6 membered, 3 to 5 membered) heterocycloalkyl, unsubstituted (e.g., C 6 -C 10 , C 6 -C 8 , C 6 -C 5 ) aryl or unsubstituted (e.g., 5 to 10 membered, 5 to 8 membered, 5 to 6 membered) heteroaryl.
- unsubstituted e.
- L 1 is a bond, -S(O) 2 -, -NR 3A -, -O-, -S-, -C(O)-, -C(O)NR 3A -, -NR 3A C(O)-, -NR 3A C(O)NR 3B -, -C(O)O-, -OC(O)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene.
- L 1 is a bond, -S(O) 2 -, -NR 3A -, -O-, -S-, -C(O)-, -C(O)NR 3A -, -NR 3A C(O)-, -NR 3A C(O)NR 3B -, -C(O)O-, -OC(O)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene.
- L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene. In aspects, L 1 is a bond, unsubstituted alkylene, or unsubstituted heteroalkylene. In aspects, L 1 is unsubstituted alkylene. In aspects, L 1 is unsubstituted heteroalkylene. In aspects, L 1 is a bond.
- L 1 is–O-, -S-, R 32 -substituted or unsubstituted C 1 -C 2 alkylene (e.g., C 1 or C 2 ) or R 32 - substituted or unsubstituted 2 membered heteroalkylene.
- L 1 is R 32 - substituted or unsubstituted alkylene (e.g., C 1 -C 8 alkylene, C 1 -C 6 alkylene, or C 1 -C 4 alkylene), R 32 -substituted or unsubstituted heteroalkylene (e.g., 2 to 8 membered heteroalkylene, 2 to 6 membered heteroalkylene, or 2 to 4 membered heteroalkylene), R 32 -substituted or unsubstituted cycloalkylene (e.g., C 3 -C 8 cycloalkylene, C 3 -C 6 cycloalkylene, or C 5 -C 6 cycloalkylene), R 32 - substituted or unsubstituted heterocycloalkylene (e.g., 3 to 8 membered heterocycloalkylene, 3 to 6 membered heterocycloalkylene, or 5 to 6 membered heterocycloalkylene), R 32 - substitute
- L 1 is independently –O-, -S-, unsubstituted C 1 -C 2 alkylene (e.g., C 1 or C 2 ) or unsubstituted 2 membered heteroalkylene.
- L 1 is independently unsubstituted methylene.
- L 1 is independently unsubstituted ethylene.
- L 1 is substituted 2 membered heteroalkylene.
- L 1 is substituted 3 membered heteroalkylene.
- L 1 is substituted 4 membered heteroalkylene.
- L 1 is an unsubstituted 2 membered heteroalkylene.
- L 1 is an unsubstituted 3 membered heteroalkylene.
- L 1 is an unsubstituted 4 membered heteroalkylene.
- X 32 is independently –F, -Cl, -Br, or –I. [0178] In embodiments, R 32 is independently unsubstituted methyl. In aspects, R 32 is independently unsubstituted ethyl.
- X 33 is independently –F, -Cl, -Br, or –I. [0180] In embodiments, R 33 is independently unsubstituted methyl. In aspects, R 33 is independently unsubstituted ethyl.
- X 34 is independently –F, -Cl, -Br, or –I.
- R 34 is independently unsubstituted methyl.
- R 34 is independently unsubstituted ethyl.
- R 3A is hydrogen, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene.
- R 3A is hydrogen, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted alkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heteroalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted cycloalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heterocycloalky
- R 3A is hydrogen, substituted or unsubstituted (e.g., C 1 -C 20 , C 1 -C 10 , C 1 -C 5 ) alkyl, substituted or unsubstituted (e.g., 2 to 20 membered, 2 to 10 membered, 2 to 5 membered) heteroalkyl, substituted or unsubstituted (e.g., C 3 -C 8 , C 3 -C 6 , C 3 -C 5 ) cycloalkyl, substituted or unsubstituted (e.g., 3 to 8 membered, 3 to 6 membered, 3 to 5 membered) heterocycloalkyl, substituted or unsubstituted (e.g., C 6 -C 10 , C 6 -C 8 , C 6 -C 5 ) aryl or substituted or unsubstituted (e.g., 5 to 10 membered, 5 to 8
- R 3A is hydrogen, unsubstituted (e.g., C 1 -C 20 , C 1 -C 10 , C 1 -C 5 ) alkyl, unsubstituted (e.g., 2 to 20 membered, 2 to 10 membered, 2 to 5 membered) heteroalkyl, unsubstituted (e.g., C 3 -C 8 , C 3 -C 6 , C 3 -C 5 ) cycloalkyl, unsubstituted (e.g., 3 to 8 membered, 3 to 6 membered, 3 to 5 membered) heterocycloalkyl, unsubstituted (e.g., C 6 -C 10 , C 6 -C 8 , C 6 -C 5 ) aryl or unsubstituted (e.g., 5 to 10 membered, 5 to 8 membered, 5 to 6 membered) heteroaryl.
- unsubstituted e
- R 3B is hydrogen, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene.
- R 3B is hydrogen, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted alkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heteroalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted cycloalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heterocycloalky
- R 3B is hydrogen, substituted or unsubstituted (e.g., C 1 -C 20 , C 1 -C 10 , C 1 - C 5 ) alkyl, substituted or unsubstituted (e.g., 2 to 20 membered, 2 to 10 membered, 2 to 5 membered) heteroalkyl, substituted or unsubstituted (e.g., C 3 -C 8 , C 3 -C 6 , C 3 -C 5 ) cycloalkyl, substituted or unsubstituted (e.g., 3 to 8 membered, 3 to 6 membered, 3 to 5 membered) heterocycloalkyl, substituted or unsubstituted (e.g., C 6 -C 10 , C 6 -C 8 , C 6 -C 5 ) aryl or substituted or unsubstituted (e.g., 5 to 10 membered, 5 to 8
- R 3B is hydrogen, unsubstituted (e.g., C 1 -C 20 , C 1 -C 10 , C 1 -C 5 ) alkyl, unsubstituted (e.g., 2 to 20 membered, 2 to 10 membered, 2 to 5 membered) heteroalkyl, unsubstituted (e.g., C 3 -C 8 , C 3 -C 6 , C 3 -C 5 ) cycloalkyl, unsubstituted (e.g., 3 to 8 membered, 3 to 6 membered, 3 to 5 membered) heterocycloalkyl, unsubstituted (e.g., C 6 -C 10 , C 6 -C 8 , C 6 -C 5 ) aryl or unsubstituted (e.g., 5 to 10 membered, 5 to 8 membered, 5 to 6 membered,) heteroaryl.
- unsubstituted
- L 2 is a bond, -S(O) 2 -, -NR 4A -, -O-, -S-, -C(O)-, -C(O)NR 4A -, -NR 4A C(O)-, -NR 4A C(O)NR 4B -, -C(O)O-, -OC(O)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, substituted or unsubstituted heteroarylene, or substituted or unsubstituted alkylarylene.
- L 2 is a bond, -S(O) 2 -, -NR 4A -, -O-, -S-, -C(O)-, -C(O)NR 4A -, -NR 4A C(O)-, -NR 4A C(O)NR 4B -, -C(O)O-, -OC(O)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene.
- L 2 is a bond, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, or substituted or unsubstituted alkylarylene. In embodiments, L 2 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene. In aspects, L 2 is a bond, unsubstituted alkylene, or unsubstituted heteroalkylene. In aspects, L 2 is unsubstituted alkylene. In aspects, L 2 is unsubstituted heteroalkylene. In aspects, L 2 is a bond.
- L 2 is a bond, or substituted or unsubstituted alkylarylene. In aspects, L 2 is a bond or unsubstituted alkylarylene. In aspects, L 2 is unsubstituted alkylarylene. In aspects, L 2 is benzylene. [0193] In embodiments, L 2 is –O-, -S-, R 35 -substituted or unsubstituted C 1 -C 2 alkylene (e.g., C 1 or C 2 ) or R 35 - substituted or unsubstituted 2 membered heteroalkylene.
- L 2 is R 35 - substituted or unsubstituted alkylene (e.g., C 1 -C 8 alkylene, C 1 -C 6 alkylene, or C 1 -C 4 alkylene), R 35 -substituted or unsubstituted heteroalkylene (e.g., 2 to 8 membered heteroalkylene, 2 to 6 membered heteroalkylene, or 2 to 4 membered heteroalkylene), R 35 -substituted or unsubstituted cycloalkylene (e.g., C 3 -C 8 cycloalkylene, C 3 -C 6 cycloalkylene, or C 5 -C 6 cycloalkylene), R 35 - substituted or unsubstituted heterocycloalkylene (e.g., 3 to 8 membered heterocycloalkylene, 3 to 6 membered heterocycloalkylene, or 5 to 6 membered heterocycloalkylene), R 35 - substitute
- L 2 is –O-, -S-, unsubstituted C 1 -C 2 alkylene (e.g., C 1 or C 2 ) or unsubstituted 2 membered heteroalkylene.
- L 2 is unsubstituted methylene.
- L 2 is unsubstituted ethylene.
- L 2 is substituted 2 membered heteroalkylene.
- L 2 is substituted 3 membered heteroalkylene.
- L 2 is substituted 4 membered heteroalkylene.
- L 2 is an unsubstituted 2 membered heteroalkylene.
- L 2 is an unsubstituted 3 membered heteroalkylene.
- L 2 is an unsubstituted 4 membered heteroalkylene.
- X 35 is independently –F, -Cl, -Br, or –I. [0195] In embodiments, R 35 is independently unsubstituted methyl. In aspects, R 35 is independently unsubstituted ethyl.
- X 36 is independently –F, -Cl, -Br, or –I. [0197] In embodiments, R 36 is independently unsubstituted methyl. In aspects, R 36 is independently unsubstituted ethyl.
- X 37 is independently –F, -Cl, -Br, or –I.
- R 37 is independently unsubstituted methyl.
- R 37 is independently unsubstituted ethyl.
- R 4A is hydrogen, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene.
- R 4A is hydrogen, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted alkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heteroalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted cycloalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heterocycloalky
- R 4A is hydrogen, substituted or unsubstituted (e.g., C 1 -C 20 , C 1 -C 10 , C 1 -C 5 ) alkyl, substituted or unsubstituted (e.g., 2 to 20 membered, 2 to 10 membered, 2 to 5 membered) heteroalkyl, substituted or unsubstituted (e.g., C 3 -C 8 , C 3 -C 6 , C 3 -C 5 ) cycloalkyl, substituted or unsubstituted (e.g., 3 to 8 membered, 3 to 6 membered, 3 to 5 membered) heterocycloalkyl, substituted or unsubstituted (e.g., C 6 -C 10 , C 6 -C 8 , C 6 -C 5 ) aryl or substituted or unsubstituted (e.g., 5 to 10 membered, 5 to 8
- R 4A is hydrogen, unsubstituted (e.g., C 1 -C 20 , C 1 -C 10 , C 1 -C 5 ) alkyl, unsubstituted (e.g., 2 to 20 membered, 2 to 10 membered, 2 to 5 membered) heteroalkyl, unsubstituted (e.g., C 3 -C 8 , C 3 -C 6 , C 3 -C 5 ) cycloalkyl, unsubstituted (e.g., 3 to 8 membered, 3 to 6 membered, 3 to 5 membered) heterocycloalkyl, unsubstituted (e.g., C 6 -C 10 , C 6 -C 8 , C 6 -C 5 ) aryl or unsubstituted (e.g., 5 to 10 membered, 5 to 8 membered, 5 to 6 membered,) heteroaryl.
- unsubstituted
- R 4B is hydrogen, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene.
- R 4B is hydrogen, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted alkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heteroalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted cycloalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with a substituent group(s), a size-limited substituent or a lower substituent group(s)) or unsubstituted heterocycloalky
- R 4B is hydrogen, substituted or unsubstituted (e.g., C 1 -C 20 , C 1 -C 10 , C 1 -C 5 ) alkyl, substituted or unsubstituted (e.g., 2 to 20 membered, 2 to 10 membered, 2 to 5 membered) heteroalkyl, substituted or unsubstituted (e.g., C 3 -C 8 , C 3 -C 6 , C 3 -C 5 ) cycloalkyl, substituted or unsubstituted (e.g., 3 to 8 membered, 3 to 6 membered, 3 to 5 membered) heterocycloalkyl, substituted or unsubstituted (e.g., C 6 -C 10 , C 6 -C 8 , C 6 -C 5 ) aryl or substituted or unsubstituted (e.g., 5 to 10 membered, 5 to 8
- R 4B is hydrogen, unsubstituted (e.g., C 1 -C 20 , C 1 -C 10 , C 1 -C 5 ) alkyl, unsubstituted (e.g., 2 to 20 membered, 2 to 10 membered, 2 to 5 membered) heteroalkyl, unsubstituted (e.g., C 3 -C 8 , C 3 -C 6 , C 3 -C 5 ) cycloalkyl, unsubstituted (e.g., 3 to 8 membered, 3 to 6 membered, 3 to 5 membered) heterocycloalkyl, unsubstituted (e.g., C 6 -C 10 , C 6 -C 8 , C 6 -C 5 ) aryl or unsubstituted (e.g., 5 to 10 membered, 5 to 8 membered, 5 to 6 membered,) heteroaryl.
- unsubstituted
- X 1 is imidazolylene, -NH- or -O-. In aspects, X 1 is imidazolylene (i.e., a divalent imidazole). In aspects, X 1 is -NH-. In aspects, X 1 is -O-. [0209] In embodiments, the first biomolecule moiety is a peptidyl moiety. In aspects, the second biomolecule moiety is a peptidyl moiety. In aspects, the first biomolecule moiety is a peptidyl moiety and the second biomolecule moiety is a peptidyl moiety. In aspects, the peptidyl moieties in the first biomolecule moiety and the second biomolecule moiety are in the same protein.
- the peptidyl moieties in the first biomolecule moiety and the second biomolecule moiety are in different proteins.
- the different proteins are a single- domain antibody and a membrane receptor.
- the different proteins are an antibody and a membrane receptor.
- the different proteins are an antigen- binding fragment and a membrane receptor.
- the different proteins are an affibody and a membrane receptor.
- the different proteins are a single-chain variable fragment and a membrane receptor.
- the peptidyl moieties of –L 1 -R 1 and –L 2 -R 2 are in the same protein. In aspects, the peptidyl moieties of –L 1 -R 1 and –L 2 -R 2 are in different proteins. In aspects, L 1 is a bond. In aspects, L 2 is a bond. In aspects, L 1 and L 2 are a bond. In embodiments, the different proteins are a single-domain antibody and a membrane receptor. In embodiments, the different proteins are an antibody and a membrane receptor. In embodiments, the different proteins are a single-chain variable fragment and a membrane receptor. In embodiments, the different proteins are an affibody and a membrane receptor.
- the different proteins are an antigen- binding fragment and a membrane receptor.
- the first biomolecule moiety is a nucleic acid moiety or a carbohydrate moiety. In embodiments, the first biomolecule moiety is a nucleic acid moiety. In embodiments, the first biomolecule moiety is a carbohydrate moiety.
- the second biomolecule moiety is a nucleic acid moiety or a carbohydrate moiety. In embodiments, the second biomolecule moiety is a nucleic acid moiety. In embodiments, the second biomolecule moiety is a carbohydrate moiety.
- –L 1 -R 1 is a nucleic acid moiety or a carbohydrate moiety.
- –L 1 -R 1 is a nucleic acid moiety. In aspects, –L 1 -R 1 is a carbohydrate moiety. In aspects, –L 2 -R 2 is a nucleic acid moiety or a carbohydrate moiety. In aspects, –L 2 -R 2 is a nucleic acid moiety. In aspects, –L 2 -R 2 is a carbohydrate moiety. In aspects, L 1 is a bond. In aspects, L 2 is a bond. In aspects, L 1 and L 2 are a bond.
- the first biomolecule moiety is selected from the group consisting of a small molecule moiety, peptidyl moiety, a nucleic acid moiety, and a carbohydrate moiety.
- the second biomolecule moiety is selected from the group consisting of a small molecule moiety, a peptidyl moiety, a nucleic acid moiety, and a carbohydrate moiety.
- the first biomolecule moiety is same as the second biomolecule moiety.
- the first biomolecule moiety is different from the second biomolecule moiety.
- the first biomolecule moiety and the second biomolecule moiety are within the same biomolecule.
- the first biomolecule moiety and the second biomolecule moiety are in different biomolecules.
- the first biomolecule moiety is a small molecule moiety and the second biomolecule moiety is a peptidyl moiety.
- the first biomolecule moiety is a peptidyl moiety and the second biomolecule moiety is a small molecule moiety.
- the first biomolecule moiety is selected from the group consisting of a peptidyl moiety, a nucleic acid moiety, and a carbohydrate moiety.
- the second biomolecule moiety is selected from the group consisting of a peptidyl moiety, a nucleic acid moiety, and a carbohydrate moiety.
- the first biomolecule moiety is same as the second biomolecule moiety.
- the first biomolecule moiety is different from the second biomolecule moiety. In aspects, the first biomolecule moiety and the second biomolecule moiety are within the same biomolecule. In aspects, the first biomolecule moiety and the second biomolecule moiety are in different biomolecules. In aspects, the first biomolecule moiety and the second biomolecule moiety are each independently a peptidyl moiety. [0215] In embodiments, –L 1 -R 1 is selected from the group consisting of a small molecule moiety, a peptidyl moiety, a nucleic acid moiety and a carbohydrate moiety.
- –L 2 -R 2 is selected from the group consisting of a small molecule moiety, a peptidyl moiety, a nucleic acid moiety and a carbohydrate moiety.
- –L 1 -R 1 is a small molecule moiety.
- –L 2 -R 2 is a small molecule moiety.
- L 1 is a bond.
- L 2 is a bond.
- L 1 and L 2 are a bond.
- –L 1 -R 1 is selected from the group consisting of a peptidyl moiety, a nucleic acid moiety and a carbohydrate moiety.
- –L 2 -R 2 is selected from the group consisting of a peptidyl moiety, a nucleic acid moiety and a carbohydrate moiety.
- – L 1 -R 1 is the same as –L 2 -R 2 .
- –L 1 -R 1 is different from –L 2 -R 2 .
- –L 1 -R 1 and – L 2 -R 2 are each independently a peptidyl moiety.
- L 1 is a bond.
- L 2 is a bond.
- L 1 and L 2 are a bond.
- the disclosure provides a protein comprising a moiety of Formula (IV), a moiety of Formula (V), a moiety of Formula (VI), or a combination of two or more thereof:
- the protein comprises a moiety of Formula (IV).
- the protein comprises a moiety of Formula (V).
- the protein comprises a moiety of Formula (VI).
- the protein comprises a moiety of Formula (IV) and a moiety of Formula (V).
- the protein comprises a moiety of Formula (IV) and a moiety of Formula (VI).
- the protein comprises a moiety of Formula (V) and a moiety of Formula (VI).
- the protein comprises a moiety of Formula (IV), a moiety of Formula (V), and a moiety of Formula (VI).
- the moieties of Formula (IV), (V), (VI), or a combination thereof form intramolecular covalent bonds.
- the moiety of Formula (IV) forms an intramolecular covalent bond.
- the moiety of Formula (V) forms an intramolecular covalent bond.
- the moiety of Formula (VI) forms an intramolecular covalent bond.
- the moieties of Formula (IV) and (V) form intramolecular covalent bonds.
- the moieties of Formula (IV) and (VI) form intramolecular covalent bonds.
- the moieties of Formula (V) and (VI) form intramolecular covalent bonds.
- the moieties of Formula (IV), (V), and (VI) form intramolecular covalent bonds.
- the moieties of Formula (IV), (V), (VI), or a combination thereof form intermolecular covalent bonds.
- the moiety of Formula (IV) forms an intermolecular covalent bond.
- the moiety of Formula (V) forms an intermolecular covalent bond.
- the moiety of Formula (VI) forms an intermolecular covalent bond.
- the moieties of Formula (IV) and (V) form intermolecular covalent bonds.
- the moieties of Formula (IV) and (VI) form intermolecular covalent bonds.
- the moieties of Formula (V) and (VI) form intermolecular covalent bonds.
- the moieties of Formula (IV), (V), and (VI) form intermolecular covalent bonds.
- the disclosure provides a protein of Formula (I), Formula (II), or Formula (III): wherein R 1 and R 2 are each independently a peptidyl moiety that are joined together, i.e., the protein of Formula (I), (II), and (III) comprises an intramolecular covalent bond.
- the protein is Formula (I).
- the protein is Formula (II).
- the protein is Formula (III).
- the peptidyl moiety of R 1 and the peptidyl moiety of R 2 comprise a protein ⁇ - strand. In aspects, the peptidyl moiety of R 1 and the peptidyl moiety of R 2 comprise a protein ⁇ - strand. In aspects, the peptidyl moiety of R 1 comprises a protein ⁇ -strand and the peptidyl moiety of R 2 comprises a protein ⁇ -strand. In aspects, the peptidyl moiety of R 1 comprises a protein ⁇ -strand and the peptidyl moiety of R 2 comprises a protein ⁇ -strand.
- the disclosure provides a protein of Formula (I), Formula (II), or Formula (III): wherein R 1 is a peptidyl moiety of a first protein and R 2 is a peptidyl moiety of a second protein, i.e., there is an intermolecular covalent bond between two proteins.
- the intermolecular bond is between two different proteins.
- the intermolecular bond is between two of the same proteins (e.g., two proteins having the same amino acid sequence that are intermolecularly bonded).
- the first protein is covalently bonded to the second protein via the moiety of Formula (IV) to form an intermolecularly bonded protein of Formula (I).
- first protein is covalently bonded to the second protein via the moiety of Formula (V) to form an intermolecularly bonded protein of Formula (II). In aspects the first protein is covalently bonded to the second protein via the moiety of Formula (VI) to form an intermolecularly bonded protein of Formula (III). In aspects the first protein is covalently bonded to the second protein via the moiety of Formula (IV) and the moiety of Formula (IV). In aspects the first protein is covalently bonded to the second protein via the moiety of Formula (IV) and the moiety of Formula (VI). In aspects the first protein is covalently bonded to the second protein via the moiety of Formula (V) and the moiety of Formula (VI).
- the first protein is covalently bonded to the second protein via the moiety of Formula (IV), the moiety of Formula (V), and the moiety of Formula (VI).
- the first protein is a hormone and the second protein is the receptor for the hormone.
- the first protein is an antibody or an antibody variant
- the second protein is a membrane receptor.
- the first protein is an antibody and the second protein is a membrane receptor.
- the first protein is an antibody variant and the second protein is a membrane receptor.
- the first protein is an antibody, an antigen-binding fragment, a single-chain variable fragment, a single- domain antibody, or an affibody and the second protein is membrane receptor.
- the first protein is an antibody-binding fragment and the second protein is membrane receptor. In aspects, the first protein is a single-chain variable fragment and the second protein is membrane receptor. In aspects, the first protein is a single-domain antibody and the second protein is membrane receptor. In aspects, the first protein is an affibody and the second protein is membrane receptor. In aspects, the first protein is a single-domain antibody and the second protein is hormone receptor. In aspects, the peptidyl moiety R 1 and R 2 comprise a protein ⁇ - strand. In aspects, the peptidyl moiety R 1 and R 2 comprise a protein ⁇ -strand.
- the peptidyl moiety R 1 comprises a protein ⁇ -strand and the peptidyl moiety R 2 comprises a protein ⁇ -strand. In aspects, the peptidyl moiety R 1 comprises a protein ⁇ -strand and the peptidyl moiety R 2 comprises a protein ⁇ -strand.
- R 1 is an antibody or an antibody variant, and R 2 is a membrane receptor. In aspects, R 1 is an antibody and R 2 is a membrane receptor. In aspects, R 1 is an antibody variant and R 2 is a membrane receptor. In aspects, R 1 is an antibody, an antigen- binding fragment, a single-chain variable fragment, a single-domain antibody, or an affibody, and R 2 is a membrane receptor.
- R 1 is an antigen-binding fragment and R 2 is a membrane receptor.
- R 1 is a single-chain variable fragment and R 2 is a membrane receptor.
- R 1 is a single-domain antibody and R 2 is a membrane receptor.
- R 1 is an affibody and R 2 is a membrane receptor.
- R 1 is a membrane receptor and R 2 is an antibody or an antibody variant.
- R 1 is a membrane receptor and R 2 is an antibody.
- R 1 is a membrane receptor and R 2 is an antibody variant.
- R 1 is a membrane receptor and R 2 is an antibody, an antigen-binding fragment, a single-chain variable fragment, a single-domain antibody, or an affibody.
- R 1 is a membrane receptor and R 2 is an antigen-binding fragment. In aspects, R 1 is a membrane receptor and R 2 is a single- chain variable fragment. In aspects, R 1 is a membrane receptor and R 2 is a single-domain antibody. In aspects, R 1 is a membrane receptor and R 2 is an affibody. [0220] In aspects, the protein conjugates may comprise three or more different and/or separate proteins.
- the first protein is covalently bonded to the second protein via a moiety of Formula (IV), a moiety of Formula (V), a moiety of Formula (VI), or a combination of two or more thereof; and the second protein is covalently bonded to a third protein via a moiety of Formula (IV), a moiety of Formula (V), a moiety of Formula (VI), or a combination of two or more thereof.
- the first protein is covalently bonded to the second protein via a moiety of Formula (IV), a moiety of Formula (V), a moiety of Formula (VI), or a combination of two or more thereof; and the first protein is also covalently bonded to a third protein via a moiety of Formula (IV), a moiety of Formula (V), a moiety of Formula (VI), or a combination of two or more thereof.
- first protein, the second protein, and the third protein may each optionally further comprise a moiety of Formula (IV), a moiety of Formula (V), a moiety of Formula (VI), or a combination of two or more thereof, wherein the peptidyl moiety of R 1 and R 2 form intramolecular bonds within the first protein, the second protein, or the third protein, respectively.
- the disclosure provides a small molecule moiety, a membrane receptor, an antibody, an antigen-binding fragment, a single-chain variable fragment, a single- domain antibody, or an affibody comprising an unnatural amino acid; wherein the unnatural amino acid has a side chain of Formula (F):
- the disclosure provides an antibody, an antigen-binding fragment, a single- chain variable fragment, a single-domain antibody, or an affibody comprising the unnatural amino acid side chain of Formula (F).
- the disclosure provides a membrane receptor comprising the unnatural amino acid side chain of Formula (F).
- the disclosure provides a small molecule moiety comprising the unnatural amino acid side chain of Formula (F).
- the disclosure provides an antibody comprising the unnatural amino acid side chain of Formula (F).
- the disclosure provides an antigen- binding fragment, a single-chain variable fragment comprising the unnatural amino acid side chain of Formula (F).
- the disclosure provides a single-chain variable fragment comprising the unnatural amino acid side chain of Formula (F).
- the disclosure provides a single-domain antibody comprising the unnatural amino acid side chain of Formula (F).
- the disclosure provides an affibody comprising the unnatural amino acid side chain of Formula (F).
- the biomolecules and proteins described herein comprises a membrane receptor.
- the membrane receptor is a programmed cell death protein 1 (PD-1) receptor, a programmed death ligand 1 (PD-L1) receptor, a 5-hydroxytryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a bradykinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR), a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin receptor, a glycoprotein hormone receptor, a gonadotrophin-releasing hormone receptor, a G
- the membrane receptor is PD-1 receptor or PD-L1 receptor. In embodiments, the membrane receptor is PD-1 receptor. In embodiments, the membrane receptor is a PD-L1 receptor. [0224] In embodiments, the membrane receptor is a receptor expressed on a cancer cell. In embodiments, the membrane receptor is a receptor overexpressed on a cancer cell relative to a control. [0225] In embodiments, the membrane receptor is a G protein-coupled receptor. In embodiments, the membrane receptor is a receptor tyrosine kinase. In embodiments, the receptor protein is a an ErbB receptor. In embodiments, the membrane receptor is an epidermal growth factor receptor (EGFR).
- EGFR epidermal growth factor receptor
- the membrane receptor is epidermal growth factor receptor 1 (HER1). In embodiments, the membrane receptor is epidermal growth factor receptor 2 (HER2). In embodiments, the membrane receptor is epidermal growth factor receptor 3 (HER3). In embodiments, the membrane receptor is epidermal growth factor receptor 4 (HER4). [0226] In embodiments, the membrane receptor is EGFR. In embodiments, the membrane receptor is EGFR expressed on a cancer cell. In embodiments, the membrane receptor is EGFR that is overexpressed on a cancer cell relative to a control. [0227] Provided herein is nanobody 7D12 modified with FSK or FSY.
- Nanobody 7D12 is set forth as SEQ ID NO:88, wherein CDR1 is as set forth in SEQ ID NO:95, CDR2 is as set forth in SEQ ID NO:96, and CDR3 is as set forth in SEQ ID NO:97.
- SEQ ID NO:88 QVKLEESGGG SVQTGGSLRL TCAASGRTSR SYGMGWFRQA PGKEREFVSG ISWRGDSTGY ADSVKGRFTI SRDNAKNTVD LQMNSLKPED TAIYYCAAAA GSAWYGTLYE YDYWGQGTQV TVSS [0229]
- SEQ ID NO:95 RTSRSYGMG [0230]
- SEQ ID NO:96 GISWRGDS [0231]
- SEQ ID NO:97 AAGSAWYGTLYEYDY [0232]
- nanobody 7D12 wherein at least one amino acid in the nanobody is FSK.
- the nanobody comprises CDR1 as set forth in SEQ ID NO:95, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:97; wherein the amino acid at the position corresponding to position 30 or position 31 is FSK.
- the nanobody comprises CDR1 as set forth in SEQ ID NO:95, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:97; wherein the amino acid at the position corresponding to position 30 is FSK (i.e., wherein position 30 corresponds to position 4 in SEQ ID NO:95).
- the nanobody comprises CDR1 as set forth in SEQ ID NO:95, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:97; wherein the amino acid at the position corresponding to position 31 is FSK (i.e., wherein position 31 corresponds to position 5 in SEQ ID NO:95).
- the nanobody comprises CDR1 as set forth in SEQ ID NO:98, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:97.
- the nanobody comprises CDR1 as set forth in SEQ ID NO:99, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:97.
- nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:35 or SEQ ID NO:88, wherein at least one amino acid in the amino acid sequence is FSK. In embodiments, nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:35, wherein at least one amino acid in the amino acid sequence is FSK. In embodiments, nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:88, wherein at least one amino acid in the amino acid sequence is FSK.
- nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:88, wherein the amino acid at the position corresponding to position 30 or position 31 is FSK. In embodiments, nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:88, wherein the amino acid at the position corresponding to position 30 is FSK (i.e., SEQ ID NO:89). In embodiments, nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:88, wherein the amino acid at the position corresponding to position 31 is FSK (i.e., SEQ ID NO:90). [0236] In embodiments, the nanobody comprises SEQ ID NO:89. In embodiments, the nanobody is as set forth at SEQ ID NO:89.
- the nanobody has at least 85% sequence identity to SEQ ID NO:89. In embodiments, the nanobody has at least 90% sequence identity to SEQ ID NO:89. In embodiments, the nanobody has at least 92% sequence identity to SEQ ID NO:89. In embodiments, the nanobody has at least 94% sequence identity to SEQ ID NO:89. In embodiments, the nanobody has at least 95% sequence identity to SEQ ID NO:89. In embodiments, the nanobody has at least 96% sequence identity to SEQ ID NO:89. In embodiments, the nanobody has at least 98% sequence identity to SEQ ID NO:89. In embodiments where there is less than 100% sequence identity, the nanobody must contain FSK at a position corresponding to position 30 in SEQ ID NO:89.
- the nanobody when the nanobody has less than 100% sequence identity to SEQ ID NO:89, then the nanobody has 100% sequence identity to CDR1, CDR2, and CDR3 within SEQ ID NO:89.
- X FSK is FSK.
- SEQ ID NO:89 QVKLEESGGG SVQTGGSLRL TCAASGRTSX FSK SYGMGWFRQA PGKEREFVSG ISWRGDSTGY ADSVKGRFTI SRDNAKNTVD LQMNSLKPED TAIYYCAAAA GSAWYGTLYE YDYWGQGTQV TVSS [0238]
- the nanobody comprises SEQ ID NO:90.
- the nanobody is as set forth at SEQ ID NO:90. In embodiments, the nanobody has at least 85% sequence identity to SEQ ID NO:90. In embodiments, the nanobody has at least 90% sequence identity to SEQ ID NO:90. In embodiments, the nanobody has at least 92% sequence identity to SEQ ID NO:90. In embodiments, the nanobody has at least 94% sequence identity to SEQ ID NO:90. In embodiments, the nanobody has at least 95% sequence identity to SEQ ID NO:90. In embodiments, the nanobody has at least 96% sequence identity to SEQ ID NO:90. In embodiments, the nanobody has at least 98% sequence identity to SEQ ID NO:90.
- the nanobody must contain FSK at a position corresponding to position 31 in SEQ ID NO:90. In embodiments when the nanobody has less than 100% sequence identity to SEQ ID NO:90, then the nanobody has 100% sequence identity to CDR1, CDR2, and CDR3 within SEQ ID NO:90. In SEQ ID NO:90, X FSK is FSK.
- nanobody 7D12 comprises CDR1 as set forth in SEQ ID NO:95, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:97; wherein the amino acid at the position corresponding to position 109 or position 113 is FSY.
- nanobody 7D12 comprises CDR1 as set forth in SEQ ID NO:95, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:97; wherein the amino acid at the position corresponding to position 109 is FSY (i.e., wherein position 109 corresponds to position 11 in SEQ ID NO:97).
- nanobody 7D12 comprises CDR1 as set forth in SEQ ID NO:95, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:100. In embodiments, nanobody 7D12 comprises CDR1 as set forth in SEQ ID NO:95, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:101. In the sequences, X FSY is FSY.
- nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:35 or SEQ ID NO:88, wherein at least one amino acid in the amino acid sequence is FSY. In embodiments, nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:35, wherein at least one amino acid in the amino acid sequence is FSY. In embodiments, nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:88, wherein at least one amino acid in the amino acid sequence is FSY.
- nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:88, wherein the amino acid at the position corresponding to position 1, position 109, position 113, or position 116 is FSY. In embodiments, nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:88, wherein the amino acid at the position corresponding to position 1 is FSY (i.e., SEQ ID NO:91). In embodiments, nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:88, wherein the amino acid at the position corresponding to position 109 is FSY (i.e., SEQ ID NO:92).
- nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:88, wherein the amino acid at the position corresponding to position 113 is FSY (i.e., SEQ ID NO:93). In embodiments, nanobody 7D12 has the amino acid sequence set forth in SEQ ID NO:88, wherein the amino acid at the position corresponding to position 116 is FSY (i.e., SEQ ID NO:94). [0244] In embodiments, the nanobody comprises SEQ ID NO:91. In embodiments, the nanobody is as set forth at SEQ ID NO:91. In embodiments, the nanobody has at least 85% sequence identity to SEQ ID NO:91. In embodiments, the nanobody has at least 90% sequence identity to SEQ ID NO:91.
- the nanobody has at least 92% sequence identity to SEQ ID NO:91. In embodiments, the nanobody has at least 94% sequence identity to SEQ ID NO:91. In embodiments, the nanobody has at least 95% sequence identity to SEQ ID NO:91. In embodiments, the nanobody has at least 96% sequence identity to SEQ ID NO:91. In embodiments, the nanobody has at least 98% sequence identity to SEQ ID NO:91. In embodiments where there is less than 100% sequence identity, the nanobody must contain FSY at a position corresponding to position 1 in SEQ ID NO:91.
- the nanobody when the nanobody has less than 100% sequence identity to SEQ ID NO:91, then the nanobody has 100% sequence identity to CDR1, CDR2, and CDR3 within SEQ ID NO:91, and the nanobody has FSY at a position corresponding to position 1 in SEQ ID NO:91.
- X FSY is FSY.
- the nanobody comprises SEQ ID NO:92.
- the nanobody is as set forth at SEQ ID NO:92.
- the nanobody having at least 85% sequence identity to SEQ ID NO:92.
- the nanobody has at least 90% sequence identity to SEQ ID NO:92.
- the nanobody has at least 92% sequence identity to SEQ ID NO:92. In embodiments, the nanobody has at least 94% sequence identity to SEQ ID NO:92. In embodiments, the nanobody has at least 95% sequence identity to SEQ ID NO:92. In embodiments, the nanobody has at least 96% sequence identity to SEQ ID NO:92. In embodiments, the nanobody has at least 98% sequence identity to SEQ ID NO:92. In embodiments where there is less than 100% sequence identity, the nanobody must contain FSY at a position corresponding to position 109 in SEQ ID NO:92.
- the nanobody when the nanobody has less than 100% sequence identity to SEQ ID NO:92, then the nanobody has 100% sequence identity to CDR1, CDR2, and CDR3 within SEQ ID NO:92.
- X FSY is FSY.
- SEQ ID NO:92 QVKLEESGGG SVQTGGSLRL TCAASGRTSR SYGMGWFRQA PGKEREFVSG ISWRGDSTGY ADSVKGRFTI SRDNAKNTVD LQMNSLKPED TAIYYCAAAA GSAWYGTLX FSY E YDYWGQGTQV TVSS [0248]
- the nanobody comprises SEQ ID NO:93.
- the nanobody is as set forth at SEQ ID NO:93. In embodiments, the nanobody has at least 85% sequence identity to SEQ ID NO:93. In embodiments, the nanobody has at least 90% sequence identity to SEQ ID NO:93. In embodiments, the nanobody has at least 92% sequence identity to SEQ ID NO:93. In embodiments, the nanobody has at least 94% sequence identity to SEQ ID NO:93. In embodiments, the nanobody has at least 95% sequence identity to SEQ ID NO:93. In embodiments, the nanobody has at least 96% sequence identity to SEQ ID NO:93. In embodiments, the nanobody has at least 98% sequence identity to SEQ ID NO:93.
- the nanobody must contain FSY at a position corresponding to position 113 in SEQ ID NO:93. In embodiments when the nanobody has less than 100% sequence identity to SEQ ID NO:93, then the nanobody has 100% sequence identity to CDR1, CDR2, and CDR3 within SEQ ID NO:93. In SEQ ID NO:93, X FSY is FSY.
- the nanobody comprises SEQ ID NO:94.
- the nanobody is as set forth at SEQ ID NO:94.
- the nanobody has at least 85% sequence identity to SEQ ID NO:94.
- the nanobody has at least 90% sequence identity to SEQ ID NO:94.
- the nanobody has at least 92% sequence identity to SEQ ID NO:94. In embodiments, the nanobody has at least 94% sequence identity to SEQ ID NO:94. In embodiments, the nanobody has at least 95% sequence identity to SEQ ID NO:94. In embodiments, the nanobody has at least 96% sequence identity to SEQ ID NO:94. In embodiments, the nanobody has at least 98% sequence identity to SEQ ID NO:94. In embodiments where there is less than 100% sequence identity, the nanobody must contain FSY at a position corresponding to position 116 in SEQ ID NO:94.
- the nanobody when the nanobody has less than 100% sequence identity to SEQ ID NO:94, then the nanobody has 100% sequence identity to CDR1, CDR2, and CDR3 within SEQ ID NO:94, and the nanobody has FSY at a position corresponding to position 116 in SEQ ID NO:94.
- X FSY is FSY.
- the disclosure provides a pharmaceutical composition comprising nanobody 7D12 wherein at least one amino acid in the amino acid sequence is FSK (including embodiments as described herein) and a pharmaceutically acceptable excipient.
- the pharmaceutical composition comprises SEQ ID NO:89 (including embodiments thereof) and a pharmaceutically acceptable carrier.
- the pharmaceutical composition comprises SEQ ID NO:90 (including embodiments thereof) and a pharmaceutically acceptable excipient.
- the pharmaceutical composition comprises a nanobody comprising CDR1 as set forth in SEQ ID NO:98, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:97, and a pharmaceutically acceptable excipient.
- the pharmaceutical composition comprises a nanobody comprising CDR1 as set forth in SEQ ID NO:99, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:97, and a pharmaceutically acceptable excipient.
- the disclosure provides a biomolecule conjugate comprising nanobody 7D12 wherein at least one amino acid in the amino acid sequence is FSK (including embodiments as described herein) covalently bonded via the amino acid side chain of FSK to a lysine, histidine, or tyrosine amino acid in the EGFR protein.
- the disclosure provides a biomolecule conjugate comprising nanobody 7D12 wherein at least one amino acid in the amino acid sequence is FSK (including embodiments as described herein) covalently bonded via the amino acid side chain of FSK to a lysine in the EGFR protein.
- the disclosure provides a biomolecule conjugate comprising nanobody 7D12 wherein at least one amino acid in the amino acid sequence is FSK (including embodiments as described herein) covalently bonded via the amino acid side chain of FSK to a histidine in the EGFR protein.
- the disclosure provides a biomolecule conjugate comprising nanobody 7D12 wherein at least one amino acid in the amino acid sequence is FSK (including embodiments as described herein) covalently bonded via the amino acid side chain of FSK to a tyrosine in the EGFR protein.
- the biomolecule conjugate comprises SEQ ID NO:89 (including embodiments thereof) covalently bonded via FSK to a lysine, histidine, or tyrosine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:89 (including embodiments thereof) covalently bonded via FSK to a lysine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:89 (including embodiments thereof) covalently bonded via FSK to a histidine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:89 (including embodiments thereof) covalently bonded via FSK to a tyrosine amino acid in EGFR.
- the biomolecule conjugate comprises SEQ ID NO:90 (including embodiments thereof) covalently bonded via FSK to a lysine, histidine, or tyrosine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:90 (including embodiments thereof) covalently bonded via FSK to a lysine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:90 (including embodiments thereof) covalently bonded via FSK to a histidine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:90 (including embodiments thereof) covalently bonded via FSK to a tyrosine amino acid in EGFR.
- the disclosure provides a pharmaceutical composition comprising nanobody 7D12 wherein at least one amino acid in the amino acid sequence is FSY (including embodiments as described herein) and a pharmaceutically acceptable carrier.
- the pharmaceutical composition comprises SEQ ID NO:91 (including embodiments thereof) and a pharmaceutically acceptable carrier.
- the pharmaceutical composition comprises SEQ ID NO:92 (including embodiments thereof) and a pharmaceutically acceptable carrier.
- the pharmaceutical composition comprises SEQ ID NO:93 (including embodiments thereof) and a pharmaceutically acceptable carrier.
- the pharmaceutical composition comprises SEQ ID NO:94 (including embodiments thereof) and a pharmaceutically acceptable carrier.
- the pharmaceutical composition comprises a nanobody comprising CDR1 as set forth in SEQ ID NO:95, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:100, and a pharmaceutically acceptable excipient.
- the pharmaceutical composition comprises a nanobody comprising CDR1 as set forth in SEQ ID NO:95, CDR2 as set forth in SEQ ID NO:96, and CDR3 as set forth in SEQ ID NO:101, and a pharmaceutically acceptable excipient.
- the disclosure provides a biomolecule conjugate comprising nanobody 7D12 wherein at least one amino acid in the amino acid sequence is FSY (including embodiments as described herein) covalently bonded via the amino acid side chain of FSY to a lysine, histidine, or tyrosine amino acid in the EGFR protein.
- the disclosure provides a biomolecule conjugate comprising nanobody 7D12 wherein at least one amino acid in the amino acid sequence is FSY (including embodiments as described herein) covalently bonded via the amino acid side chain of FSY to a lysine in the EGFR protein.
- the disclosure provides a biomolecule conjugate comprising nanobody 7D12 wherein at least one amino acid in the amino acid sequence is FSY (including embodiments as described herein) covalently bonded via the amino acid side chain of FSY to a histidine in the EGFR protein.
- the disclosure provides a biomolecule conjugate comprising nanobody 7D12 wherein at least one amino acid in the amino acid sequence is FSY (including embodiments as described herein) covalently bonded via the amino acid side chain of FSY to a tyrosine in the EGFR protein.
- the biomolecule conjugate comprises SEQ ID NO:91 (including embodiments thereof) covalently bonded via FSY to a lysine, histidine, or tyrosine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:91 (including embodiments thereof) covalently bonded via FSY to a lysine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:91 (including embodiments thereof) covalently bonded via FSY to a histidine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:91 (including embodiments thereof) covalently bonded via FSY to a tyrosine amino acid in EGFR.
- the biomolecule conjugate comprises SEQ ID NO:92 (including embodiments thereof) covalently bonded via FSY to a lysine, histidine, or tyrosine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:92 (including embodiments thereof) covalently bonded via FSY to a lysine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:92 (including embodiments thereof) covalently bonded via FSY to a histidine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:92 (including embodiments thereof) covalently bonded via FSY to a tyrosine amino acid in EGFR.
- the biomolecule conjugate comprises SEQ ID NO:93 (including embodiments thereof) covalently bonded via FSY to a lysine, histidine, or tyrosine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:93 (including embodiments thereof) covalently bonded via FSY to a lysine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:93 (including embodiments thereof) covalently bonded via FSY to a histidine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:93 (including embodiments thereof) covalently bonded via FSY to a tyrosine amino acid in EGFR.
- the biomolecule conjugate comprises SEQ ID NO:94 (including embodiments thereof) covalently bonded via FSY to a lysine, histidine, or tyrosine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:94 (including embodiments thereof) covalently bonded via FSY to a lysine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:94 (including embodiments thereof) covalently bonded via FSY to a histidine amino acid in EGFR. In embodiments, the biomolecule conjugate comprises SEQ ID NO:94 (including embodiments thereof) covalently bonded via FSY to a tyrosine amino acid in EGFR.
- an unnatural amino acid e.g., FSK
- a biomolecule e.g., protein
- tRNA transfer RNA molecule
- the attachment of unnatural amino acids to tRNA may not necessarily be accomplished by the naturally occurring aminoacyl-tRNA synthetase.
- Engineered aminoacyl-tRNA synthetases e.g., mutant pyrrolysyl-tRNA synthetase (PylRS)
- PylRS mutant pyrrolysyl-tRNA synthetase
- a PylRS mutant library was generated. Compared to previously described PylRS mutant library, the PylRS mutant library generated herein was constructed using the new small- intelligent mutagenesis approach that allows a greater number of amino acid residues to be mutated simultaneously (e.g., 10 amino acid residues).
- the disclosure provides a mutant pyrrolysyl-tRNA synthetase, including at least 5 amino acid residues substitutions within the substrate-binding site of the mutant pyrrolysyl- tRNA synthetase.
- the mutant pyrrolysyl-tRNA synthetase comprises at least 5 amino acid residues substitutions in the amino acid sequence of SEQ ID NO:1.
- the at least 5 amino acid substitutions occur at the residues tyrosine at position 126, methionine at position 129, valine at position 168, histidine at position 227, and tyrosine at position 228 as set forth in the amino acid sequence of SEQ ID NO:1.
- the at least 5 amino acid substitutions are: (i) Y126G; (ii) M129A; (iii) V168F; (iv) H227T, H227S, or H227I; and (v) Y228P, in the amino acid sequence of SEQ ID NO:1.
- the mutant pyrrolysyl-tRNA synthetase comprises at least 6 amino acid residues substitutions in the amino acid sequence of SEQ ID NO:1.
- the at least 6 amino acid substitutions occur at the residues tyrosine at position 126, methionine at position 129, valine at position 168, histidine at position 227, tyrosine at position 228, and lysine at position 229 as set forth in the amino acid sequence of SEQ ID NO:1.
- the at least 6 amino acid residues substitutions are: (i) Y126G; (ii) M129A; (iii) V168F; (iv) H227T, H227S, or H227I; (v) Y228P; and (vi) L229V or L229I, in the amino acid sequence of SEQ ID NO:1.
- the mutant pyrrolysyl-tRNA synthetase is encoded by the nucleic acid sequence of SEQ ID NO:2.
- the mutant pyrrolysyl-tRNA synthetase is encoded by a nucleic acid sequence including the sequence of SEQ ID NO:2.
- the mutant pyrrolysyl-tRNA synthetase is encoded by a nucleic acid sequence that is at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity to SEQ ID NO: 2.
- the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 80% identity to SEQ ID NO:2.
- the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 85% identity to SEQ ID NO:2.
- the mutant pyrrolysyl- tRNA synthetase has an amino acid sequence that has at least 90% identity to SEQ ID NO:2. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 95% identity to SEQ ID NO:2. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 98% identity to SEQ ID NO:2. SEQ ID NO:2 is alternatively referred to as FSKRS. [0260] In embodiments, the mutant pyrrolysyl-tRNA synthetase is encoded by the nucleic acid sequence of SEQ ID NO:86.
- the mutant pyrrolysyl-tRNA synthetase is encoded by a nucleic acid sequence including the sequence of SEQ ID NO:86. In aspects, the mutant pyrrolysyl-tRNA synthetase is encoded by a nucleic acid sequence that is at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity to SEQ ID NO: 86. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 80% identity to SEQ ID NO:86.
- the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 85% identity to SEQ ID NO:86. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 90% identity to SEQ ID NO:86. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 95% identity to SEQ ID NO:86. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 98% identity to SEQ ID NO:86.
- the mutant pyrrolysyl-tRNA synthetase when the pyrrolysyl-tRNA synthetase has less than 100% sequence identity to SEQ ID NO:86, the first seven amino acids at the N-terminus are always MH6.
- SEQ ID NO:86 is alternatively referred to as FSKRSNThis.
- the mutant pyrrolysyl-tRNA synthetase is encoded by the nucleic acid sequence of SEQ ID NO:87.
- the mutant pyrrolysyl-tRNA synthetase is encoded by a nucleic acid sequence including the sequence of SEQ ID NO:87.
- the mutant pyrrolysyl-tRNA synthetase is encoded by a nucleic acid sequence that is at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity to SEQ ID NO: 87.
- the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 80% identity to SEQ ID NO:87.
- the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 85% identity to SEQ ID NO:87.
- the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 90% identity to SEQ ID NO:87. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 95% identity to SEQ ID NO:87. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 98% identity to SEQ ID NO:87. In aspects, when the pyrrolysyl-tRNA synthetase has less than 100% sequence identity to SEQ ID NO:87, the last six amino acids at the C-terminus are always histidine.
- compositions e.g., mutant pyrrolysyl-tRNA synthetase, tRNA Pyl
- compositions may be delivered to cells using methods well known in the art.
- a vector including a nucleic acid sequence encoding a mutant pyrrolysyl-tRNA synthetase as described herein, including embodiments thereof.
- the vector comprises a nucleic acid sequence encoding a mutant pyrrolysyl-tRNA synthetase that comprises at least 5 amino acid residues substitutions within the substrate-binding site of the mutant pyrrolysyl-tRNA synthetase.
- the vector further includes a nucleic acid sequence encoding tRNA Pyl .
- the vector comprises a nucleic acid sequence encoding a mutant pyrrolysyl-tRNA synthetase, including at least 5 amino acid residues substitutions within the substrate-binding site of the mutant pyrrolysyl-tRNA synthetase.
- the vector comprises a nucleic acid sequence encoding a mutant pyrrolysyl-tRNA synthetase comprising at least 5 amino acid residues substitutions in the amino acid sequence of SEQ ID NO:1.
- the at least 5 amino acid substitutions occur at the residues tyrosine at position 126, methionine at position 129, valine at position 168, histidine at position 227, and tyrosine at position 228 as set forth in the amino acid sequence of SEQ ID NO:1.
- the at least 5 amino acid substitutions are: (i) Y126G; (ii) M129A; (iii) V168F; (iv) H227T, H227S, or H227I; and (v) Y228P, in the amino acid sequence of SEQ ID NO:1.
- the mutant pyrrolysyl- tRNA synthetase comprises at least 6 amino acid residues substitutions in the amino acid sequence of SEQ ID NO:1.
- the at least 5 amino acid substitutions occur at the residues tyrosine at position 126, methionine at position 129, valine at position 168, histidine at position 227, tyrosine at position 228, and lysine at position 229 as set forth in the amino acid sequence of SEQ ID NO:1.
- the at least 6 amino acid residues substitutions are: (i) Y126G; (ii) M129A; (iii) V168F; (iv) H227T, H227S, or H227I; (v) Y228P; and (vi) L229V or L229I, in the amino acid sequence of SEQ ID NO:1.
- the vector comprises a nucleic acid sequence encoding the amino acid sequence of SEQ ID NO:2. In aspects, the vector comprises a nucleic acid sequence encoding the amino acid sequence of SEQ ID NO:86. In aspects, the vector comprises a nucleic acid sequence encoding the amino acid sequence of SEQ ID NO:87. In aspects, the vector further includes a nucleic acid sequence encoding tRNA Pyl . [0264] In embodiments, the nucleic acid sequence encoding tRNA Pyl is: GGGGGACGGTCCGGCGACCAGCGGGTCTCTAAAACCTAGCCAGCGGGGTTCGACGC CCCGGTCTCTCGCCA (SEQ ID NO:3).
- the nucleic acid sequence encoding tRNA Pyl comprises the sequence set forth in SEQ ID NO:3. In aspects, the nucleic acid sequence encoding tRNA Pyl has a sequence that has at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO:3. In aspects, the nucleic acid sequence encoding tRNA Pyl has a sequence that has at least 80% sequence identity to SEQ ID NO:3. In aspects, the nucleic acid sequence encoding tRNA Pyl has a sequence that has at least 85% sequence identity to SEQ ID NO:3.
- the nucleic acid sequence encoding tRNA Pyl has a sequence that has at least 90% sequence identity to SEQ ID NO:3. In aspects, the nucleic acid sequence encoding tRNA Pyl has a sequence that has at least 95% sequence identity to SEQ ID NO:3. In aspects, the nucleic acid sequence encoding tRNA Pyl has a sequence that has at least 98% sequence identity to SEQ ID NO:3.
- vector refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked.
- plasmid refers to a linear or circular double stranded DNA loop into which additional DNA segments can be ligated.
- vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome.
- the disclosure provides a genome of a cell comprising a nucleic acid sequence encoding the pyrrolysyl-tRNA synthetase described herein (e.g., SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:86, or SEQ ID NO:87, including embodiments and aspects thereof).
- a nucleic acid sequence encoding the pyrrolysyl-tRNA synthetase described herein (e.g., SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:86, or SEQ ID NO:87, including embodiments and aspects thereof).
- the disclosure provides a genome of a cell comprising a nucleic acid sequence encoding the pyrrolysyl-tRNA synthetase described herein (e.g., SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:86, or SEQ ID NO:87, including embodiments and aspects thereof) and a nucleic acid encoding tRNA Pyl (e.g., SEQ ID NO:3, including embodiments and aspects thereof).
- certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as “expression vectors.” In general, expression vectors of utility in recombinant DNA techniques are often in the form of plasmids.
- plasmid and “vector” can be used interchangeably as the plasmid is the most commonly used form of vector.
- the disclosure is intended to include such other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent functions. Additionally, some viral vectors are capable of targeting a particular cells type either specifically or non-specifically. Exemplary vectors that can be used include, but are not limited to, pEvol vector, pMP vector, pET vector, pTak vector, pBad vector.
- a complex including a mutant pyrrolysyl-tRNA synthetase as described herein, including embodiments thereof; and fluorosulfonyloxybenzoyl-L-lysine (FSK) of Formula (A):
- the complex comprises a mutant pyrrolysyl-tRNA synthetase, including at least 5 amino acid residues substitutions within the substrate-binding site of the mutant pyrrolysyl-tRNA synthetase.
- the complex comprises a nucleic acid sequence encoding a mutant pyrrolysyl-tRNA synthetase comprising at least 5 amino acid residues substitutions in the amino acid sequence of SEQ ID NO:1.
- the at least 5 amino acid substitutions occur at the residues tyrosine at position 126, methionine at position 129, valine at position 168, histidine at position 227, and tyrosine at position 228 as set forth in the amino acid sequence of SEQ ID NO:1.
- the at least 5 amino acid substitutions are: (i) Y126G; (ii) M129A; (iii) V168F; (iv) H227T, H227S, or H227I; and (v) Y228P, in the amino acid sequence of SEQ ID NO:1.
- the mutant pyrrolysyl-tRNA synthetase comprises at least 6 amino acid residues substitutions in the amino acid sequence of SEQ ID NO:1.
- the at least 6 amino acid substitutions occur at the residues tyrosine at position 126, methionine at position 129, valine at position 168, histidine at position 227, tyrosine at position 228, and lysine at position 229 as set forth in the amino acid sequence of SEQ ID NO:1.
- the at least 6 amino acid residues substitutions are: (i) Y126G; (ii) M129A; (iii) V168F; (iv) H227T, H227S, or H227I; (v) Y228P; and (vi) L229V or L229I, in the amino acid sequence of SEQ ID NO:1.
- the complex comprises a nucleic acid sequence encoding the amino acid sequence of SEQ ID NO:2. In aspects, the complex comprises a nucleic acid sequence encoding the amino acid sequence of SEQ ID NO:86. In aspects, the complex comprises a nucleic acid sequence encoding the amino acid sequence of SEQ ID NO:87. [0269] In embodiments, the complex comprises a mutant pyrrolysyl-tRNA synthetase as described herein, including embodiments thereof; fluorosulfonyloxybenzoyl-L-lysine (FSK); and tRNA Pyl as described herein, including embodiments thereof. In aspects, the tRNA Pyl comprises the amino acid sequence encoded by SEQ ID NO:3.
- the tRNA Pyl has at least 80% sequence identity to the amino acid sequence encoded by SEQ ID NO:3. In aspects, the tRNA Pyl has at least 85% sequence identity to the amino acid sequence encoded by SEQ ID NO:3. In aspects, the tRNA Pyl has at least 90% sequence identity to the amino acid sequence encoded by SEQ ID NO:3. In aspects, the tRNA Pyl has at least 95% sequence identity to the amino acid sequence encoded by SEQ ID NO:3.
- the compositions comprise fluorosulfonyloxybenzoyl-L-lysine (FSK) having Formula
- the compositions further comprise components of an in vitro translation system, a variant pyrrolysyl-tRNA synthetase as described herein (including embodiments and aspects thereof), a tRNA Pyl as described herein (including embodiments and aspects thereof), a vector as described herein (including embodiments and aspects thereof), a complex as described herein (including embodiments and aspects thereof), or a combination of two or more thereof.
- the compositions comprise a variant pyrrolysyl-tRNA synthetase as described herein, including embodiments and aspects thereof.
- the compositions comprise a FSK, a tRNA Pyl as described herein (including embodiments and aspects thereof), a complex as described herein (including embodiments and aspects thereof), a vector as described herein (including embodiments and aspects thereof), or a combination of two or more thereof.
- the compositions comprise a tRNA Pyl as described herein, including embodiments and aspects thereof.
- the cell further comprises FSK, a variant pyrrolysyl-tRNA synthetase as described herein (including embodiments and aspects thereof), a complex as described herein (including embodiments and aspects thereof), a vector as described herein (including embodiments and aspects thereof), or a combination of two or more thereof.
- compositions comprise FSK having Formula (A) and one or more compounds selected from the group consisting of tRNA (e.g., as described herein), a cofactor (e.g., initation factor, elongation factor, termination factor), an energy regenerating system (e.g., creatine phosphate and/or creatine phosphokinase for a eukaryotic system, and phosphoenol pyruvate and/or pyruvate kinase for a bacterial system), a peptide, a salt (e.g., a magnesium salt, a potassium salt), a protein, and a ribosome (e.g, 70S ribosomes, 80S ribosomes).
- tRNA e.g., as described herein
- a cofactor e.g., initation factor, elongation factor, termination factor
- an energy regenerating system e.g., creatine phosphate and/or creatine phosphokina
- compositions comprise FSK having Formula (A) and a compound selected from the group consisting of tRNA, a cofactor, an energy regenerating system, a salt, a protein, a ribosome, and a combination of two or more thereof.
- compositions comprise FSK having Formula (A) and a compound selected from the group consisting of a cofactor, an energy regenerating system, a salt, a ribosome, and a combination of two or more thereof.
- the compositions comprise FSK having Formula (A) and a compound selected from the group consisting of tRNA, an initation factor, an elongation factor, a termination factor, creatine phosphate, creatine phosphokinase, a magnesium salt, a potassium salt, an 80S ribosome, and a combination of two or more thereof.
- the compositions comprise FSK having Formula (A) and a compound selected from the group consisting of tRNA, an initation factor, an elongation factor, a termination factor, phosphoenol pyruvate, pyruvate kinase, a magnesium salt, a potassium salt, a 70S ribosome, and a combination of two or more thereof.
- the disclosure provides an in vitro translation system comprising a biomolecule as described herein, e.g., a biomolecule of Formula (B), Formula (C), Formula (D), Formula (E), Formula (I), Formula (II), Formula (III), including embodiments and aspects thereof.
- the in vitro translation system is a wheat germ extract in vitro translation system or a rabbit reticulocyte lystate in vitro translation system.
- the in vitro translation system is a wheat germ extract in vitro translation system.
- the in vitro translation system is a rabbit reticulocyte lystate in vitro translation system.
- the disclosure provides cells comprising the compositions and complexes provided herein, including embodiments and aspects thereof.
- the cell comprises fluorosulfonyloxybenzoyl-L-lysine (FSK) having Formula (A): (A).
- FSK fluorosulfonyloxybenzoyl-L-lysine
- the cell further comprises a variant pyrrolysyl-tRNA synthetase as described herein (including embodiments and aspects thereof), a tRNA Pyl as described herein (including embodiments and aspects thereof), a vector as described herein (including embodiments and aspects thereof), a complex as described herein (including embodiments and aspects thereof), or a combination of two or more thereof.
- the cell comprises a variant pyrrolysyl-tRNA synthetase as described herein, including embodiments and aspects thereof.
- the cell further comprises a FSK, a tRNA Pyl as described herein (including embodiments and aspects thereof), a complex as described herein (including embodiments and aspects thereof), a vector as described herein (including embodiments and aspects thereof), or a combination of two or more thereof.
- the cell comprises a tRNA Pyl as described herein, including embodiments and aspects thereof.
- the cell further comprises FSK, a variant pyrrolysyl-tRNA synthetase as described herein (including embodiments and aspects thereof), a complex as described herein (including embodiments and aspects thereof), a vector as described herein (including embodiments and aspects thereof), or a combination of two or more thereof.
- the cell comprises a vector as described herein, including embodiments and aspects thereof.
- the cell further comprises FSK, a variant pyrrolysyl-tRNA synthetase as described herein (including embodiments and aspects thereof), a tRNA Pyl as described herein (including embodiments and aspects thereof), a complex as described herein (including embodiments and aspects thereof), or a combination of two or more thereof.
- the cell comprises a complex as described herein, including embodiments and aspects thereof.
- the cell further comprises FSK, a variant pyrrolysyl-tRNA synthetase as described herein (including embodiments and aspects thereof), a tRNA Pyl as described herein (including embodiments and aspects thereof), a vector as described herein (including embodiments and aspects thereof) or a combination of two or more thereof.
- FSK is biosynthesized inside the cell, thereby generating a cell containing FSK.
- FSK is contained in the medium outside the cell and penetrates into the cell, thereby generating a cell containing FSK.
- the cell comprises an FSK biomolecule.
- the cell comprises an FSK protein. In aspects, the cell comprises an FSK biomolecule that is synthesized inside the cell. In aspects, the cell comprises an FSK protein that is synthesized inside the cell. In aspects, the cell comprises an FSK biomolecule that is synthesized outside a cell, and that penetrates into the cell. In aspects, the cell comprises an FSK protein that is synthesized outside a cell, and that penetrates into the cell. [0284] In embodiments, the cell comprises the biomolecule conjugates described herein.
- the cell comprises biomolecule conjugate comprising a first biomolecule moiety conjugated to a second biomolecule moiety through a bioconjugate linker, wherein the bioconjugate linker has the formula: aspects, the cell comprises a biomolecule conjugate of the formula R 1 -L 1 -A-X 1 -L 2 -R 2 , wherein the substituents are as defined herein.
- the first and second biomolecule moieties are each independently a peptidyl moiety, a nucleic acid moiety, or a carbohydrate moiety. In aspects, the first and second biomolecule moieties are each a peptidyl moiety within the same protein.
- the first and second biomolecule moieties are each a peptidyl moiety within different proteins (e.g., within a single- domain antibody and within a membrane receptor).
- the cell comprises a protein which comprises a moiety of Formula (IV), a moiety of Formula (V), or a moiety of Formula (VI):
- the cell comprises a protein of Formula (I), Formula (II), or Formula (III): each independently a peptidyl moiety.
- R 1 and R 2 are bonded together, such that protein of Formula (I), (II), and (III) comprise an intramolecular bond.
- R 1 and R 2 are a peptidyl moiety in two different proteins, such that the protein of Formula (I), (II), and (III) comprises an intermolecular bond between two proteins.
- R 1 is a peptidyl moiety in a single-domain antibody and R 2 is a peptidyl moiety in a membrane receptor. In embodiments, R 1 is a peptidyl moiety in a membrane receptor and R 2 is a peptidyl moiety in a single-domain antibody.
- a cell can be any prokaryotic or eukaryotic cell. In aspects, the cell is prokaryotic. In aspects, the cell is eukaryotic. In aspects, the cell is a bacterial cell, a fungal cell, a plant cell, an archael cell, or an animal cell. In aspects, the animal cell is an insect cell or a mammalian cell. In aspects, the cell is a bacterial cell.
- the cell is a fungal cell. In aspects, the cell is a plant cell. In aspects, the cell is an archael cell. In aspects, the cell is an animal cell. In aspects, the cell is an insect cell. In aspects, the cell is a mammalian cell. In aspects, the cell is a human cell.
- any of the compositions described herein can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Hela cells, Chinese hamster ovary cells (CHO) or COS cells).
- the cell is a premature mammalian cell, i.e., a pluripotent stem cell. In aspects, the cell is derived from other human tissue.
- compositions provided herein are useful for forming a biomolecule or biomolecule conjugate.
- method of forming an FSK biomolecule by contacting a biomolecule, a mutant pyrrolysyl-tRNA synthetase, a tRNA Pyl , and fluorosulfonyloxybenzoyl- L-lysine (FSK) having Formula (A): thereby producing the FSK biomolecule, i.e., a biomolecule comprising the unnatural amino acid of FSK.
- FSK fluorosulfonyloxybenzoyl- L-lysine
- the biomolecule produced by the method will comprise the unnatural amino acid side chain of Formula (F):
- the mutant pyrrolysyl-tRNA synthetase used in the method of producing the biomolecule is any described herein.
- the tRNA Pyl used in the method of producing the biomolecule is any described herein.
- the biomolecule is a protein.
- the biomolecule is a nucleic acid.
- the biomolecule is a carbohydrate.
- the reaction is performed in vitro.
- the reaction is performed in vivo.
- the reaction is performed in one or more living cells.
- the reaction is performed in one or more living bacterial cells.
- the reaction is performed in one or more living mammalian cells.
- the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, and an animal cell. In aspects, the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, an insect cell, and a mammalian cell.
- the disclosure provides methods for producing an FSK protein by contacting a protein, a mutant pyrrolysyl-tRNA synthetase, a tRNA Pyl , and fluorosulfonyloxybenzoyl-L-lysine (FSK) having Formula (A): thereby producing the FSK protein, i.e., a protein comprising the unnatural amino acid of FSK.
- the protein produced by the method will comprise the unnatural amino acid side chain of Formula (F):
- the mutant pyrrolysyl-tRNA synthetase used in the method of producing the protein is any described herein.
- the tRNA Pyl used in the method of producing the protein is any described herein.
- the FSK protein further comprises lysine, histidine, tyrosine, or two or more thereof. In aspects, the FSK protein comprises FSK that is proximal to lysine, histidine, tyrosine, or two or more thereof. In aspects, the FSK protein comprises FSK that is proximal to lysine. In aspects, the FSK protein comprises FSK that is proximal to histidine. In aspects, the FSK protein comprises FSK that is proximal to tyrosine. The term “proximal” is described herein. In aspects, the reaction is performed in vitro. In aspects, the reaction is performed in vivo. In aspects, the reaction is performed in one or more living cells.
- the reaction is performed in one or more living bacterial cells. In aspects, the reaction is performed in one or more living mammalian cells. In aspects, the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, and an animal cell. In aspects, the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, an insect cell, and a mammalian cell.
- the disclosure provides methods for forming a biomolecule conjugate by contacting a first biomolecule moiety which comprises FSK with a second biomolecule moiety, wherein the second biomolecule moiety is reactive with the FSK in the first biomolecule moiety, thereby forming a biomolecule conjugate.
- the first biomolecule moiety which comprises FSK is a compound of Formula (B): are as defined herein.
- the first biomolecule moiety which comprises FSK is a compound of Formula (C): defined herein.
- the first biomolecule moiety which comprises FSK is a biomolecule having an amino acid side chain of Formula (F): aspects, the second biomolecule moiety comprises a lysine, histidine, or tyrosine that is reactive with the FSK in the first biomolecule.
- the reaction to form the biomolecule conjugate occurs by proximity-enabled, click chemistry (e.g., between the FSK on the first biomolecule moiety and the lysine, histidine, or tyrosine on the second biomolecule moiety).
- the reaction to form the biomolecule conjugate occurs by a sulfur-fluoride exchange reaction (e.g., between the FSK on the first biomolecule moiety and the lysine, histidine, or tyrosine on the second biomolecule moiety).
- the reaction to form biomolecule conjugate occurs by a proximity-enabled, sulfur- fluoride exchange reaction (e.g., between the FSK on the first biomolecule moiety and the lysine, histidine, or tyrosine on the second biomolecule moiety).
- the reaction is performed in vitro.
- the reaction is performed in vivo.
- the reaction is performed in one or more living cells.
- the reaction is performed in one or more living bacterial cells.
- the reaction is performed in one or more living mammalian cells. In aspects, the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, and an animal cell. In aspects, the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, an insect cell, and a mammalian cell. [0292] In embodiments, the disclosure provides proteins comprising one or more intramolecular covalent bonds (e.g., a protein conjugate).
- a protein conjugate e.g., a protein conjugate
- FSK and the proximal lysine, histidine, or tyrosine undergo a reaction to form the intramolecular covalent bond, resulting in a moiety of Formula (IV), a moiety of Formula (V), or a moiety of Formula (VI), or a combination of two or more thereof: and the lysine, histidine, or tyrosine that are proximal thereto can be on an ⁇ -strand of the protein and/or a ⁇ -strand of the protein.
- the reaction to form the intramolecular covalent bond between FSK and the lysine, histidine, or tyrosine is accomplished through click chemistry.
- the reaction to form the intramolecular covalent bond between FSK and the lysine, histidine, or tyrosine is accomplished through proximity-enabled, click chemistry. In aspects, the reaction to form the intramolecular covalent bond between FSK and the lysine, histidine, or tyrosine is accomplished through a sulfur-fluoride exchange reaction. In aspects, the reaction to form the intramolecular covalent bond between FSK and the lysine, histidine, or tyrosine is accomplished through a proximity-enabled, sulfur-fluoride exchange reaction. In aspects, the reaction is performed in vitro. In aspects, the reaction is performed in vivo. In aspects, the reaction is performed in one or more living cells.
- the reaction is performed in one or more living bacterial cells. In aspects, the reaction is performed in one or more living mammalian cells. In aspects, the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, and an animal cell. In aspects, the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, an insect cell, and a mammalian cell.
- the disclosure provides protein conjugates of Formula (I), (II), or (III) wherein R 1 and R 2 are each independently a peptidyl moiety: joined together to form an intramolecularly conjugated protein. In aspects, R 1 and R 2 are not joined together.
- the reaction to form the protein conjugates is accomplished through click chemistry.
- the reaction to form the protein conjugate is accomplished through proximity-enabled, click chemistry.
- the reaction to form the protein conjugate is accomplished through a sulfur-fluoride exchange reaction.
- the reaction to form the protein conjugate is accomplished through a proximity-enabled, sulfur-fluoride exchange reaction.
- the reaction is performed in vitro. In aspects, the reaction is performed in vivo.
- the reaction is performed in one or more living cells. In aspects, the reaction is performed in one or more living bacterial cells. In aspects, the reaction is performed in one or more living mammalian cells. In aspects, the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, and an animal cell. In aspects, the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, an insect cell, and a mammalian cell. [0294] In embodiments, two or more proteins can be covalently linked by the methods and compositions described herein.
- FSK is an unnatural amino acid in a first protein and lysine, histidine, or tyrosine are amino acids in a second protein, wherein the first protein and the second protein are different.
- the FSK in the first protein undergoes a reaction with the lysine, histidine, or tyrosine in the second protein to form an intermolecular covalent bond between the first and second proteins.
- the intermolecular covalent bond linking the two proteins is represented by a moiety of Formula (IV), moiety of Formula (V), moiety of Formula (VI), or a combination of two or more thereof: histidine, or tyrosine can be on an ⁇ -strand of their respective proteins and/or a ⁇ -strand of their respective proteins.
- the reaction to form the intermolecular covalent bond between FSK in the first protein and the lysine, histidine, or tyrosine in the second protein is accomplished through click chemistry. In aspects, the reaction to form the intermolecular covalent bond between FSK in the first protein and the lysine, histidine, or tyrosine in the second protein is accomplished through proximity-enabled, click chemistry. In aspects, the reaction to form the intermolecular covalent bond between FSK in the first protein and the lysine, histidine, or tyrosine in the second protein is accomplished through sulfur-fluoride exchange.
- the reaction to form the intermolecular covalent bond between FSK in the first protein and the lysine, histidine, or tyrosine in the second protein is accomplished through proximity-enabled, sulfur-fluoride exchange.
- the reaction is performed in vitro.
- the reaction is performed in vivo.
- the reaction is performed in one or more living cells.
- the reaction is performed in one or more living bacterial cells.
- the reaction is performed in one or more living mammalian cells.
- the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, and an animal cell.
- the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, an insect cell, and a mammalian cell.
- the disclosure provides biomolecule conjugates comprising a first biomolecule moiety conjugated to a second biomolecule moiety through a bioconjugate linker, wherein the bioconjugate linker has Formula (D): has the formula R 1 -L 1 -A-X 1 -L 2 -R 2 , where the substituents are as defined herein.
- the reaction to form the biomolecule conjugates is accomplished through click chemistry.
- the reaction to form the biomolecule conjugate is accomplished through proximity-enabled, click chemistry. In aspects, the reaction to form the biomolecule conjugate is accomplished through a sulfur-fluoride exchange reaction. In aspects, the reaction to form the biomolecule conjugate is accomplished through a proximity-enabled, sulfur-fluoride exchange reaction. In aspects, the reaction is performed in vitro. In aspects, the reaction is performed in vivo. In aspects, the reaction is performed in one or more living cells. In aspects, the reaction is performed in one or more living bacterial cells. In aspects, the reaction is performed in one or more living mammalian cells.
- the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, and an animal cell. In aspects, the reaction is performed in one or more cells selected from the group consisting of a bacterial cell, a fungal cell, a plant cell, an archael cell, an insect cell, and a mammalian cell.
- Methods of Binding a Target [0297] Provided herein are biomolecules having the structure of Formula (C): wherein R 1 is a small molecule moiety, an amino acid moiety, or a peptidyl moiety. In embodiments, R 1 is a small molecule moiety.
- R 1 is an amino acid moiety or a peptidyl moiety. In embodiments, R 1 is an amino acid moiety. In embodiments, R 1 is a peptidyl moiety. In embodiments, R 1 is an antibody, an antigen-binding fragment, a single-chain variable fragment, a single-domain antibody, or an affibody. In embodiments, R 1 is an antibody. In embodiments, R 1 is an antigen-binding fragment. In embodiments, R 1 is a single-chain variable fragment. In embodiments, R 1 is a single-domain antibody. In embodiments, R 1 is an affibody. In embodiments, R 1 is capable of binding to a target.
- R 1 is capable of binding to a target on a surface of a cell.
- the target on the surface of the cell is a receptor.
- the receptor is a membrane receptor or a hormone receptor.
- the target is a receptor selected from the group acetylcholine receptor, an adenosine receptor, an angiotensin receptor, an apelin receptor, a bile acid, receptor, a bombesin receptor, a bradykinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR), a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin receptor, a glycoprotein hormone receptor, a gonadotrophin-releasing hormone receptor,
- EGFR epidermal growth factor receptor
- the target is PD-1 or PD-L1. In embodiments, the target is PD-1. In embodiments, the target is PD-L1. In embodiments, the target is a protein, a nucleic acid, or a carbohydrate. In embodiments, the target is a protein. In embodiments, the target is a nucleic acid. In embodiments, the target is a carbohydrate. [0299] Provided herein are methods of binding a target on a cell comprising contacting the cell with the biomolecule of Formula (B) or the biomolecule of Formula (C), wherein the biomolecule is capable of specifically binding to the target on the surface of the cell, whereby the biomolecule forms a covalent bond with the target.
- the method comprises contacting the cell with the biomolecule of Formula (B), wherein the biomolecule is capable of specifically binding to the target on the surface of the cell, whereby the biomolecule forms a covalent bond with the target.
- the method comprises contacting the cell with the biomolecule of Formula (C), wherein the biomolecule is capable of specifically binding to the target on the surface of the cell, whereby the biomolecule forms a covalent bond with the target.
- the covalent bond is formed through a sulfur-fluoride exchange reaction.
- the covalent bond is formed through a proximity-enabled, sulfur-fluoride exchange reaction.
- biomolecule and the target are covalently linked by a bioconjugate linker having the structure of Formula (D) [0300]
- “Target” refers to any compound which is capable of binding covalently or non- covalently with R 1 (e.g., a protein).
- a “target” comprises, without limitation, small molecules, peptides, proteins, enzymes, antibodies, antigens, lipids, metabolites, hormones, carbohydrates, nucleic acids, cells, receptors, viruses, or any other moiety which is capable of binding covalently or non-covalently with R 1 .
- Both R 1 and the amino acid side chain thereof i.e., Formula (F) can bind the target.
- R 1 may engage the target first through non-covalent binding, followed by covalent binding through the FSK amino acid side chain.
- Embodiments 1 to 57 [0302] Embodiment 1. A biomolecule conjugate comprising a first biomolecule moiety conjugated to a second biomolecule moiety through a bioconjugate linker, wherein the bioconjugate linker has the formula: . [0303] Embodiment 2.
- Embodiment 3 The biomolecule conjugate of Embodiment 2, wherein L 1 is a bond, -S(O) 2 -, -NR 3A -, -O-, -S-, -C(O)-, -C(O)NR 3A -, -NR 3A C(O)-, -NR 3A C(O)NR 3B -, -C(O)O-, -OC(O)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene; L 2 is a bond, -S(O) 2 -, -NR 4A -, -O-, -S-, -C(O)-, -C(O)
- Embodiment 4 The biomolecule conjugate of Embodiment 2 or 3, wherein X 1 is -NH-, -O-, or imidazolylene.
- Embodiment 5. The biomolecule conjugate of any one of Embodiments 1 to 4, wherein the first biomolecule moiety is a peptidyl moiety, a nucleic acid moiety, or a carbohydrate moiety.
- Embodiment 6. The biomolecule conjugate of Embodiment 5, wherein the first biomolecule moiety is a peptidyl moiety; and wherein the peptidyl moiety is covalently bonded to the bioconjugate linker via lysine, histidine, or tyrosine.
- Embodiment 7 The biomolecule conjugate of any one of Embodiments 1 to 4, wherein the second biomolecule moiety is a peptidyl moiety, a nucleic acid moiety, or a carbohydrate moiety.
- Embodiment 8 The biomolecule conjugate of Embodiment 7, wherein the second biomolecule moiety is a peptidyl moiety; and wherein the peptidyl moiety is covalently bonded to the bioconjugate linker via lysine, histidine, or tyrosine.
- Embodiment 10 The biomolecule conjugate of any one of Embodiments 2 to 4, wherein –L 1 -R 1 is a peptidyl moiety, a nucleic acid moiety, or a carbohydrate moiety.
- Embodiment 10 The biomolecule conjugate of any one of Embodiments 2 to 4, wherein –L 2 -R 2 is a peptidyl moiety, a nucleic acid moiety, or a carbohydrate moiety.
- Embodiment 11 The biomolecule of any one of Embodiments 5 to 10, wherein the peptidyl moiety comprises a single-domain antibody or a membrane receptor.
- Embodiment 13 The biomolecule conjugate of any one of Embodiments 1 to 11, wherein the bioconjugate linker is an intermolecular linker.
- Embodiment 14 The biomolecule conjugate of any one of Embodiments 1 to 11, wherein the bioconjugate linker is an intramolecular linker.
- Embodiment 16 The protein of Embodiment 15, wherein the protein is of Formula (I).
- Embodiment 17 The protein of Embodiment 15, wherein the protein is of Formula (II).
- Embodiment 18 The protein of Embodiment 15, wherein the protein is of Formula (III).
- Embodiment 19 The protein of any one of Embodiments 15 to 18, wherein R 1 and R 2 each independently comprise a protein ⁇ -strand or a protein ⁇ -strand.
- Embodiment 20 The protein of any one of Embodiments 15 to 19, wherein R 1 and R 2 are not joined together.
- Embodiment 21 The protein of any one of Embodiments 15 to 20, wherein the peptidyl moiety of R 1 comprises a single-domain antibody and the peptidyl moiety of R 2 comprises a membrane receptor.
- Embodiment 22 The protein of any one of Embodiments 15 to 20, wherein the peptidyl moiety of R 1 comprises a membrane receptor and the peptidyl moiety of R 2 comprises a single-domain antibody.
- Embodiment 23 Embodiment 23.
- Embodiment 24 A pyrrolysyl-tRNA synthetase comprising at least 6 amino acid residues substitutions within the substrate-binding site of the pyrrolysyl-tRNA synthetase having the amino acid sequence of SEQ ID NO:1; wherein the substrate-binding site comprises residues tyrosine at position 126, methionine at position 129, valine at position 168, histidine at position 227, tyrosine at position 228, and lysine at position 229 as set forth in the amino acid sequence of SEQ ID NO:1.
- Embodiment 25 The pyrrolysyl-tRNA synthetase of Embodiment 24, wherein the at least 6 amino acid residues substitutions in the amino acid sequence of SEQ ID NO:1 are: (i) Y126G; (ii) M129A; (iii) V168F; (iv) H227T, H227S, or H227I; (v) Y228P; and (vi) L229I, L229V, or L229I. [0327] Embodiment 26. The pyrrolysyl-tRNA synthetase of Embodiment 24, comprising an amino acid sequence of SEQ ID NO:2. [0328] Embodiment 27.
- a vector comprising a nucleic acid sequence encoding the pyrrolysyl- tRNA synthetase of any one of Embodiments 24 to 26.
- Embodiment 28 The vector of Embodiment 27, further comprising a nucleic acid encoding tRNA Pyl .
- Embodiment 29 A complex comprising the pyrrolysyl-tRNA synthetase of any one of Embodiments 24 to 26 and a fluorosulfonyloxybenzoyl-L-lysine having the following formula: .
- Embodiment 30 The complex of Embodiment 29, further comprising a tRNA Pyl .
- Embodiment 31 Embodiment 31.
- Embodiment 32 A cell comprising the biomolecule conjugate of any one of Embodiments 1 to 14.
- Embodiment 32 A cell comprising the protein of any one of Embodiments 15 to 23.
- Embodiment 33 A cell comprising the pyrrolysyl-tRNA synthetase of any one of Embodiments 24 to 26.
- Embodiment 34 A cell comprising the vector of Embodiment 27 or 28.
- Embodiment 35 A cell comprising the complex of Embodiment 29 or 30.
- Embodiment 36 A cell comprising fluorosulfonyloxybenzoyl-L-lysine of the formula: .
- Embodiment 37 A cell comprising fluorosulfonyloxybenzoyl-L-lysine of the formula: .
- Embodiment 36 further comprising a pyrrolysyl-tRNA synthetase comprising at least 6 amino acid residues substitutions within the substrate-binding site of the pyrrolysyl-tRNA synthetase having the amino acid sequence of SEQ ID NO:1; wherein the substrate-binding site comprises residues tyrosine at position 126, methionine at position 129, valine at position 168, histidine at position 227, tyrosine at position 228, and lysine at position 229 as set forth in the amino acid sequence of SEQ ID NO:1.
- Embodiment 38 Embodiment 38.
- Embodiment 37 wherein the at least 6 amino acid residues substitutions in the amino acid sequence of SEQ ID NO:1 are: (i) Y126G; (ii) M129A; (iii) V168F; (iv) H227T, H227S, or H227I; (v) Y228P; and (vi) L229V or L229I.
- Embodiment 39 The cell of Embodiment 37, wherein the pyrrolysyl-tRNA synthetase comprises an amino acid sequence of SEQ ID NO:2.
- Embodiment 40 The cell of any one of Embodiments 36 to 39, further comprising a tRNA Pyl .
- Embodiment 41 The cell of any one of Embodiments 31 to 40, wherein the cell is a bacterial cell or a mammalian cell.
- Embodiment 42 A method of forming the biomolecule conjugate of Embodiment 13, the method comprising contacting a fluorosulfonyloxybenzoyl-L-lysine moiety within a fluorosulfonyloxybenzoyl-L-lysine biomolecule with a compound comprising the second biomolecule moiety, wherein the second biomolecule is reactive with the fluorosulfonyloxybenzoyl-L-lysine moiety; thereby forming the biomolecule conjugate having an intermolecular linker.
- Embodiment 43 A method of forming the biomolecule conjugate of Embodiment 14, the method comprising contacting a fluorosulfonyloxybenzoyl-L-lysine moiety within a fluorosulfonyloxybenzoyl-L-lysine biomolecule with a second biomolecule moiety in the fluorosulfonyloxybenzoyl-L-lysine biomolecule, wherein the second biomolecule is reactive with the fluorosulfonyloxybenzoyl-L-lysine moiety; thereby forming the biomolecule conjugate having an intramolecular linker.
- Embodiment 44 A method of forming the biomolecule conjugate of Embodiment 14, the method comprising contacting a fluorosulfonyloxybenzoyl-L-lysine moiety within a fluorosulfonyloxybenzoyl-L-lysine biomolecule with a second biomolecule moiety in the fluorosulfonyloxybenzoy
- Embodiment 45 The method of any one of Embodiments 42 to 44, further comprising, prior to contacting, the step contacting a biomolecule, a pyrrolysyl-tRNA synthetase of any one of Embodiments 24 to 26, a tRNA Pyl , and a fluorosulfonyloxybenzoyl-L-lysine having the fluorosulfonyloxybenzoyl-L-lysine biomolecule.
- Embodiment 46 Embodiment 46.
- a fluorosulfonyloxybenzoyl-L-lysine protein with a second protein comprising lysine, histidine, or tyrosine; thereby forming the intramolecularly conjugated protein.
- Embodiment 46 or 47 further comprising producing the fluorosulfonyloxybenzoyl-L-lysine protein, the method comprising contacting a protein, a pyrrolysyl-tRNA synthetase of any one of Embodiments 24 to 26, a tRNA Pyl , and fluorosulfonyloxybenzoyl-L-lysine having the formula: thereby producing the fluorosulfonyloxybenzoyl-L-lysine protein.
- Embodiment 49 The method of Embodiment 48, wherein contacting comprises a sulfur-fluoride exchange reaction.
- Embodiment 50 comprises a sulfur-fluoride exchange reaction.
- Embodiment 48 wherein contacting comprises a proximity-enabled, sulfur-fluoride exchange reaction.
- Embodiment 51 The method of any one of Embodiments 46 to 50, wherein contacting is performed within a cell.
- Embodiment 52 A protein comprising an unnatural amino acid; wherein the unnatural amino acid has a side chain of formula: .
- Embodiment 53 The protein of Embodiment 52, wherein the protein is a single- domain antibody.
- Embodiment 54 The protein of Embodiment 52, wherein the protein is a membrane receptor.
- Embodiment 55 Embodiment 55.
- Embodiment 56 A protein comprising a moiety of Formula (IV), a moiety of Formula (V), a moiety of Formula (VI), or a combination of two or more thereof: [0358] Embodiment 57. A cell comprising the protein of any one of Embodiments 52 to 56.
- FSK complements FSY enabling the introduction of covalent bonds via SuFEx chemistry into a broader range of protein sites for general applications.
- EXAMPLES [0360] The following examples are intended to further illustrate certain embodiments and aspects of the disclosure. The examples are not intended to limit the spirit or scope of the disclosure or claims.
- Example 1 Protein side chains can spontaneously form a covalent linkage via cysteines only. This natural barrier has been broken by adding into proteins new covalent bonds formed between a genetically encoded latent bioreactive unnatural amino acid (Uaa) and a nearby natural residue via proximity-enabled reactivity. (Ref: 1, 2).
- fluorosulfate is of particular interest for its exceptional biocompatibility, proximity-dependent reactivity, and multi-targeting ability.
- It is an excellent latent group which doesn’t react with non-interacting proteins randomly, but react efficiently with nucleophilic residues including His, Lys, Tyr only when they are located in close proximity.
- the inventors recently genetically encode fluorosulfate-L-tyrosine (FSY) and demonstrated its use for not only protein cross-linkings but also generating covalent protein drugs for in vivo cancer. (Refs: 7, 18).
- FSY has a relatively rigid side chain and limited reaction radius, which will not be able to crosslink a target residue located further away in space.
- fluorosulfate for generating covalent bonds for proteins
- the inventors designed and genetically encoded fluorosulfonyloxybenzoyl-L-lysine (FSK) which bears a long aliphatic side chain offering greater flexibility and longer reaction distance than FSY.
- FSK fluorosulfonyloxybenzoyl-L-lysine
- FSKRS FSKRS
- FIG.7-8 Western blot analysis of EGFP(182TAG) expression also showed that FSKRS only incorporate FSK and no natural amino acids (FIG.9).
- the inventors also tested FSK incorporation into the superfolder GFP (sfGFP) at site 2 and site 151. At both positions, strong sfGFP fluorescence was detected when FSK was added to the growth media, in comparison to control samples without FSK (FIG. 10), confirming the specificity of FSKRS for the Uaa.
- FSK enables inter-protein crosslinking at distance unreachable with FSY in cells
- the inventors next tested the ability of FSK to crosslink target residues that were too far for FSY.
- the inventors chose to incorporate FSY or FSK at site 65 of ecGST, around which multiple nucleophilic residues (Lys93, Tyr100, Lys 132, Tyr 135) reside with a distance to the alpha carbon spanning from 9.2 to 13.3 ⁇ (FIG.2C). This distance should be favorable for FSK to react but too far for FSY.
- Trx coli thioredoxin
- PAPS 3’-phosphoadenosine-5’-phosphosulfate
- FSK enables covalent bonding of proteins intramolecularly
- Genetically introducing intramolecular crosslinking within a peptide or protein is an innovative way to staple or bridge protein residues for engineering properties such as thermostability and cell permeability. Current methods mainly rely on disulfide bond formation between two Cys residues or targeting the thiol group of Cys with halogen or a Michael receptor installed on a bioreactive Uaa. This greatly limits the number of conformations and configurations can be created for the crosslinked peptides or proteins.
- FSK and FSY enable covalently targeting of EGFR receptor with a nanobody at different sites
- the ability to covalently target native receptors on cells and in vivo with various protein binders such as antibodies and nanobodies would afford powerful avenues for imaging, diagnostics, and therapeutics EGFR is a valuable marker for various cancers, so we aimed to covalently target it with nanobodies. (Ref: 24).
- EGFR was crosslinked by 7D12(109FSY) but not by 7D12(109FSK) in SDS-PAGE and Western blot analysis (FIG.15). Furthermore, we also assessed the ability of 7D12(31FSK) to crosslink native EGFR receptors expressed on cancer cell surface. A431, a human epidermoid carcinoma cell line, was incubated with 7D12(31FSK) or 7D12(WT).
- FSK incorporation and crosslinking in mammalian cells [0377] To enable the application of FSK in mammalian cells, we tested FSK incorporation and in vivo crosslinking in human HeLa cells. Plasmid pNEU-FSKRS expressing the FSKRS and tRNA Pyl was transfected into the HeLa-EGFP(182TAG) reporter cells. (Ref: 26). Suppression of the 182TAG codon of the genome-integrated EGFP gene would produce full-length EGFP rendering cells green fluorescent. Strong EGFP fluorescence was observed from cells using confocal microscopy only when FSK as added to the cell culture (FIG.5A).
- FSK and FSY have multi- targeting ability toward Lys, His, and Tyr, we reasoned that they can be used to capture a broader range of interacting proteins that lack Cys but have Lys, His, or Tyr at the interaction interface.
- FSY and FSK can be incorporated at the peripheral of the binding interface, rather than in the active site or inside the binding interface to minimize potential interference with protein interaction.
- GECX genetically encoded chemical crosslinking
- Trx coli cells. These two sites are away from the Trx active site and likely located in the peripheral of the binding interface of Trx and its substrate.
- FSK and FSY efficiently crosslinked potential substrate proteins, in comparison with the WT Trx (FIG.16). These crosslinked proteins were pulled down, digested with trypsin, and analyzed with tandem mass spectrometry.
- OpenUaa software for analysis of crosslinked proteins we identified 12 substrate proteins for Trx from the Trx(FSK) sample and the Trx(FSY) sample. Among these substrate proteins, AHPC, TPX, SDHA, HPTG, CH10 are well known Trx substrates previously reported. (Ref 27).
- FSK could also be used for intramolecular crosslinking, which will greatly expand the diversity of protein crosslinking patterns to facilitate the engineering of novel protein properties.
- FSK could be incorporated into nanobodies and converted them into covalent binders for EGFR, which irreversibly bound to EGFR in vitro and on cancer cell surface, which may provide novel avenues for cancer imaging and therapeutics.
- FSK could be incorporated into proteins and generate covalent protein crosslinks in both bacteria and mammalian cells. While sharing the same multi-targeting ability toward His, Lys, and Tyr, FSK complements FSY with a longer and more flexible side chain.
- pBAD-ubiquitin (6TAG) and pBAD-ecGST WT and ecGST mutants were used as previously described. (Liu et al, Journal of the American Chemical Society 2019, 141 (24), 9458-9462).
- ecGST HindIII-pCDNA and ecGST XhoI-pCDNA primers were used to clone ecGST WT and ecGST (86TAG), ecGST (86TAG/92A), ecGST (86TAG/92A/72A) into pCDNA 3.1. Primers used for cloning are shown in FIG.6. [0389] FSKRS amino acid sequence is shown as SEQ ID NO:2.
- SEQ ID NO:2 MTVKYTDAQI QRLREYGNGT YEQKVFEDLA SRDAAFSKEM SVASTDNEKK IKGMIANPSR HGLTQLMNDI ADALVAEGFI EVRTPIFISK DALARMTITE DKPLFKQVFW IDEKRALRPM LAPNLGSVAR DLRDHTDGPV KIFEMGSCFR KESHSGMHLE EFTMLNLFDM GPRGDATEVL KNYISVVMKA AGLPDYDLVQ EESDVYKETI DVEINGQEVC SAAVGPTPID AAHDVHEPWS GAGFGLERLL TIREKYSTVK KGGASISYLN GAKIN [0391] sfGFP (2TAG).
- pBAD-ecGST (86TAG/92A) was constructed by site-directed mutagenesis with primers ecGST65TAG-For and ecGST65TAG-Rev (SEQ ID NO:30, where Bold underline: amber codon TAG at 65 th position) [0396] SEQ ID NO:30 MKLFYKPGACSLASHITLRESGKDFTLVSVDLMKKRLENGDDYFAVNPKGQVPALLLD DGTLLTXGVAIMQYLADSVPDRQLLAPVNSISRYKTIEWLNYIATELHKGFTPLFRPDTP EEYKPTVRAQLEKKLQYVNEALKDEHWICGQRFTIADAYLFTVLRWAYAVKLNLEGL EHIAAFMQRMAERPEVQDALSAEGLKHHHHHH [0397] ecGST (86TAG/92A).
- pBAD-ecGST (86TAG/92A) was constructed by site-directed mutagenesis with primers ecGST86TAG92A-For and ecGST86TAG92A-Rev (SEQ ID NO:31, where Bold underline: amber codon TAG at 86 th position.
- pBAD-ecGST (86TAG/92A/72A) was constructed by site- directed mutagenesis with primers ecGST86TAG92A72A-For and ecGST86TAG92A72A-Rev (SEQ ID NO:32, where Bold underline: amber codon TAG at 86 th position.
- SEQ ID NO:33 MTSMSPILGYWKIKGLVQPTRLLLEYLEEKYEEHLYERDEGDKWRNKKFELGLEFPNLP YYIDGDVKLTQSMAIIRYIADKHNMLGGCPKERAEISMLEGAVLDIRYGVSRIAYSKDF ETLKVDFLSKLPEMLKMFEDRLCHKTYLNGDHVTHPDFMLYDALDVVLYMDPMCLD AFPKLVCFKKRIEAIPQIDKYLKSSKYIAWPLQGWQATFGGGDHPPKSDLVPRGSHHHH HH [0403] sjGST (97TAG) and sjGST (97TAG/44 mutants).
- pBAD-sjGST (97TAG) and pBAD- sjGST (97TAG/44A) were constructed by primers HR-sjGST NdeI, sjGST sjGST97TAG-For, sjGST97TAG-Rev, HR-sjGST HindIII rev, sjGST44A-For, and sjGST44A-Rev.
- primers set 44S-For, 44S-Rev, 44T-For, 44T-Rev, 44Y-For, 44Y-Rev, 44H-For, 44H-Rev were used to prepare pBAD-sjGST (97TAG/44S), pBAD-sjGST (97TAG/44T), pBAD-sjGST (97TAG/44Y) and pBAD-sjGST (97TAG/44H).
- SEQ ID NO:34 where Bold underline: amber codon TAG at 97 th position.
- pBAD-7D12 (30TAG) was constructed by site-directed mutagenesis with primers 7D1230TAG-For and 7D1230TAG-Rev (SEQ ID NO:36, where Bold underline: amber codon TAG at 30 th position.
- SEQ ID NO:36 MGQVKLEESGGGSVQTGGSLRLTCAASGRXSRSYGMGWFRQAPGKEREFVSGISWRG DSTGYADSVKGRFTISRDNAKNTVDLQMNSLKPEDTAIYYCAAAAGSAWYGTLYEYD YWGQGTQVTVSS
- 7D12 31TAG.
- pBAD-7D12 (31TAG) was constructed by site-directed mutagenesis with primers 7D1231TAG-For and 7D1231TAG-Rev (SEQ ID NO:37, where Bold underline: amber codon TAG at 31 st position.)
- SEQ ID NO:37 MGQVKLEESGGGSVQTGGSLRLTCAASGRTXRSYGMGWFRQAPGKEREFVSGISWRG DSTGYADSVKGRFTISRDNAKNTVDLQMNSLKPEDTAIYYCAAAAGSAWYGTLYEYD YWGQGTQVTVSS
- Library construction and FSKRS mutant selection [0412] To screen an efficient synthetase for the incorporation of FSK, the primers MaPylRS NdeI to MaPylRS PstI were used to randomize the active site of Methanomethylophilus alvus PylRS-tRNA synthetase (SEQ ID NO:1) and create the library
- a single colony was picked and inoculated into 1 mL 2xYT (5 g/L NaCI, 16 g/L Tryptone, 10 g/L Yeast extract).
- the cells were left grown 37 °C, 220 rpm to an OD 0.5, with good aeration for overnight.
- the cells were diluted 10 times in fresh 2XYT supplemented with relevant antibiotics, 0.2% arabinose with or without 1 mM FSK.
- the cells were then induced at either 30 °C for 6 hr or 18 °C for overnight.
- the fluorescence was checked by a plate reader as described above.
- pBAD-ecGST WT pBAD-ecGST WT, pBAD-ecGST (86TAG), pBAD-ecGST (65TAG), pBAD-ecGST (86TAG/92A), pBAD-ecGST (86TAG/92A/72A), or sjGST WT, sjGST (97TAG), sjGST (97TAG/44A, S, T, H, or Y) was co-transformed with either pEVOL-FSYRS or pEVOL- FSKRS into DH10b cells. FSY or FSK was added with 0.2% arabinose respectively to the cells for induction when the cells were grown to an OD around 0.5.
- the cells were grown for protein expression at 37 °C for 6 hr, which then were harvested by centrifugation with a benchtop centrifuge and treated with 2xSDS loading dye containing 100 mM DTT, and boiled for 5 mins at 95 °C. The dimerization of GST due to cross linking was monitored by Western blot using anti-his antibody.
- the cell culture was diluted 100 times and then regrown to an OD 0.5 in 30 to 100 mL scale, with good aeration and the relevant antibiotic selection. Then the medium was added with 0.2% arabinose with or without 1 mM FSY or FSK, and the expression were carried out at 30 °C for 12 hr. The protein purification was carried out with Ni-NTA affinity chromatography.
- Anti- GAPDH antibody was used as a reference protein.
- Genetic incorporation of FSK into Hela GFP (182TAG) [0428] The plasmid pNEU-FSKRS (1 ⁇ g) was transfected into Hela-GFP 182(TAG) cells with 3 ⁇ L polyethylenimine (PEI) in 2 mL RPMI 1640 media when the cells population reached 80% confluence. A blank Hela-GFP 182(TAG) cell group was used as a negative control. The cells were treated with or without 1 mM FSK 6 hr after transfection and cultured for additional 48 hr.
- PEI polyethylenimine
- the plasmid pNEU-FSKRS (1.5 ⁇ g) was co-transfected with 1 ⁇ g pCDNA 3.1 ecGST WT, 1.5 ⁇ g ecGST (86TAG), 1.5 ⁇ g ecGST(86TAG/92A), and 1.5 ⁇ g ecGST(86TAG/92A/72A) respectively into HEK (293T) cells with 9 ⁇ L polyethylenimine (PEI) in 2 mL DMEM media when the cells population reached 80% confluence.
- PEI polyethylenimine
- the cells were treated with or without 1 mM FSK 6 hr after transfection and cultured for additional 48 hr. The cells were harvested and ran Western blot using anti-His antibody.
- Mass spectrometry Mass spectrometric measurements were performed as previously described. (Liu et al, Journal of the American Chemical Society 2017, 139 (9), 3430-3437). Briefly for electrospray ionization mass spectrometry, mass spectra of intact proteins were obtained using a QTOF Ultima (Waters) mass spectrometer, operating under positive electrospray ionization (+ESI) mode, connected to an LC-20AD (Shimadzu) liquid chromatography unit.
- QTOF Ultima Waters
- (+ESI positive electrospray ionization
- Protein samples were separated from small molecules by reverse phase chromatography on a Waters Xbridge BEH C4 column (300 ⁇ , 3.5 ⁇ m, 2.1 mm x 50 mm), using an acetonitrile gradient from 30-71.4%, with 0.1% formic acid. Each analysis was 25 min under constant flow rate of 0.2 mL/min at RT. Data were acquired from m/z 350 to 2500, at a rate of 1 sec/scan. Alternatively, spectra were acquired by Xevo G2-S QTOF on a Waters ACQUITY UPLC Protein BEH C4 reverse-phase column (300 ⁇ , 1.7 ⁇ m, 2.1 mm x 150 mm).
- Q Exactive Orbitrap Data acquisition by Q Exactive Orbitrap was as follows: 10 ⁇ L of trypsin-digested protein was loaded on an Ace UltraCore super C18 reverse-phase column (300 ⁇ , 2.5 ⁇ m, 75 mm ⁇ 2.1 mm) via an autosampler. An acetonitrile gradient from 5%-95% was used with 0.1% formic acid, over a run time of 45 min and constant flow rate of 0.2 mL/min at RT. MS data were acquired using a data-dependent top10 method dynamically choosing the most abundant precursor ions from the survey scan for HCD fragmentation using a stepped normalized collision energy of 28, 3035 eV. Survey scans were acquired at a resolution of 70,000 at m/z 200 on the Q Exactive.
- N-Boc-Lys- t Bu (4, 0.34 g, 1 mmol, 1 equiv.) was then added, after which Et 3 N (0.15 mL, 1.1 mmol, 1.1 equiv.) was added dropwise.
- the reaction mixture was stirred at r.t. overnight.
- the reaction was quenched with 20 mL of H 2 O and washed with 1 M HCl (20 mL x 2).
- the aqueous phase was combined and extracted with ethyl acetate (20 mL x 2).
- the organic fractions were combined and dried over anhydrous sodium sulfate and concentrated under vacuum.
- the crude product was then purified by column chromatography using MeOH:CH 2 Cl 2 (1:100).
- N-Boc-FSK- t Bu The product, N-Boc-FSK- t Bu, was isolated as a yellow oil (0.25 g, 0.50 mmol, 50%). [0437] N-Boc-FSK- t Bu (0.25 g , 0.50 mmol) was added to a scintillation vial and dissolved in 4 M HCl in dioxane (10 mL). The reaction was stirred overnight.
- the FL/OD for +UAA for FSKRS was 71685 and –UAA for FSKRS was 3274; the FL/OD for +UAA for FSKRS-CThis was 76214 and –UAA for FSKRS-CThis was 2602; and the FL/OD for +UAA for FSKRS-NThis was 53687 and – UAA for FSKRS-NThis was 4055.
- the fluorescence intensity ratio of +FSK over -FSK was higher for FSKRS-CTHisx6 (29.3 fold) than for FSKRS (21.9 fold), mainly due to a lower background for FSKRS-CTHisx6 in the absence of FSK.
- the fluorescence intensity ratio of +FSK over -FSK for FSKRS- NTHisx6 was 13.2 fold. Comparison of results at 37 °C and 18 °C indicated the Hisx6 tag appended at C-terminus of FSKRS enhanced the thermostability of the synthetase. Therefore, the increase effect of the Hisx6 tag on FSK incorporation efficiency will be effective at temperatures from about 18 °C to about 37 °C. In embodiments, the temperatures are from about 25 °C to about 30 °C. [0442] Similar experiments performed with FSYRS to incorporate FSY into sfGFP(151TAG) showed no such effect, suggesting the effect of Hisx6 on FSKRS may be unique. Other tags may have a similar effect on FSKRS.
- SEQ ID NO:86 (FSKRS-NTHis6) MHHHHHHTVKYTDAQIQRLREYGNGTYEQKVFEDLASRDAAFSKEMSVASTDNEKKI KGMIANPSRHGLTQLMNDIADALVAEGFIEVRTPIFISKDALARMTITEDKPLFKQVFWI DEKRALRPMLAPNLGSVARDLRDHTDGPVKIFEMGSCFRKESHSGMHLEEFTMLNLFD MGPRGDATEVLKNYISVVMKAAGLPDYDLVQEESDVYKETIDVEINGQEVCSAAVGPT PIDAAHDVHEPWSGAGFGLERLLTIREKYSTVKKGGASISYLNGAKIN* [0445] SEQ ID NO:87 (FSKRS-CTHis6) MTVKYTDAQIQRLREYGNGTYEQKVFEDLASRDAAFSKEMSVAST
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biochemistry (AREA)
- Genetics & Genomics (AREA)
- Immunology (AREA)
- Medicinal Chemistry (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- General Engineering & Computer Science (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Hematology (AREA)
- Urology & Nephrology (AREA)
- Cell Biology (AREA)
- Food Science & Technology (AREA)
- Physics & Mathematics (AREA)
- Analytical Chemistry (AREA)
- General Physics & Mathematics (AREA)
- Pathology (AREA)
- General Chemical & Material Sciences (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Animal Behavior & Ethology (AREA)
- Pharmacology & Pharmacy (AREA)
- Epidemiology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Biophysics (AREA)
- Peptides Or Proteins (AREA)
- Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
- Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)
- Medicinal Preparation (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
Priority Applications (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP22763916.8A EP4301767A4 (en) | 2021-03-01 | 2022-03-01 | BIOREACTIVE COMPOUNDS AND METHODS OF USE THEREOF |
| CN202280022559.7A CN117098768A (zh) | 2021-03-01 | 2022-03-01 | 生物反应性化合物及其使用方法 |
| CA3212360A CA3212360A1 (en) | 2021-03-01 | 2022-03-01 | Bioreactive compounds and methods of use thereof |
| JP2023553044A JP2024512297A (ja) | 2021-03-01 | 2022-03-01 | 生体反応性化合物及びその使用方法 |
| US18/279,463 US20250283138A1 (en) | 2021-03-01 | 2022-03-01 | Bioreactive compounds and methods of use thereof |
| AU2022231099A AU2022231099A1 (en) | 2021-03-01 | 2022-03-01 | Bioreactive compounds and methods of use thereof |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202163155222P | 2021-03-01 | 2021-03-01 | |
| US63/155,222 | 2021-03-01 | ||
| US202163214432P | 2021-06-24 | 2021-06-24 | |
| US63/214,432 | 2021-06-24 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022187273A1 true WO2022187273A1 (en) | 2022-09-09 |
Family
ID=83154435
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2022/018381 Ceased WO2022187273A1 (en) | 2021-03-01 | 2022-03-01 | Bioreactive compounds and methods of use thereof |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US20250283138A1 (https=) |
| EP (1) | EP4301767A4 (https=) |
| JP (1) | JP2024512297A (https=) |
| AU (1) | AU2022231099A1 (https=) |
| CA (1) | CA3212360A1 (https=) |
| WO (1) | WO2022187273A1 (https=) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2023122753A1 (en) * | 2021-12-22 | 2023-06-29 | Enlaza Therapeutics, Inc. | Crosslinking antibodies |
| CN117003660A (zh) * | 2023-08-09 | 2023-11-07 | 四川大学 | 基于三氟甲基光脱氟-酰氟交换交联的非天然氨基酸及其用途 |
| EP4330214A4 (en) * | 2021-04-28 | 2025-11-26 | Univ California | BIOREACTIVE PROTEINS CONTAINING NON-NATURAL AMINO ACIDS |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN121087130A (zh) * | 2025-11-12 | 2025-12-09 | 康码芯(上海)智能科技有限公司 | 一种用于插入非天然氨基酸的方法及其应用 |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2020072674A1 (en) * | 2018-10-02 | 2020-04-09 | The Regents Of The University Of California | Multi-target crosslinkers and uses thereof |
| US20200397719A1 (en) * | 2014-06-06 | 2020-12-24 | The Scripps Research Institute | Sulfur (vi) fluoride compounds and methods for the preparation thereof |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP5590649B2 (ja) * | 2007-09-20 | 2014-09-17 | 独立行政法人理化学研究所 | 変異体ピロリジル−tRNA合成酵素及びこれを用いる非天然アミノ酸組み込みタンパク質の製造方法 |
| EP3845661A4 (en) * | 2018-08-31 | 2022-06-22 | Riken | PYRROLYSYL TRNA SYNTHETASE |
| CN111302980A (zh) * | 2019-12-31 | 2020-06-19 | 上海交通大学医学院附属仁济医院 | 含硫酰氟基团的氨基酸类似物及其制备方法和应用 |
-
2022
- 2022-03-01 WO PCT/US2022/018381 patent/WO2022187273A1/en not_active Ceased
- 2022-03-01 US US18/279,463 patent/US20250283138A1/en active Pending
- 2022-03-01 JP JP2023553044A patent/JP2024512297A/ja active Pending
- 2022-03-01 CA CA3212360A patent/CA3212360A1/en active Pending
- 2022-03-01 AU AU2022231099A patent/AU2022231099A1/en active Pending
- 2022-03-01 EP EP22763916.8A patent/EP4301767A4/en active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20200397719A1 (en) * | 2014-06-06 | 2020-12-24 | The Scripps Research Institute | Sulfur (vi) fluoride compounds and methods for the preparation thereof |
| WO2020072674A1 (en) * | 2018-10-02 | 2020-04-09 | The Regents Of The University Of California | Multi-target crosslinkers and uses thereof |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4301767A4 * |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP4330214A4 (en) * | 2021-04-28 | 2025-11-26 | Univ California | BIOREACTIVE PROTEINS CONTAINING NON-NATURAL AMINO ACIDS |
| WO2023122753A1 (en) * | 2021-12-22 | 2023-06-29 | Enlaza Therapeutics, Inc. | Crosslinking antibodies |
| CN117003660A (zh) * | 2023-08-09 | 2023-11-07 | 四川大学 | 基于三氟甲基光脱氟-酰氟交换交联的非天然氨基酸及其用途 |
| CN117003660B (zh) * | 2023-08-09 | 2026-03-27 | 四川大学 | 基于三氟甲基光脱氟-酰氟交换交联的非天然氨基酸及其用途 |
Also Published As
| Publication number | Publication date |
|---|---|
| CA3212360A1 (en) | 2022-09-09 |
| AU2022231099A1 (en) | 2023-08-31 |
| EP4301767A4 (en) | 2025-05-28 |
| US20250283138A1 (en) | 2025-09-11 |
| EP4301767A1 (en) | 2024-01-10 |
| JP2024512297A (ja) | 2024-03-19 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20250283138A1 (en) | Bioreactive compounds and methods of use thereof | |
| US20250084121A1 (en) | Bioreactive compositions and methods of use thereof | |
| US20240262791A1 (en) | Bioreactive proteins containing unnatural amino acids | |
| US12493045B2 (en) | Multi-target crosslinkers and uses thereof | |
| Fang et al. | GPC3-mediated lysosome-targeting chimeras (GLTACs) for targeted degradation of membrane proteins | |
| US20240252652A1 (en) | Proteins having unnatural amino acids and methods of use | |
| CN117098768A (zh) | 生物反应性化合物及其使用方法 | |
| WO2020206341A1 (en) | Method to generate biochemically reactive amino acids | |
| WO2025128629A1 (en) | Unnatural amino acids, bioreactive proteins, and uses thereof | |
| CA3236054A1 (en) | Specific binding molecules for fibroblast activation protein (fap) | |
| US12351857B2 (en) | Activity based probes | |
| EP4642761A1 (en) | Bioreactive proteins containing an unnatural amino acid and arginine | |
| KR102792904B1 (ko) | 비방사성 동위원소로 치환된 페놀 화합물 및 이의 용도 | |
| WO2024097831A1 (en) | Bioreactive proteins containing unnatural amino acids | |
| Pellecchia et al. | A fragment-based electrophile-first approach to target histidine with aryl-fluorosulfates: application to hMcl-1 | |
| WO2025007069A2 (en) | Imaging probes and uses thereof | |
| Shikwana | The Development and Application of Proteomic Approaches to Assess Consequences of Amino Acid Modifications Within the Proteome | |
| Wazynska | Novel immune checkpoint biomarkers for medical imaging: from drug design to in vivo evaluation | |
| WO2025250949A1 (en) | Compositions and methods for chemoproteomic interaction site mapping | |
| Plucinsky | The Biophysical Characterization of Caveolin-1 | |
| Davisson et al. | Targeting PCNA Phosphorylation in Breast Cancer |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22763916 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 3212360 Country of ref document: CA Ref document number: 2022231099 Country of ref document: AU Date of ref document: 20220301 Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023553044 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202280022559.7 Country of ref document: CN |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2022763916 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2022763916 Country of ref document: EP Effective date: 20231002 |
|
| WWP | Wipo information: published in national office |
Ref document number: 18279463 Country of ref document: US |