WO2006021553A1

WO2006021553A1 - Method for protein purification and labeling based on a chemoselective reaction

Info

Publication number: WO2006021553A1
Application number: PCT/EP2005/054114
Authority: WO
Inventors: Maik Kindermann
Original assignee: Covalys Biosciences Ag
Priority date: 2004-08-23
Filing date: 2005-08-22
Publication date: 2006-03-02

Abstract

The present invention relates to methods and reagents for labeling a target compound and a chemoselective reaction that simultaneously cleaves and ligates different labels to a target compound. Such reagents are phosphines of formula (1) wherein A is a group that specifically binds to or reacts with a binding partner B comprising a target protein of interest, R1 and R3 are linkers, R2 is an electrophilic functional group, X, Y and Z are aryl or another group, L1 is a label or another group, and R3 is bound either to X or to Z, and azides of formula (2), N3-R4-L2 (2), wherein L2 is a label or another group and R4 is a linker.

Description

Method for Protein Purification and Labeling Based on a Chemoselective Reaction

Field of the Invention

The present invention relates to methods and reagents for labeling a target compound and a chemoselective reaction that simultaneously cleaves and ligates different labels to a target compound.

Background of the invention

The emerging field of "proteomics" aims for a functional analysis of proteins, their interactions among each other and their interaction with small organic molecules. This requires primarily the identification and characterization of all proteins involved in a complex cellular system. The isolation of unknown binding partners of a protein which is known to be part of a certain metabolic pathway is a key issue e.g. in drug discovery.

Today, advanced techniques in molecular biology allow producing proteins of interest very rapidly whereas the possibilities to characterize the corresponding proteins and their interactions in vivo and in vitro are limited.

Several techniques in biological sciences require the attachment of different labels to bio- macromolecules (such as peptides), which either allow the purification of the macromolecules for in vitro applications or localizing them in vivo. Numerous strategies towards this goal are based on the construction of fusion proteins between a protein of interest and a second protein (polypeptide) with the intention to couple the high selectivity of the protein tag (second protein) for a specific probe to the protein of interest. Examples for such protein tags in fusion proteins include the 6xHis tag, glutathione S transferase, maltose binding protein, epitope tags, yeast-two hybrid system, split-ubiquitin, and green fluorescent protein (GFP).

Linkers and their associated strategies play a pivotal role in the successful implementation of protein labeling and purification. The bond between a linker and a label, e.g. a label for immobilization, is usually sensitive to certain reaction conditions leading to bond cleavage and the release of the final compound from the label. In case of covalent labeling of proteins as well as for some non-covalent cases, the problem may arise that a molecular probe - once introduced to the fusion protein via the specific reaction of the protein tag - cannot be easily removed or changed without disruption of biological activity of the protein of interest. An immobilization strategy for proteins of interest using O⁶-alkylguanine-DNA alkyltransf erase (AGT) fusion proteins that combines a high specificity and the stability of a covalent bond was disclosed in International Patent Application WO 02/083937. International Patent Application WO 2004/104588 discloses a method for covalent labeling acyl carrier protein (ACP) fusion proteins with a wide variety of different labels. The method relies on the transfer of a label from a coenzyme A type substrate to an ACP fusion protein using a holo-acyl carrier protein synthase (ACPS) or a homologue thereof. The method allows detecting and manipulating the fusion protein, both in vitro and in vivo, by attaching molecules to the fusion proteins that introduce a new physical or chemical property to the fusion protein.

A number of conditions/parameters must be matched for successful protein purification via reversible immobilization to a solid support. Proteins must remain hydrated, kept in their native state and handled at ambient temperature and physiological pH values. In addition, conditions for removal of the purified fusion proteins from the solid support have to be chosen carefully in order not to destroy the biological function/activity of the protein of interest. Thus, maintaining protein activity during covalent immobilization and subsequent removal requires new strategies for substrate and cleavable linker design.

Most tags for in vitro protein purification comprise affinity labels that - upon contact with their counterpart - form non-covalent complexes of various stability. Elution of the protein of interest is then achieved by using a specific competitor for the binding site or a pH shift of the elution buffer: e.g. 6xHis tag with imidazole, or streptavidin (strep tag™) with desthiobiotin. Photocleavable linkers have been used for the cleavage of entirely covalent entities, with the drawback that the light of the wavelength necessary may harm some proteins, and the work has to be carried out in the dark before the cleavage event.

The present invention makes use of the "Staudinger reaction". One of the oldest and most common transformations of azides is the reaction with trivalent phosphorous compounds to form iminophosphoranes (phospha-aza-ylides). This transformation proceeds under very mild conditions in a variety of solvents with almost quantitative yield. The nucleophilic nitrogen atom in those iminophosphoranes can react with numerous electrophiles, and the simple hydrolysis of this intermediate leads to an amine and a phosphine oxide. This reaction is known for almost a century as the Staudinger reaction (Staudinger et al., HeIv. Chim. Acta λ Sλ S, 2, ∞5). Bertozzi et al. introduced a modification of the Staudinger reaction, wherein the intermediate iminophosphorane is trapped through an intramolecular electrophile, e.g. an ester (E. Saxon, CR. Bertozzi, Science 2000, 287, 2007-2010). The reaction proceeds through the formation of an intermediate oxaphosphetane, which hydrolyzes to form a stable amide and a phosphine oxide as the second product. This reaction is now known as the Staudinger ligation and has found several applications, especially in the chemoselective ligation of bioconjugates (M. Kόhn, R. Breinbauer, Angew. Chem. 2004, 116, 3169-3178). A further extension of the same type of reaction was introduced by Bertozzi et al. (E. Saxon, J.I. Armstrong, CR. Bertozzi, Org. Lett. 2000, 2, 2141 -2143) and Raines et al. (B. L. Nilsson, LL Kiessling, RT. Raines, Org. Lett. 2000, 2, 1939-1941 ), wherein the phosphine oxide moiety is cleaved off during the ligation reaction between an azide and a phosphine. This modification is called "traceless Staudinger ligation".

Summary of the invention

The invention relates to two classes of chemical compounds, to their use and their manufacture.

In particular the invention relates to compounds of formula (1 )

wherein

A is a group that specifically binds to or reacts with a binding partner B;

Ri and R₃ are, independently of each other, a linker; R₂ is an electrophilic functional group able to react intramolecularly with an aza-ylide;

X, Y and Z are, independently of each other, aryl or heteroaryl, or an optionally substituted saturated or unsaturated alkyl, cycloalkyl or heterocyclyl group, or X and Y, X and Z or Y and Z together with the phosphorus atom represent a ring;

L₁ is a label, a plurality of same or different labels, a bond connecting R₃ to A forming a cyclic substrate, or a further group A; and R₃ is bound either to X or to Z;

and to a method of manufacture of such compounds. The invention further relates to novel azides of formula (2)

N₃-R₄-U (2)

wherein L₂ is a label, a plurality of same or different labels, a group A or hydrogen, and R₄ is a linker;

and to a method of manufacture of such compounds.

The invention further relates to a method for detecting and/or manipulating a protein of interest, wherein the protein of interest is incorporated into a binding partner B, the binding partner B is contacted with a compound of formula (1 ) comprising a label, and the reaction product of binding partner B and the compound of formula (1 ) is detected and/or further manipulated using the label. In particular the invention relates to such a method wherein the reaction product of binding partner B and the compound of formula (1 ) is further reacted with a compound of formula (2).

Detailed description of the invention

The first class of compounds comprises specially designed phosphine moieties of formula (1 )

in particular of formula (1A) or formula (1 B)

/^R3~ ^L1

A-R₁-R₅ X_χ

Y (1 A) A-R₁-R-X R₃-L₁

^Y (1 B) wherein

A is a group that specifically binds to or reacts with a binding partner B; R₁ and R₃ are, independently of each other, a linker; R₂ is an electrophilic functional group able to react intramolecularly with an aza-ylide;

X, Y and Z are, independently of each other, aryl or heteroaryl, or an optionally substituted saturated or unsaturated alkyl, cycloalkyl or heterocyclyl group, or X and Y, X and Z or Y and Z together with the phosphorus atom represent a ring; and L₁ is a label, a plurality of same or different labels, a bond connecting R₃ to A forming a cyclic substrate, or a further group A.

In formula (1A), R₃ (and R₂ and the phosphorus atom further carrying Y and Z) are connected to X. In formula (1 B), R₃ is connected to Z, which is one of the three ligands (X, Y, Z) of the phosphorus atom. Formula (1 ) has to be understood that R₃ may be connected to either X or Z.

Preferred are compounds of formula (1 A).

The second class of compounds comprises specially designed organic azides of the general formula (2)

N₃-R₄-L₂ (2)

wherein L₂ is a label, a plurality of same or different labels, a group A or hydrogen, and R₄ is a linker.

A is a group that specifically binds to or reacts with a binding partner B. The interaction of the group A with a binding partner B is either through a covalent bond after a chemical reaction, or through a non-covalent interaction such as a complex formation. Preferably, the binding partner B consists of a peptide sequence comprising a protein of interest and a protein tag. For example, A is a group recognized as a substrate by a binding partner representing a fusion protein with a suitable protein tag, e.g. a substrate for an O⁶- alkylguanine-DNA alkyltransferase (AGT), for acyl carrier protein (ACP), or for a fragment of AGT or ACP or a mutant of AGT or ACP, preferably a substrate for an O⁶-alkylguanine- DNA alkyltransferase (AGT) or for acyl carrier protein (ACP).

Particular groups A recognized as a substrate by an optionally modified O⁶-alkylguanine- DNA alkyltransferase (AGT) or a fragment thereof are those disclosed in International Patent Application WO 2004/031405. Such a particular group is, e.g., a para-substituted O⁶-benzylguanine residue of the formula (3)

or corresponding 4'-substituted O⁶-thiophenyl-2'-methyl guanine residues.

Further particular groups A recognized as a substrate by an acyl carrier protein (ACP) in the presence of a holo-acyl carrier protein synthase (ACPS) are coenzyme A type substrates described in International Patent Application WO 2004/104588, for example of the formula (4).

In a further embodiment, A is a haloalkane, reacting specifically with an active-site variant of dehalogenase (DhaA). Haloalkane dehalogenases are enzymes that hydrolyze carbon- halogen bonds in a broad range of substrates. In the catalytic mechanism for dehalogenation, a nucleophilic attack of an Asp (aspartic acid) residue on the halogen- substituted carbon atom of the substrate occurs forming a covalent alkyl-enzyme intermediate. In wild-type dehalogenases, this intermediate is subsequently hydrolyzed by a water molecule which is activated by His and Asp residues (T. Bosma et al.,

Biochemistry, 2003, 42, 8047-8053). A point mutation in DhaA involving the substitution of those His and/or Asp residues impairs the hydrolysis step leading to a stable covalent enzyme substrate intermediate. Particular groups A recognized as a substrate by a dehalogenase mutant are omega-haloalkanes with a carbon chain of 1 -20, preferably 2-8, carbon atoms. Binding partner B may then be a fusion protein of such a dehalogenase mutant and a protein of interest (WO 2004/072232).

In a further embodiment A is a biotin moiety that can be captured by a binding partner B representing an avidin derivative, e.g. streptavidin or a protein of interest carrying streptavidin.

In a further embodiment A is a synthetic ligand (SLF^') which interacts with a binding partner B comprising FKBP12(F36V) in the nM range as disclosed in K. M. Marks et al., Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 9982-9987.

In a further embodiment A is an activity based chemical probe that targets a binding partner B representing proteins of interest fused to specific classes of enzymes, for example a fluorophosphonate reacting specifically with members of the serine hydrolase superfamily (D. Kidd, B.F. Cravatt, Biochemistry, 2001 , 40, 4005-4015), an (acyloxy)- methyl ketone reacting with cysteine proteases (N. A. Thornberry et al., Biochemistry, 1994, 33, 3934-3940), or a 4-fluoromethyl-1 -phosphaphenyl group reacting with tyrosine phosphatases (J. K. Meyers et al., Science, 1993, 262, 1451 -1453).

In a further embodiment, A is a bis-arsenical dye reacting specifically with a binding partner B representing a tetracysteine-tagged protein (S. R. Adams, R. Y. Tsien et al., J. Am. Chem. Soc. 2002, 124, 6063-6076).

Preferably A is an O⁶-benzylguanine residue of formula (3) or a coenzyme A type substrate of formula (4).

R₁, R₃ and R₄ are, independently of each other, a linker. A linker R₁ is a linker connecting group A to the electrophilic functional group R₂ bound to ligand X of phosphine PXYZ. A linker R₃ is a linker connecting a label L₁ (or L₁ with another meaning) to the X group bearing R₂ (in the compound of formula (1 A)), or to another phosphine ligand Z (in the compound of formula (1 B)). A linker R₄ is a linker connecting a label L₂ (or L₂ with another meaning) to the azide function -N₃. Linkers R₁, R₃ and R₄ are preferably flexible linkers. Linker units R₁ and R₃ are chosen in the context of the envisioned application, i.e. the reaction with a binding partner B, such as the transfer of the group bound to A to a fusion protein comprising AGT or ACP, or the reaction with a derivative of a dehalogenase mutant, of avidin, of FKBP12(F36V), of members of the serine hydrolase superfamily, of cysteine proteases, of tyrosine phosphatases, or reaction with tetracysteine-tagged proteins, respectively. Likewise linker R₄ is chosen in the context of the envisioned application, i.e. the reaction of the compound of formula (2) with a compound of formula (1 ) connected to a binding partner B. The linkers are also supposed to increase the solubility of the substrates in the appropriate solvent. The linkers used are chemically stable under the conditions of the actual application. The linkers R₁, R₃ and R₄ do not interfere with the interaction of group A with the binding partner B nor with the detection of the labels L₁ and L₂, respectively, but may be constructed such as to be cleaved at some point in time after the reaction of the compound of formula (1 ) with the binding partner B or after the reaction of compound of formula (2) with compound (1 ) carrying the binding partner B, respectively.

Linkers R₁, R₃ and R₄ considered are those disclosed in International Patent Application WO 2004/031405 (as a linker R₄). Such a particular linker is e.g. a straight or branched chain alkylene group with 1 to 300 carbon atoms, wherein optionally (a) one or more carbon atoms are replaced by oxygen, in particular wherein every third carbon atom is replaced by oxygen, e.g. a poylethyleneoxy group with 1 to 100 ethyleneoxy units;

(b) one or more carbon atoms are replaced by nitrogen carrying a hydrogen atom, and the adjacent carbon atoms are substituted by oxo, representing an amide function -NH-CO-;

(c) one or more carbon atoms are replaced by oxygen, and the adjacent carbon atoms are substituted by oxo, representing an ester function -O-CO-;

(d) the bond between two adjacent carbon atoms is a double or a triple bond, representing a function -CH=CH- or -CDC-; (e) one or more carbon atoms are replaced by a phenylene, a saturated or unsaturated cycloalkylene, a saturated or unsaturated bicycloalkylene, a bridging heteroaromatic or a bridging saturated or unsaturated heterocyclyl group;

(f) two adjacent carbon atoms are replaced by a disulfide linkage -S-S-; or a combination of two or more, especially two or three, alkylene and/or modified alkylene groups as defined under (a) to (f) hereinbefore, optionally containing substituents. A particularly preferred linker R₁, R₃ and R₄ is a straight chain alkylene group of 10 to 40 carbon atoms wherein 3 to 12 carbon atoms are replaced by oxygen, and optionally one carbon atom is replaced by a 1 ,4-phenylene unit. Another particularly preferred linker R₁, R₃ and R₄ is a straight chain alkylene group of 10 to 40 carbon atoms optionally substituted by oxo wherein 3 to 12 carbon atoms are replaced by oxygen and one or two carbon atoms are replaced by nitrogen. Another particularly preferred linker R₁, R₃ and R₄ is a straight chain alkylene group of 6 to 40 carbon atoms wherein 2 to 12 carbon atoms are replaced by oxygen and one or two bonds between two adjacent carbon atoms is a double bond representing a function -CH=CH-.

In cases wherein the linker R₁ or R₃ features stereogenic centers, especially at the α- carbon to R₂, the Staudinger ligation, e.g. the reaction corresponding to the reaction of compound of formula (1 ) with compound of formula (2), proceeds without detectable racemization (M. B. Soellner et al. J. Org. Chem. 2002, 67, 4993-4996).

The central subunit

consists of a trisubstituted phosphorus atom. X, Y and Z are, independently of each other, aryl or heteroaryl, or an optionally substituted saturated or unsaturated alkyl, cycloalkyl or heterocyclyl group. Y and Z differ from X in that X is not only bound to phosphorus, but further to the electrophilic functional group R₂ and, in compounds of formula (1 A), to the linker R₃. In compounds of formula (1 B) Z differs further from Y in that Z is bound to the linker R₃.

When X and Y, X and Z or Y and Z together with the phosphorus atom represent a ring, this ring may have 4, 5, 6, 7 or 8 ring members selected from carbon, nitrogen and oxygen atoms, preferably carbon atoms. The ring members may be substituted by the substituents listed below under alkyl, and may be annealed to an aromatic ring, e.g. phenyl. Particular ring residues bound to phosphorous considered are e.g. 1 ,4-butylene, 1 ,5-pentylene, 1 ,6-hexylene, or o-phenylene-dimethylene.

X, Y and Z are preferably aryl, in particular optionally substituted phenyl, wherein the substituents preferably have the meanings listed below under aryl. The rate of reaction of the Staudinger ligation depends on the electronic properties of X, Y and Z, and substituents are chosen to take this into account. Most preferably, X, Y and Z are unsubstituted phenyl.

Further preferred are phosphines wherein X is an optionally substituted methyl group, in particular methyl or benzyl, or imidazolyl, and Y and Z represent aryl, preferably phenyl.

R₂ is an electrophilic functional group able to react intramolecularly with an aza-ylide, in particular with an iminophosphorane. R₂ is bound to linker R₁ and to group X, and is preferably separated by two carbon atoms from -PYZ, e.g. located in the ortho position relative to PYZ if X is aryl (e.g. phenyl) or heteroaryl, or by one carbon atom from -PYZ, e.g. bound to the methyl group in an optionally substituted methyl (e.g. benzyl) group X. An electrophilic functional group R₂ is, for example, a derivative of carboxylic acid such as an ester, thioester or carboxamide, or a sulfonic acid ester, preferably a carboxylic acid ester. This electrophilic functional group R₂ is bound to X in such a way that, on reaction with a nucleophile, X represents the leaving group, i.e. the sequence -R₁-R₂-X- corresponds to partial formula -R₁-CO-O-X-, -R₁-CO-S-X-, -R₁-CO-NH-X-, and -R₁-SO₂-O-X-, respectively. A further partial formula corresponding to this definition is

wherein the amide nitrogen is actually part of heteroaryl X with the meaning imidazole.

Aryl is an aromatic group comprising 6 to 10 carbon atoms, and is preferably phenyl or naphthyl, in particular phenyl. Aryl may be (further) substituted by lower alkyl, such as methyl, lower alkoxy, such as methoxy or ethoxy, halogen, e.g. chlorine, bromine or fluorine, halogenated lower alkyl, such as trifluoromethyl, or hydroxy.

Heteroaryl is mono- or bicyclic heteroaryl comprising zero, one, two, three or four ring nitrogen atoms and zero or one oxygen atom and zero or one sulfur atom, with the proviso that at least one ring carbon atom is replaced by a nitrogen, oxygen or sulfur atom, and which has 5 to 12, preferably 5 or 6 ring atoms; and which may be (further) substituted by lower alkyl, such as methyl, lower alkoxy, such as methoxy or ethoxy, halogen, e.g. chlorine, bromine or fluorine, halogenated lower alkyl, such as trifluoromethyl, or hydroxy.

Preferably heteroaryl is pyrrolyl, imidazolyl, benzimidazolyl, pyridyl, pyrimidinyl, oxazolyl, isoxazolyl, thiazolyl, isothiazolyl, triazolyl, tetrazolyl, thiophenyl, or furanyl. Alkyl has preferably 1 to 10 carbon atoms, is linear or branched, and includes lower alkyl of 1 to 7 carbon atoms, in particular 1 to 4 carbon atoms, e.g. methyl, ethyl, butyl, such as n-butyl, sec-butyl, isobutyl or tert-butyl, and propyl, such as n-propyl or isopropyl, and also pentyl, hexyl, heptyl, octyl, nonyl, or decyl, e.g. n-hexyl. Substituents considered are aryl, heteroaryl, cycloalkyl, lower alkoxy, such as methoxy or ethoxy, halogen, e.g. chlorine, bromine or fluorine, hydroxy, lower acyloxy, amino, e.g. methylamino, dimethylamino or lower acylamino such as acetylamino, carboxy, lower alkoxycarbonyl, carbamoyl, and cyano. Substituted alkyl is preferably aryl-lower alkyl, in particular arylmethyl, such as benzyl.

Unsaturated alkyl has preferably 2 to 10 carbons and corresponds to the definitions given for alkyl, further containing one, two or three double or triple bonds. Unsaturated alkyl is e.g. 1 -alkenyl or 1 -alkynyl. Substituents considered are for example aryl, e.g. phenyl, lower alkoxy, e.g. methoxy, lower acyloxy, e.g. acetoxy, or halogen, e.g. chloro.

Cycloalkyl has preferably 3 to 7 carbon atoms, and is e.g. cyclopropyl, cyclopentyl, cyclohexyl or cycloheptyl, in particular cyclohexyl, optionally substituted by lower alkyl, e.g. methyl, lower alkoxy, e.g. methoxy, lower acyloxy, e.g. acetoxy, or halogen, e.g. chloro.

Unsaturated cycloalkyl corresponds to the definitions given for cycloalkyl, further containing one or two double bonds. Unsaturated cycloalkyl is e.g. cyclopentenyl or cyclohexenyl, being optionally substituted by substituents listed under cycloalkyl.

Heterocyclyl has preferably 3 to 12 atoms comprising 1 to 5 hetero atoms selected from nitrogen, oxygen and sulfur, and is, for example, pyrrolidinyl, tetrahydrofuranyl, piperidinyl, morpholinyl or dioxolanyl. Substituents considered are e.g. lower alkyl, e.g. methyl, benzyl, lower alkoxy, e.g. methoxy, lower acyloxy, e.g. acetoxy, or halogen, e.g. chloro. Lower alkyl or benzyl substituents may be connected to a ring nitrogen atom.

Unsaturated heterocyclyl corresponds to the definitions given for heterocyclyl, further containing one or two double bonds. Unsaturated heterocyclyl is e.g. dihydro- or tetrahydropyridyl, and is optionally substituted by substituents listed under heterocyclyl. L₁ is a label, a plurality of same or different labels, a bond connecting R₃ to A forming a cyclic substrate, or a further group A.

L₂ is a label, a plurality of same or different labels, a group A, or hydrogen, preferably a label, a plurality of same or different labels, or a group A.

Labels L₁ and L₂ can be chosen by those skilled in the art dependent on the application for which the compounds of formula (1 ) and (2), respectively, are intended. The labels may be e.g. such that the binding partner B after reaction with the compound of formula (1 ) then carrying label L₁ is easily detected or separated from its environment. Other labels considered are those which are capable of sensing and inducing changes in the environment of the labeled binding partner B of compound (1 ), or labels which aid in manipulating the binding partner B of compound (1 ) by the physical and/or chemical properties of the compound of formula (1 ) carrying the label. Furthermore the combination of labels L₁ and L₂ can be chosen such that, e.g., after reaction of binding partner B with a compound of formula (1 ) the label L₁ is used to isolate and purify B, and then on reaction with a compound of formula (2) the label L₂ replacing label L₁ is used to further manipulate B, as will be described in more detail hereinbelow.

Labels L₁ and L₂ considered are those disclosed in International Patent Application WO 2004/031405 (as a label L).

Examples of labels L₁ and L₂ include a spectroscopic probe such as a fluorophore, a chromophore, a magnetic probe or a contrast reagent; a radioactively labelled molecule; a molecule which is one part of a specific binding pair which is capable of specifically binding to a partner; a molecule that is suspected to interact with other biomolecules; a library of molecules that are suspected to interact with other biomolecules; a molecule which is capable of crosslinking to other molecules; a molecule which is capable of generating hydroxyl radicals upon exposure to H₂O₂ and ascorbate, such as a tethered metal-chelate; a molecule which is capable of generating reactive radicals upon irradiation with light, such as malachite green; a molecule covalently attached to a solid support, where the support may be a glass slide, a microtiter plate or any polymeric structure known to those proficient in the art; a nucleic acid or a derivative thereof capable of undergoing base-pairing with its complementary strand; a lipid or other hydrophobic molecule with membrane-inserting properties; a biomolecule with desirable enzymatic, chemical or physical properties; or a molecule possessing a combination of any of the properties listed above. Preferred labels L₁ are spectroscopic probes, molecules which are one part of a specific binding pair which is capable of specifically binding to a partner, so- called affinity labels, and molecules covalently attached to a solid support. Preferred labels L₂ are spectroscopic probes, and molecules which are one part of a specific binding pair which is capable of specifically binding to a partner.

When the label L₁ or L₂ is a fluorophore, a chromophore, a magnetic label, a radioactive label or the like, detection is by standard means adapted to the label and whether the method is used in vitro or in vivo. The method can be compared to the applications of the green fluorescent protein (GFP) which is genetically fused to a protein of interest and allows protein investigation in the living cell. Particular examples of labels L₁ and L₂ are also boron compounds displaying non-linear optical properties, or a member of a FRET pair which changes its spectroscopic properties on reaction of the compound of formula (1 ) with the binding partner B.

Depending on the properties of the label L₁ or L₂, the binding partner B may be bound to a solid support. The label L₁ or L₂ may already be attached to a solid support when entering into reaction with binding partner B, or with binding partner B bound to the compound of formula (1 ), respectively, or may subsequently, i.e. after the corresponding reaction, be used to attach the binding partner B to a solid support. The label may be one member of a specific binding pair, the other member of which is attached or attachable to the solid support, either covalently or by any other means. A specific binding pair considered is e.g. biotin and avidin or streptavidin. Either member of the specific binding pair may be the label L₁ or L₂, the other being attached to the solid support. Further examples of labels allowing convenient binding to a solid support are e.g. maltose binding protein, glycoproteins, FLAG tags, or reactive substituents allowing chemoselective reaction between such substituent with a complementary functional group on the surface of the solid support. Examples of such pairs of reactive substituents and complementary functional group are e.g. amine and activated carboxy group forming an amide, azide and a propiolic acid derivative undergoing a 1 ,3-dipolar cycloaddition reaction, amine and another amine functional group reacting with an added bifunctional linker reagent of the type of activated dicarboxylic acid derivative giving rise to two amide bonds, or other combinations known in the art.

Examples of a convenient solid support are e.g. glass surfaces such as glass slides, microtiter plates, and suitable sensor elements, in particular functionalized polymers (e.g. in the form of beads), chemically modified oxidic surfaces, e.g. silicon dioxide, tantalum pentoxide or titanium dioxide, or also chemically modified metal surfaces, e.g. noble metal surfaces such as gold or silver surfaces. Irreversibly attaching and/or spotting a compound of formula (1 ) may then be used to attach binding partners B in a spatially resolved manner, particularly through spotting, on the solid support representing protein microarrays, DNA microarrays or arrays of small molecules.

When the label L₁ or L₂ is capable of generating reactive radicals, such as hydroxyl radicals, upon exposure to an external stimulus, the generated radicals can then inactivate the binding partner B, e.g. an AGT fusion protein, as well as those proteins that are in close proximity of such AGT fusion protein, allowing to study the role of these proteins. Examples of such labels are tethered metal-chelate complexes that produce hydroxyl radicals upon exposure to H₂O₂ and ascorbate, and chromophores such as malachite green that produce hydroxyl radicals upon laser irradiation. The use of chromophores and lasers to generate hydroxyl radicals is also known in the art as chromophore assisted laser induced inactivation (CALI).

Other labels considered are for example fullerenes, boranes for neutron capture treatment, nucleotides or oligonucleotides, e.g. for self-addressing chips, peptide nucleic acids, and metal chelates, e.g. platinum chelates that bind specifically to DNA.

If L₁ or L₂ represent two or more labels, these labels may be identical or different. Particular preferred combinations are two different affinity labels, or one affinity label and one chromophore label, in particular one affinity label and one fluorophore label. L₁ or L₂ representing a plurality of labels may consist of up to 100 same or different labels, preferably up to 5 same or different labels, and may also comprise appropriately functionalized dendritic structures, where the outer sphere of such a dendritic structure carries the same or different label molecules.

If L₁ or L₂ is a group A, the reaction with a binding partner B leads to dimerization of such binding partner B. The chemical structure of such dimers may be either symmetrical (homodimers) or unsymmetrical (heterodimers).

If L₂ is hydrogen, the compound of formula (2) represents a simple azide N₃-R₄-H. On reaction of such a simple azide, this compound serves as a scavenger reagent, cleaving the entity

from the group A and the binding partner B without introducing another label, as will be described in more detail hereinbelow.

The present invention relates also to novel azides of formula (2)

N₃-R₄-U (2)

wherein L₂ is a label, a plurality of same or different labels, or a group A, and R₄ is a linker. Such novel azides are e.g. compounds of formula (2) wherein L₂ is a group A, in particular a group A which is recognized as a substrate by an optionally modified O⁶- alkylguanine-DNA alkyltransferase or by an acyl carrier protein, such as the substituted benzylguanine (47) described below. Other novel azides are e.g. compounds of formula (2) with a particular combination of linker R₄ and label L₂, for example compounds (40), (42) and (44) described below.

Use of compounds of formula (1 ) and (2)

The present invention provides a new method and reagents for labeling proteins of interest and for chemoselective cleavage and re-labeling of labeled proteins of interest, leaving the proteins of interest in their native state. This approach is particular useful for in vitro protein purification and labeling. The reaction can be carried out under a wide range of mild reaction conditions, making this application particularly suitable for the chemical modification of protein tags and fusion proteins with proteins of interest.

The method features the specific transfer of a "molecular probe" (L₁) to a target peptide sequence (peptide of interest incorporated in binding partner B). In some embodiments, the interaction between the molecular probe L₁ and the target peptide sequence is non- covalent, i.e. when the interaction of group A with the binding partner B is non-covalent. Most preferred is the specific and covalent introduction of the molecular probe L₁ to an AGT or ACP fusion protein with the protein of interest. At a particular time, the molecular probe L₁ can be cleaved off and replaced by a new (the same or different) label L₂ by a chemoselective ligation/cleavage reaction that can be carried out under a variety of different mild reaction conditions, which do not interfere with the native state of the protein of interest. The reagent used to re-label or cleave the first label employs the functional group of an azide. Azides (e.g. compounds of formula (2)) do not appear in biomolecules and are therefore termed "bio-orthogonal". Since they have a high intrinsic reactivity they are perfect reagents for the chemoselective reaction in the Staudinger ligation as described in this invention.

In the preferred embodiment, the present invention provides a possibility for covalent labeling of AGT or ACP fusion proteins (as the binding partner B) and to exchange the first label L₁ by a second label L₂ in a highly specific manner, and at the same time conserving the covalent nature of the bond between fusion protein and label.

In particular, the invention relates to a method for detecting and/or manipulating a protein of interest, wherein the protein of interest is incorporated into a binding partner B, most preferably an AGT or ACP fusion protein, the binding partner B is contacted with a compound of formula (1 ), i.e. a particular substrate for binding partner B carrying a properly substituted phosphine entity and a label as described hereinbefore, and the reaction product of binding partner B and the compound of formula (1 ) is detected and/or further manipulated using the label. Under "manipulation" any physical or chemical treatment is understood. For instance manipulation may mean isolation from cells, purification, quantification, reaction with a compound of formula (2), and/or reaction with a reaction partner for the label. Such manipulation may be dependent on the label L₁ and/or label L₂, and may comprise several steps in any sequence.

In the method for detecting and/or manipulating, if the reaction product of binding partner B and the compound of formula (1 ) is both manipulated and detected, detection may be before or after manipulation, or may occur during manipulation as defined herein.

The reaction of binding partner B with the compound of formula (1 ), and optionally also with a compound of formula (2), can generally be performed in vitro, either in cell extracts or with already purified or enriched forms of binding partner B.

An overview of the application is illustrated by Scheme 1 , which, however, in no way limits the invention as described. A protein or peptide of interest is fused to a tag that recognizes (reacts) specifically with the group A of compound (1 ). Binding partner B as defined hereinbefore then corresponds to the entity "protein of interest — tag" (Schemel ). The "tags" are preferably mutants of O⁶-alkylguanine-DNA alkyltransferase (AGT) or an acyl-carrier protein (ACP). The protein or peptide of interest may be of any length and both with and without secondary, tertiary or quaternary structure, and preferably consists of at least twelve amino acids and up to 2000 amino acids, preferably between 50 and 1000 amino acids.

A-R₁-R₂-X(PYZ)- R₃ — L₁ (1 )

[protein of], tag)— Ri-R₂ — X(PYZ)- R₃-L₁ interest ,

N

[protein of], tag)— R₁-R₂-NH-R₄-L₂ interest , (7)

Scheme 1 The "protein of interest — tag" is contacted with a particular substrate of formula (1 ). Conditions of reaction are selected such that the "tag" reacts with the compound of formula (1 ) and thereby transfers the label L₁ to the binding partner B, i.e. to the protein of interest. Usual conditions are a buffer solution in a pH range of 5 to 9, and a temperature between 4° and 5O⁰C, i.e. conditions under which the protein of interest remains unchanged. However, it is understood that some proteins of interest and corresponding binding partners B react also under a variety of other conditions, and those conditions mentioned here are not limiting the scope of the invention.

In the resulting product of formula (6), group A may still be present or may be displaced as a result of the reaction with the binding partner B. This is the case, for example, in the preferred method wherein group A is a purine type group of formula (3) and the "tag" is AGT, or wherein group A is a coenzyme A type group of formula (4) and the "tag" is ACP. The next two steps in Scheme 1 then demonstrate the reaction of compound (6) with a compound of formula (2) corresponding to the Staudinger ligation. The intermediate formed in the reaction of the phosphine with the azide under loss of nitrogen, the iminophosphorane, is further converted to the final product of formula (7) now carrying the label L₂ connected to the protein of interest, simultaneously cleaving off a phospine oxide together with label L₁, originally bound to the protein of interest. In this reaction the original electrophilic functional group R₂ is changed to a related functional group now carrying an amino residue. If L₂ is hydrogen, the conversion of compound (6) to compound (7) corresponds to a simple cleavage of label L₁ from the binding partner B.

This reaction with an azide of formula (2) is accomplished in aqueous buffered solution. Usual conditions are again a buffer solution in a pH range of 5 to 9, and a temperature between 4° and 5O⁰C, i.e. conditions under which the protein of interest remains unchanged. The buffer used depends upon the properties of the protein of interest incorporated in the binding partner B. Preferred concentration of the compound of formula (2) is in the range of 1 μM up to 100 mM. The reaction time can be varied between 1 min and 24 hours.

Method of manufacture

Methods of manufacture of novel substrates and intermediates are also an object of this invention. These methods are generally known in the art, are chosen as to best produce the preferred substrates of the invention, and are exemplified hereinbelow. Particular methods for the synthesis of the A group of compound (1 ), wherein A is a substrate for AGT, and of combinations of A with a linker, of labels combined with a linker, and the use of such intermediates are disclosed in International Patent Application WO 2004/031405.

For example, compounds of formula (1 ) are manufactured by reacting A) a compound of formula (1 C)

with a compound of formula A-R₁ ⁸, optionally in protected form, wherein the substituents have the meaning as given under formula (1 ) and Ri^A and Ri^B react with each other to give linker R₁ ;

B) a compound of formula (1 D)

with a compound of formula R^-L₁, optionally in protected form, wherein the substituents have the meaning as given under formula (1 ) and R₃ ^A and R₃ ^B react with each other to give linker R₃;

C) for the synthesis of compounds of formula (1 ) wherein R₂ is an ester, a compound of formula (1 E)

with a carboxylic acid of the formula A-R₁-COOH, optionally in protected form, wherein the substituents have the meaning as given under formula (1 );

and in an obtainable compound optional protecting groups are cleaved, and an obtainable compound of formula (1 ) is optionally converted into another compound of formula (1 ).

In particular, in the preferred synthesis of a compound of formula (1 ), wherein a precursor comprising residue X is elaborated building chains finally carrying the entities A and L₁, use of orthogonally protected functional groups is made. Such a choice of protective groups allows for a separate deprotection so that each functionality released in turn can be further chemically manipulated either to attach a label to it or for the introduction of further extension of a linker. Appropriate protecting groups for the envisioned functionalities can be chosen by those skilled in the art, and are e.g. summarized in T.W. Greene and P. G. M. Wuts in "Protective Groups in Organic Synthesis", John Wiley & Sons, New York 1991.

Most preferred are the synthetic methods and intermediates described in the Schemes and Examples.

Triarylphosphines containing diverse substituted aromatic substituents are accessible via palladium-catalyzed P-C coupling reactions between appropriate phosphines and aromatic iodo- or bromo-aryl compounds. (DJ. Brauer et al. J. Organomet. Chem. 2002, 645, 14-26). In most cases, no protecting groups are required.

The synthesis of an intermediate useful in the preparation of compounds of formula (1 ) wherein R₂ is an ester group is summarized in Scheme 2. The iodo compound (9) is obtained in a Sandmeyer type reaction from the corresponding commercially available aniline (8). Palladium-catalyzed P-C coupling is performed as described in DJ. Brauer et al., loc. cit, to yield phosphine (10). The commercially available diamine (11 ) is coupled to (10) under standard peptide coupling conditions using benzotriazol-1 -yloxy-tripyrrolidino- phosphonium hexafluorophosphate (PyBOP) in dimethyl formamide (DMF) as an activation reagent for the phosphine-carboxylic acid (10). The building block (12) is subsequently coupled to a carboxylic acid N-succinimide ester activated solid support of formula (13). The benzylguanine derivative (15) is activated in situ (PyBOP, triethylamine; DMF) and coupled to the phenolic functional group in (14) to yield the appropriate functionalized surface (16) for the immobilization of AGT fusion proteins.

(8) (9) (10)

(12) (14)

PyBOP

Scheme 2 The synthesis of an intermediate useful in the preparation of compounds of formula (1 ) wherein R₂ is a carbamate group (i.e. an ester group bound to nitrogen) is summarized in Scheme 3. The phenolic group of the functionalized solid support (14) is activated by the reaction with 4-nitrophenyl chloroformate (17) to yield the carbonate (18). The subsequent reaction of (18) with the amino-benzylguanine (19) forms the desired carbamate of formula (20).

(14) (18)

(20)

Scheme 3 Scheme 4 provides a method for the synthesis of an intermediate useful in the context of a compound of formula (1 ) wherein X is a substituted benzyl group. The synthesis is based on the addition of diphenylphosphine to 4-formylbenzoic acid (21 ) to form alcohol (22). This building block is extended with the polyethylene glycol linker (1 1 ), and the resulting compound (23) coupled to the solid support (13), followed by esterification with benzylguanine-carboxylic acid derivative (15). This leads to the desired modified solid support of formula (24).

PyBOP

PyBOP

Scheme 4 The synthesis of an intermediate useful in the preparation of compounds of formula (1 ) wherein A is phosphopantetheine (a derivative of coenzyme A) is shown in Scheme 5. Esterification of the phenolic hydroxy group in phosphine (10) with the maleimido- carboxylic acid (25) and subsequent immobilization of the resulting compound (26) to an amino-functionalized solid support (27) yields a thiol reactive surface (28). The free thiol group of a suitable coenzyme A derivative is added to the double bond of maleimide (28) to give (29), a compound of formula (1 ) representing a CoA covered surface suitable for reaction with an ACP fusion protein.

(28) (29)

Scheme 5 The synthesis of a compound of formula (1 ) wherein R₂ is an ester group and L₁ is biotin is summarized in Scheme 6. Phospine (10) and the benzylguanine-carboxylic acid derivative (15) are coupled by in situ activation of (15) to yield ester (30). This compound is coupled to amine (31 ) (E. Saxon, CR. Bertozzi, Science, 2000, 287, 2007-2010) using 1 -hydroxybenztriazole, 1 -ethyl-3-(3-dimethylaminopropyl)carbodiimide or PyBOP.

Scheme 6 The synthesis of an intermediate useful in the synthesis of compounds of formula (1 ) wherein R₂ is a protected precursor of a thioester group is summarized in Scheme 7. Phenylphosphine and m-iodobenzoic acid (33) are coupled with palladium catalysis (D. J. Brauer et al. J. Organomet. Chem.. 2002, 645, 14-26), reacted with paraformaldehyde and converted to the air-stable protected borane-diphenyl phosphine complex (35) with diborane. Subsequently (35) is coupled to the free amino group of an appropriate solid support (27) via Standard peptide coupling chemistry (PyBOP, triethylamine). The alcohol group in the phosphine-borane complex (35) is activated by methanesulfonyl chloride (MsCI) and converted to the protected thiol (37) with thioacetic acid (using triethylamine as a base). The phosphinothiol (37) may be further elaborated to a compound of formula (1 ) in analogy to the reactions in Schemes 2 to 6.

(36) (37)

Scheme 7 Examples for the synthesis of compounds of formula (2) are shown in Schemes 8 to 10. A useful intermediate is 11 -azido-3,6,9-trioxaundecanamine (38) prepared according to A.W. Schwabacher et al., J. Org. Chem. 1998, 63, 1727-1729. The synthesis of several compounds of formula (2) is summarized in Scheme 8. (38) is coupled with N-succinimidyl esters of formula (39), (41 ) and (43) bearing a biotin, digoxigenin or fluorophore label, respectively, to give azides of formula (40), (42) and (44), respectively.

(41 ) (42)

Scheme 8 A compound of formula (2) wherein L₂ is a group A, namely a benzylguanine type compound useful for reaction with AGT, is shown in Scheme 9. Aminomethyl-benzyl- guanine (45) is acylated with chloroacetic anhydride in methanol to give the intermediate (46). Displacement of the halide with sodium azide provides the desired compound (47).

(45) (46)

(47)

Scheme 9

As it will be readily appreciated by those skilled in the art, various modifications of the described reactions with compounds bearing different functional substructures can be easily made, and these modifications are well within the scope of the invention.

Examples

Example 1 : O⁶-r4-(13-f4-r4-(13-Biotinylamido-4,7,10-trioxa-tridecyl-aminocarbonyl)-2- diphenylphosphino-phenyloxycarbonyl1-butanoylamino)-2, 5,8,1 1 -tetraoxa-tridecvD- benzyliαuanine (32)

(32) O⁶-[4-(13-Glutarylamido-2, 5,8,1 1 -tetraoxa-tridecyl)-benzyl]guanine (Example 1 a, 50 mg, 0.09 mmol) is dissolved in 2 ml dimethylacetamide and benzotriazol-1 -yloxy- thpyrrolidinophosphonium hexafluorophosphate (PyBOP, 51 mg, 0.1 mmol) and diisopropylethylamine (20 Dl) is added. After stirring at room temperature for 15 min, N- (13-biotinylamido-4,7,10-trioxa-tridecyl)-3-diphenylphosphino-4-hydroxybenzamide (Example 1 b, 67 mg, 0.09 mmol) is added and the reaction mixture heated to 5O⁰C for 1 min, then stirred at room temperature for 8 h. The crude product is precipitated in 50 ml diethyl ether, all organic solvents decanted and the precipitate dried in vacuo. The precipitate is further purified via reversed phase medium pressure liquid chromatography (MPLC).

Example 1 a: O⁶-r4-(13-Glutarylamido-2, 5,8,1 1 -tetraoxa-tridecyl)-benzyl1αuanine (15) (-^-^-(IS-Amino^SΛH -tetraoxa-tridecyO-benzylJguanine (491 mg, 1.10 mmol) is dissolved in 10 ml dimethylacetamide by heating the solution to 8O⁰C for 5 min. After cooling to room temperature, glutaric anhydride (126 mg, 1.10 mmol), N, N'- dimethylaminopyridine (DMAP, 60 mg, 0.48 mmol) and diisopropylethylamine (170 ύ) are added and the reaction mixture stirred at room temperature for 20 h. The reaction mixture is poured into 200 ml of diethyl ether and the organic phase decanted. The resulting residue is dried in vacuo and the product used without further purification.

Example 1 b: N-(13-Biotinylamido-4,7,10-trioxa-tridecyl)-3-diphenylphosphino-4-hydroxy- benzamide

N-(13-Amino-4,7,10-trioxa-tridecyl)-3-diphenylphosphino-4-hydroxy-benzamide (Example 1c, 53 mg, 0.1 mmol) is dissolved in 1 ml DMF followed by the addition of N-biotinyl- succinimide (Biotin-NHS, 38 mg, 0.1 1 mmol) and triethylamine (15 ύ). The reaction mixture is stirred at room temperature over night and the product purified via reversed phase MPLC.

Example 1c: N-(13-Amino-4,7,10-trioxa-tridecyl)-3-diphenylphosphino-4-hvdroxy- benzamide (12)

3-Diphenylphosphino-4-hydroxy-benzoic acid (Example 1 d, 100 mg, 0.31 mmol) and PyBOP (177 mg, 3.41 mmol) are dissolved in 5 ml DMF/CH₃CN (1 :1 ) and stirred for 5 min at room temperature. 4,7,10-Trioxa-1 ,13-tridecanediamine (136 mg, 6.2 mmol) and triethylamine (46 Dl) are added and the reaction mixture stirred at room temperature for 48 h. All volatiles are removed in vacuo and the residue purified by flash column chromatography (CH₂CI₂/Me0H 95:5) to yield 142 mg (0.27 mmol, 87%) of the title compound as a colorless wax. ESI-MS 525.2 [M+H]⁺.

Example 1 d: 3-Diphenylphosphino-4-hvdroxy-benzoic acid (10) 4-Hydroxy-3-iodo-benzoic acid (Example 1 e, 264 mg, 1 mmol) and Pd(OAc)₂ (2.2 mg) are suspended in 4 ml dry acentonitrile, and 0.5 ml triethylamine are added. The mixture is degassed in an ultrasonic bath while passing dry argon through it for 30 min. Diphenylphosphine (0.175 ml, 1 mmol) is added and the mixture heated to reflux for 48 h. After cooling to room temperature all volatiles are removed in vacuo. The residue is dissolved in KOH (1 N, 2.5 ml) and washed with diethyl ether (3 times 10 ml). The aqueous solution is cooled in an ice bath and 2 N HCI is added until pH 2. The aqueous phase is decanted, the precipitate redissolved in diethyl ether, the solution dried over MgSO₄ and the solvent evaporated to give a slightly yellow foam (155 mg, 48%). ESI-MS 323.6 [M₊H]⁺.

Example 1 e: 4-Hvdroxy-3-iodo-benzoic acid (9)

3-Amino-4-hydroxybenzoic acid (1.0 g, 6.53 mmol) is dissolved in 10 ml HCI cone, and cooled to O⁰C in an ice bath. Under vigorous stirring NaNO₂ (540 mg, 7.84 mmol) dissolved in 2 ml of water is added drop wise and the mixture stirred at room temperature for 40 min. The mixture is subsequently filtered through glass-wool into a solution of Kl (10.82 g, 65 mmol) in 14 ml H₂O. The solution is stirred for 1.5 h diluted with CH₂CI₂ (150 ml) and washed twice with 50 ml Na₂SO₃ sat., 50 ml water, and 50 ml of brine. The organic layer is dried over MgSO₄ and the solvent evaporated. The residue is purified by flash column chromatography (cyclohexane:ethyl acetate 10:1 ) to yield 250 mg (0.95 mmol, 15%) of the title compound as a colorless solid. ESI-MS 265.3 [M+H]⁺. Example 2: O⁶-r4-f4-r4-(13-Biotinylamido-4,7,10-trioxa-tridecyl-aminocarbonyl)-2- diphenylphosphino-phenyloxycarbonyli-butanoylaminomethvD-benzyliαuanine

O⁶-(4-Glutarylamidomethyl-benzyl)guanine (Example 2a, 54 mg, 0.14 mmol) is suspended in 3 ml dimethylacetamide, and PyBOP (84 mg, 0.16 mmol) and diisopropylethylamine (50 ml) are added. After 5 min a clear solution is formed. After stirring at room temperature for 15 min, N-(13-biotinylamido-4,7,10-trioxa-tridecyl)-3-diphenylphosphino-4-hydroxy- benzamide (Example 1 b, 105 mg, 14 mmol) is added and the reaction mixture heated to 5O⁰C for 1 min, then stirred at room temperature for 8 h. The crude product is precipitated in 50 ml diethyl ether, all organic solvents decanted and the precipitate dried in vacuo. The precipitate is further purified with reversed phase MPLC.

Example 2a: O⁶-(4-Glutarylamidomethyl-benzyl)quanine

O⁶-(4-Aminomethyl-benzyl)guanine (300 mg, 1.10 mmol) is dissolved in 7 ml dimethyl¬ acetamide by heating the solution to 8O⁰C for 5 min. After cooling to room temperature, glutaric anhydride (126 mg, 1.10 mmol), N,N-dimethylaminopyridine (DMAP, 60 mg, 0.48 mmol) and diisopropylethylamine (170 ϋ) are added and the reaction mixture stirred at room temperature for 20 h. The reaction mixture is poured into 70 ml water resulting in a clear solution. The pH is adjusted to 4-5 by 0.5 N HCI and the white precipitate isolated by filtration. The precipitate is washed several times with water and dried in vacuo yielding 350 mg (0.91 mmol, 82%) of the title compound. ¹H-NMR (400 MHz, DMSOd₆): 12.10 (br.s, 1 H, COOH), 8.32 (t, J=6 Hz, 1 H, CONH), 7.81 (br.s, 1 H, H-8), 7.43 (d, J=8Hz, 2H, ArH), 7.24 (d, J=8Hz, 2H, ArH), 6.27 (br.s, 2H, NH₂), 5.44 (s, 2H, CH₂Ar), 4.24 (d, J=6Hz, 2H, NHCH₂Ar), 2.17 (m, 4H, CH₂), 1.72 (m, 2H, CH₂). ¹³C NMR (100.6 MHz, DMSOd₆) ID75.1 , 172.5, 160.6, 140.5, 136.1 , 129.5, 128.2, 67.5, 42.7, 35.3, 34.0, 21.6. MS(ESI) m/z. 385.2 [M₊H]⁺.

Example 3: Substrate (16)

support

O⁶-[4-(13-Glutaryl-amido-2,5,8,11 -tetraoxa-tridecyl)-benzyl]guanine (Example 1 a, 50 mg, 0.09 mmol) is dissolved in 2 ml abs. dimethylacetamide, and PyBOP (51 mg, 0.1 mmol) and diisopropylethylamine (20 Dl) are added. After stirring at room temperature for 15 min 4-dimethylaminopyridine (5 mg) is added, and the NHS-Sepharose4FastFlow (GE- Amersham) modified with N-(13-amino-4,7,10-trioxa-tridecyl)-3-diphenylphosphino-4- hydroxy-benzamide (Example 3a) is incubated with this solution at 2 mM over night. The supernatant solution is removed and the solid support is washed with 0.5 M NaCI (2 times 20 ml).

Example 3a: Immobilized N-(13-amino-4,7,10-trioxa-tridecyl)-3-diphenylphosphino-4- hvdroxy-benzamide (14)

support

(14)

NHS-Sepharose4FastFlow (GE-Amersham) is modified with N-(13-amino-4,7,10-trioxa- tridecyl)-3-diphenylphosphino-4-hydroxy-benzamide (Example 1 c) at 1 mM according to the manufacturer's instructions (coupling at 1 mM, blocking with 0.5 M ethanolamine, washing). An appropriate quantity of the resulting resin is dried by brief centrifugation. This and all following steps are performed on Biorad Microbiospin columns.

Example 4: Substrate (20)

(20) N PC-activated solid support (18, Example 4a) is treated with a freshly prepared solution of (-^-^-(IS-amino^SAH -tetraoxa-tridecyO-benzyltøuanine (5 mM) and DMAP (5 mM) in 1 ml anhydrous DMF for 14 h under vigorous shaking. Excess reagent is removed by washing twice with DMF for 5 min. Residual p-nitrophenyl carbonate groups are subsequently quenched by treatment with ethanolamine. The resin is immersed in a 0.5 M solution of ethanolamine in anhydrous DMF for 15 min at room temperature under vigorous shaking, rinsed with anhydrous DMF, washed in anhydrous DMF for 5 min, rinsed extensively with deionized water, and finally washed in deionized water twice for 5 min.

Example 4a: p-Nitrophenoxcarbonyl-activated phosphine-containinq beads (18)

support

(18)

Phosphine modified solid support (14, Example 3a) is activated by applying a freshly prepared mixture of p-nitrophenyl chloroformate (NPC, 1 mM) and triethylamine (1 mM) in anhydrous THF for 1 h at room temperature under vigorous shaking. Excess NPC is removed from the solid support by several washings with THF, water and again THF. The resulting resin is dried by brief centrifugation under a flow of nitrogen, then stored in a nitrogen atmosphere until used for further functionalization. Example 5: Substrate (29)

support

(29)

Coenzyme A disodium salt (5 mg, 0.006 mmol) in 200 ύ DMF and 50 ύ 50 mM Tris Cl pH 7.5 are added to an appropriate quantity of modified beads (28) of Example 5a. The mixture is shaken for 4 hours at room temperature. It is washed several times with CH₃CN / H₂O 1 :4, and finally washed in deionized water twice for 5 min.

Example 5a: Immobilized 3-diphenylphosphino-4-r4-(N-maleimidomethyl)-cvclohexyl- carbonyloxyi-benzoic acid (28)

support

(28)

EAH Sepharose 4B (GE-Amersham) is modified with 3-diphenylphosphino-4-[4-(N- maleimidomethyl)-cyclohexyl-carbonyloxy]-benzoic acid (26, Example 5b) at 1 mM according to the manufacturer's instructions (coupling at 1 mM, blocking, washing). An appropriate quantity of the resulting resin is dried by brief centrifugation. This and all following steps are performed on Biorad Microbiospin columns.

Example 5b: 3-Diphenylphosphino-4-[4-(N-maleimidomethyl)-cvclohexyl-carbonyloxy1- benzoic acid (26)

4-N-Maleimidomethyl-cyclohexanecarboxylic acid (100 mg, 0.42 mmol) and PyBOP (219 mg, 0.42 mmol) are dissolved in 3 ml dry DMF and stirred at room temperature for 15 min.

3-Diphenylphosphino-4-hydroxy-benzoic acid (135 mg, 0.42 mmol) and 60 d triethylamine are added and the reaction mixture stirred for 8 h. The solvent is removed in vacuo and the product purified with flash-column chromatography (CH₂CI₂MeOH 10:1 ).

Example 6: 1 1 -Azido-1 -(6-biotinylamido-caproylamino)-3,6,9-trioxa-undecane

6-Biotinylamido-caproic acid N-succinimide (250 mg, 0.55 mmol) and 1 -amino-11 -azido- 3,6,9-trioxa-undecane (240 mg,1 .1 mmol) are dissolved in 5 ml dry DMF with 150 Dl TEA. The reaction mixture is stirred at room temperature over night, than all volatiles are removed in vacuo. The crude product is adsorbed on silica gel and purified by flash- column chromatography (gradient CH₂CI₂:Me0H 50:1 to 10:1 ) yielding 210 mg (0.37 mmol, 68%) of the title compound. MS(ESI) m/z. 558.4 [M+H]⁺.

Example 7: 11 -Azido-1 -(fluoresceine-5(6)-carboxamido)-3,6,9-trioxa-undecane

5(6)-Carboxyfluoresceine N-succinimide (70 mg, 0.15 mmol) and 1 -amino-11 -azido-3,6,9- trioxa-undecane (50 mg, 0.22 mmol) are dissolved in 4 ml dry DMF with 70 d TEA. The reaction mixture is stirred at room temperature over night, then all volatiles are removed in vacuo. The crude product is purified by MPLC yielding 52 mg (0.09 mmol, 60%) of the title compound. MS(ESI) m/z. 577 A [M₊H]⁺.

Example 8: Immobilization and release of AGT from a modified resin Resin (16) (100 Dl, Example 3) is dried by brief centrifugation on a Biorad Microbiospin column. A solution of AGT (100 Dl, 25 DM, a recombinantly expressed variant of human alkylguanine-DNA-alkyltransferase genetically optimized for reactivity, available from Covalys under the tradename SNAP26) is loaded onto the resin and left on the resin at room temperature for 1 h. Subsequently the solution is separated from the resin and the resin is washed 2 times with buffer (100 Dl). The washout and the two rinsing solutions are combined. The protein content of the combined washout solutions is compared with an identical protein sample pretreated for 30 min with 100 DM benzylguanine. The protein content indicates that at least 20% of the AGT is bound to the resin.

Afterwards the resin is mixed with 100 Dl of 0.5 mM 1 -azido-3,6,9-trioxadecane, (CJ. Hawker et al., J. Org. Chem. 1994, 59, 3503) and incubated for 1 hour at 25⁰C. The solution is removed from the column by centrifugation and combined with 2 times 100 Dl wash solution. The protein content is analyzed from an aliquot. At least 20% of the immobilized protein is removed from the column. For further use, eluted protein is removed from low molecular weight labeling compounds by sequential separation over two NAP5 column (GE-Amersham).

Example 9: Binding and release with attachment of fluorescent group Resin (16) (100 Dl, Example 3) is loaded with the AGT (SNAP26 as described in Example 8) under identical conditions. For the elution step the resin is mixed with 100 Cl of 0.5 mM fluorescein-PEG-azide of Example 7. The eluted protein is sequentially purified over two NAP5-columns to reduce the level of free fluorescein label. The protein solution is loaded onto the resin, afterwards buffer is added to a total volume of 700 μl. Protein is subsequently eluted by further adding buffer and collecting the eluate. From the first column 500 μl of eluate are used for the second purification step. A sample of the resin not loaded with protein is treated in the same way to establish a background value. Fluorescence is read on a plate reader instrument (Tecan, Safire; fluorescein settings). The protein level is estimated after background subtraction to be greater than 10 nM by comparison with fluorescein solutions.

In a further experiment resin (29) (Example 5, carrying CoA) instead of (16) and recombinantly expressed E. coli acyl carrier protein (ACP, 25 μM) mixed with acyl carrier protein synthase (ACPS, 5 μM) instead of AGT are used (volumes identical to the AGT experiment). Incubation and subsequent separation are done under the same conditions as described for the AGT in this example. The protein level is again estimated to be greater than 10 nM by comparison with fluorescein solutions.

Example 10: Binding and release with attachment of affinity group Example 8 is repeated, however AGT is added at 100 μM concentration. Subsequently AGT is eluted with 100 d of 0.5 mM biotin-PEG-azide of Example 6. Again at least 4% of the protein initially loaded to the resin are recovered. About 1 Dg of the protein are loaded in duplicates to an SDS-gel as described below (Example 12). Separation, western blotting, and detection are done as described below. The resulting protein bands are clearly biotin-labeled, while a control with preblocked protein shows no significant bands on the western blot. Example 11 : Docking of AGT with fluorescein

A solution of AGT (100 Cl, 20 DM) is labeled with the benzylguanine-phosphine-biotin compound (32) (Example 1 , 30 CM) for 1 hour at 25⁰C. The resulting mixture is purified over a Biorad Microbiospin column. A fraction of the eluate is mixed with 0.5 mM fluorescein-PEG-azide of Example 7 and incubated for 1 hour at 25⁰C. The resulting solution is again separated sequentially over 2 Biorad Microbiospin columns to reduce the concentration of free fluorescein-PEG-azide. As a control the same reaction is performed with AGT preblocked with 100 DM benzylguanine for 30 min at 25⁰C. Subsequently protein solutions treated with the benzylguanine-phosphine-biotin compound (32) and protein solutions untreated with this substrate are loaded on an SDS-gel and separated. Quantities of about 1 Dg and 5 Dg per well are loaded onto the gel. After separation analysis is done on a UV-transilluminator, followed by Coomassie Blue staining. Protein bands are clearly visible for all three samples, while only the sample not preblocked and treated with the fluorescein-PEG-azide shows significant green fluorescence.

Example 12: Docking of AGT with biotin

All steps of the reaction are carried out as given above in Example 1 1 , but the fluorescein- PEG-azide of Example 7 is replaced with the biotin-PEG-azide of Example 6. All samples are loaded in duplicate. After separation the gel is cut in half and one half is stained with Coomassie Blue. The other half is transferred to a semi-dry blotting system (Biorad), proteins are transferred to nitrocellulose and subsequently a western blot detection is performed with streptavidin-horseradish peroxidase conjugate and chromogenic substrate (all materials from Pierce; all conditions used as published by Pierce). In the Coomassie Blue stain all bands are clearly visible, while on the western blot only the bands of non blocked protein incubated with the biotin-PEG-azide of Example 6 give a clear signal.

Example 13: Docking of ACP with fluorescein

ACP (20 DM) is loaded with phosphine-CoA compound (29) (Example 5, 30 DM) by coincubating with ACP-Synthase (ACPS, 5 DM) for 1 h. Labeled protein is separated over a BioRad Microbiospin column to remove phosphine-CoA not bound to protein.

Fluorescein-PEG-azide of Example 7 is used for the modification of the protein. As a reference, a sample of ACP is incubated with non-modified coenzyme A (50 DM) and ACPS (5 DM) at 25⁰C for 1 hour. Again unreacted substrate is removed by spin-dialysis. The undiluted samples and a sample diluted 1 :5 are loaded on an SDS-gel. After separation analysis is done on a UV-transilluminator, followed by Coomassie Blue staining. Protein bands are clearly visible for control and for the sample not preblocked and treated with the fluorescein-PEG-azide, while only the sample not preblocked and treated with the fluorescein-PEG-azide shows significant green fluorescence.

Example 14: Docking of ACP with biotin All steps of the reaction are carried out as given above in Example 13, but the fluorescein- PEG-azide is replaced with biotin-PEG-azide of Example 6. All samples are loaded in duplicate. After separation the gel is cut in half and one half is stained with Coomassie Blue. The other half is transferred to a semi-dry blotting system (Biorad), proteins are transferred to nitrocellulose and subsequently a western blot detection is performed with streptavidin-horse radish peroxidase conjugate and chromogenic substrate (all materials from Pierce; all conditions as published by Pierce). In the Coomassie Blue stain all bands are clearly visible, while on the western blot only the bands of non blocked protein incubated with the biotin-PEG-azide derivative give a clear signal.

Example 15: Binding and release of ACP-taα to phosphine-modified beads

Binding and release of an ACP-tagged protein is done at the same concentrations as in Example 14 before. Resin material offering a phosphine-CoA substrate (29) is used. For binding to the resin ACP-fusion protein (25 DM) and ACP-Synthase (5DM) are loaded onto the resin and left on the resin for 1 h. At least 20% of the ACP are retained on the resin. Afterwards the resin is mixed with 1 -azido-3,6,9-trioxadecane (100 Dl, 0.5 mM) and incubated for 1 hour at 25⁰C. The solution is removed from the column by centrifugation and combined with 2 times 100 Dl wash solution. The protein content is analyzed from an aliquot. During that step at least 20% of the immobilized protein is removed from the column. For further use, eluted protein is removed from the low molecular weight labeling compounds by sequential separation over two NAP5 columns (GE-Amersham) as described in Example 9.

Claims

1 . Compounds of formula (1 )

wherein

A is a group that specifically binds to or reacts with a binding partner B;

Ri and R₃ are, independently of each other, a linker;

R₂ is an electrophilic functional group able to react intramolecularly with an aza-ylide; X, Y and Z are, independently of each other, aryl or heteroaryl, or an optionally substituted saturated or unsaturated alkyl, cycloalkyl or heterocyclyl group, or X and Y, X and Z or Y and Z together with the phosphorus atom represent a ring;

L₁ is a label, a plurality of same or different labels, a bond connecting R₃ to A forming a cyclic substrate, or a further group A; and R₃ is bound either to X or to Z.

2. The compound of formula (1 ) according to claim 1 which is a compound of formula (1 A)

/^R3~ ^L1

A-R₁-R₅ X_χ

^Y (1 A)

or a compound of formula (1 B)

3. The compound according to claim 1 of formula (1 A) wherein the substituents are as defined for a compound of formula (1 ) in claim 1.

4. The compound of formula (1 ) according to claim 1 wherein A is a group that specifically binds to or reacts with a binding partner B which consists of a peptide sequence comprising a protein of interest and a protein tag.

5. The compound of formula (1 ) according to claim 4 wherein A is a group recognized as a substrate by an O⁶-alkylguanine-DNA alkyltransferase (AGT), by acyl carrier protein (ACP), by a fragment of AGT or ACP, or by a mutant of AGT or ACP.

6. The compound of formula (1 ) according to claim 5 wherein A is a group recognized as a substrate by an O⁶-alkylguanine-DNA alkyltransferase (AGT) or by acyl carrier protein (ACP).

7. The compound of formula (1 ) according to claim 1 wherein

A is a para-substituted O⁶-benzylguanine residue of the formula (3)

or a corresponding 4'-substituted O⁶-thiophenyl-2'-methyl guanine residue specifically reacting with AGT, an AGT fragment or an AGT mutant; a coenzyme A type residue of the formula (4)

specifically reacting with ACP, an ACP fragment or an ACP mutant; a haloalkane reacting specifically with an active-site variant of dehalogenase; a biotin moiety; a synthetic ligand (SLF) which interacts with FKBP12(F36V); a fluorophosphonate reacting specifically with members of the serine hydrolase superfamily; an (acyloxy)methyl ketone reacting with cysteine proteases; a 4-fluoromethyl-1 -phosphaphenyl group reacting with tyrosine phosphatases; or a bis-arsenical dye reacting specifically with a tetracysteine-tagged protein.

8. The compound of formula (1 ) according to claim 7 wherein A is a O⁶-benzylguanine residue of the formula (3) or a coenzyme A type residue of the formula (4).

9. The compound of formula (1 ) according to claim 1 wherein R₁ and R₃ are, independently of each other, a straight or branched chain alkylene group with 1 to 300 carbon atoms, wherein optionally

(a) one or more carbon atoms are replaced by oxygen, in particular wherein every third carbon atom is replaced by oxygen, e.g. a poylethyleneoxy group with 1 to 100 ethyleneoxy units;

(b) one or more carbon atoms are replaced by nitrogen carrying a hydrogen atom, and the adjacent carbon atoms are substituted by oxo, representing an amide function -NH-CO-; (c) one or more carbon atoms are replaced by oxygen, and the adjacent carbon atoms are substituted by oxo, representing an ester function -O-CO-;

(d) the bond between two adjacent carbon atoms is a double or a triple bond, representing a function -CH=CH- or -CDC-;

(e) one or more carbon atoms are replaced by a phenylene, a saturated or unsaturated cycloalkylene, a saturated or unsaturated bicycloalkylene, a bridging heteroaromatic or a bridging saturated or unsaturated heterocyclyl group;

(f) two adjacent carbon atoms are replaced by a disulfide linkage -S-S-; or a combination of two or more, especially two or three, alkylene and/or modified alkylene groups as defined under (a) to (f) hereinbefore, optionally containing substituents.

10. The compound of formula (1 ) according to claim 9 wherein R₁ and R₃ are, independently of each other, a straight chain alkylene group of 10 to 40 carbon atoms optionally substituted by oxo wherein 3 to 12 carbon atoms are replaced by oxygen and one or two carbon atoms are replaced by nitrogen.

11 . The compound of formula (1 ) according to claim 1 wherein R₂ is an ester, thioester, carboxamide, or a sulfonic acid ester functional group.

12. The compound of formula (1 ) according to claim 11 wherein R₂ is an ester functional group.

13. The compound of formula (1 ) according to claim 1 1 wherein the sequence -R₁-R₂-X- of formula (1 ) is -R₁-CO-O-X-.

14. The compound of formula (1 ) according to claim 1 wherein X, Y and Z are, independently of each other, aryl, or wherein X is an optionally substituted methyl group or imidazolyl, and Y and Z are, independently of each other, aryl.

15. The compound of formula (1 ) according to claim 14 wherein X, Y and Z are, independently of each other, optionally substituted phenyl.

16. The compound of formula (1 ) according to claim 14 wherein X, Y and Z are phenyl.

17. The compound of formula (1 ) according to claim 15 wherein substituent R₂ is in ortho position to substituent -PYZ.

18. The compound of formula (1 ) according to claim 1 wherein L₁ is a spectroscopic probe, a chromophore, a magnetic probe, a contrast reagent, a radioactively labelled molecule, a molecule which is one part of a specific binding pair which is capable of specifically binding to a partner, a molecule which is capable of crosslinking to other molecules, a molecule which is capable of generating hydroxyl radicals upon exposure to H₂O₂ and ascorbate, a molecule which is capable of generating reactive radicals upon irradiation with light, a molecule covalently attached to a solid support, a nucleic acid or a derivative thereof capable of undergoing base-pairing with its complementary strand, a lipid or other hydrophobic molecule with membrane-inserting properties, a biomolecule with desirable enzymatic, chemical or physical properties, or a molecule possessing a combination of any of the properties listed above.

19. The compound of formula (1 ) according to claim 18 wherein L₁ is a spectroscopic probe, a molecule which is one part of a specific binding pair which is capable of specifically binding to a partner, or a molecule covalently attached to a solid support.

20. The compound of formula (1 ) according to claim 1 wherein L₁ is a plurality of labels.

21 . A compound of formula (2)

N₃-R₄-L₂ (2)

wherein L₂ is a label, a plurality of same or different labels, or a group A, and R₄ is a linker.

22. The compound of formula (2) according to claim 21 wherein R₄ is a straight or branched chain alkylene group with 1 to 300 carbon atoms, wherein optionally

23. The compound of formula (2) according to claim 22 wherein R₄ is a straight chain alkylene group of 10 to 40 carbon atoms optionally substituted by oxo wherein 3 to 12 carbon atoms are replaced by oxygen and one or two carbon atoms are replaced by nitrogen.

24. The compound of formula (2) according to claim 21 wherein L₂ is a spectroscopic probe, a chromophore, a magnetic probe, a contrast reagent, a radioactively labelled molecule, a molecule which is one part of a specific binding pair which is capable of specifically binding to a partner, a molecule which is capable of crosslinking to other molecules, a molecule which is capable of generating hydroxyl radicals upon exposure to H₂O₂ and ascorbate, a molecule which is capable of generating reactive radicals upon irradiation with light, a molecule covalently attached to a solid support, a nucleic acid or a derivative thereof capable of undergoing base-pairing with its complementary strand, a lipid or other hydrophobic molecule with membrane-inserting properties, a biomolecule with desirable enzymatic, chemical or physical properties, or a molecule possessing a combination of any of the properties listed above.

25. The compound of formula (2) according to claim 24 wherein L₂ is a spectroscopic probe or a molecule which is one part of a specific binding pair which is capable of specifically binding to a partner.

26. A method for detecting and/or manipulating a protein of interest, wherein the protein of interest is incorporated into a binding partner B, the binding partner B is contacted with a compound of formula (1 ) according to claim 1 comprising a label, and the reaction product of binding partner B and the compound of formula (1 ) is detected and/or further manipulated using the label.

27. The method of claim 26 further comprising reacting the reaction product of binding partner B and the compound of formula (1 ) with a compound of formula (2)

N₃-R₄-L₂ (2)

28. The method of claim 26 or 27 wherein binding partner B is an AGT or ACP fusion protein.

29. A method of manufacture of a compound of formula (1 )

(1 ) as defined in claim 1 by reacting

A) a compound of formula (1 C)

with a compound of formula A-R₁ ⁸, optionally in protected form, wherein the substituents have the meaning as defined under formula (1 ) and Ri^A and Ri^B react with each other to give linker R₁ ;

B) a compound of formula (1 D)

with a compound of formula R₃^L₁, optionally in protected form, wherein the substituents have the meaning as defined under formula (1 ) and R₃ ^A and R₃ ^B react with each other to give linker R₃;

C) for the synthesis of compounds of formula (1 ), wherein R₂ is an ester, a compound of formula (1 E)

with a carboxylic acid of the formula A-R₁-COOH, optionally in protected form, wherein the substituents have the meaning as defined under formula (1 );