CN114940979B - Method for improving cation-pi interaction by utilizing genetic code expansion and application - Google Patents

Method for improving cation-pi interaction by utilizing genetic code expansion and application Download PDF

Info

Publication number
CN114940979B
CN114940979B CN202210140263.7A CN202210140263A CN114940979B CN 114940979 B CN114940979 B CN 114940979B CN 202210140263 A CN202210140263 A CN 202210140263A CN 114940979 B CN114940979 B CN 114940979B
Authority
CN
China
Prior art keywords
tryptophan
protein
phd
decoding
cation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210140263.7A
Other languages
Chinese (zh)
Other versions
CN114940979A (en
Inventor
林世贤
赵红霞
刘超
方誉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Chihua Hesheng Pharmaceutical Technology Co ltd
Original Assignee
Hangzhou Chihua Hesheng Pharmaceutical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Chihua Hesheng Pharmaceutical Technology Co ltd filed Critical Hangzhou Chihua Hesheng Pharmaceutical Technology Co ltd
Priority to CN202210140263.7A priority Critical patent/CN114940979B/en
Publication of CN114940979A publication Critical patent/CN114940979A/en
Application granted granted Critical
Publication of CN114940979B publication Critical patent/CN114940979B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/93Ligases (6)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y601/00Ligases forming carbon-oxygen bonds (6.1)
    • C12Y601/01Ligases forming aminoacyl-tRNA and related compounds (6.1.1)
    • C12Y601/0102Phenylalanine-tRNA ligase (6.1.1.20)

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Peptides Or Proteins (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Cation-pi interactions are an important non-covalent interaction between molecules, playing an important role in the biological and chemical fields, and despite great success in understanding the origin and biological function of cation-pi, research to design and synthesize stronger cation-pi interactions is scarce. The invention provides a method for improving cation-pi interaction by utilizing genetic code expansion and application thereof, taking histone methylation modified decoding protein as an example, introducing a tryptophan analogue substituted by a strong electron donating side chain group into a tryptophan site of an aromatic cage of the decoding protein by utilizing a genetic code expansion technology, improving the affinity of the decoding protein and histone methylation modification, and establishing a super-parent molecular recognition system for recognizing histone methylation modification.

Description

Method for improving cation-pi interaction by utilizing genetic code expansion and application
Technical Field
The invention relates to a method for improving cation-pi interaction, in particular to a method for improving cation-pi interaction by utilizing genetic code expansion and application thereof, belonging to the technical field of biology.
Background
Non-covalent interactions regulate the structure and function of biomolecules, playing a key role in the folding and recognition of molecules. Non-covalent interactions include cation-pi interactions, hydrogen bonding interactions, ionic interactions and hydrophobic interactions, where cation-pi is a strong non-covalent interaction that occurs between cations and pi electron clouds and plays an important role in biomolecular self-assembly, molecular recognition, molecular adhesion and molecular folding, and a series of work in recent years regarding the origin and rationale of cation-pi interactions suggest that cation-pi interactions play a critical role in the recognition process of substrate-receptor binding and post-histone modification. It has been reported that substitution of aromatic amino acids in aromatic cages with fluorine substituted tryptophan analogues impairs the cation-pi interactions due to the electron withdrawing ability of fluorine. In addition, mutation of aromatic amino acids in the aromatic cage of the protein can significantly reduce or disrupt interactions. Despite great success in understanding the origin and biological function of cation-pi interactions, the study of designing and synthesizing stronger cation-pi interactions is essentially blank.
Taking histone methylation decoding protein as an example, histone methylation refers to methylation modification on H3 and H4 histone N-terminal arginine or lysine residues mediated by methyltransferase, and the histone methylation modification is identified by decoding protein to be involved in regulating important life processes such as gene expression, DNA replication, DNA damage repair and cell cycle regulation, so that the research on the distribution and abundance of histone methylation is the basis for understanding the mechanism of histone codes and chromatin regulation molecules, and antibodies based on histone methylation are currently a technical method for mainly detecting the distribution and site specificity of histone methylation genome. Unfortunately, antibodies have the disadvantages of sequence-dependent affinity, low substrate resolution, non-specific recognition, suitability for in vitro experiments only, and the like, limiting their use and precise resolution of histone methylation function. There is therefore a need to develop new methods for high affinity detection of histone methylation modifications.
Research shows that a histone methylation modified decoding protein specifically recognizes histone methylation modification through cation-pi interaction by forming a hydrophobic pocket from 2-4 aromatic amino acids, and in view of the characteristic that the decoding protein can specifically recognize histone methylation modification, a method for detecting histone methylation modification based on a decoding protein domain becomes a substitute of a specific antibody, and is widely focused, wherein an ADD domain of ATRX protein and a PWWP domain of DNMT3A protein are respectively used for capturing H3K9me3 and H3K36me3; the MBT2 domain of L3MBTL1 recognizes either methylated lysine or di-methylated lysine modifications in a broad spectrum and has therefore evolved as a method of capturing methylated lysine proteomes. The detection method based on the decoding protein domain has the advantages of easy modification, economy and capturing of various PTMs, but the affinity of the decoding protein domain and histone methylation modification is in the micromolar level, and the wide application of the technology is limited. Therefore, there is a need to design high affinity histone methylation-encoding proteins that facilitate the application of the encoding protein domains in the enrichment, imaging, and sequencing of histone methylation modifications.
The genetic code expansion technology (genetic code expansion, GCE for short) specifically introduces unnatural amino acids with novel structures and unique properties on proteins, expands bricks and tiles of synthetic proteins, and provides a powerful tool for precise protein manipulation, protein function identification and optimization. The invention patent No. ZL 2019 1 0440254.8 entitled "construction of orthogonal aminoacyl-tRNA synthetase/tRNA system using chimeric design method" illustrates that the universal orthogonal property of pyrrolysine aminoacyl-tRNA synthetase (PylRS)/trnua orthogonal pair is transplanted to human mitochondrial phenylalanyl-tRNA synthetase (PheRS)/tRNA pair using protein chimeric design to construct a chimeric phenylalanyl-tRNA synthetase (chPheRS)/tRNA system with universal orthogonality, thereby widening the types of recognition of unnatural amino acids, providing a new tool for genetic code expansion technology, and the chimeric phenylalanyl-tRNA synthetase of the system can specifically recognize tryptophan analogues, such as: 6-methyl-tryptophan, 7-methyl-tryptophan, 6-chloro-tryptophan, 7-chloro-tryptophan, 6-cyano-tryptophan and 7-cyano-tryptophan.
Disclosure of Invention
The invention aims to provide a method for improving cation-pi interaction by using genetic code expansion, which replaces tryptophan forming the cation-pi interaction in a biological molecule with tryptophan analogues by using genetic code expansion technology, thereby improving the binding energy of the cation-pi interaction.
The technical scheme adopted for solving the technical problems is as follows:
a method for increasing cation-pi interactions using genetic code expansion, which replaces tryptophan forming an aromatic cage of cation-pi interactions in a biomolecule with tryptophan analogues using genetic code expansion techniques to increase binding energy of the cation-pi interactions.
Non-covalent interactions regulate the structure and function of biomolecules, playing a key role in molecular folding and molecular recognition, where cation-pi is a strong non-covalent interaction between cations and pi electron clouds, playing an important role in biomolecule self-assembly, molecular recognition, molecular adhesion and molecular folding, but the invention of how to enhance cation-pi interactions is in a blank state. The invention provides a synthesis method of tryptophan compounds with electron donating groups, and develops a method for replacing tryptophan of an aromatic cage by using a genetic code expansion technology, and the verified synthesized compounds A1-A6 obviously improve the cation-pi interaction binding energy. The histone methylation modification plays an important role in regulating gene expression, DNA replication, DNA damage repair, regulating cell cycle and other important life processes, the current method for detecting the histone methylation modification mainly depends on a specific antibody, but the antibody has the defects of sequence-dependent affinity, low substrate resolution, nonspecific recognition, suitability for in vitro experiments and the like. The method is used for establishing a super parent molecule recognition system for recognizing histone methylation modification, and the system is applied to aspects of detection, imaging, sequencing and the like of histone methylation modification.
In the invention, a series of tryptophan analogues are specifically introduced into an aromatic cage of a decoding protein to regulate the binding affinity of the decoding protein and histone methylation modification, the affinity of histone methylation and the decoding protein thereof is improved by 4-8 times by utilizing the strategy, the affinity of H3K4me3 and PHD decoding protein structural domain reaches nanomolar level through tandem repeat design, and a super-parent molecular recognition system for detecting the histone methylation modification is developed by utilizing the strategy.
The method of the invention can be applied to study any biomacromolecule that forms a cation-pi interaction.
Preferably, the method comprises the steps of:
s1, designing and synthesizing a strong electron side chain substituted tryptophan analogue, wherein the tryptophan analogue is unnatural amino acid, and is selected from one of 6-methyl-tryptophan (A1), 6-methoxy-tryptophan (A2), 7-methyl-tryptophan (A3), 7-methoxy-tryptophan (A4), 6, 7-methoxy-tryptophan (A5), 6, 7-methyl-tryptophan (A6), 7, 8-dihydrofuran-tryptophan (A7), 6, 7-dihydrofuran-tryptophan (A8), 7, 8-furan-tryptophan (A9), 6, 7-furyran-tryptophan (A10), 6, 7-dioxole-tryptophan (A11) or 6, 7-cyclopentane-tryptophan (A12), and the structural formulas of the tryptophan analogues A1 to A12 are as follows:
S2, screening chimeric phenylalanine aminoacyl-tRNA synthetase mutants which specifically recognize tryptophan analogues A1 to A12;
s3, taking a biomolecule forming cation-pi interaction as a research object, and utilizing a genetic code expansion technology to specifically introduce tryptophan analogues into the biomolecule through the chimeric phenylalanine aminoacyl-tRNA synthetase mutant to obtain the protein with the tryptophan analogues.
Preferably, indole B substituted at different positions is used as a reactant to react to obtain a target product, and the chemical structural formula of the indole substituted at different positions is as follows:the structural general formula of the target product is as follows: />Wherein X is selected from: one of an oxygen atom or a carbon atom.
Preferably, the synthesis method of the tryptophan analogues A1 to A12 comprises the following steps:
step one: synthesis of starting material compound B:
starting material B is selected from one of 6-methyl-indole (B1), 6-methoxy-indole (B2), 7-methyl-indole (B3), 7-methoxy-indole (B4), 6, 7-methoxy-indole (B5), 6, 7-methyl-indole (B6), 7, 8-dihydrofuran-indole (B7), 6, 7-dihydrofuran-indole (B8), 7, 8-furan-indole (B9), 6, 7-furan-indole (B10), 6, 7-dioxole-indole (B11) or 6, 7-cyclopentane-indole (B12), the structural formula of the above indole analogues B1 to B12 being:
(1) Synthesis of compounds B6, B7, B8, B9, B10: aniline (G6, G7, G8, G9 or G10) and triethanolamine are used as reactants, ruCl is used 3 ·nH 2 O,SnCl 2 ·2H 2 O and PPh 3 As a catalyst, reacting in anhydrous dioxane to obtain a starting material compound B; (2) Chemical treatmentSynthesis of compounds B11, B12: aniline (G11 or G12), chloral hydrate and hydroxylamine hydrochloride are used as reactants, sulfuric acid is used as a catalyst, water is used as a solvent to obtain a crude product, the crude product is reacted with methanesulfonic acid to obtain an isatoic product, and finally, the isatoic product is reduced by lithium aluminum hydride to obtain an initial raw material compound B;
step two: synthesis of Compound C: the initial raw material compound B and iodine are used as reactants, potassium hydroxide is used as alkali, anhydrous N, N-dimethylformamide is used as a solvent, and an intermediate compound C is obtained through reaction;
step three: synthesis of Compound D: the compound C and di-tert-butyl dicarbonate are used as reactants, triethylamine is used as alkali, DMAP is used as a catalyst, and the intermediate compound D is obtained by reaction in anhydrous dichloromethane;
step four: synthesis of compound E: the compound D and Boc-3-iodine-L-alanine methyl ester are used as reactants, palladium acetate is used as a catalyst, S-Phos is used as a ligand, and the intermediate compound E is obtained by reaction in anhydrous N, N-dimethylformamide solvent under the protection of nitrogen;
Step five: synthesis of compound F: under the condition that methanol and water are used as solvents, the compound E reacts with potassium hydroxide as alkali to obtain an intermediate compound F;
step six: synthesis of compound a: the compound F reacts with trifluoroacetic acid as a catalyst under the condition of anhydrous dichloromethane as a solvent to obtain target products (tryptophan analogues A1 to A12).
Preferably, in the fourth step, the palladium acetate catalyst is used in an amount of 2% of the substrate (compound D) in terms of molar amount;
the reaction time is 1-2h in the first step, 2h in the second step, 8h in the third step, 5h in the fourth step, 2-3h in the fifth step and 2h in the sixth step;
the reaction temperature is 90 ℃ in the first step, 0 ℃ in the second step, 0 ℃ in the third step, 40 ℃ in the fourth step, 25 ℃ in the fifth step and 0 ℃ in the sixth step.
Preferably, in S2, (1) the chimeric phenylalanyl-tRNA synthetase mutants specifically recognizing tryptophan analogues are selected by constructing a saturated mutagenesis gene library of amino acids of the amino acid binding pocket of the chimeric phenylalanyl-tRNA synthetase; (2) GFP fluorescence and LC-MS mass spectrometry to identify the efficiency and specificity of phenylalanine aminoacyl-tRNA synthetase mutant recognition; (3) The chimeric phenylalanine aminoacyl-tRNA mutant obtained by screening is applied to expression application of hosts such as bacteria, cells or viruses.
In screening for chimeric phenylalanine aminoacyl-tRNA synthetase mutants specifically recognizing 6-methoxy-tryptophan (A2), 7-methoxy-tryptophan (A4), 6, 7-methoxy-tryptophan (A5), 6, 7-methyl-tryptophan (A6), 7, 8-dihydrofuran-tryptophan (A7), 6, 7-dihydrofuran-tryptophan (A8), 7, 8-furan-tryptophan (A9), 6, 7-furan-tryptophan (A10), 6, 7-dioxole-tryptophan (A11), 6, 7-cyclopentane-tryptophan (A12), positive and negative screening for amino acids specifically recognizing 6-methoxy-tryptophan (A2), 7-methoxy-tryptophan (A11), 6, 7-cyclopentane-tryptophan (A12), 6, 7-methoxy-tryptophan (A2), 6, 7-dioxole-tryptophan (A4), 6, 7-methyl-tryptophan (A7, 7-tryptophan), 8-tryptophan (A), 7-5), 8-dimethyl-tryptophan (A), 6, 7-furantryptophan (A), 8-7-dimethyl-tryptophan (A), 6, 7-furantryptophan (A464), M490, T467 and A507, by constructing a library of saturated mutagenesis genes (E391, V393, F464, M467 and A507), the chimeric phenylalanyl-tRNA synthetase mutant of 6, 7-cyclopentane-tryptophan (A12) finally obtains the phenylalanine aminoacyl-tRNA synthetase mutant containing six mutations of E391D, V393G, M490V, F464V, T467G and A507G, wherein the nucleotide sequence and the amino acid sequence of the phenylalanyl-tRNA synthetase mutant are shown as SEQ: ID 1-2.
Preferably, in S3, (1) the tryptophan corresponding site of the decoding protein forming the aromatic cage is mutated to a stop codon (TAG), (2) the decoding protein mutant is co-expressed with the chimeric phenylalanyl-tRNA synthetase mutant, the corresponding tryptophan analog (typically 1 mM) is added during expression, (3) the decoding protein variant is purified according to the GST-TAG protein purification method, and the fidelity of the decoding protein variant is identified by LC-MS.
Preferably, the nucleotide sequence and the amino acid sequence of the chimeric phenylalanine-tRNA synthetase mutants recognizing tryptophan analogues A1-A6 are set forth in SEQ ID NO: 1-2.
Preferably, the method is directed to a histone methylation decoding protein domain, said decoding protein domain being any one of Chromo, PHD, PWWP, tudor, MBT, CW, SPIN and BAH domains.
Use of a protein with a tryptophan analog obtained by the method for constructing a decoded protein super parent recognition system specifically recognizing histone methylation modifications. Taking H3K4me3 as an example, establishing a decoding protein KDM5A-PHD3 super parent for identifying the H3K4me3, and naming the decoding protein as PHD super parent; taking H3K9me3 as an example, establishing a decoding protein CDY1-Chromo super parent for identifying the H3K9me3, and naming the decoding protein as the Chromo super parent; taking H3K27me3 as an example, establishing a decoding protein BAHD1-BAH super parent for identifying H3K27me3, and named as BAH super parent; taking H3K36me3 as an example, a decoding protein DNMT3B-PWWP super parent for identifying H3K36me3 is established and named as PWWP super parent.
The method comprises the steps of establishing a specific recognition histone methylation modified decoding protein super parent, wherein the affinity reaches a nanomolar level, and the titer is superior to that of a histone methylation modified specific antibody, so as to detect histone methylation modification in a biological sample. The specific recognition histone methylation modified decoding protein super parent is marked by a fluorescent group, and can be applied to detection of histone methylation modification in biological samples by an imaging technology. The specific recognition histone methylation modified decoding protein super parent recognition system can be applied to living body imaging and can dynamically detect the change of histone methylation modification. The histone methylation modified decoding protein super parent identification system can be used for enriching histone methylation modification of a sample and is applied to single cell sequencing technology. The strong electron side chain substituted tryptophan analogues can improve the affinity of the decoding protein and histone methylation modification by 4-8 times, and the tandem repeated decoding protein improves the affinity of the decoding protein and histone methylation modification.
Taking a KDM5A PHD3 as an example,
(1) The specific recognition H3K4me3 of the aromatic cage is formed by W18 and W28 through judging the crystal structure (PDB: 2 KGI), so that the W18 site and the W28 site of KDM5A PHD3 are mutated into stop codons, and the nucleotide sequences are respectively shown as SEQ ID NO:3 to 4.
(2) A variant protein of the PHD3 decoding protein domain is obtained by specifically introducing tryptophan analogues at the W18 and W28 sites of the PHD3 decoding protein domain respectively by using a chimeric phenylalanine translation system, wherein the tryptophan analogues are any one of 6-cyano-tryptophan, 7-cyano-tryptophan, 6-chloro-tryptophan, 7-chloro-tryptophan, 6-methyl-tryptophan (A1), 6-methoxy-tryptophan (A2), 7-methyl-tryptophan (A3), 7-methoxy-tryptophan (A4), 6, 7-methoxy-tryptophan (A5), 6, 7-methyl-tryptophan (A6), 7, 8-dihydrofuran-tryptophan (A7), 6, 7-dihydrofuran-tryptophan (A8), 7, 8-furan-tryptophan (A9), 6, 7-furan-tryptophan (A10), 6, 7-dioxole-tryptophan (A11) and 6, 7-cyclopentane-tryptophan (A12).
(3) The affinity of PHD3 decoding protein structural domain variant and H3K4me3 is measured by a microphoresis instrument, when the non-natural amino acid A2 is inserted into the W28 site of the PHD3 decoding protein structural domain, the affinity of PHD3 and H3K4me3 can be improved by 8 times, and the decoding protein variant is PHD3-W28-A2 and is named PHD. The amino acid sequence of the specific H3K4me3 polypeptide is shown in Table 2.
(4) In some specific embodiments, the site-specific introduction of 6-methoxy-tryptophan (A2) into a different histone methylation modified decoding protein domain increases the affinity of the decoding protein domain for histone methylation modifications.
Taking H3K9me3 as an example, selecting a Chromo domain of CDY1 as a study object, and improving the affinity of the H3K9me3 and the Chromo domain of CDY1 by 2 times after 6-methoxy-tryptophan is inserted into the W28 site of the Chromo domain of CDY 1; taking H3K27me3 as an example, selecting the BAH domain of BAHD1 as a study object, and improving the affinity of H3K27me3 and the BAH domain by 5 times after 6-methoxy-tryptophan is inserted into the W667 site of the BAH domain of BAHD 1; taking H3K36me3 as an example, the PWWP domain of DNMT3B was chosen as the subject, when 6-methoxy-tryptophan was inserted into the W263 site of the PWWP domain of DNMT3B, the affinity of the PWWP domains of H3K36me3 and DNMT3B was increased by 7-fold. Wherein the nucleotide sequence and the protein sequence of the Chromo domain of CDY1 are shown in SEQ ID NO:5-6; wherein the nucleotide sequence and the protein sequence of the BAH domain of BAHD1 are shown in SEQ ID NO:7-8; wherein the nucleotide sequence and the protein sequence of the PWWP domain of DNMT3B are shown in SEQ ID NO:9-10.
The following illustrates the application of several aspects of the invention.
1. The invention provides a method for establishing tandem repeat histone methylation decoding protein to improve histone methylation modification and decoding protein affinity.
Taking PHD3 as an example of this,
(1) Constructing a multiple repeat of the decoded protein variant, and comparing SEQ ID NO:4 as a template, constructing double and triple decoding protein mutants, wherein the double and triple decoding proteins respectively carry 2 and 3 amber Terminators (TAGs), and the nucleotide sequences of the specific double and triple mutants are shown as SEQ ID NO:11 to 12.
(2) The 2 or 3 6-methoxy-tryptophan (A2) are introduced into the amber terminator site of the duplex or triplex PHD protein through the genetic code expansion technology in a site-specific way, so that duplex and triplex PHD protein variants are obtained and are respectively named as 2x PHD and 3x PHD.
The affinity of duplex or triplex PHD protein variants to H3K4me3 was determined using a microphoresis instrument. Duplex and triplex PHD variants were obtained with 14.7 fold and 62.9 fold increases in affinity for H3K4me 3.
In some specific embodiments, the above strategy is equally applicable to other histone methylation modified decoding protein domains.
2. The invention provides a method for detecting histone methylation modification by a histone methylation modification super-parent molecular recognition system.
(1) Expressing and purifying the 6-methoxy-tryptophan (A2) substituted PHD protein variant to obtain PHD protein, 2xPHD protein and 3xPHD protein.
(2) Taking HeLa cell lysate as an example, heLa cells were lysed, and after gradient dilution to different concentrations, the protein samples were separated by SDS-PAGE running gel and transferred to PVDF membranes.
(3) After milk blocking with PVDF membrane, H3K4me3 specific antibodies, PHD protein, 2x PHD protein and 3x PHD protein were incubated overnight, respectively.
(4) Incubation of PVDF membrane for H3K4me 3-specific antibody the corresponding secondary antibody was incubated. Incubation of PVDF membrane of PHD protein further incubates GST-specific antibodies, and finally incubates the corresponding secondary antibodies.
(5) Chemiluminescent imaging. The 2x PHD protein and the 3x PHD protein exhibited higher detectability than the H3K4me3 specific antibodies.
3. The invention provides a method for detecting histone methylation modification by using a histone methylation modification super-parent molecular recognition system through an imaging technology. The method can be applied to living cell imaging and also can be applied to in vitro immunofluorescence imaging technology. The method can be applied to different cells, such as: HEK 293T cell line, heLa cell line, NCI-60 cell line, CHOs cell line, and the like.
The method is applied to living cell imaging and comprises the following steps:
(1) Constructing a plasmid expressed by cells, and encoding the plasmid with SEQ ID NO:4, cloning PHD-W28TAG fragment by using a template, cloning vector fragment by using pEGFP-EGFP as a template, and constructing plasmid of pEGFP-PHD-W28TAG-EGFP by Gibson assembly; the sequence represented by SEQ ID NO:1 as template cloning phenylalanine aminoacyl-tRNA synthetase (chPheRS) fragment, pCDNA3.1 as template cloning carrier fragment, and Gibson assembly to construct pCDNA3.1-chPheR9 plasmid. The plasmid map is shown in FIG. 1.
(2) The above two plasmids were combined according to 1:1 in molar ratio, HEK 293T cells were transfected. After 6-8 hours, 2mM 6-methoxy-tryptophan (A2) was added.
(3) The live cell imaging microscope detects EGFP fluorescence signal changes.
The immunofluorescence imaging technique applied in vitro comprises the following steps:
(1) Expressing and purifying the 6-methoxy-tryptophan substituted PHD protein variant to obtain PHD protein, 2xPHD protein and 3xPHD protein.
(2) The labeled proteins, PHD protein, 2x PHD protein and 3xPHD protein, were labeled with NHS-Cy5 activated lipid, respectively.
(3) Taking a HeLa cell line as an example, formaldehyde-immobilized cells are respectively incubated with Cy 5-labeled PHD protein or histone methylation modified specific antibodies, and after incubation, the localization of methylation modification of the corresponding group of proteins is detected by confocal microscopy imaging. The 2xPHD protein variants and the 3xPHD protein variants have a higher signal-to-noise ratio than the H3K4me3 specific antibodies.
4. The invention provides a method for detecting a histone methylation modification interaction relation group by a histone methylation modification super-parent molecular recognition system through a proximity labeling technology.
Adjacent labelling technology, which generally utilizes CRISPR gene editing technology or plasmid-based expression to express adjacent biotinylases fused to bait proteins in cells, has evolved to provide a supplement to traditional methods of studying intermolecular interactions in living cells. After the addition of exogenous biotin, proteins adjacent to the bait protein are biotinylated, which can be enriched by streptavidin-coupled magnetic beads, and then identified by mass spectrometry. The affinity of the H3K4me3 super-affinity molecular recognition system provided by the invention to H3K4me3 reaches 7nM, PHD-W28TAG and the bad blood acid peroxidase APEX/APEX2 can be fused and expressed, when PHD variant is specifically combined with H3K4me3, APEX2 can mark a protein group adjacent to H3K4me3 with Biotin under the stimulation of hydrogen peroxide, and then the protein group is enriched through strepitavidins, and finally the protein group is analyzed through LC-MS. The method is not limited to APEX 2-based proximity labelling techniques, but is also applicable to other proximal biotin-based techniques. Such as: horseradish peroxidase HRP and biotin ligase BioID, BASU, turboID, miniTurbo, and the like.
Compared with the prior art, the invention has the main advantages that:
1. the invention provides a method for improving cation-pi interaction, which can be applied to any biomacromolecule with cation-pi interaction. The invention provides a synthetic path of 6-methyl-tryptophan (A1), 6-methoxy-tryptophan (A2), 7-methyl-tryptophan (A3), 7-methoxy-tryptophan (A4), 6, 7-methoxy-tryptophan (A5), 6, 7-methyl-tryptophan (A6), 7, 8-dihydrofuran-tryptophan (A7), 6, 7-dihydrofuran-tryptophan (A8), 7, 8-furan-tryptophan (A9), 6, 7-furan-tryptophan (A10), 6, 7-dioxole-tryptophane (A11) and 6, 7-cyclopentane-tryptophan (A12), which can be applied to a legacy code expansion technology and can improve cation-pi interaction.
2. The present invention has a wide range of applications including (but not limited to): applied to detect dynamic changes of histone methylation modification in combination with living cell imaging techniques; detecting histone methylation modification in the biological sample in combination with immunofluorescence technology; genome sequencing technology and the like can be combined to analyze genome related to histone methylation modification; the histone methylation modification interaction proteomes can be identified in conjunction with proximity labeling techniques and the like. Specifically: (1) The introduction of 6-methoxy-tryptophane (A2) at the tryptophan site forming the cation-pi interaction using the genetic code extension technique can increase the affinity between biomolecules by a factor of 4-8. (2) 6-methoxy-tryptophan (A2) is specifically introduced into the PHD locus of the decoding protein of H3K4me3 to improve the affinity of the PHD with H3K4me3 by 8 times, and after the triple design, the affinity of the PHD variant with H3K4me3 reaches 7nM. (3) The histone methylation super-parent molecular recognition system provided by the invention has high sensitivity in recognizing histone methylation modification, and has higher specificity and sensitivity compared with a histone methylation modification specific antibody. (4) The histone methylation modified super-parent molecular recognition system has the advantages of easy modification, economy and capturing of various PTMs, and can be developed into various super-parent molecular recognition systems aiming at specific methylation modification.
Drawings
FIG. 1 is a plasmid map;
FIG. 2 is a chemical synthesis pathway, A is a synthesis pathway of 6-methoxy-tryptophan (A2) and 7-methoxy-tryptophan (A4), and B is a chemical synthesis pathway of 6, 7-cyclopentane-tryptophan A12;
FIG. 3 is an inventive strategy for modulating cation-pi interactions between histone methylation and its decoding proteins using genetic code expansion techniques. (A) Tryptophan is the amount of tryptophan in the aromatic cage component of the different histone methylation modified decoding proteins. (B) A flow diagram of a histone methylation super-parent molecular recognition system is developed by applying a genetic code expansion technology. Taking H3K4me3 as an example, replacing tryptophan in the aromatic cage of the decoding protein with a tryptophan analogue by utilizing a genetic code expansion technology, so as to regulate and control cation-pi interaction in the aromatic cage and obtain an unnatural amino acid analogue which remarkably improves the methylation affinity of the decoding protein and the histone. (C) The structural formula of the unnatural amino acid used in the invention;
FIG. 4 is a graph depicting the efficiency and specificity of identifying chimeric alanine aminoacyl-tRNA synthetases A2 and A4, wherein: GFP fluorescence reporting experiments identify the efficiency of the chimeric phenylalanine aminoacyl tRNA synthetases A2RS (A) and A4RS (B) to recognize A2 and A4, respectively, and mass spectrometry identifies the fidelity of the chimeric phenylalanine aminoacyl tRNA synthetases A2RS (C) and A4RS (D) to recognize A2 and A4, respectively;
FIG. 5 is a PHD domain variant protein designed to increase affinity for H3K4me3. (A) Complex structure of KDM5A PHD3 protein and H3K4me3 polypeptide (PDB: 2 KGI) wherein PHD3 and polypeptide are displayed in cartoon mode and aromatic amino acid and H3K4me3 modifications are displayed in a rod-like structure. (B) Coomassie brilliant blue displays PHD-W18-UAA and PHD-W28-UAA variants. (C) The affinity of PHD-W18-UAA variants to H3K4me3 was determined by a microphoresis instrument, wherein H3K4me3 was labeled with a FITC fluorophore. (D) Determining the affinity of the PHD-W28-UAA variant with H3K4me3 by a microphoresis instrument, wherein the H3K4me3 is marked by a FITC fluorescent group;
FIG. 6 is a diagram of a multivalent tandem repeat PHD domain designed to recognize H3K4me3. (A) A cartoon of multivalent tandem repeat PHD domain design. (B) Coomassie blue staining identified purity of the multiple PHD protein variants. (C) Determining the affinity of the multi-linked PHD protein variant with H3K4me3 by a micro thermophoresis instrument, wherein H3K4me3 is marked by a FITC fluorescent group;
FIG. 7 is the detection and imaging of H3K4me3 using a histone methylation super-philic molecular recognition system. (A) The group protein methylation super-philic molecule recognition system is applied to strategy diagrams of detection and imaging. (B) The H3K4me3 level of HeLa cells was detected using histone methylation super-philic molecules. The H3 specific antibody and the H3K4me3 specific antibody are used as control groups, and PHD-WT, 2xPHD and 3xPHD are used for detecting H3K4me3 respectively. (C) The histone methylation super parent molecule recognition system is applied to fluorescence imaging detection of cell H3K4me3 positioning, wherein PHD protein is marked by Cy 5;
FIG. 8 is a graph showing the efficiency of the system in recognizing 6-methoxy-tryptophan, 6, 7-methoxy-tryptophan, and 6, 7-methyl-tryptophan in mammalian cells using a chimeric phenylalanyl-tRNA synthetase obtained by a flow cytometry detection screen of the mammalian cells;
FIG. 9 shows the experimental flow chart of the application of the histone methylation super-philic molecular recognition system established by the genetic code expansion technology to the near marker detection of protein interaction group (A) and the GO analysis data.
Detailed Description
The technical scheme of the invention is further specifically described by the following specific examples. It should be understood that the practice of the invention is not limited to the following examples, but is intended to be illustrative of any of the various modifications and/or variations that may be made to the invention.
In the present invention, unless otherwise specified, all parts and percentages are by weight, and the equipment, materials, etc. used are commercially available or are conventional in the art. The methods in the following examples are conventional in the art unless otherwise specified.
The primer sequences used in the construction of the vector of the present invention in the specific examples are shown in Table 1:
table 1: primer sequences for constructing vectors
/>
The inventive strategy for modulating the cation-pi interactions between histone methylation and its decoding proteins using genetic code expansion techniques of the present invention is shown in FIG. 3, and the following examples illustrate specific methods.
Example 1: chemical Synthesis of Compounds A2 and A4
To 50mL of a solution of anhydrous N, N-dimethylformamide in B (2.0 g,13.6 mmol) was added potassium hydroxide (1.68 g,29.9 mmol), and the mixture was stirred at room temperature for 20min. To the reaction flask was added dropwise 30mL of an iodine solution of anhydrous N, N-dimethylformamide (4.14 g,16.3 mmol), and stirring was continued at room temperature for 2 hours. The reaction mixture was poured into an ice water solution containing 0.1% sodium thiosulfate. The mixture was placed in a refrigerator to ensure complete precipitation. The precipitate was filtered, washed with cold water and then dried in vacuo. 3-iodo-1H-indole B (90% B2 yield, 93% B4 yield) was obtained as a pale yellow solid, which was used in the next step without further purification.
The solid B (2.73 g,10.0 mmol) obtained in the first step was dissolved in 30mL anhydrous N, N-dimethylformamide. After 60% NaH (391.2 mg,16.3 mmol) was washed with hexane, it was suspended in 10mL anhydrous N, N-dimethylformamide under nitrogen. Raw material B was slowly added to the suspension under ice bath conditions, stirred for 10min, then p-toluenesulfonyl chloride (2.1 g,11.0 mmol) was added and stirred for 5h at 25 ℃. The mixture was poured into water, extracted three times with ethyl acetate, and then the ethyl acetate organic layer was washed with saturated brine and dried over anhydrous sodium sulfate. The organic phase is then concentrated under reduced pressure. Column chromatography was performed with petroleum ether and ethyl acetate to give compound C as a white solid (C2 yield 85%, C4 yield 81%).
The dried degassed N, N-dimethylformamide was charged under nitrogen to a vessel containing zinc powder (3.9 g,50.0 mmol). TMSCl (108.6 mg,1.0 mmol) was added, and the mixture was vigorously stirred at room temperature for 30min, after which the stirring was stopped, the zinc was precipitated. The supernatant was withdrawn under a flow of nitrogen with a syringe, and then new N, N-dimethylformamide was added to the zinc. Stirring was stopped after 2 minutes continued to precipitate zinc dust and the supernatant removed as before, and this step was repeated two more times. 1, 2-dibromoethane (751.4 mg,4.0 mmol) was then added to the vessel and stirred at 80℃for 30min. After the mixture was cooled to 25 ℃, tmcl (325.8 mg,3.0 mmol) was added and the resulting mixture was stirred for an additional 30min. Boc-3-iodo-L-alanine methyl ester (3.95 g,12 mmol) was dissolved in 10mL of N, N-dimethylformamide and added to the activated zinc powder, and the mixture was stirred vigorously. After the exotherm subsided (controlled with ice bath), stirring was continued for an additional 30min at which time stirring was stopped and the zinc was allowed to precipitate. The supernatant was gently withdrawn with a syringe and poured into a clean reaction flask under a flow of nitrogen. The supernatant was transferred by syringe to compound D (2.13 g,5.0 mmol), pd (OAc) 2 (112.2 mg,0.5 mmol) and S-Phos (410.5 mg, 1.0 mmol). The reaction was carried out for 4h under nitrogen protection. After completion of the reaction, the mixture was poured into water, extracted with ethyl acetate, and the upper organic layer was washed with brine, dried over anhydrous sodium sulfate, and purified by column chromatography over petroleum ether and ethyl acetate after completion of concentration under reduced pressure to give compound E (E2 yield 57%, E4 yield 45%) as a pale yellow oil.
Product E was analyzed and the results were as follows: e2 (s) 1 H NMR(500MHz,CDCl 3 )δ7.70(d,J=8.5 Hz,2H),7.48(d,J=2.3Hz,1H),7.30(d,J=8.7Hz,1H),7.22(d,J=8.1Hz,3H),6.84 (dd,J=8.7,2.3Hz,1H),5.05(d,J=8.0Hz,1H),4.60(d,J=7.1Hz,1H),3.86(s,3H), 3.62(s,3H),3.14(qd,J=14.7,5.6Hz,2H),2.34(s,3H),1.49–1.26(m,9H). 13 C NMR (125MHz,CDCl 3 )δ172.15,158.25,155.13,145.01,136.29,135.28,129.98,126.82, 123.22,120.13,117.44,112.55,98.09,80.26,55.91,53.69,52.47,28.46,21.70.HRMS(ESI)m/z calcd.For C 20 H 23 N 2 O 5 S + (M-Boc) + 403.1322,found 403.1331.E4) 1 H NMR (500MHz,CDCl 3 )δ7.69(d,J=8.1Hz,2H),7.62(s,1H),7.24(d,J=8.1Hz,2H),7.16 –7.06(m,2H),6.67(dd,J=7.3,1.5Hz,1H),5.15(d,J=8.0Hz,1H),4.65(dt,J=8.0,5.5Hz,1H),3.71(s,3H),3.67(s,3H),3.32–3.11(m,2H),2.37(s,3H),1.44(s,9H). 13 C NMR(125MHz,CDCl 3 )δ172.28,155.16,147.49,144.21,137.36,133.77,129.43, 127.30,126.91,124.79,124.04,114.95,112.05,107.16,80.12,55.54,53.85,52.49,28.41,28.01,26.99,21.67.HRMS(ESI)m/z calcd.For C 20 H 23 N 2 O 5 S + (M-Boc) + 403.1322, found 403.1334.
Compound E (973.2 mg,2.0 mmol) was dissolved in 50mL of methanol, naOH (1.2 g,30.0 mmol) was added and dissolved in 20mL of H 2 O. The mixture was heated under reflux for 8h, then methanol was evaporated under reduced pressure to a volume of about half the reaction volume. Acidifying with ice-cold 2M diluted hydrochloric acid, and adjusting pH to 3. The aqueous solution was extracted with cold ethyl acetate, and the upper organic layer was washed with saturated brine, dried over anhydrous sodium sulfate, and evaporated in vacuo to give a colorless oil, which was used to give carbamate F without further purification. F was then dissolved in dichloromethane and trifluoroacetic acid (112.2 mg,0.5 mmol) was added for deprotection to give the title compound A (74% yield of A2 and 68% yield of A4) as a pale yellow solid, the complete synthetic route being shown in FIG. 2.
Product a was analyzed and the results were as follows: a2 (s) 1 H NMR(500MHz,D 2 O)δ7.49(d,J=8.7Hz, 1H),7.01(s,1H),6.95(d,J=2.4Hz,1H),6.72(dd,J=8.7,2.4Hz,1H),3.75(s,3H), 3.45(dd,J=7.3,5.2Hz,1H),3.03(dd,J=14.4,5.2Hz,1H),2.86(dd,J=14.4,7.3Hz, 1H). 13 C NMR(125MHz,D 2 O)δ182.83,155.19,136.74,123.18,122.03,119.51, 110.63,108.80,95.26,56.42,55.78,30.43.HRMS(ESI)m/z calcd.For C 12 H 15 N 2 O 3 S + (M+H) + 235.1077,found 235.1081.A4) 1 H NMR(500MHz,D 2 O)δ7.27–7.22(m,2H), 7.08(td,J=7.9,0.9Hz,1H),6.78(d,J=7.7Hz,1H),4.31(ddd,J=6.3,5.4,0.9Hz, 1H),3.94(s,3H),3.44(ddt,J=15.4,5.3,0.9Hz,1H),3.36(dd,J=15.4,7.3Hz,1H)). 13 C NMR(125MHz,D 2 O)δ171.83,146.06,128.04,126.58,124.97,120.10,117.44, 115.12,106.92,55.63,53.27,25.83.HRMS(ESI)m/z calcd.For C 12 H 13 N 2 O 3 S - (M-H) - 233.0932,found 233.0939.
Example 2: library construction of chimeric phenylalanine aminoacyl-tRNA synthetase mutants and positive and negative screening
In this example, the gene sequence of the chimeric phenylalanyl-tRNA synthetase chPheRS is shown in SEQ ID NO: 1.
(1) Selecting the amino acid binding site of the chimeric phenylalanyl-tRNA synthetase with reference to the structure of the humanized mitochondrial phenylalanyl-tRNA synthetase: f464, T467 and a507, amino acids around the binding pocket: e391, V393, M490.
(2) The chimeric phenylalanyl-tRNA synthetase (T467G and A507G) is used as a template, the primers chPheRS-E391NNK-V393NNK-R/F, chPheRS-M490NNK-R/F and chPheRS-F464NNK-R/F are used for amplifying gene fragments, and the nucleotide sequences of the primers are shown in SEQ ID NO:19-24, the library of mutations was cloned into the pBK vector by Gibson assembly to generate a chPheRS mutant gene library (E391 NNK, V393NNK, M490NNK, F464NNK, T467G and A507G).
(3) The pNEG-chPheT-Barnase-2 TAG was transformed into E.coli DH10B to prepare negative selection competent cells, the plasmid map of which is shown in FIG. 1; positive selection competent cells were prepared by transforming pNEG-3C11-CAT-112TAG-GFP190TAG into E.coli DH10B, the plasmid map of which is shown in FIG. 1.
(4) The screened library in (2) was transformed into negative selection competent cells, and the bacterial solution was spread on LB plates (kanamycin, 50. Mu.g/mL; ampicillin, 100. Mu.g/mL; 0.2% L-arabinose) and incubated at 37 ℃.
(5) The clones in (4) were collected to extract plasmids, and the plasmids were transformed into positive selection competent cells, and the whole bacterial solution was plated on LB agar plates (kanamycin, 50. Mu.g/mL; ampicillin, 100. Mu.g/mL; chloramphenicol, 10. Mu.g/mL; 0.2% L-arabinose; 2mM unnatural amino acid) to which unnatural amino acids were added, cultured at 37℃for 12 hours, and further cultured at 30℃for 48 hours.
Example 3: screening chimeric phenylalanine aminoacyl-tRNA synthetase mutants specifically recognizing unnatural amino acid by GFP fluorescence reporting experiment
(1) After two rounds of forward screening, the monoclonal with fluorescent signal in example 2 was picked overnight for culture.
(2) According to 1:100, and culturing at 37deg.C until OD600 = 0.6-0.8, adding 0.2% L-arabinose to induce expression, simultaneously taking 1mL bacterial liquid, adding 1mM corresponding unnatural amino acid, and expressing at 30deg.C for 20 hr.
(3) After centrifugation of 750. Mu.L of the bacterial liquid in (2), 150. Mu.L of 1 XBugbuster (Millipore, lot: 3492682) was added and placed at 25℃for 30min, followed by centrifugation, 100. Mu.L of the supernatant was taken into a 96-well plate, and 100. Mu.L of the bacterial liquid in (2) was simultaneously taken, and GFP fluorescence signal intensity and OD of the corresponding clone were measured by an enzyme-labeled instrument Bio Tek Synergy NEO2 600 The efficiency of the mutant in recognizing unnatural amino acids was calculated.
(4) Sequencing the chimeric phenylalanine aminoacyl-tRNA synthetase mutant capable of recognizing the corresponding unnatural amino acid at high efficiency to obtain a specific mutant sequence, and placing the plasmid corresponding to cloning at-20 ℃ for standby.
(6) Finally, a chimeric phenylalanyl-tRNA synthetase mutant was identified that recognizes 6-methoxy-tryptophan, 7-methoxy-tryptophan, 6, 7-methyl-tryptophan, 6, 7-methoxy-tryptophan, and comprises six mutations E391D, V393G, M490V, F464V, T467G, and A507G, designated chPheRS9, the nucleotide and amino acid sequence of which is shown in SEQ ID NO: 1-2.
(7) The efficiency of the chimeric phenylalanine translation system in recognizing unnatural amino acids at different unnatural amino acid concentrations was determined using GFP fluorescence reporting experiments. The efficiency and fidelity of the recognition of 6-methoxy-tryptophan and 7-methoxy-tryptophan by the chimeric phenylalanyl-tRNA synthetases is shown in FIG. 4.
Example 4: series plasmid construction of KDM5A PHD3 (PHD) Domain
All plasmids were constructed from the Gibson assembly system, except as specified. Take as an example a series of plasmid constructs for the KDM5A PHD3 (PHD) domain.
1. PHD wild-type plasmid: amplifying GST tag by using pGEX-6p vector as template and using primer pNEG-GST-F/R, the nucleotide sequence is shown as SEQ ID NO: 25-26; the cDNA is used as a template, a PHD domain is amplified by using a primer pNEG-PHD-F/R (Uniport ID: P29375, nucleotides 1598-1663, the nucleotide sequence of the primer is shown as SEQ ID NO:27-28, a pNEG-2 x chPheT vector is used as a template, the primer pNEG-PHD-V-F/R is used for amplifying the vector, the nucleotide sequence of the primer is shown as SEQ ID NO:29-30, and a plasmid pNEG-2 x chPheT-PHD-GST is constructed by Gibson assembly.
2. PHD mutant plasmid: introducing a site-directed mutation of an amber codon into the PHD domain W28 by using a primer pNEG-PHD-W28TAG-F/R and constructing a plasmid pNEG-2 xchPheT-PHD-W28 TAG-GST through Gibson assembly by using pNEG-2 xchPheT-PHD-GST as a template; introducing a site-directed mutation of an amber codon into the PHD domain W18 by using a primer pNEG-PHD-W18TAG-F/R, and constructing a plasmid pNEG-2 XchPheT-PHD-W18 TAG-GST through Gibson assembly, wherein the nucleotide sequence of the primer is shown as SEQ ID NO: 31-34.
3. Multivalent tandem repeat PHD domain plasmid: the PHD-W28TAG fragment containing 6x-Linker (GGSGGS) is amplified by using pNEG-2 x chPheT-PHD-W28TAG-GST as a template and adopting a primer pNEG-2 x PHD-F/R, and the nucleotide sequence of the PHD-W28TAG fragment is shown as SEQ ID NO: 35-36; amplifying the vector by using a primer pNEG-2 x PHD-V-R and pNEG-PHD-V-F, wherein the nucleotide sequence of the vector is shown in SEQ ID NO:37 and SEQ ID NO:29, construction of duplex or triplex PHD plasmids by Gibson assembly: pNEG-2 xchPheT-2 xPHD-W28TAG-GST and pNEG-2 xchPheT-3 xPHD-W28TAG-GST.
4. Multicomponent tandem repeat PHD-Chromo domain plasmid: amplifying the vector by using pNEG-2 xchPheT-PHD-W28-GST as a template and adopting primers pNGE-PHD-V-F and pNEG-2 xPHD-V-R; the pNEG-2 XchPheT-CDY 1-W28TAG-GST is used as a template, a primer pNEG-PHD-CDY1-F/R is used for amplifying CDY1-W2TAG fragments, and the nucleotide sequence of the primer is shown as SEQ ID NO: 38-39. Construction of multicomponent tandem plasmids by Gibson assembly: pNEG-2 chPheT-PHD-W28TAG-CDY1-W28TAG-GST. The plasmid map is shown in FIG. 1.
5. Eukaryotic cell expression plasmid: plasmid construction of pEGFP-PHD3-W28 TAG-EGFP: pEGFP-EGFP is used as a template, and a primer pEGFP-PHD-V-F/R is used as an amplification vector, wherein the nucleotide sequence of the primer is shown as SEQ ID NO: 40-41; PHD domain is amplified by using pNEG-2 XchPheT-PHD-W28 TAG-GST as a template and using a primer pEGFP-PHD-F/R, and the nucleotide sequence is shown as SEQ ID NO:42-43, the plasmid was constructed by Gibson assembly and the plasmid map is shown in FIG. 1. plasmid construction of pCDNA3.1-chPheRS 9: designing a primer to amplify chimeric phenylalanyl-tRNA synthetase (chPheRS 9), cloning the chimeric phenylalanyl-tRNA synthetase onto a pcDNA3.1 vector, and respectively under the control of CMV and U6 promoters, wherein the cloning genes and the primers of the vector are respectively shown as SEQ ID NO:44-47, and the plasmid map is shown in FIG. 1.
Sequencing of plasmids was done by Peking Optimaceae. The construction of the remaining plasmids was the same as above.
Example 5: expression purification of KDM5A PHD3 (PHD) wild-type and mutant proteins
1. Expression of KDM5A PHD3 (PHD) wild-type protein
1. Plasmid transformation: the DH10B chemically competent strain is taken out from a refrigerator at the temperature of minus 80 ℃, immediately placed into an ice box, and after the strain is melted, plasmid pNEG-2 XchPheT-PHD-GST is added, and the strain is stirred evenly by flicking the abdomen. Standing in ice bath for 30 min, heat-shocking at 42deg.C for 90s, standing in ice bath for 2min, adding antibiotic-free LB liquid medium, recovering at 37deg.C for 40min, spreading 200 μl of bacterial liquid on LB agar plate (ampicillin, 100 μg/mL), and culturing at 37deg.C overnight.
2. Induction of expression: from the above resistant plate, a single clone was picked up to 3mL of LB liquid medium (ampicillin, 100. Mu.g/mL), and cultured overnight with shaking (37 ℃, 2.)20 rpm); according to 1: inoculating the bacterial liquid at a ratio of 100, culturing at 37deg.C to OD 600 When the concentration is=0.6 to 0.8, L-arabinose (final concentration: 0.2%) and ZnCl are added 2 (final concentration: 0.1 mM), and induction was carried out at 22℃for 24 hours.
2. Expression of KDM5A PHD3 mutant proteins
PHD3-W28-6MeOW mutant was exemplified.
1. Co-transformation of plasmids pNEG-2 XchPheT-PHD-W28 TAG-GST and pBK-chPheRS9 into E.coli DH10B was performed in the same manner as above.
2. Induction of expression: from the above-mentioned resistant plate, a single clone was picked up to 3mL of LB liquid medium (ampicillin, 100. Mu.g/mL; kanamycin, 50. Mu.g/mL), and cultured overnight with shaking (37 ℃,220 rpm); according to 1: 100. inoculating into 100mL LB liquid medium, culturing at 37deg.C to OD 600 When=0.6 to 0.8, L-arabinose (final concentration: 0.2%) and ZnCl were added 2 (final concentration: 0.1 mM) and the non-natural amino acid at a final concentration of 0.5mM, and induction of expression was carried out at 22℃for 24 hours.
3. Purification of KDM5A PHD3 (PHD)
1, collecting bacterial liquid. Centrifugation (4 ℃,4000rpm,20 min) was performed to collect the submerged bacteria.
2 resuspension of the cells. Using lysis buffer (20 mM Tris-HCl, pH 7.5,150mM NaCl,0.1mM ZnCl) 2 2mM beta-Me, protease inhibitor PMSF, aprotinin).
And 3, ultrasonic crushing. Setting an ultrasonic instrument program: working for 2s, intermittent for 5s, power 60%, and ultrasonic at 4 ℃.
4 centrifugation (4 ℃,12000rpm,20 min), and collecting supernatant.
5 take 0.5mL GST beads and apply to a gravity column with ddH 2 O washes the beads and equilibrates the column with 10 column volumes of lysis buffer.
6 the supernatant from 4 was added to the equilibrated GST column.
7 with 20 column volumes of lysis buffer (20 mM Tris-HCl, pH 7.5 150mM NaCl,0.1mM ZnCl) 2 2mM beta-Me, protease inhibitor PMSF, aprotinin) to elute non-specifically adsorbed heteroproteins.
8 eluting with 10 times of column volume of elution buffer (20 mM Tris-HCl, pH 7.5, 150mM NaCl,20mM reduced glutathione), and collecting the eluate, i.e., the target protein component.
9 protein expression purity was determined by SDS polyacrylamide gel electrophoresis (SDS-PAGE), and protein expression level was measured using Nanodrop (micro-and fluorescence spectrophotometry, siemens). The protein is used for subsequent SDS protein gel electrophoresis analysis, mass spectrum identification and MST experiments.
FIG. 5 is a PHD domain variant protein designed to increase affinity for H3K4me 3. Wherein, (A) complex structure of KDM5A PHD3 protein and H3K4me3 polypeptide (PDB: 2 KGI) wherein PHD3 and polypeptide are displayed in cartoon mode and aromatic amino acid and H3K4me3 modification are displayed in stick-like structure. (B) Coomassie brilliant blue displays PHD-W18-UAA and PHD-W28-UAA variants. (C) The affinity of PHD-W18-UAA variants to H3K4me3 was determined by a microphoresis instrument, wherein H3K4me3 was labeled with a FITC fluorophore. (D) Determining the affinity of the PHD-W28-UAA variant with H3K4me3 by a microphoresis instrument, wherein the H3K4me3 is marked by a FITC fluorescent group;
As shown in FIG. 5B, the purity of the SDS protein was 90% or more.
4. LC-MS identification of proteins
Purified proteins were analyzed by SCIEX Triple TOF 6600MS mass spectrometry using electrospray ionization and SCIEX analysis TF software. Adopting PHENOMENEX AERIS wide-pore C4 chromatographic column2.1x50mm,3.6 μm). Mobile phase a was 0.1% formic acid in water and mobile phase B was 0.1% acetonitrile formate. The constant flow rate was set at 0.2mL/min. Mass spectrum deconvolution was performed using SCIEX OS-Q software (version 2.0, SCIEX Corporation) to analyze mass spectrum data. The molecular weight of the protein was predicted using the ExPASy Compute pI/Mw tool.
The LC-MS identification results are shown in figures 4C and 4D, the theoretical molecular weight of the target protein is 33378Da, the actual molecular weight is 33377Da and 33378Da respectively, and the specificity of the chimeric phenylalanyl-tRNA synthetase mutants is proved to be recognized as 6-methoxy-tryptophan and 7-methoxy-tryptophan.
Example 6: a microphoresis Meter (MST) determines the affinity of the decoded protein domain variant to the histone methylation-modified polypeptide. The peptides used in the experiments were all synthesized by Beijing cloisonne midbody biotechnology Co., ltd, and the C-terminal of the peptide was labeled with Fluorescein Isothiocyanate (FITC), and the specific sequences are shown in Table 2.
Table 2. Polypeptide sequence information used in mst experiments
The MST assay is specifically described using the decoding proteins PHD and H3K4me3 as examples.
(1) Desalting of protein samples. Protein samples were dialyzed against 2L of MST buffer (20 mM Tris-HCl,50mM NaCl,1 mM DTT,0.05%Tween-20, pH 7.5) and repeated 3 times.
(2) Concentrating the protein. Protein samples were concentrated to the appropriate concentration using a 10Kd protein concentrate tube (Millipore).
(3) Preparing 16 PCR tubes, adding 10 mu L of MST buffer into the No. 2-16 PCR tubes, taking 20 mu L of protein sample to the No. 1 tube, pipetting 10 mu L of protein sample from the No. 1 tube to the No. 2 tube, and iteratively diluting the protein sample;
(4) Adding 10 mu L of polypeptide molecules with final concentration of 100nM into each tube, and fully mixing to obtain 20 mu L;
(5) And (5) loading a capillary tube.
(6) Kd value measurement. This was done using a NT.115Monolith instrument (Nano Temper Technologies, munich, germany) using a blue LED excitation light source at a constant temperature of 25 ℃. The instrument is provided with: 20% of blue LED excitation power and 40% of infrared laser power. All measurements were performed using standard glass capillaries (Nano Temper Technologies, # catMO-K022), and each set of experiments was repeated 3 times, unless otherwise specified.
(7) And (5) data processing. By means of NT analysis software, the target protein and fluorescent peptide fragment were combined according to the following sequence 1: and 1, fitting the combined model in proportion to obtain the dissociation constant Kd of the target protein. All data were analyzed by Origin software process.
(8) Other histone methylation modified decoding proteins are identical to histone methylation modified affinity assay procedures.
Experimental results: experimental data as shown in fig. 5C and 5D: the affinity of the PHD wild-type domain and H3K4me3 is 440nM, the affinity of the PHD variant with the 6-methoxy-tryptophan introduced by the W28 site specificity and the H3K4me3 is 52nM, and compared with the PHD wild-type domain of PHD3, the affinity of the PHD variant with the 6-methoxy-tryptophan introduced by the W28 site specificity and the H3K4me3 is increased by 8 times, and the affinity of other electron donating tryptophan analogues of the PHD protein and the H3K4me3 is increased by 2-6 times. Similarly, the site-specific introduction of 6-methoxy-tryptophan into other decoding protein domains also increases the affinity of the decoding protein for its corresponding histone methylation modification by 2-4 fold.
Example 7: construction of multivalent tandem repeat PHD domains to increase their affinity for H3K4me3
1. Duplex and triplex repeat PHD domain plasmids were constructed as described in example 4: pNEG-2 xchPheT-2 xPHD-W28TAG-GST (2 xPHD) and pNEG-2 xchPheT-3 xPHD-W28TAG-GST (3 xPHD).
2. The purified W28 site-specifically introduced duplex and triplex PHD variants of 6-methoxy-tryptophan (2 x PHD, 3x PHD) were expressed as described in example 5, and protein expression purity was identified by SDS polyacrylamide gel electrophoresis (SDS-PAGE) and protein molecular weight was identified by LC-MS.
3. The affinities of the multivalent tandem repeat PHD domains 2x PHD, 3x PHD with H3K4me3 were determined as described in example 6.
4. The strategy is not limited to interaction between PHD structural domains and H3K4me3, and can be expanded between other decoding proteins and other histone methylation modifications, and experiments prove that the strategy can improve affinity between different decoding protein structural domains and corresponding histone methylation modifications.
FIG. 6 is a diagram of a multivalent tandem repeat PHD domain designed to recognize H3K4me 3. (A) A cartoon of multivalent tandem repeat PHD domain design. (B) Coomassie blue staining identified purity of the multiple PHD protein variants. (C) The microphoresis instrument determines the affinity of the multiple PHD protein variant to H3K4me3, wherein H3K4me3 is labeled with a FITC fluorophore.
Experimental results: experimental data are shown in fig. 6C: the affinity of PHD wild-type domain and H3K4me3 was 440nM, and the affinity of the W28 site-specifically introduced duplex and triplex PHD variants of 6-methoxy-tryptophan with H3K4me3 was 30nM and 7nM, respectively, as compared to the PHD wild-type domain of PHD3, the W28 site-specifically introduced duplex and triplex PHD variants of 6-methoxy-tryptophan with H3K4me3 affinity was up to 14.7-fold and 62.9-fold. The affinity assay results that extend this strategy to other decoded proteins and their corresponding histone methylation modifications indicate that: the multivalent tandem repeat histone methylation modification can increase the affinity of the multivalent tandem repeat histone methylation modification to the corresponding histone methylation modification.
Example 8: far-Western Blot assessment of the efficiency of PHD super-parent molecule recognition on H3K4me3
1. Protein expression. Purified 6-methoxy-tryptophan-substituted PHD protein variants were expressed as described in example 4, example 5, to obtain PHD protein, 2x PHD protein and 3x PHD protein.
2. SDS polyacrylamide gel electrophoresis. Taking HeLa cell lysate as an example, after Hela cells are lysed, the protein samples are subjected to gradient dilution to different concentrations and separated by SDS-PAGE running gel
3. And (5) transferring films. The proteins were transferred to PVDF membranes. Constant current 300mA, transfer film 2.5h.
4. Closing: the membrane was placed in a plastic box containing 5% skim milk/TBST, placed on a shaker, sealed for 1h and the sealing solution was poured off. Washed 3 times with TBST for 10min.
5. Incubating the bait protein. H3K4me 3-specific antibodies, PHD protein, 2x PHD protein and 3x PHD protein were incubated overnight, respectively. TBST was washed 3 times, once for 10min.
6. And (5) incubating the antibody. PVDF membranes incubated with H3K4me 3-specific antibodies the corresponding secondary antibodies were incubated at room temperature. Incubation of PVDF membrane of PHD protein GST-specific antibodies (Sigma-Aldrich, cat#G7781) were further incubated, followed by final incubation of the corresponding secondary antibodies (Proteintech, cat#SA 00001-2). TBST was washed 3 times, once for 10min.
7. Chemiluminescent imaging. The PVDF film was covered on the developing solution, taking care of the uniform coverage, left at room temperature for 3 minutes, and then imaged and developed on a multifunctional imager.
FIG. 7 is the detection and imaging of H3K4me3 using a histone methylation super-philic molecular recognition system. (A) The group protein methylation super-philic molecule recognition system is applied to strategy diagrams of detection and imaging. (B) The H3K4me3 level of HeLa cells was detected using histone methylation super-philic molecules. The H3 specific antibody and the H3K4me3 specific antibody are used as control groups, and PHD-WT, 2xPHD and 3xPHD are used for detecting H3K4me3 respectively. (C) The histone methylation super parent molecule recognition system is applied to fluorescent imaging detection of cell H3K4me3 localization, wherein PHD protein is marked by Cy 5.
Experimental results: experimental data as shown in fig. 7B, the 2x PHD protein and the 3x PHD protein exhibited higher signal-to-noise ratios than the H3K4me3 specific antibodies. The experiment is not limited to interaction of PHD and H3K4me3, and the same is applicable to detection of corresponding histone methylation modification of different decoding proteins.
Example 9: histone methylation superphilic molecular recognition systems detect histone methylation modifications in combination with immunofluorescence techniques.
Histone methylation of H3K4me3 and PHD-decoded proteins are exemplified.
1. The PHD variant was marked. PHD decoding protein domains were labeled with Cy5 dye. NHS-Cy5 was dissolved in DMSO and PHD-decoded protein in PBS solution. NHS-Cy5 and PHD proteins at 1:2 molar ratio, incubated at 37℃in the absence of light for 1h,50mM Tris-HCl, pH 8.0 solution, and the reaction stopped.
2. PHD protein was purified using a PD MiniTrapTM G-25 desalting column (Cytiva, cat# 28918007).
3. Preparation of cell sheets. HeLa cells were inoculated into a petri dish in which a treated cover glass was previously placed, and cultured at 37 ℃.
4. Cell fixation. After the cells were completely adherent, the medium was removed, rinsed 1 time with PBS, fixed for 10min at room temperature with 4% paraformaldehyde (4% PFA/PBS), and rinsed 3 times with PBS.
5. Cell permeabilization. Cells were permeabilized with PBS containing 0.5% Triton X-100 for 10min and rinsed 3 times with PBS.
6. And (5) sealing. At room temperature, the cells were blocked for 30min with PBS containing 3% BSA.
7. And (5) incubating the primary antibody. The Cy 5-labeled PHD protein (wild-type and mutant) and H3K4me3 antibody (Abcam, cat#ab 8580) of (2) were used for incubation for 2H at room temperature, respectively.
8. And (5) incubating the secondary antibody. Cells incubated with H3K4me3 antibody were rinsed with PBS, 10min each time, and repeated 3 times. Then incubated with Dyight 488, goat rabbit antibody IgG (Abbkine, cat#A 23220) for 1h at room temperature. The PBS was rinsed 3 times for 10min each. Cells incubated with Cy 5-labeled PHD protein were rinsed directly 3 times for 10min each with PBS.
9. And (5) sealing the piece. The coverslips were mounted face down on slides with DAPI blocking agent (Abcam, cat#ab 104139) and allowed to stand overnight in the dark. Mounted on a glass slide for imaging.
10. Imaging. At room temperature, images were made using a LSM710 confocal microscope (Zessi) with a 63x oil microscope. All images were analyzed and processed using Zeiss ZEN 2.3lite software.
Experimental results: experimental data as shown in figure 7C immunofluorescence results indicate that Cy 5-labeled PHD super-parent molecules are able to detect co-localization of H3K4me3 with M-phase condensed chromosomes during mitosis. The PHD super-parent molecule has a higher signal-to-noise ratio than the commercially available commercial H3K4me3 antibody (Abcam, cat#ab 8580).
Example 10: flow cytometry analysis of the efficiency of chimeric phenylalanine translation systems in mammalian cells
1. Transfecting the cells. 293T cells were transfected according to the standard plasmid transient transfection procedure, with the experimental group being cells co-transfected with plasmid pCDNA3.1-chPheRS9 and the fluorescent reporter plasmid pEGFP-mCherry-T2A-EGFP-190TAG expressing the chimeric phenylalanine translation system, and the control group being cells infected with pEGFP-mCherry and pEGFP-EGFP alone.
2. After 48h of cell transfection, the medium was aspirated off and the residual medium was washed off by addition of 1 XPBS.
3. The PBS solution was aspirated off, cells were digested with pancreatin, resuspended in 1mL DMEM medium, and the cells were transferred to a 1.5mL centrifuge tube.
4. The flow cytometer was set up with 293T cells for forward and side scatter gates, mCherry-expressing cells for parameters and gates of the PE channel, EGFP-expressing cells for parameters and gates of the FITC channel.
5. The experimental group cells were assayed and 50000 cells were set per sample collection. Data was analyzed using software FlowJo.
Experimental results: the experimental data are shown in FIG. 8, and the results of the flow cytometry experiments show that the chimeric phenylalanine aminoacyl-tRNA synthetase (chPheRS 9) can efficiently recognize any unnatural amino acid in a mammalian cell, namely 6-methoxy-tryptophan (6 MeOW), 7-methoxy-tryptophan (7 MeOW), 6, 7-methyl-tryptophan (67 MW) and 6, 7-methoxy-tryptophan (67 MeOW). The experimental procedure is applicable to 293T cell lines, but is not limited to 293T cell lines, and is applicable to various cell lines.
Example 11: capturing proteomes interacting with histone methylation modifications using proximity labeling techniques
1. Using pT3 vector as template, using primer pT3-PHD-APEX-V-F/R amplification vector, the nucleotide sequence of the primer is shown as SEQ ID NO: 52-53; PHD gene fragment is amplified by using pNEG-2 XchPheT-PHD-W28 TAG-GST as a template and using a primer pT3-PHD-F/R, wherein the nucleotide sequence of the primer is shown as SEQ ID NO: 56-57; the primer pT3-APEX2-F/R is used for amplifying the APEX2 gene fragment, and the nucleotide sequence of the primer is shown as SEQ ID NO:54-55, the nucleotide and amino acid sequences of APEX2 are set forth in SEQ ID NO: 13-14, plasmid pT3-PHD-APEX2 was constructed by Gibson assembly. The plasmid map is shown in FIG. 1.
2. And constructing a stable transfer cell line for stably expressing the PHD and APEX2 fusion protein. Plasmid pCMV-SB100 (see FIG. 1 for specific plasmid map) containing the sleep-bed transposon system was co-transformed with plasmid pT3-APEX2-PHD into HeLa cells, which were cultured at 37℃for 24 hours, and then periodically subjected to liquid exchange with DMEM containing 2. Mu.g/mL puromycin. After all cells of the blank group died, the experimental group cells were cultured with DMEM containing 1 μg/mL puromycin to obtain a mixed clone stable cell line.
3. Cells were transfected and pCDNA3.1-chPheRS9 was overexpressed in a clonally stable cell line with the addition of 2mM 6-methoxy-tryptophan and cultured for 36h.
4. APEX2 catalyzed proximity tag. The stable cell line in step (3) was incubated with DMEM containing 500. Mu.M biotin phenol at 37℃for 30min. Changing the solution to 1mM H 2 O 2 PBS solution, standing at room temperature for 5min. Cells were rinsed 4 times with pre-chilled 20mM ascorbic acid/PBS followed by 1 time of PBS. Using pancreatin digestion, DMEM was neutralized, centrifuged (1000 g,1 min) and the supernatant discarded. Finally, PBS was added to resuspend the cells, centrifugation (1000 g,1 min), the supernatant was discarded, and the procedure described above was repeated 1 time.
5. Isolation of nuclei. 1.5mL of hypotonic buffer (10mM HEPES,10mM KCl,0.05%NP40) was added to the cells obtained in step (4), resuspended, allowed to stand on ice for 10min, centrifuged (4 ℃,12000rpm, 20 min), and the supernatant discarded. Repeating the above steps for 5-8 times.
6. Lysis of the nuclei. To the pellet in step (5) was added 400. Mu.L of lysis buffer (25mM TEOA pH7.5, 150mM NaCl,0.1%SDS,1%Triton X-100,0.5% sodium deoxycholate, 1mM PMSF,1 XPIC) and 20. Mu.L of DNase, resuspended, left at room temperature for 20min, centrifuged (4 ℃,18000rpm,15 min) and the supernatant was collected.
7. Enrichment of biotin-labeled proteins. To the supernatant collected in step (6), streptavidin-conjugated magnetic beads (Streptavidin Beads) were added and incubated overnight at 4 ℃. Washing 1 time with 0.01% NP40/PBS buffer, then washing 3 times with 0.01% NP40/PBS buffer containing 500mM NaCl, 0.01% NP40/PBS buffer containing 0.2% SDS, 0.01% NP40/PBS buffer containing 2M urea, washing 1 time with 0.01% NP40/PBS buffer, finally adding 100. Mu.L of 1X SDS loading buffer to resuspension, boiling at 100 ℃ for 10min.
8. SDS polyacrylamide gel electrophoresis. The protein samples in step (7) were separated by SDS polyacrylamide gel electrophoresis (SDS-PAGE), stained with Coomassie brilliant blue G250, and destained.
9. LC-MS/MS detects proteomes interacting with histone methylation modifications. Separating protein strips of the protein gel in the step (8) by using a clean blade gel cutting, and respectively carrying out decoloring, dehydration, drying, reduction, alkylation, enzymatic hydrolysis, peptide segment extraction, desalination, isotope labeling and desalination treatment. The samples treated as described above were analyzed by Q Exactive Orbitrap mass spectrometer using Proxeon nanospray ionization and the high performance liquid chromatography instrument was Proxeon Easy-nLC II HPLC. Sample loading to 100-micro x 20mm Magic C18 Desalting in 5U reverse column, and passing through 75-microx 100mm Magic C18->The protein samples were separated on a 3U reverse phase column. The elution flow rate was set at 300nL/min and the elution time was set at 60min to obtain MS/MS results. And (3) data processing: experimental software MaxQuant and pfbel software analysis processed experimental results.
The flow of this example is shown in FIG. 9, which combines proximity labeling technology with a histone methylation super-parent recognition system to capture proteomes interacting with histone methylation modifications (H3K 4me 3), which is useful for analyzing biological functions of H3K4me3 during life.
In summary, the invention provides a synthesis method of pyridine alkaloid compounds, and the compounds are used for remarkably improving the cation-pi interaction, so that a research method is provided for researching biomacromolecules of the cation-pi interaction, a theoretical basis is provided for developing biotechnology of a super parent based on histone methylation modification of decoding protein, possibility is provided for further application, and great clinical value and development and application value are provided.
It should be understood that the foregoing detailed description of the present invention is provided for illustration only and is not limited to the technical solutions described in the embodiments of the present invention, and those skilled in the art should understand that the present invention may be modified or equivalently replaced to achieve the same technical effects; as long as the use requirement is met, the invention is within the protection scope of the invention.
Sequence listing
<110> university of Zhejiang
<120> a method for increasing cation-pi interaction by genetic code extension and application thereof
<130> ZJDX-002
<160> 49
<170> SIPOSequenceListing 1.0
<210> 1
<211> 1668
<212> DNA
<213> Synthesis (synthetic sequence)
<400> 1
atggataaga agccgctgga tgttctgatc tctgcgaccg gtctgtggat gtcccgtacc 60
ggcacgctgc acaagatcaa gcactatgag atttctcgtt ctaaaatcta catcgaaatg 120
gcgtgtggtg accatctggt tgtgaacaac tctcgttctt gtcgtcccgc acgtgcattc 180
cgttatcata aataccgtaa aacctgcaaa cgttgtcgtg tttctgacga agatatcaac 240
aacttcctga cccgttctac cgaaggcaaa acctctgtta aagttaaagt tgtttctgag 300
ccgaaagtga aaaaagcgat gccgaaatct gtttctcgtg cgccgaaacc gctggaaaat 360
ccggtttctg cgaaagcgtc taccgacacc tctcgttctg ttccgtctcc ggcgaaatct 420
accccgaact ctccggttcc gacctctgca agcgccccag ctctgactaa atcccagacg 480
gaccgtctgg aggtgctgct gaacccaaag gatgaaatct ctctgaacag cggcaagcct 540
ttccgtgagc tggaaagcga gctgctgtct cgtcgtaaaa aggatctgca acagatctac 600
gctgaggaac gcgagggtgg cggaagcggc ggcggtggcg gaagcggcgg cggtggcgga 660
agcggcggcg gtggaagcca ggcctgggga tcgaggcctc ctgcagcaga gtgtgccacc 720
caaagagctc caggcagtgt ggtggagctg ctgggcaaat cctaccctca ggacgaccac 780
agcaacctca cccggaaggt cctcaccaga gttggcagga acctgcacaa ccagcagcat 840
caccctctgt ggctgatcaa ggagagggtg ttggagcact tcaacaagca gtatgtgggc 900
agctctggga ccccgttgtt ctcggtctat gacaaccttt cgccagtggt cacgacctgg 960
cagaactttg acagcctgct catcccagct gatcacccct gcaggaagaa gggggacaac 1020
tattacctga atcggactca catgctgaga gcgcacacgt ccgcacacca gtgggacttg 1080
ctgcacgcgg gactggatgc cttcctggtg gtgggtgatg tctacaggcg tgaccagatc 1140
gactcccagc actaccctat tttccaccag ctggacgccg gtcggctctt ctctaagcat 1200
gagttatttg ctggtataaa ggatggggaa agcctgcagc tctttgaaca aagttctcgc 1260
tctgcgcata aacaagagac acacaccatg gaggccgtga agcttgttga gtttgatctt 1320
aagcaaacgc ttaccaggct catggcacat ctttttggag atgagccgga gataaggtgg 1380
gtagactgct acgttccttt tggacatcct tcctttgaga tggagatcaa ctttcatgga 1440
gaatggctgg aagttcttgg ctgcggggtg gttgaacaac aactggtcaa ttcagctggt 1500
gctcaagacc gaatcggctg gggatttggc ctagggttag aaaggctagc catgatcctc 1560
tacgacatcc ctgatatccg tctcttctgg tgtgaggacg agcgcttcct gaagcagttc 1620
tgtgtatcca acattaatca gaaggtgaag tttcagcctc ttagcaaa 1668
<210> 2
<211> 556
<212> PRT
<213> Synthesis (synthetic sequence)
<400> 2
Met Asp Lys Lys Pro Leu Asp Val Leu Ile Ser Ala Thr Gly Leu Trp
1 5 10 15
Met Ser Arg Thr Gly Thr Leu His Lys Ile Lys His Tyr Glu Ile Ser
20 25 30
Arg Ser Lys Ile Tyr Ile Glu Met Ala Cys Gly Asp His Leu Val Val
35 40 45
Asn Asn Ser Arg Ser Cys Arg Pro Ala Arg Ala Phe Arg Tyr His Lys
50 55 60
Tyr Arg Lys Thr Cys Lys Arg Cys Arg Val Ser Asp Glu Asp Ile Asn
65 70 75 80
Asn Phe Leu Thr Arg Ser Thr Glu Gly Lys Thr Ser Val Lys Val Lys
85 90 95
Val Val Ser Glu Pro Lys Val Lys Lys Ala Met Pro Lys Ser Val Ser
100 105 110
Arg Ala Pro Lys Pro Leu Glu Asn Pro Val Ser Ala Lys Ala Ser Thr
115 120 125
Asp Thr Ser Arg Ser Val Pro Ser Pro Ala Lys Ser Thr Pro Asn Ser
130 135 140
Pro Val Pro Thr Ser Ala Ser Ala Pro Ala Leu Thr Lys Ser Gln Thr
145 150 155 160
Asp Arg Leu Glu Val Leu Leu Asn Pro Lys Asp Glu Ile Ser Leu Asn
165 170 175
Ser Gly Lys Pro Phe Arg Glu Leu Glu Ser Glu Leu Leu Ser Arg Arg
180 185 190
Lys Lys Asp Leu Gln Gln Ile Tyr Ala Glu Glu Arg Glu Gly Gly Gly
195 200 205
Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Gly Gly Gly
210 215 220
Gly Ser Gln Ala Trp Gly Ser Arg Pro Pro Ala Ala Glu Cys Ala Thr
225 230 235 240
Gln Arg Ala Pro Gly Ser Val Val Glu Leu Leu Gly Lys Ser Tyr Pro
245 250 255
Gln Asp Asp His Ser Asn Leu Thr Arg Lys Val Leu Thr Arg Val Gly
260 265 270
Arg Asn Leu His Asn Gln Gln His His Pro Leu Trp Leu Ile Lys Glu
275 280 285
Arg Val Leu Glu His Phe Asn Lys Gln Tyr Val Gly Ser Ser Gly Thr
290 295 300
Pro Leu Phe Ser Val Tyr Asp Asn Leu Ser Pro Val Val Thr Thr Trp
305 310 315 320
Gln Asn Phe Asp Ser Leu Leu Ile Pro Ala Asp His Pro Cys Arg Lys
325 330 335
Lys Gly Asp Asn Tyr Tyr Leu Asn Arg Thr His Met Leu Arg Ala His
340 345 350
Thr Ser Ala His Gln Trp Asp Leu Leu His Ala Gly Leu Asp Ala Phe
355 360 365
Leu Val Val Gly Asp Val Tyr Arg Arg Asp Gln Ile Asp Ser Gln His
370 375 380
Tyr Pro Ile Phe His Gln Leu Asp Ala Gly Arg Leu Phe Ser Lys His
385 390 395 400
Glu Leu Phe Ala Gly Ile Lys Asp Gly Glu Ser Leu Gln Leu Phe Glu
405 410 415
Gln Ser Ser Arg Ser Ala His Lys Gln Glu Thr His Thr Met Glu Ala
420 425 430
Val Lys Leu Val Glu Phe Asp Leu Lys Gln Thr Leu Thr Arg Leu Met
435 440 445
Ala His Leu Phe Gly Asp Glu Pro Glu Ile Arg Trp Val Asp Cys Tyr
450 455 460
Val Pro Phe Gly His Pro Ser Phe Glu Met Glu Ile Asn Phe His Gly
465 470 475 480
Glu Trp Leu Glu Val Leu Gly Cys Gly Val Val Glu Gln Gln Leu Val
485 490 495
Asn Ser Ala Gly Ala Gln Asp Arg Ile Gly Trp Gly Phe Gly Leu Gly
500 505 510
Leu Glu Arg Leu Ala Met Ile Leu Tyr Asp Ile Pro Asp Ile Arg Leu
515 520 525
Phe Trp Cys Glu Asp Glu Arg Phe Leu Lys Gln Phe Cys Val Ser Asn
530 535 540
Ile Asn Gln Lys Val Lys Phe Gln Pro Leu Ser Lys
545 550 555
<210> 3
<211> 201
<212> DNA
<213> person (H. Sapiens)
<400> 3
atgagcggtg cagaagaatc agatgatgaa aatgcagttt gtgcagcaca gaattgtcag 60
cgcccgtgta aagataaagt tgattaggtt cagtgtgatg gtggttgtga tgaatggttt 120
catcaggttt gtgttggtgt tagcccggaa atggcagaaa atgaagatta tatttgcatc 180
aactgcgcaa aaaaacaggg t 201
<210> 4
<211> 201
<212> DNA
<213> person (H. Sapiens)
<400> 4
atgagcggtg cagaagaatc agatgatgaa aatgcagttt gtgcagcaca gaattgtcag 60
cgcccgtgta aagataaagt tgattgggtt cagtgtgatg gtggttgtga tgaatagttt 120
catcaggttt gtgttggtgt tagcccggaa atggcagaaa atgaagatta tatttgcatc 180
aactgcgcaa aaaaacaggg t 201
<210> 5
<211> 198
<212> DNA
<213> person (H. Sapiens)
<400> 5
atggcaagtc aggaatttga agtagaagca attgttgata aacgtcaaga taaaaacggt 60
aatacccaat atctggttcg ttggaaaggt tatgataaac aggatgatac atgggaaccg 120
gaacagcatc tgatgaattg tgaaaaatgt gtgcatgatt tcaaccgtcg ccaaaccgaa 180
aaacagaaag gtggaagc 198
<210> 6
<211> 66
<212> PRT
<213> person (H. Sapiens)
<400> 6
Met Ala Ser Gln Glu Phe Glu Val Glu Ala Ile Val Asp Lys Arg Gln
1 5 10 15
Asp Lys Asn Gly Asn Thr Gln Tyr Leu Val Arg Trp Lys Gly Tyr Asp
20 25 30
Lys Gln Asp Asp Thr Trp Glu Pro Glu Gln His Leu Met Asn Cys Glu
35 40 45
Lys Cys Val His Asp Phe Asn Arg Arg Gln Thr Glu Lys Gln Lys Gly
50 55 60
Gly Ser
65
<210> 7
<211> 579
<212> DNA
<213> person (H. Sapiens)
<400> 7
atgaatggct gggtacctgt tggggctgcg tgtgagaagg ctgtgtatgt cttggatgag 60
ccggagccag ccatccgaaa gagctaccag gcggtagagc ggcatgggga gacaatccga 120
gtccgggaca ccgtccttct caaatcaggc ccacgaaaga cctccacacc ttatgtggcc 180
aagatctctg ccctctggga gaaccccgag tcaggagagc tgatgatgag cctcctgtgg 240
tattacagac ctgagcactt acagggaggc cgcagtccca gcatgcacga gcccttgcag 300
aatgaagtgt ttgcatcgcg acatcaggac cagaacagtg tggcctgcat tgaggagaag 360
tgctatgtgc tgacttttgc cgagtactgc aggttctgtg ccatggccaa gcgccgaggt 420
gaaggcctcc ccagccgaaa gacagcactg gttcccccct ctgcagacta ttccacccca 480
ccccaccgca cagtgccaga ggacacggac cctgagctgg tgttcctttg ccgccatgtc 540
tatgacttcc gccacgggcg catccttaag aacccccag 579
<210> 8
<211> 193
<212> PRT
<213> person (H. Sapiens)
<400> 8
Met Asn Gly Trp Val Pro Val Gly Ala Ala Cys Glu Lys Ala Val Tyr
1 5 10 15
Val Leu Asp Glu Pro Glu Pro Ala Ile Arg Lys Ser Tyr Gln Ala Val
20 25 30
Glu Arg His Gly Glu Thr Ile Arg Val Arg Asp Thr Val Leu Leu Lys
35 40 45
Ser Gly Pro Arg Lys Thr Ser Thr Pro Tyr Val Ala Lys Ile Ser Ala
50 55 60
Leu Trp Glu Asn Pro Glu Ser Gly Glu Leu Met Met Ser Leu Leu Trp
65 70 75 80
Tyr Tyr Arg Pro Glu His Leu Gln Gly Gly Arg Ser Pro Ser Met His
85 90 95
Glu Pro Leu Gln Asn Glu Val Phe Ala Ser Arg His Gln Asp Gln Asn
100 105 110
Ser Val Ala Cys Ile Glu Glu Lys Cys Tyr Val Leu Thr Phe Ala Glu
115 120 125
Tyr Cys Arg Phe Cys Ala Met Ala Lys Arg Arg Gly Glu Gly Leu Pro
130 135 140
Ser Arg Lys Thr Ala Leu Val Pro Pro Ser Ala Asp Tyr Ser Thr Pro
145 150 155 160
Pro His Arg Thr Val Pro Glu Asp Thr Asp Pro Glu Leu Val Phe Leu
165 170 175
Cys Arg His Val Tyr Asp Phe Arg His Gly Arg Ile Leu Lys Asn Pro
180 185 190
Gln
<210> 9
<211> 207
<212> DNA
<213> person (H. Sapiens)
<400> 9
atggagtatc aggatgggaa ggagtttgga ataggggacc tcgtgtgggg aaagatcaag 60
ggcttctcct ggtggcccgc catggtggtg tcttggaagg ccacctccaa gcgacaggct 120
atgtctggca tgcggtgggt ccagtggttt ggcgatggca agttctccga ggtctctgca 180
gacaaactgg tggcactggg gctgttc 207
<210> 10
<211> 69
<212> PRT
<213> person (H. Sapiens)
<400> 10
Met Glu Tyr Gln Asp Gly Lys Glu Phe Gly Ile Gly Asp Leu Val Trp
1 5 10 15
Gly Lys Ile Lys Gly Phe Ser Trp Trp Pro Ala Met Val Val Ser Trp
20 25 30
Lys Ala Thr Ser Lys Arg Gln Ala Met Ser Gly Met Arg Trp Val Gln
35 40 45
Trp Phe Gly Asp Gly Lys Phe Ser Glu Val Ser Ala Asp Lys Leu Val
50 55 60
Ala Leu Gly Leu Phe
65
<210> 11
<211> 417
<212> DNA
<213> person (H. Sapiens)
<400> 11
atgagcggtg cagaagaatc agatgatgaa aatgcagttt gtgcagcaca gaattgtcag 60
cgcccgtgta aagataaagt tgattgggtt cagtgtgatg gtggttgtga tgaatagttt 120
catcaggttt gtgttggtgt tagcccggaa atggcagaaa atgaagatta tatttgcatc 180
aactgcgcaa aaaaacaggg tggcagcagc ggcagcagca gcggtgcaga agaatcagat 240
gatgaaaatg cagtttgtgc agcacagaat tgtcagcgcc cgtgtaaaga taaagttgat 300
tgggttcagt gtgatggtgg ttgtgatgaa tagtttcatc aggtttgtgt tggtgttagc 360
ccggaaatgg cagaaaatga agattatatt tgcatcaact gcgcaaaaaa acagggt 417
<210> 12
<211> 636
<212> DNA
<213> person (H. Sapiens)
<400> 12
atgagcggtg cagaagaatc agatgatgaa aatgcagttt gtgcagcaca gaattgtcag 60
cgcccgtgta aagataaagt tgattgggtt cagtgtgatg gtggttgtga tgaatggttt 120
catcaggttt gtgttggtgt tagcccggaa atggcagaaa atgaagatta tatttgcatc 180
aactgcgcaa aaaaacaggg tggcagcagc ggcagcagca gcggtgcaga agaatcagat 240
gatgaaaatg cagtttgtgc agcacagaat tgtcagcgcc cgtgtaaaga taaagttgat 300
tgggttcagt gtgatggtgg ttgtgatgaa tggtttcatc aggtttgtgt tggtgttagc 360
ccggaaatgg cagaaaatga agattatatt tgcatcaact gcgcaaaaaa acagggtctg 420
gtgccgcgcg gcagcagcag cggtgcagaa gaatcagatg atgaaaatgc agtttgtgca 480
gcacagaatt gtcagcgccc gtgtaaagat aaagttgatt gggttcagtg tgatggtggt 540
tgtgatgaat ggtttcatca ggtttgtgtt ggtgttagcc cggaaatggc agaaaatgaa 600
gattatattt gcatcaactg cgcaaaaaaa cagggt 636
<210> 13
<211> 747
<212> DNA
<213> Soybean (Glycine max)
<400> 13
ggaaagtctt acccaactgt gagtgctgat taccaggacg ccgttgagaa ggcgaagaag 60
aagctcagag gcttcatcgc tgagaagaga tgcgctcctc taatgctccg tttggcattc 120
cactctgctg gaacctttga caagggcacg aagaccggtg gacccttcgg aaccatcaag 180
caccctgccg aactggctca cagcgctaac aacggtcttg acatcgctgt taggcttttg 240
gagccactca aggcggagtt ccctattttg agctacgccg atttctacca gttggctggc 300
gttgttgccg ttgaggtcac gggtggacct aaggttccat tccaccctgg aagagaggac 360
aagcctgagc caccaccaga gggtcgcttg cccgatccca ctaagggttc tgaccatttg 420
agagatgtgt ttggcaaagc tatggggctt actgaccaag atatcgttgc tctatctggg 480
ggtcacacta ttggagctgc acacaaggag cgttctggat ttgagggtcc ctggacctct 540
aatcctctta ttttcgacaa ctcatacttc acggagttgt tgagtggtga gaaggaaggt 600
ctccttcagc taccttctga caaggctctt ttgtctgacc ctgtattccg ccctctcgtt 660
gacaaatatg cagcggacga agatgccttc tttgctgatt acgctgaggc tcaccaaaag 720
ctttccgagc ttgggtttgc tgatgcc 747
<210> 14
<211> 249
<212> PRT
<213> Soybean (Glycine max)
<400> 14
Gly Lys Ser Tyr Pro Thr Val Ser Ala Asp Tyr Gln Asp Ala Val Glu
1 5 10 15
Lys Ala Lys Lys Lys Leu Arg Gly Phe Ile Ala Glu Lys Arg Cys Ala
20 25 30
Pro Leu Met Leu Arg Leu Ala Phe His Ser Ala Gly Thr Phe Asp Lys
35 40 45
Gly Thr Lys Thr Gly Gly Pro Phe Gly Thr Ile Lys His Pro Ala Glu
50 55 60
Leu Ala His Ser Ala Asn Asn Gly Leu Asp Ile Ala Val Arg Leu Leu
65 70 75 80
Glu Pro Leu Lys Ala Glu Phe Pro Ile Leu Ser Tyr Ala Asp Phe Tyr
85 90 95
Gln Leu Ala Gly Val Val Ala Val Glu Val Thr Gly Gly Pro Lys Val
100 105 110
Pro Phe His Pro Gly Arg Glu Asp Lys Pro Glu Pro Pro Pro Glu Gly
115 120 125
Arg Leu Pro Asp Pro Thr Lys Gly Ser Asp His Leu Arg Asp Val Phe
130 135 140
Gly Lys Ala Met Gly Leu Thr Asp Gln Asp Ile Val Ala Leu Ser Gly
145 150 155 160
Gly His Thr Ile Gly Ala Ala His Lys Glu Arg Ser Gly Phe Glu Gly
165 170 175
Pro Trp Thr Ser Asn Pro Leu Ile Phe Asp Asn Ser Tyr Phe Thr Glu
180 185 190
Leu Leu Ser Gly Glu Lys Glu Gly Leu Leu Gln Leu Pro Ser Asp Lys
195 200 205
Ala Leu Leu Ser Asp Pro Val Phe Arg Pro Leu Val Asp Lys Tyr Ala
210 215 220
Ala Asp Glu Asp Ala Phe Phe Ala Asp Tyr Ala Glu Ala His Gln Lys
225 230 235 240
Leu Ser Glu Leu Gly Phe Ala Asp Ala
245
<210> 15
<211> 47
<212> DNA
<213> Artificial sequence (synthetic sequence)
<220>
<221> misc_feature
<222> (21)..(22)
<223> n is a, c, g, or t
<400> 15
taagatgggt agactgctac nnkccttttg gtcatccttc ttttgag 47
<210> 16
<211> 25
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 16
gtagcagtct acccatctta tctcc 25
<210> 17
<211> 46
<212> DNA
<213> Artificial sequence (synthetic sequence)
<220>
<221> misc_feature
<222> (21)..(22)
<223> n is a, c, g, or t
<400> 17
aagttcttgg ctgcggggtg nnkgaacaac aactggtcaa ttcagc 46
<210> 18
<211> 20
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 18
caccccgcag ccaagaactt 20
<210> 19
<211> 50
<212> DNA
<213> Artificial sequence (synthetic sequence)
<220>
<221> misc_feature
<222> (21)..(22)
<223> n is a, c, g, or t
<220>
<221> misc_feature
<222> (27)..(28)
<223> n is a, c, g, or t
<400> 19
accctatttt ccaccagctg nnkgccnnkc ggctcttctc caagcatgag 50
<210> 20
<211> 24
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 20
cagctggtgg aaaatagggt agtg 24
<210> 21
<211> 46
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 21
ctggtgccgc gcggcagcat gtcccctata ctaggttatt ggaaaa 46
<210> 22
<211> 40
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 22
gtggcgacca tcctccaaaa tgaagcatgc accattcctt 40
<210> 23
<211> 45
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 23
taagaaggag atatacatat gagcggtgca gaagaatcag atgat 45
<210> 24
<211> 44
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 24
catgctgccg cgcggcacca gaccctgttt ttttgcgcag ttga 44
<210> 25
<211> 22
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 25
tgaagcatgc accattcctt gc 22
<210> 26
<211> 58
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 26
catatgtata tctccttctt aaagttaaac aaaattattt ctagcccaaa aaaacggg 58
<210> 27
<211> 46
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 27
cgtgtaaaga taaagttgat taggttcagt gtgatggtgg ttgtga 46
<210> 28
<211> 25
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 28
atcaacttta tctttacacg ggcgc 25
<210> 29
<211> 27
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 29
tttcatcagg tttgtgttgg tgttagc 27
<210> 30
<211> 47
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 30
ccaacacaaa cctgatgaaa ctattcatca caaccaccat cacactg 47
<210> 31
<211> 59
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 31
actgcgcaaa aaaacagggt ggcagcagcg gcagcagcag cggtgcagaa gaatcagat 59
<210> 32
<211> 42
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 32
atgctgccgc gcggcaccag accctgtttt tttgcgcagt tg 42
<210> 33
<211> 39
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 33
accctgtttt tttgcgcagt tgatgcaaat ataatcttc 39
<210> 34
<211> 42
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 34
atgctgccgc gcggcaccag gcttccacct ttctgttttt cg 42
<210> 35
<211> 59
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 35
actgcgcaaa aaaacagggt ggcagcagcg gcagcagcgc aagtcaggaa tttgaagta 59
<210> 36
<211> 40
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 36
ggcagcagcg gcagcagcgt gagcaagggc gaggagctgt 40
<210> 37
<211> 22
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 37
catggtggcg accggtagcg ct 22
<210> 38
<211> 42
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 38
cgctaccggt cgccaccatg agcggtgcag aagaatcaga tg 42
<210> 39
<211> 43
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 39
cgctgctgcc gctgctgcca ccctgttttt ttgcgcagtt gat 43
<210> 40
<211> 43
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 40
ctgcacggaa gcttgccacc atggataaga agccgctgga tgt 43
<210> 41
<211> 46
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 41
tagtgatggt gatggtggtg tttgctaaga ggctgaaact tcacct 46
<210> 42
<211> 25
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 42
caccaccatc accatcacta aaccc 25
<210> 43
<211> 22
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 43
ggtggcaagc ttccgtgcag tt 22
<210> 44
<211> 23
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 44
taactagtcc actgagatcg acg 23
<210> 45
<211> 26
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 45
cttatcgtcg tcatccttgt agtcca 26
<210> 46
<211> 46
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 46
tctggcagcg gttctgctag cggaaagtct tacccaactg tgagtg 46
<210> 47
<211> 40
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 47
cgatctcagt ggactagtta ggcatcagca aacccaagct 40
<210> 48
<211> 41
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 48
acaaggatga cgacgataag agcggtgcag aagaatcaga t 41
<210> 49
<211> 51
<212> DNA
<213> Artificial sequence (synthetic sequence)
<400> 49
gctagcagaa ccgctgccag aaccgctgcc accctgtttt tttgcgcagt t 51

Claims (5)

1. A method for improving cation-pi interaction by using genetic code expansion, which is characterized in that tryptophan forming an aromatic cage of the cation-pi interaction in a biological molecule is replaced by tryptophan analogues by using a genetic code expansion technology so as to improve the binding energy of the cation-pi interaction, and the method comprises the following specific steps of:
S1, designing and synthesizing a tryptophan analogue with a strong electron side chain substitution, wherein the tryptophan analogue is an unnatural amino acid and is selected from one of 6-methoxy-tryptophan (A2), 7-methyl-tryptophan (A3) and 7-methoxy-tryptophan (A4), and structural formulas of the tryptophan analogues A2 to A4 are as follows:
s2, providing chimeric phenylalanine aminoacyl-tRNA synthetase mutants capable of specifically recognizing tryptophan analogues A2 to A4, wherein the nucleotide sequence and the amino acid sequence of the chimeric phenylalanine aminoacyl-tRNA synthetase mutants are shown in SEQ ID NO: 1-2;
s3, taking a biomolecule forming cation-pi interaction as a research object, and utilizing a genetic code expansion technology to specifically introduce tryptophan analogues into the biomolecule through the chimeric phenylalanine aminoacyl-tRNA synthetase mutant to obtain the protein with the tryptophan analogues.
2. The method according to claim 1, characterized in that the tryptophan analogues are synthesized by: indole B substituted at different positions is used as a reactant to react to obtain a target product,
the chemical structural formula of the indole substituted at different positions is as follows:the structural general formula of the obtained target product is as follows: / >Wherein X is selected from: one of an oxygen atom or a carbon atom.
3. The method according to claim 1, characterized in that: in the step S3, the processing unit,
(1) The tryptophan corresponding site of the decoding protein forming an aromatic cage is mutated into a stop codon,
(2) Co-transferring the decoded protein mutant and chimeric phenylalanyl-tRNA synthetase mutant, adding corresponding tryptophan analogue in the expression process,
(3) The decoded protein variants were purified according to the GST-tag protein purification method and the fidelity of the decoded protein variants was identified by LC-MS.
4. The method according to claim 1, characterized in that: the method takes histone methylation decoding protein domain as a research object, wherein the decoding protein domain is any one of Chromo, PHD, PWWP, tudor, MBT, CW, SPIN and BAH domains.
5. Use of a protein with a tryptophan analog obtained by the method of claim 1 for establishing a decoded protein super-parent recognition system specifically recognizing histone methylation modifications.
CN202210140263.7A 2022-02-16 2022-02-16 Method for improving cation-pi interaction by utilizing genetic code expansion and application Active CN114940979B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210140263.7A CN114940979B (en) 2022-02-16 2022-02-16 Method for improving cation-pi interaction by utilizing genetic code expansion and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210140263.7A CN114940979B (en) 2022-02-16 2022-02-16 Method for improving cation-pi interaction by utilizing genetic code expansion and application

Publications (2)

Publication Number Publication Date
CN114940979A CN114940979A (en) 2022-08-26
CN114940979B true CN114940979B (en) 2024-01-23

Family

ID=82905867

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210140263.7A Active CN114940979B (en) 2022-02-16 2022-02-16 Method for improving cation-pi interaction by utilizing genetic code expansion and application

Country Status (1)

Country Link
CN (1) CN114940979B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2006330947A1 (en) * 2005-12-22 2007-07-05 Pacific Biosciences Of California, Inc. Polymerases for nucleotide analogue incorporation
CN110172467A (en) * 2019-05-24 2019-08-27 浙江大学 It is a kind of to construct orthogonal aminoacyl-tRNA synthetase/tRNA system using chimeric design method
CN111118048A (en) * 2019-11-11 2020-05-08 浙江大学 Use of chimeric phenylalanyl-tRNA synthetases/tRNAs
CN116990270A (en) * 2023-06-26 2023-11-03 浙江大学绍兴研究院 Method for improving pi effect in living cells by utilizing genetic code expansion and application

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2006330947A1 (en) * 2005-12-22 2007-07-05 Pacific Biosciences Of California, Inc. Polymerases for nucleotide analogue incorporation
CN110172467A (en) * 2019-05-24 2019-08-27 浙江大学 It is a kind of to construct orthogonal aminoacyl-tRNA synthetase/tRNA system using chimeric design method
CN111118048A (en) * 2019-11-11 2020-05-08 浙江大学 Use of chimeric phenylalanyl-tRNA synthetases/tRNAs
CN116990270A (en) * 2023-06-26 2023-11-03 浙江大学绍兴研究院 Method for improving pi effect in living cells by utilizing genetic code expansion and application

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
An expression system for the efficient incorporation of an expanded set of tryptophan analogues;Petrovic, DM等;《Amino Acids》;第44卷(第5期);第1329-1336页 *
Effects of lysine acetylation in a beta-hairpin peptide: Comparison of an amide-pi and a cation-pi interaction;Hughes, RM等;《Journal of the American Chemical Society》;第128卷(第41期);第13586-13591页 *
Probing the Active Site of Deubiquitinase USP30 with Noncanonical Tryptophan Analogues;Han-Kai Jiang等;《Biochemistry》;第59卷;第2205-2209页 *
Strictly Conserved Residues in Euphorbia tirucalli -Amyrin Cyclase: Trp612 Stabilizes Transient Cation through Cation- Interaction and CH- Interaction of Tyr736 with Leu734 Confers Robust Local Protein Architecture;Aiba, Y等;《Chembiochem》;第19卷(第5期);第486-495页 *
The Role of Tryptophan in π Interactions in Proteins: An Experimental Approach;Jinfeng Shao等;《Journal of the American Chemical Society》;第144卷;第13815-13822页 *
组蛋白甲基化的阅读器识别机制研究进展;赵帅;苏晓楠;李元元;李海涛;;科技导报(第08期);全文 *
遗传密码扩展技术在蛋白质翻译后修饰研究中的应用;臧佳;《中国博士学位论文全文数据库(电子期刊)基础科学辑》(第2期);全文 *
锂离子与芳香体系相互作用的量子化学研究;马国正等;《化学学报》;第62卷(第19期);第1871-1876页 *

Also Published As

Publication number Publication date
CN114940979A (en) 2022-08-26

Similar Documents

Publication Publication Date Title
EP2438174B1 (en) METHOD FOR INCORPORATING ALIPHATIC AMINO ACIDS COMPRISING ALKYNE, AZIDE OR ALIPHATIC KETONE FUNCTIONAL GROUPS USING APPROPRIATE tRNA/tRNA SYNTHASE PAIRS
JP6603390B2 (en) Novel peptide library and use thereof
US20170152287A1 (en) Methods and compositions for site-specific labeling of peptides and proteins
AU2017342062B2 (en) Archaeal pyrrolysyl tRNA synthetases for orthogonal use
CN110577564A (en) Polypeptides and methods
CN112358414B (en) Unnatural amino acids and their use in protein site-directed modification and protein interactions
US20120083599A1 (en) Biomolecular Labelling Using Multifunctional Biotin Analogues
EP3030545A2 (en) Amino acid derivatives
CN114940979B (en) Method for improving cation-pi interaction by utilizing genetic code expansion and application
WO2011024887A1 (en) Conjugate containing cyclic peptide and method for producing same
CN114292270B (en) BTK inhibitor and preparation method and application thereof
KR101910169B1 (en) Methods for identification of proteins using phenolic compounds for protein labeling
CN114231553A (en) High-throughput screening method of signal peptide library based on fluorescent probe Rho-IDA-CoII
JP5686385B2 (en) Methods for fluorescently labeling proteins
US20040138446A1 (en) Biotin derivatives, methods for making same and uses thereof as vectors
US20230139680A1 (en) A genetically encoded, phage-displayed cyclic peptide library and methods of making the same
CN117904060B (en) PARKIN-based targeted protein ubiquitination degradation agent and preparation method and application thereof
WO2017113428A1 (en) Preparation of unnatural cyanoamino acids and application thereof in bioorthogonal raman detection
CN117730076A (en) Amino acids containing tetrazine moieties
Metts Synthesis and Validation of a Trifunctional Trimethoprim-based Probe for Use with Degradation Domain System
WO2024061353A1 (en) Crystal form of quinazoline compound and preparation method therefor
CN117292741A (en) Method for developing cell membrane-binding and/or serum albumin lipidation analogues by using computer aided design and application
WO2023004687A1 (en) Unnatural amino acid and application thereof, recombinant protein containing same, and recombinant protein conjugate
KR101868917B1 (en) Phenolic compound for labeling to protein and preparation method thereof
JP2024126404A (en) How to fluorescently label proteins

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230116

Address after: 311100 Room 520, Building 2, No. 366, Tongyun Street, Liangzhu Street, Yuhang District, Hangzhou City, Zhejiang Province

Applicant after: Hangzhou Chihua Hesheng Pharmaceutical Technology Co.,Ltd.

Address before: 310058 Yuhang Tang Road, Xihu District, Hangzhou, Zhejiang 866

Applicant before: ZHEJIANG University

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant