CN114276460B

CN114276460B - Modified GPR75 and use thereof

Info

Publication number: CN114276460B
Application number: CN202210205737.1A
Authority: CN
Inventors: 衡杰; 郭涵博; 杨怡然; 何鋆彤; 李京; 卓微; 倪晓丹
Original assignee: Shuimu Future Beijing Technology Co ltd
Current assignee: Shuimu Future Beijing Technology Co ltd
Priority date: 2022-03-04
Filing date: 2022-03-04
Publication date: 2022-06-24
Anticipated expiration: 2042-03-04
Also published as: CN114276460A; WO2023165108A1

Abstract

The invention discloses a modified GPR75 and application thereof. The present invention provides a modified GPR75 comprising: a first domain comprising an amino acid sequence derived from a β 2 adrenergic receptor; and a second domain that is a domain in which the random sequence between the fifth and sixth transmembrane helices and the N-terminal, C-terminal random sequence are deleted in wild-type GPR75 and linked between the fifth and sixth transmembrane helices by an amino acid sequence derived from a BRIL fusion protein. The modified GPR75 provided by the invention can be used for GPR75 structure analysis, fluorescent molecular marking, phosphorylated polypeptide or signal protein fusion, GPR75 activity analysis, nucleic acid coding small molecule library screening, computer-aided drug design and drug screening.

Description

Modified GPR75 and use thereof

Technical Field

The invention belongs to the technical field of biology, and particularly relates to a modified GPR75 and application thereof.

Background

G protein-coupled receptors are the largest class of cell membrane receptors in the human body. The completion of the human genome project provides a basis for analyzing the distribution, sequence and function of the family members¹. The human body has over 800G protein-coupled receptor members, of which there are about 370 non-olfactory G protein-coupled receptors and over 400 olfactory receptors. These G protein-coupled receptors are classified into six subfamilies, the rhodopsin family (rhodopsin family), the adhesion family (secretin family), and the secretin receptor family (secretin fa)mill), glutamate receptor family (glutamate family), Frizzled family (Frizzled family), and taste family (Tasted family). The different G protein-coupled receptors are involved in mediating a series of important biological functions of organisms, ranging from chemical perception recognition (vision, smell, taste) to endocrine molecule-related regulation². The functional importance stems from the fact that receptors of this family are capable of recognizing a wide variety of ligands, common ligands including monoamines (dopamine, norepinephrine, serotonin, histamine), amino acid transmitters (glutamate, γ -aminobutyric acid), polypeptides (tachykinins, neurotensin, somatostatin, pancreatin, glucagon-like peptide-1, endocrine releasing factor), lipid derivatives (lysophosphatidic acid, sphingosine phosphate, eicosanoids), odors, and the like. Statistics indicate that about 35% of the total number of FDA-approved clinical drugs on the market target about 135 different G protein-coupled receptors³. In ongoing clinical studies, more than 20% of the tested drugs target G protein-coupled receptors⁴. Nevertheless, about 50% of non-olfactory G protein-coupled receptors may be potential disease treatment targets, and there is no relevant new drug clinical test yet, which is yet to be further developed and researched. Therefore, the value and potential of the G protein coupled receptor family members in the field of new drug development can be seen.

GPR75 (G protein-coupled receptor 75) belongs to a member of the G protein-coupled receptor family, and its endogenous agonist ligand includes the metabolite 20-HETE⁵And the chemokine CCL5/RANTES⁶. GPR75 has an expression profile in a large number of cell types, wherein GPR75 expressed in islets, through activation of CCL5, regulates insulin release and is involved in regulating glucose homeostasis in humans⁶. In addition, big data research published in the journal of science indicates that GPR75 gene is involved in regulating mouse obesity and is a clinically important obesity treatment target⁷. The human GPR75 gene contains 540 amino acids and has the typical 7-transmembrane characteristic of a G protein-coupled receptor family member⁸The C-terminus of the receptor has a random coil sequence of about 140 amino acids in length. Due to its comparison withLong random coil, and lack of aspartate/arginine/tyrosine (DRY) motifs, are classified as Atypical Chemokine Receptors (ACRs)⁹. Activation of the GPR75 receptor on the cell membrane by CCL5 results in intracellular phospholipase C mediated IP3 (inositol triphosphate) and Ca²⁺Up regulation of concentration¹⁰。

In 2017, the Nobel chemical prize awards three scientists who make remarkable contribution in the development process of the cryoelectron microscope technology¹¹This marks the introduction of the field of structural biology into a new era¹². In the same year, cryoelectron microscopy was applied to the structural analysis of receptor-G protein complexes. The subsequent 4 years, about 45 independent receptor-G protein complexes of high resolution structure were resolved¹³。

Structure-based drug design is an innovation over traditional new drug development protocols^{14, 15}. The field of G protein-coupled receptor structure research has changed from the everlasting to the ground over the last 20 years. In 2000, the structure of the first high-resolution G protein-coupled receptor-rhodopsin was analyzed, which lays the foundation for the structural and functional research in this field¹⁶. In 2007, professor Brian Kobilka and its collaborators used antibodies to stabilize receptor conformation or fused T4 lysozyme^{17, 18}The high-resolution crystal structure of the beta 2 adrenergic receptor was successfully resolved. In 2011, the crystal structure of the ternary complex of agonist-beta 2 adrenergic receptor-Gs protein is further analyzed by professor of Brian Kobilka¹⁹He therefore also shared the nobel prize of chemistry 2012 with the his mentor Robert j. Lefkowitz. In the era of traditional protein crystallography, about 50 independent G protein-coupled receptors were resolved in crystal structure. The structural information with high resolution lays a foundation for the structure-based drug design. One of the most notable examples is that professor Brian Kobilka and its co-workers use the high resolution crystal structure of mu opioid receptor to molecularly dock over 300 ten thousand small molecule compounds, ultimately resulting in opioid analgesic PZM21 with lower side effects²⁰。

In the era of AI enabling innovative drug development, how to utilize sequence information of proteins to perform structure analysis work and design and screen active small molecule drugs based on computer-aided drug design (CADD) or molecular structure-based drug design (SBDD) thinking is urgently needed.

In order to develop targeted drug molecules against the GPR75 receptor, we needed to perform structural analysis work on the GPR75 receptor. In recent years, a mainstream structural research means in the field is to analyze the activated state structure of a receptor by forming a ternary complex from an agonist-G protein coupled receptor-G protein. The GPR75 receptor has been reported to play an important role in controlling obesity in humans, with human GPR75 receptor truncations showing a lower proportion of obesity. Knockout of GPR75 in mice can significantly inhibit obesity and enhance glycemic control in high fat diet model mice⁷. Inhibition of GPR75 receptor activity has been suggested as a strategy for clinical management of obesity. To be able to target the GPR75 receptor and develop a molecule of clinical value, it is necessary to use inhibitors. Therefore, it is very challenging to obtain a receptor structure in an inactivated state and develop a specific receptor activity inhibitor against it.

Disclosure of Invention

Problems to be solved by the invention

Since the clinically valuable molecule of the GPR75 receptor requires inhibitors and currently lacks its inactive receptor structure, the present invention provides a modification of GPR 75.

Means for solving the problems

A first aspect of the invention provides a modified GPR75 comprising:

a first domain comprising an amino acid sequence derived from a β 2 adrenergic receptor; and (c) and (d),

a second domain that is a domain in which the random sequence between the fifth and sixth transmembrane helices and the N-terminal, C-terminal random sequence are deleted in wild-type GPR75 and linked between the fifth and sixth transmembrane helices by an amino acid sequence derived from a BRIL fusion protein.

In some embodiments of the invention, the first domain comprises the amino acid sequence shown as SEQ ID NO. 4 or an amino acid sequence having at least 80% homology to the amino acid sequence shown as SEQ ID NO. 4.

In some embodiments of the invention, the second domain comprises the amino acid sequence shown as SEQ ID NO. 3 or an amino acid sequence having at least 80% homology to the amino acid sequence shown as SEQ ID NO. 3.

In some embodiments of the invention, the modified GPR75 comprises one or more of the following sequences:

(i) an amino acid sequence as shown in SEQ ID NO. 5;

(ii) an amino acid sequence having at least 80%, 82%, 85%, 87%, 90%, 92%, 95%, 96%, 97%, 98% or 99% identity to the amino acid sequence shown in SEQ ID NO. 5 and which retains the function of a binding specific ligand of the amino acid sequence shown in SEQ ID NO. 5;

(iii) an amino acid sequence in which 1 or more amino acid residues are added, substituted, deleted or inserted in the amino acid sequence shown in SEQ ID NO. 5 and which retains the function of a binding-specific ligand of the amino acid sequence shown in SEQ ID NO. 5; alternatively, the first and second electrodes may be,

(iv) an amino acid sequence encoded by a nucleotide sequence that hybridizes under stringent conditions to a polynucleotide sequence encoding the amino acid sequence set forth in SEQ ID NO. 5 and which retains the function of binding to a specific ligand of the amino acid sequence set forth in SEQ ID NO. 5, said stringent conditions being medium stringency conditions, medium-high stringency conditions, high stringency conditions or very high stringency conditions.

In some embodiments of the invention, the modified GPR75 further comprises a tag, a protease cleavage site, a signal peptide, a peptide linker, or any combination thereof.

In some embodiments of the invention, the modified GPR75 comprises a tag at its N-terminus and/or C-terminus.

In some embodiments of the invention, the modified GPR75 comprises a signal peptide at its N-terminus.

In some embodiments of the invention, the protease cleavage site is located between two adjacent elements; the element is selected from the group consisting of a first domain, a second domain, a tag, a signal peptide, and a peptide linker.

(i) an amino acid sequence as shown in SEQ ID NO. 13;

(ii) an amino acid sequence having at least 80%, 82%, 85%, 87%, 90%, 92%, 95%, 96%, 97%, 98% or 99% identity to the amino acid sequence shown in SEQ ID NO. 13 and which retains the function of a binding specific ligand of the amino acid sequence shown in SEQ ID NO. 13;

(iii) an amino acid sequence in which 1 or more amino acid residues are added, substituted, deleted or inserted in the amino acid sequence shown in SEQ ID NO. 13 and which retains the function of a binding-specific ligand of the amino acid sequence shown in SEQ ID NO. 13; alternatively, the first and second electrodes may be,

(iv) an amino acid sequence encoded by a nucleotide sequence that hybridizes under stringent conditions to a polynucleotide sequence encoding the amino acid sequence set forth in SEQ ID NO. 13 and which retains the function of binding to a specific ligand of the amino acid sequence set forth in SEQ ID NO. 13, said stringent conditions being medium stringency conditions, medium-high stringency conditions, high stringency conditions or very high stringency conditions.

In a second aspect the invention provides a polynucleotide encoding a modified GPR75 according to the first aspect of the invention.

In a third aspect, the present invention provides an expression vector comprising a polynucleotide according to the second aspect of the invention.

In a fourth aspect, the present invention provides a host cell comprising an expression vector according to the third aspect of the invention.

A fifth aspect of the invention provides the use of a modified GPR75 according to the first aspect of the invention, a polynucleotide according to the second aspect of the invention, an expression vector according to the third aspect of the invention or a host cell according to the fourth aspect of the invention for use in GPR75 structural analysis, fluorescent molecular labeling, fusion of phosphorylated polypeptide or signal protein, GPR75 activity analysis, nucleic acid encoding small molecule library screening, computer-assisted drug design and drug screening.

ADVANTAGEOUS EFFECTS OF INVENTION

According to the invention, ideas such as molecular structure prediction, molecular design and the like are innovatively combined, a protein sequence which is originally not suitable for the non-activated state structure research is cut off and modified by fusion protein, and a 24-amino-acid beta 2 adrenergic receptor sequence is fused at the N end, so that the stability and the expression quantity of the receptor are improved. In addition, the design of a Flag tag sequence, a Strep tag sequence and a 6 XHis tag sequence is respectively carried out at the N end and the C end of the receptor, so that the receptor expressed by heterologous expression can be conveniently purified. Finally, the invention carries out multi-enzyme cleavage site (3C/TEV) design and Sortase A fusion sequence design on the protein sequence, can conveniently carry out reverse purification, matrix fixation, site-specific fluorescence labeling and the like on the purified protein according to different schemes in the later period as required, and the modified sequence can be conveniently applied to a nucleic acid-coded small molecule drug library, a drug combination experiment and the like. In addition, compared with wild GPR75, the modified GPR75 provided by the invention has better stability and higher expression level.

Drawings

FIGS. 1A and 1B show the prediction model of GPR75 protein structure. FIG. 1A shows the AlphaFold2 and RosettAFold prediction model of full-length GPR75, and FIG. 1B shows the predicted transmembrane region of GPR 75. It can be seen from FIG. 1A that GPR75 has a long random coil structure and is not suitable for direct structural biology studies.

FIG. 2 shows the GPR75 protein engineering concept. First, we used the prediction of AlphaFold2 to truncate the GPR75 transmembrane region and insert the BRIL fusion protein at a third intracellular helical position. Subsequently, we performed stepwise optimization of the fusion sites and selected the best fusion site as the final version of the fusion protein.

FIG. 3 shows the optimization of the fusion site of GPR75 protein.

FIG. 4 is a graph showing the difference in the effect of the optimization of the fusion site of GPR75 protein.

FIG. 5 shows the difference in the expression levels of wild-type GPR75 and modified GPR75 as detected using a western blot. The results of Western blot of GPR75 wild type and modified GPR75 are shown in the figure, lane 1 is GPR75 wild type, lane 2 is GPR75 modified GPR, and the expression level of the modified GPR75 is obviously improved according to the results of Western blot. The modified GPR75 has higher protein expression level. Wild type GPR75 was expressed in very low amounts and could hardly be detected under the same expression conditions.

FIG. 6 is a gel filtration chromatography UV-280 absorption peak alignment of modified GPR75 fused to BRIL and truncated GPR75 (no BRIL fusion and no deletion of the N-terminal random sequence of wild type GPR75) in an example of the invention. In the figure, the solid UV-280 absorption peak is the BRIL-fused, modified GPR75, and the dashed UV-280 absorption peak is truncated GPR 75.

FIGS. 7A and 7B are SDS-PAGE gels (FIG. 7A) and photographs of frozen data (FIG. 7B) of modified GPR75 in accordance with an embodiment of the present invention. The purified modified GPR75 has high purity, and the data particles of the frozen sample have good dispersity.

FIGS. 8A and 8B are background activity assays for modified GPR75 in test examples of the present invention. The ligand-binding-free modified GPR75 was able to accelerate GTP hydrolysis activity of Gq protein, while the ligand 20-HETE showed an effect of inhibiting 75 protein activity (fig. 8A). In the experiment, the IC of 20-HETE was measured₅₀Approximately 2nM (FIG. 8B).

FIGS. 9A-9C illustrate the preparation of complexes of modified GPR75 and fragment of anti-BRIL fab in test examples of the present invention and the analysis of the freezing data. The effect of co-migration of engineered GPR75 with anti-BRIL fab fragments on molecular sieves (FIG. 9A), and SDS-PAGE further demonstrated their ability to form stable complexes (FIG. 9B). Cryo-electron microscopy two-dimensional classification of single particles revealed features of receptor and complex with fab fragment against BRIL (fig. 9C).

Detailed Description

In order that the invention may be more readily understood, certain technical and scientific terms are specifically defined below. Unless otherwise defined herein, all other technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

In the present specification, the numerical range represented by "a numerical value to B numerical value" means a range including the end point numerical value A, B.

In the present specification, the term "substantially" or "substantially" means that the standard deviation from the theoretical model or theoretical data is within 5%, preferably 3%, and more preferably 1%.

In the present specification, the meaning of "may" includes both the meaning of performing a certain process and the meaning of not performing a certain process.

In this specification, "optional" or "optionally" means that the subsequently described event or circumstance may or may not occur, and that the description includes instances where the event occurs and instances where it does not.

Reference in the specification to "some specific/preferred embodiments," "other specific/preferred embodiments," "embodiments," and so forth, means that a particular element (e.g., feature, structure, property, and/or characteristic) described in connection with the embodiment is included in at least one embodiment described herein, and may or may not be present in other embodiments. In addition, it is to be understood that the described elements may be combined in any suitable manner in the various embodiments.

According to the present invention, the terms "polypeptide", "protein", "peptide" are used interchangeably herein to refer to a polymeric form of amino acids of any length, and may include coded and non-coded amino acids, chemically or biochemically modified or derivatized amino acids, and polypeptides having a similar peptide backbone.

According to the present invention, the terms "nucleic acid molecule", "polynucleotide", "polynucleic acid", "nucleic acid" are used interchangeably and refer to a polymeric form of nucleotides of any length, whether deoxyribonucleotides or ribonucleotides, or analogues thereof. The polynucleotide may have any three-dimensional structure and may perform any known or unknown function. Non-limiting examples of polynucleotides include genes, gene fragments, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, control regions, isolated RNA of any sequence, nucleic acid probes, and primers. The nucleic acid molecule may be linear or circular.

According to the present invention, the term "G protein-coupled receptor" or "GPCR" or "GPR" refers to a transmembrane receptor capable of transmitting a signal from outside the cell to inside the cell via the G protein pathway and/or the arrestin pathway. Hundreds of such receptors are known in the art; see, e.g., Fredriksson et al, mol. pharmacol.63: 1256-: 4903 and 4908(2003), each of which is incorporated herein by reference. G protein-coupled receptors are polypeptides sharing a common structural motif, with 7 regions between 22 to 24 hydrophobic amino acids that form 7 alpha helices, each spanning the cell membrane. Each span, i.e., transmembrane-1 (TM1), transmembrane-2 (TM2), etc., is identified by number and may also be referred to as a first transmembrane helix, a second transmembrane helix, etc., in the present invention. The transmembrane helices are also connected by regions of amino acids between transmembrane-2 and transmembrane-3, transmembrane-4 and transmembrane-5, and transmembrane-6 and transmembrane-7, outside or "extracellular" side of the cell membrane, which regions are referred to as "extracellular"

regions

1, 2 and 3, respectively (EC1, EC2 and EC 3). The transmembrane helices are also connected by regions of amino acids between transmembrane-1 and transmembrane-2, transmembrane-3 and transmembrane-4, and transmembrane-5 and transmembrane-6, inside or on the "intracellular" side of the cell membrane, which regions are referred to as "intracellular"

regions

1, 2 and 3, respectively (IC1, IC2 and IC 3). The "carboxy" ("C") terminus of the receptor is located in the intracellular space within the cell, and the "amino" ("N") terminus of the receptor is located in the extracellular space outside the cell. Any of the above regions can be readily identified by analysis of the primary amino acid sequence of the GPCR.

According to the present invention, the term "ligand" or "receptor ligand" means a molecule that specifically binds to a GPCR either intracellularly or extracellularly. The ligand may be, without limiting purpose, a protein, a (poly) peptide, a lipid, a small molecule, a protein scaffold, an antibody fragment, a nucleic acid, a carbohydrate. The ligands may be synthetic or naturally occurring. The term "ligand" includes "natural ligands," which are endogenous, natural ligands of a natural GPCR. In most cases, the ligand is a "modulator" that increases or decreases the intracellular response when contacted with (e.g., binds to) a GPCR expressed by the cell. Examples of ligands as modulators include agonists, partial agonists, inverse agonists, and antagonists. Wherein "agonist" refers to a ligand that increases the signaling activity of a receptor by binding to the receptor. Full agonists are able to maximally stimulate receptors; partial agonists do not elicit full activity even at saturating concentrations. Partial agonists may also function as "blockers" by preventing binding of more potent agonists. "antagonist" refers to a ligand that binds to a receptor without stimulating any activity. "antagonists" are also referred to as "blockers" because of their ability to prevent binding of other ligands and thus block agonist-induced activity. Furthermore, "inverse agonist" refers to an antagonist that, in addition to blocking the agonist effect, also reduces the basal or constitutive activity of the receptor below that of the receptor to which the ligand is not bound.

According to the invention, the amino acid three letter codes and the one letter code used are as described in J.biol. chem, 243, p3558 (1968).

According to the present invention, the term "host cell" refers to a cell into which an expression vector has been introduced. Host cells may include bacterial, microbial, plant or animal cells. Bacteria susceptible to transformation include Enterobacteriaceae (A), (B), (C)enterobacteriaceae) Members of (2), e.g. Escherichia coli (E. coli) ((II))Escherichia coli) Or Salmonella(Salmonella) The strain of (a); bacillaceae (B)Bacillaceae) Such as Bacillus subtilis (B.), (Bacillus subtilis) (ii) a Pneumococcus (A), (B), (C)Pneumococcus) (ii) a Streptococcus (Streptococcus) And Haemophilus influenzae: (Haemophilus influenzae). Suitable microorganisms include Saccharomyces cerevisiae (seeSaccharomyces cerevisiae) And Pichia pastoris (Pichia pastoris). Suitable animalsHost cell lines include CHO (chinese hamster ovary cell line) and NS0 cells.

According to the present invention, amino acid "addition" refers to the addition of an amino acid at the C-terminus or N-terminus of an amino acid sequence. According to the invention, an amino acid "deletion" means that 1, 2 or more than 3 amino acids can be deleted from the amino acid sequence. According to the present invention, the amino acid "insertion" refers to insertion of amino acid residues at appropriate positions in the amino acid sequence, and the inserted amino acid residues may be adjacent to each other in whole or in part, or none of the inserted amino acids may be adjacent to each other.

According to the present invention, an amino acid "substitution" refers to the replacement of an amino acid residue at a certain position in an amino acid sequence with another amino acid residue; wherein "substitution" may be a conservative amino acid substitution.

According to the present invention, "conservative modification", "conservative substitution" or "conservative substitution" refers to the replacement of an amino acid in a protein with another amino acid having similar characteristics (e.g., charge, side chain size, hydrophobicity/hydrophilicity, backbone conformation, rigidity, etc.) so that changes can be frequently made without changing the biological activity of the protein. It is known to The person skilled in The art that, in general, a single amino acid substitution in a non-essential region of a polypeptide does not substantially alter The biological activity (see, for example, Watson et al (1987) Molecular Biology of The Gene, The Benjamin/Cummings pub. Co., p. 224, (4 th edition)). In addition, substitution of structurally or functionally similar amino acids is unlikely to abolish biological activity. Exemplary conservative substitutions are set forth in the following table "exemplary amino acid conservative substitutions".

Exemplary amino acid conservative substitutions

According to the present invention, "medium to very high stringency conditions" include "medium stringency conditions", "medium-high stringency conditions", "high stringency conditions" or "very high stringency conditions", which describe conditions for nucleic acid hybridization and washing. For guidance in performing hybridization reactions see Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated herein by reference. Aqueous and non-aqueous methods are described in this document, and either may be used. For example, specific hybridization conditions are as follows: (1) low stringency hybridization conditions are washed 2 times in 6 x sodium chloride/sodium citrate (SSC), at about 45 ℃, then at least 50 ℃, in 0.2 x SSC, 0.1% SDS (for low stringency conditions, the wash temperature can be raised to 55 ℃); (2) moderate stringency hybridization conditions wash 1 or more times in 6 XSSC, at about 45 ℃, then in 0.2 XSSC, 0.1% SDS at 60 ℃; (3) high stringency hybridization conditions are 1 or more washes in 6 XSSC, at about 45 ℃, then 65 ℃ in 0.2 XSSC, 0.1% SDS and preferably; (4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS, 1 or more washes in 0.2 XSSC, 1% SDS at 65 ℃.

According to the present invention, "exogenous" refers to a substance produced outside an organism, cell or human body depending on the case. "endogenous" refers to a substance produced in a cell, organism, or human body as the case may be.

According to the invention, "homology" refers to sequence similarity between two polynucleotide sequences or between two polypeptides. When a position in both of the two compared sequences is occupied by the same base or amino acid monomer subunit, e.g., if each position of two DNA molecules is occupied by adenine, then the molecules are homologous at that position. The percent homology between two sequences is a function of the number of matching or homologous positions shared by the two sequences divided by the number of positions compared x 100. For example, if there are 6 matches or homologies at 10 positions in two sequences when the sequences are optimally aligned, then the two sequences are 60% homologous; two sequences are 95% homologous if there are 95 matches or homologies at 100 positions in the two sequences. Typically, a comparison is made when aligning two sequences to give the maximum percent homology. For example, the comparison may be performed by the BLAST algorithm, where the parameters of the algorithm are selected to give the maximum match between the respective sequences over the entire length of the respective reference sequence. The following references relate to the BLAST algorithm often used for sequence analysis: BLAST algorithm (BLAST ALGORITHMS) Altschul, S.F. et al, (1990) J.mol.biol.215: 403-; gish, W. et al, (1993) Nature Genet.3: 266-; madden, T.L. et al, (1996) meth.Enzymol.266: 131-; altschul, S.F. et al, (1997) Nucleic Acids Res. 25: 3389-3402; zhang, J et al, (1997) Genome Res.7: 649-656. Other conventional BLAST algorithms, such as provided by NCBI BLAST, are also well known to those skilled in the art.

According to the present invention, the term "codon optimized" means that the nucleotide sequence encoding the polypeptide has been configured to comprise codons preferred by the host cell or organism to improve gene expression and increase translation efficiency in the host cell or organism.

According to the present invention, the term "tag" refers to a short peptide that is fused or linked to a protein of interest (e.g., the modified GPR75 of the present invention) and thereby facilitates soluble expression, detection, and/or purification of the recombinant protein. The tag may be fused or linked to the N-terminus and/or C-terminus of the protein of interest (optionally via a linker or protease cleavage site). Such tags are well known to those skilled in the art and have been described in detail in the prior art literature. Such tags include, for example, but are not limited to, histidine tag (Sockolosky, J.T. and F.C.Szoka (2013) Protein Expr Purif 87(2):129-135), glutathione transferase (GST) tag (Hayashi, K. and C.Kojima (2008), Protein ExprPurif 62(1):120-127), Maltose Binding Protein (MBP) tag (Bataille, L., W.Dieryck, A.Hocqellet, C.Cabanne, K.Bathany, S.Lecommandox, B.rbay and E.Garanger, Protein Express and Purfion Volume 110, June 2015, Pages 165-171), thioredoxin (Trax) tag (Tomalay, M.S.Lantin, Lanti, Schnei and Purif, K.S.S.Lecomando, B.g.S.1000, K.Garzan. Garlander, Schnein Expresson and Purison Volume 110, Jun.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.G.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S.S, Disulfide bond isomerases DsbA tags (Zhang, Y., D.R.Olsen, K.B.Nguyen, P.S.Olson, E.T.Rhodes and D.Mascarenhas (1998) Protein Expr Purif 12(2):159-165), DsbC tags (Kurokawa, Y., H.Yanagi and T.yura (2001) J Biol Chem 276(17): 14393-99), SUMO tags (Marbletone, J.G., S.C.Edavettal, Y.Lim, P.Lim, X.Zuo and T.R.Butt (2006) Protein Sci 15 (1)) 182- (189), MSyB tags (Zhang, Z., L.o, P.Sahong, Y.R.Butt (2006) Protein Sci 15- (182-42), JamJ.H.J.S.E.H.J.E.H.J.J.E.S.E.E.E.E.H.J.E.E.E.E.E.E., E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E.E, shuster, Jeffrey R., Barr, Philip J. (1989) Nature Biotechnology 7(7):705- > 709), Myc tag, Flag tag, fluorescent protein (e.g., GFP) tag (Pedelacq, J.D., S.Cabantouts, T.Tran, T.C.Terwilliger and G.S.Waldo (2006) Nat Biotechnology 24(1):79-88), biotin tag, and avidin tag.

According to the present invention, the term "signal peptide", "signal sequence" or "signal peptide sequence" refers to a short peptide which, when fused to a protein of interest (e.g., the modified GPR75 of the present invention), promotes secretion of the protein of interest expressed by a cell onto or out of the cell membrane. Signal peptides are typically located at the N-terminus of the protein of interest, and various signal peptides are known to those skilled in the art, such as, but not limited to, the hemagglutinin signal sequence, the human insulin signal sequence, the human interleukin 2 signal sequence, the albumin signal sequence, and the like.

According to the present invention, the term "protease cleavage site" refers to a site that is specifically recognized and cleaved by a protease. Various specific proteases and their recognition sites are well known to those skilled in the art and are found in many prior art documents. The skilled worker can, depending on the circumstances, use suitable protease cleavage sites in the fusion protein and cleave with the corresponding proteases. The use of protease cleavage sites may be advantageous, for example, in that they may be used to cleave signal peptides and/or tags from fusion proteins, thereby obtaining mature proteins with the desired activity.

According to the present invention, the term "peptide linker" refers to a short peptide used to connect two molecules (e.g., proteins). Typically, a fusion protein, such as protein of interest 1-peptide linker-protein of interest 2, is obtained by introducing (e.g., by PCR amplification or ligase) a polynucleotide sequence encoding the short peptide between two DNA fragments respectively encoding the two proteins of interest to be ligated, and performing protein expression.

According to the present invention, the term "vector" refers to a nucleic acid vehicle into which a polynucleotide may be inserted. When a vector is capable of expressing a protein encoded by an inserted polynucleotide, the vector is referred to as an expression vector. The vector may be introduced into a host cell by transformation, transduction, or transfection, and the genetic material elements carried thereby are expressed in the host cell. Vectors are well known to those skilled in the art and include, but are not limited to: plasmids, phages, cosmids, and the like.

According to the present invention, the terms "cell," "cell line," and "cell culture" are used interchangeably, and all such designations include progeny. Thus, the words "transformant" and "transformed cell" include the primary test cell and cultures derived therefrom, regardless of the number of transfers. It is also understood that all progeny may not be precisely identical in DNA content due to deliberate or inadvertent mutations. Mutant progeny that have the same function or biological activity as screened for in the originally transformed cell are included. Where different names are intended, they are clearly visible from the context.

The invention is further illustrated by the following examples, but is not to be construed as being limited thereto, in conjunction with the accompanying drawings. The following provides specific materials and sources thereof used in embodiments of the present invention. However, it should be understood that these are exemplary only and not intended to limit the invention, and that materials of the same or similar type, quality, nature or function as the following reagents and instruments may be used in the practice of the invention. The experimental procedures used in the following examples are all conventional procedures unless otherwise specified. Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.

Example (b): preparation of modified GPR75

First, sequence optimization

In this example, the predicted protein structure software used was AlphaFold local edition (v2.1.0) and rosettafald.

1.1 truncation and engineering of wild-type GPR75

The wild-type GPR75 has a long random sequence region, and is not suitable for protein structure analysis. In this example, the transmembrane region prediction results were very similar for GPR75 secondary structure prediction (fig. 1A) based on both AlphaFold2 and rosettafeold models. The regions of random sequence are of very low confidence, so we truncated the original regions of random sequence of the GPR75 receptor, leaving 7 membrane spanning regions critical to the receptor (fig. 1B). The wild type and its truncated sequence are shown below (see in particular SEQ ID NO: 1), truncated between the fifth and sixth transmembrane helix of wild type GPR75, and the random sequence at the N-and C-termini of wild type GPR75 and the random sequence between the fifth and sixth transmembrane helix of wild type GPR75 were deleted (deleted).

Wild type GPR75 (GPR 75-WT) amino acid sequence (SEQ ID NO: 1):

MNSTGHLQDAPNATSLHVPHSQEGNSTSLQE GLQDLIHTATLVTCTFLLAVIFCLGSYGNFIVFLSFFD PAFRKFRTNFDFMILNLSFCDLFICGVTAPMFTFVLFFSSASSIPDAFCFTFHLTSSGFIIMSLKTVAVIALHRLRM VLGKQPNRTASFPCTVLLTLLLWATSFTLATLATLKTSKSHLCLPMSSLIAGKGKAILSLYVVDFTFCVAVVSVSYI MIAQTLRKNAQVRKCPPVITVDASRPQPFMGVPVQGGGDPIQCAMPALYRNQNYNKLQHVQTRGYTKSPNQLVTPAASRLQLVSAINLSTAKDSKAVVTCVIIVLSVLVCCLPLGISLVQVVLSSNGSFILYQFELFGFTLIFFKSGLNPFIYS RNSAGLRRKVLWCLQYIGLGFFCCKQKTRLRAMGKGNLEVNRNKSSHHETNSAYMLSPKPQKKFVDQACGPSHSKESMVSPKISAGHQHCGQSSSTPINTRIEPYYSIYNSSPSQEESSPCNLQPVNSFGFANSYIAMHYHTTNDLVQEYDSTSAKQIPVPSV

remarking: of the above wild-type GPR75 amino acid sequences, the amino acid sequences retained in modified GPR75 are single underlined and correspond to the first to fifth transmembrane helices from the N-terminus of wild-type GPR75 and the sixth to seventh transmembrane helices from the N-terminus of wild-type GPR75, respectively; the italic portion is a sequence retained in truncated GPR75 and deleted in modified GPR 75.

1.2 ligation of BRIL fusion proteins to truncated GPR75

It is contemplated that the fusion junction at the fifth and sixth transmembrane helix cut (FIG. 2), which has 4 transmembrane helices, has been used in the resolution of multiple GPCR crystals (e.g., PDB ID:7F83, 7VOD, 6LPJ, 6KO5, 6OS0, etc.) and electron microscopy structures (e.g., PDB ID: 7S8O, 6WW2, 6USF, etc.) using the classical BRIL fusion protein derived from bacterial soluble cytochrome b 562. The BRIL fusion protein can increase the size of the extracellular region of the modified GPR75, and can be used as a marker in electron microscope structure analysis.

The fusion site of the BRIL fusion protein needs to be optimized. In this example, 16 sites of fusion sites of BRIL fusion protein were screened (FIG. 3), and according to the prediction of the structure of AlphaFold2 protein, the modified sequence capable of forming stable helix with the fifth and sixth transmembrane helices was selected in this example, and the fifth and sixth transmembrane helices were fused and connected.

The BRIL fusion protein sequence employed in the engineered GPR75 in this example (SEQ ID NO: 2):

ARRQLADLEDNWETLNDNLKVIEKADNAAQVKDALTKMRAAALDAQKATPPKLEDKSPDSPEMKDFRHGFDILVGQIDDALKLANEGKVKEAQAAAEQLKTTRNAYIQKYLERAR

to further optimize the location of the fusion site, we distributed a single amino acid float to the fifth and sixth transmembrane helix to BRIL junction region and used AlphaFold2 to predict the secondary structure of the fusion protein. From the predicted effect profile of AlphaFold2 (fig. 4), we compared the effect of the selection of 16 fusion sites. The sequence obtained by fusion joining the fifth and sixth transmembrane helix truncations described above by the BRIL fusion protein sequence is (SEQ ID NO: 3):

GLQDLIHTATLVTCTFLLAVIFCLGSYGNFIVFLSFFDPAFRKFRTNFDFMILNLSFCDLFICGVTAPM FTFVLFFSSASSIPDAFCFTFHLTSSGFIIMSLKTVAVIALHRLRMVLGKQPNRTASFPCTVLLTLLLWATSFTLAT LATLKTSKSHLCLPMSSLIAGKGKAILSLYVVDFTFCVAVVSVSYIMIAQTLRKNAQVARRQLADLEDNWETLNDNLKVIEKADNAAQVKDALTKMRAAALDAQKATPPKLEDKSPDSPEMKDFRHGFDILVGQIDDALKLANEGKVKEAQAAAEQLKTTRNAYIQKYLERARSAINLSTAKDSKAVVTCVIIVLSVLVCCLPLGISLVQVVLSSNGSFILYQFELFGFTL IFFKSGLNPFIYSRNSAGLRRKVLWCLQYIGL

1.3N-terminal fusion of beta 2 adrenoceptor sequences

In this example, a 24 amino acid sequence of the human β 2 adrenoceptor was fused to the N-terminus to improve receptor stability and expression (UNIPORT: P07550).

The sequence of the β 2 adrenoceptor employed in the modified GPR75 in this example is (SEQ ID NO: 4):

MGQPGNGSAFLLAPNRSHAPDHDV

in this example, the sequence of a modified GPR75 with an added β 2 adrenoreceptor sequence is as follows (SEQ ID NO: 5)

MGQPGNGSAFLLAPNRSHAPDHDVGLQDLIHTATLVTCTFLLAVIFCLGSYGNFIVFLSFFDPAFRKFR TNFDFMILNLSFCDLFICGVTAPMFTFVLFFSSASSIPDAFCFTFHLTSSGFIIMSLKTVAVIALHRLRMVLGKQPN RTASFPCTVLLTLLLWATSFTLATLATLKTSKSHLCLPMSSLIAGKGKAILSLYVVDFTFCVAVVSVSYIMIAQTLR KNAQVARRQLADLEDNWETLNDNLKVIEKADNAAQVKDALTKMRAAALDAQKATPPKLEDKSPDSPEMKDFRHGFDILVGQIDDALKLANEGKVKEAQAAAEQLKTTRNAYIQKYLERARSAINLSTAKDSKAVVTCVIIVLSVLVCCLPLGIS LVQVVLSSNGSFILYQFELFGFTLIFFKSGLNPFIYSRNSAGLRRKVLWCLQYIGL

1.4 addition of Signal peptide, cleavage site and tag sequence

In this example, the N-terminus and C-terminus of modified GPR75 were designed with Flag tag sequence, Strep tag sequence and 6 × His tag sequence, respectively, to facilitate purification of heterologously expressed modified GPR 75. Finally, in this embodiment, the sequence of the protein is subjected to multiple enzyme cleavage site (3C/TEV) design and Sortase a fusion sequence design, so that the purified protein can be conveniently subjected to reverse purification, matrix fixation, site-specific fluorescence labeling and the like of different schemes according to needs in the later period, and the modified sequence can be conveniently applied to a nucleic acid-encoded small molecule drug library, a drug binding experiment and the like.

Specifically, the modified GPR75 in this embodiment is, in order from the N-terminus to the C-terminus:

the hemagglutinin signal sequence (MKTIIALSYIFCLVFA; SEQ ID NO: 6);

flag tag sequence (DYKDDDDA; SEQ ID NO: 7);

the beta 2 adrenoceptor sequence (MGQPGNGSAFLLAPNRSHAPDHDV; SEQ ID NO: 4);

TEV protease cleavage site (ENLYFQG; SEQ ID NO: 8);

the sequence obtained for truncated GPR75 linked by the BRIL fusion protein sequence (SEQ ID NO: 3);

sortase A fusion sequence (LPETG; SEQ ID NO: 9); it is a USB interface, a connection site of Sortase A enzyme.

Strep tag sequence (SAWSHPQFEK; SEQ ID NO: 10);

HRV 3C protease cleavage site (LEVLFQGP; SEQ ID NO: 11);

6 × His tag sequence (HHHHHHHH; SEQ ID NO: 12).

The HRV 3C protease cleavage site was linked to the 6 XHis tag sequence by GS.

The complete amino acid sequence of modified GPR75 (GPR 75-M) is (SEQ ID NO: 13):

MKTIIALSYIFCLVFADYKDDDDAMGQPGNGSAFLLAPNRSHAPDHDVENLYFQGGLQDLIHTATLVTC TFLLAVIFCLGSYGNFIVFLSFFDPAFRKFRTNFDFMILNLSFCDLFICGVTAPMFTFVLFFSSASSIPDAFCFTFH LTSSGFIIMSLKTVAVIALHRLRMVLGKQPNRTASFPCTVLLTLLLWATSFTLATLATLKTSKSHLCLPMSSLIAGK GKAILSLYVVDFTFCVAVVSVSYIMIAQTLRKNAQVARRQLADLEDNWETLNDNLKVIEKADNAAQVKDALTKMRAAALDAQKATPPKLEDKSPDSPEMKDFRHGFDILVGQIDDALKLANEGKVKEAQAAAEQLKTTRNAYIQKYLERARSAI NLSTAKDSKAVVTCVIIVLSVLVCCLPLGISLVQVVLSSNGSFILYQFELFGFTLIFFKSGLNPFIYSRNSAGLRRK VLWCLQYIGLPETGSAWSHPQFEKLEVLFQGPGSHHHHHH

remarking: the tail end of the C end of the sequence shown in SEQ ID NO. 3 shares amino acid L with the N end of the Sortase A fusion sequence.

GPR75-M codon optimized (sf 9 insect cell based) nucleic acid sequence (SEQ ID NO: 14):

ATGAAAACGATTATCGCACTGTCTTACATCTTCTGCCTGGTTTTTGCAGACTACAAAGACGACGATGATGCAATGGGTCAACCCGGAAACGGTTCAGCATTTTTGTTGGCGCCGAATCGTTCACACGCTCCCGATCACGACGTGGAGAATCTGTATTTCCAAGGCGGTCTGCAGGACTTGATACACACGGCTACGCTTGTCACCTGCACTTTTCTTCTTGCTGTAATATTTTGTTTGGGATCGTACGGCAATTTCATAGTCTTCCTGTCATTTTTCGATCCGGCTTTCCGCAAGTTTAGGACCAATTTTGACTTCATGATCCTTAACCTCTCTTTCTGTGATTTGTTCATATGCGGTGTGACTGCGCCTATGTTTACATTCGTGCTGTTTTTCTCAAGCGCATCATCCATACCCGATGCTTTTTGCTTCACGTTCCATTTGACCTCCTCGGGCTTCATCATTATGTCTTTGAAGACTGTTGCAGTAATAGCACTTCATAGGCTTCGTATGGTCCTCGGCAAACAGCCTAATCGCACTGCGTCGTTCCCTTGCACTGTCCTCTTGACCCTGCTCCTTTGGGCGACATCGTTTACTCTTGCCACCTTGGCTACACTCAAAACAAGCAAGTCTCATCTCTGTTTGCCAATGAGTAGTCTCATTGCCGGTAAAGGAAAGGCAATTTTGTCTTTGTACGTGGTTGACTTTACTTTCTGCGTTGCCGTTGTGTCAGTTTCTTACATCATGATTGCGCAGACACTGCGTAAAAATGCGCAGGTCGCAAGGAGACAGCTCGCCGATCTTGAAGACAATTGGGAGACGTTGAATGACAACCTGAAAGTGATTGAGAAAGCAGACAATGCAGCGCAAGTAAAGGATGCACTCACTAAGATGCGTGCCGCTGCGCTCGACGCGCAAAAGGCAACTCCGCCTAAATTGGAGGATAAGTCCCCTGACTCACCAGAGATGAAGGATTTCAGACATGGCTTCGACATCCTGGTAGGACAGATTGACGATGCGTTGAAGCTCGCGAACGAAGGAAAGGTGAAAGAGGCCCAGGCAGCGGCTGAACAGCTCAAGACCACAAGGAACGCATACATACAAAAATACCTGGAGCGTGCAAGGTCAGCTATAAATCTTTCAACCGCTAAAGATTCCAAGGCGGTAGTCACCTGCGTAATTATAGTACTTTCCGTCTTGGTTTGTTGTCTCCCGTTGGGCATATCCCTCGTACAAGTGGTCCTTTCGAGTAATGGTTCCTTCATTCTGTATCAATTCGAGCTTTTCGGCTTCACTCTGATATTCTTTAAGTCAGGTCTGAATCCCTTTATTTACTCCCGTAATTCAGCGGGACTCAGACGCAAGGTGCTCTGGTGTCTCCAGTACATCGGCCTGCCCGAAACCGGTTCGGCATGGTCTCACCCCCAGTTTGAAAAACTCGAGGTTCTCTTTCAAGGACCGGGAAGTCATCATCACCATCATCATTAG

1.5 Gene Synthesis and plasmid construction

Genes with optimized sequences, including wild type GPR75, truncated type GPR75 and a modified GPR75 gene, are sent to Beijing Honghong biotechnology limited for gene synthesis, carry NotI and HindIII enzyme cutting sites, and are connected and recombined to pFastbac-1 plasmid.

Wherein, the amino acid sequence of truncated GPR75 (SEQ ID NO: 15):

MKTIIALSYIFCLVFADYKDDDDAMGQPGNGSAFLLAPNRSHAPDHDVENLYFQGMNSTGHLQDAPNATSLHVPHSQEGNSTSLQEGLQDLIHTATLVTCTFLLAVIFCLGSYGNFIVFLSFFDPAFRKFRTNFDFMILNLSFCDLFICGVTAPMFTFVLFFSSASSIPDAFCFTFHLTSSGFIIMSLKTVAVIALHRLRMVLGKQPNRTASFPCTVLLTLLLWATSFTLATLATLKTSKSHLCLPMSSLIAGKGKAILSLYVVDFTFCVAVVSVSYIMIAQTLRKNAQVRKCPPVITVDASRPQPFMGVPVQGGGDPIQCAMPALYRNQNYNKLQHVQTRGYTKSPNQLVTPAASRLQLVSAINLSTAKDSKAVVTCVIIVLSVLVCCLPLGISLVQVVLSSNGSFILYQFELFGFTLIFFKSGLNPFIYSRNSAGLRRKVLWCLQYIGLPETGSAWSHPQFEKLEVLFQGPGSHHHHHH

in truncated GPR75, no BRIL fusion protein sequence is present, the random sequence at the N-terminal in wild GPR75 is not deleted, and other signal peptides, enzyme cutting sites and tag sequences are the same as those of modified GPR 75.

Preparation of recombinant baculovirus

2.1, the recombinant pFastbac plasmid containing the gene of interest was introduced into E.coli DH10Bac competent cells (Bomeide organism) by heat shock transformation, and cultured in LB solid medium containing 50. mu.g/mL kanamycin (BioBomei), 7. mu.g/mL gentamicin (BioBomei), 10. mu.g/mL tetracycline (BioBomei), 200. mu.g/mL X-gal (inalco), and 40. mu.g/mL IPTG (inalco) at 37 ℃ for 48 hours. Selecting uniform white spots to 3 mL LB liquid culture medium containing three antibiotics (50. mu.g/mL kanamycin, 7. mu.g/mL gentamicin, 10. mu.g/mL tetracycline), culturing overnight at 37 deg.C under 200 rpm, and waiting for OD of bacterial liquid₆₀₀At about 0.6, a recombinant baculovirus plasmid was extracted.

2.2, 1 mL of the isolated Medium (Graces) was incubated with 15. mu.L of the transfection reagent (FuGENE) and 5. mu.g of the recombinant baculovirus plasmid at room temperature for 15 minutes, and the mixture solution was used to resuspend 10-12X 10 cells⁶sf9 insect cells (obtained by centrifuging a cell culture solution at 500 rpm for 10 minutes at room temperature) were cultured at 27 ℃ for 4 hours at 200 rpm, and 5 mL of ESF921 insect cell culture medium (Expression Systems) was added, and the culture was continued at 27 ℃ for 48 hours at 200 rpm. The sf9 cells cultured for 48 hours were transferred to a 100 mL Erlenmeyer flask and cultured at 27 ℃ and 110 rpm until the cell density reached 2-4X 10⁶At room temperature, 2500 rpm at/mLThe supernatant was the P1 generation recombinant baculovirus after 10 minutes.

2.3, transfecting the P1 generation recombinant baculovirus according to the proportion of 1:10000 to 100 mL of cells with the cell density of 1.5 multiplied by 10⁶The insect cells were cultured at 27 ℃ and 110 rpm in the presence of sf9 to a cell density of 6X 10⁶And when the cell volume is expanded and more uniform, centrifuging at 2500 rpm for 10 minutes at room temperature, and filtering the supernatant by using a 0.22-micron needle filter to obtain the P2-generation recombinant baculovirus.

Third, protein purification laboratory

The P2 generation recombinant baculovirus was transfected into 20 mL of 4X 10 density medium at a ratio of 1:50⁶The insect cells were cultured at 27 ℃ and 110 rpm for 48 hours in the presence of/mL sf 9. After the culture is finished, sampling and carrying out Western blot detection to determine the expression of the target protein.

We performed Western blot parallel experiments on wild type GPR75 and modified GPR 75. The experimental results showed that no band was significantly expressed in wild type GPR75, whereas target band expression was observed in modified GPR75 (fig. 5). The expression level of GPR75 is improved after modification. Since wild-type GPR75 was tested for no expression, a further large expression purification comparison was subsequently performed for truncated GPR75 and modified GPR 75.

Fourth, protein mass expression and purification

The P2 generation recombinant baculovirus was transfected at a ratio of 1:50 to give 1L recombinant baculovirus with a density of 4X 10⁶The insect cells were cultured at 27 ℃ and 110 rpm for 48 hours in a/mL sf 9.

After the cell culture is finished, centrifuging at 4 ℃ and 4000 rpm for 20 minutes, collecting cells, re-suspending cell sediment by using 100 mL Buffer A, stirring at 4 ℃ for 10 minutes to fully lyse the cells, centrifuging the lysed cells at 4 ℃ and 15000 rpm for 10 minutes, discarding supernatant, re-suspending the sediment by using 25 mL Buffer B and homogenizing by using a homogenizer, and performing membrane dissolving on the homogenate at 4 ℃ for 90 minutes. After the membrane dissolution, the mixture is centrifuged at 4 ℃ and 37000 rpm for 15 minutes, the supernatant is incubated with nickel affinity chromatography packing for 1 hour, and then the foreign proteins are eluted by Buffer C and Buffer D, and the target protein is eluted by Buffer E. Loading an eluent containing target protein on a Flag affinity chromatography column, eluting hybrid protein by using Buffer F and target protein by using Buffer G, then concentrating the target protein by using a 50 KDa ultrafiltration tube, carrying out gel filtration chromatography when the volume of the concentrated target protein is about 500 mu L, wherein the model of the gel column is Superdex 200 Incrase 10/300 GL (cytiva), the Buffer solution is Buffer H, and collecting a protein sample at a UV-280 ultraviolet absorption peak to carry out SDS-PAGE gel electrophoresis to detect the content and the purity of the target protein.

Wherein, the Buffer solution (Buffer) A-F comprises the following specific components:

buffer A: 20mM Tris, pH 7.5, 2 mg/mL Iodoacetamide (Iodoacetamide);

buffer B: 20mM Tris, pH 7.5, 1 mg/mL iodoacetamide, 750 mM NaCl, 0.5% LMNG, 0.03% CHS, 0.2% sodium cholate (sodium cholate), 1/1000 protease inhibitor;

buffer C: 20mM Tris, pH 7.5, 150 mM NaCl, 0.05% LMNG, 0.003% CHS, 0.02% sodium cholate, 1/1000 protease inhibitor, 20mM Imidazole (Imidazole);

buffer D: 20mM Tris, pH 7.5, 150 mM NaCl, 0.05% LMNG, 0.003% CHS, 0.02% sodium cholate, 1/1000 protease inhibitor, 30 mM imidazole;

buffer E: 20mM Tris, pH 7.5, 150 mM NaCl, 0.05% LMNG, 0.003% CHS, 0.02% sodium cholate, 1/1000 protease inhibitor, 250 mM imidazole;

buffer F: 20mM Tris, pH 7.5, 150 mM NaCl, 0.05% LMNG, 0.003% CHS, 0.02% sodium cholate, 1/1000 protease inhibitor;

buffer G: 20mM Tris, pH 7.5, 150 mM NaCl, 0.05% LMNG, 0.003% CHS, 0.02% sodium cholate, 1/1000 protease inhibitor, 0.13 mg/mL Flag peptide;

buffer H: 20mM Tris, pH 7.5, 150 mM NaCl, 0.00075% LMNG, 0.0001% CHS, 0.00025% GDN, 1/1000 protease inhibitor, 100. mu.M Tris (2-carboxyethyl) phosphine (TCEP).

The experimental results are as follows: as shown in FIG. 6, in the gel filtration chromatography experiment, UV-280 absorption peak began to appear in modified GPR75 when Buffer H eluted in a volume of 11.7 mL, with a maximum absorption of 228.2 mAu, corresponding to a Buffer elution volume of 12.6 mL. As shown in figure 6, the expression of modified GPR75 was significantly increased compared to truncated GPR75 (without the BRIL fusion and without deletion of the N-terminal random sequence of wild-type GPR 75). The modified GPR75 also had a higher proportion of monomer peaks. The molecular weight of the modified GPR75 protein is 58.03 KDa, as shown in figure 7A and figure 7B, and the purity of the protein obtained after affinity chromatography purification and gel filtration chromatography purification is about 90% after SDS-PAGE gel electrophoresis analysis. Cryo-electron microscopy data showed that the samples were in good aggregate state and were available for further structural analysis as shown in fig. 7A and 7B.

Test example

Test example 1 identification of Activity of modified GPR75 protein

To verify whether the modified GPR75 protein prepared and purified in the examples has activity, the test example determined the efficiency of promoting GTP hydrolysis by Gq protein by using that the downstream Gq protein can be activated by G protein-coupled receptor²¹. Experimental procedure according to GTPase-Glo^TMAssay (Promega) kit instructions. The principle of the method is that 10 mu M GTP molecules are added into an experimental system, and because Gq protein has GTP hydrolysis activity, GTP molecules in the system are gradually consumed. The non-ligand bound, modified GPR75 protein has a background level of activation activity that will accelerate GTP depletion in the system.

The method comprises the following specific steps: first, the buffer conditions used for the experiment were configured: 20mM Tris pH 7.5, 100mM NaCl, 0.01% MNG, 100. mu.M TCEP, 5mM MgCl₂. In the experiment, the purified control buffer solution, Gq protein, modified GPR75 protein, modified GPR75 protein-Gq protein complex and modified GPR75 protein-Gq protein-20-HETE ligand complex are respectively mixed with 10 mu M GTP, wherein the concentrations of the modified GPR75 protein and the Gq protein are both 3 mu M, and the concentration of 20-HETE is 10 mu M. The samples were incubated at room temperature for 2 hours. Subsequently, a Glo reaction solution was prepared: glo reagent in kit was diluted 500-fold into double distilled water and 10 μ M ADP molecules were added. mu.L of Glo reaction solution was added to 5 samples of the previous step at a volume ratio of 1:1 and incubated for 30 minutes. Subsequently, 40. mu.L of Glo detection solution was added at a volume ratio of 1:1,adding into the previous step. Finally, the samples were aliquoted into 384-well plates and read using an Ensight plate reader (perkinelmer).

The hydrolytic activity of Gq protein was enhanced after addition of the modified GPR75 protein compared to the hydrolytic GTP activity of Gq protein itself (fig. 8A). This indicates that the modified GPR75 protein without ligand binding state has a certain level of background activity. The test example shows that the ligand 20-HETE of GPR75 protein reported in the literature shows the effect of inhibiting the activation of the modified GPR75 protein. To further investigate the effect of 20-HETE, this test example was conducted on the IC of 20-HETE₅₀The measurement was carried out (FIG. 8B), and the results showed IC of 20-HETE₅₀The value was approximately 2 nM. The modified GPR75 has certain activity, and the modified GPR75 provided by the invention can be combined with a ligand of wild GPR75, such as 20-HETE, and can be applied to GPR75 structure analysis, GPR75 activity analysis, related nucleic acid coding small molecule library screening, computer-assisted drug design and drug screening and the like. And as demonstrated in the examples, compared with wild type and truncated type GPR75, the modified GPR75 provided by the invention has higher expression level and is suitable for subsequent structural analysis and other applications.

Test example 2 modification of GPR75 protein for Structure analysis

For further structural analysis work using the modified GPR75 protein, the modified GPR75 protein prepared and purified in the examples was incubated with the anti-BRIL fab fragment, and the complex was purified by molecular sieving (fig. 9A). The purified complex was verified in a further SDS-PAGE experiment (FIG. 9B). Furthermore, as shown in fig. 9C, the single-particle two-dimensional classification by cryoelectron microscopy revealed features of the receptor and the complex with fab fragment of anti-BRIL. Test examples 1 and 2 demonstrate that in the modified GPR75, the ligand binding site of wild-type GPR75 was partially retained in GPR75, and the BRIL fusion protein was recognized by its antibody and was used as a marker in electron microscopy structural analysis. Subsequently, in this test example, the obtained stable compound was subjected to freezing sample preparation, electron microscopy data collection, and structure analysis. The single-particle two-dimensional classification data show that the modified 75 protein-BRIL resistant fab fragment forms a stable compound, which lays a solid foundation for further performing structure analysis on the modified GPR75 protein.

The above description of exemplary embodiments has been presented only to illustrate the technical solution of the invention and is not intended to be exhaustive or to limit the invention to the precise form described. Obviously, many modifications and variations are possible in light of the above teaching to those skilled in the art. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to thereby enable others skilled in the art to understand, implement and utilize the invention in various exemplary embodiments and with various alternatives and modifications. It is intended that the scope of the invention be defined by the following claims and their equivalents.

Reference documents:

1. Venter, J. C. et al. The Sequence of the Human Genome. Science 291, 1304–1351 (2001).

2. Wise, A., Jupe, S. C. & Rees, S. THE IDENTIFICATION OF LIGANDS AT ORPHAN G-PROTEIN COUPLED RECEPTORS. Pharmacol Toxicol 44, 43–66 (2004).

3. Sriram, K. & Insel, P. A. GPCRs as targets for approved drugs: How many targets and how many drugs. Mol Pharmacol 93, mol.117.111062 (2018).

4. Flock, T. et al. Selectivity determinants of GPCR-G-protein binding. Nature 545, 317–322 (2017).

5. Garcia, V. et al. 20-HETE Signals Through G-Protein–Coupled Receptor GPR75 (Gq) to Affect Vascular Function and Trigger Hypertension. Circ Res 120, 1776–1788 (2017).

6. Liu, B. et al. The novel chemokine receptor, G-protein-coupled receptor 75, is expressed by islets and is coupled to stimulation of insulin secretion and improved glucose homeostasis. Diabetologia 56, 2467–2476 (2013).

7. Akbari, P. et al. Sequencing of 640,000 exomes identifies GPR75 variants associated with protection from obesity. Science 373, eabf8683 (2021).

8. Tarttelin, E. E. et al. Cloning and Characterization of a Novel Orphan G-Protein-Coupled Receptor Localized to Human Chromosome 2p16. Biochem Bioph Res Co 260, 174–180 (1999).

9. Pease, J. E. Tails of the unexpected – an atypical receptor for the chemokine RANTES/CCL5 expressed in brain. Brit J Pharmacol 149, 460–462 (2006).

10. Ignatov, A., Robert, J., Gregory‐Evans, C. & Schaller, H. C. RANTES stimulates Ca2+ mobilization and inositol trisphosphate (IP3) formation in cells transfected with G protein‐coupled receptor 75. Brit J Pharmacol 149, 490–497 (2006).

11. Shen, P. S. The 2017 Nobel Prize in Chemistry: cryo-EM comes of age. Anal Bioanal Chem 410, 2053–2057 (2018).

12. Callaway, E. Revolutionary cryo-EM is taking over structural biology. Nature 578, 201–201 (2020).

13. Kooistra, A. J. et al. GPCRdb in 2021: integrating GPCR sequence, structure and function. Nucleic Acids Res 49, gkaa1080- (2020).

14. Schneider, G. & Fechner, U. Computer-based de novo design of drug-like molecules. Nat Rev Drug Discov 4, 649–663 (2005).

15. Renaud, J.-P. et al. Cryo-EM in drug discovery: achievements, limitations and prospects. Nat Rev Drug Discov 17, 471–492 (2018).

16. Palczewski, K. et al. Crystal Structure of Rhodopsin: A G Protein-Coupled Receptor. Science 289, 739–745 (2000).

17. Rasmussen, S. G. F. et al. Crystal structure of the human β2 adrenergic G-protein-coupled receptor. Nature 450, 383–387 (2007).

18. Cherezov, V. et al. High-Resolution Crystal Structure of an Engineered Human β2-Adrenergic G Protein–Coupled Receptor. Science 318, 1258–1265 (2007).

19. Rasmussen, S. G. F. et al. Crystal structure of the β2 adrenergic receptor-Gs protein complex. Nature 477, 549–555 (2011).

20. Manglik, A. et al. Structure-based discovery of opioid analgesics with reduced side effects. Nature 537, 185–190 (2016).

21. Gregorio, G. G. et al. Single-molecule analysis of ligand efficacy in β2AR-G-protein activation. Nature 547, 68–73 (2017).

sequence listing

<110> Shuimu future (Beijing) Tech Co Ltd

<120> modified GPR75 and use thereof

<130> 6C95-2128763IP

<160> 15

<170> SIPOSequenceListing 1.0

<210> 1

<211> 540

<212> PRT

<213> human ()

<400> 1

Met Asn Ser Thr Gly His Leu Gln Asp Ala Pro Asn Ala Thr Ser Leu

1 5 10 15

His Val Pro His Ser Gln Glu Gly Asn Ser Thr Ser Leu Gln Glu Gly

20 25 30

Leu Gln Asp Leu Ile His Thr Ala Thr Leu Val Thr Cys Thr Phe Leu

35 40 45

Leu Ala Val Ile Phe Cys Leu Gly Ser Tyr Gly Asn Phe Ile Val Phe

50 55 60

Leu Ser Phe Phe Asp Pro Ala Phe Arg Lys Phe Arg Thr Asn Phe Asp

65 70 75 80

Phe Met Ile Leu Asn Leu Ser Phe Cys Asp Leu Phe Ile Cys Gly Val

85 90 95

Thr Ala Pro Met Phe Thr Phe Val Leu Phe Phe Ser Ser Ala Ser Ser

100 105 110

Ile Pro Asp Ala Phe Cys Phe Thr Phe His Leu Thr Ser Ser Gly Phe

115 120 125

Ile Ile Met Ser Leu Lys Thr Val Ala Val Ile Ala Leu His Arg Leu

130 135 140

Arg Met Val Leu Gly Lys Gln Pro Asn Arg Thr Ala Ser Phe Pro Cys

145 150 155 160

Thr Val Leu Leu Thr Leu Leu Leu Trp Ala Thr Ser Phe Thr Leu Ala

165 170 175

Thr Leu Ala Thr Leu Lys Thr Ser Lys Ser His Leu Cys Leu Pro Met

180 185 190

Ser Ser Leu Ile Ala Gly Lys Gly Lys Ala Ile Leu Ser Leu Tyr Val

195 200 205

Val Asp Phe Thr Phe Cys Val Ala Val Val Ser Val Ser Tyr Ile Met

210 215 220

Ile Ala Gln Thr Leu Arg Lys Asn Ala Gln Val Arg Lys Cys Pro Pro

225 230 235 240

Val Ile Thr Val Asp Ala Ser Arg Pro Gln Pro Phe Met Gly Val Pro

245 250 255

Val Gln Gly Gly Gly Asp Pro Ile Gln Cys Ala Met Pro Ala Leu Tyr

260 265 270

Arg Asn Gln Asn Tyr Asn Lys Leu Gln His Val Gln Thr Arg Gly Tyr

275 280 285

Thr Lys Ser Pro Asn Gln Leu Val Thr Pro Ala Ala Ser Arg Leu Gln

290 295 300

Leu Val Ser Ala Ile Asn Leu Ser Thr Ala Lys Asp Ser Lys Ala Val

305 310 315 320

Val Thr Cys Val Ile Ile Val Leu Ser Val Leu Val Cys Cys Leu Pro

325 330 335

Leu Gly Ile Ser Leu Val Gln Val Val Leu Ser Ser Asn Gly Ser Phe

340 345 350

Ile Leu Tyr Gln Phe Glu Leu Phe Gly Phe Thr Leu Ile Phe Phe Lys

355 360 365

Ser Gly Leu Asn Pro Phe Ile Tyr Ser Arg Asn Ser Ala Gly Leu Arg

370 375 380

Arg Lys Val Leu Trp Cys Leu Gln Tyr Ile Gly Leu Gly Phe Phe Cys

385 390 395 400

Cys Lys Gln Lys Thr Arg Leu Arg Ala Met Gly Lys Gly Asn Leu Glu

405 410 415

Val Asn Arg Asn Lys Ser Ser His His Glu Thr Asn Ser Ala Tyr Met

420 425 430

Leu Ser Pro Lys Pro Gln Lys Lys Phe Val Asp Gln Ala Cys Gly Pro

435 440 445

Ser His Ser Lys Glu Ser Met Val Ser Pro Lys Ile Ser Ala Gly His

450 455 460

Gln His Cys Gly Gln Ser Ser Ser Thr Pro Ile Asn Thr Arg Ile Glu

465 470 475 480

Pro Tyr Tyr Ser Ile Tyr Asn Ser Ser Pro Ser Gln Glu Glu Ser Ser

485 490 495

Pro Cys Asn Leu Gln Pro Val Asn Ser Phe Gly Phe Ala Asn Ser Tyr

500 505 510

Ile Ala Met His Tyr His Thr Thr Asn Asp Leu Val Gln Glu Tyr Asp

515 520 525

Ser Thr Ser Ala Lys Gln Ile Pro Val Pro Ser Val

530 535 540

<210> 2

<211> 115

<212> PRT

<213> Artificial sequence ()

<220>

Claims

1. A modified GPR75, characterized in that the modified GPR75 comprises:

a first domain comprising an amino acid sequence derived from a β 2 adrenergic receptor; and the combination of (a) and (b),

a second domain that is a domain in which the random sequence between the fifth and sixth transmembrane helices and the N-terminal, C-terminal random sequence are deleted in wild-type GPR75 and linked between the fifth and sixth transmembrane helices by an amino acid sequence derived from a BRIL fusion protein;

the amino acid sequence of the modified GPR75 is shown as SEQ ID NO: shown at 13.

2. The modified GPR75 of claim 1 wherein the first domain comprises the amino acid sequence set forth as SEQ ID No. 4; and/or

The second domain comprises an amino acid sequence as set forth in SEQ ID NO 3.

3. Modified GPR75 according to claim 1 or 2, characterized in that the modified GPR75 comprises the following sequence:

(i) the amino acid sequence shown as SEQ ID NO. 5.

4. A polynucleotide encoding the engineered GPR75 of any one of claims 1 to 3.

5. The polynucleotide of claim 4, having a nucleotide sequence as set forth in SEQ ID NO: as shown at 14.

6. An expression vector comprising the polynucleotide of claim 4 or 5.

7. A host cell comprising the expression vector of claim 6.

8. Use of a modified GPR75 according to any one of claims 1 to 3, a polynucleotide according to claim 4 or 5, an expression vector according to claim 6 or a host cell according to claim 7 for use in GPR75 structural analysis, fluorescent molecular labeling, fusion of phosphorylated polypeptides or signaling proteins, GPR75 activity analysis, screening of nucleic acid-encoded small molecule libraries, computer-assisted drug design and drug screening, as a non-disease treatment or diagnostic method.