WO2014179983A1 - 一种改造非抗体类蛋白产生结合分子的方法、所产生的产品和一种长效glp-1受体激动剂 - Google Patents

一种改造非抗体类蛋白产生结合分子的方法、所产生的产品和一种长效glp-1受体激动剂 Download PDF

Info

Publication number
WO2014179983A1
WO2014179983A1 PCT/CN2013/075460 CN2013075460W WO2014179983A1 WO 2014179983 A1 WO2014179983 A1 WO 2014179983A1 CN 2013075460 W CN2013075460 W CN 2013075460W WO 2014179983 A1 WO2014179983 A1 WO 2014179983A1
Authority
WO
WIPO (PCT)
Prior art keywords
protein
polypeptide
sequence
seq
variable region
Prior art date
Application number
PCT/CN2013/075460
Other languages
English (en)
French (fr)
Inventor
王瑞
黄金
张伟
卢水秀
史孟君
王军亮
司武亮
Original Assignee
北京华金瑞清生物医药技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京华金瑞清生物医药技术有限公司 filed Critical 北京华金瑞清生物医药技术有限公司
Priority to PCT/CN2013/075460 priority Critical patent/WO2014179983A1/zh
Priority to CN201380075612.0A priority patent/CN105143250B/zh
Publication of WO2014179983A1 publication Critical patent/WO2014179983A1/zh

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/575Hormones
    • C07K14/605Glucagons
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K47/00Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient
    • A61K47/50Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates
    • A61K47/51Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the non-active ingredient being a modifying agent
    • A61K47/62Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the non-active ingredient being a modifying agent the modifying agent being a protein, peptide or polyamino acid
    • A61K47/64Drug-peptide, drug-protein or drug-polyamino acid conjugates, i.e. the modifying agent being a peptide, protein or polyamino acid which is covalently bonded or complexed to a therapeutically active agent
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P3/00Drugs for disorders of the metabolism
    • A61P3/08Drugs for disorders of the metabolism for glucose homeostasis
    • A61P3/10Drugs for disorders of the metabolism for glucose homeostasis for hyperglycaemia, e.g. antidiabetics
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide

Definitions

  • the wood invention relates to a biopharmaceutical, that is, a high score that a pseudo-antibody (or non-natural antibody) can bind to a target.
  • the present invention also relates to the fields of diabetes, obesity, cardiovascular disease, and neurodegenerative diseases, and more particularly to GLP-1 receptor agonists.
  • Protein molecules with engineering potential can specifically identify and bind specific targets after artificial engineering. Such targeted protein molecules and engineering techniques have thus been carried out in the field of biotechnology such as Chuanchuan Pharmaceutical and Diagnostics.
  • monoclonal antibody molecules are natural antigen-binding molecules that produce strong specific binding capabilities for a wide variety of drug targets.
  • antibodies and their derivatives still have many key defects that cannot meet the actual needs. This is mainly reflected in the following aspects: First, the amount of monoclonal antibody is generally more than 100,000 Daltons, and the ability to penetrate tissues is not strong, which limits its efficacy for diseases such as 3 ⁇ 4 tumors.
  • the antigen-binding interface of monoclonal antibodies and their derived fragments is relatively flat, and it is difficult to combine ion channel proteins, catalytic sites of enzymes, etc., which are very important in drug targets.
  • antibodies are composed of two different polypeptide chains, and the cloning step is complicated, sometimes leading to structural instability.
  • Single-stranded Ding Cheng antibody fragments or other derivatives, such as single-chain antibodies (VH and VL subunits are directly linked by a linker peptide), etc. in most cases, can only be produced in small batches.
  • the production and purification of antibodies and their functional fragments is costly and often requires post-translational modifications such as complex glycosylation, which are produced in expensive mammalian cells with low yields.
  • the invention disclosed in the Wood Application relates to a method for identifying the potential of a protein template to produce a mimetic antibody, comprising: (i) preliminarily selecting a protein; (ii) using the structural information of the protein itself to identify a protein that can introduce changes while One or more regions of the protein structure (referred to as variable regions) are not substantially affected, thereby identifying the potential of the protein template to produce a mimetic antibody.
  • it may further comprise preferentially selecting one or more variable regions identified by structural information of the protein using the sequence information of the chalk itself.
  • An approach to performing the above method may further comprise: verifying the potential of the protein template to produce a mimetic antibody and verifying the potential of the protein template to produce a mimetic antibody, the verification method comprising: (i) introducing a point mutation in the identified variable region, Inserting one or more polypeptides (NLPs) that may participate in the formation of a protein template interacting with other proteins and exhibiting a non-linear structure, or may present a non-linear structure by itself, or partially or completely substituted with one or more of said polypeptides
  • NLPs polypeptides
  • the variable region is then analyzed for the performance of the resulting protein variant introduced in the variable region to introduce the above changes, wherein the performance of the protein variant demonstrates the potential of the protein template to produce a mimetic antibody.
  • a protein display can include one of: ⁇ i) phage display (pha ⁇ e di splay); (ii) yeast display (yeast di splay); (iii) mRNA display; and (IV) ribosome display (ribosome) Display ).
  • the properties of the protein variant or fusion protein analyzed as described above may include: (i) thermal stability, (ii) enzyme stability, (iii) solubility, (iv) whether or not the original polypeptide with the polypeptide is introduced Point affinity, (V) expression of leeches -.
  • the invention disclosed in the present application also relates to a method for producing a pseudo-antibody, comprising: (i) preliminary selection of a protein; (ii) structural information of the protein itself to identify changes in the protein that do not substantially affect the protein.
  • One or more regions of the chalky structure (referred to as variable regions), thereby identifying the potential of the protein template to produce a mimetic antibody.
  • Replacement page (Article 26)
  • the polypeptides described above may be one or more polypeptides that may participate in the formation of a protein template to interact with other proteins and exhibit a non-linear structure, or may exhibit a non-linear structure by itself.
  • the method further comprises replacing one or more of the identified variable regions with a portion or all of the polypeptide that is adjacent to the length of the corresponding variable region.
  • the polypeptide inserted above may be derived from one of the following: (i) a polypeptide which is capable of forming a circular structure and has a targeted binding ability; (ii) a part of an antibody complementarity determining region (or CDR); a part of the binding interface between two interacting natural proteins.
  • the polypeptide inserted above can be produced by one of the following methods: (i) selecting one or more known polypeptides that bind to a target protein; (ii) screening for binding to a target by protein display. a polypeptide; (iii) screening for a disulfide bond nonlinear polypeptide (NLP); (iv) making an antibody to a target, and then substituting or terminating the complementarity determining regions (or CDRs) of the antibody ⁇ !j to make one or more stretches of polypeptide; and (V) to select a segment from the binding interface between two interacting natural proteins as the polypeptide.
  • NLP disulfide bond nonlinear polypeptide
  • the protein display described above can be one of the following methods: (i) phage display; (ii) yeast display; (iii) mRNA display; and (iv) ribosome display.
  • the method of making a mimetic antibody disclosed in the present application may further comprise altering the variable region (referred to as an immutable region) to further improve the prepared antibody.
  • the alterations to the immutable regions described above may include: (i) adding or deleting the N-terminal or C-terminal sequences of the non-variable region, and (ii) modifying the N-terminal or C-terminal to a sequence suitable for expression of the host, And (iii) replacing the residue of the junction region joining the secondary structure in the non-variable region with a residue having a shorter side chain.
  • the shorter side chain residues described above may be glycine, alanine and serine.
  • the method for identifying a protein template to produce the potential of a pseudo-antibody as described above includes: (i) preliminary selection of a protein; (ii) identification of the protein itself to identify changes in the protein that can be introduced without substantially affecting the protein.
  • One or more regions of the structure referred to as variable regions, thereby identifying the potential of the protein template to produce a mimetic antibody, wherein the variable region is identified by the following method:
  • the data describing the protein structure above may be three-dimensional Euclidean spatial coordinate data.
  • the objects expressed by the three-dimensional Euclidean space coordinate data may be protein all-atoms, carbon alpha (Co), carbon beta (C e ), carbon gamma (C y ), carbon delta (C A ), carbon-printed silon ( C 6 ) or other type of atom, or a combination of the above atom types.
  • the data describing the structure of the protein can also be a protein contact map.
  • the mathematical model for describing the randomness of incomplete data in the method for identifying the potential of a protein template to produce a quasi-antibody can be a Hidden Markov model, wherein each node of the Hidden Markov model has three states M (Match, homologous conserved State, I (Insert, random space) state and D (Deletion) state; these three states follow a certain probability distribution; wherein step (iv) tends to deviate from the M state in the proteome structure, or The structural region exhibiting the I state is identified as a variable region.
  • the probability distributions observed by the three states in the above method may be a Gaussian distribution, a Beta distribution, or an Exponential distribution.
  • the method for identifying the potential of a protein template to produce a mimetic antibody as described above can also set definitive parameters to distinguish protein structure flexibility caused by three factors: (i) auto-flexibility due to thermal stability, (ii) non- Heat stable Qualitative self-flexibility, and (iii) deviations in protein structure that can be tolerated during natural or artificial evolution.
  • the structure of the proteome described above can be considered as a random path (A) that follows a certain map (G), a random variable (Y) generated according to a certain probability of emission, through a certain rotation (R) and translation ( V) operation, resulting in a random three-dimensional lattice; wherein the random sampling method may be a Monte Carlo method, and the parameters related to the model updated by the random sampling method may be a map (G), a random path (A), a random variable (Y), rotation (R) and translation ( ⁇ ).
  • the joint probability or conditional probability involved in the random path process described above is derived by the Forward or Viterbi algorithm.
  • the random sampling described above can be performed at least 100 times, and further includes (i) checking the state of the node corresponding to each residue of the protein structure for each sample, and if the node state corresponds to the I state, marking the residue as It belongs to the latent variable region; (ii) if the node state corresponds to the M state, and the spatial position of the residue greatly deviates from the corresponding emission probability distribution corresponding to the M state, the residue is marked as belonging to the potential variable region.
  • residues which are marked as belonging to the latent variable region and whose cumulative number exceeds a certain ratio can be regarded as a variable region.
  • a large deviation may mean that the emission probability is less than 0.05.
  • the cumulative number of times marked as belonging to the latent variable region may be more than 95% as a criterion regarded as a variable region.
  • the invention disclosed in the present application also relates to a method for identifying the potential of a protein template to produce a mimetic antibody, comprising: (i) preliminary selection of a protein; (ii) identification of the protein itself to identify changes in the protein. Further, it does not substantially affect one or more regions of the protein structure (referred to as variable regions), thereby identifying the potential of the protein template to produce a mimetic antibody; (iii) preferentially selecting the sequence information of the protein itself (ii) Variable area, including:
  • step (c) The positional score obtained in step (b) is used to preferentially select the variable region, that is, the lower the score, the more likely it is to belong to the variable region and thus to be preferred.
  • the invention disclosed in the present application also relates to a polypeptide or protein whose sequence may be one of the following sequences or more than 75% homologous to one of the following sequences: (i) SEQ ID NO: 1, wherein the variable region comprises the 32nd Between 43 amino acids, between 55 and 58 amino acids, and between 90 and 93 amino acids; (ii) SEQ ID NO: 15, wherein the variable region comprises between 72 and 81 amino acids; SEQ ID NO: 16, wherein the variable region comprises between 10 and 15 amino acids and between 45 and 68 amino acids; (iv) SEQ ID NO: 17, wherein the variable region comprises: 67 to 71 Between amino acids, between 86 and 91 amino acids and between 96 and 101 amino acids.
  • the homology described above may also be: 80% or more homology, 85% or more homology, 90% or more homology, 95% or more homology, and 99% or more homology.
  • variable region thereof is inserted by the following polypeptide or a sequence homologous to more than 75% of the following polypeptide, or part or all of its variable region is partially and completely replaced by the polypeptide sequence:
  • the homology described above may also be: 80% or more homology, 85% or more homology, 90% or more homology, 95% or more homology, and 99% or more homology.
  • the invention disclosed in this application also relates to an isolated nucleic acid molecule encoding the polypeptide or protein described above.
  • the invention disclosed in the present application also relates to an expression vector comprising the above nucleic acid molecule.
  • the invention disclosed in the present application also relates to an expression vector which can express the above polypeptide or protein.
  • the invention disclosed in the present application also relates to an expression vector having the sequence of SEQ ID NO: 14 or SEQ ID NO: 14 at least 75% of homologous sequences.
  • the homology may also be: more than 80% homology, more than 85% homology, more than 90% homology, more than 95% homology, and more than 99% homology.
  • the invention disclosed in the present application also relates to a polypeptide or protein whose sequence may be SEQ ID NO: 1 or wherein at least one amino acid differs from the wild sequence of the gene corresponding to SEQ ID NO: 1 (i.e., lx5j).
  • the invention disclosed in the present application also relates to a polypeptide or protein whose sequence may be altered by the wild sequence of SEQ ID NO: 16 or at least one of the amino acids and the gene corresponding to SEQ ID NO: 16 (i.e., lklg).
  • the invention disclosed in the present application also relates to a macromolecule comprising the following two parts:
  • a biologically functional polypeptide or protein having a sequence that is one of the following sequences or has more than 75% homology to one of the following sequences: (a) SEQ ID NO: 25; (b) SEQ ID NO: 26 ; and (c) SEQ ID NO: 43.
  • a serum albumin targeting polypeptide or protein having a sequence that is one of the following sequences or has more than 75% homology to one of the following sequences: (a) SEQ ID NO: 27; (b) SEQ ID NO: 28 ;
  • the polymer described above may further comprise a linker molecule (third part) between the biologically functional polypeptide (part 1) and the serum albumin targeting polypeptide (second part), the linker molecule
  • the molecular weight is between 300 and 5,500.
  • the biologically functional polypeptide described above may be the following mutant of SEQ ID NO: 26 (i.e., GLP-1):
  • His ⁇ LP-l modified mutants specifically including: deaminated GLP-1, (D-His 1 ) GLP-N-sorbitol-GLP-1, N-imidazole-GLP-1, N-a- Methyl-GLP-1, N-methyl-GLP-1, N-acetyl-GLP-1 and N-pyroglutamyl-GLP-1;
  • Ala 2 GLP-1 mutant specifically comprising: (D- Ala 2 ) GLP-1, (Gly 2 )GLP - 1 , (Ser 2 ) GLP-U (Aha 2 ) GLP-1, (Thr 2 GLP-1, (Aib 2 )GLP-1, (Abu 2 ) GLP-1 and (Val 2 ) GLP-1;
  • GluGLP-1 mutant specifically comprising: (Asp 3 ) GLP-1, (Ala 3 ) GLP-1, (Pro 3 ) GLP-1, (Phe 3 ) GLP-1, (Lys 3 ) GLP- 1 and (Tyr 3 ) GLP-l;
  • the linker molecule in the above described polymer may be a non-polypeptide molecule.
  • the non-polypeptide molecule can be, but is not limited to, one or any combination of the following: polyethylene glycol, polypropylene glycol, (ethylene/propylene) copolymerized ethylene glycol, polyoxyethylene, polyurethane, polyphosphazene, polysaccharide, dextran , polyvinyl alcohol, polyvinylpyrrolidone, polyvinyl ethyl ether, polyacrylamide, polypropylene, polycyano, lipid polymer, chitin, hyaluronic acid and heparin.
  • the linker molecule in the above-described polymer may be a polypeptide molecule which may be composed of natural or non-natural amino acids.
  • the natural amino acid can be a natural amino acid that can form a protein.
  • the natural amino acid can be a natural amino acid directly encoded by the genetic code.
  • the polypeptide as a linker molecule may also be one of the following sequences or more than 75% homologous to one of the following sequences: (a) SEQ ID NO: 36; (b) SEQ ID NO: 37;
  • the homology may also be: more than 80% homologous; more than 85% homologous; more than 90% homologous; more than 95% homologous; and more than 99% homologous.
  • the biologically functional polypeptide of the above-described polymer and the listed sequence are at least 80% identical Source, and the serum albumin targeting polypeptide and the listed sequences (a) SEQ ID NO: 27, (b) SEQ ID NO: 28, (c) SEQ ID NO: 29, (d) SEQ ID NO : 30, (e) SEQ ID NO: 31, (f) SEQ ID NO: 32, (g) SEQ ID NO: 33, (h) SEQ ID NO: 34, and (i) SEQ ID NO: 35 at least 80% Homologous.
  • the homology may also be more than 85% homologous, more than 90% homologous, more than 95% homologous, and more than 99% homologous.
  • the invention disclosed in the present application also relates to an isolated nucleic acid molecule encoding a polypeptide or protein in the above polymer.
  • the invention disclosed in the present application also relates to an expression vector comprising the above nucleic acid molecule.
  • the invention disclosed in the present application also relates to an expression vector which expresses a polypeptide or protein in the above polymer.
  • the invention disclosed in the present application also relates to a medicament or vaccine comprising any one of the polypeptides or proteins described above, or any of the above-mentioned polymers, or any of the nucleic acid molecules described above, Or any of the expression vectors described above.
  • Figure 1 is a flow chart of some of the methods of the present invention.
  • Figure 2 is a logic flow diagram of the ability of a test template protein to produce a quasi-antibody.
  • Figure 3 is a logic flow diagram illustrating the generation of a quasi-antibody based on a given target.
  • Figure 4 shows the structure of 1x5 j and its structurally similar proteins.
  • Figure 5 is a probabilistic estimate of whether each residue of lx5j belongs to a structural element.
  • Figure 6 shows the results of the analysis of lx5j by the combined sequence and structure spectrum results.
  • Figure 7 is a diagram showing a typical plasmid used to express lx5j and its protein variants.
  • Figure 8 is a phage ELISA result for detecting the expression of a 1x5 j variant on the surface of a phage.
  • Figure 9 shows the test results of the binding ability of the fusion protein formed by the template protein and the NLP polypeptide to serum albumin.
  • Figure 10 shows the phage ELISA results for lfna, lhms and lklg.
  • Figure 11 is an electrophoresis pattern of the BMT library mutation product of the lhms template protein.
  • Figure 12 shows the expression of a GLP1 receptor agonist fusion protein.
  • Figure 13 is an electropherogram of the purified fusion proteins Ex4-lfna-sabl, Ex4-lhms_sabl and Ex4-lx5j_sabl.
  • Figure 14 is an electrophoresis pattern of the fusion proteins Ex4- lfna-sabl, Ex4-lhms-sabl and Ex4- 1x5 j-sabl after enterokinase digestion.
  • Figure 15 is an electropherogram of the fusion protein after digestion and purification by enterokinase.
  • Figure 16 is a ELISA result of detecting the binding ability of Ex4-lfn a - Sa bl to human serum albumin.
  • Figure 17 is the effect of Ex4-lfn a - Sa bl on blood glucose concentration in normal mice.
  • Figure 18 is a comparison of the hypoglycemic effect of Ex-4 and the fusion protein Ex4-lfn a - Sa bl in mice after 2 hours of administration.
  • Figure 19 is a comparison of hypoglycemic effects of Ex-4 and the fusion protein Ex4-lfn a - Sa bl in mice after 12 hours of administration.
  • Figure 20 is a comparison of the hypoglycemic effects of the fusion protein Ex4-lfn a - Sa bl and Ex4 in mice.
  • Figure 21 is a pharmacokinetic profile of Ex4 in mouse plasma.
  • Figure 22 is a pharmacokinetic curve of the fusion protein Ex4-lfna-sabl in mouse plasma.
  • Figure 23 shows the hypoglycemic effect of the fusion protein Ex4-lfn a - Sa bl in Beagle dogs.
  • Figure 24 is a pharmacokinetic profile of the fusion protein Ex4-lfn a - Sa bl in Beagle dogs. Detailed description of the invention
  • a novel targeting protein called a "quasi-antibody” has been discovered following monoclonal antibodies.
  • These novel targeting proteins are obtained by engineering natural protein templates, have dense and thermally stable structures, are small (5-20 kDa), and have large mutability surface areas for engineering and directed evolution. Does not seriously damage the stability of the original protein.
  • These antibody-like template proteins have sequences that are completely different from the antibody, but have specific binding ability to specific antigens, and generally have better solubility, tissue penetration, heat stability and enzyme stability. And can be passed through a prokaryotic system (such as E. coli) or simple Eukaryotic systems (such as yeast) are produced in large quantities.
  • a prokaryotic system such as E. coli
  • simple Eukaryotic systems such as yeast
  • the antigen-binding region (ie, the variable region) of the pseudo-antibody protein has no clear boundaries with the structural region, and often varies with the target, and therefore requires a large amount of mutation work to confirm.
  • artificial design is required to construct a large-capacity variant library, and individual targets are screened to test whether the template has sufficient potential to produce sufficient structure and diversity.
  • Sexual variant Therefore, early stages such as the design, screening, and optimization of prodrug prodrugs are often highly dependent on the construction of large-capacity protein libraries, which are time consuming and costly. Therefore, compared with hundreds of clinically-recognized varieties and dozens of listed varieties, there are only a dozen clinically-acquired varieties and one listed variety, and their development is subject to the above technical bottlenecks. Restriction.
  • Dahiyat et al. pointed out that "a protein with a length of 500 amino acids has a variation of 20 to 500, and library screening methods cannot screen for such multiple possibilities.” It can only "test a small portion of protein variants that may improve function.” (Dahiyat BI et al., U.S. Patent No. 7,379,822, the disclosure of which is incorporated herein by reference in its entirety in its entirety in its entirety in Kiss et al.
  • the HCDR1 loop of the CD4 antibody can be inserted into a protein inhibitor of neuronal nitric oxide synthase, and each of the molecules thus formed has the ability to bind to CD4 (Bes C. et al. Chardes T. PIN-bodies: a new class of antibody-like proteins with CD4 specificity derived from the protein inhibitor of neuronal nitric oxide synthase. Biochem. Biophys. Res. Commun. 2006;343:334-344). This document is incorporated by reference in its entirety by reference. Bes et al.
  • a loop region that binds to an antigen in a lysozyme antibody is grafted to a green fluorescent protein (green) Fluorescent protein ( GFP ) can produce a fluorescent protein that binds to lysozyme (Kiss et al. Antibody binding loop insertions as diversity elements Nucl. Acids Res. (2006) 34(19): el32) 0 The way incorporation by reference is part of this application.
  • green fluorescent protein green Fluorescent protein
  • an entire epitope of an HIV-1 C polypeptide isolated from the C-terminus of the HIV-1 gp41 protein ie, a solvent-accessible surface region formed by 19 non-contiguous amino acids, above about 2000 square angstroms
  • Inserting the surface of the GCN4 protein leucine zipper to form an artificial ligand close to the natural ligand's antiviral ability (Samuel K. Sia et al. Protein grafting of an HIV- 1- inhibiting epitope PNAS 2003 100) (17) 9756-9761; doi : 10.1073/pnas.1733910100).
  • This document is incorporated by reference in its entirety by reference.
  • This one-shot "protein grafting" can directly generate a quasi-antibody, skipping the steps of constructing a library and performing multiple rounds of screening and optimization, providing a direct independent of library construction.
  • a new idea for generating a quasi-antibody and quickly discovering potential new antibody-like template proteins At the same time, follow-up studies have found that these proteins, which can accommodate peptide transplantation, also have the potential to become new template templates for quasi-antibody. Randomization of the corresponding sites can be used to create libraries for more target screening and production of quasi-antibodies.
  • recent inventions have also directly introduced CDRs of antibodies to obtain pseudobodies, such as No. 20100322930 of Novartis. This document is incorporated by reference in its entirety by reference.
  • the important disadvantage of the above method is that it does not utilize the structural information of the template protein, and does not actively and systematically judge the method of producing the antibody ability of the template protein, but relies heavily on the luck to find a specific site or region. Individual proteins that accommodate polypeptide replacement or insertion.
  • the present invention provides a method and a system that do not require time-consuming and labor-intensive construction of a large-capacity protein library and perform high-throughput screening, but by identifying variable elements (also called variable regions) of a target protein to be engineered, Information on target and non-linear polypeptide sequences is known to determine the ability of any unknown protein template to produce a quasi-antibody and to rapidly generate a quasi-antibody for a particular target.
  • Figure 1 depicts the general structure of the present invention.
  • the present invention analyzes the structural map of a protein template to find a continuous region (ie, a variable element) that deviates significantly from its structural profile, and then, for example, one that has a specific binding ability for a target.
  • a plurality of non-linear polypeptides are directly substituted or inserted into the region.
  • it is determined whether the resulting protein variant still has specific binding ability to the target, thereby determining the ability of the protein template to produce a quasi-antibody.
  • the invention may also engineer known or unknown protein templates for any target to produce a quasi-antibody.
  • any target first use a relatively simple short peptide library display technique to screen for variable elements that produce multiple nonlinear peptides, directly substituted or inserted into one or more protein templates and analyzed by structural profiling. Finally, among the plurality of protein variants thus obtained, a quasi-antibody that specifically binds to the target is found.
  • the advantage of the present invention is that, unlike the prior art, it bypasses the difficult operation of directly displaying a larger protein by using a library, and uses a structural spectrum to identify a relatively independent variable element in a protein template, by performing a polypeptide replacement or the like. Produce the final mimetic antibody.
  • variable element and its template protein are regarded as an integral part of the screening or design and thus are limited to the bottleneck surrounding the screening of a single protein template, and can simultaneously correspond to multiple structural elements (corresponding to multiple Screening of the protein template) and variable elements (corresponding to multiple nonlinear peptides) reduces the difficulty, increases the throughput and success rate, and overcomes the problem that the variable region boundary changes with target changes.
  • the function of the mimetic antibody molecule is based on a certain flexible structure, and is not a specific rigid structure that is difficult to change. As they interact with their corresponding target molecules, they can change their structure to more efficiently bind to the target molecule. From the point of view of the design of the antibody-like drug, the more flexible part is more tolerant to large-scale mutations, on the one hand, it is not easy to affect the structural stability (medicinal properties), on the other hand, it is easier to form stronger with the target molecule. The combination of strength.
  • the protein structure information obtained by the current technology is mainly a static image (such as an X-ray diffraction crystal or a ⁇ R structure), it only contains flexibility on a very small scale due to thermal stability, and it is difficult to reveal a larger scale, For flexible information that is more important for the design of antibody-like drugs, it is even more difficult to predict changes in peptide sequence replacement or insertion.
  • the present invention provides a method for utilizing a fully probabilistic mathematical model for auto-flexibility due to thermal stability, auto-flexibility due to non-thermal stability, deviation of protein structure that can be tolerated during natural or artificial evolution, etc. Three factors, set clear model parameters, and parameter estimation by comparing the target protein structure with other homologous structural proteins, thus accurately and effectively realize the design of the pseudo-antibody protein by using the polypeptide sequence substitution or insertion method.
  • the adsorbed fusion protein drug is gradually dissociated from the serum albumin, thereby maintaining the drug concentration in the blood and maintaining the drug effect for a long time.
  • One of the daily injectable GLP1 varieties, liraglutide is the use of targeted serum albumin technology to increase the half-life to about 14 hours.
  • the key point of this technical route is to make the prototype peptide drug produce sufficient serum albumin targeting.
  • the existing methods are as follows: (1) Targeting is achieved by chemically modifying a prototype polypeptide (such as acylation, AlbuTag, etc.). The binding strength produced by this type of method is very limited, the Kd value is generally on the order of uM, and the half-life requirement of weekly injection is basically impossible; (2) the prototype polypeptide and the serum albumin targeting polypeptide (such as Genentech, Dyax, Isogenics) Such as the company's albumin affinity pept ides) or Albumin-binding Fab fragment recombination.
  • the binding strength produced by this method is slightly better, and the Kd value is about several hundred nM to several uM, but the obtained drug is generally 50-60 polypeptide, and the production cost is high; (3) targeting the prototype polypeptide and serum albumin Antibodies (such as Domant is GSK's dAbs, Ablynx and BAC's VHH, Affibody) are recombined.
  • Prototype proteins useful for the production of such artificial targeting proteins include: Staphylococcus aureus A domain protein (US5831012, EP0739353), human fibronectin (US6818418, EP1266025) and the like. The strength of the bond produced by this type of method can meet the needs of weekly or even longer injections and can be adjusted.
  • This type of GLP1 drug can be greatly reduced in cost by high-density fermentation. In the case of investing the same production costs, It can increase productivity by nearly 100 times compared to chemical synthesis.
  • the entire literature cited in this paragraph is part of this request by means of an incorporate ion by reference.
  • a method of producing a quasi-antibody without the need for bulk library construction uses five parts: Target System 100, Non-Linear Peptide (NLP) System 200, Template System 300, Design Unit 400, and Test System 500, with two input points and one output point. Depending on the purpose, it corresponds to a different logic flow.
  • NLP Non-Linear Peptide
  • the aim is to test the ability of any template protein to produce a mimetic antibody.
  • This logic flow is primarily to identify the variable regions of the template protein and does not directly care what target the antibody will bind to.
  • This logic flow starts from the template system 300 and passes through the target system 100 and the NLP system 200 to the design unit 400. The other information directly reaches the design unit 400. After the design unit 400 completes the design, it enters the test unit 500 and combines the target points. With the help of system 100, the conclusion is output. Specifically, as shown in FIG.
  • the template system 300 includes an information collecting unit 310 for obtaining basic information of the protein to be tested, and an analyzing unit 320 for analyzing the variable protein to be tested.
  • the component area includes a information acquisition unit 110 for selecting a suitable reference target, and a synthesis unit 120 for synthesizing the target;
  • the NLP system 200 includes an information acquisition unit 210 for selecting suitable a non-linear polypeptide sequence, and a test screening unit 220 for screening non-linear polypeptides against selected targets; a design unit 400 for integrating the results of the target system, the NLP system, and the template system to design changes in the protein to be tested
  • the test system 500 includes a synthesis unit 510 for synthesizing the variant protein, and a test unit 520 for testing the binding ability of the variant protein to the reference target and outputting the final evaluation result.
  • Another logic flow is shown by reference numeral 2 in FIG.
  • the goal is to rapidly produce a mimetic antibody that best fits that particular target for a particular target.
  • This logic flow is mainly centered on the target to see which antibody is optimally bound to the specific target, so this logic flow does not directly care whether the antibody being tested is generally a target for most targets.
  • Good pseudo-antibody This logic flow starts from the target system 100, passes through the template system 300, and reaches the design unit 400. The other passes through the NLP system 200 to the design unit 400. After the design unit 400 completes the design, it enters the test unit 500, and combines the target system. 100 help, output conclusions. Specifically, as shown in FIG.
  • the target system 100 includes an information collecting unit 110 for obtaining related information of a given target, and a synthesizing unit 120 for synthesizing the above target.
  • the template system 300 includes an information acquisition unit 310 for obtaining information having a template protein that produces a potential for pseudo-antibody, or may have this potential, an analysis unit 320 for analyzing a variable element region of the template protein;
  • NLP system 200 includes an information acquisition unit 210 for selecting nonlinear polypeptide sequences known to have binding ability for a given target, and a test screening unit 220 for screening nonlinear polypeptides for a given target; design unit 400 For designing the results of the target system, the NLP system, and the template system, designing a variant of the template protein;
  • the assay system 500 includes a synthesis unit 510 for synthesizing the variant protein, and a test unit 520 for testing the variant Specificity of the protein for the reference target Ability, output results.
  • GLP-1 human glucagon-like peptide-1 receptor agonist polymeric drug
  • GLP-1 receptor agonist polymeric drug an important novel drug for the treatment of type 2 diabetes drugs.
  • GLP-1 receptor agonist drugs have the following unique therapeutic mechanisms and safety different from previous diabetes drugs: (1) hypoglycemic effect is "glycemic concentration-dependent", closest to endocrine hypoglycemic physiological state It avoids the hypoglycemic adverse reactions of oral hypoglycemic agents and insulin. It can be administered at a fixed dose, which is superior to insulin and suitable for long-term use. (2) It has the function of islet ⁇ cell protection and promotes proliferation, and stimulates islet ⁇ cell response more than oral hypoglycemic agents and exogenous insulin.
  • GLP1 receptor agonist drugs have obvious advantages over traditional western medicine.
  • the specific performance is as follows: (1) Blood glucose concentration-dependent hypoglycemic effect. GLP-1 receptor agonists do not cause significant clinical hypoglycemia and are indicated for use in diets, patients with poor sulfonylurea control, and patients requiring insulin therapy. (2) Improve insulin sensitivity, improve islet ⁇ -cell function, prevent and fundamentally cure diabetic patients and diabetic susceptible individuals with impaired glucose tolerance (IGT). Moreover, it stimulates the stimulatory response phase of beta cells more than exogenous insulin. (3) Lose weight, control diet, better than sulfonylureas, thiazolone and insulin. (4) In combination with metformin, the effect is better than single-use/glimepiride/glargine combined. (5) It has cardiovascular protection, which can lower blood pressure, reduce cardiovascular complications of diabetes, and improve the body's ability to respond to stress.
  • GLP-1 receptor agonists also produce weight-reducing effects through a variety of pathways, including inhibition of gastrointestinal motility and gastric secretion, appetite suppression and feeding, and delaying gastric emptying.
  • GLP-1 receptor agonists can also act on the central nervous system (especially the hypothalamus), resulting in a feeling of fullness and loss of appetite.
  • GLP-1 receptor agonists have many other biological properties and functions. For example, GLP-1 receptor agonists may exert lipid-lowering and antihypertensive effects, thereby protecting the cardiovascular system. The nerve can be protected by acting on the central enhancement learning and memory function.
  • the GLP-1 receptor agonist Exenat ide (a 39 amino acid peptide) synthesized by Amyl in Pharmaceuticals of the United States was launched in 2005, and its long-acting sustained-release preparation was launched in 2012.
  • GLP-1 receptor agonist varieties in the clinical stage include: GLP1-Fc, LY548806, and GLP1-PEG of El i Li l ly/Amyl in; Semaglut ide of Novo Nordi sk, PC of ConjuChem - DAC, GSK's Albiglut ide, Roche/Ipsen's Taspoglut ide, Aventis/Zealand Pharma's Lixi senat ide, Intarcia's ITCA650, and domestic Haosen's GLP1-PEG.
  • GLP-1 receptor agonist drugs have great market potential, drugs on the market require daily injections, and the incidence of adverse reactions is high.
  • a polymer or protein molecule that is 10 times larger than the GLP-1 receptor agonist polypeptide will be introduced. Lead to loss of drug activity.
  • most of the varieties included in the research require complex chemical synthesis processes, and the per capita annual cost is about 20,000 to 30,000 yuan. This cost is more than five times the per capita annual drug expenditure level (4000 yuan) for Chinese diabetic patients, 10 times the cost of insulin injection, and very expensive. Even the generic drug variety is limited by the production process, and the expected selling price is more than 10,000 yuan.
  • the GLP1 drugs on the market and under development can only meet the needs of the "high-end market" population of no more than 100,000 people, accounting for only 0.1% of the population of Chinese type 2 diabetes patients. Therefore, there is an urgent need for new long-acting genetically engineered GLP1 drugs that are more convenient to use and less expensive to spend.
  • the present invention provides a series of high molecular weight drugs based on GLP-1 receptor agonists that target serum albumin polypeptides. Characterized in that the polymer comprises the amino acid sequence shown in SEQ ID NO: 25, 26 and 43 or a sequence similar to SEQ ID NO: 25, 26 and 43, which activates the GLP-1 receptor; SEQ ID NO : amino acid sequences shown in 27-35 or sequences similar to SEQ ID NO: 27-35, which can be targeted to bind serum albumin.
  • the polymeric drug described above may also include a linker molecule.
  • the main purpose of the linker molecule is to spatially separate the two parts (activate the GLP-1 receptor moiety and the serum albumin targeting polypeptide moiety) by a distance, so that the above two parts can better perform biological effects. .
  • the chemical composition of the linking molecule is not critical. Only its size has an effect on the isolation effect (ie the final biological function).
  • the linker molecule can be a non-polypeptide or polypeptide.
  • Non-polypeptide linking molecules can be natural or non-natural.
  • the non-polypeptide linking molecule can be, but is not limited to, polyethylene glycol, polypropylene glycol, (ethylene/propylene) copolymerized ethylene glycol, polyoxyethylene, polyurethane, polyphosphazene, polysaccharide, dextran, polyvinyl alcohol. , polyvinylpyrrolidone, polyvinyl ethyl ether, polyacrylamide, polypropylene, polycyano, lipid polymer, chitin, hyaluronic acid and heparin.
  • the amino acid in the linker molecule of the polypeptide may be any amino acid, both natural and non-natural, and may be a D amino acid or an L amino acid.
  • the linker molecule of the polypeptide may be SEQ ID NO: 36-42 or a sequence similar to SEQ ID NO: 36-42.
  • VVVT 9C 0N ai (53S
  • the amino acid sequence of GLP-1 includes the following mutants in addition to SEQ ID NO: 26 above:
  • His'GLP-1 modified mutants specifically including: deamino GLP-1, (0-[ ⁇ 3 1 ) 01 ⁇ -1, sorbitol-01 ⁇ -1, N-imidazole-GLP-1, ⁇ - ⁇ -methyl-GLP-1, N_methyl_GLP_1, N_acetyl_GLP_1 and N_pyroglutamyl-GLP-1;
  • Ala 2 GLP-1 mutant specifically including: (D- Ala 2 ) GLP-1, (Gly 2 ) GLP-1, (Ser 2 ) GLP-1, (Aha) GLP-U (Thr 2 ) GLP-1, (Aib 2 ) GLP-1, (Abu 2 ) GLP-1 and (Val 2 ) GLP-l;
  • Glu GLP-1 mutant specifically comprising: (Asp 3 ) GLP-1, (Ala) GLP-U (Pro 3 ) GLP- U (Phe 3 ) GLP-1, (Lys 3 ) GLP-1 and (Tyr 3 ) GLP-1 ;
  • a mutant KGLP-1 with a lysine residue at the N-terminus of GLP-1 The invention also provides a series of nucleic acid molecules encoding the polypeptide and fusion protein.
  • the polypeptides and fusion proteins of the invention can be produced by chemical synthesis or genetic engineering recombinant expression.
  • the recombinant expression of the gene is generally preferred, as follows:
  • the nucleic acid encoding the molecule is inserted into an expression vector.
  • Expression control sequences include, but are not limited to, promoters, signal sequences, enhancer elements, and transcription termination sequences.
  • expression vectors are typically replicated in the host as part of the episome or host chromosomal DNA.
  • the expression vector contains a selectable marker (e.g., ampicillin resistance, tetracycline resistance, etc.) to detect those host cells expressing the desired DNA sequence.
  • a selectable marker e.g., ampicillin resistance, tetracycline resistance, etc.
  • Hosts include, but are not limited to, E. coli, yeast, and the like.
  • polypeptide and fusion protein of the present invention can be purified according to standard methods in the art, including ammonium sulfate precipitation, affinity column, column chromatography, HPLC purification, gel electrophoresis, and the like.
  • a substantially pure product of at least about 90-95% purity is preferred.
  • the polypeptide or fusion protein of the invention may be combined with one or more pharmaceutically acceptable excipients to form a pharmaceutical composition.
  • excipients include: water soluble fillers, pH adjusters, stabilizers, water for injection, osmotic pressure regulators, and the like.
  • the pharmaceutical composition can be administered by intramuscular, intravenous, subcutaneous, etc., and the preferred dosage form is lyophilized or solution injection.
  • the water-soluble filler adjuvants include, but are not limited to, one or a combination of mannitol, low molecular weight dextran, sorbitol, polyethylene glycol, glucose, lactose, galactose, and the like.
  • the pH adjusting agent includes, but is not limited to, phytic acid, phosphoric acid, hydrochloric acid, potassium hydroxide or sodium or ammonium, sodium or potassium or ammonium salt, sodium hydrogencarbonate or potassium or ammonium salt, physiologically acceptable organic or A combination of one or more of an inorganic acid and a base and a salt.
  • the stabilizers include, but are not limited to: EDTA-2Na, sodium thiosulfate, sodium metabisulfite, sodium sulfite, dipotassium hydrogen phosphate, sodium hydrogencarbonate, sodium carbonate, arginine, glutamic acid, polyethylene glycol, ten One or a combination of sodium dimercaptosulfate, trimethylolaminocarbamidine or the like.
  • the osmotic pressure adjusting agent includes, but not limited to, a combination of one or more of sodium chloride, potassium chloride and the like.
  • the pharmaceutical compositions of this invention may also be administered in combination therapy, i.e., in combination with other agents.
  • a combination therapy can include a composition of the invention along with at least one or more additional therapeutic agents, such as anti-inflammatory agents, anticancer drugs, and chemotherapeutic agents.
  • additional therapeutic agents such as anti-inflammatory agents, anticancer drugs, and chemotherapeutic agents.
  • the potential for the production of a quasi-antibody by a given protein (pdb number: lx5j) using the present invention is illustrated below in conjunction with specific examples.
  • This embodiment starts from the template system 300.
  • the unit collects the known data of the lx5j protein, including but not limited to its primary sequence information, secondary sequence information, tertiary structure information, production process information (such as production). Process, expression efficiency, etc.) and functional information (such as subcellular localization information, enzyme stability, etc.).
  • Common methods include: Database query and document mining.
  • the primary sequence of 1x5 j is obtained by querying the SCOP database as SEQ ID NO 1.
  • KPNTLYEFSVMVTKGRRSSTWSMTAHGTTFEL SEQ ID NO 1
  • this unit collects other protein information similar to the sequence of the lx5j protein.
  • a common method is database query.
  • the information collection unit 310 uses the site-specific iterative BLAST (PSI-BLAST) algorithm to search the SWISS-PR0T database, and collected 301 other proteins having similar sequences to the 1x5 j protein.
  • PSI-BLAST and BLAST are commonly used sequence database search algorithms in the art (Altschul et al, Gapped BLAST and PSI-BLAST: a new generat ion of protein database search programs. Nucl. Acids Res.
  • BLAST is the abbreviation of Bas ic Local Al ignment Search Tool, which means “basic partial similarity comparison search tool”.
  • Bas ic Local Al ignment Search Tool which means "basic partial similarity comparison search tool”.
  • the sequence of the target protein is used as a query sequence, and the SWISS-PR0T database is searched by the BLAST algorithm to obtain a comparison result of a plurality of similar sequences, thereby establishing a position-specific score matrix.
  • the score matrix is used as the query sequence, and the BLAST algorithm is used to search the SWISS-PR0T database to find new similar protein sequences and update the score matrix.
  • the above process of establishing a location-specific score matrix and the process of updating the score matrix can be performed in accordance with the Altschul l997 article. This process is iteratively iterated until no new similar sequences are found.
  • this unit also collects other protein information similar to the structure of the 1x5 j protein.
  • the information collecting unit 310 obtains other five proteins (SC0P numbers: dlx5fal, dlx5hal, dlx5kal, dlx5gal, and dlx5ial) which are similar to the lx5j structure and belong to the human source by querying the SC0P database.
  • the SCOP database is a protein structure classification database that provides information on the structural and evolutionary relationships between known structural proteins, including all entries in the structural database PDB. Its structural classification is mainly obtained by manual observation and comparison. Its classification levels include: structural families, structural superfamilies, folding, and so on. Similar to the CATH database and so on.
  • the unit analyzes the spectrum of 1x5 j and its similar sequence protein, and finds a rapid evolution site in the lx5j protein for scoring.
  • the general procedure is: multi-sequence alignment of the above sequences, establishment of a phylogenetic tree, calculation of the evolution rate of each locus according to a specific molecular evolution model, and scoring.
  • Common methods for multiple sequence alignment include: CLUSTAL algorithm (Reference: Larkin et al. Clustal W and Clustal X vers ion 2. 0. Bioinformatics (2007) 23 (21): 2947-2948, this article is incorporated by reference in its entirety.
  • This document is part of this application in the form of an incorporated by reference.
  • Common methods for building phylogenetic trees include: Neighbor-Joining algorithm, Unweighted Pairing Group (UPGMA) algorithm, Minimal Evolution (ME) algorithm, Maximum Minimization (MP) algorithm, Maximum Likelihood (ML) algorithm, Bayesian algorithm, etc.
  • Residue motif residue class conservative hit similar sequence number / general order
  • this unit analyzes the structural spectra of 1x5 j protein and other proteins with similar structures, and finds the variable elements of lx5j protein.
  • the structural spectrum can be composed of three-dimensional Euclidean spatial coordinate data of protein all atoms, Ca or other types of molecules.
  • the C a structure spectrum of any set of protein structures is described using the Hidden Markov Model s.
  • Hidden Markov models are commonly used mathematical models in the field and are widely used to describe the randomness and underlying structure of incomplete data, particularly in the description of protein sequence or structure spectra.
  • Eddy Profi le Hidden Markov Model s.
  • the hidden Markov model used to describe the structure of the protein structure is a structure having n nodes.
  • Each node has three states: M, D, and I.
  • the M state of the kth node can only be transferred to the M or D state of the k+1th node, or the I state of the kth node; the D or I state of the kth node can only be to the kth
  • This state transition probability matrix is set to an unknown parameter, but does not depend on the sequence number of the node in which it is located.
  • the M states of each node k correspond to an unknown emission probability distribution (three-dimensional normal distribution), and the expected value parameters are ( , y k , 3 ⁇ 4), and the variance parameters are. , thereby generating three-dimensional spatial coordinates; in particular, . / Defined separately for each protein structure ( ) and does not vary with node position.
  • the I state of all nodes only corresponds to the same unknown three-dimensional Gaussian probability distribution, the expected value parameter is (x, y, z), and the variance parameter is ⁇ 2 , which can generate three-dimensional space coordinates.
  • the three-dimensional structure of any protein can be regarded as: a random path (A) that follows a certain map (G), a random variable (Y) generated according to a certain probability of emission, A random three-dimensional lattice generated by a certain rotation (R) and translation (V) operation. Due to the joint probability or conditional probability involved in this stochastic process, the Forward or Viterbi algorithm known in the art can be used. inferred
  • the above question of how the Great Wall strikes and how each stone is thrown out is like the problem of finding a variable region in a protein template as described above.
  • the imaginary Great Wall is the structure spectrum.
  • the five colors of the stone are like proteins in the protein group (assuming five proteins). All these can be seen today are the five proteins (the five color stones). The question is how these five proteins have arrived today.
  • the beacon tower in the imaginary rock climbing above the Great Wall is like the node of the above hidden Markov model. Gently throwing stones is like the M (Match, homologous) state in the hidden Markov model node. Throwing a stone is like an I (Insert, Random Space) state in a hidden Markov model node.
  • the jump beacon is like the D (Delete) state in the hidden Markov model node.
  • the lx5j-variable regions identified according to the above methods are between the 32nd and 43rd amino acids, between the 55th and 58th amino acids, and between the 90th and 93rd amino acids.
  • target protein information suitable for evaluation of lx5j can be obtained by unit 110.
  • the 120 unit is used for synthesis and purification of targets, including expression and purification of full-length target proteins, target protein-specific fragments or target proteins expressed on stable or rapidly transfected cell lines, etc. I know.
  • serum albumin derived from human, mouse, and rabbit was selected as a target and purchased from Sigma-Aldrich.
  • NLP nonlinear polypeptide
  • Common methods include database retrieval and literature retrieval, and the secondary design of simple additions and deletions can be performed on the basis of the above.
  • NLP sequences that meet the search criteria include, but are not limited to, CDR sequences of antibodies from a given target, non-linear polypeptide sequences with targeted binding capabilities from natural or artificial selection, binding site sequences of known ligands, etc. .
  • a screening unit 220 performs a non-linear polypeptide screening for a given target, where the information obtained by 210 is insufficient, incomplete, or otherwise necessary, and methods include phage display, mRNA display, and the like.
  • NLP polypeptide screening was performed using the M13 phage peptide library (PhD-C7C phage library, New England Biolabs).
  • the Ph. D. -C7C phage display peptide library is a combinatorial library constructed by fusing a random heptapeptide to the M13 phage minor capsid protein ( ⁇ ).
  • the random polypeptides shown have a cysteine (Cys) on each side. Under non-reducing conditions, the two cysteines spontaneously form a disulfide bond that cyclizes the displayed polypeptide.
  • the 7-peptide library restricted by the disulfide bond ring has been confirmed to recognize the epitope structure, the mirror base of the D-amino acid target molecule, and the development of a polypeptide-based therapeutic drug.
  • the peptide expressed in the peptide library is at the apex of the phage secondary capsid protein, the first cysteine is preceded by an alanine residue, and the second cysteine is linked to the wild-type ⁇ sequence.
  • Peptide Gly-Gly-Gly-Ser The peptide library consisted of 109 different clones which were amplified once to obtain a phage library with a copy number of approximately 100 per sequence per 10 ⁇ l.
  • HSA Human serum albumin
  • the phage concentration can be assessed by the E. coli ER2738 strain titration test.
  • the eluted phage was propagated by the E. coli ER2738 strain.
  • the overnight cell culture was diluted 1:100 using LB medium. 1 ml of the diluted medium was dispensed into the culture tube. Using one end of a sterile wooden stick, blue plaque was picked from a plate having a density of less than about 100 plaques, transferred to a test tube containing the diluted medium, and incubated at 37 V for 5 hours with shaking.
  • the culture solution was placed in a microcentrifuge tube and centrifuged at 0.01 rpm for 10 minutes.
  • the supernatant contained a large amount of amplified phage particles. Remove 80% of the upper layer of the supernatant and put 4 V, which can be stored for several weeks without changing the titer.
  • the phage particle that was randomly screened in 4 steps used the universal primer 96gIII (eg 5' -CCCTCATAGTTAGCGTAACG - 3') DNA Sequencing was performed and ELISA assays using rabbit anti-M13 phage antibodies were used to assess their binding to HSA. A 1/6 volume of PEG/NaCl solution was added to the amplified phage supernatant for overnight precipitation, followed by centrifugation at 12, OOO rpm for 10 minutes.
  • a row of ELISA plate wells was coated with HSA 200 u L at a concentration of 100 ug/ml diluted with 0.1 M NaHC0 3 and placed in a gas tight wet box at 4 ° C overnight.
  • Another well plate was coated with a gradient diluted phage. Both plates were blocked with 0.1% NaHC0 3 dissolved 1% casein.
  • the phage is used in a 200 ⁇ l/well TBS containing 0.1% Tween _20.
  • the first well contains 10 12 virions, and the last well, ie the 12th well contains 2. 4 X 10 5 virions.
  • Each row of phage was transferred to a plate coated with HSA using a rifle.
  • the plate was incubated for 1 hour at room temperature with shaking and then washed with a TBS solution containing 0.1% Tween-20.
  • the rabbit anti-M13 phage antibody was then incubated, and the bound phage was detected with goat anti-rabbit IgG conjugated with horseradish peroxidase.
  • the amount of bound horseradish peroxidase was measured by the absorbance reading of 405 nm after color development by the substrate ABTS/0 2 . Each sample was repeated 5 times. No phage was added to the control group.
  • the background absorbance at 405 nm should be subtracted from each reading.
  • the absorbance value at 405 nm in the ELISA assay correlates with the amount of bound phage. Phage clones with significantly enhanced absorbance compared to the control group were picked and sequenced to obtain the NLP polypeptide sequence contained therein (partially shown in Table 2).
  • the sequence of the NLP polypeptide obtained in the above step is inserted or replaced in whole or in part to the sequence of the target protein 1x5 j for modification.
  • the analysis results of the sequence spectrum and the structure spectrum are first integrated to determine the position at which the nonlinear polypeptide sequence can be inserted.
  • the three regions (A, B, C) that were circled were the variable elements identified after analysis of the lx5j protein structure profile using the methods disclosed in this application.
  • the region having a small spherical structure in Fig. 6 is a non-conserved sequence identified after the protein sequence analysis.
  • Such structural information and sequence information are both utilized for analyzing and identifying variable elements.
  • the methods disclosed in this application do not necessarily use sequence information. Using only the structural information of the target protein (i.e., the template protein) itself, the variable elements can also be analyzed and identified using the methods disclosed in this application.
  • variable elements A and C also contain non-conservative sequences, which have the characteristics of rapid evolution, so they can be used for NLP transformation.
  • size of the nonlinear polypeptide sequence is about 10-20 residues on average, far exceeding the size of the variable element C, and is close to the size of the variable element A. Therefore, it is preferred to transplant all or part of the sequence of the NLP into the variable element A.
  • Common operations include addition, deletion or transformation of the N-terminal or C-terminal sequence into a sequence suitable for expression of the host, replacing the residue of the junction region joining the secondary structure with a residue with a shorter side chain, such as serine, and cysteine.
  • the acid is replaced with serine or the like.
  • the primary objective is to reduce the hydrophobicity of the altered region, thereby increasing the solubility and other properties of the variant protein.
  • the above-designed protein variants are synthesized by the synthesis unit 510, and the usual methods include chemical synthesis, enzymatic digestion, bioreactor expression, and the like.
  • the prokaryotic organism E. coli (B. coli.) expression method was used to produce the target protein as follows:
  • DNA sequence of the constructed protein can be artificially synthesized and PCR.
  • the present invention employs a method of whole-gene synthesis to prepare a full-length double-stranded DNA of a protein variant.
  • the N-terminus of the variant is added with HIS and FLAG tags, and the fusion protein sequence is:
  • the coding DNA synthesized in this example has the following characteristics: 5' end has an Ncol cleavage site for the 3' end of the expression vector pET28a; 3' end has a Xhol cleavage site, which is used for 5 with pET28a 'End connection.
  • the product was double-digested (NcoI/XhoI) and purified.
  • the template DNA and the expression plasmid are separately digested by a restriction enzyme digestion method and a cohesive terminal ligation method in the art, and then ligated with DNA ligase to obtain a desired expression vector, as shown in Fig. 7.
  • the expression vector pET28a was double-digested (NcoI/XhoI) and then ligated with the purified product in the above step.
  • the ligation product was transferred to DH5 a competent cells by heat shock.
  • the heat shock conversion method is a conventional technique in the art. Then extract the plasmid and measure it.
  • the expression vector encoding the complete fusion protein was transformed into BL21 (DE3) competent cells for expression.
  • Expression of the fusion protein in the present invention uses the lac promoter E. coli fusion protein expression technique conventional in the art to initiate the production of the fusion protein using IPTG.
  • BL21 (DE3) competent cells containing the fusion protein expression vector were precultured overnight in LB medium containing kanamycin antibiotic.
  • the overnight bacterial solution was diluted with LB fresh medium containing kanamycin at a ratio of 1:100, and the bacterial culture was cultured at 37 °C until the OD600 reached 0.6.
  • the system temperature was lowered to 25 V, and the fusion protein was started using ITPG. expression.
  • the vector sequence is: (SEQ ID NO 14)
  • CCCCGCCC GCCGGCG GCGGGG GCGCGGGGCC2CCTATA ATAMTTAATAAAAAATA TAT- GCGGGCC C2C2CCGGCG C3 ⁇ 4CGC3 ⁇ 4CGGCGGCG GC TTAATTMT TTT TTTATAAATAT-
  • GC2CG2CCGCCGC CCCCC ⁇ CCCCGTATTT TTATT TAAAAATAAA ⁇ TATAA- CCCGCGGGCC GGGCC GC3 ⁇ 4GGG CGCCGCC2CCG AATAT TATAAATATTTAATTA TAATT
  • G3 ⁇ 4GGC GGCGCCC GGGG3 ⁇ 4 CCGGGG3 ⁇ 4G3 ⁇ 4 GCC3 ⁇ 4ATATATAAAAAAATTA TAATAAAAA
  • the fusion protein of the invention can be purified by standard protein isolation and purification techniques. For example, crude purification of the fusion protein is carried out based on the tag of the fusion protein expressed in vivo.
  • the purified fusion protein can be concentrated to the same by ion exchange, ultrafiltration or the like. Required concentration. After 16 hours of expression of the fusion protein, the bacterial solution was centrifuged, collected, and ultrasonically disrupted 10 times for 15 seconds each time. The crushed bacterial liquid is centrifuged at a high speed, and the supernatant is passed through a nickel column. The 6X HIS-tagged fusion protein was captured on a nickel column and the fusion protein was eluted using a concentration gradient of imidazole. The eluted proteins were collected, dialyzed and concentrated, and UV-sterilized.
  • test unit 520 the ability of the protein variant to bind to the target protein is tested.
  • Common techniques include ELISA, FACS (high-precision sorting flow cytometry), SPR (surface plasmon resonance), and the like, and the details are well known in the art.
  • an indirect ELISA method was used, as follows: ⁇ ⁇ ⁇ 5 5 ⁇ g/ml target protein (human serum albumin) prepared with 40 mM NaHC0 3 (pH 9.5) The solution was cultured overnight at 4 °C, and one well per sample was left uncoated with the target protein as a blank control. Pour off the coating solution, wash once with deionized water, and wash once with PBS.
  • phage library display was performed as follows. 50 ⁇ l/well of the Anti_V5-labeled antibody solution diluted 1000-fold with a coating buffer (50 mM NaHC0 3 , pH 9 ) was added to the MaxiSorp microplate while the negative control well was coated with the antibody-free coating buffer. Incubate for 1 hour at room temperature in a humidified box (or overnight at 4 °C). Wash once with TBST. Add 200 u 1 blocking buffer, incubate overnight at 4 ° C in a humidified box or incubate for 1 hour at room temperature. Wash once with TBST.
  • a coating buffer 50 mM NaHC0 3 , pH 9
  • phagemid samples diluted with TBST were added (sample concentrations were 10 6 , 10 7 and 10 8 phagemids/well, respectively). Incubate for 40 minutes at room temperature on a rotary shaker. Wash 5 times with TBST. A 50 u 1 HRP-labeled anti-phage antibody solution (2500-fold diluted with TBST/BSA) was added and incubated for 40 minutes at room temperature on a rotary shaker. Wash 5 times with TBST and 2 times with TBS. Add 50 ul 1-Step Turbo TMB-ELISA chromogenic solution (Pierce) and incubate until blue appears. The reaction was terminated by adding 50 u 1 2 M S0 4 to the tip of the filter cartridge. The absorbance at 450 nm was measured using a microplate reader.
  • TBS 50 mM Tris, 150 mM NaCl, pH 7.5.
  • Blocking solution TBS solution containing 0.5% BSA.
  • TBST TBS buffer containing 0.1% Tween 20.
  • TBST/BSA TBST solution containing 1 mg/ml BSA.
  • the optimal conditions after the groping are: IPTG concentration is 0.2 mM, using a baffled flask, the culture system is 20 ml. Under optimal conditions, the titer of the phage is l X 107 ml.
  • Example 3 Generation of non-antibody targeting proteins for a given target (human serum albumin)
  • the goal is to generate a protein binding molecule for a given target.
  • Three non-antibody proteins, lfna, lhms, and lklg, were selected as template proteins by the information collection unit 310 of the template system, and protein information having a similar sequence or a similar structure was obtained, respectively.
  • lfna is a non-antibody protein template known to have the ability to produce a quasi-antibody (US6818418, which is part of this application by reference incorporation by reference), which has been fully studied in the past ten years.
  • a novel targeting protein that binds to human serum albumin based on lfna is described in U.S. Patent Application Serial Nos. 13/098,851 and 12/989,494, the entire disclosure of which is incorporated herein by reference. Part of the application) was obtained by a large-scale phage library display method.
  • This embodiment discloses the use of the present invention, including a new artificial target combination A protein binding molecule that is targeted to human serum albumin.
  • Lhms is a protein template known to have a certain ability to produce a quasi-antibody, and is disclosed in the patent application CN201210186485. 9 (the entire disclosure of which is hereby incorporated by reference). A specific sequence of the human serum albumin binding molecule based on the protein, and a partial variable region. This example discloses the technical details and more variable regions for obtaining such novel binding molecules using the present invention. Finally, lklg is a non-antibody protein of unknown ability to produce a quasi-antibody.
  • variable elements of the three template proteins are identified by analysis unit 320.
  • a variable region of the identified lfna is between 72 and 81 amino acids, and some of the variable regions of lklg (SEQ ID NO: 16) include between 10 and 15 amino acids and 45 Up to 68 amino acids.
  • Some of the variable regions of lhms include between 12 and 38 amino acids, between 67 and 71 amino acids, between 86 and 91 amino acids, and between 96 and 101 amino acids.
  • target system 100 and NLP system 200 was similar to that of Example 1, and the NLP polypeptide sequence as shown in Table 2 was output to design system 400.
  • a plurality of protein variants designed by the system 400 are designed and synthesized by the experimental system 500.
  • the test results of the binding ability of the fusion protein formed by the three protein templates and the NLP polypeptide to serum albumin (partial) are shown in Fig. 9, and the HSA is Human serum albumin, BSA is bovine serum albumin, and NaHC0 3 is a negative control.
  • the protein variants produced in this example with human serum albumin targeted binding ability are as follows:
  • a large-capacity phage library was constructed by randomly mutating the variable regions of the variable region and other specific residues.
  • a single colony of CJ236 containing the target plasmid was picked from the newly prepared plate, inoculated into 1 ml of 2 ⁇ medium containing 100 ⁇ 8 / ⁇ 1 ampicillin, and cultured at 37 ° 0 for 6 hours with shaking (until the culture became cloudy).
  • the helper phage M13K07 ( ⁇ 20 u 1 ) was added to a final titer of 10 1Q pfu/ml in the culture medium, incubated for 10 minutes, and then transferred to a preheated 100 ⁇ 8 / ⁇ 1 ampicillin and 0.25.
  • the cells of ⁇ 8 / ⁇ 1 uridine in 30 ml 2 X YT were cultured overnight at 37 ° C with vigorous shaking.
  • the culture was transferred to a sterile 50 ml centrifuge tube and centrifuged at 8000 rpm for 10 minutes at 4 °C. The supernatant was then transferred to a new sterile 50 ml centrifuge tube containing 6 ml of 20% PEG8000/2. 5 M NaCl and mixed well. Leave at room temperature for 5 minutes. Centrifuge at 8000 rpm for 10 minutes at 4 °C. Discard the supernatant, centrifuge briefly, and remove the remaining supernatant with a pipette. The phagemid pellet was resuspended in 1 ml TBS and transferred to a microcentrifuge tube. Centrifuge at maximum speed for 2 minutes to remove insolubles.
  • U_ss DNA was purified using the QIApr Spin M13 Kit. Take ⁇ samples and run agarose gel electrophoresis.
  • the synthesis reaction system can be scaled up correspondingly according to the annealing reaction system.
  • Mix well centrifuge briefly, and incubate at 37 °C for 30 minutes. Heat inactivated at 75 °C for 15 minutes and cool to room temperature. Take 1 ⁇ of the sample and run it on agarose gel.
  • the XL-1 Blue or DH5 a competent cells were transformed with a 0.5 ⁇ sample. The next day, compare the number of colonies containing the primer reaction tube and the control tube (without primer). If the ratio is ⁇ 10 : 1 or greater, then the reaction It is likely to be successful. Colony PCR was used to detect the ratio of wild type and mutant ( ⁇ 10 colonies) using template-specific primers.
  • the Kunkel product was purified using Wizard® SV Gel and PCR Clean-Up System. Place the purified Kunkel product and two 2 mm electric shock cups on ice for more than 5 minutes. Half of the purified pre-cooled Kunkel product was mixed with 350 ⁇ SS-320 electroporation competent cells, ice bathed for 5 minutes, and then transferred to a pre-cooled electric shock cup. For the remaining kunkel products, another set of transformations is prepared. Take 1 ml of S0C with a p-1000 automatic pipette. 2, 500 V electroporation competent cells (BTX ECM395). A beep will be heard after ⁇ 4 ms. Immediately add the pre-taken SOC medium.
  • BTX ECM395 500 V electroporation competent cells
  • the phagemid was prepared the next day.
  • the method for preparing the phagemid is basically the same as the second step in the present embodiment, namely the PEG/NaCl secondary precipitation method.
  • 0D 66 . Near 0. 8 (about 2-3 hours), the culture was pre-cooled for 10 minutes on ice. Transfer the culture to 2 pre-cooled 500 ml centrifuge bottles (approximately 250 ml each).
  • Super broth medium 500 ml is prepared by mixing 425 ml of deionized water, 12 g of yeast extract, 6 g of tryptone and 25 ml of 10% glycerol, autoclaving, and then adding 50 ml of autoclave. Potassium phosphate solution (0.11 MK P0 4 , 0. 72 MK 2 HP0 4 ).
  • the amount of U-ss DNA used for Kunkel mutation is about 10-15 ug, and the amount of cccDNA (covalently closed circular DNA) obtained after Kunkel mutation is about 25 ug.
  • the electropherogram of Kunkel mutant product is shown in Fig. 11. .
  • the amount of cccDNA used for electroporation is about 20 ugo.
  • the total number of transformants obtained after electrotransformation is 1. 25 X 10 9 .
  • the titer of the unpurified phage supernatant was 3.06 X 107 ml, and after purification and concentration 30 times, the titer of the phage supernatant was 8. 6 X l OVml, and the purification recovery was 94%.
  • Phage ELISA results using unpurified phage supernatants are shown in Table 5.
  • the sample is diluted to 5 times and the sample is diluted 25 times.
  • 1C0 3 control 0. 052 0. 055 0. 032
  • the transformed plate was directly sequenced, and the sequencing result was: the ratio of the correct mutation in both loop regions was 3/5 ; 1/5 of the clone A loop region was correctly mutated, but the B loop region was not mutated;
  • the clone has a mutation mutation.
  • Sequencing results of infected plates The plaque-infected sample plates were sequenced and sequenced as follows: The ratio of correct mutations in both loop regions was 3/9; 1/9 clone B loop regions were correctly mutated, but There was no mutation in the A loop region; 3/9 clones were correctly mutated in the A loop region, but there was no mutation in the B loop region; 1/9 clones showed false mutations.
  • Detect whether the phage itself binds to the target protein The target protein diluted with 100 ul 0. 5 uM TBS is coated with MaxiSorp microplate (using a target protein known to bind phage as a positive control). Negative control wells were coated with a coating buffer containing no target protein. The sample used phage that did not display foreign proteins. Incubate for 1 hour at room temperature. With TBST
  • TBS containing 0.1% Tween 20 Wash once. Block with 200 ul of blocking solution (TBS containing 0.5% BSA) for 1 hour at room temperature or overnight at 4 °C. Wash once with TBST. Add 100 ⁇ l of phage solution (10 8 -10 9 phage) diluted 100-fold with blocking solution and incubate for 40 minutes at room temperature. Wash 5 times with TBST. Add 50 ul of anti-phage HRP antibody diluted 2500 times with blocking solution and incubate for 30 minutes at room temperature. Wash 5 times with TBST. Wash twice with TBS. Add 50 ul of TMB and incubate for 5-10 minutes at room temperature until blue appears.
  • the reaction was stopped by the addition of 50 ul of 2M H2S04 and the blue color turned yellow. The absorbance at OD450 was measured. If the signal of the target protein is more than 10 times that of the negative control, the target protein cannot be used for screening (the final signal value of TMB should be less than 0.2 after 10 minutes of color development).
  • Biotin labels the target protein and detects whether its biotinylation can be cleaved by DTT.
  • HPDP-Biotin stock solution Add 2. 2 mg HPDP-Biotin to 1.0 ml solvent (eg DMF) to obtain 4 mM HPDP-Biotin stock solution. To ensure complete dissolution of the solvent, the mixture was heated to 37 °C and gently vortexed or sonicated. The stock solution is dispensed and stored frozen.
  • solvent eg DMF
  • Reaction buffer PBS+1 mM EDTA
  • Biotinylated HSA 2 mg HSA was dissolved in 1 ml PBS/EDTA buffer. Mix 5 ul of HPDP-biotin stock solution and 95 ul of DMS0, then add to 1 ml of HSA solution. Mix by vortexing and incubate for 2 hours at room temperature. The reaction mixture was desalted using a desalting column equilibrated with a reaction buffer.
  • a 20 mM Tri s (pH 8) solution containing 100 mM DTT was added to a tube containing 100 ul of magnetic beads, and the mixture was vortexed at room temperature for 10 minutes. Collect the beads and keep the supernatant (sample is 3-100). Add 12. 5 ul of 1 X SDS PAGE sample buffer to the beads and boil for 5 minutes. Take the supernatant (sample 4). Run SDS-PAGE. Sample 1: Raw protein solution; Sample 2: 2_50, 2_100, protein not bound to magnetic beads; Sample 3: Protein eluted by DTT; Sample 4: Protein still bound to magnetic beads after DTT elution. Compare each sample strip. Estimate the amount of target protein that the magnetic beads can bind. The DTT shear reaction was detected. If 100 ul of magnetic beads are unable to bind so many target proteins, the appropriate amount of magnetic beads can be estimated by comparing samples 2-50 and 2-100.
  • Round 1 Screening In the first round of screening, the target protein is first bound to the magnetic beads and then the phage is added. Manual screening was used due to the large amount of (1 ml) phage library sample solution required.
  • Prepare Log phase of XL-1 cells Take 1 ml of streptavidin magnetic beads and place them in a microtube. Place the microtube on the magnetic stand for ⁇ 1 minute and discard the supernatant. Add 1 ml of TBS, resuspend the beads, place on the magnetic stand for 1 minute, and discard the supernatant. Repeat the washing once. The beads were then resuspended in 1 ml TBS.
  • the infected mixture was transferred to 30 ml of 2xYT + ampicillin ( ⁇ ) + 30 ⁇ l helper phage (final titer of ⁇ 10 8 /ml) + 0.2 mM IPTG, 37 overnight shake culture.
  • the phage were prepared by precipitation twice with PEG/NaCl, and finally the phage pellet was resuspended with 300 ul of TBS.
  • Round 2 screening From the second round, the solution capture method was used, and the KingFisher magnetic bead purifier was used for the screening. Before starting the screening, prepare the eluent (must be ready for use), lOO ul/sample: 20 mM Tris (pH 8), 100 mM DTT (4 ul 0. 5 M Tris, 1. 54 mg DTT/100 ul). Formulation of the binding solution: Mix 60 ul of phage solution, 10 pmol of biotinylated target protein (cleavable biotinylation) and 10 ul of 50 mg/ml BSA, supplemented with TBS to a final volume of 100 ul. The final concentration of the target protein was 100 nM.
  • Round 3 screening The screening method for the third round is the same as for the second round, except that the preparation method and amplification steps of the binding solution are slightly different.
  • Formulation of the binding solution (this step is different from the second round): Mix 10 ul of phage solution, 2 pmol of biotinylated target protein (cleavable biotinylation) and 10 ul of 50 mg/ml BSA, supplemented with TBS The final volume is 100 ul. The final concentration of the target protein was 20 nM.
  • Infected with 100 ul log phase of XL-1 cells. 2 ml of 2 X YT+Ap+2 ul helper phage + 0.2 mM IPTG medium was added and cultured overnight at 37 °C. Store the remaining eluate at 4 °C.
  • Amplification of monoclonal 48 clones/day: 24 monoclonals were selected from the titer assay plates. Each was inoculated into 150 ul of 2xYT + ⁇ (100 ⁇ 8 / ⁇ 1) medium. After culturing for 3 hours, 150 ⁇ l 2 ⁇ + ⁇ + 0. 3 ⁇ l helper phage + 0. 4 mM IPTG medium was added. Cultivate overnight. Centrifuge at 5000 rpm for 10 minutes (using a 96-well plate basket rotor). Transfer 50 ul of supernatant to a new 96-well plate. The KingFisher phage ELISA (these samples should be arranged side by side) with or without the target protein, respectively. Use 5 ul of phage supernatant.
  • the nucleic acid sequences of the fusion proteins Ex4- lfna-sabl, Ex4-lhms-sabl and Ex4- 1x5 j-sabl were cloned into pET-32a (+) expression vector and co-expressed with Thioredoxin (Trx) to generate Trx_ fusion protein. Increase soluble expression levels.
  • the BL21 (DE3) cryopreservation strain containing the expression vector was thawed in ice on ice, and the strain was activated on a LB agar medium containing 100 ⁇ g/ml ampicillin using an inoculating loop, and incubated at 37 ° C overnight.
  • Selective activation Monoclonal bacteria were cultured in 30 ml of LB medium containing 100 ug/ml ampicillin and shaken at 37 rpm on a shaker at 200 rpm overnight.
  • the overnight cultured bacterial culture was inoculated into 1 liter of LB medium containing 100 ug/ml ampicillin at 2% of the inoculum and shaken at 37 ° C at 200 rpm until the optical density (0D,) at 600 nm reached 0. . 5 mM ⁇
  • the culture was placed at room temperature, the temperature was lowered to 25 V, the inducer isopropyl- ⁇ -D-thiogalactoside (IPTG) was added to a final concentration of 0.5 mM.
  • IPTG inducer isopropyl- ⁇ -D-thiogalactoside
  • the bacterial culture after addition of IPTG was placed in a 25 ° C shaker and shaken at 200 rpm for 4 hours to induce intracellular expression in E. coli. The results of the expression are shown in Figure 12.
  • Example 6 Purification of a Trx-tagged fusion protein
  • E. coli cells were harvested by centrifugation at 6000 x g for 10 minutes, and 20 ml of loading buffer (50 mM sodium phosphate, 0.5 M sodium chloride, 20 mM imidazole, pH 7.4) was added, followed by the addition of lysozyme and protease inhibitor benzene. Methylsulfonyl fluoride (PMSF) to a final concentration of 0.2 mg/ml and 1 mM, respectively. Incubate for one hour on ice, and the bacterial suspension was intermittently sonicated for 2 minutes. The protein suspension was centrifuged at 15000 xg for 1 hour, and the supernatant was collected and filtered through a 0.45 ⁇ m microporous membrane.
  • loading buffer 50 mM sodium phosphate, 0.5 M sodium chloride, 20 mM imidazole, pH 7.4
  • PMSF Methylsulfonyl fluoride
  • the protein suspension was centrifuged at 15000
  • Trx-tagged fusion proteins (E X -lfna- Sa bl, Ex4- lhms- sabl and Ex4- lx5j- sabl ) purified by immobilized metal ion affinity chromatography, dialyzed against dialysis buffer (10 mM Tris, 30 mM) Sodium chloride, 2 mM CaCL, 20 mM L-Arg HC1, 20 mM L-Glu HC1, pH 8. 0), dialyzed overnight at 4 °C. Different amounts of recombinant enterokinase (Enterokinase, EK, GenScript) were added at the time of digestion, and left at room temperature overnight. The enzymatic cleavage effect of the fusion protein under different EK enzyme dosages was determined by SDS-PAGE. The results are shown in Fig. 14.
  • Example 8 Purification of fusion protein after digestion
  • the enzyme-cut fusion protein was dialyzed against the original dialysis buffer (10 mM Tris, 30 mM sodium chloride, 2 mM CaCL, 20 mM L-Arg HC1, 20 mM L-Glu HC1, pH 8. 0). Replace with a new dialysis buffer (40 mM Na 2 HP0 4 , 20 mM L- Arg HC1, 20 mM L-Glu HC1, pH 7.4). Pre-equilibrate 3 ml of M-NTA resin pre-packed in a disposable column with 10 column volumes of new dialysis buffer. Slowly bind the filtered protein solution to the column and collect the effluent, which is the fusion after digestion. protein.
  • the functional molecule activity retained by the fusion protein was determined by the following experiment (see: Establishment and application of a drug screening cell model targeting GLP-1 receptor. ⁇ , Shen Zhufang, Journal of Pharmaceutical Sciences 2009, 44 (3): 309-313). This document is incorporated by reference in its entirety by reference. The experimental steps are briefly described as follows: First, a specific vector (RIP-CRE) 6 copy sequence and a recombinant vector of reporter gene E-GFP, Peakl2RIP-CRE6X GFP, which is regulated by the GLP1 receptor signaling pathway, was constructed. This vector was transfected into an islet NIT-1 cell line that activates the expression reporter gene under the stimulation of GLP1 analogs.
  • RIP-CRE specific vector
  • Protein name concentration (mol/1) EC 50 EC 1. 5 1X10-1X10 1X10
  • each well was incubated with 50 ⁇ l of 5 ⁇ g/ml HSA solution in buffer buffer (50 mM NaHC03, pH 9 ) overnight at 4 °C. At the same time, it was coated with a solution containing no target protein as a negative control well. Wash once with PBST.
  • Blocking 200 ⁇ l of blocking solution in PBS was added to each well and incubated in a humid chamber at 4 ° C overnight or at room temperature for 2-4 hours. Wash once with PBS.
  • Add sample Add 50 ⁇ M of the fusion protein sample diluted with PBST or sodium acetate ( ⁇ 5. 5 ) and incubate for 1 hour at room temperature on a rotary shaker. Wash 3 times with PBST.
  • Detection Add 50 u 1 1 step turbo-TMB-ELISA and incubate at room temperature until blue appears. The reaction was terminated by the addition of 50 ⁇ L of 2 M S0 4 . The absorbance at 450 nm was measured using a microplate reader.
  • PBS 0.1 M phosphate buffer, pH 7. 4.
  • PBST PBS solution containing 0.1% Tween 20.
  • Blocking solution 1% Ficoll 400 in PBS.
  • Ex4-lfn a - Sa bl had stronger binding ability to human serum albumin than the negative control protein (ie, the unmodified template protein), and the pH was 5. At 5 o'clock, its binding ability is basically unaffected.
  • Example 11 Ex4-lfna-sabl and Ex4-like do not lower blood glucose at normal physiological blood glucose levels and cause adverse reactions
  • mice Kunming mice, weighing 22-24g, half male and half female.
  • mice Fifty healthy mice were randomly divided into control group, exenatide-administered group (1.3 ug/kg) and Ex4-lfn a - Sa bl-administered group (64 ug/kg, 128 ug/kg and 320 ug/kg). ). The control group was given an equal volume of phosphate buffer. After fasting for 12 hours, the corresponding drug or physiological saline was injected subcutaneously, and blood glucose levels of 0, 0.5, 1, 2, 4, 8, 12, and 24 hours after administration were measured.
  • mice As can be seen from Fig. 17, the blood glucose levels of the Ex4-lfna-sabl group at each of the three doses were not significantly different from the phosphate buffer (PBS) control group and the Ex4 control group, and were not normal. The mice have an adverse effect on hypoglycemia.
  • Example 12 Ex4-lfna-sabl (different from Ex4) The hypoglycemic effect in mice lasted for 12 hours after the experimental animals: Kunming mice, weighing 22-24 g, half male and half female.
  • mice 18 healthy mice were randomly divided into control group (phosphate buffer), exenatide-administered group (1. 3 ug/kg, ie, 0.31 nmol/kg) and Ex4-lfna-sabl. Administration group (320 ug/kg, ie 21. 33 nmol/kg). Blood glucose levels were measured 12 hours after fasting and administered subcutaneously. Two hours after the administration, a glucose solution of 1. 5 g/kg was administered by gavage, and blood glucose levels were measured 30 minutes before the sugar filling, 0.5, 15, 30, 60, and 120 minutes after the sugar filling (see Fig. 18).
  • mice Eighteen healthy mice were randomly divided into control group (phosphate buffer), exenatide-administered group (1. 3 ug/kg, ie, 0.31 nmol/kg) and Ex4-lfna-sabl.
  • the administration group (320 ug/kg, 21. 33 nmol/kg) 0
  • the blood glucose level was measured 12 hours after fasting and administered subcutaneously.
  • a glucose solution of 1.5 g/kg was administered by gavage, and blood glucose levels were measured 30 minutes before the sugar filling, 0.5, 15, 30, 60, and 120 minutes after the sugar filling (see Fig. 19).
  • Exendin-4 Heloderma suspectum enzyme-linked kit (Phoenix Pharmaceutical s, catalog number EK-070-94); ICR mouse plasma; ACCU-CHElTPerf orma blood glucose meter (Roche).
  • mice were divided into exenatide ( ⁇ 4) control group (1.3 ug/kg, ie 0.11 nmol/kg) and Ex4- lfna-sabl administration group (320 ug/kg, ie 21 33 nmol/kg) 0
  • each group of mice was administered subcutaneously 0. 08, 0. 25, 0.5, 1, 2, 4, 6, 10, 24 and 48 hours after taking 30- 40 ul of blood, the concentration of Ex4 in the blood sample was determined using an enzyme-linked kit and a previously established working curve.
  • Figure 21 is a PK curve of the Ex4 control group.
  • Figure 22 is a PK curve of Ex4-lfna-sabl.
  • Ex4-lfna-sabl can maintain a hypoglycemic effect in Beagle dogs for at least 5 days
  • Experiment 1 Two healthy animals were divided into a control group (equal volume phosphate buffer) and an Ex4-lfn a - Sa bl administration group (1 mg/kg). After 12 hours of fasting, it was administered intravenously, and it was allowed to move freely for about 12 hours. After fasting for 12 hours, 24 hours (1 day) after administration, 4 g/kg of glucose solution was administered by gavage. Blood glucose levels 30 minutes before sugar, 5, 10, 20, 30, 45, 60 and 120 minutes after sugar filling.
  • Experiment 2 Two healthy animals were divided into a control group (equal volume phosphate buffer) and an Ex4-lfn a - Sa bl administration group (1 mg/kg). After fasting for 12 hours, it was administered intravenously, allowing it to move freely and after eating for about 60 hours, and then fasting for another 12 hours, that is, after 72 hours (3 days) of administration, 4 g/kg of glucose solution was administered by gavage. Blood glucose levels 30 minutes before sugar, 5, 10, 20, 30, 45, 60 and 120 minutes after sugar filling.
  • Experiment 3 2 healthy animals were divided into control group (equal volume phosphate buffer) and Ex4-lfn a - Sa bl Administration group (1 mg/kg). After fasting for 12 hours, it was administered intravenously, allowing it to move freely, after eating for about 108 hours, and then fasting for another 12 hours, that is, after 120 hours (5 days) of administration, 4 g/kg of glucose solution was administered by gavage. Blood glucose levels 30 minutes before sugar, 5, 10, 20, 30, 45, 60 and 120 minutes after sugar filling.
  • EIA Kit measures the concentration of Ex4-lfna-sabl in the blood sample (ie, the concentration of Exendin-4, since the EIA Kit recognizes the ⁇ -4 (ie, Exending-4) portion of the Ex4-lfna-sabl fusion protein).
  • concentration of Ex4-lfna-sabl in the blood sample ie, the concentration of Exendin-4, since the EIA Kit recognizes the ⁇ -4 (ie, Exending-4) portion of the Ex4-lfna-sabl fusion protein.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Diabetes (AREA)
  • General Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medicinal Chemistry (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Animal Behavior & Ethology (AREA)
  • Endocrinology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Molecular Biology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Toxicology (AREA)
  • Obesity (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Emergency Medicine (AREA)
  • Epidemiology (AREA)
  • Hematology (AREA)
  • Zoology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Peptides Or Proteins (AREA)
  • Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)

Abstract

本发明公开了一种新型的蛋白模板用于生成非抗体类靶向结合分子。所述蛋白模板及其衍生的靶向结合分子可与多肽、蛋白或化学功能分子形成融合蛋白或蛋白偶联物。由此,后者可具备可调节的靶向性或半衰期性质,同时保留结构稳定性和功能分子原有的活性,在制药及分子诊断领域有广泛应用前景。本发明也公开了一系列的GLP-1受体激动剂融合蛋白,所述高分子包括GLP-1受体激动剂多肽和靶向肽。其中,靶向肽为人工改造后的、能够与血清白蛋白可逆性地结合的肽。所述高分子还可以在靶向肽和GLP-1受体激动剂多肽之间包括一个连接分子。这类高分子药剂能够保留GLP-1受体激动剂多肽的活性,同时具有更长的体内半衰期,在治疗糖尿病、肥胖症、神经退行性疾病等领域具有较好的前景。

Description

一种改造非抗体类蛋白产生结合分子的方法、所产生的产品和一种长效 GLP-1受体激动 剂
发明领域
木发明涉及生物药, 即拟抗体 (或称非自然抗体) 类可与靶点结合的高分了。 本发 明也涉及糖尿病、 肥胖症、 心血管病、 神经退行性疾病领域, 特别涉及 GLP- 1受体激动 剂。
背景介绍
具备工程改造潜力的蛋白分子,能够在人工改造后,特异性地识别和结合特定靶点。 这类靶向蛋白分子及工程改造技术因此被着秉运川十制药和诊断等生物技术领域,其: 要性与口俱增。 在过去几十年中, 这类工作主要集中在单克隆抗体的研发。 作为生物体 免疫系统的重要组成部分, 单克隆抗体分子是天然的抗原结合分子, 能够针对种类繁多 的各种药物靶标产生很强的特异性结合能力。 但从产业化角度来看, 抗体及其衍生分了 仍然有着不少关键缺陷, 不能满足现实需求。 这主要体现在如下几个方面: 首先, 单克 隆抗体的分了量一般超过 10万道尔顿, 穿透组织的能力不强, 限制了其对于诸如 ¾体 瘤等秉大疾病的疗效。 其次, 单克隆抗体及其衍生片段的抗原结合界面比较平坦, 对于 药物靶标中很 Λ要的离子通道蛋白、 酶的催化位点等类型很难结合。 另外, 抗体是由两 种不同的多肽链组成, 闲此克隆步骤复杂, 有时会导致结构不稳定。 单链的丁程抗体片 段或其他衍生物, 例如单链抗体 (VH和 VL亚基通过连接肽直接相连) 等, 在大多数情 况下, 只能小批量生产。 最后, 抗体及其功能性片段的生产及纯化成本很高, 往往需要 复杂的糖基化等翻译后修饰, 在昂贵的哺乳动物细胞中生产, 产率较低。
发明的概要
木申请中披露的发明涉及一种鉴定一个蛋白模板产生拟抗体潜力的方法,包括:(i ) 初步选择某蛋白; (i i ) 用该蛋白本身的结构信息来鉴定该蛋白中可以引入变化而又基 本不影响该蛋白结构的一或多个区域 (称为可变区), 从而鉴定该蛋白模板产生拟抗体 的潜力。
在实施上述方法的一种做法中,可以进一步包括用该蚩白本身的序列信息来优先选 择川该蛋白的结构信息所鉴定的一或多个可变区。
实施上述方法的一种做法可以进一步包括:鉴定该蛋白模板产生拟抗体的潜力后验 证该蛋白模板产生拟抗体的潜力,其验证方法包括:(i )在所鉴定可变区中引入点突变, 插入一或多个可以参与形成蛋白模板与其他蛋白相互作用的界面并呈现非线性结构,或 者可以自身呈现非线性结构的多肽(NLP), 或以一或多个所述多肽部分或全部取代所述 可变区, 然后分析所形成的在可变区引入上述变化的蛋白变体的性能, 其中, 该蛋白变 体的性能好坏验证了所述鉴定该蛋白模板产生拟抗体的潜力。 或(i i )将所述蚩白模板 克隆到常用的蛋白展示方法的展示载体中, 并在所鉴定的可变区中插入随机寡核苷酸从 而建立一个所述蚩白模板的可变区被随机多肽部分或全部取代的文库,然后川常用的蛋 白展示方法来从所建文库中筛选对于一或多个给定靶点有亲和力的蛋白 (称为 "融合蛋 白 "); 其中所筛出的融合蛋白的性能好坏验证了该蛋白模板产生拟抗体的潜力。
一蛋白展示可以包括以下中的一种:丄 i ) 噬菌体展示 (pha^e di splay ); ( i i ) 酵母 展示 (yeast di splay ); ( i i i ) mRNA展示; 和 ( IV ) 核糖体展示(ribosome display )。
上面所述分析的蛋白变体或融合蚩白之性能可以包括: (i ) 热稳定性, (i i ) 酶稳 定性, (i i i ) 溶解性, (iv ) 是否具有引入多肽的原有的与其靶点的亲和力, (V ) 表达 水 Ψ -。
本申请中披露的发明还涉及一种制作拟抗体的方法, 包括: (i ) 初步选择某蛋白; ( i i ) 该蛋白本身的结构信息来鉴定该蛋白中 以弓 I入变化而又基本不影响该蚩白结 构的一或多个区域 (称为可变区), 从而鉴定该蛋白模板产生拟抗体的潜力。 然后进一 步包括在一或多个鉴定的可变区中引入点突变, 插入一或多个多肽, 或以- -或多个所述 多肽部分或全部取代所述可变区。 替换页 (细则第 26条) 上面所述的多肽可以是一或多个可以参与形成蛋白模板与其他蛋白相互作用的界 面并且呈现非线性结构, 或者可以自身呈现非线性结构的的多肽。
在实施上述方法的一种做法中,可以进一步包括将一或多个所鉴定的可变区用与对 应可变区长短接近的所述多肽的部分或全部取代所述可变区。
上面所插入的多肽可以来源于以下之一: (i) 自身能够形成环状结构并具有靶向结 合能力的多肽;(ii)抗体互补决定区(Complementarity determining regions,或 CDR) 的一部分; (ΙΠ) 两个互相作用的天然蛋白之间的结合界面 (binding interface) 的一 部分。
上面所插入的多肽可以用以下方法之一制作: (i)选择一或多个己知的和某靶点蛋 白结合的多肽; (ii) 通过蛋白展示来筛选能和某一靶点相结合的多肽; (iii) 筛选双 硫键非线性多肽 (NLP); (iv) 制作某一靶点的抗体, 然后按照该抗体的互补决定区 (Complementarity determining regions, 或 CDR) 的咅分或全咅序歹 !j来制作一或多段 多肽; 和 (V) 从两个互相作用的天然蛋白之间的结合界面 (binding interface) 选择 一段作为所述多肽。
上面所述^蛋白展示可以是以下方法中的一种: (i)噬菌体展示(phage display); (ii) 酵母展示; (iii) mRNA展示; 和 (iv) 核糖体展示 (ribosome display)。
本申请中披露的制作拟抗体的方法还可以进一步包括改变可变区之外(称为不可变 区) 来进一步改进所制拟抗体。
上面所述的对不可变区进行的改变可以包括: ( i )将不可变区的 N端或 C端序列进 行增删, (ii)将所述 N端或 C端改造为适合表达宿主的序列, 和 (iii)将不可变区中 连接二级结构的连接区的残基替换为侧链较短的残基。
上面所述的侧链较短的残基可以是甘氨酸、 丙氨酸和丝氨酸。
上面所述的鉴定一个蛋白模板产生拟抗体潜力的方法,包括:(i)初步选择某蛋白; ( i i )用该蛋白本身的结构信息来鉴定该蛋白中可以引入变化而又基本不影响该蛋白结 构的一或多个区域 (称为可变区), 从而鉴定该蛋白模板产生拟抗体的潜力, 其中可变 区用以下方法鉴定:
(i) 选择一或多个与该蛋白结构相似的蛋白, 和该蛋白组成一个蛋白结构组;
( ii )用一或多种常用来描述蛋白结构的数据和一或多个描述不完全数据的随机性 的数学模型来描述该蛋白结构组的结构特征 (称为结构谱);
(iii) 用随机抽样方法来更新所述模型及有关参数, 直至模型收敛, 从而估计出 结构谱;
(iv)将该蛋白组结构中倾向于偏离了结构最常见状态, 而呈现出不常见状态的结 构区域鉴定为可变区。
上面描述蛋白结构的数据可以是三维欧几里德空间坐标数据。而三维欧几里德空间 坐标数据所表述的对象可以是蛋白全原子、 碳 alpha (Co ), 碳 beta (Ce)、 碳 gamma (C y)、 碳 delta (CA)、 碳印 silon (C6) 或其他类型原子, 或以上原子类型的组合。 描述 蛋白结构的数据还可以是蛋白质接触图 (protein contact map)。
上面鉴定一个蛋白模板产生拟抗体潜力的方法中描述不完全数据的随机性的数学 模型可以是隐马氏模型, 其中所述隐马氏模型的每一个节点有三种状态 M (Match, 同源 保守)状态, I (Insert, 随机空间)状态和 D (Deletion, 缺失)状态; 这三种状态遵 守一定的概率分布; 其中第 (iv) 步将该蛋白组结构中倾向于偏离了 M状态, 或者呈现 出 I状态的结构区域鉴定为可变区。
上述方法中的三种状态所遵守的概率分布可以是高斯(Gaussian)分布、贝塔(Beta) 分布或指数 (Exponential) 分布。
上面所述的鉴定一个蛋白模板产生拟抗体潜力的方法还可以设定明确的参数, 以区 分以下三种因素引起的蛋白结构柔韧性: (i) 热稳定性导致的自体柔性, (ii) 非热稳 定性导致的自体柔性,和(iii)自然或人工进化过程中蛋白结构可被容忍发生的偏差。 上面所述的蛋白组的结构可以被视为遵循一定的图谱 (G) 出现的随机路径 (A), 按照一定的发射概率产生的随机变量 (Y), 通过一定的旋转 (R) 和平移 (V) 操作, 而 产生的随机三维点阵; 其中随机抽样方法可以是 Monte Carlo方法, 而用随机抽样方法 来更新的所述模型有关参数可以是图谱(G)、 随机路径(A)、 随机变量(Y)、 旋转(R) 和平移 (ν)。
上面所述的随机路径过程中所涉及的联合概率或条件概率由 Forward或 Viterbi算 法得出。
上面所述的随机抽样可以进行至少 100次, 并进一步包括 (i) 对于每次抽样检査 该蛋白结构的每个残基所对应的节点状态, 如果节点状态对应 I状态, 标记该残基为属 于潜在可变区; (ii) 如果节点状态对应 M状态, 而该残基的空间位置大大偏离所对应 的 M状态所对应发射概率分布则标记该残基为属于潜在可变区。
上面所述的方法中,可以把被标记为属于潜在可变区的累计次数超过一定比例的残 基视为可变区。
上面所述的方法中, 大大偏离可以是指发射概率小于 0.05。
上面所述的方法中, 可以以被标记为属于潜在可变区的累计次数超过 95%作为被视 为可变区的标准。
本申请中披露的发明还涉及一种鉴定一个蛋白模板产生拟抗体潜力的方法, 包括: (i)初步选择某蛋白; (ii)用该蛋白本身的结构信息来鉴定该蛋白中可以引入变化而 又基本不影响该蛋白结构的一或多个区域 (称为可变区), 从而鉴定该蛋白模板产生拟 抗体的潜力;(iii)用该蛋白本身的序列信息来优先选择(ii)所鉴定的可变区,包括:
(a) 选择一或多个与该蛋白序列相似的蛋白, 和该蛋白组成一个蛋白组;
(b) 对所述蛋白组进行多序列比对, 建立系统发育树, 根据分子进化模型, 计算 每个位点的进化速率并给各位点的保守性打分。
(c) 用第 (b)步得到的位点分数来优先选择可变区, 即分数越低的位点越有可能 属于可变区, 从而被优先选择。
本申请中披露的发明还涉及一个多肽或蛋白,其序列可以是以下序列之一或者和以 下序列之一有 75%以上同源: (i) SEQ ID NO: 1, 其中可变区包括第 32至 43个氨基酸 之间、 第 55至 58个氨基酸之间和第 90至 93个氨基酸之间; (ii) SEQ ID NO: 15, 其 中可变区包括第 72至 81个氨基酸之间; (iii) SEQ ID NO: 16, 其中可变区包括第 10 至 15个氨基酸之间和第 45至 68个氨基酸之间; (iv) SEQ ID NO: 17, 其中可变区包 括:第 67至 71个氨基酸之间、第 86至 91个氨基酸之间和第 96至 101个氨基酸之间。
上面所述的同源性还可以是: 80%以上同源、 85%以上同源、 90%以上同源、 95%以上 同源和 99%以上同源。
上面所述多肽或蛋白中, 其可变区被以下多肽或和以下多肽 75%以上同源的序列插 入, 或者其可变区部分或全部被所述多肽序列部分及全部替代:
(i) SEQ ID NO: 2; (ii) SEQ ID NO: 3; (iii) SEQ ID NO: 4; (iv) SEQ ID NO: 5; (v) SEQ ID NO: 6; (vi) SEQ ID NO: 7; 和 (vii) SEQ ID NO: 8。
上面所述的同源性还可以是: 80%以上同源、 85%以上同源、 90%以上同源、 95%以上 同源和 99%以上同源。
本申请中披露的发明还涉及一个分离的核酸分子,该核酸分子编码上述的多肽或蛋 白。
本申请中披露的发明还涉及一个表达载体, 该表达载体包含上述核酸分子。
本申请中披露的发明还涉及一个表达载体, 该表达载体可以表达上述多肽或蛋白。 本申请中披露的发明还涉及一个表达载体,其序列为 SEQ ID N0: 14或和 SEQ ID NO: 14至少 75%以上同源的序列。所述的同源性还可以是: 80%以上同源、 85%以上同源、 90% 以上同源、 95%以上同源和 99%以上同源。
本申请中披露的发明还涉及一个多肽或蛋白, 其序列可以是 SEQ ID NO: 1或者其 中至少有一个氨基酸和 SEQ ID NO: 1所对应的基因(即 lx5j)的野生序列相比有改变。
本申请中披露的发明还涉及一个多肽或蛋白, 其序列可以是 SEQ ID NO: 16或者其 中至少有一个氨基酸和 SEQ ID NO: 16所对应的基因(即 lklg)的野生序列相比有改变。
本申请中披露的发明还涉及一个高分子 (macromolecule), 包括以下两部分:
( i ) 一段有生物功能的的多肽或蛋白,其序列是以下序列之一或者和以下 序列之一有 75%以上同源: (a) SEQ ID NO: 25; (b) SEQ ID NO: 26; 和 (c) SEQ ID NO: 43。
(ii) 一个血清白蛋白靶向多肽或蛋白,其序列是以下序列之一或者和以下 序列之一有 75%以上同源: (a) SEQ ID NO: 27; (b) SEQ ID NO: 28;
(c) SEQ ID NO: 29; (d) SEQ ID NO: 30; (e) SEQ ID NO: 31; (f) SEQ ID NO: 32; (g) SEQ ID NO: 33; (h) SEQ ID NO: 34; 和 (i) SEQ ID NO: 35。
上面所述的高分子, 在所述生物功能的多肽 (第一部分) 和所述血清白蛋白靶向多 肽 (第二部分) 之间还可以进一步包括一个连接分子 (第三部分), 该连接分子的分子 量在 300到 5, 500之间。
上面所述的有生物功能的的多肽可以是 SEQ ID N0: 26 (即 GLP-1) 的以下突变体:
(i) A8G、 R36G和 G37K突变体;
(ii) His^LP-l修饰突变体, 具体包括: 脱氨基 GLP-1、 (D-His1) GLP- N- 山梨醇 - GLP- 1、 N-咪唑- GLP- 1、 N- a -甲基- GLP- 1、 N-甲基- GLP- 1、 N- 乙酰基 -GLP-1和 N-焦谷氨酰 -GLP-1;
(iii) Ala2GLP-l 突变体, 具体包括: (D- Ala2)GLP- 1、 (Gly2)GLP - 1、 (Ser2) GLP-U (Aha2) GLP- 1、 (Thr2)GLP- 1、 (Aib2)GLP- 1、 (Abu2) GLP- 1 和(Val2) GLP-1;
(iv) GluGLP-l突变体,具体包括:(Asp3) GLP- 1、(Ala3) GLP- 1、(Pro3) GLP - 1、 (Phe3) GLP- 1、 (Lys3) GLP- 1和(Tyr3) GLP-l;
(v) 在 GLP-1的 N末端加上赖氨酸残基的突变体 KGLP-1。 上面所述的高分子中所述三部分可以以融合蛋白形式连在一起或者以共轭 (conjugation) 形式连在一起。
上面所述的高分子中的连接分子可以是一个非多肽分子。 非多肽分子可以是(但并 不限于) 以下分子之一或任何组合: 聚乙二醇、 聚丙二醇、 (乙烯 /丙烯) 共聚乙二醇、 聚氧乙烯、 聚氨酯、 聚磷腈、 多糖、 右旋糖酐、 聚乙烯醇、 聚乙烯基吡咯烧酮、 聚乙烯 基乙基醚、 聚丙烯酰胺、 聚丙烯、 聚氰基、 脂质聚合物、 几丁质、 透明质酸和肝素。
上面所述的高分子中的连接分子可以是一个多肽分子,该多肽可以由天然或非天然 氨基酸组成。 所述天然氨基酸可以是可以形成蛋白质的天然氨基酸。 所述天然氨基酸可 以是由遗传密码直接编码的天然氨基酸。所述的作为连接分子的多肽还可以是以下序列 之一或者和以下序列之一有 75%以上同源: (a) SEQ ID NO: 36; (b) SEQ ID NO: 37;
(c) SEQ ID NO: 38; (d) SEQ ID NO: 39; (e) SEQ ID NO: 40; (f) SEQ ID NO: 41; 和 (g) SEQ ID NO: 42。 所述的同源性还可以是: 80%以上同源; 85%以上同源; 90%以 上同源; 95%以上同源; 和 99%以上同源。
上面所述的高分子中有生物功能的的多肽和所列出的序列(a) SEQ ID NO: 25、 (b) SEQ ID NO: 26、 和 (c) SEQ ID NO: 43至少 80%同源, 而且所述血清白蛋白靶向多肽 和所列出的序列 (a) SEQ ID NO: 27、 (b) SEQ ID NO: 28、 (c) SEQ ID NO: 29、 (d) SEQ ID NO: 30、 (e) SEQ ID NO: 31、 (f) SEQ ID NO: 32、 (g) SEQ ID NO: 33、 (h) SEQ ID NO: 34和 (i) SEQ ID NO: 35至少 80%同源。 所述的同源性还可以是 85%以上 同源、 90%以上同源、 95%以上同源和 99%以上同源。 本申请中披露的发明还涉及一个分离的核酸分子,该核酸分子编码上述高分子中的 多肽或蛋白。
本申请中披露的发明还涉及一个表达载体, 该表达载体包含上述核酸分子。
本申请中披露的发明还涉及一个表达载体,该载体可以表达上述高分子中的多肽或 蛋白。
本申请中披露的发明还涉及一种药物或疫苗,该药物或疫苗包含上面所述的任何一 个多肽或蛋白,或者上面所述的任何一个高分子,或者上面所述的的任何一个核酸分子, 或者上面所述的任何一个表达载体。 附图说明
结合文中的附图, 将能够更好的理解前面提到的发明内容, 以及接下来的具体实施 方式, 这些附图以举例的方式给出, 不能作为对权利要求的限制。 在附图中: 图 1为本发明中一些方法的流程图。
图 2为一个测试模板蛋白产生拟抗体能力的逻辑流程图。
图 3为一个说明基于给定靶点产生拟抗体的逻辑流程图。
图 4为 1x5 j及其结构相似蛋白的结构谱。
图 5为对 lx5j各残基是否属于结构元件的概率化估计。
图 6为综合序列谱和结构谱结果对 lx5j的分析结果。
图 7为用来表达 lx5j及其蛋白变体的典型质粒图。
图 8为检测 1x5 j变体在噬菌体表面表达情况的噬菌体 ELISA结果。
图 9为模板蛋白和 NLP多肽形成的融合蛋白与血清白蛋白结合能力的测试结果。
图 10为 lfna、 lhms和 lklg的噬菌体 ELISA结果。
图 11为 lhms模板蛋白的 BMT文库突变产物电泳图。
图 12为 GLP1受体激动剂融合蛋白的表达。
图 13为纯化后的融合蛋白 Ex4-lfna-sabl、Ex4-lhms_sabl和 Ex4-lx5j_sabl的电泳图。 图 14为融合蛋白 Ex4- lfna- sabl、 Ex4- lhms- sabl和 Ex4- 1x5 j- sabl经肠激酶酶切后的 电泳图。
图 15为经肠激酶酶切并纯化后的融合蛋白电泳图。
图 16为检测 Ex4-lfna-Sabl与人血清白蛋白的结合能力的 ELISA结果。
图 17为 Ex4-lfna-Sabl对正常小鼠血糖浓度的影响。
图 18为给药 2小时后 Ex-4和融合蛋白 Ex4-lfna-Sabl在小鼠中的降糖效果比较。 图 19为给药 12小时后 Ex-4和融合蛋白 Ex4-lfna-Sabl在小鼠中的降糖效果比较。 图 20为融合蛋白 Ex4-lfna-Sabl与 Ex4在小鼠中的降糖效果比较。
图 21为 Ex4在小鼠血浆中的药代动力学曲线。
图 22为融合蛋白 Ex4-lfna-sabl在小鼠血浆中的药代动力学曲线。
图 23为融合蛋白 Ex4-lfna-Sabl在比格犬中的降糖效果。
图 24为融合蛋白 Ex4-lfna-Sabl在比格犬中的药代动力学曲线。 发明详细说明
由于单克隆抗体的缺点, 继单克隆抗体之后, 被称为 "拟抗体"的新型靶向蛋白被 发现。 这类新型靶向蛋白通过改造天然蛋白模板而得到, 具有致密而热稳定的结构, 体 积小 (5_20kDa), 而又具有较大的可突变表面积用于工程改造和定向进化, 这些区域在 改造后不至于严重破坏原本蛋白的稳定性。这些拟抗体模板蛋白具有与抗体完全不同源 的序列, 但对于特定抗原的特异性结合能力毫不逊色, 而且通常都具有更好的溶解性、 组织穿透能力、 热稳定性和酶稳定性, 而且可以通过原核系统(如大肠杆菌) 或简单的 真核系统 (如酵母) 大量生产。 目前为止, 己经有接近 60种具有这类潜力的拟抗体模 板被发现, 并逐步取代单克隆抗体在药物治疗、 诊断和其他生物技术领域中的应用。
但是, 与单克隆抗体不同, 拟抗体蛋白的抗原结合区 (即可变区) 与结构区没有明 确定义的界限, 且往往随着靶点的变化而变化, 因此需要大量的突变工作予以确认。 拟 抗体蛋白模板(scaffold) 的可变区确认之后, 需进行人工设计, 构建大容量的变体文 库, 并对个别靶点进行筛选, 以测试该模板是否有足够的潜力, 产生具有足够结构多样 性的变体。 所以, 拟抗体药物前体的设计、 筛选和优化等早期环节通常高度依赖于构建 大容量的蛋白文库, 耗时漫长, 花费很高。 因此, 相比于单克隆药物有数百个处于临床 阶段的品种和数十个上市品种,拟抗体药物目前仅有十几个处于临床阶段的品种和 1个 上市品种, 其发展受到以上技术瓶颈的制约。
本领域的普通技术人员很早就体会到传统文库筛选法的局限性。 例如, Dahiyat等 指出 "一个长 500氨基酸的蛋白有 20的 500次方种可能变化, 文库筛选法不可能筛选 这么多种可能性",只能"检验很小一部分可能提高功能的蛋白变体 "(Dahiyat B. I.等, 美国专利 7,379,822, 此文献全篇以参考合并 (incorporation by reference) 的方式 成为本申请的一部分)。 Kiss等指出, 这类文库构建的方法高度依赖于随机产生的寡聚 核苷酸(编码氨基酸密码子序列) 以产生文库序列的多样性, 其中一个致命弱点就是容 易产生终止密码子以及严重影响蛋白折叠的氨基酸残基组合 (Kiss 等 .Nucleic Acids Res.2006, 34(19) :el32)。 此文献全篇以参考合并 (incorporation by reference) 的 方式成为本申请的一部分。
特别的是, 相比于非常成熟的短肽 (〈3kDa) 文库展示技术, 由于拟抗体蛋白的大 小通常在 5kDa到 20kDa之间,进行拟抗体的文库展示非常困难,需要高度的实验技巧, 而且不同拟抗体模板的文库展示方法一般难以相互借鉴。 很多拟抗体蛋白模板, 由于文 库构造过于困难, 或者文库容量有限, 或者筛选效率低下等原因, 导致无法在测试环节 筛出阳性克隆, 而被放弃。 即使是通过测试的拟抗体蛋白模板, 在实际药物筛选中, 其 针对不同靶点的筛选方法也差异较大, 需要分别摸索、 优化 (参考文献: Ruigrok 等. Alternative affinity tools: more attractive than antibodies. Biochem J. 2011 May 15 ;436(1) :1-13. doi: 10.1042/BJ20101860) c 此文献全篇 以参考合并 ( incorporation by reference) 的方式成为本中请的一部分。
鉴于文库方法的以上不足, 近年来发展出一些新思路和方法, 试图解决上述传统文 库筛选法的局限性。 例如, 研究发现, 个别蛋白的特定位点或区域可以容纳多肽替换或 插入, 并保留植入多肽或抗体 CDR的原有靶向性。 例如, 将 somatostatin多肽序列植 入模板蛋白 CTLA4的类 CDR3区可以获得对于 somatostatin受体具有靶向结合能力的 chimera蛋白 (参考文献: Design and expression of soluble CTLA-4 variable domain as a scaffold for the display of functional polypeptides , Proteins.1999 Aug 1;36(2) :217- 27)。 此文献全篇以参考合并 (incorporation by reference) 的方式成 为本申请的一部分。 又如, CD4 抗体的 HCDR1 环可以插入到神经元型一氧化氮合酶 (neuronal nitric oxide synthase) 的一个蛋白抑制剂中, 每一个这样形成的分子都 有结合 CD4的能力 (Bes C.等, Chardes T. PIN-bodies: a new class of antibody-like proteins with CD4 specificity derived from the protein inhibitor of neuronal nitric oxide synthase. Biochem. Biophys. Res. Commun. 2006 ;343 :334-344)。 此 文献全篇以参考合并 (incorporation by reference) 的方式成为本申请的一部分。 Bes 等演示了只有从抗 CD4抗体的六个 CDR (Complementarity determining regions, 或抗 体互补决定区) 中的五个 CDR中分离的多肽(而不是从该抗体可变区的其他部分分离的 多肽) 才有以可溶、 环状形式结合 CD4的能力 (Bes C.等. Efficient CD4 binding and immunosuppressive properties of the 13B8.2 monoclonal antibody are displayed by its CDR- HI- derived peptide CB1. FEBS Lett. 2001 ; 508 : 67- 74. )。 此文献全篇以 参考合并 (incorporation by reference) 的方式成为本申请的一咅分。
又例如,将溶菌酶(lysozyme)抗体中与抗原结合的环区嫁接到绿色荧光蛋白(green fluorescent protein, GFP ) 中可以产生一个与溶菌酶结合的荧光蛋白 (Kiss 等. Antibody binding loop insertions as diversity elements Nucl. Acids Res. (2006) 34(19): el32)0 此文献全篇以参考合并 (incorporation by reference) 的方式成为 本申请的一部分。
又例如, 将一个从 HIV-1 gp41蛋白的 C末端中分离出的 HIV-1 C多肽的整个抗原 表位 (即 19个非连续氨基酸形成的一个溶剂可到达的表面区域, 约 2000平方埃以上) 插入 GCN4蛋白亮氨酸拉链 (leucine zipper) 的表面, 可以形成一个和天然配体抗病 毒能力接近的人造配体 (Samuel K. Sia等. Protein grafting of an HIV- 1- inhibiting epitope PNAS 2003 100 (17) 9756-9761; doi : 10.1073/pnas.1733910100)。 此文献全 篇以参考合并 (incorporation by reference) 的方式成为本申请的一部分。
更多例子可以在以下文献中找到:
Norman, T. C.等. Genetic selection of peptide inhibitors of biological pathways. Science 285, 591 - 595 (1999);
Colas, P.等. Genetic selection of peptide aptamers that recognize and inhibit cycl in-dependent kinase 2. Nature 380, 548 - 550 (1996);
Kwan, A. H.等. Engineering a protein scaffold from a PHD finger. Structure (Camb) 11, 803 - 813 (2003);
Karlsson, G. B.等. Activation of p53 by scaffold- stabilised expression of Mdm2 - binding peptides: visualisation of reporter gene induction at the single-cell level. Br. J. Cancer 91, 1488 - 1494 (2004);
Vita, C.等. Scorpion toxins as natural scaffolds for protein engineering. Proc. Natl. Acad. Sci. USA 92, 6404 - 6408(1995);
Martin, L.等. Rational design of a CD4 mimic that inhibits HIV-1 entry and exposes cryptic neutralization epitopes. Nat. Biotechnol. 21, 71 - 76 (2003)。
以上引用的文献全篇以参考合并 (incorporation by reference) 的方式成为本申 请的一部分。
这种一步到位的多肽移植 ("one-shot" protein grafting), 就可以直接产生拟 抗体, 略过了构建文库并进行多轮筛选、 优化的步骤, 提供了一种不依赖于文库构建而 直接产生拟抗体、快速发现潜在的新的拟抗体模板蛋白的新思路。同时,后续研究发现, 这些能够容纳多肽移植的蛋白也往往具有成为新的拟抗体模板的潜力,可以对相应位点 进行随机化建立文库, 针对更多靶点筛选、 产生拟抗体。 另外, 对于某些己经被广泛研 究的原有拟抗体模板(scaffold),近来也有发明将抗体的 CDR直接导入以获得拟抗体, 例如诺华公司(Novartis)的 US20100322930等。此文献全篇以参考合并(incorporation by reference) 的方式成为本申请的一部分。
但是上述方法的重要缺点是在于没有利用模板蛋白的结构信息, 没有主动地、 系统 地判断模板蛋白产生拟抗体能力的办法,而是在很大程度上依靠碰运气来找到特定位点 或区域可以容纳多肽替换或插入的个别蛋白。
另外, 也有方法通过计算找出某靶点蛋白和它的某一个结合伙伴(例如抗体) 的己 知结构的结合面上几个关键的 disembodied氨基酸,然后改造某蛋白模板来模拟所找出 的几个氨基酸, 从而使改造后的蛋白模板能和所述的靶点蛋白结合。 例如, Baker等通 过计算来改造一些蛋白模板来模拟靶点蛋白 influenza hemagglutinin (HA)和 HA的抗体 之间互相作用的几个 disembodied的氨基酸,从而使改造后的蛋白模板能够结合靶点蛋 白 HA (Baker等, Computational Design of Proteins Targeting the Conserved Stem Region of Influenza Hemagglutinin, Science, Vol 332, 816-821 (2011))。 此文献 全篇以参考合并 (incorporation by reference) 的方式成为本申请的一部分。 但这种 方法的局限性是需要所选靶点蛋白有己知的结合伙伴,并且需要知道这个靶点蛋白和己 知的结合伙伴互相结合的结合面的详细结构。 本发明提供了一种方法和一个系统,无须耗时耗力地构建大容量蛋白文库和实施高 通量筛选, 而是通过识别待改造目标蛋白的可变元件 (也叫可变区), 结合己知靶点和 非线性多肽序列的信息, 以判断任意未知蛋白模板产生拟抗体的能力, 并可以针对特定 靶标快速产生拟抗体。
图 1描述了本发明的总体结构。 简言之, 本发明通过分析蛋白模板的结构谱图, 找 出与其结构谱图有显著偏离的连续区域 (即可变元件), 然后, 将己知对于某靶标具有 特异性结合能力的一个或多个非线性多肽, 直接替换或者插入该区域, 最后, 测定所生 成的蛋白变种是否与该靶标仍具备特异性结合能力, 以此判断该蛋白模板产生拟抗体的 能力。 本发明还可以针对任意靶标, 改造己知或未知的蛋白模板, 以产生拟抗体。 简言 之,针对任意靶标,首先利用相对简单的短肽文库展示技术,筛选产生多个非线性多肽, 直接替换或者插入到一个或多个蛋白模板中通过结构谱图分析而得到的可变元件,最后, 在由此得到的多个蛋白变种中找出与靶标有特异性结合的拟抗体。 本发明的优势在于, 与现有的技术不同, 绕过了直接利用文库展示较大蛋白的困难操作, 利用结构谱图识别 出蛋白模板中相对独立的可变元件, 通过进行多肽置换等方式, 产生最终的拟抗体。 因 此,本发明克服了现有技术将可变元件及其所在模板蛋白视为筛选或设计中不可分割的 整体因而局限于围绕单个蛋白模板进行筛选的瓶颈, 可同时对多个结构元件(对应多个 蛋白模板) 和可变元件 (对应多个非线性多肽) 进行筛选, 降低了难度, 提高了通量和 成功率, 克服了可变区变区边界随靶点变化而变化的难题。
特别地, 拟抗体分子的功能是以一定的柔性结构为基础, 不是具有特定的、 不易改 变的刚性结构。 在它们与其相应的靶分子相互作用时, 可以改变自身的结构, 以便更有 效地与靶分子结合。 从拟抗体药物设计的角度来看, 柔性较大的部位, 对于大规模突变 的容忍度更强, 一方面不容易影响结构稳定性 (成药性), 另一方面, 容易与靶分子形 成更强的结合力。 由于目前技术获得的蛋白结构信息主要是静态图像(如 X射线衍射晶 体或匪 R结构), 仅包含非常小尺度上的、 由于热稳定性而导致的柔性, 而难以揭示更 大尺度上的、对于拟抗体药物设计更为重要的柔性信息, 更无法预测进行多肽序列替换 或者插入后所带来的变化。 这里本发明提供了一种方法, 利用完全概率化的数学模型, 对于热稳定性导致的自体柔性、 非热稳定性导致的自体柔性、 自然或人工进化过程中蛋 白结构可被容忍发生的偏差等三种因素, 设定了明确的模型参数, 并通过比较目标蛋白 结构与其他同源结构蛋白进行参数估计, 从而准确、 有效地实现了利用多肽序列替换或 者插入方法所进行的拟抗体蛋白设计。
近年来, 基于靶向血清白蛋白的多肽长效化技术路线受到重视。 这一技术采用人工 改造后的、 具有血清白蛋白靶向性的小蛋白分子 (一般仅有约 100残基), 与需要延长 半衰期的多肽基因重组, 而不会大幅度降低活性。 所得到的融合蛋白药物进入血液循环 系统后, 绝大部分被吸附到血清白蛋白上, 少部分保持游离状态。 被吸附的融合蛋白药 物借助与血清白蛋白的可逆性结合作用 (半衰期: 19-20天), 避免被降解或排泄。 随着 游离的融合蛋白药物被消耗或清除,吸附状态的融合蛋白药物从血清白蛋白上逐渐解离 下来, 从而维持了血液中的药物浓度, 长时间地维持药效。 目前上市的每日注射型 GLP1 品种之一一利拉鲁肽, 就是利用靶向血清白蛋白技术, 将半衰期提高到约 14小时。
这一技术路线的关键点在于使原型多肽药物产生足够的血清白蛋白靶向性。现有方 法有如下几种: (1 ) 通过化学修饰原型多肽 (如酰化作用, AlbuTag等) 产生靶向性。 这类方法产生的结合强度非常有限, Kd值一般在 uM量级, 基本无法做到每周注射的半 衰期要求; (2 )将原型多肽与血清白蛋白靶向型多肽(如 Genentech、 Dyax、 Isogenics 等公司的 albumin affinity pept ides ) 或 Albumin - binding Fab片段重组。 这类方法 产生的结合强度略好, Kd值约为数百 nM至几 uM, 但所得药物一般为 50-60的多肽, 生 产成本较高;(3 )将原型多肽与血清白蛋白靶向拟抗体(如 Domant i s GSK的 dAbs, Ablynx 和 BAC的 VHH, Affibody ) 重组。 可用于产生这类人工靶向蛋白的原型蛋白包括: 金黄 色葡萄球菌 A结构域蛋白(US5831012,EP0739353 ),人纤连蛋白(US6818418,EP1266025 ) 等。 这类方法产生的结合强度可以满足每周甚至更长时间注射的需求, 而且可调节。 这 类 GLP1药物可通过高密度发酵, 极大地降低了成本。 在投入同样的生产成本情况下, 可以比化学合成工艺提高近百倍的产能。 此段引用的文献全篇以参考合并 ( incorporat ion by reference ) 的方式成为本中请的一部分。
关键参考文献如下 (这些文献全篇以参考合并 (incorporat ion by reference ) 的 方式成为本申请的一部分)。
Hopp等. The effects of affinity and valency of an albumin-binding domain (ABD) on the half-l ife of a s ingle-chain diabody - ABD fus ion protein. Protein Engineering, Des ign & Select ion vol. 23 no. 11 pp. 827 - 834, 2010
Stork 等. Biodi stribut ion of a Bi specif ic Single-chain Diabody and Its Half-l ife Extended Derivat ives. THE JOURNAL OF BIOLOGICAL CHEMISTRY VOL. 284, NO. 38, pp. 25612 - 25619, September 18, 2009
Jonsson等. Engineering of a f emtomolar affinity binding protein to human serum albumin. Protein Engineering, Des ign & Select ion vol. 21 no. 8 pp. 515 - 527, 2008
Stork 等 . A novel tri - funct ional ant ibody fus ion protein with improved pharmacokinet ic propert ies generated by fus ing a bi specif ic s ingle-chain diabody with an albumin-binding domain from streptococcal protein G. Protein Engineering, Des ign & Select ion vol. 20 no. 11 pp. 569 - 576, 2007
本发明的一个方面, 是提供了一套无须大容量文库构建即可产生拟抗体的方法。 其 使用了五个部分: 靶点系统 100、 非线性多肽 (Non-Linear Pept ide , NLP ) 系统 200、 模板系统 300、 设计单元 400和试验系统 500, 拥有两个输入点和一个输出点。 根据目 的不同, 对应不同的逻辑流程。
其中一种逻辑流程如图 1中标号①所示。其目的是为了测试任意模板蛋白产生拟抗 体的能力。 这一逻辑流程主要是为了鉴定模板蛋白的可变区, 不直接关心拟抗体将结合 什么样的靶点。 这一逻辑流程从模板系统 300出发, 一路经过靶点系统 100和 NLP系统 200到达设计单元 400 ;另一路信息直接到达设计单元 400,在设计单元 400完成设计后, 进入试验单元 500, 结合靶点系统 100的帮助, 输出结论。 具体而言, 如图 2所示, 在 该业务逻辑下, 模板系统 300包含一个信息采集单元 310, 用于获得待测蛋白的基本信 息, 一个分析单元 320, 用于分析待测蛋白的可变元件区域; 靶点系统包含一个信息采 集单元 110, 用于挑选适宜的参考靶点, 和一个合成单元 120, 用于合成上述靶点; NLP 系统 200包含一个信息采集单元 210, 用于挑选适宜的非线性多肽序列, 和一个试验筛 选单元 220,用于筛选针对选定靶点的非线性多肽;设计单元 400,用于综合靶点系统、 NLP系统和模板系统的结果,设计待测蛋白的变体;试验系统 500包含一个合成单元 510, 用于合成变体蛋白,和一个测试单元 520,用于测试变体蛋白对于参考靶点的结合能力, 并且输出最终的评价结果。
另一种逻辑流程如图 1中标号②所示。其目的是为了针对特定靶点快速产出最适合 该特定靶点的拟抗体。 这一逻辑流程主要以靶点为中心, 看哪一个拟抗体和该特定靶点 产生最佳结合,所以这一逻辑流程不直接关心所测拟抗体是否从总体上看对大多数靶点 是一个好的拟抗体。 这一逻辑流程从靶点系统 100出发, 一路经过模板系统 300, 到达 设计单元 400 ; 另一路经过 NLP系统 200到达设计单元 400, 在设计单元 400完成设计 后,进入试验单元 500, 结合靶点系统 100的帮助, 输出结论。具体而言, 如图 3所示, 在该业务逻辑下, 靶点系统 100包含一个信息采集单元 110, 用于获得给定靶点的相关 信息, 和一个合成单元 120, 用于合成上述靶点; 模板系统 300包含一个信息采集单元 310, 用于获得具有产生拟抗体潜力、 或者可能具有该潜力的模板蛋白的信息, 一个分 析单元 320, 用于分析上述模板蛋白的可变元件区域; NLP系统 200包含一个信息采集 单元 210, 用于挑选己知对于给定靶点具有结合能力的非线性多肽序列, 和一个试验筛 选单元 220,用于筛选针对给定靶点的非线性多肽;设计单元 400,用于综合靶点系统、 NLP系统和模板系统的结果,设计模板蛋白的变体;试验系统 500包含一个合成单元 510, 用于合成变体蛋白, 和一个测试单元 520, 用于测试变体蛋白对于参考靶点的特异性结 合能力, 输出结果。
本领域的普通技术人员用模板 (或 " Scaffold ") 来描述一种蛋白框架 (Binz等, Na ture Biotechnology, Vol. 23, 1257 (2005) , 此文献全篇以参考合并 (incorporat ion by reference ) 的方式成为本申请的一部分)。 这种蛋白框架 (在该模板结构稳定的前 提下), 和一般蛋白框架不同, 通常能够容忍下述一些改变。 这些改变包括, 但并不限 于: (1 )模板内一或多段区域一些氨基酸的改变; (2 ) —些外来氨基酸序列在模板内一 或多段区域的插入; 和 (3 ) 模板内一或多段区域被外来氨基酸序列全部或部分取代。 上述改变中有些改变能够引起模板蛋白功能的变化, 这些变化包括(但并不限于)模板 对原来结合靶点结合力的升高或降低, 以及模板能够结合以前不能结合的靶点。
本发明的另一方面涉及 GLP-1 (人胰高血糖素样肽 -1 ) 受体激动剂高分子药物, 该 类药物是治疗 2型糖尿病药物的重要新型药物。 在 GLP-1受体激动剂药物出现之前, 2 型糖尿病患者无论最初采用哪一类降糖药物治疗, 随着病程的进展, 都不可避免地导致 胰岛 β 细胞功能的衰竭及并发症的产生。 一般来说, 单种口服降糖药失效为每年 5%〜 20%, 也就是说单种药治疗的疗效最多能持续 5年左右。 例如: 磺脲类单药治疗 6年以 上约有 53%的患者必须联合胰岛素治疗。 随访 9年的结果显示, 磺脲类或二甲双胍单药 治疗组中只有 25%的患者能够达到血糖控制目标。 随着糖尿病病程的进展, 各种慢性并 发症的发生率也逐步增加。 糖尿病的并发症是引起糖尿病患者致死、 致残的主要原因, 尤其是心脑血管并发症己经成为糖尿病最主要的死亡原因(高达 75%), 约 70%的糖尿病 病人是因为心血管疾病住院治疗的。
GLP-1受体激动剂药物的出现改变了这一状况,可以避免胰岛 β 细胞衰竭,提高胰 岛素敏感性, 减少糖尿病的心脑血管并发症。 具体而言, GLP-1受体激动剂药物具有如 下几点与以往糖尿病药物不同的独特治疗机理和安全性: (1 )降糖作用呈现 "血糖浓度 依赖性" ,最接近内分泌降糖生理状态,避免了口服降糖药和胰岛素的低血糖不良反应, 可固定剂量给药, 优于胰岛素, 适宜长期用药。 (2 ) 具有胰岛 β 细胞保护和促进增殖 的作用, 比口服降糖药和外源性胰岛素更能刺激胰岛 β 细胞反应,是唯一有可能阻止 2 型糖尿病进行性恶化的药物品种。 (3 ) 延缓胃排空, 控制食欲, 减轻体重, 避免了口服 降糖药的体重不良反应, 更受用药患者欢迎。 (4 )具备综合的心血管保护作用和神经系 统作用, 有更宽的适用潜力, 尤其适用代谢综合症患者。
从临床效果的角度来看, GLP1受体激动剂药物相对传统西药有明显优势。具体表现 在: (1 ) 血糖浓度依赖性的降血糖作用。 GLP-1受体激动剂不会引起显著的临床低血糖 反应, 适用于饮食、 磺脲类控制较差的患者, 以及需胰岛素治疗的患者。 (2 )提高胰岛 素敏感性, 改善胰岛 β 细胞功能, 能够预防、 并从根本上治愈糖尿病患者以及糖尿病 的易感人群糖耐量减低 (IGT ) 患者。 而且, 比外源性胰岛素更能促进 β 细胞的刺激反 应时相。 (3 ) 减轻体重, 控制饮食, 效果优于磺脲类、 噻唑垸胺酮类和胰岛素。 (4 ) 与 二甲双胍联用, 疗效优于单用 /格列美脲联用 /甘精胰岛素联用。 (5 ) 具有心血管保护作 用, 可降低血压, 减少糖尿病心血管并发症, 提高机体应激反应能力。
除此之外, 研究显示 GLP-1受体激动剂还通过多种途径产生降低体重的作用, 包括 抑制胃肠道蠕动和胃液分泌、 抑制食欲及摄食以及延缓胃内容物排空。 此外, GLP-1受 体激动剂还可作用于中枢神经系统 (特别是下丘脑),从而使人体产生饱胀感和食欲下降。 除此之外, GLP-1受体激动剂还具有许多其他生物学特性及功能, 例如, GLP-1受体激 动剂可能发挥降脂、 降压作用, 从而对心血管系统产生保护作用, 还可通过作用于中枢 增强学习和记忆功能, 保护神经。
美国 Amyl in制药公司合成的 GLP-1受体激动剂 Exenat ide (含 39个氨基酸的多肽) 在 2005年上市, 其长效缓释制剂在 2012年上市。 Novo Nordi sk公司的另一种 GLP-1 受体激动剂 Liraglut ide在 2010年上市。 目前在临床阶段的 GLP-1受体激动剂品种还 包括: El i Li l ly/Amyl in公司的 GLP1- Fc、 LY548806和 GLP1- PEG等; Novo Nordi sk公 司的 Semaglut ide , Con juChem公司的 PC- DAC, GSK公司的 Albiglut ide, Roche/Ipsen 公司的 Taspoglut ide , 安万特 /Zealand Pharma 的 Lixi senat ide , Intarcia 公司的 ITCA650 , 以及国内豪森药业的 GLP1-PEG。 尽管 GLP-1受体激动剂药物具有巨大的市场潜力, 但市场上的药物需要每日注射, 不良反应发生率高。 应用传统的多肽长效化技术 (如 PEG化学修饰和血清白蛋白 /Fc融 合技术等),将引入比 GLP-1受体激动剂多肽体积大 10倍以上的高分子聚合物或蛋白分 子, 会导致药物活性丧失。 另外, 包括在研的品种在内, 绝大多数需要复杂的化学合成 工艺, 人均年花费约为 2至 3万元。 这一花费, 是中国糖尿病患者的人均年度药物支出 水平 (4000元) 的 5倍以上, 是注射胰岛素花费的 10倍, 非常昂贵。 即使是仿制药物 品种,受到生产工艺的局限,预期售价也在 1万元以上。因此市场上以及正在开发的 GLP1 药物, 只能满足不超过 10万人的 "高端市场"人群需求, 仅占中国 2型糖尿病患者人 群的 0. 1%。因此,迫切需要用药更方便、年花费更少的新型长效基因工程 GLP1类药物。
本发明提供了一系列基于靶向血清白蛋白多肽的 GLP-1受体激动剂高分子药物。其 特征在于所述高分子包括 SEQ ID NO : 25, 26 和 43 所示的氨基酸序列或者与 SEQ ID NO : 25, 26和 43相似的序列, 这些序列可以激活 GLP- 1受体; SEQ ID NO : 27-35所示的 氨基酸序列或者与 SEQ ID N0 : 27-35相似的序列, 这些序列可以靶向结合血清白蛋白。
上面所述的高分子药物中还可以包括一个连接分子。 该连接分子的主要目的是使 上述的两部分(激活 GLP-1受体部分和血清白蛋白靶向多肽部分)从空间上分开一定 距离, 从而使上述的两部分能更好地起到生物效果。 因其所起的作用是隔离作用, 所 述连接分子的化学成分并不重要。只有其大小对隔离效果(即最终生物功能)有影响。 所以所述连接分子可以是非多肽或多肽。 非多肽的连接分子可以是天然或非天然。 例 如, 非多肽的连接分子可以是 (但并不限于) 聚乙二醇、 聚丙二醇、 (乙烯 /丙烯) 共 聚乙二醇、 聚氧乙烯、 聚氨酯、 聚磷腈、 多糖、 右旋糖酐、 聚乙烯醇、 聚乙烯基吡咯 垸酮、 聚乙烯基乙基醚、 聚丙烯酰胺、 聚丙烯、 聚氰基、 脂质聚合物、 几丁质、 透明 质酸和肝素。多肽的连接分子中的氨基酸可以是任何氨基酸,包括天然的和非天然的, 可以是 D氨基酸, 也可以是 L氨基酸。 可以是形成蛋白质的氨基酸, 也可以是不参与 形成蛋白质的氨基酸。 可以是遗传密码直接编码的氨基酸, 也可以是不直接被遗传密 码编码的氨基酸。 例如, 多肽的连接分子可以是 SEQ ID NO : 36-42或者与 SEQ ID NO : 36-42相似的序列。
SEQ ID NO : 25
Exendin-4的氨基酸序列
HGEGTFTSDLSKQMEEEAVRLFIEWLKNGGPSSGAPPPS
SEQ ID NO : 26
GLP-1的氨基酸序列
HAEGTFTSDVSSYLEGQAAKEFIAWLVKGRG
SEQ ID NO : 27
sabl
VSSVi
VYAEVRSFCTDWPAEKSCKPLRGP I S INYRT
SEQ ID NO : 28
sab2
VSSVF
VYAVTDWPAEKSPI S INYRT
SEQ ID NO : 29
sab3
KVKSIVTLDGGKLVHLQKWDGQETTLVRELIDGKLILTLTHGTAVCTRTYEKE Of :0N ai (53S
(9-2=u) Vu (MVVV3) V
6C :0N ai (53S
(9-T=u) u(soOf)SOO)
8C :0N ai (53S
(9-T=u) u(SOOOO)
ZC :0N ai (53S
VVVT 9C :0N ai (53S
3AaidVSSaOSOHMNi5V adIAAIVdnNi5I
IHdVS
Figure imgf000014_0001
9C :0N ai (53S
SV¾d丄 M6AV3SMi) )¾SNav6 IM人人 d丄 (Π丄 Ί3δ I (5HnHNOAAd
:0N ai (53S
SiadllOHVMSMISSaaOMIA SdaAlINcDnOIAlAS
:ON QI 3S
Siad!IOHVMSMISSaaOMIA SdaAlINcDnOIAl gqus :ON QI 3S
SiadII0HVMSMISSaa0MIA Sd3AlIN cDnOIAlASlIIVNVNMAMINVdlNI aAIAAamOIOddADHaaVMIiaiiaHSlISViiAOAddffldOS gq¾s
TC :ON QI (53S
SiadllOHVMSMISSaaOMIA SdaAlINcDnOIAlAS
:ON QI 3S
Zl
£866J/ 0Z OAV (PEAPTD) n (n=l-5 )
SEQ ID NO : 41
IEGR
SEQ ID NO : 42
FNPRG (P/A/S)
SEQ ID NO : 43
Exendin-4变体的氨基酸序列
HHGEGTFTSDLSKQMEEEAVRLFIEWLKNGGPSSGAPPSKKKKKK
GLP-1的氨基酸序列除上述 SEQ ID NO : 26外, 还包括以下突变体:
(1) A8G、 R36G和 G37K突变体;
(2) His'GLP-l修饰突变体,具体包括:脱氨基 GLP-1、(0-[^ 31) 01^-1、 山梨醇-01^-1、 N-咪唑 -GLP-1、 Ν- α -甲基 -GLP-1、 N_甲基 _GLP_1、 N_乙酰基 _GLP_1和 N_焦谷氨酰 -GLP-1 ;
(3) Ala2GLP-l 突变体, 具体包括: (D- Ala2) GLP- 1、 (Gly2) GLP- 1、 (Ser2) GLP - 1、 (Aha ) GLP-U (Thr2) GLP- 1、 (Aib2) GLP- 1、 (Abu2) GLP- 1和(Val2) GLP-l ;
(4) Glu GLP-l 突变体, 具体包括: (Asp3) GLP- 1、 (Ala ) GLP- U (Pro3) GLP- U (Phe3) GLP- 1、 (Lys3) GLP- 1和 (Tyr3) GLP- 1 ;
(5) 在 GLP-1的 N末端加上赖氨酸残基的突变体 KGLP-1。 本发明还提供了一系列能编码所述多肽和融合蛋白的核酸分子。
本发明所述多肽和融合蛋白可通过化学合成或基因工程重组表达产生。一般优选基 因重组表达, 方法如下: 编码所述分子的核酸插入到表达载体中。 编码所述分子的 DNA 区段在表达载体中有效连接以确保其表达的控制序列。 表达控制序列包括但不限于: 启 动子、 信号序列、 增强子元件和转录终止序列。一旦载体己经掺入到适当的宿主中, 就 将宿主维持在适合于高水平表达所述核酸序列、收集和纯化该多肽和融合蛋白的条件下。 这些表达载体通常作为游离体或宿主染色体 DNA的一部分而在宿主中复制。通常表达载 体含有选择标记 (如氨苄青霉素抗性、 四环素抗性等), 以便检测表达了含有期望 DNA 序列的那些宿主细胞。 宿主包括但不限于大肠杆菌、 酵母菌等。
本发明所述多肽和融合蛋白一经表达, 则可按照本领域的标准方法纯化, 包括硫酸 铵沉淀、 亲和柱、 柱层析、 HPLC纯化、 凝胶电泳等。 对于制药用途, 优选基本上纯的、 至少约 90-95%纯度的产物。
本发明的多肽或融合蛋白,可以与一种或几种药学上可接受的辅料共同制成药物组 合物。 这些辅料包括: 水溶性填充剂、 pH调节剂、 稳定剂、 注射用水、 渗透压调节剂等 等。 该药物组合物可以通过肌肉、 静脉内、 皮下等注射途径给药, 优选的剂型为冻干或 溶液注射剂。 所述的水溶性填充剂辅料包括但不限于: 甘露醇、 低分子右旋糖苷、 山梨 醇、 聚乙二醇、 葡萄糖、 乳糖、 半乳糖等一种或几种的组合。 所述的 pH调节剂包括但 不限于: 枸橼酸、 磷酸、 盐酸、 氢氧化钾或钠或铵、 碳酸钠或钾或铵盐、 碳酸氢钠或钾 或铵盐等生理可接受的有机或无机酸和碱及盐等一种或几种的组合。所述的稳定剂包括 但不限于: EDTA-2Na、 硫代硫酸钠、 焦亚硫酸钠、 亚硫酸钠、 磷酸氢二钾、 碳酸氢钠、 碳酸钠、 精氨酸、 谷氨酸、 聚乙二醇、 十二垸基硫酸钠、 三羟甲基胺基甲垸等一种或几 种的组合。 所述的渗透压调节剂包括但不限于: 氯化钠、 氯化钾等一种或多种的组合。 本发明的药物组合物还可以在组合治疗中给药, 即与其它药剂组合。 例如, 组合治疗可 包括本发明的组合物连同至少一种或多种其它治疗剂,例如抗炎药、抗癌药和化疗药物。 具体实施方式 这里将对附图说明中的具体实例进行参照说明。 在下面的详细描述中, 阐述了许多 具体细节以提供一个关于本发明的透彻理解。 给出的实施例仅为了阐明本发明, 而不是 为了限制本发明的范围。 在以下的实施例中, 未详细描述的各种过程和方法是本领域中 公知的常规方法。 所用试剂的来源、 商品名以及有必要列出其成分者, 均在首次出现时 标出, 其后所用相同试剂如无特殊说明, 均与首次标明的内容相同。 实施例 1: 评估某未知蛋白模板产生拟抗体的潜力
下面结合具体的实施例, 说明采用本发明评估某给定蛋白 (pdb编号: lx5j )产生拟 抗体的潜力。 该实施例从模板系统 300出发。 在模板系统 300的信息采集单元 310中, 一方面, 本单元采集 lx5j蛋白的己知数据, 包括但不限于其一级序列信息、 二级序列 信息、 三级结构信息、 生产工艺信息 (如生产工艺、 表达效率等) 和功能信息 (如亚细 胞定位信息、酶稳定性等)。常用的方法包括: 数据库査询和文献挖掘。在本实施例中, 通过査询 SC0P数据库得到 1x5 j的一级序列为 SEQ ID NO 1。
KPNTLYEFSVMVTKGRRSSTWSMTAHGTTFEL ( SEQ ID NO 1 ) 另一方面, 本单元采集与 lx5j蛋白的序列相似的其他蛋白信息。 常用的方法为数 据库査询。 在本实施例中, 信息采集单元 310采用位点特异迭代 BLAST (PSI-BLAST )算 法, 搜索 SWISS-PR0T数据库, 共采集了 301条与 1x5 j蛋白具有相似序列的其他蛋白。 PSI- BLAST和 BLAST是本领域常用的序列数据库搜索算法 (Altschul等, Gapped BLAST and PSI— BLAST : a new generat ion of protein database search programs. Nucl. Acids Res. (1997) 25 (17) : 3389-3402 doi : 10. 1093/nar/25. 17. 3389 )("Altschul 1997文章")。 此文献全篇以参考合并 (incorporat ion by reference ) 的方式成为本申请的一部分。 BLAST是 Bas ic Local Al ignment Search Tool的缩写, 意为 "基本局部相似性比对搜 索工具"。 国际有多个著名生物信息中心都提供基于 Web的 BLAST服务器。
具体而言,首先,将目标蛋白的序列作为査询序列,利用 BLAST算法搜索 SWISS-PR0T 数据库, 得到多个相似序列的比对结果, 由此建立一个位置特异的分值矩阵。 其次, 将 该分值矩阵作为査询序列, 继续利用 BLAST算法搜索 SWISS-PR0T数据库, 以找到新的 相似蛋白序列, 并更新分值矩阵。 例如, 上述建立位置特异的分值矩阵的过程和更新分 值矩阵的过程都可以按照 Altschul l997文章来进行。 此过程被反复迭代直至不再有新 的相似序列被发现。
另一方面, 本单元还采集与 1x5 j蛋白的结构相似的其他蛋白信息。 在本示例中, 信息采集单元 310通过査询 SC0P数据库, 获得了与 lx5j结构相似的、 同属于人源的其 他 5个蛋白 (SC0P编号: dlx5fal、 dlx5hal、 dlx5kal、 dlx5gal和 dlx5ial )。 SCOP数 据库是蛋白质结构分类数据库, 提供了己知结构蛋白之间的结构和进化关系的信息, 所 涉及的蛋白质包括结构数据库 PDB中的所有条目。其结构分类主要是通过人工观察和比 较而得来的。 其分类层次包括: 结构家族、 结构超家族、 折叠等。 类似的还有 CATH数 据库等。
进而, 在模板系统 300的分析单元 320中, 一方面, 本单元分析 1x5 j与其相似序 列蛋白的谱图, 从中找出 lx5j蛋白中的快速进化位点, 进行打分。 一般流程为: 对上 述序列进行多序列比对, 建立系统发育树, 根据特定的分子进化模型, 计算每个位点的 进化速率, 进行打分。 多序列比对的常用方法包括: CLUSTAL 算法 (参考: Larkin 等. Clustal W and Clustal X vers ion 2. 0. Bioinformat ics (2007) 23 (21) : 2947-2948, 此文献全篇以参考合并(incorporat ion by reference )的方式成为本申请的一部分), Dial ign 算法 (参考 = Morgenstern 等. DIALIGN : finding local s imi larities by mult iple sequence al ignment. Bioinformat ics (1998) 14 (3): 290-294 ) 等。 此文 献全篇以参考合并 (incorporat ion by reference ) 的方式成为本申请的一部分。 建立系统发育树的常用方法包括: 相邻连接 (Neighbor-Joining)算法、 非加权配 对组 (UPGMA) 算法、 最小进化 (ME) 算法、 最大简约 (MP) 算法、 最大似然 (ML ) 算 法、 贝叶斯 (Bayes ) 算法等。 本领域有多个共知的用于序列进化分析的假说和模型, 包括碱基取代速率模型、 位点内速率差异模型等等, 可用于计算进化速率 (Johnson 等. Model selection in ecology and evolution, Trends in Ecology & Evolution, 19 (2) : 101-108 (2004) )。 此文献全篇以参考合并 (incorporation by reference ) 的 方式成为本申请的一部分。 在本实施例中, 利用 Consurf 软件 (Glaser 等. ConSurf : identification of functional regions in proteins by surface-mapping of phylogenetic information. Bioinformatics. 2003 Jan ; 19 (1) : 163-4, 此文献全篇以 参考合并 (incorporation by reference ) 的方式成为本申请的一部分), 进行打分, 且取保守型分值大于 1作为筛选标准, 得到符合条件的快速进化位点, 结果如表 1所 示。 表 1-快速进化位点打分表
残基序 残基类 保守性打 相似序列数 /总序
相同位置的残基类型 号 型 分 列数
S, F, T, N, Κ, Υ, E, V, Q, M, C, L, A, P, H, D
13 V 1. 049 269/301
,R,I,G
31 D 1. 385 33/301 A, F, S, T, P, K, E, Y, H, M, D, L
S, T, N, K, Y, E, V, Q, M, L, A, P, H, D, I, R
36 K 2. 309 294/301
,G
F, S, T, N, K, E, Y, V, Q, M, C, L, A, P, H, D
37 H 1. 11 294/301
,I,R,G
F, S, T, N, K, Y, E, V, Q, M, L, A, W, P, H, D
53 N 1. 092 292/301
,R,I,G
S, F, T, N, K, Y, E, V, Q, M, C, L, A, W, P, H
54 I 1. 111 239/301
,D,I,R,G
F, S, T, N, K, Y, E, V, Q, C, L, A, P, H, D, I
55 P 1. 349 293/301
,R,G
F, S, T, N, K, Y, E, V, Q, C, L, A, P, H, D, I
56 A 2. 302 293/301
,R,G
F, S, T, N, K, Y, E, V, Q, M, L, A, P, H, D, R
57 N 2. 167 293/301
,I,G
S, F, T, N, K, E, Y, V, Q, M, L, A, W, P, H, D
58 T 1. 546 265/301
,I,R,G
F, S, A, T, N, P, K, Y, E, V, H, Q, M, D, R, I
59 K 1. 766 288/301
,G,L
F, S, T, N, K, Y, E, V, Q, M, C, L, A, W, P, H
60 Y 2. 279 293/301
,D, R,I,G
S, F, T, N, K, E, Y, V, Q, M, C, L, A, W, P, H
61 K 2. 187 299/301
,D,I,R,G
F, S, T, N, K, Y, E, V, Q, M, C, L, A, W, P, H
62 N 1. 447 300/301
,D, R,I,G
F, S, T, N, K, Y, E, V, Q, M, C, L, A, W, P, H
64 N 1. 101 301/301
,D,I,R,G 66 T 1. 025 299/301 S, A, T, N, P, K, E, V, H, Q, M, D, R, I, G, L
F, S, T, N, K, E, V, Q, M, C, L, A, P, H, D, R
67 T 1. 874 300/301
,I,G 另一方面, 本单元分析 1x5 j蛋白及与其结构相似的其他蛋白的结构谱, 从中找出 lx5j蛋白的可变元件。 该结构谱可以由蛋白全原子、 C a 或者其他类型分子的三维欧几 里德空间坐标数据构成。 在本示例中, 利用隐马氏模型 (Hidden Markov Model s ) 对于 任何一组蛋白结构的 C a 结构谱进行描述。 隐马氏模型是本领域内常用的数学模型, 被 广泛用于描述不完全数据的随机性与潜在结构,特别地在描述蛋白序列谱或结构谱中有 重要应用。 相关介绍可参考: Eddy, Profi le Hidden Markov Model s.
Bioinformat ics 14 (9) : 755- 763 ( 1998 )。 此文献全篇以参考合并 ( incorporat ion by reference ) 的方式成为本申请的一部分。 在本示例中, 用于描述蛋白结构谱的隐马氏 模型是具有 n个节点的结构。 每个节点具有 M、 D、 I三种状态。 其中, 第 k个节点的 M 状态, 只能向第 k+1个节点的 M或 D状态, 或者第 k个节点的 I状态转移; 第 k个节点 的 D或 I状态, 只能向第 k+1个节点的 M或 D状态, 或者第 k个节点的 I状态转移。
这种状态转移概率矩阵设定为未知参数, 但不依赖于所在节点的序号。 同时, 每个 节点 k的 M状态都分别对应一个未知的发射概率分布 (三维正态分布), 其期望值参数 为 ( ,yk,¾), 方差参数为 。 , 由此可产生三维空间坐标; 特别地, 。/根据每个蛋白 结构 ( ) 而分别定义, 且不随节点位置而变化。 相反, 所有节点的 I状态都只对应同 一个未知的三维高斯概率分布, 其期望值参数为 (x,y,z ), 方差参数为 σ 2, 由此可产 生三维空间坐标。 从而, 在该模型下, 任一蛋白的三维空间结构 (Χ , 都可以视为: 遵 循一定的图谱 (G) 出现的随机路径 (A), 按照一定的发射概率产生的随机变量 (Y), 通过一定的旋转 (R) 和平移 (V ) 操作, 而产生的随机三维点阵。 由于这一随机过程中 所涉及的联合概率或条件概率,均可由本领域内所共知的 Forward或 Viterbi算法得出
(参见 Eddy, Profi le Hidden Markov Model s. Bioinformat ics 14 (9) : 755-763( 1998 ), 此文献全篇以参考合并(incorporat ion by reference )的方式成为本申请的一部分), 便可以利用本领域内共知的随机抽样方法, 例如 MonteCarlo方法来更新隐马氏模型 G、 A、 Y、 R、 v等未知参数直至收敛, 得到最终的蛋白结构谱。 如图 4所示, 即为一组蛋白 结构的最终结构谱。 其中, 节点状态之间连线越粗, 代表所观察到的蛋白结构出现在这 一位置的概率越大。 由此可见, 在第 -1至 2节点, 第 10至 22节点, 第 42-51节点, 第 112-116节点, 这组蛋白结构倾向于偏离了模型中的 M状态, 即同源保守的空间位置, 而是呈现出要么缺失 (D状态), 要么随机出现于空间之中 (I状态)。 对应这类结构谱 特征的结构区域, 在本系统中定义为可变元件。 随机元件可以通过以上示例中利用肉眼 甄别、 定性分析结构谱的方法得到, 也可以如下述示例, 利用统计学方法, 在结构谱图 中精确选择。 对于特定的蛋白结构 (Xj ), 在上述随机抽样进行参数估计的过程中, 可 产生 100个随机路径的抽样 (Al— 100 ), 以及对应的 (R, V ) 1— 100。 对于每次抽样, 检査该蛋白结构的每个残基所对应的节点状态。 若对应 I状态, 标记该残基为潜在可变 元件; 若对应 M状态, 而该残基的空间位置大大偏离所对应的 M状态所对应发射概率分 布(比如,发射概率小于 0. 05),则标记该残基为潜在可变元件。在总共 100次抽样中, 被标记为潜在可变元件的累计次数超过一定比例的残基(比如: 95%), 被最终视为可变 元件。 图 5显示了按照这一方法, 得到 lx5j中的可变元件。
为更好地理解以上描述的隐马氏模型,上述鉴定可变元件的过程可以和下面的假想 类比。 假定某人带五种不同颜色石子爬长城, 在大部分烽火台上此人会停下将五种不同 颜色石子各扔一个。 大部分时间此人会不用什么力气, 也就是说石子被随手丢在地上。 但此人有时会使劲将一种或多种颜色石子扔得很远。也有时此人也会从一个烽火台跳下 长城而跳跃到不是下一个而是再下一个烽火台再爬上该烽火台继续爬长城扔石子。继续 假定某观察者没有每天跟着此人。这个观察者只是最终看到很多五种颜色石子所形成的 轨迹, 连长城也看不到。 这个观察者现在的任务是推算长城的走向和每一个石子是怎样 扔出去的。 也就是说石子怎么会成为今天的样子。
以上推算长城走向和每一个石子怎样被扔出去的问题就象上面描述的在蛋白模板 中找可变区的问题。 假想中的长城就是结构谱。 而五种颜色的石子就像是蛋白组(假定 有五种蛋白) 中的蛋白。 今天能看到的只是这五种蛋白 (即五种颜色石子)。 问题是这 五种蛋白怎样到了今天的样子。上面爬长城扔石子假想中的烽火台就像是上述隐马氏模 型的节点。 轻轻扔石子就像是隐马氏模型节点中的 M (Match, 同源保守)状态。 使劲扔 石子就像是隐马氏模型节点中的 I ( Insert , 随机空间) 状态。 而跳跃烽火台就像是隐 马氏模型节点中的 D (Delete, 缺失) 状态。
根据以上方法所鉴定的 lx5j—些可变区是第 32至 43个氨基酸之间、 第 55至 58 个氨基酸之间和第 90至 93个氨基酸之间。
如上所述, 在 lx5j的可变区被鉴定之后, 在靶点系统中, 通过 110单元可获得适 宜评价 lx5j的靶点蛋白信息。 120单元用于合成、纯化靶点, 包括表达纯化全长的靶点 蛋白、 靶点蛋白特定片段或者表达在稳定或快速转染细胞系上的靶点蛋白等, 其常用方 法为本领域内所共知。 在本实施例中, 选择来源为人、 鼠、 兔的血清白蛋白作为靶点, 购自 Sigma- Aldrich公司。
进而, 在 NLP系统 200中, 通过信息采集单元 210获得己知对给定靶点具有靶向结 合能力的非线性多肽 (NLP) 的序列。 常用方法包括数据库检索和文献检索, 在上述基 础上可进行简单增删的二次设计。 符合检索条件的 NLP序列包括但不限于: 来自于给定 靶点的抗体的 CDR序列,来自于天然或人工筛选的具有靶向结合能力的非线性多肽序列, 己知配体的结合部位序列等。 通常情况下, 在 210所得信息不充分、 不完全或其他有必 要的情况下,一筛选单元 220针对给定靶点进行非线性多肽筛选, 通常方法包括噬菌体展 示、 mRNA展示等方法。
在本实施例中, 利用 M13噬菌体肽库 (PhD- C7C噬菌体库, New England Biolabs 公司) 进行 NLP多肽筛选。 Ph. D. -C7C噬菌体展示肽库是将随机七肽融合到 M13噬菌体 次要衣壳蛋白 (ρΙΠ) 上而构建成的一个组合文库。 所展示的随机多肽两侧各有一个半 胱氨酸 (Cys )。 在非还原条件下, 这两个半胱氨酸自发地形成一个二硫键, 使展示的多 肽环化。 受限于二硫键环内的 7肽库己被证实能识别抗原表位结构、 D-氨基酸靶分子的 镜像配基及开发以多肽为基础的治疗药物等。该肽库表达的随机肽在噬菌体次要衣壳蛋 白 ρΙΠ的 Ν端, 第 1个半胱氨酸之前是丙氨酸残基, 第 2个半胱氨酸和野生型 ρΙΠ序列 之间是连接肽段 Gly-Gly-Gly-Ser。 该肽库由 109个不同克隆组成, 扩增一次后得到噬 菌体文库, 其中每 10 μ 1中每个序列的拷贝数约为 100。
针对靶点蛋白的筛选细节如下所述。 将人血清白蛋白 (HSA, Sigma-Aldrich公司) 溶解于 0. 1 M NaHC03得到 100 u g/ml浓度的 HSA溶液 (pH8. 6 ), 然后通过物理吸附固定 到 96孔 ELISA板 (Nunc Max is orb ) 上。 在前三轮的生物淘洗中, 将 1. 5 ml的 HSA溶 液加到每个聚苯乙烯 Petra平板 (60 X 15mm, Corning公司, 美国) 中, 然后把平板放 到湿润容器中, 4 °C轻摇过夜。 第四轮时, HSA浓度变为 10 y g/ml。 用 0. 1 M NaHC03
(pH8. 6 )配制 1%的卵清蛋白封闭板孔 1小时。用含有 0. 1%吐温 -20的 Tris盐缓冲液 (TBS 缓冲液), 悬浮 10 μ 1的噬菌体文库。 前三轮中, 噬菌体与 HSA室温轻摇结合 1小时; 第四轮, 结合 20分钟。 未结合的噬菌体用 TBS反复冲洗除去。 其中, 前三轮 TBS含有 0. 1% (ν/ν) 的吐温 -20, 第四轮含有 0. 3% (ν/ν) 的吐温 -20。 结合噬菌体的 HSA颗粒加 入 2 M Glycine- HC1 (pH2. 2 ) 10分钟, 来洗脱噬菌体。 加入 150 μ 1 的 1 M Tris- HC1
(PH9. 2 ), 中和被洗脱的噬菌体。 噬菌体浓度可以通过大肠杆菌 ER2738菌株滴定试验 来评估。 洗脱后的噬菌体通过大肠杆菌 ER2738菌株增殖。 过夜细胞培养液使用 LB培养 基按照 1 : 100稀释。 1 ml稀释的培养基分装到培养试管中。 使用无菌木质小棍的一端, 从密度小于约 100个菌斑的平板中挑取蓝色菌斑,转移到装有稀释培养基的试管中, 37 V 摇动培养 5小时。 培养液放到微型离心管中 12, OOOrpm离心 10分钟, 上清液中含有大 量扩增后的噬菌体颗粒。 取出上清上层的 80%放 4 V, 可保存数周而滴度不变。 经过 4 步随机筛选的噬菌体颗粒使用通用引物 96gIII (如 5' -CCCTCATAGTTAGCGTAACG - 3' ) DNA 测序鉴定,使用兔抗 M13噬菌体抗体的 ELISA试验来评估他们与 HSA的结合力。加入 1/6 体积的 PEG/NaCl溶液到扩增后的噬菌体上清中过夜沉淀,然后用 12, OOOrpm离心 10分 钟。 一排 ELISA板孔使用 0. 1 M NaHC03稀释的浓度为 100 u g/ml的 HSA 200 u L包被, 放在气密湿箱中 4 °C过夜培养。 另一个孔板使用梯度稀释的噬菌体包被。 两个平板均使 用 0. 1 M NaHC03溶解的 1%酪蛋白封闭。 噬菌体使用 200 u 1/孔含有 0. 1%吐温 _20的 TBS 按照 4倍梯度稀释,第一个孔中含有 1012病毒体,最后一孔也即第 12个孔含有 2. 4 X 105 病毒体。 每一排噬菌体使用排枪转移到包被有 HSA的板子上。 板子放在室温摇动孵育 1 小时, 然后用含有 0. 3%吐温 -20的 TBS溶液洗涤。 接着用兔抗 M13噬菌体抗体孵育, 再 用结合辣根过氧化物酶的山羊抗兔 IgG检测结合的噬菌体。结合的辣根过氧化物酶的数 量通过底物 ABTS/ 02显色后, 用 405nm的吸光度读数来衡量。 每个样本重复 5遍。 对 照组中不加噬菌体。 每个读数都应减去 405nm的背景吸收值。 ELISA试验中 405nm的吸 光度值与结合的噬菌体数量相关。挑取与对照组相比吸光度显著增强的噬菌体克隆进行 测序, 获得其含有的 NLP多肽序列 (部分如表 2所示)。
表 2本实施例中涉及的 NLP序列
Figure imgf000020_0001
进而, 在设计系统 400, 将上述环节所得 NLP多肽的序列全部或部分插入或替换目 标蛋白 1x5 j的序列来进行改造。在本实施例中,首先综合序列谱和结构谱的分析结果, 确定可插入非线性多肽序列的位置。 如图 6, 被圈注的三个区域(A, B, C)是用本申请 中披露的方法对 lx5j蛋白结构谱分析后识别出的可变元件。 而图 6中具有小球状结构 的区域, 则是对该蛋白序列谱分析后识别出的非保守序列。 这样结构信息和序列信息都 被利用于分析和识别可变元件。 但本申请中披露的这些方法并不必须使用序列信息。 仅 使用目标蛋白 (即模板蛋白)本身的结构信息, 用本申请中披露的这些方法, 也可以分 析和识别可变元件。
一般地, 建议在实际操作中, 序列谱分析的结果起辅助作用, 结构谱分析的结果起 主导作用。 例如, A、 C两个可变元件同时也含有非保守序列, 具有快速进化的特征, 因 此可以用于进行 NLP改造。 在实际操作中, 考虑到非线性多肽序列 (NLP) 的大小平均 为 10-20个残基左右, 远远超过可变元件 C的大小, 与可变元件 A的大小较为接近。 因 此,首选将 NLP的全部或部分序列移植入可变元件 A。另外,考虑到变体蛋白的稳定性、 可溶性以及其他因素的优化,一般也推荐对非可变元件区的某些残基进行替换。 常见操 作包括, 将 N端或 C端序列进行增删或改造为适合表达宿主的序列, 将连接二级结构的 连接区的残基替换为侧链较短的残基, 比如丝氨酸, 将半胱氨酸替换为丝氨酸等。 主要 目的是为了减少所改区域的疏水性, 从而提高变体蛋白的可溶性和其他性能。
改造后的部分 lx5j变体的序列如表 3所示:
表 3: 1x5 j变体序列及针对血清白蛋白的靶向结合强 1
血清白蛋白 编号 模板蛋白 1x5 j改造后的变体序列
人 鼠 兔 SGPMMPPVGVQASILSHDTIRITWADEVRSFCTDW
PAEKSCKPLRGRYYTVRWKTNIPANTKYKNANATT
SEQ ID NO 9 1. 034 0. 511 0. 727
LSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHG TTFELS
SGPMMPPVGVQASILSHDTIRITWADEMCYFPGIC
SEQ ID NO 10 WMRYYTVRWKTNIPANTKYKNA ATTLSYLVTGLK 0. 711 0. 489 0. 205
PNTLYEFSVMVTKGRRSSTWSMTAHGTTFELS
SGPMMPPVGVQASILSHDTIRITWADRLIEDICLP RWGCLWEDDRYYTVRWKTNIPA TKYKNANATTLS
SEQ ID NO 11 0. 366 0. 415 0. 623
YLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGTT FELS 进而, 在试验系统 500, 通过合成单元 510合成以上设计的蛋白变体, 常用方法包 括化学合成法、 酶切法、 生物反应器表达法等。 本实施例采用了原核生物大肠杆菌 ( B. coli. ) 表达的方法产生目的蛋白如下:
1 )构建编码蛋白质变体的 DNA。构建蛋白质的 DNA序列可以采用人工合成法和 PCR 方法。 本发明采用全基因合成的方法制备蛋白变体的全长双链 DNA。 在本实例中, 考虑 到后续纯化和检测方便, 变体 N端添加了 HIS和 FLAG标签, 融合蛋白序列为:
NO 12 )
Figure imgf000021_0001
本实例中合成的编码 DNA具有以下特征: 5 ' 端具有 Ncol酶切位点, 其用于表达载 体 pET28a的 3 ' 端连接; 3 ' 端具有 Xhol酶切位点, 其用于与 pET28a的 5 ' 端连接。 将产物进行双酶切 (NcoI/XhoI ), 并进行纯化。
2 )构建蛋白质变体的表达载体并表达。
利用本领域常规的酶切法与粘性末端连接法, 把模板 DNA与表达质粒分别酶切, 再 用 DNA连接酶连接, 得到所需表达载体, 如图 7所示。 本实例中, 将表达载体 pET28a 进行双酶切 (NcoI/XhoI ), 然后与上面步骤中纯化后的产物进行连接。 将连接产物以热 击的方法转入 DH5 a 感受态细胞。 热击转化方法为本领域常规的技术。 然后提质粒, 测 序。 经测序验证后具有编码完整融合蛋白的表达载体, 再转化 BL21 (DE3)感受态细胞进 行表达。本发明中的融合蛋白的表达采用本领域常规的 lac启动子大肠杆菌融合蛋白表 达技术,使用 IPTG启动融合蛋白的产生。将含有融合蛋白表达载体的 BL21 (DE3)感受态 细胞在含有卡那霉素抗生素的 LB培养基中进行过夜预培养。 将过夜的菌液以 1 : 100的 比例用含卡那霉素的 LB新鲜培养基稀释, 37 °C培养菌液直至 OD600达到 0. 6, 将体系 温度降至 25 V, 使用 ITPG启动融合蛋白的表达。
载体序列为: (SEQ ID NO 14 )
1 ATCCGGATAT AGTTCCTCCT TTCAGCAAAA AACCCCTCAA GACCCGTTTA GAGGCCCCAA 61 GGGGTTATGC TAGTTATTGC TCAGCGGTGG CAGCAGCCAA CTCAGCTTCC TTTCGGGCTT 121 TGTTAGCAGC CGGATCTCAG TGGTGGTGGT GGTGGTGCTC GAGTTACTAG CTCAGTTCAA GGCGCC¾GCGCCCGCC CGCGGG CCCGC GCGCCCGCAAAT TMT TTAAATATATTTAATT
CGGC CGC CGCCGG GCCC GGCGGCG CGGGC TATAAAAAATTAATTTAAAATAAATAATAAA
CCCCCG GGGCCGGCCC GGCC GCCCCGGCCCGCTTAAAATA TAAAAAAMAAATTAAA TTT
CC¾C¾GGCCCC CC2C CG¾CCCCCCGCGCCAT AATTTTTTATATATT AATA AAAAAT
CCGCC22CCC CCC¾C C¾CCGG GC¾C CC¾C TAA TTTATAAATAAAATTATATAAA- ¾CGGCGCGGCGCCCG¾G CCG¾CCC GCCGCCC TATTAA AAT ATTAT AATTTTTTTATA
C¾C CGGCGCCGCGGG CGCGGC CCC¾GG CCCCCATTATTATTA TAATTTTATAAATAAT
G¾CC G¾CC¾GC¾GCGGCG CGG¾CG CGCCCGCC AATTTTT TTTAAA TTAAAMATA
CGCGGCGC¾GGCGCGCGCG CC¾C GCGCCGG CCCG AAA AAA ATTTAAMTATAAAAAAT
¾GGGCC CGGCGGC GGCGGG¾G CCCG¾GCGGCCCGTAATATTTTAT TAAAAT AATAAT
GGCGGCCCGCCG GCGGCGGC CCCCGG¾G CGCGCGGCCC2CAAA ATTATATAATAT TTT- CGCC GCCCCGCCGCCC¾ CCG¾CCC GCCCCC 2CGGCG AATTTATT AATTATAAAATTA
CGGCGGGC CGCCGC GGGCGCGC GGCGGC C2CGCG CG¾CGGAATTTTAAATATTTTATT-
GC¾CCCGCC8GGCG¾GCCG GCGCGCC GG¾CATTT TTTATAT AATTA AAATTATTTT
CCCCC 2CGCCCCCG¾GGGC GCGGGGCGCC¾CG CCGCCTTMAATT TAT ATATATTAT- Ώ¾GCC CCGGCCCGGGC CGCCGGC CCGGGC ΑΑΤΤΜΤΑ ATATATATTA AATTAATTTAA
CGCGCGGCCGCG2CC CGCGG GCGGGC¾GGCGCCGC TATTAAT ATA AATTATAAAA AAA
G CCCCCGGC GCCCCG CCCGCC CCCCCGCGC ¾GGCCG AAAAAAAAAATAATAAAATTTAT
CGCCC CCGGGCC GGCG¾GG GGCCGC¾GCCC¾ CCCG¾GTMTTTAAAAMTA TTTAT
CC2CC CCCC GGCGC CGCGGGGCCCGC¾ GGCCGCGTAAAATAAAAATTTTTT AAAAATT
G¾GCGGCCG¾¾ GGGCGC GCCGGCGCCGCCCGATA TTAATAAATAAA AATAT TTATAT
GCCCGCGGGCGC GGGCGCG CGCGCC CCG¾CCGG GCGCGC2AAAT AATTAATATTAATT
G¾CCCCGCGCG GGGGCCGCCCCGC GGCG¾TTTAA AAAATA AATTAATAAT TTATATTT
CCCCGCCC GCCGGCG GCGGGG GCGCGGGGCC2CCTATA ATAMTTAATAAAAAATA TAT- GCGGGCC C2C2CCGGCG C¾CGC¾CGGCGGCG GC TTAATTMT TTT TTTATAAATAT-
GCG¾CGC GCGGGC¾ GGGCCG 2CGCGG 2CCCGCCGCGCGAAATAATTATTAAA AATT--
GC2CG2CCGCCGC CCCCC Ώ CCCCGTATTT TTATT TAAAAATAAAΑΤΤΑΤΤΤΤTATAA- CCCGCGGGCC GGGCC GC¾GGG CGCCGCC2CCG AATAT TATAAATATTTAATTA TAATT
GGCC¾GCGGCG¾ GCCC CCGCGCC GC¾C¾GG GCGCAATA AAAMTTTTAAATATATT
CCGCGCCGCCGGGGCCGC¾G ¾G¾¾CGG¾GCGGCGGC¾C TT ATAAT TAA TAAA AA
GGC2GCC CCCC2C GGCCCGCC GCGCC¾CGC¾CGC GGCGTATATTA AAATTATTAAA-
CGCCCCCCG C¾GGCC CCGGCGG GCC¾C ¾GCGTAATTAA TTAAAAAAATAMATTATT
CCGGCC GCCC2CGCGCG CGGGGGGGCGGC GGGCGGG AAAAATT TAAAAAAAAA TAAATA
C¾CG¾CGCG ¾GCCG¾ 2CCCGCC2C CCGC¾G¾G ATATA ATAAATTTTAT AATT-
G¾GGC GGCGCCC GGGG¾ CCGGGG¾G¾ GCC¾ATATATAAAAAAATTA TAATAAAAA
¾CGC GGGGGC¾ CGGCCCG GCCCCGGGCGCCCCGC AATAAAAAATAAAT AATAA AATTT
CCGCGGC GGCG¾GCCCGGCCCCCCC GGGCGCC CGCGG TTTT AATA ATATATTTTTAATA
¾G¾GGCGGCCCC¾G GCCGGGGGC2GGGCGC CCCCG CGCCCAT AA TTATTTTATAAT
CCCGGG GGGCGG GCCGCCCCGGGCC GGCGC¾CGGC¾GG TAATAAATTAT TTATATT TT
GC8GGC CGGCCCC GGCGCCCG ¾GCG¾GCGGCGCCCGCCGCATTATAAAT TTAT ATAA
CCCCGG¾C¾CGCG GGCGG CCGCCCCGCCGGC TTATA TAAT TAATTTATAATTATT TAA
CCCCΏ CCGGGGG¾C CGCCC ATTTTT ΤΑΑΑΤΑΑΑAAMTTATT TTAA AATTTATTAAAT
C¾C¾ 2CGCGCG¾GG¾ G¾GGG ¾GGGG CCCG¾TTTATTA TATATATATATATAT-
CGGCGGC CGGGCCG CGGCCC GG¾CG 2CCC¾CC¾TATATTTATTAAAATTTAT TTAT
CCCCCGC CCGG CGG¾ CGGCCGGCGCCGCCGCCCTATATAATAAATATATTA AATTT TAA
CGGCCCG CGCG¾G CGC CGCCGGCCCG¾GCGCCGC AAAATTATTTTTTTA ATAA AATAA
C¾CGC¾¾C ¾¾GCCG G¾G¾CCCGCCC¾¾TT TTTATTTTTAATT TTTAA AAAT
CGCGCCCGC¾CGGCC GGCCC2C CGGCCGG¾G¾CGTAAT ATAAA TTTTAAAAATAT A- G¾C¾GCCGCGCG¾CGCCCGGGCGCCGCGGCCG 2CCCC A AT ATTA TTTA ATTTAATA- 3121 GCGTTTCGGT GATGACGGTG AAAACCTCTG ACACATGCAG CTCCCGGAGA CGGTCACAGC
3181 TTGTCTGTAA GCGGATGCCG GGAGCAGACA AGCCCGTCAG GGCGCGTCAG CGGGTGTTGG
3241 CGGGTGTCGG GGCGCAGCCA TGACCCAGTC ACGTAGCGAT AGCGGAGTGT ATACTGGCTT
3301 AACTATGCGG CATCAGAGCA GATTGTACTG AGAGTGCACC ATATATGCGG TGTGAAATAC
3361 CGCACAGATG CGTAAGGAGA AAATACCGCA TCAGGCGCTC TTCCGCTTCC TCGCTCACTG
3421 ACTCGCTGCG CTCGGTCGTT CGGCTGCGGC GAGCGGTATC AGCTCACTCA AAGGCGGTAA
3481 TACGGTTATC CACAGAATCA GGGGATAACG CAGGAAAGAA CATGTGAGCA AAAGGCCAGC
3541 AAAAGGCCAG GAACCGTAAA AAGGCCGCGT TGCTGGCGTT TTTCCATAGG CTCCGCCCCC
3601 CTGACGAGCA TCACAAAAAT CGACGCTCAA GTCAGAGGTG GCGAAACCCG ACAGGACTAT
3661 AAAGATACCA GGCGTTTCCC CCTGGAAGCT CCCTCGTGCG CTCTCCTGTT CCGACCCTGC
3721 CGCTTACCGG ATACCTGTCC GCCTTTCTCC CTTCGGGAAG CGTGGCGCTT TCTCATAGCT
3781 CACGCTGTAG GTATCTCAGT TCGGTGTAGG TCGTTCGCTC CAAGCTGGGC TGTGTGCACG
3841 AACCCCCCGT TCAGCCCGAC CGCTGCGCCT TATCCGGTAA CTATCGTCTT GAGTCCAACC
3901 CGGTAAGACA CGACTTATCG CCACTGGCAG CAGCCACTGG TAACAGGATT AGCAGAGCGA
3961 GGTATGTAGG CGGTGCTACA GAGTTCTTGA AGTGGTGGCC TAACTACGGC TACACTAGAA
4021 GGACAGTATT TGGTATCTGC GCTCTGCTGA AGCCAGTTAC CTTCGGAAAA AGAGTTGGTA
4081 GCTCTTGATC CGGCAAACAA ACCACCGCTG GTAGCGGTGG TTTTTTTGTT TGCAAGCAGC
4141 AGATTACGCG CAGAAAAAAA GGATCTCAAG AAGATCCTTT GATCTTTTCT ACGGGGTCTG
4201 ACGCTCAGTG GAACGAAAAC TCACGTTAAG GGATTTTGGT CATGAACAAT AAAACTGTCT
4261 GCTTACATAA ACAGTAATAC AAGGGGTGTT ATGAGCCATA TTCAACGGGA AACGTCTTGC
4321 TCTAGGCCGC GATTAAATTC CAACATGGAT GCTGATTTAT ATGGGTATAA ATGGGCTCGC
4381 GATAATGTCG GGCAATCAGG TGCGACAATC TATCGATTGT ATGGGAAGCC CGATGCGCCA
4441 GAGTTGTTTC TGAAACATGG CAAAGGTAGC GTTGCCAATG ATGTTACAGA TGAGATGGTC
4501 AGACTAAACT GGCTGACGGA ATTTATGCCT CTTCCGACCA TCAAGCATTT TATCCGTACT
4561 CCTGATGATG CATGGTTACT CACCACTGCG ATCCCCGGGA AAACAGCATT CCAGGTATTA
4621 GAAGAATATC CTGATTCAGG TGAAAATATT GTTGATGCGC TGGCAGTGTT CCTGCGCCGG
4681 TTGCATTCGA TTCCTGTTTG TAATTGTCCT TTTAACAGCG ATCGCGTATT TCGTCTCGCT
4741 CAGGCGCAAT CACGAATGAA TAACGGTTTG GTTGATGCGA GTGATTTTGA TGACGAGCGT
4801 AATGGCTGGC CTGTTGAACA AGTCTGGAAA GAAATGCATA AACTTTTGCC ATTCTCACCG
4861 GATTCAGTCG TCACTCATGG TGATTTCTCA CTTGATAACC TTATTTTTGA CGAGGGGAAA
4921 TTAATAGGTT GTATTGATGT TGGACGAGTC GGAATCGCAG ACCGATACCA GGATCTTGCC
4981 ATCCTATGGA ACTGCCTCGG TGAGTTTTCT CCTTCATTAC AGAAACGGCT TTTTCAAAAA
5041 TATGGTATTG ATAATCCTGA TATGAATAAA TTGCAGTTTC ATTTGATGCT CGATGAGTTT
5101 TTCTAAGAAT TAATTCATGA GCGGATACAT ATTTGAATGT ATTTAGAAAA ATAAACAAAT
5161 AGGGGTTCCG CGCACATTTC CCCGAAAAGT GCCACCTGAA ATTGTAAACG TTAATATTTT
5221 GTTAAAATTC GCGTTAAATT TTTGTTAAAT CAGCTCATTT TTTAACCAAT AGGCCGAAAT
5281 CGGCAAAATC CCTTATAAAT CAAAAGAATA GACCGAGATA GGGTTGAGTG TTGTTCCAGT
5341 TTGGAACAAG AGTCCACTAT TAAAGAACGT GGACTCCAAC GTCAAAGGGC GAAAAACCGT
5401 CTATCAGGGC GATGGCCCAC TACGTGAACC ATCACCCTAA TCAAGTTTTT TGGGGTCGAG
5461 GTGCCGTAAA GCACTAAATC GGAACCCTAA AGGGAGCCCC CGATTTAGAG CTTGACGGGG
5521 AAAGCCGGCG AACGTGGCGA GAAAGGAAGG GAAGAAAGCG AAAGGAGCGG GCGCTAGGGC
5581 GCTGGCAAGT GTAGCGGTCA CGCTGCGCGT AACCACCACA CCCGCCGCGC TTAATGCGCC
5641 GCTACAGGGC GCGTCCCATT CGCCA
3)本发明中的融合蛋白的纯化。
一旦本发明的融合蛋白在适当的宿主细胞中表达,可以通过标准的蛋白质分离和纯 化技术对发明中的融合蛋白进行纯化。例如根据融合蛋白表达在体内的标签进行融合蛋 白粗提纯。 本发明中, 纯化后的融合蛋白可以通过离子交换、 超滤等方法将其浓缩至所 需浓度。 融合蛋白表达 16小时后, 将菌液离心、 收集, 超声波破碎 10次, 每次 15秒。 将破碎后的菌液高速离心, 上清液过镍柱。 用镍柱捕获 6X HIS标签的融合蛋白, 并利 用咪唑的浓度梯度对融合蛋白进行洗脱。 将洗脱的蛋白收集、 透析及浓缩, 紫外线灭菌 处理。
进而,在测试单元 520,测试蛋白变体对靶点蛋白的结合能力。常用的技术包括 ELISA、 FACS (高精确度分选型流式细胞仪)、 SPR (表面等离子共振)等方法, 具体细节为领域 所共知。 在本实施例中, 采用了间接 ELISA的方法, 具体如下: 孔板上包被 ΙΟΟ μ Ι用 40 mM NaHC03 (pH9. 5 ) 配制的 5 μ g/ml的靶点蛋白 (人血清白蛋白) 溶液, 4 °C过夜培 养, 每一个样本保留一个孔不包被靶点蛋白, 作为空白对照。 倒掉包被溶液, 用去离子 水洗一次, PBS洗一次。 之后每个孔加入 PBS配置的含 1% Ficoll 400的溶液 300 μ 1 封闭 2小时。 倒掉封闭液, 用 PBS洗一次, 加入用 PBST (PBS+0. l%Tween20 ) 稀释的蛋 白样品 100 μ 1, 室温孵育 1小时。用 PBST洗 3次, 每个孔加入 100 u 1的 FLAG®M2单克 隆抗体(用 PBST稀释 1000倍), 室温孵育 1小时。 用 PBST洗 3次。 每个孔加入 100 u 1 用 PBST稀释 1000倍的 HRP标记的山羊抗鼠 IgG, 室温孵育 1小时。用 PBST洗 4次, 用 PBS洗 2次, 按照厂家说明书, 加入 100 u 1的 l-Step™ Turbo TMB-ELISA显色底物, 30 分钟内加入 2 M ¾S04 100 u 1终止反应。 使用微量滴定板分光光度计测量 450nm处的吸 光度, 结果如表 3所示。 lx5j经本发明改造后, 能够产生针对靶点蛋白的变体, 证明其 具备产生拟抗体的能力。 实施例 2: 噬菌体 ELISA检测 1x5 j变体在噬菌体表面的表达情况
获得以上 1x5 j变体并验证与靶点蛋白的结合能力后, 以如下方法进行噬菌体文库 展示。 在 MaxiSorp微孔板中加入 50 μ 1/孔用包被缓冲液 (50 mM NaHC03, pH9 ) 稀释 1000倍的 Anti_V5标签抗体溶液, 同时用不含抗体的包被缓冲液包被阴性对照孔。在一 个湿润的盒子中室温孵育 1小时 (或 4°C孵育过夜)。 用 TBST洗一次。 加入 200 u 1封 闭缓冲液, 在一个湿润的盒子中 4°C孵育过夜或室温孵育 1小时。用 TBST洗一次。加入 50 μ 1用 TBST稀释的噬菌粒样品 (样品浓度分别为 106、 107和 108噬菌粒 /孔)。 在回转 摇床上室温孵育 40分钟。 用 TBST洗 5次。 加入 50 u 1 HRP标记的抗噬菌体抗体溶液 (用 TBST/BSA稀释 2500倍), 在回转摇床上室温孵育 40分钟。 先用 TBST洗 5次, 再 用 TBS洗 2次。 加入 50 ul 1-Step Turbo TMB-ELISA显色液 (Pierce ), 孵育直至出现 蓝色。 用带滤芯的枪头加入 50 u 1 2 M S04终止反应。 使用酶标仪测定 450 nm处的 的吸光值。
TBS: 50 mM Tris , 150 mM NaCl , pH7. 5。
封闭液: 含 0. 5% BSA的 TBS溶液。
TBST: 含 0. 1%吐温 20的 TBS缓冲液。
TBST/BSA: 含 1 mg/ml BSA的 TBST溶液。
如图 8所示, 摸索后的最优条件为: IPTG浓度为 0. 2 mM, 使用带挡板的三角瓶, 培养体系为 20 ml。 在最优条件下, 噬菌体的滴度为 l X 107ml。 实施例 3: 针对给定靶点 (人血清白蛋白)产生非抗体类的靶向蛋白
在本实施例中, 目标是产生针对给定靶点的蛋白结合分子。 通过模板系统的信息收 集单元 310选择了 lfna、 lhms、 lklg三个非抗体类蛋白作为模板蛋白, 并分别得到与 其具有相似序列或者相似结构的蛋白信息。
其中, lfna是己知具有产生拟抗体能力的非抗体蛋白模板 (US6818418, 此文献全 篇以参考合并 (incorporation by reference ) 的方式成为本申请的一部分), 过去十 多年中己经被充分研究。 基于 lfna的靶向结合人血清白蛋白的新型靶向蛋白, 在美国 专利申请 13/098851 和 12/989494 中有所描述 (此两篇文献全篇以参考合并 ( incorporation by reference ) 的方式成为本申请的一部分), 是通过大规模噬菌体 文库展示方法而得到的。 本实施例披露了利用本发明而产生的, 包括全新人工靶向结合 序列的,靶向于人血清白蛋白的蛋白结合分子。 lhms是己知具有一定的产生拟抗体能力 的蛋白模板,在本专利申请人此前的专利申请 CN201210186485. 9 (此文献全篇以参考合 并 (incorporation by reference ) 的方式成为本申请的一部分) 中披露了基于该蛋白 的靶向人血清白蛋白结合分子的具体序列, 以及部分可变区。 本实施例则披露了采用本 发明得到这类新型结合分子的技术细节及更多可变区域。最后, lklg是产生拟抗体能力 未知的非抗体类蛋白。
>lfna
RDLEVV
SPASS PISINYRTEI (SEQ ID NO 15 )
>lklg
TRVSDf
ANTMENVKKAVEQIRNILKQGIETPEDQNDLRKMQLRELARLNGTLR (SEQ ID NO
>lhms
DDRKVKSIVTLDGGKLVHLQKWDGQETTLVRELIDGKLILTLTHGTAVCTRTYEKE 通过分析单元 320鉴定这三个模板蛋白的可变元件。 经鉴定 lfna (SEQ ID NO 15 ) 的一个可变区是在第 72至 81个氨基酸之间, lklg (SEQ ID NO : 16) 的一些可变区包 括第 10至 15个氨基酸之间和第 45至 68个氨基酸之间。 lhms (SEQ ID NO : 17 ) 的一 些可变区包括第 12至 38个氨基酸之间、 第 67至 71个氨基酸之间、 第 86至 91个氨基 酸之间和第 96至 101个氨基酸之间。
靶点系统 100和 NLP系统 200的运行过程与实施例 1相似,并输出如表 2所示的 NLP 多肽序列至设计系统 400。设计系统 400所设计的多个蛋白变体,经实验系统 500合成, 这三个蛋白模板和 NLP多肽形成的融合蛋白与血清白蛋白结合能力的测试结果 (部分) 如图 9所示, HSA是人血清白蛋白, BSA是牛血清白蛋白, NaHC03是阴性对照。
本实施例中产生的具有人血清白蛋白靶向结合能力的蛋白变体如下:
表 4具有人血清白蛋白靶向结合能力的蛋白变体 编号 变体序列 模板蛋白
VSSVPTKLEVVAATPTSLLISWDASSSSVSYYRITYGETGGNSPVQEFTVPGSKSTATI
SEQ ID NO 18 lfna
SGLKPGVDYTITVYAEVRSFCTDWPAEKSCKPLRGPISINYRT
VSSVPTKLEVVAATPTSLLISWDASSSSVSYYRITYGETGGNSPVQEFTVPGSKSTATI
SEQ ID NO 19 lfna
SGLKPGVDYT ITVYATDWPAEKSP I SINYRT
TRVSDKVMIPQDEYPEINFVGLLIGPRGNTLKNIEKESNAKIMIRGKGSVKEGTDWPAE
SEQ ID NO 20 lklg
KSQMLPGEDEPLHALVTANTMENVKKAVEQIRNILKQGIETPEDQNDLRKMQLRELA
TRVSDKVMIPQDEYPEINFVGLLIGPRGNTLKNIEKESNAKIMIRGKGSVKEGLPQWGQ
SEQ ID NO 21 lklg
MLPGEDEPLHALVTA TMENVKKAVEQIRNILKQGIETPEDQNDLRKMQLRELA
TRVSDKVMIPQDEYPEINFVGLLIGPRGNTLKNIEKESNAKIMIRGKGSVKEGEVRSFC
SEQ ID NO 22 TDWPAEKSCKPLRGQMLPGEDEPLHALVTANTMENVKKAVEQIRNILKQGIETPEDQND lklg
LRKMQLRELA
TRVSDKVMIPQDEYPEINFVGLLIGPRGNTLKNIEKESNAKIMIRGKGSVKEGRLIEDI
SEQ ID NO 23 CLPRWGCLWEDDQMLPGEDEPLHALVTANTMENVKKAVEQIRNILKQGIETPEDQNDLR lklg
KMQLRELA VDAFLGTWKLVEVRSFCTDWPAEKSCKPLRGTTIIEKNGDILTLKTHSTFKNTEISFKL
SEQ ID NO 24 GVEFDETTADDRKVKSIVTLDGGKLVHLQKWDGQETTLVRELIDGKLILTLTHGTAVCT lhms
RTYEKE 而上述三个蛋白模板 lfna ( 10Fn3 ), lhms和 lklg都比较适合于噬菌体展示方法。 如图 10所示, 与 lfna ( 10Fn3 ) 类似, lhms和 lklg均能够通过噬菌体展示方法很好地 表达, 从而可以进一步建立文库进行拟抗体筛选。 实施例 4: lhms模板蛋白的 BMT文库的建立和多轮筛选
以所得的对血清白蛋白具有靶向结合能力的 lhms变体作为模板, 通过随机突变所 述之可变区及其他部位的特定残基, 建立大容量噬菌体文库。
1.含尿嘧啶的噬菌粒模板的制备
从新制备的平板上挑取一个含目标质粒的 CJ236单菌落, 接种到 1 ml含 100 μ8/ηι1 氨苄青霉素的 2ΥΤ培养基中, 37 °0振摇培养 6小时 (直至培养物变浑浊)。 加入辅助噬 菌体 M13K07 (〜20 u 1 ), 使其在培养液中的最终滴度为 101Q pfu/ml , 孵育 10分钟, 然后转移至预热的含 100 μ8/ιη1氨苄青霉素和 0. 25 μ8/ιη1尿苷的 30 ml 2 X YT培养基 中, 于 37°C剧烈振摇下培养过夜。
2.噬菌体沉淀
将培养物转移至无菌的 50 ml离心管中, 于 4°C 8000 rpm离心 10分钟。 然后将上 清转移至新的含 6 ml 20% PEG8000/2. 5 M NaCl的无菌 50 ml离心管中, 充分混匀。 室 温放置 5分钟。 于 4°C 8000 rpm离心 10分钟。 弃上清, 短暂离心, 用移液器移去剩余 的上清。 用 1 ml TBS重悬噬菌粒沉淀, 并转移至微量离心管中。 于最高转速离心 2分 钟, 去除不溶物。 将上清转移至新的含 200 y l PEG/NaCl的微量离心管中, 充分混匀, 冰浴 10分钟。 离心 10分钟。 弃上清, 短暂离心, 用移液器移去所有液体。 用 l ml TBS 重悬沉淀。 分别用对数期 (0. 6-0. 9 ) 的 XL-1 Blue和 CJ236细胞测定噬菌粒的滴度。 两个滴度的差异应大于 104。 使用 CJ236细胞测得的滴度应为 1012-10 ml。
3. U-ss DNA的制备
使用 QIApr印 Spin M13 Kit纯化 U_ss DNA。 取 Ιμΐ样品跑琼脂糖凝胶电泳。
4. 寡核苷酸磷酸化
按如下体系混合: 0. 3 nmol (〜5 μ8 ) 寡核苷酸 (ΒΜΤ文库), 3 μΐ 10X Τ4 多聚核 苷酸激酶反应缓冲液, 1. 5 μΐ 10 mM ATP, 0. 5 μΐ Τ4多聚核苷酸激酶 (5 U)。 加水至 终体积 30 μ1。 37 °C孵育 2小时。 65°C热灭活 15分钟。 贮存于 -80 °C。
5. Kunkel反应
按如下比例混合: 3-6 μΐ U-ssDNA (对于文库制备, 用量为 1 4g), 3 μΐ磷酸化寡 核苷酸 (6-9 pmol ) (对于文库, 用量可增加至 30 pmol ) , 1 μΐ 10 X退火缓冲液 (终浓 度为: 20 mM Tri s-Cl (pH7. 4 ) , 2 mM MgCl2, 50 mM NaCl ) , 终体积为 10 μΐ (如需要 可加水补充至终体积)。 可根据实际情况扩大反应体系。 同时设立不加引物的对照管。 退火 PCR程序为: 98 °C, 2 min; 70 °C, 5 min; 37 °C, 30 min。 然后置于冰上。 对 于合成反应, 将下列组分加至退火混合物中 (仍置于冰上): 1 μΐ 10 X合成缓冲液 (终 浓度为: 0. 4 mM 等量混合的 dNTP, 0. 75 mM ATP, 17. 5 mM Tris-Cl (pH7. 4), 3. 75 mM MgCl2, 1. 5 mM DTT), 1 μΐ T4 DNA连接酶 (用 T7稀释缓冲液 (20 mM磷酸钾缓冲液, pH7. 4, 1 mM DTT, 0. 1 mM EDTA, 50%甘油) 稀释 2倍), 1 μΐ NEB T7 DNA聚合酶 (用 T7稀释缓冲液稀释至 0. 5 U/μΙ )。合成反应体系可根据退火反应体系相应地同比例放大。 充分混匀, 短暂离心, 37 °C孵育 30分钟。 75 °C热灭活 15分钟, 冷却至室温。 取 1 μΐ 样品跑琼脂糖凝胶电泳。 用 0. 5 μΐ样品转化 XL-1 Blue或 DH5 a 感受态细胞。 次日, 比较含引物反应管和对照管(不含引物)的菌落数。 如果比例是〜 10 : 1或更大, 则反应 很可能是成功的。用模板特异引物做菌落 PCR检测野生型和突变体的比例 (〜10个菌落)。
6. 电转化以制备 dsDNA
使用 Wizard® SV Gel and PCR Clean- Up System纯化 Kunkel产物。将纯化后的 Kunkel 产物以及 2个 2 mm电击杯冰上放置 5分钟以上。取一半的己纯化预冷 Kunkel产物与 350 μΐ SS-320电转感受态细胞混合, 冰浴 5分钟, 然后转移至预冷的电击杯中。 对于剩余 的 kunkel产物, 准备进行另外一个组转化。用 p-1000自动移液器取好 1 ml S0C。 2, 500 V 电击感受态细胞 (BTX ECM395 )。 〜4 ms 后会听到嘟嘟声。 立即加入预先取好的 S0C 培养基。 重悬细胞, 并转移至 250 ml三角瓶中。 取 1 ml S0C培养基到空电击杯中, 充 分重悬剩余细胞, 将细胞悬液合并至三角瓶中。 再洗一次电击杯, 合并细胞悬液。 对于 另外一组 Kunkel产物与 SS-320电转感受态细胞混合物, 重复上述过程, 并将细胞悬液 转移至同一三角瓶中。 加入 19 ml S0C培养基, 最终体积约为 25 ml。 37°C振摇培养 45 分钟。 取 1 μΐ细胞悬液, 与 99 μΐ水混合, 分别将 1 μΐ稀释后的细胞悬液 (需要先在 平板上加 100 μΐ水, 然后将 1 μΐ稀释后的细胞悬液加到水中, 涂匀) 和剩余的细胞悬 液 (〜99 μΐ ) 涂到含氨苄青霉素的 LB平板上。 37 °C过夜培养。 次日, 数菌落数, 然 后得到滴度 (数菌落数适宜的平板)。 例如, 如果在涂 1 μΐ 100倍稀释菌液的平板上得 到 40个菌落, 则滴度是:
Figure imgf000027_0001
0 χ 108个总单克隆数。
7. 噬菌粒制备
使用 500 ml 2 X YT+氨苄青霉素 +0. 2 mM IPTG + 1012 pfu辅助噬菌体 (〜3 ml ) 作 为培养基接种上步中电击转化并复苏后的菌液, 37 °C振摇过夜。 次日制备噬菌粒。 制 备噬菌粒的方法基本同本实施例中第 2步, 即 PEG/NaCl二次沉淀法。
8. SS-320电转感受态细胞的制备
接种 SS-320到 20 ml 含 10 μΐ Tc储存液 (10 mg/ml ) 的 2 X YT培养基中, 37 °C 振摇过夜。 冰上预冷 1 mM HEPES、 2个 500 ml无菌离心瓶、 3个 2 隱电击杯和 3个扣 紧盖子的微量离心管。 37 °C预热 500 ml 超级肉汤培养基, 然后加入 5 ml SS-320过夜 预培养物。 37 °C振摇培养。 当 0D66。接近 0. 8时(约需 2-3小时), 将培养物至于冰上预 冷 10分钟。 转移培养物至 2个预冷的 500 ml离心瓶中 (每个约 250 ml )。 2 °C, 5, 000 rpm离心 5分钟。弃上清。采用冰上旋转瓶子的方法用少量(〜20 ml )冰冷的 1 mM HEPES 重悬细胞。 用冰冷的 1 mM HEPES补充至约 250 ml。 再次离心并弃上清。 采用冰上旋转 瓶子的方法用少量 (〜20 ml ) 冰冷的 1 mM HEPES重悬细胞。 用冰冷的 1 mM HEPES补 充至 250 ml。 再次离心并弃上清。 采用冰上旋转瓶子的方法用少量 (〜20 ml ) 冰冷的 水重悬细胞。 将两个瓶子中的细胞悬液合并至一个瓶子。 用冰冷的水漂洗空的瓶子, 并 合并至细胞悬液。 用冰冷的水补充至 300 ml。 再次离心并弃上清。 将离心瓶置于冰上。 在冰上倾斜瓶子以使细胞沉淀与水尽量分离。 用移液器移去瓶子底部的上清, 并弃之。 加入 300 μΐ冰冷的水, 重悬细胞 (总体积约为〜 1000 μ1 )。 将电转感受态细胞分装至 3 个预冷的微量离心管中, 每管 350 μ1。
注: 超级肉汤培养基 (500 ml ) 的配制是通过混合 425 ml去离子水、 12 g酵母提 取物、 6 g胰蛋白胨和 25 ml 10%甘油, 高压灭菌, 然后加入 50 ml高压灭菌的磷酸钾 溶液 ( 0. 17 M K P04, 0. 72 M K2HP04 )。
结果:
用于 Kunkel突变的 U-ss DNA的量约为 10〜15 u g, Kunkel突变后得到的 cccDNA (共价闭合环状 DNA) 的量约为 25 u g, Kunkel突变产物的电泳图如图 11所示。 用于 电转化的 cccDNA的量约为 20 u g o 电转化后得到的总转化子数为 1. 25 X 109个。
未纯化噬菌体上清的滴度为 3. 06 X 107ml , 纯化并浓缩 30倍后噬菌体上清的滴度 为 8. 6 X lOVml,纯化回收率为 94%。使用未纯化噬菌体上清的噬菌体 ELISA结果如表 5 所示。
表 5 lhms模板蛋白的 BMT文库的噬菌体 ELISA结果
样品不稀释 样品稀释 5倍 样品稀释 25倍 1C03对照 0. 052 0. 055 0. 032
Anti-V5 1. 809 1. 211 0. 498
净值 1. 757 1. 156 0. 466 转化平板的测序结果 :
将转化得到的平板直接进行测序,测序结果为:两个环区均正确突变的比例为 3/5 ; 有 1/5的克隆 A环区正确突变, 但 B环区没有突变; 另外 1/5的克隆出现了错误突变。 感染平板的测序结果- 将突变后并感染了噬菌体的样品平板进行测序, 测序结果为: 两个环区均正确突变 的比例为 3/9 ; 有 1/9的克隆 B环区正确突变, 但 A 环区没有突变; 有 3/9的克隆 A环 区正确突变, 但 B环区没有突变; 有 1/9的克隆出现了错误突变。 多轮筛选方法
1. 检测噬菌体本身是否和靶蛋白结合: 用 100 ul 0. 5 uM TBS 稀释的靶蛋白包被 MaxiSorp微孔板 (以己知与噬菌体结合的靶蛋白作为阳性对照)。 用不含靶蛋白的包被 缓冲液包被阴性对照孔。样品使用不展示外源蛋白的噬菌体。室温孵育 1小时。用 TBST
(含 0. 1%吐温 20的 TBS ) 洗一次。 用 200 ul封闭液 (含 0. 5% BSA的 TBS ) 室温封闭 1 小时, 或 4 °C封闭过夜。 用 TBST洗一次。 加入 100 ul用封闭液稀释 100倍的噬菌体溶 液 (108-109个噬菌体), 室温孵育 40分钟。 用 TBST洗 5次。 加入 50 ul用封闭液稀释 2500倍的抗噬菌体 HRP抗体, 室温孵育 30分钟。 用 TBST洗 5次。 用 TBS洗 2次。 加入 50 ul TMB, 室温孵育 5-10分钟, 直至出现蓝色。 加入 50 ul 2M H2S04终止反应, 蓝 色变为黄色。 测 OD450处的吸光值。 如果靶蛋白的信号是阴性对照的 10倍以上, 则该 靶蛋白不能被用于筛选 (显色 10分钟后 TMB的最终信号值应当小于 0. 2 )。
2. 生物素标记靶蛋白, 并检测其生物素化是否可被 DTT剪切。
HPDP-生物素储存液: 在 1. 0 ml溶剂 (如 DMF) 中加入 2. 2 mg HPDP-生物素, 得到 4 mM 的 HPDP-生物素储存液。 为了确保溶剂的完全溶解, 将混合物加热至 37 °C, 并轻柔地 涡旋或超声。 将储存液分装后冷冻贮存。
反应缓冲液: PBS+1 mM EDTA
生物素标记 HSA: 将 2 mg HSA溶解于 1 ml PBS/EDTA缓冲液中。 混合 5 ul HPDP-生物 素储存液和 95 ul DMS0, 然后加入到 1 ml HSA溶液中。 涡旋混匀, 然后室温孵育 2小 时。 使用反应缓冲液平衡的脱盐柱对反应混合物进行脱盐。
3. 测定结合生物素标记靶蛋白所需要的链霉亲和素磁珠的量: 取 150 ul链霉亲和素顺 磁颗粒 (Promega, Z5481/2 )。 用 TBS洗磁珠 2次, 每次 150 ul TBS。 将磁珠分为 100 ul 和 50 ul两份。 收集磁珠。 在每个管中加入 10 ul 10-20 uM的靶蛋白溶液, 室温旋转 混合 15分钟。 收集磁珠, 并保留上清 (样品分别记为 2-50和 2-100 )。 在含 100 ul磁 珠的管子中加入含 100 mM DTT的 20 mM Tri s (pH8)溶液, 室温旋转混合 10分钟。 收集 磁珠, 并保留上清 (样品记为 3-100 )。 在磁珠中加入 12. 5 ul 1 X SDS PAGE样品缓冲 液,煮沸 5分钟。取上清(样品 4 )。跑 SDS-PAGE。样品 1:原始蛋白溶液;样品 2: 2_50, 2_ 100, 未与磁珠结合的蛋白; 样品 3 : DTT洗脱下来的蛋白; 样品 4: DTT洗脱后仍然结合在磁 珠上的蛋白。 比较各样品条带。 估计磁珠所能结合的靶蛋白的量。 检测 DTT剪切反应。 如果 100 ul磁珠不能结合如此多的靶蛋白, 则可通过比较样品 2-50和 2-100估计磁珠 的合适用量。
4. 第 1轮筛选(手动筛选): 在第 1轮筛选中, 首先将靶蛋白与磁珠结合, 然后再加入 噬菌体。 由于需要使用大量的 (1 ml ) 噬菌体库样品溶液, 因此采用手动法筛选。 准备 对数期的 XL-1细胞。 取 1 ml链霉亲和素磁珠, 置于微量管中。 将微量管在磁力架上放 置〜 1分钟, 移弃上清。加入 1 ml TBS, 重悬磁珠, 在磁力架上放置 1分钟, 移弃上清。 重复洗涤 1次。 然后用 1 ml TBS重悬磁珠。 在微量管中混合 1 nmol生物素标记的靶蛋 白 (100 ul的 10 uM靶蛋白贮存液) 和洗涤 2次的链霉亲和素磁珠。 与噬菌体混合后, 靶蛋白的终浓度为 1 uM。 孵育结合 15分钟。 然后将管子放在磁力架上。 放置〜 1分钟 直至上清变清澈。 移弃上清。 加入终浓度为 5 uM的生物素, 孵育 5分钟。 使用磁力架 用 500 ul TBST洗 2次。弃上清。在靶蛋白 -磁珠复合物中加入 1012〜1013噬菌体(用 1 ml TBST/BSA (0. 5%)重悬)。 混匀, 孵育 15分钟。 弃上清, 用 l ml TBST洗 2次。 用 0. 5 ml TBS重悬磁珠。 保留 0. 2 ml磁珠悬液作为备份 (以备扩增失败时使用), 4 °C储存。 在 剩余的〜 0. 3 ml磁珠悬液中, 加入 3 ml对数期的 XL-1细胞。 感染 20分钟。 将感染后 的混合物转移至 30 ml 2xYT+氨苄青霉素(Αρ) +30μ1辅助噬菌体(最终效价为〜 108/ml ) +0. 2 mM IPTG, 37 过夜震荡培养。 用 PEG/NaCl沉淀 2次制备噬菌体, 最后用 300 ul TBS重悬噬菌体沉淀。
5. 第 2轮筛选: 从第 2轮开始, 开始采用溶液捕获法, 筛选时使用 KingFisher磁珠纯 化仪。 开始筛选前, 配制洗脱液 (必须现配现用), lOO ul/样品: 20 mM Tris (pH8), 100 mM DTT (4 ul 0. 5 M Tris , 1. 54 mg DTT/100 ul )。 配制结合液: 混合 60 ul噬菌体溶 液、 10 pmol的生物素标记靶蛋白 (可剪切的生物素化) 和 10 ul的 50 mg/ml BSA, 用 TBS补充至终体积 100 ul。 靶蛋白终浓度为 100 nM。 在筛选后的洗脱液中, 加入 1. 2 ml 对数期的 XL-1细胞。 室温感染 20分钟, 然后进入下一轮的噬菌体扩增步骤。 将感染后 的混合物转移至 30 ml 2 X YT+Ap+30 ul辅助噬菌体(最终效价为〜 108/ml ) +0. 2 mM IPTG 培养基中, 37 震荡培养过夜。 PEG/NaCl 沉淀两次制备噬菌体, 最后用 300 ul TBS 重悬。
6. 第 3轮筛选: 除结合液的配制方法和扩增步骤略有不同外,第 3轮的筛选方法与第 2 轮相同。 配制结合液 (该步骤与第 2轮不同): 混合 10 ul噬菌体溶液、 2 pmol的生物 素标记靶蛋白 (可剪切的生物素化) 和 10 ul 的 50 mg/ml BSA, 用 TBS补充至终体积 100 ul。靶蛋白终浓度为 20 nM。感染 100 ul对数期的 XL- 1细胞。加入 2 ml 2 X YT+Ap+2 ul辅助噬菌体 +0. 2 mM IPTG培养基, 37 °C培养过夜。 将剩余的洗脱液保存在 4 °C。
7. 第 4轮筛选 (仅检测富集度)
取 33 ul培养液上清, 加入 2 pmol生物素标记的靶蛋白 (可剪切的生物素化) 和 10 ul 的 50 mg/ml BSA, 用 TBS补充至终体积 100 ul。 靶蛋白终浓度为 20 nM。 同时设立不含 靶蛋白的阴性对照孔。阴性对照孔应当与含靶蛋白的样品孔相连。用 10 ul溶液感染 100 ul对数期的 XL-1细胞。梯度稀释并涂平板。必须至少得到 24个分离良好的克隆用于测 序。
8. 扩增单克隆(48个克隆 /天): 从效价测定平板上选取 24个单克隆。 分别接种到 150 ul 2xYT + Αρ (100μ8/ιη1)培养基中。 培养 3小时后, 加入 150μ1 2χΥΤ+Αρ+0. 3μ1辅助 噬菌体 +0. 4mM IPTG培养基。培养过夜。 5000 rpm离心 10分钟(使用 96孔板吊篮转子)。 转移 50 ul上清至新的 96孔板中。 分别在有或没有靶蛋白的情况下进行 KingFisher噬 菌体 ELISA (这些样品应当并排排列)。 使用 5 ul噬菌体上清。
对于含有游离半胱氨酸且需要保持还原状态的靶蛋白 (如雌激素受体), 生物素化 前封闭游离的半胱氨酸残基。 加入 0. 1 ml对数期的 XL-1细胞。 室温感染 20分钟, 然 后加入 2 ml 2xYT+Ap (+0. 2 mM IPTG) +2 ul辅助噬菌体培养基。 实施例 5 : GLP1受体激动剂表达载体的构建和融合蛋白的表达
将融合蛋白 Ex4- lfna- sabl、 Ex4- lhms- sabl和 Ex4- 1x5 j- sabl的核酸序列克隆到 pET-32a (+)表达载体中, 与 Thioredoxin (Trx) 共表达生成 Trx_融合蛋白, 以增加可 溶表达水平。
将含表达载体的 BL21 (DE3)冻存菌株置于冰上冰浴融化少许, 使用接种环, 划线活 化菌株于含 100 μ g/ml氨苄青霉素的 LB琼脂培养基上, 37 °C孵育过夜。 挑选活化的 单克隆细菌于 30 ml含 100 u g/ml氨苄青霉素的 LB培养基中,置于 37 °C摇床中 200 rpm 振摇培养过夜。按照 2%的接种量,将隔夜培养的细菌培养物接种到 1升含有 100 u g/ml 氨苄青霉素的 LB培养基中, 37 °C 200 rpm振摇,直到 600 nm光密度 (0D,)达到 0. 5。 将培养物室温放置, 降温至 25 V, 加入诱导剂异丙基 - β -D-硫代半乳糖苷 (IPTG) 至 终浓度 0. 5 mM。 将加入 IPTG后的细菌培养物置于 25°C摇床中, 200 rpm振摇培养 4小 时, 以诱导在大肠杆菌中的胞内表达。 表达结果见图 12。 实施例 6 : 纯化带有 Trx标签的融合蛋白
以 6000xg转速离心 10分钟收获大肠杆菌细胞, 加入 20 ml上样缓冲液 (50 mM磷 酸钠, 0. 5 M氯化钠, 20 mM咪唑, pH7. 4), 随后加入溶菌酶和蛋白酶抑制剂苯甲基磺 酰氟 (PMSF) 分别至终浓度 0. 2 mg/ml和 1 mM。 冰上孵育一小时, 菌悬液间歇超声破碎 2分钟。 15000xg离心蛋白悬液 1小时, 收集上清, 并以 0. 45 μ ιη微孔滤膜过滤。 以 10 倍柱体积的上样缓冲液预平衡预装在一次性柱中的 3 ml的 M-NTA树脂, 将经过过滤的 蛋白溶液缓慢上柱结合, 收集流出液重新上柱于 M-NTA树脂。 上样结束后, 使用上样 缓冲液缓慢洗涤 M-NTA柱直至无蛋白脱出。 使用洗脱缓冲液 (50 mM磷酸钠, 0. 5 M氯 化钠, 0. 5 M咪唑, pH7. 4) 洗脱结合蛋白, 每 2 ml收集 1管洗脱液。 测定各管 280 nm 光密度(0D28。)。 SDS-PAGE结果如图 13所示, 电泳结果显示纯化样品的分子量与蛋白的 预期大小一致。 实施例 7 : 融合蛋白的酶切
固定金属离子亲和色谱纯化后的带有 Trx 标签的融合蛋白 (EX-lfna-Sabl、 Ex4- lhms- sabl和 Ex4- lx5j- sabl ), 透析于透析缓冲液 ( 10 mM Tris , 30 mM氯化钠, 2 mM CaCL, 20 mM L- Arg HC1 , 20 mM L- Glu HC1 , pH8. 0) 中, 4 °C透析过夜。 酶切时加 入不同量的重组肠激酶 (Enterokinase, 简称 EK, GenScript ), 室温过夜。 SDS-PAGE 检测融合蛋白在不同 EK酶用量条件下的酶切效果, 结果如图 14所示。 实施例 8: 酶切后, 纯化融合蛋白
将酶切后的融合蛋白采用透析法把原有的透析缓冲液(10 mM Tris , 30 mM氯化钠, 2 mM CaCL, 20 mM L- Arg HC1 , 20 mM L- Glu HC1 , pH8. 0 ) 更换为新透析缓冲液 (40 mM Na2HP04 , 20 mM L- Arg HC1 , 20 mM L- Glu HC1 , pH7. 4)。 以 10倍柱体积的新透析缓冲 液预平衡预装在一次性柱中的 3 ml的 M-NTA树脂, 将经过过滤的蛋白溶液缓慢上柱结 合, 收集流出液, 即为酶切后的融合蛋白。测定目标蛋白在 280皿处的光密度(0D28。)。 SDS-PAGE结果如图 15所示, 显示纯化样品的分子量与蛋白的预期质量一致。 实施例 9: 融合蛋白的 GLP1受体激动活性与 Ex4相似
通过如下实验判断融合蛋白所保留的功能分子活性(详见: 以 GLP-1受体为靶点的 药物筛选细胞模型的建立和应用。环奕、 申竹芳《药学学报》 2009, 44 ( 3): 309-313 )。 此文献全篇以参考合并 (incorporation by reference ) 的方式成为本申请的一部分。 实验步骤简述如下: 首先构建含 GLP1受体信号通路调控的特异应答原件 (RIP-CRE) 6 拷贝序列及报告基因 E-GFP的重组载体 Peakl2RIP-CRE6X GFP。将该载体转染胰岛 NIT-1 细胞株,该细胞模型在 GLP1类似物刺激下激活表达报告基因。然后使用不同浓度(1X10— "、 1X10— 1()、 1Χ10—9、 lX10—8、 1Χ10—7、 1X10— 6 M) 的融合蛋白刺激, 刺激 48小时后通过荧光检 测酶标仪检测不同浓度药物刺激后的荧光值改变。 为避免由于细胞状态、 检测加样及读 数时间延误造成的不同实验批次造成的实验误差, 引入内参基因合并靶基因的双报告基 因检测方法,荧光检测结果为靶基因荧光读数值 /内参基因荧光读数值。阳性药对照为: 艾赛纳肽注射液 (礼来公司)。
表 6融合蛋白的 GLP1受体激动活性
蛋白名称 浓度(mol/1 ) EC 50 EC 1. 5 1X10— 1X10 1X10
1X10— 1X10— 1X10—
Ex4 - lhms- sab 1. 2 2. 9 4. 0 1. 86E- - 0 3. 96E- - 1
1. 16 1. 17 4. 18
1 9 7 6 9 0
Ex4- lx5j- sab 0. 8 1. 1 3. 2 1. 30E- - 0 1. 45E- - 1
0. 84 1. 27 4. 30
1 3 4 3 9 0
Ex4-lfna- sab 1. 7 2. 0 3. 3 4. 6 IE- - 1 2. 46E- - 1
1. 01 1. 11 3. 03
1 1 5 0 0 0
2. 2 4. 0 4. 5 2. 74E- - 0 3. 45E- - 1
Exendin -4 1. 11 1. 54 4. 84
3 1 6 9 0 上表结果显示, 融合蛋白样品活性随浓度增加而增强, 具有显著的量效关系。 其中 一个指标为半数有效浓度 (EC50 ), 另一个指标为在激动剂达到 1. 5倍激活效能时对应 的浓度 (EC1. 5 )。 以上结果显示, 融合蛋白样品与阳性药的量效曲线接近平行, 活性相 仿, 证明人源蛋白基结合分子模板与功能分子 (Exendin-4) 融合后, 没有对功能分子 的原有活性造成显著降低。 实施例 10 :测试 Ex4-lfna-sabl与人血清白蛋白的结合能力
包被: 使用 Nunc MaxiSorp微孔板, 每孔包被 50 μ 1用包被缓冲液 (50 mM NaHC03, pH9 ) 配制的 5 y g/ml的 HSA溶液, 4°C孵育过夜。 同时用不含靶点蛋白的溶液包被, 作 为阴性对照孔。 用 PBST洗一次。
封闭: 每孔加入 200 μ 1用 PBS配制的封闭液, 在湿润盒子中 4°C孵育过夜或室温孵 育 2-4小时。 用 PBS洗一次。
加样品: 加入 50 μ ΐ用 PBST或乙酸钠 (ρΗ 5. 5 ) 稀释的融合蛋白样品, 在回转摇床 上室温孵育 1小时。 用 PBST洗 3次。
加一抗: 加入 50 u 1 1000倍稀释的 Flag标签抗体溶液, 在回转摇床上室温孵育 40 分钟。 用 PBST洗 3次。
加二抗: 加入 50 μ 1稀释的 HRP标记的二抗溶液 (用 PBST稀释 1000倍), 在回转摇床 上室温孵育 40分钟。 用 PBST洗 5次, 然后用 PBS洗 2次。
检测: 加入 50 u 1 1 step turbo-TMB-ELISA, 室温孵育直至出现蓝色。 加入 50 u 1 的 2 M S04终止反应。 使用酶标仪测定 450 nm处的的吸光值。
试剂配制:
PBS: 0. 1 M的磷酸盐缓冲液, pH7. 4.
PBST: 含 0. 1%吐温 20的 PBS溶液。
封闭液: 含 1% Ficoll 400的 PBS溶液。
结果: 如图 16所示, 和阴性对照蛋白 (即可变区未修饰的模板蛋白) 相比, Ex4-lfna-Sabl与人血清白蛋白具有较强的结合能力, 并且 pH为 5. 5时, 其结合能力基本 不受影响。 实施例 11: Ex4-lfna-sabl和 Ex4—样在正常生理血糖浓度下不会降低血糖而引起不良 反应
实验动物: 昆明种小白鼠, 体重 22-24g, 雌雄各半。
取健康小鼠 50只,随机分为对照组、艾赛那肽给药组(1. 3ug/kg)和 Ex4-lfna-Sabl 给药组 (64ug/kg、 128ug/kg和 320ug/kg)。 对照组给予等体积磷酸盐缓冲液。 禁食 12 小时后, 皮下注射相应药物或生理盐水, 测量给药后 0、 0. 5、 1、 2、 4、 8、 12和 24小 时的血糖值。
由图 17可见, Ex4-lfna-sabl组在三个给药剂量下的各时间点血糖值,与磷酸盐缓 冲液 (PBS ) 对照组和 Ex4对照组比较, 没有显著区别, 不会对正常小鼠产生低血糖的 不良影响。 实施例 12 : Ex4-lfna-sabl (与 Ex4不同)在小鼠中的降糖效果能持续到 12小时后 实验动物: 昆明种小白鼠, 体重 22-24g, 雌雄各半。
实验 1 : 取健康小鼠 18 只, 随机分为对照组 (磷酸盐缓冲液)、 艾赛那肽给药组 ( 1. 3ug/kg,即 0. 31 nmol/kg )和 Ex4- lfna- sabl给药组(320ug/kg,即 21. 33 nmol/kg)。 禁食 12小时后测定血糖值并皮下给药。给药 2小时后灌胃给予 1. 5g/kg的葡萄糖溶液, 测定灌糖前 30分钟、 灌糖后 0. 5、 15、 30、 60和 120分钟的血糖值 (见图 18)。
实验 2 : 取健康小鼠 18 只, 随机分为对照组 (磷酸盐缓冲液)、 艾赛那肽给药组 ( 1. 3ug/kg,即 0. 31 nmol/kg )和 Ex4- lfna- sabl给药组(320ug/kg,即 21. 33 nmol/kg ) 0 禁食 12小时后测定血糖值并皮下给药。给药 12小时后灌胃给予 1. 5g/kg的葡萄糖溶液, 测定灌糖前 30分钟、 灌糖后 0. 5、 15、 30、 60和 120分钟的血糖值 (见图 19)。
比较以上两组实验 (给药 2小时后的降糖效果和给药 12小时后的降糖效果) 的血 糖曲线下面积(图 20 )可见, Ex4和 Ex4-lfna-Sabl在给药 2小时后均具有显著的降糖 效果 (p〈0. 05 ), 而在给药 12 小时后, 只有 Ex4-lfna-Sabl 仍然具有显著的降糖效果 (p〈0. 05)。 实施例 13: EX4及其衍生物在小鼠血浆中的药代动力学研究
实验材料: Exendin-4 ( Heloderma suspectum ) 酶联试剂盒 ( Phoenix Pharmaceut ical s公司, 目录号 EK- 070- 94 ) ; ICR小鼠血浆; ACCU- CHElTPerf orma血糖 仪 (罗氏公司)。
实验动物: 雄性 ICR小鼠
实验方法:
用 950 ml蒸熘水稀释 20 X分析缓冲液, 得到 I X分析缓冲液, 用于稀释该试剂盒 中的其他所有试剂及样品。用分析缓冲液稀释 EX4衍生物,使样品浓度分别为 0. 5、2. 5、 5、 10、 25和 50 ng/mL。 1 μ 1工作液与 9 u 1空白血浆混合, 作为标准液。 用 40 μ 1 分析缓冲液稀释小鼠血浆, 按照酶联试剂盒说明书进行定量分析。
取 10 只小鼠, 分为艾赛那肽 (Εχ4 ) 对照组 ( 1. 3 ug/kg, 即 0. 31 nmol/kg ) 和 Ex4- lfna- sabl给药组 ( 320 ug/kg, 即 21. 33 nmol/kg )0 禁食 10小时后, 各组小鼠皮 下给药 0. 08、 0. 25、 0. 5、 1、 2、 4、 6、 10、 24和 48小时后取 30-40ul血, 利用酶联 试剂盒及先前建立的工作曲线测定血样中 Ex4浓度。 图 21是 Ex4对照组的 PK曲线。 图 22是 Ex4- lfna- sabl的 PK曲线。
此实验结果表明, Ex4-lfna-Sabl在皮下给药后约 0. 5小时达到血药浓度峰值, 之 后迅速下降,在 4小时后进入明显的平台期,浓度约为 15ng/ml (约 1 nM,是该药物 EC50 浓度的 2倍),并持续至 48小时乃至更久, 因此可以解释其在小鼠中给药 12小时后(实 施例 12 )仍然具有显著的降糖效果。而 Ex4对照药物的血药浓度在给药 6小时即迅速下 降至几乎完全清除的状态 (0. 5 ng/ml , 约 0. 1 nM, 远低于其 EC50浓度)。 按照一级消 除动力学计算, Ex4的消除半衰期为 14. 19 小时, Ex4-lfna-sabl的消除半衰期为 25. 85 小时。 实施例 14: Ex4-lfna-sabl在比格犬中的降糖效果能维持至少 5天
实验动物: 比格犬, 体重 13_16kg。
实验 1 : 取健康动物 2只, 分为对照组 (等体积磷酸盐缓冲液) 和 Ex4-lfna-Sabl 给药组 (lmg/kg)。 禁食 12小时后静脉注射给药, 令其自由活动、 进食约 12小时后, 再次禁食 12小时, 即给药 24小时 (1天) 后, 灌胃给予 4g/kg的葡萄糖溶液, 测定灌 糖前 30分钟、 灌糖后 5、 10、 20、 30、 45、 60和 120分钟的血糖值。
实验 2 : 取健康动物 2只, 分为对照组 (等体积磷酸盐缓冲液) 和 Ex4-lfna-Sabl 给药组 (lmg/kg)。 禁食 12小时后静脉注射给药, 令其自由活动、 进食约 60小时后, 再次禁食 12小时, 即给药 72小时 (3天) 后, 灌胃给予 4g/kg的葡萄糖溶液, 测定灌 糖前 30分钟、 灌糖后 5、 10、 20、 30、 45、 60和 120分钟的血糖值。
实验 3 : 取健康动物 2只, 分为对照组 (等体积磷酸盐缓冲液) 和 Ex4-lfna-Sabl 给药组 (lmg/kg)。 禁食 12小时后静脉注射给药, 令其自由活动、 进食约 108小时后, 再次禁食 12小时, 即给药 120小时(5天)后, 灌胃给予 4g/kg的葡萄糖溶液, 测定灌 糖前 30分钟、 灌糖后 5、 10、 20、 30、 45、 60和 120分钟的血糖值。
以上三组实验的结果如图 23所示。 实验结果表明, Ex4-lfna-Sabl给药的动物, 在 口服葡萄糖后, 未出现对照组动物的血糖急剧升高的峰值曲线, 证明产生了显著的降糖 效果, 而这一降糖效果可以维持至少 5天。 实施例 15 : Ex4-lfna_sabl在比格犬中的药代动力学研究
实验动物: 比格犬, 体重 13_16kg。
取健康动物 1只, 禁食 12小时后静脉注射给药(20ug/kg), 0. 08、 1、 4、 12、 24、 48、 72、 96和 144小时后取 30-40ul血, 用 EIA Kit测定血样中 Ex4-lfna-sabl的浓度 (也就是 Exendin-4的浓度, 因为 EIA Kit识别的是 Ex4-lfna-sabl融合蛋白的 Εχ-4 (即 Exending-4 ) 部分)。 测试结果如图 24所示。
实验结果表明, Ex4-lfna-Sabl在犬注射给药 4小时后进入明显的平台期, 浓度约 为 20-30ng/ml ( 1. 5-2nM, 约是该药物 IC50浓度的 3倍以上), 并持续至 144小时乃至 更久, 因此可以解释其在比格犬中给药 144小时后(实施例 14 )仍然具有显著的降糖效 果。 表 7本专利中出现的序列
编号 序列名称 具体序列
1 lx5j GSSGSSGPMMPPVGVQASILSHDTIRITWADNSLPKHQKITDSRYYTVRWKTNIPANTK
YKNA ATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGTTFEL
2 用于改造 EVRSFCTDWPAEKSCKPLRG
3 蛋白模板 RAPESFVCYWETICFERSEQ
4 的 NLP序 EMCYFPGICWM
5 列 QRQMVDFCLPQWGCLWGDGF
6 RLIEDICLPRWGCLWEDD
7 GEWWED ICLPRWGCLWEEED
8 NVCLPKWGCLWE
9 模板蛋白 SGP丽 PPVGVQASILSHDTIRITWADEVRSFCTDWPAEKSCKPLRGRYYTVRWKTNIPA 1x5 j改造 NTKYKNANATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGTTFELS
10 后的变体 SGP丽 PPVGVQASILSHDTIRITWADEMCYFPGICWMRYYTVRWKTNIPANTKYKNANA 序列 TTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGTTFELS
11 SGP丽 PPVGVQASILSHDTIRITWADRLIEDICLPRWGCLWEDDRYYTVRWKTNIPANT
KYKNANATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGTTFELS
12 MG昍昍昍昍昍 SSDYKDDDDKGENLYFQGSSGPMMPPVGVQASILSHDTIRITWADEV
RSFCTDWPAEKSCKPLRGRYYTVRWKTNIPANTKYKNANATTLSYLVTGLKPNTLYEFS VMVTKGRRSSTWSMTAHGTTFELS
13 ATGGGCCATCATCACCATCATCACCACCATCACCATAGCAGCGACTACAAAGACGACGA
TGACAAAGGTGAAAACCTGTACTTCCAGGGATCCAGCGGCCCAATGATGCCGCCAGTGG GCGTGCAGGCAAGCATTCTGAGCCATGATACCATTCGTATTACCTGGGCGGATGAGGTG CGTAGCTTTTGCACCGATTGGCCGGCAGAAAAAAGCTGCAAACCGCTGCGTGGCCGTTA TTACACGGTGCGTTGGAAAACCAACATTCCGGCAAACACGAAATACAAAAACGCGAACG CGACCACCCTGAGCTATCTGGTTACGGGCCTGAAGCCGAATACGCTGTATGAGTTCAGC GTGATGGTGACCAAAGGCCGTCGTAGCAGCACCTGGAGCATGACCGCGCATGGCACGAC CTTTGAACTGAGCTA
14 载体序列 见 25页
15 lfna RDLEVVAATPTSLLISWDAPAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKP
GVDYTITVYAVTGRGDSPASS P I S INYRTE I lklg TRVSDKVMIPQDEYPEINFVGLLIGPRGNTLKNIEKECNAKIMIRGKGSVKEGKVGRK DGQMLPGEDEPLHALVTANTMENVKKAVEQIRNILKQGIETPEDQNDLRKMQLRELAR LNGTLR
lhms VDAFLGTWKLVDSKNFDDYMKSLGVGFATRQVASMTKPTTIIEKNGDILTLKTHSTFK
NTEISFKLGVEFDETTADDRKVKSIVTLDGGKLVHLQKWDGQETTLVRELIDGKLILT
LTHGTAVCTRTYEKE
VSSVPTKLEVVAATPTSLLISWDASSSSVSYYRITYGETGGNSPVQEFTVPGSKSTATI SGLKPGVDYTITVYAEVRSFCTDWPAEKSCKPLRGPISINYRT
VSSVPTKLEVVAATPTSLLISWDASSSSVSYYRITYGETGGNSPVQEFTVPGSKSTATI SGLKPGVDYT ITVYATDWPAEKSP I SINYRT
TRVSDKVMIPQDEYPEINFVGLLIGPRGNTLKNIEKESNAKIMIRGKGSVKEGTDWPAE KSQMLPGEDEPLHALVTANTMENVKKAVEQIRNILKQGIETPEDQNDLRKMQLRELA
TRVSDKVMIPQDEYPEINFVGLLIGPRGNTLKNIEKESNAKIMIRGKGSVKEGLPQWGQ MLPGEDEPLHALVTA TMENVKKAVEQIRNILKQGIETPEDQNDLRKMQLRELA
TRVSDKVMIPQDEYPEINFVGLLIGPRGNTLKNIEKESNAKIMIRGKGSVKEGEVRSFC TDWPAEKSCKPLRGQMLPGEDEPLHALVTANTMENVKKAVEQIRNILKQGIETPEDQND LRKMQLRELA
TRVSDKVMIPQDEYPEINFVGLLIGPRGNTLKNIEKESNAKIMIRGKGSVKEGRLIEDI CLPRWGCLWEDDQMLPGEDEPLHALVTANTMENVKKAVEQIRNILKQGIETPEDQNDLR KMQLRELA
VDAFLGTWKLVEVRSFCTDWPAEKSCKPLRGTTIIEKNGDILTLKTHSTFKNTEISFKL GVEFDETTADDRKVKSIVTLDGGKLVHLQKWDGQETTLVRELIDGKLILTLTHGTAVCT RTYEKE
Exendin - HGEGTFTSDLSKQMEEEAVRLFIEWLKNGGPSSGAPPPS
4
GLP-1 HAEGTFTSDVSSYLEGQAAKEFIAWLVKGRG
sabl VSSVPTKLEVVAATPTSLLISWDASSSSVSYYRITYGETGGNSPVQEFTVPGSKSTATI
SGLKPGVDYTITVYAEVRSFCTDWPAEKSCKPLRGPISINYRT
sab2 VSSVPTKLEVVAATPTSLLISWDASSSSVSYYRITYGETGGNSPVQEFTVPGSKSTATI
SGLKPGVDYTITVYAVTDWPAEKSPISINYRT
sab3 VDAFLGTWKLVEVRSFCTDWPAEKSCKPLRGTTIIEKNGDILTLKTHSTFKNTEISFKL
GVEFDETTADDRKVKSIVTLDGGKLVHLQKWDGQETTLVRELIDGKLILTLTHGTAVCT RTYEKE
sab4 SGP丽 PPVGVQASILSHDTIRITWADEVRSFCTDWPAEKSCKPLRGRYYTVRWKTNIPA
NTKYKNANATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGTTFELS
sab5 SGP丽 PPVGVQASILSHDTIRITWADEMCYFPGICWMRYYTVRWKTNIPANTKYKNANA
TTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGTTFELS
sab6 SGP丽 PPVGVQASILSHDTIRITWADRLIEDICLPRWGCLWEDDRYYTVRWKTNIPANT
KYKNANATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGTTFELS
sab7 SGP丽 PPVGVQASILSHDTIRITWADEVRSFCTDWPAEKSCKPLRGRYYTVRWKTNIPA
NTKYKNANATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGTTFELS
sab8 LVPTSPPKDVTVVTDWPAEKSKTIIVNWQPPSEANGKITGYIIYYSTEVRSFCTDWPAE
KSCKPLRGWVIEPVVGNRLTHQIQELTLDTPYYFKIQARNSKGMGPMSEAVQFRTPKAS
sab9 SAPRDVVASLVSTRFIKLTWRTPEVRSFCTDWPAEKSCKPLRGTYSVFYTKEGIARERV
ENTSHPGEMQVTIQNLMPATVYIFRVMAQNKHGSGESSAPLRVE
1 inker 1 LAAA
l inker2 (GGGGS) n (n=l- 6 )
Linker3 (GGSGGGS) n (n=l- 5 ) Linker4 A (EAAAK) nA (n=2- 5 )
Linker5 (PEAPTD) n (n=l- 5 )
IEGR
FNPRG (P/A/S)
Exendin - HHGEGTFTSDLSKQMEEEAVRLFIEWLKNGGPSSGAPPSKKKKKK 4变体

Claims

权利要求:
1. 一种鉴定一个蛋白模板产生拟抗体潜力的方法,包括:(i)初步选择某蛋白; (ϋ) 用该蛋白本身的结构信息来鉴定该蛋白中可以引入变化而又基本不影响该蛋白 结构的一或多个区域 (称为可变区), 从而鉴定该蛋白模板产生拟抗体的潜力。
2. 权利要求 (1) 的方法, 并进一步包括用该蛋白本身的序列信息来优先选择权利 要求 (1) 所鉴定的可变区。
3. 权利要求 (1) 或 (2) 的方法, 并进一步包括: 鉴定该蛋白模板产生拟抗体的 潜力后验证该蛋白模板产生拟抗体的潜力, 其验证方法包括: (i) 在所鉴定可 变区中引入点突变, 插入一或多个可以参与形成蛋白模板与其他蛋白相互作用 的界面并呈现非线性结构, 或者可以自身呈现非线性结构的多肽 (NLP), 或以 一或多个所述多肽的部分或全部取代所述可变区, 然后分析所形成的在可变区 引入上述变化的蛋白变体的性能, 其中, 该蛋白变体的性能好坏验证了所述鉴 定该蛋白模板产生拟抗体的潜力, 或 (ii) 将所述蛋白模板克隆到常用的蛋白 展示方法的展示载体中, 并在所鉴定的可变区中插入随机寡核苷酸从而建立一 个所述蛋白模板的可变区被随机多肽部分或全部取代的文库, 然后用常用的蛋 白展示方法来从所建文库中筛选对于一或多个给定靶点有亲和力的蛋白 (称为 "融合蛋白 "); 其中所筛出的融合蛋白的性能好坏验证了该蛋白模板产生拟抗 体的潜力。
4. 权利要求(3) 的方法, ¾_中蛋白展示是以下中的一种: (i)噬菌体展示(phage display); (ii) 酵母展示 (yeast display); (iii) mRNA展示; 和 (iv) 核糖 体展示 (ribosome display)。
5. 权利要求 (3) 的方法, 其中所分析的变体蛋白或融合蛋白的性能包括: (i) 热 稳定性, (ii) 酶稳定性, (iii) 溶解性, (iv) 是否具有引入多肽的原有的与 其靶点的亲和力, 和 (V) 表达水平。
6. 一种制作拟抗体的方法, 包括: 权利要求 (1) 或 (2) 的方法, 并进一步包括 在一或多个所鉴定的可变区中引入点突变, 或者插入一或多个多肽, 或者以一 或多个所述多肽的一部分或全部来部分或全部取代所述可变区。
7. 权利要求 (6) 的方法, 其中的多肽是可以参与形成蛋白模板与其他蛋白相互作 用的界面并呈现非线性结构, 或者可以自身呈现非线性结构的多肽 (NLP)。
8. 权利要求(6)或(7) 的方法, 进一步包括将一或多个所鉴定的可变区用与对应 可变区长短接近的所述多肽的部分或全部取代所述可变区。
9. 权利要求(6) 或(7) 的方法, 其中所插入的多肽来源于以下之一: (i) 自身能 够形成环状结构并具有靶向结合能力的多肽; (ii) 抗体互补决定区
(Complementarity determining regions, 即 CDR) 的一部分; ( iv) 两个互相 作用的天然蛋白之间的结合界面 (binding interface) 的一部分。
10.权利要求 (6) 或 (7) 的方法, 进一步包括用以下方法之一获得或者制作所述 多肽: (i) 选择一或多个己知的和某靶点蛋白结合的多肽; (ii) 通过蛋白展示 来筛选能和某靶点相结合的多肽; (iii) 筛选双硫键非线性多肽 (NLP); (iv) 制作某一靶点的抗体, 然后按照该抗体的互补决定区 (Complementarity determining regions,或 CDR)的部分或全部序列来制作一或多段多肽;和(v) 从两个互相作用的天然蛋白之间的结合界面 (binding interface) 选择一段作 为所述多肽。
11.权利要求 (10) 的方法, 其中蛋白展示是以下方法中的一种: (i) 噬菌体展示
(phage display); ( ii )酵母展示; (iii )mRNA展示;和( iv)核糖体展示(ribosome display )。 权利要求 (6) 或 (7) 的方法, 进一步包括改变可变区之外 (称为不可变区) 来进一步改进所制拟抗体。
权利要求 (12) 的方法, 其中对不可变区的改变包括: (i) 将不可变区的 N端 或 C端序列进行增删, ( ii ) 将所述 N端或 C端改造为适合表达宿主的序列, 和
(iii) 将不可变区中连接二级结构的连接区的残基替换为侧链较短的残基。 权利要求 (13) 的方法, 其中侧链较短的残基是甘氨酸、 丙氨酸和丝氨酸。 权利要求 (1) 的方法, 可变区用以下方法鉴定:
(i) 选择一或多个与该蛋白结构相似的蛋白, 和该蛋白组成一个蛋白结构组;
(ϋ) 用一或多种常用来描述蛋白结构的数据和一或多个描述不完全数据的随 机性的数学模型来描述该蛋白结构组的结构特征 (称为结构谱);
(iii) 用随机抽样方法来更新所述模型及有关参数, 直至模型收敛, 从而估计 出结构谱;
(iv) 将该蛋白组结构中倾向于偏离了结构最常见状态, 而呈现出不常见状态 的结构区域鉴定为可变区。
权利要求 (15) 的方法, 其中描述蛋白结构的数据是三维欧几里德空间坐标。 权利要求 (16) 的方法, 其中三维欧几里德空间坐标所描述的是蛋白全原子、 碳 alpha (C。 )、碳 beta (Ce )、碳 gamma (Cy )、碳 delta < Δ)、碳 epsilon (Ce) 或其他类型原子, 或以上原子类型的组合。
权利要求 (15) 的方法, 其中描述蛋白结构的数据是蛋白质接触图 (protein contact map)。
权利要求 (15) 的方法, 其中第 (ii) 步中描述不完全数据的随机性的数学模 型是隐马氏模型, 模型的每一个节点有三种状态 M (Match, 同源保守)状态、 I
(Insert, 随机空间) 状态和 D (Deletion, 缺失) 状态, 这三种状态遵守一定 的概率分布; 其中第 (iv) 步将该蛋白组结构中倾向于偏离了 M状态, 或者呈 现出 I状态的结构区域鉴定为可变区。
权利要求 (19) 的方法, 其中三种状态所遵守的概率分布是高斯 (Gaussian) 分布、 贝塔 (Beta) 分布或指数 (Exponential) 分布。
权利要求 (15) 的方法, 其中描述不完全数据的随机性的数学模型中有设定明 确的参数, 以区分以下三种因素引起的蛋白结构柔韧性: (i) 热稳定性导致的 自体柔性, (ii) 非热稳定性导致的自体柔性, 和 (iii) 自然或人工进化过程 中蛋白结构可被容忍发生的偏差。
权利要求 (19) 的方法, 其中所述蛋白组的结构被视为遵循一定的图谱 (G) 出 现的随机路径 (A), 按照一定的发射概率产生的随机变量 (Y), 通过一定的旋 转 (R)和平移 (V)操作, 而产生的随机三维点阵; 其中随机抽样方法是 Monte Carlo方法, 而用随机抽样方法来更新的所述模型有关参数是图谱(G)、 随机路 径 (A)、 随机变量 (Y)、 旋转 (R) 和平移 (ν)。
权利要求 (22) 的方法, 其中随机路径过程中所涉及的联合概率或条件概率由 Forward或 Viterbi算法得出。
权利要求 (22) 的方法, 其中随机抽样至少 100次, 还进一步包括 (i) 对于每 次抽样检査该蛋白结构的每个残基所对应的节点状态, 如果节点状态对应 I状 态, 标记该残基为属于潜在可变区; (ii) 如果节点状态对应 M状态, 而该残基 的空间位置大大偏离所对应的 M状态所对应发射概率分布则标记该残基为属于 潜在可变区。
权利要求 (24) 的方法, 其中在总共至少 100次抽样中, 被标记为属于潜在可 变区的累计次数超过一定比例的残基被最终视为可变区。
26.权利要求 (24) 的方法, 其中的大大偏离是指发射概率小于 0.05。
27.权利要求 (25) 的方法, 其中若被标记为属于潜在可变区的累计次数超过 95%, 则被最终视为可变区。
28.权利要求 (2) 的方法, 其中用该蛋白本身的序列信息来优先选择所鉴定的可变 区, 包括:
(i) 选择一或多个与该蛋白序列相似的蛋白, 和该蛋白组成一个蛋白组;
(ii)对所述蛋白组进行多序列比对, 建立系统发育树, 根据分子进化模型, 计算每个位 点的进化速率并给各位点的保守性打分。
(iii) 用第 (ii) 步得到的位点的分数来优先选择可变区; 即分数越低的位点越有可 能属于可变区, 从而被优先选择。
29.一个多肽或蛋白, 其序列是 SEQ ID NO: 1或者其中至少有一个氨基酸和 SEQ ID NO: 1所对应的基因 (即 lx5j) 的野生序列相比有改变。
30.一个多肽或蛋白, 其序列是 SEQ ID NO: 16或者其中至少有一个氨基酸和 SEQ ID NO: 16所对应的基因 (即 lklg) 的野生序列相比有改变。
31.一个多肽或蛋白, 其序列是以下序列之一或者和以下序列之一有 75%以上同源:
(i) SEQ ID NO: 1, 其中可变区包括第 32至 43个氨基酸之间、 第 55至 58个 氨基酸之间和第 90至 93个氨基酸之间; (ii) SEQ ID NO: 15, 其中可变区包 括第 72至 81个氨基酸之间; (iii) SEQ ID NO: 16, 其中可变区包括第 10至 15个氨基酸之间和第 45至 68个氨基酸之间; (iv) SEQ ID NO: 17, 其中可变 区包括: 第 67至 71个氨基酸之间、 第 86至 91个氨基酸之间和第 96至 101个 氨基酸之间。
32.权利要求 (31) 的多肽或蛋白, 其和所列出的序列至少 80%以上同源。
33.权利要求 (31) 的多肽或蛋白, 其和所列出的序列至少 85%以上同源。
34.权利要求 (31) 的多肽或蛋白, 其和所列出的序列至少 90%以上同源。
35.权利要求 (31) 的多肽或蛋白, 其和所列出的序列至少 95%以上同源。
36.权利要求 (31) 的多肽或蛋白, 其和所列出的序列至少 99%以上同源。
37.权利要求 (31) - (36) 的多肽或蛋白, 其可变区被以下多肽或和以下多肽 75% 以上同源的序列插入, 或者其可变区部分或全部被所述多肽序列部分及全部替 代:
(i) SEQ ID NO: 2; (ii) SEQ ID NO: 3; (iii) SEQ ID NO: 4; (iv) SEQ ID NO: 5;
(v) SEQ ID NO: 6; (vi) SEQ ID NO: 7; 和 (vii) SEQ ID NO: 8。
38.权利要求(37) 的多肽或蛋白, 其可变区被以下多肽或和以下多肽 80%以上同源 的序列插入, 或者其可变区部分或全部被所述多肽序列部分及全部替代。
39.权利要求(37) 的多肽或蛋白, 其可变区被以下多肽或和以下多肽 85%以上同源 的序列插入, 或者其可变区部分或全部被所述多肽序列部分及全部替代。
40.权利要求(37) 的多肽或蛋白, 其可变区被以下多肽或和以下多肽 90%以上同源 的序列插入, 或者其可变区部分或全部被所述多肽序列部分及全部替代。
41.权利要求(35) 的多肽或蛋白, 其可变区被以下多肽或和以下多肽 95%以上同源 的序列插入, 或者其可变区部分或全部被所述多肽序列部分及全部替代。
42.权利要求(37) 的多肽或蛋白, 其可变区被以下多肽或和以下多肽 99%以上同源 的序列插入, 或者其可变区部分或全部被所述多肽序列部分及全部替代。
43.一个分离的核酸分子, 其编码权利要求 (29) - (42) 的多肽或蛋白。
44.一个表达载体, 其包含权利要求 (43) 的核酸分子。 一个表达载体, 其可以表达权利要求 (29) - (42) 的多肽或蛋白。
—个表达载体, 其序列为 SEQ ID NO: 14或和 SEQ ID NO: 14至少 75%以上同源 的序列。
权利要求 (46) 的表达载体 , 其序列为 SEQ ID NO: 14或和 SEQ ID NO: 14至 少 80%以上同源的序列。
权利要求 (47) 的表达载体 , 其序列为 SEQ ID NO: 14或和 SEQ ID NO: 14至 少 85%以上同源的序列。
权利要求 (48) 的表达载体 , 其序列为 SEQ ID NO: 14或和 SEQ ID NO: 14至 少 90%以上同源的序列。
权利要求 (49) 的表达载体 , 其序列为 SEQ ID NO: 14或和 SEQ ID NO: 14至 少 95%以上同源的序列。
权利要求 (50) 的表达载体 , 其序列为 SEQ ID NO: 14或和 SEQ ID NO: 14至 少 99%以上同源的序列。
一个高分子 (macromolecule), 包括以下两部分:
(i) 一段有生物功能的多肽或蛋白, 其序列是以下序列之一或者和以 下序列之一有 75%以上同源: (a) SEQ ID NO: 25; (b) SEQ ID NO: 26; 和 (c) SEQ ID NO: 43。
(ii) 一个血清白蛋白靶向多肽或蛋白, 其序列是以下序列之一或者和 以下序列之一有 75%以上同源: (a) SEQ ID NO: 27; (b) SEQ ID NO: 28; (c) SEQ ID NO: 29; (d) SEQ ID NO: 30; (e) SEQ ID NO: 31;
(f) SEQ ID NO: 32; (g) SEQ ID NO: 33; (h) SEQ ID NO: 34; 和 (i) SEQ ID NO: 35。
权利要求 52的高分子, 在所述生物功能的多肽 (第一部分) 和所述血清白蛋白 靶向多肽 (第二部分) 之间进一步包括一个连接分子 (第三部分), 该连接分子 的分子量在 300到 5, 500之间。
权利要求 52或 53的高分子,其中有生物功能的多肽是 SEQ ID N0: 26 (即 GLP-1) 的以下突变体:
(i) A8G、 R36G和 G37K突变体;
(ii) His'GLP-l 修饰突变体, 具体包括: 脱氨基 GLP-1、 (D-His GLP-U N-山梨醇 -GLP-1、 N_咪唑 _GLP_1、 Ν_α-甲基
- GLP-1、 N-甲基 -GLP-1、 N_乙酰基 _GLP_1和 N_焦谷氨酰 -GLP - 1;
(iii) Ala¾LP- 1突变体,具体包括: (D- Ala2)GLP- 1、 (Gly2)GLP - 1、 (Ser2) GLP-1 、 (Aha2) GLP-1 、 (Thr)GLP-l 、 (Aib2)GLP- 1 、 (Abu2) GLP-1和 (Val2) GLP- 1;
(iv) GluGLP-l 突变体, 具体包括: (Asp3)GLP- 1、 (Ala) GLP- U (Pro3) GLP- (Phe3)GLP- 1、 (Lys3) GLP- 1和(Tyr3) GLP- 1 ;
(v) 在 GLP-1的 N末端加上赖氨酸残基的突变体 KGLP-1。
权利要求 53或 54的高分子, 其中所述三部分以融合蛋白形式连在一起或以共 辗 (conjugation) 形式连在一起。
权利要求 53-55的任何一个高分子, 其中的连接分子是一个非多肽分子。
权利要求 56的高分子,其中的连接分子是以下分子之一或任何组合:聚乙二醇、 聚丙二醇、 (乙烯 /丙烯) 共聚乙二醇、 聚氧乙烯、 聚氨酯、 聚磷腈、 多糖、 右 旋糖酐、 聚乙烯醇、 聚乙烯基吡咯垸酮、 聚乙烯基乙基醚、 聚丙烯酰胺、 聚丙 烯、 聚氰基、 脂质聚合物、 几丁质、 透明质酸和肝素。
权利要求 53-55 的任何一个高分子, 其中的连接分子是一个多肽, 该多肽可以 由天然或非天然氨基酸组成。
权利要求 58的高分子, 其中连接分子的多肽由天然氨基酸组成。
权利要求 59的高分子, 其中形成多肽的天然氨基酸是可以形成蛋白质的天然氨 基酸。
61.权利要求 60的高分子, 其中形成多肽的天然氨基酸是由遗传密码直接编码的天 然氨基酸。
62.权利要求 61的高分子, 其中的多肽序列是以下序列之一或者和以下序列之一有
75%以上同源: (a) SEQ ID NO: 36; (b) SEQ ID NO: 37; (c) SEQ ID NO: 38;
(d) SEQ ID NO: 39; (e) SEQ ID NO: 40; (f) SEQ ID NO: 41; 和 (g) SEQ ID NO: 42。
63.权利要求 62的高分子, 其中的多肽序列和所列出的序列至少 80%同源。
64.权利要求 63的高分子, 其中的多肽序列和所列出的序列至少 85%同源。
65.权利要求 64的高分子, 其中的多肽序列和所列出的序列至少 90%同源。
66.权利要求 65的高分子, 其中的多肽序列和所列出的序列至少 95%同源。
67.权利要求 66的高分子, 其中的多肽序列和所列出的序列至少 99%同源。
68.权利要求 (52) 到 (67) 的任何一个高分子, 其中所述有生物功能的多肽和所 列出的序列至少 80%同源, 并且所述血清白蛋白靶向多肽和所列出的序列至少 80%同源。
69.权利要求 (52) 到 (67) 的任何一个高分子, 其中所述有生物功能的的多肽和 所列出的序列至少 85%同源,并且所述血清白蛋白靶向多肽和所列出的序列至少 85%同源。
70.权利要求 (52) 到 (67) 的任何一个高分子, 其中所述有生物功能的的多肽和 所列出的序列至少 90%同源,并且所述血清白蛋白靶向多肽和所列出的序列至少 90%同源。
71.权利要求 (52) 到 (67) 的任何一个高分子, 其中所述有生物功能的的多肽和 所列出的序列至少 95%同源,并且所述血清白蛋白靶向多肽和所列出的序列至少 95%同源。
72.权利要求 (52) 到 (67) 的任何一个高分子, 其中所述有生物功能的的多肽和 所列出的序列至少 99%同源,并且所述血清白蛋白靶向多肽和所列出的序列至少 99%同源。
73.一个分离的核酸分子, 其编码权利要求(52) - (72)的高分子中的多肽或蛋白。
74.一个表达载体, 其包含权利要求 (73) 的核酸分子。
75.—个表达载体, 其可以表达权利要求 (52) - (72) 的高分子中的多肽或蛋白。
76.—种药物或疫苗, 其包含权利要求 29-42 中任何一个多肽或蛋白, 或者权利要 求 52-72中任何一个高分子, 或者权利要求 43或 73的任何一个核酸分子, 或 者权利要求 44-51和 74-75中任何一个表达载体。
PCT/CN2013/075460 2013-05-10 2013-05-10 一种改造非抗体类蛋白产生结合分子的方法、所产生的产品和一种长效glp-1受体激动剂 WO2014179983A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2013/075460 WO2014179983A1 (zh) 2013-05-10 2013-05-10 一种改造非抗体类蛋白产生结合分子的方法、所产生的产品和一种长效glp-1受体激动剂
CN201380075612.0A CN105143250B (zh) 2013-05-10 2013-05-10 一种改造非抗体类蛋白产生结合分子的方法及其产品和一种长效glp-1受体激动剂

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2013/075460 WO2014179983A1 (zh) 2013-05-10 2013-05-10 一种改造非抗体类蛋白产生结合分子的方法、所产生的产品和一种长效glp-1受体激动剂

Publications (1)

Publication Number Publication Date
WO2014179983A1 true WO2014179983A1 (zh) 2014-11-13

Family

ID=51866644

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/075460 WO2014179983A1 (zh) 2013-05-10 2013-05-10 一种改造非抗体类蛋白产生结合分子的方法、所产生的产品和一种长效glp-1受体激动剂

Country Status (2)

Country Link
CN (1) CN105143250B (zh)
WO (1) WO2014179983A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106110325A (zh) * 2016-06-08 2016-11-16 上海朗安生物技术有限公司 一种新型 glp‑1 受体激动剂的制备方法及其在神经退行性疾病治疗领域的应用
US11123405B2 (en) 2015-12-23 2021-09-21 The Johns Hopkins University Long-acting GLP-1R agonist as a therapy of neurological and neurodegenerative conditions

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112420124B (zh) * 2021-01-19 2021-04-13 腾讯科技(深圳)有限公司 一种数据处理方法、装置、计算机设备和存储介质

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6818418B1 (en) * 1998-12-10 2004-11-16 Compound Therapeutics, Inc. Protein scaffolds for antibody mimics and other binding proteins
CN102775487A (zh) * 2012-06-07 2012-11-14 北京华金瑞清生物医药技术有限公司 一类人工靶向融合蛋白和蛋白偶联物及其制备方法和应用

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6818418B1 (en) * 1998-12-10 2004-11-16 Compound Therapeutics, Inc. Protein scaffolds for antibody mimics and other binding proteins
CN102775487A (zh) * 2012-06-07 2012-11-14 北京华金瑞清生物医药技术有限公司 一类人工靶向融合蛋白和蛋白偶联物及其制备方法和应用

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AKIKO KOIDE ET AL.: "The Fibronectin Type III Domain as a Scaffold for Novel Binding Proteins.", J. MOL. BIOL., vol. 284, 1998, pages 1141 - 1151 *
LIU Z ET AL.: "Chain A, Structural Basis For Recognition Of The Intron Branch Site Rna By Splicing Factor 1.", PDB: 1K1G_A, 25 September 2001 (2001-09-25) *
TOCHIO N ET AL.: "Chain A, The Solution Structure of the Fifth Fibronectin Type III Domain of Human Neogenin.", PDB: 1X5J_A, 15 May 2005 (2005-05-15) *
YUAN LI ET AL.: "Overview of Scaffold Protein Used for Selection of Artificial Binding Proteins.", CHINA BIOTECHNOLOGY, vol. 33, no. 1, January 2013 (2013-01-01), pages 95 - 103 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11123405B2 (en) 2015-12-23 2021-09-21 The Johns Hopkins University Long-acting GLP-1R agonist as a therapy of neurological and neurodegenerative conditions
CN106110325A (zh) * 2016-06-08 2016-11-16 上海朗安生物技术有限公司 一种新型 glp‑1 受体激动剂的制备方法及其在神经退行性疾病治疗领域的应用

Also Published As

Publication number Publication date
CN105143250B (zh) 2020-11-03
CN105143250A (zh) 2015-12-09

Similar Documents

Publication Publication Date Title
JP7217783B2 (ja) グリピカン-3(gpc3)に対する親和性を有するヒトリポカリン2のムテイン
KR101516023B1 (ko) 눈물 리포칼린 돌연변이 단백질 및 이를 얻는 방법
JP4907542B2 (ja) 治療、診断およびクロマトグラフィーに使用するためのタンパク質複合体
CN103459415B (zh) 设计的与血清白蛋白结合的重复蛋白
JP4989638B2 (ja) 足場
WO2014071978A1 (en) Nucleic acids encoding chimeric polypeptides for library screening
US9492572B2 (en) Dimeric binding proteins based on modified ubiquitins
KR20210111761A (ko) 단백질 기능 및 상호작용을 제어하기 위한 시약 및 방법
CN115461068A (zh) 仿生病毒肽的鉴定及其用途
WO2014179983A1 (zh) 一种改造非抗体类蛋白产生结合分子的方法、所产生的产品和一种长效glp-1受体激动剂
JP6629325B2 (ja) 親和性タンパク質及びその使用
US20100055715A1 (en) Nucleic and amino acid sequences of prokaryotic ubiquitin-like protein and methods of use thereof
US9097721B2 (en) Compositions comprising engineered phosphothreonine affinity reagents, methods of making, and methods of use
Shrivastava et al. Plasmodium falciparum FIKK9. 1 is a monomeric serine–threonine protein kinase with features to exploit as a drug target.
US9340584B2 (en) Engineered thioredoxin-like fold proteins
Gold et al. Engineering AKAP-selective regulatory subunits of PKA through structure-based phage selection
Mechulam et al. Translation initiation
US20050191628A1 (en) Antibiotics based upon bacteriophage lysis proteins
Gopinathan Nair Regulation of replication dependent nucleosome assembly
Yong Structure and Functional Characterization of ERG and SPOP
Leopold HP1 protein Chp2 selectively recruits nucleosome remodeler through non-canonical interaction
Deyle Development of Protein-Catalyzed Capture (PCC) Agents with Application to the Specific Targeting of the E17K Point Mutation of Akt1
Hackel Fibronectin domain engineering
JP2020029403A (ja) カテプシンeに強く結合し活性化するペプチド
WO2008024128A2 (en) Loop-variant pdz domains as biotherapeutics, diagnostics and research reagents

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201380075612.0

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13884254

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13884254

Country of ref document: EP

Kind code of ref document: A1