CN116333055A

CN116333055A - Ultrahigh affinity small protein targeting COVID-19 virus S protein and application thereof

Info

Publication number: CN116333055A
Application number: CN202111585512.5A
Authority: CN
Inventors: 赵磊; 张帆; 胡毅; 姚咏明
Original assignee: Chinese PLA General Hospital
Current assignee: Chinese PLA General Hospital
Priority date: 2021-12-19
Filing date: 2021-12-19
Publication date: 2023-06-27

Abstract

The invention provides a kind of ultrahigh affinity small protein targeting Spike protein (S protein) of a novel COVID-19 coronavirus mutant Delta and application thereof. In particular, the invention provides a binding protein which targets the Spike protein of a novel coronavirus Delta mutant strain and has ultrahigh affinity. The protein can be combined in an S protein RBD region, and can block the combination of the novel coronavirus and an ACE2 receptor, thereby blocking the invasion of the novel coronavirus into host cells. The invention also provides a fusion protein with ultrahigh affinity comprising the targeted novel coronavirus Delta mutant S protein, which not only has better neutralization protective activity on a COVID-19 virus Delta mutant, but also has better neutralization protective activity on wild type novel coronavirus and main mutants Alpha, beta and Gamma thereof, and has broad-spectrum blocking protective activity on the novel coronavirus.

Description

Ultrahigh affinity small protein targeting COVID-19 virus S protein and application thereof

Technical Field

The invention belongs to the field of biotechnology and medicine, and particularly relates to a target COVID-19 virus S protein ultrahigh affinity small protein and a fusion protein thereof.

Background

The covd-19 virus is a single-stranded RNA virus that invades the body by specifically binding to host ACE2 receptor proteins, mainly through Spike proteins (S proteins) distributed on the surface of the virus.

Coronavirus surface spike protein (S), a large class I fusion protein. The S protein forms a trimeric complex, which can be functionally divided into two distinct subunits, S1 and S2, separated by a protease cleavage site. The S1 subunit comprises a Receptor Binding Domain (RBD) that interacts with host cell receptor proteins, triggering membrane fusion. The S2 subunit comprises a membrane fusion complex comprising a hydrophobic fusion peptide and an alpha-helical heptapeptide repeat region. Coronaviruses mediate viral invasion by binding to host cell surface receptors via the Receptor Binding Domain (RBD) of the S protein. Thus, the S protein is considered as an important target for the prevention and treatment of coronavirus infection of host cells. At present, the vaccine (recombinant vaccine and mRNA) and the neutralizing antibody aiming at the COVID-19 take S protein as a main target spot, and the preparation of the vaccine and the screening of the neutralizing antibody are carried out.

In view of the foregoing, there is a strong need in the art for the development of a drug that can more effectively block the invasion of new coronaviruses into a host.

Disclosure of Invention

The invention aims to provide a novel coronavirus S protein targeted ultrahigh affinity small protein which can more efficiently block the combination of the S protein and ACE2 protein so as to further block the invasion of a novel coronavirus into host cells.

The invention also aims to provide a fusion protein of a small ultra-high affinity protein based on targeting novel coronavirus S protein and a preparation method thereof.

In a first aspect of the invention, there is provided a small protein targeting the S protein of a novel coronavirus, which small protein is capable of specifically targeting the S protein of a novel coronavirus, exhibiting a super strong affinity, and being capable of competing with ACE2 for binding to the S protein, effectively blocking the binding of the S protein of the wild type of a novel coronavirus and its alpha, beta, gamma and Delta mutants to the ACE2 protein.

In another preferred embodiment, the small protein has a peptide chain that forms mainly three alpha-helical secondary structures.

In another preferred embodiment, the small protein is capable of specifically targeting the S protein binding to the novel coronavirus, exhibiting a strong affinity, and exhibiting a strong binding activity against the novel coronavirus wild type, alpha, beta, gamma and delta, respectively; wherein the small protein is composed of a peptide chain and mainly forms three alpha-helix secondary structures;

And the amino acid sequence of the small protein has SEQ ID No. 1 or SEQ ID NO: 1.

In another preferred embodiment, the small protein is capable of specifically targeting the S protein of a novel coronavirus Delta mutant, exhibits an ultra-strong affinity, and is capable of competitively binding to the novel coronavirus S protein with ACE2 receptor, effectively blocking the binding of the novel coronavirus S protein to the ACE2 receptor protein; and the wild type, alpha, beta and gamma of the novel coronavirus show better neutralization protection activity; wherein the small protein is composed of a peptide chain and mainly forms three alpha-helix secondary structures;

and the amino acid sequence of the small protein has or is shown in SEQ ID NO:3 or 5.

In another preferred embodiment, the amino acid sequence of the small protein is as set forth in SEQ ID NO: 1. 3 or 5. The invention also provides a recombinant protein comprising two or more small S protein targeting proteins of the invention in tandem.

In a second aspect of the invention, there is provided a fusion protein comprising a first polypeptide and/or a second polypeptide;

wherein the first polypeptide has a structure shown in a formula I from the N end to the C end, the second polypeptide has a structure shown in a formula II from the N end to the C end,

P-Mx-H-Fc (formula I)

P-Fc-H-Mx (formula II)

Wherein,,

p is a none or signal peptide sequence;

m is an S protein binding region (or binding element) whose amino acid sequence is derived from the amino acid sequence of a small protein that targets an S protein as described in the first aspect;

h is a hinge region;

fc is a constant region of an immunoglobulin or no or, or a fragment thereof;

"-" means a peptide bond or a connecting peptide to which the above element is attached;

x is a positive integer from 1 to 4.

In another preferred embodiment, the "amino acid sequence from the S protein targeting small protein" means that the amino acid sequence of the S protein binding domain (or binding element) is identical or substantially identical (i.e., 90% or more, preferably 95% or more, more preferably 98% or more homologous) to the amino acid sequence of the S protein targeting small protein, and the S protein binding domain (or binding element) retains binding activity (preferably 70% or more, more preferably 80% or more binding activity) to the S protein of the novel coronavirus Delta mutant.

In another preferred embodiment, the amino acid sequence of P is selected from the group consisting of:

(i) A sequence as shown in SEQ ID NO. 17;

(ii) An amino acid sequence obtained by substitution, deletion, alteration or insertion of one or more amino acid residues, or addition of 1 to 10 amino acid residues, more preferably 1 to 5 amino acid residues, at the N-terminus or C-terminus thereof, is carried out on the basis of SEQ ID NO. 17.

In another preferred embodiment, the nucleotide sequence encoding said M is shown as SEQ ID NO. 18.

In another preferred embodiment, the fusion protein is a monomer or dimer.

In another preferred embodiment, the fusion protein is a homodimer or a heterodimer.

In another preferred example, disulfide bonds may be formed between the first polypeptide and the first polypeptide, between the second polypeptide and the second polypeptide, or between the first polypeptide and the second polypeptide through cysteine C on the respective Fc.

In another preferred embodiment, the dimer is selected from the group consisting of: a homodimer formed by two first polypeptides, a homodimer formed by two second polypeptides, or a heterodimer formed by a first polypeptide and a second polypeptide.

In another preferred embodiment, the fusion protein is a homodimer formed from two first polypeptides.

In another preferred embodiment, the sequence of M is shown in SEQ ID NO. 1, 3 or 5.

In another preferred embodiment, said x is 1, 2, 3 or 4, preferably 2.

In another preferred embodiment, the H is the hinge region of a human immunoglobulin.

In another preferred embodiment, the human immunoglobulin is selected from the group consisting of: igG1, igG4, or a combination thereof.

In another preferred embodiment, the human immunoglobulin is IgG1.

In another preferred embodiment, the amino acid sequence of H is selected from the group consisting of:

(i) A sequence shown in SEQ ID NO. 7;

(ii) An amino acid sequence obtained by substitution, deletion, alteration or insertion of one or more amino acid residues, or addition of 1 to 10 amino acid residues, more preferably 1 to 5 amino acid residues, at the N-terminus or C-terminus thereof, based on SEQ ID NO. 7.

In another preferred embodiment, the nucleotide sequence encoding said H is shown in SEQ ID NO. 8.

In another preferred embodiment, the Fc is a constant region of a human immunoglobulin or a fragment thereof.

In another preferred embodiment, the Fc is a tandem sequence of the CH2 and CH3 regions of a human immunoglobulin, or is only the CH3 region of a human immunoglobulin.

In another preferred embodiment, the amino acid sequence of the Fc is selected from the group consisting of:

(i) A sequence shown as SEQ ID NO. 9;

(ii) An amino acid sequence obtained by substitution, deletion, alteration or insertion of one or more amino acid residues, or addition of 1 to 30 amino acid residues, preferably 1 to 10 amino acid residues, more preferably 1 to 5 amino acid residues, at the N-terminus or C-terminus thereof, is carried out on the basis of SEQ ID NO. 9.

In another preferred embodiment, the nucleotide sequence encoding the Fc is shown as SEQ ID NO. 10.

In another preferred embodiment, the amino acid sequence of the first polypeptide is selected from the group consisting of:

(i) A sequence as shown in SEQ ID NO. 11, 13 or 15;

(ii) Amino acid sequences obtained by substitution, deletion, alteration or insertion of one or more amino acid residues, or addition of 1 to 30 amino acid residues, preferably 1 to 10 amino acid residues, more preferably 1 to 5 amino acid residues, at the N-terminus or C-terminus thereof, are carried out on the basis of SEQ ID NO. 11, 13 or 15.

In another preferred embodiment, the amino acid sequence of the first polypeptide is shown in SEQ ID NO. 11, 13 or 15, and the nucleotide sequence encoding the first polypeptide is shown in SEQ ID NO. 12, 14 or 16.

In a third aspect of the invention, there is provided a polynucleotide encoding a small protein of the first aspect of the invention that targets an S protein, or a recombinant protein thereof, or a fusion protein of the second aspect of the invention.

In another preferred embodiment, the polynucleotide has the sequence shown in SEQ ID NO. 2, 4, 6, 12, 14 or 16.

In a fourth aspect of the invention there is provided a vector comprising a polynucleotide according to the third aspect of the invention.

In another preferred embodiment, the carrier is: pET vector, pGEM-T vector, pcDNA3.1, or a combination thereof.

In a fifth aspect of the invention there is provided a host cell comprising the vector of the fourth aspect or having integrated into its genome the polynucleotide of the third aspect.

In a sixth aspect of the invention, there is provided an immunoconjugate comprising:

(a) The small protein targeting the S protein or the recombinant protein thereof according to the first aspect of the present invention or the fusion protein according to the second aspect; and

(b) A coupling moiety selected from the group consisting of: a detectable label, drug, toxin, cytokine, radionuclide, or enzyme.

In another preferred embodiment, the coupling moiety is a drug or a toxin.

In another preferred embodiment, the coupling moiety is a detectable label.

In another preferred embodiment, the conjugate is selected from the group consisting of: fluorescent or luminescent markers, radioactive markers, MRI (magnetic resonance imaging) or CT (computerized tomography) contrast agents.

In a seventh aspect of the present invention, there is provided a pharmaceutical composition comprising:

(a) The small protein targeting the S protein or the recombinant protein thereof according to the first aspect of the invention or the fusion protein according to the second aspect of the invention or the encoding gene thereof; or an immunoconjugate according to the sixth aspect of the invention; and

(b) A pharmaceutically acceptable carrier.

In another preferred embodiment, the pharmaceutical composition is used for diagnosing or treating a novel coronavirus expressing the S protein.

In another preferred embodiment, the content of component (a) is 0.1 to 99.9wt%, preferably 10 to 99.9wt%, more preferably 70 to 99.9wt%.

In another preferred embodiment, the pharmaceutical composition is in the form of an oral dosage form, an injection, or an external pharmaceutical dosage form.

In another preferred embodiment, the dosage form of the pharmaceutical composition comprises a tablet, a granule, a capsule, an oral liquid, or an injection.

In another preferred embodiment, the pharmaceutical composition or formulation is selected from the group consisting of: suspension, liquid or lyophilized formulations.

In another preferred embodiment, the liquid formulation is a water injection formulation.

In another preferred embodiment, the shelf life of the liquid formulation is one to three years, preferably one to two years, more preferably one year.

In another preferred embodiment, the liquid formulation has a storage temperature of from 0 ℃ to 16 ℃, preferably from 0 ℃ to 10 ℃, more preferably from 2 ℃ to 8 ℃.

In another preferred embodiment, the shelf life of the lyophilized formulation is from half a year to two years, preferably from half a year to one year, more preferably half a year.

In another preferred embodiment, the lyophilized formulation has a shelf temperature of 42 ℃ or less, preferably 37 ℃ or less, more preferably 30 ℃ or less.

In another preferred embodiment, the pharmaceutically acceptable carrier comprises: surfactants, solution stabilizers, isotonicity adjusting agents, buffers, or combinations thereof.

In another preferred embodiment, the pharmaceutically acceptable carrier is selected from the group consisting of: infusion and/or injection carriers, preferably said carrier is one or more carriers selected from the group consisting of: normal saline, dextrose saline, or combinations thereof.

In another preferred embodiment, the solution stabilizer is selected from the group consisting of: a saccharide solution stabilizer, an amino acid solution stabilizer, an alcohol solution stabilizer, or a combination thereof.

In another preferred embodiment, the saccharide solution stabilizer is selected from the group consisting of: reducing saccharide solution stabilizers or non-reducing saccharide solution stabilizers.

In another preferred embodiment, the amino acid solution stabilizer is selected from the group consisting of: monosodium glutamate or histidine.

In another preferred embodiment, the alcoholic solution stabilizer is selected from the group consisting of: triols, higher sugar alcohols, propylene glycol, polyethylene glycols, or combinations thereof.

In another preferred embodiment, the isotonicity adjusting agent is selected from the group consisting of: sodium chloride or mannitol.

In another preferred embodiment, the buffer is selected from the group consisting of: TRIS, histidine buffer, phosphate buffer, or a combination thereof.

In another preferred embodiment, the subject to which the pharmaceutical composition or formulation is administered is a human or non-human animal.

In another preferred embodiment, the non-human animal comprises: rodents (e.g., rats, mice), primates (e.g., monkeys).

In another preferred embodiment, in the administration of the pharmaceutical composition or formulation, the amount administered is 0.01-10 g/day, preferably 0.05-5000 mg/day, more preferably 0.1-3000 mg/day.

In another preferred embodiment, the pharmaceutical composition or formulation is for use in inhibiting and/or treating viral invasion of a host, preferably for use in inhibiting and/or treating viral infection; more preferably for inhibiting and/or treating a new coronavirus infection.

In another preferred embodiment, the inhibition and/or treatment of a viral infection comprises a delay in the development of associated symptoms following a viral infection and/or a reduction in the severity of such symptoms.

In another preferred embodiment, the inhibition and/or treatment of viral infection further includes alleviation of the symptoms associated with an existing viral infection and prevention of the appearance of other symptoms.

In another preferred embodiment, the pharmaceutical composition or formulation may be administered in combination with other anti-viral or anti-inflammatory agents for the treatment of tumors.

In another preferred embodiment, the additional antiviral or anti-inflammatory agent administered in combination is selected from the group consisting of: inhibitors of viral replication, hormonal anti-inflammatory agents, biological response modifiers, monoclonal antibodies, or combinations thereof.

In another preferred embodiment, the viral replication inhibitor comprises: agents that affect viral nucleic acid synthesis and replication, agents that affect reverse transcription of nucleic acids.

In another preferred embodiment, the agent that affects viral nucleic acid synthesis and replication comprises: and (3) Ruidexivir.

In another preferred embodiment, the agent that acts on reverse transcription of a nucleic acid is selected from the group consisting of: mo Napi Lavir (Molnupiravir), baro Sha Wei, fapiravir, or combinations thereof.

In another preferred example, the hormonal anti-inflammatory drug comprises an antiestrogen, an aromatase inhibitor or an anti-androgen; preferably, the antiestrogen is selected from the group consisting of: tamoxifen, droloxifene, exemestane, or a combination thereof; the aromatase inhibitor is selected from the group consisting of: aminoglutethimide, lantelon, letrozole, laningd, or a combination thereof; the anti-androgens are selected from the group consisting of: flutamine RH-LH agonists/antagonists: norrad, etalum, or a combination thereof.

In another preferred embodiment, the biological response modifier comprises: interferon, interleukin-2, thymus peptides, or a combination thereof.

In another preferred embodiment, the monoclonal antibody comprises an antibody that blocks viral entry, or an anti-inflammatory antibody; preferably, the antibody blocking viral invasion comprises: LY-CoV555, LY-CoV016, REGN10933, REGN10987, AZD8895, AZD1061, VIR-7831, BRII-196, DXP-604, or combinations thereof; the anti-inflammatory antibody comprises: tocilizumab, sarilumab, or a combination thereof.

In an eighth aspect of the invention, there is provided a method for preparing a small protein targeting the S protein of the first aspect of the invention or a recombinant protein thereof or a fusion protein according to the third aspect of the invention, comprising the steps of:

(a) Culturing the host cell according to the fifth aspect of the invention under suitable conditions, thereby obtaining a culture comprising said small protein or recombinant protein or fusion protein thereof; and

(b) Purifying and/or separating the culture obtained in the step (a) to obtain the small protein targeting the S protein or the recombinant protein or the fusion protein thereof.

In a ninth aspect of the invention, there is provided the use of a small protein targeting the S protein of the first aspect of the invention, or a fusion protein thereof, or an immunoconjugate thereof, for the preparation of a medicament, reagent, assay plate or kit; wherein the reagent, assay plate or kit is for: detecting an S protein or a new coronavirus in the sample; wherein the medicament is for the treatment and/or prophylaxis of a novel coronavirus infection.

In another preferred embodiment, the agent is used to block the invasion of a new coronavirus into a human.

In another preferred embodiment, the agent is one or more agents selected from the group consisting of: isotope tracer, contrast agent, flow detection reagent, cell immunofluorescence detection reagent, nano magnetic particle and imaging agent.

In another preferred embodiment, the agent for detecting a novel coronavirus S protein in the sample is a contrast agent for detecting a novel coronavirus or a viral S protein (in vivo).

In another preferred embodiment, the assay is an in vivo assay or an in vitro assay.

In another preferred embodiment, the detection comprises a flow assay, a cellular immunofluorescence assay, or a combination thereof.

In another preferred embodiment, the agent is used to block the interaction of the novel coronavirus S protein and ACE2 protein.

In another preferred embodiment, the novel coronaviruses include, but are not limited to: wild-type novel coronaviruses, alpha mutants, beta mutants, gamma mutants, delta mutants or combinations thereof.

In a tenth aspect of the present invention, there is provided a method of treating and/or preventing a novel coronavirus infection comprising the steps of: administering to a subject in need thereof a safe and effective amount of a small protein targeting the S protein according to the first aspect of the invention or a recombinant protein thereof or the fusion protein according to the second aspect, or the immunoconjugate according to the sixth aspect, or the pharmaceutical composition according to the seventh aspect.

In another preferred embodiment, the treatment and/or prevention of a new coronavirus infection comprises blocking the invasion of the human body by a new coronavirus.

It is understood that within the scope of the present invention, the above-described technical features of the present invention and technical features specifically described below (e.g., in the examples) may be combined with each other to constitute new or preferred technical solutions. And are limited to a space, and are not described in detail herein.

Drawings

FIG. 1 shows a simulated view of the structure of the ultra-high affinity binding small protein and S protein complex targeting the S protein of a novel coronavirus. Wherein A is the protein structure of a human ACE2 and S protein complex.

B is a structural simulation diagram of a small protein NC_139_error_ (2) and S protein binding complex.

FIG. 2 shows the binding activity of ultra-high affinity small proteins targeting the novel coronavirus Delta mutant S protein as detected by flow-through. Wherein, the ultra-high affinity small protein of the targeted S protein is displayed on the surface of the yeast, and the yeast displaying the small protein is tracked by anti-Myc tag antibody FITC (ab 1394); by F (ab') ₂ Sheep anti-human IgG Fc secondary, PE (H10104) will be able to trace Fc tagged S protein bound yeast cells.

Figure 3 shows the competitive binding activity of the ultra-high affinity small protein targeting the S protein with human ACE2 protein detected using a flow-through method.

Wherein, after incubating human ACE2 protein with different concentrations and Delta mutant S protein with Fc label at room temperature, incubating with yeast displaying small protein with ultrahigh affinity targeting S protein. By means of an anti-Myc tag antibody FITC (ab 1394) and F (ab') ₂ Sheep anti-human IgG Fc secondary, PE (H10104) double staining to assess competitive binding activity of the ultra-high affinity small protein targeting the S protein to human ACE2 protein.

Figure 4 shows the affinity of ultra-high affinity small protein targeting S protein using a biofilm interference technique (BLI) assay. Wherein NC_139_error_Delta_ (3) and NC_139_error_Delta_ (9) show super-strong binding activity to S protein of the novel crown Delta mutant strain, and the affinities of the NC_139_error_Delta_ (3) and the NC_139_error_Delta_ (9) are respectively 7.58 multiplied by 10 ^-10 M and 6.083 ×10 ^-9 M。

Figure 5 shows the affinity of ultra-high affinity small protein targeting S protein using a biofilm interference technique (BLI) assay. Wherein, after wild type, alpha, beta, gamma or Delta mutant S proteins with Fc labels are respectively coated on a detection probe, the affinities of ultrahigh affinity small proteins (NC_139_error_ (2)) targeting the S proteins and the S proteins with different concentrations are detected.

Figure 6 shows the measurement of the thermal stability of ultra-high affinity small proteins targeting S proteins using CD spectroscopy. Wherein, NC_139_error_Delta_ (3) is observed to be circular dichroism of the protein at the temperature of 25 ℃, the temperature is raised to 95 ℃ and the temperature is lowered to 25 ℃, and then the change of the secondary structure of the protein before and after the temperature is raised is evaluated.

Fig. 7 shows the Tm values of the ultra-high affinity small proteins targeted to the S protein as determined by CD spectroscopy. Wherein NC_139_error_Delta_ (3) was observed to detect the circular dichroism signal of the protein during the gradual temperature rise to 95℃at 25 ℃. The Tm value of the protein was calculated from the protein circle dichroism signal varying with time point.

FIG. 8 shows a schematic representation of several structural combinations of S-protein targeting high affinity small proteins and their fusion proteins.

Wherein A is a small protein short peptide chain of a targeted S protein;

b is a polypeptide chain formed by the serial connection of small protein of targeted S protein and an antibody hinge region (hinge) or a linker (linker) and CH2 and CH3, and the small protein (or fragment) of the high-affinity targeted S protein provided by the invention forms a single/multi-targeted fusion protein of targeted S protein;

c is a polypeptide chain formed by the cascade connection of small protein targeting S protein, antibody hinge region (hinge) or linker (linker) and CH3, and the small protein (or fragment) targeting S protein with high affinity provided by the invention forms single/multi-targeting fusion protein targeting S protein;

d is a polypeptide chain formed by the cascade connection of small protein targeting S protein, an antibody hinge region (hinge) or a linker (linker) and CH3, and the high-affinity small protein (or fragment) provided by the invention is used for forming single/multi-targeting fusion protein targeting S protein;

E is a polypeptide chain formed by connecting small protein of targeted S protein with small protein of targeted S protein through linker sequence, and then connecting the small protein of targeted S protein with antibody hinge region (hinge) or linker (linker) and CH2 and CH3 in series, and the small protein (or fragment) of targeted S protein with high affinity provided by the invention forms single/multi-targeted fusion protein of targeted S protein;

f is a polypeptide chain formed by connecting small protein of targeted S protein with small protein of targeted S protein through linker sequence, and connecting with antibody hinge region (hinge) or linker (linker) and CH3 in series, and the high affinity targeted S protein small protein (or fragment) provided by the invention forms targeted S protein single/multi-targeted fusion protein;

g is a polypeptide chain formed by connecting small protein of targeted S protein with small protein of targeted S protein through linker sequence, and connecting the small protein with antibody hinge region (hinge) or linker (linker) and CH3 in series, and the high affinity targeted S protein small protein (or fragment) provided by the invention forms single/multi-targeted fusion protein of targeted S protein.

FIG. 9 shows the in vitro pseudovirus neutralization protective activity of NC_139_error_Delta_ (3) on the wild-type and major mutants alpha, beta, gamma and Delta, respectively, of the novel coronavirus. After incubation of nc_139_error_delta_ (3) with the novel coronavirus carrying the fluorescein reporter gene, respectively, at the concentrations shown in fig. 9, neutralization protective activity was determined using 293T cells that highly express human ACE 2.

Detailed Description

Through extensive and intensive research, the inventor designs and obtains a class of ultrahigh-affinity small proteins targeting the novel coronavirus S protein aiming at the interaction surface of ACE2 and the S protein based on a novel coronavirus S protein and ACE2 protein structural complex. The binding site of the small protein can almost completely cover the binding site of ACE2 protein on S protein. Experiments show that the high-affinity small protein NC_139_error_ (2) provided by the invention can be widely combined with the main mutant strain alpha, beta, gamma and Delta of the novel coronavirus. On this basis, the ultra-high affinity small proteins NC_139_error_Delta_ (3) and NC_139_error_Delta_ (9) of the Delta mutant S protein of the novel coronavirus are further optimized and obtained. The protein has better neutralization protection activity on a novel coronavirus Delta mutant strain, has better neutralization protection activity on a novel coronavirus wild type, alpha, beta and Gamma, and has broad-spectrum neutralization protection activity on the novel coronavirus. Compared with the traditional antibody, the small protein has smaller molecular weight and potentially better tissue penetrability and structural stability. The present invention has been completed on the basis of this finding.

In particular, representative ultra-high affinity small proteins targeting S proteins are less than about 60 amino acids in length, have a molecular weight much less than conventional antibodies, and have no antibody Fc portion, and thus have better tissue penetration. In addition, the S protein targeted ultrahigh affinity small protein has higher affinity and can be used as a potential novel coronavirus diagnosis and in-vivo tracer reagent.

The invention targets the S protein with ultrahigh affinity small protein and fusion protein

In the present invention, there are provided a class of ultra-high affinity small proteins targeting the S protein and a fusion protein comprising said small proteins or conjugates thereof.

As used herein, the terms "small protein of the invention", "ultra-high affinity small protein of the invention targeting the novel coronavirus S protein", "ultra-high affinity small protein of the invention targeting the S protein" are used interchangeably and refer to small proteins having an ultra-high affinity for the novel coronavirus S protein as described in the first aspect of the invention.

As used herein, the term "S protein of the invention" includes the S protein of the novel coronavirus wild type and alpha, beta, gamma and Delta mutants thereof.

Preferably, the small protein of the invention has an amino acid sequence as shown in SEQ ID NO. 1, 3 or 5.

As used herein, the term "fusion protein of the invention" refers to a fusion protein of the invention that is formed by an ultrahigh affinity small protein of the targeted S protein with other fusion elements, e.g., a fusion protein of the invention that may be formed by a small protein of the invention with elements of the hinge region, fc region, etc. The fusion protein of the invention has ultrahigh affinity to ACE 2.

As used herein, the term "having an ultrahigh affinity for ACE 2" means that the affinity of the small protein or fusion protein of the invention for the S protein is much higher than the affinity of the ACE2 protein for the Delta mutant S protein, e.g., the affinity Q1 of the small protein or fusion protein of the invention for the Delta mutant S protein is at least 1.5, at least 2-fold or more of the affinity Q0 of the ACE2 protein for the Delta mutant S protein; alternatively, the ratio of the Kd value Z1 of the small protein or fusion protein of the invention to the Delta mutant S protein to the Kd value Z0 of the ACE2 protein to the Delta mutant S protein (Z1/Z0) is 1/1.5 or less, preferably 1/2 or 1/3 or more. Preferably, the ultra-high affinity fusion protein of the invention may be any ultra-high affinity small protein or a partial amino acid fragment thereof (typically an amino acid fragment of at least 70% length) comprising at least the entire targeted S protein.

Typically, the fusion proteins of the invention may have the following structure:

the Y-shaped structure of the ultra-high affinity small protein or fragment-finger-CH 2-CH3 of the targeted S protein;

the Y-shaped structure of the ultra-high affinity small protein or fragment-finger-CH 3 of the targeted S protein;

ultra-high affinity small protein or fragment-tracer tags targeting S protein;

Ultra-high affinity small proteins or fragments targeting S protein.

It should be understood that the above structural types are exemplary only and do not limit the present invention. Some representative structures are shown in fig. 8. Wherein the S protein-targeting ultrahigh affinity small protein or fragment thereof may be single or multiple (e.g., 2, 3, or 4 ultrahigh affinity small proteins or fragments thereof in tandem form, e.g., fig. 8E, 8F, and 8G).

As used herein, the term "ultra-high affinity small protein targeting an S protein" or "fusion protein" also includes variants having S protein binding activity and ACE2/S protein blocking activity. These variants include (but are not limited to): deletion, insertion and/or substitution of 1 to 3 (usually 1 to 2, more preferably 1) amino acids, addition or deletion of one or several (usually 3 or less, preferably 2 or less, more preferably 1 or less) amino acids at the C-terminus and/or N-terminus, or addition of an amino acid fragment having a smaller amino acid side chain at the N-terminus or C-terminus of a small protein as a linker (e.g., glycine, serine, etc.). For example, in the art, substitution with amino acids of similar or similar properties does not generally alter the function of the protein. As another example, the addition or deletion of one or more amino acids at the C-terminus and/or N-terminus generally does not alter the structure or function of the protein. Furthermore, the term also includes polypeptides of the invention in monomeric and multimeric form. The term also includes linear as well as non-linear polypeptides (e.g., cyclic peptides).

The invention also includes active fragments, derivatives and analogues of the small protein or fusion protein (particularly fusion protein with Fc fragment) targeting S protein described above. As used herein, the terms "fragment," "derivative" and "analog" refer to polypeptides that substantially retain the function or activity of the ultra-high affinity small protein or fusion protein of the subject invention that targets an S protein.

The polypeptide fragment, derivative or analogue of the present invention may be (i) a polypeptide having one or several conserved or non-conserved amino acid residues, preferably conserved amino acid residues, substituted or (ii) a polypeptide having a substituent group in one or more amino acid residues, or (iii) a polypeptide formed by fusing a polypeptide with another compound such as a compound which extends the half-life of the polypeptide, for example polyethylene glycol, or (iv) a polypeptide formed by fusing an additional amino acid sequence to the polypeptide sequence (fusion protein formed by fusing a tag sequence such as a leader sequence, a secretory sequence or 6 His). Such fragments, derivatives and analogs are within the purview of one skilled in the art and would be well known in light of the teachings herein.

A preferred class of reactive derivatives refers to polypeptides in which up to 5, preferably up to 3, more preferably up to 1 amino acid is replaced by an amino acid of similar or similar nature, as compared to the amino acid sequence of the invention. These conservatively variant polypeptides are preferably generated by amino acid substitutions according to Table A.

Table A

Initial residues	Representative substitution	Preferred substitution
			Ala(A)	Val；Leu；Ile	Val
Arg(R)	Lys；Gln；Asn	Lys
			Asn(N)	Gln；His；Lys；Arg	Gln
Asp(D)	Glu	Glu
			Cys(C)	Ser	Ser
Gln(Q)	Asn	Asn
			Glu(E)	Asp	Asp
Gly(G)	Pro；Ala	Ala
			His(H)	Asn；Gln；Lys；Arg	Arg
Ile(I)	Leu；Val；Met；Ala；Phe	Leu
			Leu(L)	Ile；Val；Met；Ala；Phe	Ile
Lys(K)	Arg；Gln；Asn	Arg
			Met(M)	Leu；Phe；Ile	Leu
Phe(F)	Leu；Val；Ile；Ala；Tyr	Leu
			Pro(P)	Ala	Ala
Ser(S)	Thr	Thr
			Thr(T)	Ser	Ser
Trp(W)	Tyr；Phe	Tyr
			Tyr(Y)	Trp；Phe；Thr；Ser	Phe
Val(V)	Ile；Leu；Met；Phe；Ala	Leu

The invention also provides analogs of the fusion proteins of the invention. These analogs may differ from the polypeptides of the invention by differences in amino acid sequence, by differences in modified forms that do not affect the sequence, or by both. Analogs also include analogs having residues other than the natural L-amino acid (e.g., D-amino acids), as well as analogs having non-naturally occurring or synthetic amino acids (e.g., beta, gamma-amino acids). It is to be understood that the polypeptides of the present invention are not limited to the representative polypeptides exemplified above.

In addition, the ultra-high affinity small protein or fusion protein of the targeted S protein can be modified. Modified (typically without altering the primary structure) forms include: chemically derivatized forms of polypeptides such as acetylation or carboxylation, in vivo or in vitro. Modifications also include glycosylation, such as those resulting from glycosylation modifications during synthesis and processing of the polypeptide or during further processing steps. Such modification may be accomplished by exposing the polypeptide to an enzyme that performs glycosylation (e.g., mammalian glycosylase or deglycosylase). Modified forms also include sequences having phosphorylated amino acid residues (e.g., phosphotyrosine, phosphoserine, phosphothreonine). Also included are polypeptides modified to improve their proteolytic resistance or to optimize solubility.

The term "polynucleotide of the invention" may be a polynucleotide comprising a small ultra-high affinity protein or fusion protein encoding a targeted S protein of the invention, or may be a polynucleotide further comprising additional coding and/or non-coding sequences.

The invention also relates to variants of the above polynucleotides which encode fragments, analogs and derivatives of the polypeptides or fusion proteins having the same amino acid sequence as the invention. These nucleotide variants include substitution variants, deletion variants and insertion variants. As known in the art, an allelic variant is a substitution, deletion, or insertion of a polynucleotide that may be a substitution, deletion, or insertion of one or more nucleotides, but does not substantially alter the function of the encoded S protein-targeted ultrahigh affinity small protein or fusion protein.

The invention also relates to polynucleotides which hybridize to the sequences described above and which have at least 50%, preferably at least 70%, more preferably at least 80% identity between the two sequences. The invention relates in particular to polynucleotides which hybridize under stringent conditions (or stringent conditions) to the polynucleotides of the invention. In the present invention, "stringent conditions" means: (1) Hybridization and elution at lower ionic strength and higher temperature, e.g., 0.2 XSSC, 0.1% SDS,60 ℃; or (2) adding denaturant such as 50% (v/v) formamide, 0.1% calf serum/0.1% Ficoll,42 ℃ and the like during hybridization; or (3) hybridization only occurs when the identity between the two sequences is at least 90% or more, more preferably 95% or more.

The ultra-high affinity small protein or fusion protein targeting the S protein and polynucleotide of the invention are preferably provided in isolated form, and more preferably purified to homogeneity.

The full-length polynucleotide sequence of the present invention can be obtained by PCR amplification, recombinant methods or artificial synthesis. For the PCR amplification method, primers can be designed according to the nucleotide sequences disclosed in the present invention, particularly the open reading frame sequences, and amplified to obtain the relevant sequences using a commercially available cDNA library or a cDNA library prepared according to a conventional method known to those skilled in the art as a template. When the sequence is longer, it is often necessary to perform two or more PCR amplifications, and then splice the amplified fragments together in the correct order.

Once the relevant sequences are obtained, recombinant methods can be used to obtain the relevant sequences in large quantities. This is usually done by cloning it into a vector, transferring it into a cell, and isolating the relevant sequence from the propagated host cell by conventional methods.

Furthermore, the sequences concerned, in particular fragments of short length, can also be synthesized by artificial synthesis. In general, fragments of very long sequences are obtained by first synthesizing a plurality of small fragments and then ligating them.

At present, it is already possible to obtain the DNA sequences encoding the proteins of the invention (or fragments thereof, or derivatives thereof) entirely by chemical synthesis. The DNA sequence can then be introduced into a variety of existing DNA molecules (or vectors, for example) and cells known in the art.

Methods of amplifying DNA/RNA using PCR techniques are preferred for obtaining polynucleotides of the invention. Particularly, when it is difficult to obtain full-length cDNA from a library, it is preferable to use RACE (RACE-cDNA end rapid amplification) method, and primers for PCR can be appropriately selected according to the sequence information of the present invention disclosed herein and synthesized by a conventional method. The amplified DNA/RNA fragments can be isolated and purified by conventional methods, such as by gel electrophoresis.

Expression vector

The invention also relates to vectors comprising the polynucleotides of the invention, as well as host cells genetically engineered with the vectors of the invention or the coding sequences of the ultra-high affinity small proteins or fusion proteins of the invention targeting the S protein, and methods for producing the polypeptides of the invention by recombinant techniques.

The polynucleotide sequences of the present invention may be used to express or produce recombinant fusion proteins by conventional recombinant DNA techniques. Generally, there are the following steps:

(1) Transforming or transducing a suitable host cell with a polynucleotide (or variant) encoding a fusion protein of the invention, or with a recombinant expression vector comprising the polynucleotide;

(2) Host cells cultured in a suitable medium;

(3) Isolating and purifying the protein from the culture medium or the cells.

In the present invention, the polynucleotide sequence encoding the fusion protein may be inserted into a recombinant expression vector. The term "recombinant expression vector" refers to bacterial plasmids, phage, yeast plasmids, plant cell viruses, mammalian cell viruses such as adenoviruses, retroviruses, or other vectors well known in the art. Any He Zhili and vector may be used as long as it is capable of replication and stability in the host. An important feature of expression vectors is that they generally contain an origin of replication, a promoter, a marker gene and translational control elements.

In the preparation method of the S protein targeting ultrahigh affinity small protein or fusion protein thereof, any suitable vector can be used, and can be selected from one of pET, pDR1, pcDNA3.1 (+), pcDNA3.1/ZEO (+), and pDHFR, and the expression vector comprises a fusion DNA sequence connected with a proper transcription and translation regulatory sequence.

Eukaryotic/prokaryotic host cells can be used for expressing the S-protein-targeted ultrahigh-affinity small protein or fusion protein thereof, and eukaryotic host cells are preferably mammalian or insect host cell culture systems, and cells such as COS, CHO, NS0, sf9, sf21 and the like are preferred; the prokaryotic host cell is preferably one of lemo21, DH5a, BL21 (DE 3), TG 1.

Methods well known to those skilled in the art can be used to construct expression vectors containing the DNA sequences encoding the fusion proteins of the invention and appropriate transcriptional/translational control signals. These methods include in vitro recombinant DNA techniques, DNA synthesis techniques, in vivo recombinant techniques, and the like. The DNA sequence may be operably linked to an appropriate promoter in an expression vector to direct mRNA synthesis. Representative examples of these promoters are: the lac or trp promoter of E.coli; a lambda phage PL promoter; eukaryotic promoters include the CMV immediate early promoter, the HSV thymidine kinase promoter, the early and late SV40 promoters, LTRs from retroviruses, and other known promoters that control the expression of genes in prokaryotic or eukaryotic cells or viruses thereof. The expression vector also includes a ribosome binding site for translation initiation and a transcription terminator.

In addition, the expression vector preferably comprises one or more selectable marker genes to provide phenotypic traits for selection of transformed host cells, such as dihydrofolate reductase, neomycin resistance and Green Fluorescent Protein (GFP) for eukaryotic cell culture, or tetracycline or ampicillin resistance for E.coli.

Vectors comprising the appropriate DNA sequences as described above, as well as appropriate promoter or control sequences, may be used to transform appropriate host cells to enable expression of the protein.

The host cell may be a prokaryotic cell, such as a bacterial cell; or lower eukaryotic cells, such as yeast cells; or higher eukaryotic cells, such as mammalian cells. Representative examples are: coli, streptomyces; bacterial cells of salmonella typhimurium; fungal cells such as yeast, plant cells (e.g., ginseng cells).

When the polynucleotide of the present invention is expressed in higher eukaryotic cells, transcription will be enhanced if an enhancer sequence is inserted into the vector. Enhancers are cis-acting elements of DNA, usually about 10 to 300 base pairs, that act on a promoter to increase the transcription of a gene. Examples include the SV40 enhancer 100 to 270 base pairs on the late side of the origin of replication, the polyoma enhancer on the late side of the origin of replication, and adenovirus enhancers.

It will be clear to a person of ordinary skill in the art how to select appropriate vectors, promoters, enhancers and host cells.

Transformation of host cells with recombinant DNA can be performed using conventional techniques well known to those skilled in the art. When the host is a prokaryote such as E.coli, competent cells, which can take up DNA, can be obtained after the exponential growth phase and then treated with CaCl ₂ The process is carried out using procedures well known in the art. Another approach is to use MgCl ₂ . Transformation can also be performed by electroporation, if desiredThe method is carried out. When the host is eukaryotic, the following DNA transfection methods may be used: calcium phosphate co-precipitation, conventional mechanical methods such as microinjection, electroporation, liposome encapsulation, etc.

The transformant obtained can be cultured by a conventional method to express the polypeptide encoded by the gene of the present invention. The medium used in the culture may be selected from various conventional media depending on the host cell used. The culture is carried out under conditions suitable for the growth of the host cell. After the host cells have grown to the appropriate cell density, the selected promoters are induced by suitable means (e.g., temperature switching or chemical induction) and the cells are cultured for an additional period of time.

The recombinant polypeptide in the above method may be expressed in a cell, or on a cell membrane, or secreted outside the cell. If desired, the recombinant proteins can be isolated and purified by various separation methods using their physical, chemical and other properties. Such methods are well known to those skilled in the art. Examples of such methods include, but are not limited to: conventional renaturation treatment, treatment with protein precipitants (salting-out method), centrifugation, osmotic sterilization, super-treatment, super-centrifugation, molecular sieve chromatography (gel filtration), adsorption chromatography, ion exchange chromatography, high Performance Liquid Chromatography (HPLC), and other various liquid chromatography techniques and combinations of these methods.

The S protein targeting ultrahigh affinity small proteins or fusion proteins thereof disclosed by the invention can be separated and purified by utilizing an affinity chromatography method, and the S protein targeting ultrahigh affinity small proteins or fusion proteins thereof bound on the affinity column can be eluted by utilizing conventional methods such as high-salt buffer, pH change and the like according to the characteristics of the utilized affinity column.

By the above method, the S protein-targeted ultrahigh affinity small protein or fusion protein thereof can be purified as a substantially homogeneous substance, for example, as a single band on SDS-PAGE electrophoresis.

Pharmaceutical composition

In the present invention, there is also provided a pharmaceutical composition comprising the small protein or fusion protein targeting the S protein of the present invention or an immunoconjugate thereof.

The pharmaceutical compositions of the invention contain a safe and effective amount (e.g., 0.001-99 wt.%, preferably 0.01-90 wt.%, more preferably 0.1-80 wt.%) of the small protein or fusion protein of the invention (or conjugates thereof) and a pharmaceutically acceptable carrier or excipient. Such vectors include (but are not limited to): saline, buffers, dextrose, water, glycerol, ethanol, and combinations thereof. The pharmaceutical formulation should be compatible with the mode of administration. The pharmaceutical compositions of the invention may be formulated in the form of needles, for example, by conventional methods using physiological saline or aqueous solutions containing glucose and other adjuvants. The pharmaceutical compositions, such as injections, solutions are preferably manufactured under sterile conditions. The amount of active ingredient administered is a therapeutically effective amount, for example, from about 10 micrograms per kilogram of body weight to about 50 milligrams per kilogram of body weight per day. In addition, the polypeptides of the invention may also be used with other therapeutic agents. The small protein or the fusion protein of the targeted S protein or the immunoconjugate thereof can be combined with pharmaceutically acceptable auxiliary materials to form a pharmaceutical preparation so as to exert curative effect more stably, and the preparation can ensure the structural integrity of the amino acid core sequence of the small protein or the fusion protein of the targeted S protein, and simultaneously protect the multifunctional group of the protein from degradation (including but not limited to condensation, deamination or oxidation). The formulations may be in a variety of forms, and in general, for liquid formulations, they will generally be stable for at least one year at 2 ℃ to 8 ℃ and for lyophilized formulations, for at least six months at 30 ℃. The preparation can be suspension, water injection, freeze-drying preparation, etc., which are commonly used in the pharmaceutical field, preferably water injection or freeze-drying preparation.

For the S protein-targeting pharmaceutical composition (such as a water injection or a freeze-dried preparation) of the invention, pharmaceutically acceptable auxiliary materials comprise one or a combination of a surfactant, a solution stabilizer, an isotonicity regulator and a buffer, wherein the surfactant comprises a nonionic surfactant such as polyoxyethylene sorbitol fatty acid ester (Tween 20 or 80); poloxamers (such as poloxamer 188); triton; sodium Dodecyl Sulfate (SDS); sodium lauryl sulfate; tetradecyl, linoleyl or octadecyl sarcosine; pluronics; MONAQUATTM, etc., in an amount that minimizes the tendency of protein to granulate, the solution stabilizer may be a saccharide, including reducing and non-reducing saccharides, the amino acids include monosodium glutamate or histidine, the alcohols include one or a combination of triols, higher sugar alcohols, propylene glycol, polyethylene glycol, etc., and the solution stabilizer is added in an amount that would be recognized by those skilled in the art as a stable formulation for a stable period of time, the isotonicity modifier may be one of sodium chloride, mannitol, and the buffer may be one of TRIS, histidine buffer, phosphate buffer.

In the case of pharmaceutical compositions, a safe and effective amount of the small protein or fusion protein of the invention or an immunoconjugate thereof, is administered to a mammal, wherein the safe and effective amount is typically at least about 50 micrograms per kilogram of body weight and in most cases no more than about 100 milligrams per kilogram of body weight, preferably the dose is from about 100 micrograms per kilogram of body weight to about 50 milligrams per kilogram of body weight. Of course, the particular dosage should also take into account factors such as the route of administration, the health of the patient, etc., which are within the skill of the skilled practitioner. Typically, the total dose cannot exceed a certain range, for example, the intravenous dose is 10 to 3000 mg/day/50 kg, preferably 100 to 1000 mg/day/50 kg.

The small protein of the targeted S protein or the fusion protein thereof and the pharmaceutical preparation containing the small protein can be used as an anti-tumor drug for tumor treatment, and the anti-tumor drug refers to a drug for inhibiting and/or treating tumors, can comprise delay of development of symptoms associated with tumor growth and/or reduction of severity of the symptoms, further comprises alleviation of symptoms associated with existing tumor growth and prevention of occurrence of other symptoms, and also reduces or prevents metastasis.

The small protein targeting the S protein or the fusion protein thereof and the pharmaceutical preparation thereof can be combined with other antitumor drugs for treating tumors, and the antitumor drugs for combined administration include but are not limited to: 1. cytotoxic drugs (1) act on drugs of DNA chemical structure: alkylating agents such as nitrogen mustards, nitrosamines, methylsulfonates; platinum compounds such as cisplatin, carboplatin, and platinum oxalate; mitomycin (MMC); (2) a drug that affects nucleic acid synthesis: dihydrofolate reductase inhibitors such as Methotrexate (MTX) and Alimta, etc.; thymic nucleoside synthase inhibitors such as fluorouracil (5 FU, FT-207, capecitabine) and the like; purine nucleoside synthetase inhibitors such as 6-mercaptopurine (6-MP) and 6-TG and the like; nucleotide reductase inhibitors such as Hydroxyurea (HU) and the like; DNA polymerase inhibitors such as cytarabine (Ara-C) and Gemz; (3) an agent that acts on transcription of nucleic acids: drugs that selectively act on DNA templates to inhibit DNA-dependent RNA polymerase and thereby inhibit RNA synthesis, such as: actinomycin D, daunorubicin, doxorubicin, epirubicin, aclarubicin, mithramycin, and the like; (4) drugs acting mainly on tubulin synthesis: paclitaxel, taxotere, vinblastine, vinorelbine, podophylloids, homoharringtonines; (5) other cytotoxic agents: asparaginase mainly inhibits protein synthesis; 2. hormonal antiestrogens: tamoxifen, droloxifene, exemestane, and the like; aromatase inhibitors: aminoglutethimide, lantelong, letrozole, laningde, and the like; antiandrogens: flutamine RH-LH agonists/antagonists: norided, etalum, etc.; 3. biological reaction modifier: inhibiting tumor interferon mainly through organism immunity; interleukin-2; thymus peptides; 4. Monoclonal antibodies: rituximab (MabThera); cetuximab (C225); herceptin (Trastuzumab); bevacizumab (Avastin); yervoy (Ipilimumab); nivolumab (OPDIVO); pembrolizumab (Keytruda); atezolizumab (Tecentriq); 5. others include some drugs whose mechanisms are currently unknown and are to be further studied; cell differentiation inducers such as retinoids; apoptosis inducers.

The main advantages of the invention include:

1) The binding site of the small protein of the targeted novel coronavirus S protein provided by the invention can almost cover the binding site of ACE2 and S protein.

2) The small protein has smaller molecular weight and length of less than about 60 amino acids, and has better tumor penetration.

3) The small protein provided by the invention has ultrahigh affinity to the novel coronavirus S protein, and can effectively block the invasion of the novel coronavirus into host cells.

4) The small protein of the invention has ultrahigh structural stability, and the Tm value is more than 95 ℃.

The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. The experimental procedure, which does not address the specific conditions in the examples below, is generally followed by routine conditions, such as, for example, sambrook et al, molecular cloning: conditions described in the laboratory Manual (New York: cold Spring Harbor Laboratory Press, 1989) or as recommended by the manufacturer. Percentages and parts are weight percentages and parts unless otherwise indicated.

The sequences of the present invention are shown in Table 1 below

TABLE 1 sequences of the invention

/>

/>

Example 1: synthesis of high affinity targeted novel coronavirus S protein

1.1 screening of high affinity targeting novel coronavirus S proteins

Candidate proteins were screened using yeast display library technology. First, the synthesized candidate protein gene is combined with pETCON vector fragment by electrotransformation according to 2:1, were electrotransferred to EBY-100 yeast cells. After 2 days of incubation at 30℃with the aid of double-defect (-Ura/-Trp) plates, the electrotransformation efficiency was confirmed (greater than 1X 10) ⁵ ). The electrotransformed yeast cells were cultured in double deficient medium (30 ℃,250 rpm) for two days. According to 1:100 dilution ratio, induced expression of display proteins was performed in lactose-rich induction medium. When od600=0.5, fc-tagged S protein was used as theTarget protein by means of F (ab') ₂ Sheep anti-human IgG Fc secondary antibody, PE (H10104) and anti-Myc tag antibody FITC (ab 1394) were double-stained by flow-through. Wherein the FITC positive cells are yeast cells displaying proteins, and PE/FITC double positive indicates that the displaying proteins can carry out affinity binding with target protein S. PE/FITC double-positive yeast cells corresponding to the ultrahigh affinity are selected according to the affinity, and then the gene sequence of the candidate protein capable of binding to the target protein (namely, the ultrahigh affinity small protein of the targeted S protein) is obtained through gene sequencing.

1.2 Synthesis of Targeted S protein high affinity Small proteins

By adopting a total gene synthesis method, ultra-high affinity small protein genes which target the S protein are synthesized and named NC_139_error_ (2), NC_139_error_Delta_ (3) and NC_139_error_Delta_ (9). Wherein NC_139_error_ (2) is targeted novel coronavirus S protein broad-spectrum high-affinity small protein, and the amino acid sequence is shown in SEQ ID NO:1, the nucleotide sequence of which is shown as SEQ ID NO: 2.

NC_139_error_Delta_ (3) and NC_139_error_Delta_ (9) are ultrahigh affinity small proteins targeting the novel coronavirus Delta mutant S protein. Wherein, the amino acid sequence of NC_139_error_Delta_ (3) is shown as SEQ ID NO:3, the nucleotide sequence of which is shown as SEQ ID NO: 4. The amino acid sequence of NC_139_error_Delta_ (9) is shown in SEQ ID NO:5, the nucleotide sequence of which is shown as SEQ ID NO: shown at 6. After adding an initiation codon to the N-terminal of the synthesized nucleotide sequence, pET29b (+) expression vectors are loaded at XhoI and NedI cleavage sites.

1.3 structural simulation of high affinity small proteins targeting novel coronavirus S proteins

The obtained small protein is subjected to protein structure simulation by means of RoseTTAFold, and the small protein is in an alpha structure as shown in figure 1B. The structural representation by means of ChimeraX is shown in FIG. 1A as a block diagram of the novel coronavirus S protein and ACE2 protein complex (6M 0J). The small proteins are molecular-docked with the novel coronavirus S protein by means of molecular docking software. As shown in fig. 1, the binding site of the small protein at the new coronavirus S protein almost completely overlaps the binding site of ACE2 protein binding at the S protein.

Example 2: expression purification of ultra-high affinity small proteins

After transformation of the vector into E.coli, it was cultured in LB medium at 37℃and 270rpm to OD ₆₀₀ =0.6. Bacterial fluid protein expression was then induced overnight with 1mM IPTG. After the bacterial recovery, add Protease Inhibitor Cocktail and

nuclease, the supernatant was removed by sonication (4 min, 10s on,10s off,80%Amp). After purification by means of a Ni column, the concentrated sample was further purified by passing through a molecular sieve. Protein expression and purification were assessed by SDS-PAGE and Coomassie brilliant blue staining. The concentration of the protein was further determined by means of the method.

The high purity candidate protein is obtained by the method for subsequent experiments.

Example 3: detection of high-affinity small protein binding activity of targeted novel coronavirus S protein

In this example, the synthetic small protein nucleotide sequence was loaded with the pETCON vector at the XhoI and NedI cleavage sites after addition of the start codon at the N-terminus. The small protein gene-loaded vector was transferred to EBY-100 yeast cells with the aid of a yeast transformation kit. After 2 days of incubation at 30℃with the aid of double-defect (-Ura/-Trp) plates, the electrotransformation efficiency was confirmed (greater than 1X 10) ⁵ ). The electrotransformed yeast cells were cultured in double-defect medium (30 ℃ C., 225 rpm) for two days. According to 1:100 dilution ratio, induction expression of display proteins was performed in lactose-rich induction medium. When od600=0.5, the new coronavirus Delta mutant S protein with Fc tag was used as target protein and was diluted as shown in the concentration and incubated with yeast cells for 45 min at room temperature. By means of F (ab') ₂ Sheep anti-human IgG Fc secondary antibody, PE (H10104) and anti-Myc tag antibody FITC (ab 1394) were double-stained by flow-through. Wherein FITC positive cells are yeast cells displaying proteins, and PE/FITC double positivity indicates that the displaying proteins can bind to target proteins.

As shown in FIG. 2, NC_139_error (2) shows that candidate proteins on the surface of yeast cells show strong binding activity at target protein concentrations of 500pM and 100pM, and PE/FITC double-positive signals. To further clarify the super-binding activity of NC_139_error_Delta_ (3) and NC_139_error_Delta_ (9) to the Delta mutant S proteins, the target protein concentrations were diluted to 24.5pM and 12.25pM, respectively, which still showed strong binding activity at the target protein concentration of 24.5pM and binding activity at the target protein concentration of 12.25pM, showing PE/FITC double positive signal.

Example 4: high affinity small protein competitive binding Activity assay targeting S protein

In this example, to further confirm the competing binding activity of the high affinity small protein targeting the S protein with the new coronavirus S protein. After the biotin marked human ACE2 protein with different concentrations is incubated with a novel coronavirus S protein fusion protein with a human antibody Fc region for 20 minutes at room temperature, the novel coronavirus S protein fusion protein is incubated with yeast cells displaying a targeting S protein high-affinity small protein, and then competitive binding activity is evaluated by means of a F (ab') 2-goat anti-human IgG Fc secondary antibody, PE (H10104) and anti-Myc tag antibody FITC (ab 1394) in a bicolor flow manner. Wherein FITC positive cells are yeast cells displaying proteins, and PE/FITC double positivity indicates that the displayed small proteins are capable of binding to S protein.

As shown in FIG. 3, the concentration of ACE2 protein was selected from 200nM and 0nM, respectively, at 100pM, with target protein S _Delta Fc was incubated for 30 min at room temperature. The protein incubation mixture was then incubated with yeast cells expressing the candidate protein for 45 minutes at room temperature. The competitive binding activity of the candidate proteins was assessed by two-color flow. The candidate binding protein can still show better competitive protection activity at 2nM concentration (supersaturation concentration) of the competitive protein ACE 2.

Example 5: high affinity small protein affinity assays targeting S protein

In this example, affinity detection was performed for high affinity blocking proteins by means of ForteBio Octet. First, 3. Mu.g/ml of the Fc-tagged novel coronavirus S Protein was loaded onto a detection probe coupled to Protein A (300S), and the unbound novel Fc-tagged coronavirus S Protein was eluted in PBST solution. The detection probe with S protein is then immersed in an equally twice-diluted solution of the targeting S protein with high affinity, and the binding signal is detected (300S). The probe was then immersed in PBST to detect the dissociation signal of the binding protein. The affinity of the high affinity blocking binding protein was finally calculated.

As shown in FIG. 4, NC_139_error_Delta_ (3) and NC_139_error_Delta_ (9) showed super-strong binding activity to S protein of the novel crown Delta mutant strain with affinities of 7.58×10, respectively ^-10 M and 6.083 ×10 ^-9 M. As shown in fig. 5, nc_139_error_ (2) showed better binding activity against both wild-type new coronavirus and major mutants alpha, beta, gamma and Delta, respectively. Their affinities are 9.399 ×10 respectively ^-9 M、 9.022×10 ^-9 M、8.444×10 ^-9 M、1.799×10 ^-8 M and 1.448×10 ^-8 M, exhibits broad-spectrum binding activity against the new coronavirus S protein.

Example 6: detection of structural stability of targeting S protein high affinity small protein

The structural stability of the protein was examined by means of JASCO-1500. The detection is carried out by selecting the wavelength range from 190nm to 260nm, firstly, the round dichroism signal of NC_139_error_Delta_ (3) protein at 25 ℃ (0.1 mg/ml) is measured, then the round dichroism signal of the protein is detected after the temperature of the protein is raised to 95 ℃, and finally, the temperature is restored to 25 ℃ and the round dichroism signal is kept for 5 minutes. The protein is obtained to change the secondary structure conformation of the protein at different temperatures, so that the structural stability of the binding protein is evaluated.

As shown in fig. 6, nc_139_error_delta_ (3) exhibits a higher alpha helical protein secondary structure at 25 ℃. When the temperature is raised to 95 ℃, the secondary structure of the protein changes to a certain extent due to the influence of high temperature. However, as the temperature is reduced to 25 ℃ again, the circular dichroism signals are almost completely overlapped, which indicates that the secondary structure of the protein is restored to the condition before temperature rise. The protein shows super heat stability.

Example 7: tm value determination of targeted S protein high affinity blocking binding protein

The circular dichroism signal of NC_139_error_Delta_ (3) protein was determined at 25 ℃ (0.1 mg/ml) by means of JASCO-1500. The circular dichroism signal with the wavelength of 222nm is selected to detect the protein in the process of gradually increasing the temperature from 25 ℃ to 95 ℃. Wherein 2 ℃/min and 30 seconds of equilibration per minute. Further, the Tm value of the protein was obtained.

As shown in fig. 7, although the circular dichroism signal increases with increasing temperature, the circular dichroism signal increases only by a small extent at the detection limit temperature of 95 ℃. According to the signal curve, the Tm of the signal is determined to exceed the upper limit of the detection temperature of the instrument, and the Tm is greater than 95 ℃. The protein shows super heat stability.

Example 8: expression purification of fusion proteins

In this example, fusion proteins of ultra-high affinity small proteins were prepared. The structure of the prepared fusion protein is shown as B in FIG. 8, and the amino acid sequence is SEQ ID NO. 11, 13 or 15. The method comprises the following steps:

the signal peptide sequence SEQ ID NO:17 are respectively connected with coding sequences SEQ ID NO. 12, 14 or 16 of the fusion protein, then respectively introduce multiple cloning sites of pcDNA3.1 vector, and the vector is transfected into 293F cells and cultured on a cell culture shaking table for 6 days. After harvesting the cell culture supernatant and filtering, the sample was purified by means of a Protein a column and further concentrated by ultrafiltration. Protein expression and purification were assessed by SDS-PAGE and Coomassie brilliant blue staining.

In addition, the binding of the fusion protein to the S protein was determined using the BLI method of example 4, and the results indicate that the prepared fusion protein can bind to the S protein with ultra-high affinity.

Example 9: in vitro neutralization protective activity test of ultra-high affinity small protein of targeted S protein

In this example, NC_139_error_Delta_ (3) will be evaluated for its neutralizing protective activity against pseudoviruses in vitro of the wild-type and major mutants alpha, beta, gamma and Delta, respectively, of the novel coronavirus. NC_139_error_Delta_ (3) was incubated with novel coronavirus carrying a fluorescein reporter gene at the concentration shown in FIG. 9 for 1 hour at 37℃and incubated with 293T cells highly expressing human ACE2 for 24 hours at 37 ℃. After removal of the culture supernatant, 100. Mu. l D-luciferin was added and incubated for 2 minutes. After 10-fold dilution of the mixture, 150. Mu.l of the luciferase activity was measured, and the in vitro pseudovirus neutralization protective activities of the protein on the novel crown wild type and the main mutant alpha, beta, gamma and Delta respectively were calculated.

All documents mentioned in this application are incorporated by reference as if each was individually incorporated by reference. Further, it will be appreciated that various changes and modifications may be made by those skilled in the art after reading the above teachings, and such equivalents are intended to fall within the scope of the claims appended hereto.

Sequence listing

<110> general Hospital for liberation of Chinese people

<120> ultra-high affinity small protein of targeting COVID-19 virus S protein and application thereof

<130> P2021-2600

<160> 18

<170> PatentIn version 3.5

<210> 1

<211> 56

<212> PRT

<213> Artificial sequence (Artificial Sequence)

<400> 1

Asp Leu Arg Glu Arg Ile Glu Arg Phe Ile Glu Asp Ala Lys Arg Asn

1 5 10 15

Leu Glu Glu Gly Asn Pro Glu Ile Ala Arg Arg Leu Leu Glu Ala Ala

20 25 30

Lys Asn Ile Ala Glu Gln Leu Gly Asp Asp Glu Leu Arg Arg Glu Val

35 40 45

Glu Arg Leu Leu Lys Glu Leu Lys

50 55

<210> 2

<211> 168

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 2

gacctgcgtg aacgtatcga acgtttcatc gaagacgcga aacgtaacct ggaagaaggt 60

aacccggaaa ttgcgcgtcg tctgctggaa gcggcgaaaa acatcgcgga acagctgggt 120

gatgacgaac tccgccgtga agttgaacgc ctgctgaaag aactgaaa 168

<210> 3

<211> 56

<212> PRT

<213> Artificial sequence (Artificial Sequence)

<400> 3

Asp Leu Arg Glu Arg Ile Glu Arg Phe Ile Glu Asp Ala Lys Arg Asn

1 5 10 15

Leu Glu Glu Gly Asn Pro Glu Ile Ala Arg Arg Leu Leu Glu Ala Ala

20 25 30

Lys Asp Ile Ala Glu Gln Leu Gly Asp Asp Glu Leu Arg Arg Glu Ala

35 40 45

Glu Arg Leu Leu Lys Glu Leu Lys

50 55

<210> 4

<211> 168

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 4

gacctgcgtg aacgtatcga acgtttcatc gaagatgcga aacgtaacct ggaagaaggt 60

aacccggaaa ttgcgcgtcg tctgctggaa gcggcgaaag acatcgcgga acagctgggt 120

gatgacgaac tccgccgtga agctgaacgc ctgctgaaag aactgaaa 168

<210> 5

<211> 56

<212> PRT

<213> Artificial sequence (Artificial Sequence)

<400> 5

Asp Leu Arg Glu Arg Ile Glu Arg Phe Ile Glu Asp Ala Lys Arg Asn

1 5 10 15

Leu Glu Glu Gly Asn Pro Glu Ile Ala Arg Arg Leu Leu Glu Ala Ala

20 25 30

Lys Asn Ile Ala Glu Gln Leu Gly Asp Asp Glu Leu Arg Arg Glu Val

35 40 45

Glu Arg Leu Leu Lys Glu Leu Lys

50 55

<210> 6

<211> 168

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 6

gacctgcgtg aacgtatcga acgtttcatc gaagacgcga aacgtaacct ggaagaaggt 60

aacccggaaa ttgcgcgtcg tttgctggaa gcggcgaaaa acatcgcgga acagctgggt 120

gatgacgaac tccgccgtga agttgaacgc ctgctgaaag aactgaaa 168

<210> 7

<211> 15

<212> PRT

<213> Artificial sequence (Artificial Sequence)

<400> 7

Glu Pro Lys Ser Gly Asp Lys Thr His Thr Cys Pro Pro Cys Pro

1 5 10 15

<210> 8

<211> 45

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 8

gagcccaaat ctggtgacaa aactcacaca tgcccaccgt gccca 45

<210> 9

<211> 217

<212> PRT

<213> Artificial sequence (Artificial Sequence)

<400> 9

Ala Pro Glu Leu Leu Gly Gly Pro Ser Val Phe Leu Phe Pro Pro Lys

1 5 10 15

Pro Lys Asp Thr Leu Met Ile Ser Arg Thr Pro Glu Val Thr Cys Val

20 25 30

Val Val Asp Val Ser His Glu Asp Pro Glu Val Lys Phe Asn Trp Tyr

35 40 45

Val Asp Gly Val Glu Val His Asn Ala Lys Thr Lys Pro Arg Glu Glu

50 55 60

Gln Tyr Asn Ser Thr Tyr Arg Val Val Ser Val Leu Thr Val Leu His

65 70 75 80

Gln Asp Trp Leu Asn Gly Lys Glu Tyr Lys Cys Lys Val Ser Asn Lys

85 90 95

Ala Leu Pro Ala Pro Ile Glu Lys Thr Ile Ser Lys Ala Lys Gly Gln

100 105 110

Pro Arg Glu Pro Gln Val Tyr Thr Leu Pro Pro Ser Arg Asp Glu Leu

115 120 125

Thr Lys Asn Gln Val Ser Leu Thr Cys Leu Val Lys Gly Phe Tyr Pro

130 135 140

Ser Asp Ile Ala Val Glu Trp Glu Ser Asn Gly Gln Pro Glu Asn Asn

145 150 155 160

Tyr Lys Thr Thr Pro Pro Val Leu Asp Ser Asp Gly Ser Phe Phe Leu

165 170 175

Tyr Ser Lys Leu Thr Val Asp Lys Ser Arg Trp Gln Gln Gly Asn Val

180 185 190

Phe Ser Cys Ser Val Met His Glu Ala Leu His Asn His Tyr Thr Gln

195 200 205

Lys Ser Leu Ser Leu Ser Pro Gly Lys

210 215

<210> 10

<211> 651

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 10

gcacctgaac tcctgggggg accgtcagtc ttcctcttcc ccccaaaacc caaggacacc 60

ctcatgatct cccggacccc tgaggtcaca tgcgtggtgg tggacgtgag ccacgaagac 120

cctgaggtca agttcaactg gtacgtggac ggcgtggagg tgcataatgc caagacaaag 180

ccgcgggagg agcagtacaa cagcacgtac cgtgtggtca gcgtcctcac cgtcctgcac 240

caggactggc tgaatggcaa ggagtacaag tgcaaggtct ccaacaaagc cctcccagcc 300

cccatcgaga aaaccatctc caaagccaaa gggcagcccc gagaaccaca ggtgtacacc 360

ctgcccccat cccgggatga gctgaccaag aaccaggtca gcctgacctg cctggtcaaa 420

ggcttctatc ccagcgacat cgccgtggag tgggagagca atgggcagcc ggagaacaac 480

tacaagacca cgcctcccgt gctggactcc gacggctcct tcttcctcta cagcaagctc 540

accgtggaca agagcaggtg gcagcagggg aacgtcttct catgctccgt gatgcatgag 600

gctctgcaca accactacac gcagaagagc ctctccctgt ctccgggtaa a 651

<210> 11

<211> 288

<212> PRT

<213> Artificial sequence (Artificial Sequence)

<400> 11

Asp Leu Arg Glu Arg Ile Glu Arg Phe Ile Glu Asp Ala Lys Arg Asn

1 5 10 15

Leu Glu Glu Gly Asn Pro Glu Ile Ala Arg Arg Leu Leu Glu Ala Ala

20 25 30

Lys Asp Ile Ala Glu Gln Leu Gly Asp Asp Glu Leu Arg Arg Glu Ala

35 40 45

Glu Arg Leu Leu Lys Glu Leu Lys Glu Pro Lys Ser Gly Asp Lys Thr

50 55 60

His Thr Cys Pro Pro Cys Pro Ala Pro Glu Leu Leu Gly Gly Pro Ser

65 70 75 80

Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu Met Ile Ser Arg

85 90 95

Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser His Glu Asp Pro

100 105 110

Glu Val Lys Phe Asn Trp Tyr Val Asp Gly Val Glu Val His Asn Ala

115 120 125

Lys Thr Lys Pro Arg Glu Glu Gln Tyr Asn Ser Thr Tyr Arg Val Val

130 135 140

Ser Val Leu Thr Val Leu His Gln Asp Trp Leu Asn Gly Lys Glu Tyr

145 150 155 160

Lys Cys Lys Val Ser Asn Lys Ala Leu Pro Ala Pro Ile Glu Lys Thr

165 170 175

Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln Val Tyr Thr Leu

180 185 190

Pro Pro Ser Arg Asp Glu Leu Thr Lys Asn Gln Val Ser Leu Thr Cys

195 200 205

Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val Glu Trp Glu Ser

210 215 220

Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val Leu Asp

225 230 235 240

Ser Asp Gly Ser Phe Phe Leu Tyr Ser Lys Leu Thr Val Asp Lys Ser

245 250 255

Arg Trp Gln Gln Gly Asn Val Phe Ser Cys Ser Val Met His Glu Ala

260 265 270

Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu Ser Pro Gly Lys

275 280 285

<210> 12

<211> 864

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 12

gacctgcgtg aacgtatcga acgtttcatc gaagatgcga aacgtaacct ggaagaaggt 60

aacccggaaa ttgcgcgtcg tctgctggaa gcggcgaaag acatcgcgga acagctgggt 120

gatgacgaac tccgccgtga agctgaacgc ctgctgaaag aactgaaaga gcccaaatct 180

ggtgacaaaa ctcacacatg cccaccgtgc ccagcacctg aactcctggg gggaccgtca 240

gtcttcctct tccccccaaa acccaaggac accctcatga tctcccggac ccctgaggtc 300

acatgcgtgg tggtggacgt gagccacgaa gaccctgagg tcaagttcaa ctggtacgtg 360

gacggcgtgg aggtgcataa tgccaagaca aagccgcggg aggagcagta caacagcacg 420

taccgtgtgg tcagcgtcct caccgtcctg caccaggact ggctgaatgg caaggagtac 480

aagtgcaagg tctccaacaa agccctccca gcccccatcg agaaaaccat ctccaaagcc 540

aaagggcagc cccgagaacc acaggtgtac accctgcccc catcccggga tgagctgacc 600

aagaaccagg tcagcctgac ctgcctggtc aaaggcttct atcccagcga catcgccgtg 660

gagtgggaga gcaatgggca gccggagaac aactacaaga ccacgcctcc cgtgctggac 720

tccgacggct ccttcttcct ctacagcaag ctcaccgtgg acaagagcag gtggcagcag 780

gggaacgtct tctcatgctc cgtgatgcat gaggctctgc acaaccacta cacgcagaag 840

agcctctccc tgtctccggg taaa 864

<210> 13

<211> 288

<212> PRT

<213> Artificial sequence (Artificial Sequence)

<400> 13

Asp Leu Arg Glu Arg Ile Glu Arg Phe Ile Glu Asp Ala Lys Arg Asn

1 5 10 15

Leu Glu Glu Gly Asn Pro Glu Ile Ala Arg Arg Leu Leu Glu Ala Ala

20 25 30

Lys Asn Ile Ala Glu Gln Leu Gly Asp Asp Glu Leu Arg Arg Glu Val

35 40 45

Glu Arg Leu Leu Lys Glu Leu Lys Glu Pro Lys Ser Gly Asp Lys Thr

50 55 60

His Thr Cys Pro Pro Cys Pro Ala Pro Glu Leu Leu Gly Gly Pro Ser

65 70 75 80

Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu Met Ile Ser Arg

85 90 95

Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser His Glu Asp Pro

100 105 110

Glu Val Lys Phe Asn Trp Tyr Val Asp Gly Val Glu Val His Asn Ala

115 120 125

Lys Thr Lys Pro Arg Glu Glu Gln Tyr Asn Ser Thr Tyr Arg Val Val

130 135 140

Ser Val Leu Thr Val Leu His Gln Asp Trp Leu Asn Gly Lys Glu Tyr

145 150 155 160

Lys Cys Lys Val Ser Asn Lys Ala Leu Pro Ala Pro Ile Glu Lys Thr

165 170 175

Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln Val Tyr Thr Leu

180 185 190

Pro Pro Ser Arg Asp Glu Leu Thr Lys Asn Gln Val Ser Leu Thr Cys

195 200 205

Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val Glu Trp Glu Ser

210 215 220

Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val Leu Asp

225 230 235 240

Ser Asp Gly Ser Phe Phe Leu Tyr Ser Lys Leu Thr Val Asp Lys Ser

245 250 255

Arg Trp Gln Gln Gly Asn Val Phe Ser Cys Ser Val Met His Glu Ala

260 265 270

Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu Ser Pro Gly Lys

275 280 285

<210> 14

<211> 864

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 14

gacctgcgtg aacgtatcga acgtttcatc gaagacgcga aacgtaacct ggaagaaggt 60

aacccggaaa ttgcgcgtcg tctgctggaa gcggcgaaaa acatcgcgga acagctgggt 120

gatgacgaac tccgccgtga agttgaacgc ctgctgaaag aactgaaaga gcccaaatct 180

ggtgacaaaa ctcacacatg cccaccgtgc ccagcacctg aactcctggg gggaccgtca 240

gtcttcctct tccccccaaa acccaaggac accctcatga tctcccggac ccctgaggtc 300

acatgcgtgg tggtggacgt gagccacgaa gaccctgagg tcaagttcaa ctggtacgtg 360

gacggcgtgg aggtgcataa tgccaagaca aagccgcggg aggagcagta caacagcacg 420

taccgtgtgg tcagcgtcct caccgtcctg caccaggact ggctgaatgg caaggagtac 480

aagtgcaagg tctccaacaa agccctccca gcccccatcg agaaaaccat ctccaaagcc 540

aaagggcagc cccgagaacc acaggtgtac accctgcccc catcccggga tgagctgacc 600

aagaaccagg tcagcctgac ctgcctggtc aaaggcttct atcccagcga catcgccgtg 660

gagtgggaga gcaatgggca gccggagaac aactacaaga ccacgcctcc cgtgctggac 720

tccgacggct ccttcttcct ctacagcaag ctcaccgtgg acaagagcag gtggcagcag 780

gggaacgtct tctcatgctc cgtgatgcat gaggctctgc acaaccacta cacgcagaag 840

agcctctccc tgtctccggg taaa 864

<210> 15

<211> 288

<212> PRT

<213> Artificial sequence (Artificial Sequence)

<400> 15

Asp Leu Arg Glu Arg Ile Glu Arg Phe Ile Glu Asp Ala Lys Arg Asn

1 5 10 15

Leu Glu Glu Gly Asn Pro Glu Ile Ala Arg Arg Leu Leu Glu Ala Ala

20 25 30

Lys Asn Ile Ala Glu Gln Leu Gly Asp Asp Glu Leu Arg Arg Glu Val

35 40 45

Glu Arg Leu Leu Lys Glu Leu Lys Glu Pro Lys Ser Gly Asp Lys Thr

50 55 60

His Thr Cys Pro Pro Cys Pro Ala Pro Glu Leu Leu Gly Gly Pro Ser

65 70 75 80

Val Phe Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu Met Ile Ser Arg

85 90 95

Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser His Glu Asp Pro

100 105 110

Glu Val Lys Phe Asn Trp Tyr Val Asp Gly Val Glu Val His Asn Ala

115 120 125

Lys Thr Lys Pro Arg Glu Glu Gln Tyr Asn Ser Thr Tyr Arg Val Val

130 135 140

Ser Val Leu Thr Val Leu His Gln Asp Trp Leu Asn Gly Lys Glu Tyr

145 150 155 160

Lys Cys Lys Val Ser Asn Lys Ala Leu Pro Ala Pro Ile Glu Lys Thr

165 170 175

Ile Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln Val Tyr Thr Leu

180 185 190

Pro Pro Ser Arg Asp Glu Leu Thr Lys Asn Gln Val Ser Leu Thr Cys

195 200 205

Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val Glu Trp Glu Ser

210 215 220

Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val Leu Asp

225 230 235 240

Ser Asp Gly Ser Phe Phe Leu Tyr Ser Lys Leu Thr Val Asp Lys Ser

245 250 255

Arg Trp Gln Gln Gly Asn Val Phe Ser Cys Ser Val Met His Glu Ala

260 265 270

Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu Ser Pro Gly Lys

275 280 285

<210> 16

<211> 864

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 16

gacctgcgtg aacgtatcga acgtttcatc gaagacgcga aacgtaacct ggaagaaggt 60

aacccggaaa ttgcgcgtcg tttgctggaa gcggcgaaaa acatcgcgga acagctgggt 120

gatgacgaac tccgccgtga agttgaacgc ctgctgaaag aactgaaaga gcccaaatct 180

ggtgacaaaa ctcacacatg cccaccgtgc ccagcacctg aactcctggg gggaccgtca 240

gtcttcctct tccccccaaa acccaaggac accctcatga tctcccggac ccctgaggtc 300

acatgcgtgg tggtggacgt gagccacgaa gaccctgagg tcaagttcaa ctggtacgtg 360

gacggcgtgg aggtgcataa tgccaagaca aagccgcggg aggagcagta caacagcacg 420

taccgtgtgg tcagcgtcct caccgtcctg caccaggact ggctgaatgg caaggagtac 480

aagtgcaagg tctccaacaa agccctccca gcccccatcg agaaaaccat ctccaaagcc 540

aaagggcagc cccgagaacc acaggtgtac accctgcccc catcccggga tgagctgacc 600

aagaaccagg tcagcctgac ctgcctggtc aaaggcttct atcccagcga catcgccgtg 660

gagtgggaga gcaatgggca gccggagaac aactacaaga ccacgcctcc cgtgctggac 720

tccgacggct ccttcttcct ctacagcaag ctcaccgtgg acaagagcag gtggcagcag 780

gggaacgtct tctcatgctc cgtgatgcat gaggctctgc acaaccacta cacgcagaag 840

agcctctccc tgtctccggg taaa 864

<210> 17

<211> 19

<212> PRT

<213> Artificial sequence (Artificial Sequence)

<400> 17

Met Gly Trp Ser Cys Ile Ile Leu Phe Leu Val Ala Thr Ala Thr Gly

1 5 10 15

Val His Ser

<210> 18

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 18

atgggatggt catgtatcat cctttttcta gtagcaactg caaccggtgt acattcc 57

Claims

1. A small protein targeting a novel coronavirus S protein, wherein the small protein is capable of specifically targeting binding to the novel coronavirus S protein, exhibits a strong affinity, and exhibits a strong binding activity against the novel coronavirus wild type, alpha, beta, gamma and delta, respectively;

Wherein the small protein is composed of a peptide chain and mainly forms three alpha-helix secondary structures;

and the amino acid sequence of the small protein is shown as SEQ ID NO: 1.

2. A small protein for targeting a novel coronavirus Delta mutant S protein, which is characterized in that the small protein can specifically target and bind to the novel coronavirus Delta mutant S protein, shows ultra-strong affinity, can competitively bind to a novel coronavirus S protein with an ACE2 receptor, and effectively blocks the binding of the novel coronavirus S protein and the ACE2 receptor protein; and the wild type, alpha, beta and gamma of the novel coronavirus show better neutralization protection activity;

and the amino acid sequence of the small protein is shown as SEQ ID NO:3 or 5.

3. A recombinant protein comprising two or more S protein targeting small proteins of claim 1 or 2 in tandem.

4. A fusion protein comprising a first polypeptide and/or a second polypeptide;

P-Mx-H-Fc (formula I)

P-Fc-H-Mx (formula II)

Wherein,,

p is a none or signal peptide sequence;

m is an S protein binding domain (or binding member) having an amino acid sequence derived from the amino acid sequence of a small protein targeting an S protein as set forth in claim 1 or 2;

h is a hinge region;

fc is a constant region of an immunoglobulin or no or, or a fragment thereof;

x is a positive integer from 1 to 4.

5. A polynucleotide encoding the S protein-targeting small protein of claim 1 or 2, the recombinant protein of claim 3, or the fusion protein of claim 4.

6. A vector comprising the polynucleotide of claim 5.

7. A host cell comprising the vector of claim 6 or having integrated into its genome the polynucleotide of claim 5.

8. An immunoconjugate, the immunoconjugate comprising:

(a) A small protein targeting an S protein according to claim 1 or 2, a recombinant protein according to claim 3, or a fusion protein according to claim 4; and

9. A pharmaceutical composition comprising:

(a) The S protein-targeting small protein of claim 1 or 2, or the recombinant protein of claim 3, or the fusion protein of claim 3, or a gene encoding the same; or the immunoconjugate of claim 8; and

(b) A pharmaceutically acceptable carrier.

10. A method for preparing the S protein-targeting small protein of claim 1 or 2, or the recombinant protein of claim 3 or the fusion protein of claim 4, comprising the steps of:

(a) Culturing the host cell of claim 7 under suitable conditions, thereby obtaining a culture comprising the small or recombinant protein or fusion protein; and

(b) Purifying and/or separating the culture obtained in the step (a) to obtain the small protein or recombinant protein or fusion protein of the targeted S protein.