CN113999292B - Polypeptide inhibitors targeting respiratory viruses - Google Patents

Polypeptide inhibitors targeting respiratory viruses Download PDF

Info

Publication number
CN113999292B
CN113999292B CN202111253551.5A CN202111253551A CN113999292B CN 113999292 B CN113999292 B CN 113999292B CN 202111253551 A CN202111253551 A CN 202111253551A CN 113999292 B CN113999292 B CN 113999292B
Authority
CN
China
Prior art keywords
seq
polypeptide
leu
amino acid
glu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111253551.5A
Other languages
Chinese (zh)
Other versions
CN113999292A (en
Inventor
陈忠周
董晓飞
吴玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Agricultural University
Original Assignee
China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Agricultural University filed Critical China Agricultural University
Priority to CN202111253551.5A priority Critical patent/CN113999292B/en
Publication of CN113999292A publication Critical patent/CN113999292A/en
Application granted granted Critical
Publication of CN113999292B publication Critical patent/CN113999292B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/12Antivirals
    • A61P31/14Antivirals for RNA viruses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2760/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses negative-sense
    • C12N2760/00011Details
    • C12N2760/18011Paramyxoviridae
    • C12N2760/18611Respirovirus, e.g. Bovine, human parainfluenza 1,3
    • C12N2760/18622New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2760/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses negative-sense
    • C12N2760/00011Details
    • C12N2760/18011Paramyxoviridae
    • C12N2760/18611Respirovirus, e.g. Bovine, human parainfluenza 1,3
    • C12N2760/18633Use of viral protein as therapeutic agent other than vaccine, e.g. apoptosis inducing or anti-inflammatory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Abstract

The invention discloses a polypeptide, which can be a polypeptide with an amino acid sequence shown as SEQ ID No.3 or a polypeptide with an amino acid sequence shown as 22 th to 78 th positions of SEQ ID No. 3. The affinity of the polypeptide provided by the invention with HPIV3N is obviously higher than that between P polypeptide and HPIV 3N; moreover, the polypeptide can obviously inhibit the combination of HPIV3N protein and RNA, thereby inhibiting the proliferation of HPIV 3. The polypeptide provided by the invention has important application value in the aspect of controlling infection and transmission of parainfluenza viruses (Parainfluenza Viruses, PIVs), and has important industrial value in the biological medicine industry and animal husbandry.

Description

Polypeptide inhibitors targeting respiratory viruses
Technical Field
The invention relates to the research of antiviral drugs in the field of biological medicine, in particular to a polypeptide inhibitor targeting respiratory viruses.
Background
Parainfluenza viruses (Parainfluenza Viruses, PIVs) are a type of virus capable of causing respiratory diseases, and can infect animals such as humans and domestic animals, e.g., cattle, dogs, pigs, goats, etc., causing upper and lower respiratory tract infections, and when infected with respiratory epithelial cells, causing cell degeneration and death, thereby causing mucosal ulcers; when the lower respiratory tract is infected, the alveolar epithelium and the interstitial cells are damaged, and symptoms such as pneumonia and the like are caused. Compared with symptoms such as cough, sore throat and the like after adult infection, parainfluenza virus is easier to infect infants, is the second disease source for acute respiratory diseases of children, and more than 150 ten thousand children are infected each year. Parainfluenza virus infection in animals is often not regarded as important, but immunosuppression, tissue damage and secondary infection occur under high-intensity stress, and pneumonia and atypical interstitial pneumonia are caused, so that huge economic losses are brought to animal husbandry.
Parainfluenza viruses belong to the Mononegavirales family (Paramyxoviridae) and can be classified into 5 types, PIV1, PIV2, PIV3, PIV4 and PIV5, respectively, based on differences in genetic and serological characteristics. Among the four subtypes of parainfluenza viruses that can infect humans are HPIV1, HPIV2, HPIV3 and HPIV4, respectively. Wherein HPIV1 and HPIV3 belong to the genus respiratory virus and HPIV2 and HPIV4 belong to the genus mumps virus. Parainfluenza viruses are mainly airborne through droplets and are prevalent annually since they were found in the last 50 th century. Wherein HPIV1 and HPIV2 are susceptible to odd and even years, respectively, and HPIV3 is essentially exploded from 4 months to 6 months each year. HPIV1 and HPIV2 generally cause lighter symptoms like the cold, while HPIV3 tends to cause more severe lower respiratory tract disease.
The virus particles in the respiratory genus are in various spherical shapes under an electron microscope, and the diameter range is 100-500 nm. But its internal genome is an antisense, non-segmented single-stranded RNA, essentially encoding 6 proteins: nucleoprotein (N), polymerase cofactor phosphoprotein (P), matrix (M) protein, fusion (F) protein, hemagglutinin-neuraminidase (HN) protein, and polymerase (large, L) protein. N protein is one of the core components of viral proliferation, and only the viral genome encapsulated by N protein can be used for transcription and replication effectively. Given the importance of N protein to viruses, but no FDA-approved drugs specifically directed against respiratory viruses currently exist, the development of corresponding targeted inhibitors is of great value.
Disclosure of Invention
It is an object of the present invention to provide inhibitors of respiratory viruses.
To achieve the above object, in a first aspect, the present invention provides a polypeptide comprising a P polypeptide and an N polypeptide, which may be any one of A1) to a 10):
a1 The amino acid sequence of the P polypeptide is shown in the 22 th to 42 th positions of SEQ ID No.3, and the amino acid sequence of the N polypeptide is shown in the 51 th to 78 th positions of SEQ ID No. 3;
a2 The amino acid sequence of the P polypeptide is shown in positions 1-42 of SEQ ID No.3, and the amino acid sequence of the N polypeptide is shown in positions 51-78 of SEQ ID No. 3;
a3 The amino acid sequence of the P polypeptide is shown in the 22 th to 42 th positions of SEQ ID No.16, and the amino acid sequence of the N polypeptide is shown in the 51 th to 78 th positions of SEQ ID No. 16;
a4 The amino acid sequence of the P polypeptide is shown in positions 1-42 of SEQ ID No.16, and the amino acid sequence of the N polypeptide is shown in positions 51-78 of SEQ ID No. 16;
a5 The amino acid sequence of the P polypeptide is shown in the 22 th to 42 th positions of SEQ ID No.18, and the amino acid sequence of the N polypeptide is shown in the 51 th to 78 th positions of SEQ ID No. 18;
a6 The amino acid sequence of the P polypeptide is shown in positions 1-42 of SEQ ID No.18, and the amino acid sequence of the N polypeptide is shown in positions 51-78 of SEQ ID No. 18;
a7 The amino acid sequence of the P polypeptide is shown in the 22 th to 42 th positions of SEQ ID No.20, and the amino acid sequence of the N polypeptide is shown in the 51 th to 78 th positions of SEQ ID No. 20;
A8 The amino acid sequence of the P polypeptide is shown in positions 1-42 of SEQ ID No.20, and the amino acid sequence of the N polypeptide is shown in positions 51-78 of SEQ ID No. 20;
a9 The amino acid sequence of the P polypeptide is shown in the 22 th-42 th positions of SEQ ID No.22, and the amino acid sequence of the N polypeptide is shown in the 51 th-78 th positions of SEQ ID No. 22;
a10 The amino acid sequence of the P polypeptide is shown in positions 1-42 of SEQ ID No.22, and the amino acid sequence of the N polypeptide is shown in positions 51-78 of SEQ ID No. 22.
Further, the polypeptides also include a linker peptide linking the P polypeptide and the N polypeptide, including but not limited to (GS) 4 having the amino acid sequence shown at positions 43-50 of SEQ ID No. 3.
Common connecting peptides are divided into flexible connecting peptides and rigid connecting peptides, wherein the flexible connecting peptides are flexible and linear amino acid sequences which are easy to bend and are commonly (GGGGS) n repeated sequences (n=2-4) and (GS) n repeated sequences (n=4); rigid linker peptides are a class of amino acid sequences of fixed length that stabilize helical structures, commonly referred to as (EAAAK) n repeats (n=2-5).
Further, the polypeptide may be any one of C1) -C10):
c1 The amino acid sequence of the polypeptide is shown in 22 th to 78 th positions of SEQ ID No. 3;
c2 The amino acid sequence of the polypeptide is shown as SEQ ID No. 3;
C3 The amino acid sequence of the polypeptide is shown in 22 th to 78 th positions of SEQ ID No. 16;
c4 The amino acid sequence of the polypeptide is shown as SEQ ID No. 16;
c5 The amino acid sequence of the polypeptide is shown in 22 th-78 th positions of SEQ ID No. 18;
c6 The amino acid sequence of the polypeptide is shown in the 18 th site of SEQ ID No. of the polypeptide;
c7 The amino acid sequence of the polypeptide is shown in 22 th to 78 th positions of SEQ ID No. 20;
c8 The amino acid sequence of the polypeptide is shown in the 20 th site of SEQ ID No. 2;
c9 The amino acid sequence of the polypeptide is shown in 22 th-78 th positions of SEQ ID No. 22;
c10 The amino acid sequence of the polypeptide is shown as SEQ ID No. 22.
Further, the polypeptide specifically targets against respiratory viruses, in particular, human parainfluenza virus HPIV3, as described in A1), A2), C1) or/and C2); a3 Polypeptide specific for the inhibition of human parainfluenza virus HPIV 1), A4), C3) or/and C4); a5 Specifically targeted inhibition of sendai virus SeV) by said polypeptide of a), A6), C5) or/and C6); a7 A8), C7) or/and C8) said polypeptide specifically targets porcine parainfluenza virus PRV1; a9 A 10), C9), or/and C10) that specifically targets bovine parainfluenza virus BPIV3.
In a second aspect, the present invention provides a biological material associated with the polypeptide described above, which biological material may be any of the following:
B1 A nucleic acid molecule encoding the above polypeptide;
b2 An expression cassette comprising the nucleic acid molecule of B1);
b3 A recombinant vector comprising the nucleic acid molecule of B1) or comprising the expression cassette of B2);
b4 A recombinant microorganism comprising B1) said nucleic acid molecule or comprising B2) said expression cassette or comprising B3) said recombinant vector;
b5 An animal cell line containing B1) said nucleic acid molecule or containing B2) said expression cassette or containing B3) said recombinant vector;
b6 A plant cell line comprising B1) said nucleic acid molecule or comprising B2) said expression cassette or comprising B3) said recombinant vector;
b7 A recombinant cell producing the polypeptide described above.
Further, in the biological material, the coding sequence of the coding strand of the nucleic acid molecule of B1) may be any one of the sequences D1) to D10):
d1 The coding sequence of the coding chain is shown as 64 th to 237 th bits of SEQ ID No. 4;
d2 The coding sequence of the coding chain is shown as SEQ ID No. 4;
d3 The coding sequence of the coding chain is shown as 64-237 of SEQ ID No. 17;
d4 The coding sequence of the coding chain is shown as SEQ ID No. 17;
d5 The coding sequence of the coding chain is shown as 64-237 of SEQ ID No. 19;
d6 The coding sequence of the coding chain is shown as SEQ ID No. 19;
d7 The coding sequence of the coding chain is shown as 64-237 of SEQ ID No. 21;
D8 The coding sequence of the coding chain is shown as SEQ ID No. 21;
d9 The coding sequence of the coding chain is shown as 64-237 of SEQ ID No. 23;
d10 The coding sequence of the coding strand is shown as SEQ ID No. 23.
In a third aspect, the present invention provides the use of the polypeptide described above and/or the biological material described above in the preparation of a medicament for inhibiting a respiratory virus.
Further, in the described application, the respiratory viruses include, but are not limited to, human parainfluenza virus HPIV1, human parainfluenza virus HPIV3, sendai virus SeV, porcine parainfluenza virus PRV1, and bovine parainfluenza virus BPIV3.
In a fourth aspect, the invention provides a medicament or pharmaceutical composition comprising a polypeptide as described above.
In a fifth aspect, the present invention provides a method for preparing an N protein which binds to the polypeptide described above, said method comprising subjecting P of an N-P fusion protein to a hydrophobic mutation.
Further, the P obtained by the hydrophobic mutation may be E1) or E2):
e1 Polypeptide with the amino acid sequence shown as SEQ ID No. 5;
e2 Polypeptide with the amino acid sequence shown as SEQ ID No. 14.
In the method, the amino acid sequence of the N-P fusion protein is shown as SEQ ID No.1, and the coding sequence of the coding chain of the N-P fusion protein is shown as SEQ ID No. 2.
The affinity of the polypeptide provided by the invention with HPIV 3N is obviously higher than that between P and HPIV 3N; moreover, the polypeptide can obviously inhibit the combination of HPIV 3N protein and RNA, thereby inhibiting the proliferation of HPIV 3. The polypeptide provided by the invention has important application value in the aspect of controlling infection and transmission of parainfluenza viruses (Parainfluenza Viruses, PIVs), and has important industrial value in the biological medicine industry and animal husbandry.
Drawings
FIG. 1 shows HPIV 3N (29-374) - (GS) 4 -purification assay of P (1-42) protein.
FIG. 2 is a diagram of the crystals of the primary screening protein and their diffraction patterns; wherein A in FIG. 2 is HPIV 3N (29-374) - (GS) 4 -P (1-42) a crystallographic pattern for diffraction; FIG. 2B shows HPIV 3N (29-374) - (GS) 4 -P (1-42) crystal diffraction pattern, the numbers on the pattern corresponding to the resolution.
FIG. 3 shows the interaction of HPIV 3N (29-374) and P (1-42) proteins; wherein a in the figure is a hydrophobic surface of N and P, and as the color becomes lighter, the hydrophobicity increases. In the figure, the interaction interface of N and P is shown in a cartoon diagram, and the amino acids involved in the interaction are shown in a stick shape.
FIG. 4 is a schematic representation of the design of a derived polypeptide of P.
FIG. 5 is a purification scheme of HPIV3 soluble monomeric N protein.
FIG. 6 is a purification profile of PARM 1.
FIG. 7 shows MST detects affinities between PARM1, PARM2 and HPIV 3N.
FIG. 8 shows the affinity and HPIV 3N of MST assay P between different fragments; wherein the three P polypeptide fragments are P respectively 1-42 、P 1-21 、P 22-42
FIG. 9 shows the MST assay for affinity between HPIV 3N and P polypeptide in the control.
Fig. 10 is an EMSA showing that PARM affects HPIV 3N binding to RNA more than P polypeptide.
FIG. 11 shows MST detection of affinity between HPIV1-PARM and HPIV 1N.
FIG. 12 shows the affinity between SeV-PARM and SeV N for MST detection.
FIG. 13 is a graph showing MST detection of affinity between PRV1-PARM and PRV 1N.
FIG. 14 shows MST detects the affinity between BPIV3-PARM and BPIV 3N.
Detailed Description
The following detailed description of the invention is provided in connection with the accompanying drawings that are presented to illustrate the invention and not to limit the scope thereof. The examples provided below are intended as guidelines for further modifications by one of ordinary skill in the art and are not to be construed as limiting the invention in any way.
The experimental methods in the following examples, unless otherwise specified, are conventional methods, and are carried out according to techniques or conditions described in the literature in the field or according to the product specifications. Materials, reagents and the like used in the examples described below are commercially available unless otherwise specified.
In the present invention pET28a-TEV and BL21 (DE 3) are described in the literature "Manfeng Zhang, xiaorong Li, zengqin Deng, zhenhang Chen, yang Liu, YIna Gao, wei Wu, and Zhongzhou Chen. Structural Biology of the Arterivirus nsp11 endonucleocleses", ASM journal/Journal of Virology/Vol.91, no.1,16 December 2016.DOI:10.1128/ JVI.01309-16"modified pET-28a (N-terminal His tag) vector in which the thrombin recognition site was replaced by a Tobacco Etch Virus (TEV) protease recognition site", which is described in the literature, the above-mentioned biological materials are available to the public from the applicant, and are used only for repeated experiments of the present invention, and are not used for other purposes.
In the present invention, pGEX-6P-1 vector was purchased from NovoPro Bioscience Inc under the trade designation V010912.
In the present invention, pET28a vector is purchased from NOVAGEN under the trade designation 69864.
In the present invention, bamH I and Xho I restriction enzymes were purchased from Thermo Fisher company.
The seamless junction kit was purchased from Zhongmeitai and company under the trade designation C5891.
EXAMPLE 1 design and purification of PARM
1.1.HPIV3 N(29-374)-(GS) 4 Resolution of the Crystal Structure of P (1-42) protein
The GenBank accession numbers of the N and P genes of the HPIV3 strain NIH47885 in the genus of respiratory viruses used were D10025.1 (2010-6-15) and M14932.1 (1995-5-17), respectively. N (29-374) and P (1-42) were used (GS) 4 Flexible linker of 8 amino acid residues was designed as N (29-374) - (GS) 4 -P(1-42),N(29-374)-(GS) 4 The amino acid sequence of P (1-42) is shown as SEQ ID No.1, wherein the amino acid residues 1-346 of SEQ ID No.1 represent N (29-374), and the amino acid disabilities 347-354 of SEQ ID No.1 represent connecting peptide (GS) 4 Amino acid residues 355-396 of SEQ ID No.1 represent P (1-42).
N(29-374)-(GS) 4 The coding sequence of the P (1-42) coding strand is shown in SEQ ID No.2, wherein nucleotides 1 to 1038 of SEQ ID No.2 represent the coding sequence of N (29-374), and nucleotides 1039 to 1062 of SEQ ID No.2 represent the connecting peptide (GS) 4 SEQ ID No.2, bits 1063-1188 represent the coding sequence of P (1-42).
N (29-374) - (GS) having the nucleotide sequence shown in SEQ ID No.2 4 The P (1-42) encoding gene replaces the fragment between BamH I and Xho I recognition sites of pET28a-TEV vector (small fragment between BamH I and Xho I recognition sites), leaving the other nucleotide sequences of pET28a-TEV vector unchanged, yielding N (29-374) - (GS) 4 Recombinant expression vector of P (1-42) gene, designated pET28a-TEV-N (29-374) - (GS) 4 -P(1-42)。
After sequencing without error, pET28a-TEV-N (29-374) - (GS) 4 Heat shock transformation of P (1-42) at 42℃into E.coli expression Strain BL21 (DE 3) at 37℃and 200rpm to OD 600nm The value is between 0.8 and 1.0, 200 mu M IPTG is added to induce the expression at 18 ℃ overnight, and then the bacterial cells are collected by low-speed centrifugation at 4000 rpm. The thallus is subjected to ultrasonic disruption (ultrasonic cell disruption instrument is available from Ningbo Xinzhi biotechnology Co., ltd.) and high-speed centrifugation at 20000rpm, and supernatant is collected (centrifuge)Purchased from Anhui, middle-Kagaku scientific instruments Co., ltd.) was filtered with a 0.45 μm pore size filter and then purified with a Ni affinity chromatography column (Ni-NTA, thermo Fisher Co.). Eluting with Ni affinity chromatography column with buffer solution (20mM BTP pH 8.5,1000mM NaCl,20mM imidazole, 50nM imidazole, 300nM imidazole) with increasing imidazole concentration gradient, collecting eluate containing target protein, concentrating to reduce imidazole with ultrafiltration tube, adding TEV protease, enzyme cutting at 4deg.C overnight, purifying with Ni affinity chromatography column again the next day, collecting target protein in flow-through solution, concentrating, and performing gel filtration exclusion chromatography with Superdex 200 10/300GL and 20mM BTP pH 8.5,1000mM NaCl buffer solution. The target protein showed a peak at about 15.5mL, and exhibited a single symmetrical peak shape (FIG. 1). The peak tip position protein was collected and concentrated to 7mg/ml for crystal screening by gas phase diffusion.
FIG. 1A shows HPIV 3N (29-374) - (GS) 4 -SDS-PAGE detection of Ni affinity column purification of P (1-42) protein, wherein the leftmost lane M is standard molecular weight Marker, BEF represents supernatant after high-speed centrifugation for loading into Ni column, FT is flow-through solution, and 20, 50, 300 represent concentration (unit is mM) of imidazole eluent purified by Ni column respectively; b in FIG. 1 is HPIV 3N (29-374) - (GS) 4 SDS-PAGE detection of Ni affinity column purification after TEV protease cleavage of P (1-42) protein; BEF represents protein sample after TEV cleavage for loading into Ni column, 20, 300 represent concentration of Ni column purified imidazole eluent (in mM), respectively; FIG. 1C shows the results of gel filtration exclusion chromatography (GE Superdex 200 10/300 GL) of proteins, with the abscissa representing elution volume (in mL) and the ordinate representing UV absorbance (in AU) of proteins; in FIG. 1, D is an SDS-PAGE detection of a gel filtration exclusion chromatography peak tip protein sample.
The crystals of the prescreened protein grew out under conditions of 4℃ 0.1M Magnesium chloride,0.1M HEPES pH 7.0,15%PEG3350,0.005M Nickel (II) chloride hexahydrate. Optimized for data collection in 20% (w/v) PEG 6,000,0.1M Sodium Cacodylate pH 6.5,20mM Nickel (II) chloride conditions and data were collected at the Shanghai synchrotron radiation light source BL17U1 line station (FIG. 2). Setting the wavelength as
Figure BDA0003323155350000051
Setting the distance CCD parameter to 550mm, collecting one diffraction pattern every 1 degree to obtain a set of +.>
Figure BDA0003323155350000052
The data of (2) is obtained by taking a MeV N protein structure (PDB ID:6H 5Q) as a model to carry out molecular replacement, then optimizing a PDB file and an MTZ file which are output by the result by using a Refmac5 program, manually constructing a model by COOT, and finally analyzing the structure.
1.2 design of PARM
Resolution-based HPIV 3N (29-374) - (GS) 4 The crystal structure of the P (1-42) protein, HPIV 3P (1-42) and HPIV 3N (375-402) are connected through (GS) 4 flexible connecting peptide, the obtained polypeptide is named PARM1, the amino acid sequence of the polypeptide is shown as SEQ ID No.3, wherein the 1 st-42 th positions of the SEQ ID No.3 are the 1 st-42 th positions of the HPIV 3P polypeptide, the 43 rd-50 th positions of the SEQ ID No.3 are the connecting peptide (GS) 4, and the 51 st-78 th positions of the SEQ ID No.3 are the HPIV 3N (375-402). The coding gene of the coding chain of PARM1 is shown as SEQ ID No.4, wherein the 1 st to 126 th positions of the SEQ ID No.4 are the coding sequence of HPIV 3P (1-42), the 127 th to 150 th positions of the SEQ ID No.4 are the coding sequence of connecting peptide (GS) 4, and the 151 st to 237 th positions of the SEQ ID No.4 are the coding sequence of HPIV 3N (375-402).
Resolution-based HPIV 3N (29-374) - (GS) 4 The crystal structure of the P (1-42) protein, HPIV 3P (22-42) and HPIV 3N (375-402) are connected through (GS) 4 flexible connecting peptide, the obtained polypeptide is named PARM2, the amino acid sequence of the polypeptide is shown as 22-78 positions of SEQ ID No.3, wherein the 22-42 positions of the SEQ ID No.3 are HPIV 3P (22-42), the 43-50 positions of the SEQ ID No.3 are connecting peptide (GS) 4, and the 51-78 positions of the SEQ ID No.3 are HPIV 3N (375-402). The coding gene of the coding chain of PARM2 is shown as 64-237 of SEQ ID No.4, wherein 64-126 bits of SEQ ID No.4 are the coding sequence of HPIV 3P (22-42), 127-150 bits of SEQ ID No.4 are the coding sequence of connecting peptide (GS) 4, and 151-237 bits of SEQ ID No.4 are the coding sequence of HPIV 3N (375-402).
FIG. 4 is a schematic representation of a polypeptide derived from P, wherein A in FIG. 4 is the superposition of HPIV 3N-P and N in the SeV nucleocapsid; wherein SeV N (wheat color), ntum (cyan), CTARM (dark green), RNA (purple), HPIV 3N (blue) and HPIV 3P (magenta), the lower left panels show the positions of the aligned SeV N trimers in the nucleocapsids; FIG. 4B is a schematic representation of the derived peptide PARM, CTARM sequence alignment of P amino terminus and N in the genus of respiratory viruses using the sequences in the UniProt database: HPIV1 (human respirovirus 1) P (P32530) and N (P26590), seV P (P14252) and N (Q07097), PRV1 (porcine respirovirus 1) P (S5 LVM 8) and N (S5 LSJ 0), BPIV3 (bovine respirovirus 3) P (P06163) and N (P06161), HPIV 3P (P06162) and N (P06159).
1.3 preparation and purification of PARM
The PARM1 coding gene with the nucleotide sequence shown as SEQ ID No.4 is used for replacing a fragment (a small fragment between BamH I and Xho I recognition sites) between BamH I and Xho I recognition sites of the pGEX-6P-1 vector, other nucleotide sequences of the pGEX-6P-1 vector are kept unchanged, and a recombinant expression vector of the PARM1 gene is obtained and is named pGEX-6P-1-PARM1.
pGEX-6P-1-PARM1 was transformed into BL21 (DE 3) E.coli competence, cultured in 20ml LB liquid medium at 37℃and 200rpm, transferred to 1L LB liquid medium after about 6 hours and expanded to OD600nm of 0.8-1.0 at 37℃and 200 rpm. Then, the expression was induced overnight (about 12 hours) at a low temperature of 16℃and an inducer of 0.5mM IPTG (isopropyl-. Beta. -D thiogalactoside) was used to obtain a bacterial liquid after the induced expression.
The purification procedure was as follows: the bacterial liquid after induced expression was centrifuged at 4000rpm for 10min, and the bacterial liquid was collected for further purification. The centrifugally collected bacteria were resuspended in buffer of 20mM Tris pH 7.5,500mM NaCl,2mM EDTA,1mM protease inhibitor phenylmethylsulfonyl fluoride (Phenylmethylsulfonyl fluoride, PMSF), 0.1% (v/v) Triton X-100 and cells were lysed in an ice-salt bath using an ultrasonic disrupter, the ultrasonic procedure being: operating for 2sec, intermittent for 4sec and total time of 12-16min. The disrupted cell suspension was centrifuged at 20000 Xg for 50min at 4℃and the supernatant was filtered through a 0.45 μm filter. The protein supernatant was loaded into a gravity column with GST affinity chromatography packing (Glutathione Sepharose Fast Flow) at a column volume of 2mL in a refrigerator at 4deg.C and the permeate was collected in another clean beaker. The process was repeated and the protein supernatant was passed through the GST affinity chromatography column 2-3 times. Elution was performed with a buffer containing 10mM reduced glutathione (20mM Tris pH 7.5,300mM NaCl), and the salt concentration was reduced to 50mM or less by continuous concentration with ultrafiltration concentration tube and addition of salt-free buffer for anion exchange (GE Mono Q).
Mono Q from GE company at 4 DEG C TM 5/50GL was ion-exchange chromatographed on a Bio-Rad FPLC instrument. The 5 column volumes were washed with high salt buffer (20 mM Tris-HCl pH 8.0,1M NaCl) and 10 column volumes were equilibrated with salt-free buffer (20 mM Tris-HCl pH 8.0). Concentrating the protein sample to 1mL for loading, and running a set program: the flow rate is 1mL/min; the proportion (volume ratio) of high-salt buffer in the mixed buffer increases in a linear manner at the time of elution. Protein samples from which peak positions were collected were subjected to SDS-PAGE. The higher purity fractions were collected for subsequent gel filtration exclusion chromatography.
Gel filtration exclusion chromatography was performed on a Bio-Rad FPLC instrument with GE Superdex 200/300 GL from GE company at 4 ℃. The column was equilibrated (at least 1 column volume, 24 mL) with buffer (20 mM Tris-HCl pH 7.5,150mM NaCl) at a flow rate of 0.3mL/min to zero and leveled off detection lines (OD 280nm and OD260 nm). The target protein was concentrated to 1mL, and centrifuged at 12000 Xg at 4℃for 10min to remove bubbles and precipitates from the solution. Setting an operation program: the flow rate is 0.3mL/min, 2mL of the sample is automatically loaded, the loading volume is 2mL, 8mL to 24mL of the sample is automatically collected, and 0.5mL of the sample is collected by each tube. Protein samples were loaded into a 1mL sample loop of FPLC, starting the procedure. Protein samples at peak positions were collected for SDS-PAGE detection. Purified sample (PARM 1) was used for subsequent experiments.
FIG. 6 is a purification profile of PARM 1; in FIG. 6, A is an SDS-PAGE detection diagram of GST affinity chromatography purification of PARM1, and the rightmost lane M is a standard molecular weight Marker; FIG. 6B is an anion exchange chromatography purification chart of PARM1, with the abscissa representing elution volume (in mL), the ordinate representing UV (280/260 nm) absorbance of the protein (in AU), and the minor ordinate representing conductance of the eluent (in mS/cm); FIG. 6C is a SDS-PAGE detection corresponding to the peak position in FIG. 6B; FIG. 6D shows further purification of PARM1 by gel filtration exclusion chromatography (GE Superdex 200 10/300 GL); in FIG. 6E corresponds to SDS-PAGE detection at the peak position of the D plot in FIG. 6, the abscissa represents elution volume (in mL) and the ordinate represents the ultraviolet (280/260 nm) absorbance (in AU) of the protein.
The preparation method of PARM2 is identical to that of PARM1, and the difference is that the coding sequence of PARM2 is the 64 th-237 th bit of SEQ ID No. 4.
Example 2 determination of affinity of PARM to HPIV 3N protein
2.1 preparation of HPIV 3N protein
The preparation method of the HPIV 3N protein comprises the following steps: based on structural analysis, hydrophobic interactions between N protein and P polypeptide are dominant (fig. 3), and thus, mutation of P polypeptide was attempted to disrupt the interactions between N-P, and thus N protein monomers could be obtained. It is unexpected that, following L29E or I25E mutation of the P polypeptide, hydrophobic interactions can be disrupted, thereby obtaining N protein monomers. The amino acid sequence of the P polypeptide after the L29E mutation is shown as SEQ ID No.5, and the coding sequence is shown as 1060 th to 1185 th positions of SEQ ID No. 7; the amino acid sequence of the P polypeptide after I25E mutation is shown as SEQ ID No.14, and the coding sequence is shown as SEQ ID No. 15.
Taking L29E mutation as an example, constructing a chimeric body by the mutated P polypeptide and N protein: the amino acid sequence of the N (29-374) -TEV-P (1-42) (L29E) -His-Tag is shown as SEQ ID No.6, wherein the 1 st to 346 th positions of the SEQ ID No.6 are N protein (29-374) amino acid residues, the 347 th to 353 th positions of the SEQ ID No.6 are amino acid residues of a TEV enzyme cutting site, the 354 th to 395 th positions of the SEQ ID No.6 are amino acid residues after the mutation of the L29E at the 1 st to 42 th positions of the P polypeptide, and the 396 th to 403 th positions are His Tag.
The coding sequence of the coding chain of the N (29-374) -TEV-P (1-42) (L29E) -His-Tag is shown as SEQ ID No.7, wherein the 1 st to 1038 th positions of the SEQ ID No.7 are the coding sequence of the N protein (29-374), the 1039 th to 1059 th positions of the SEQ ID No.7 are the coding sequence of the TEV enzyme cutting site, the 1060 th to 1185 th positions of the SEQ ID No.7 are the coding sequence after the L29E mutation of the 1 st to 42 th positions of the P polypeptide, and the 1186 th to 1212 th positions are the coding sequence of the His Tag.
The nucleotide sequence of 5074-5233 of the pET28a vector is replaced by the N (29-374) -TEV-P (1-42) (L29E) -His-tag coding gene with the nucleotide sequence shown as SEQ ID No.7 through seamless connection, other nucleotide sequences of the pET28a vector are kept unchanged, and a recombinant expression vector of the N (29-374) -TEV-P (1-42) (L29E) -His-tag gene is obtained and is named as pET28a-N (29-374) -TEV-P (1-42) (L29E) -His-tag.
And transforming the constructed recombinant plasmid into escherichia coli BL21 (DE 3) competence for induction expression. The protein was purified by Ni affinity chromatography column followed by TEV cleavage and again by Ni affinity chromatography to collect the flow-through solution using the same purification method as described in 1.3. After Western-blot validation of N-P separation, the flow-through was collected for subsequent gel filtration exclusion chromatography purification (FIG. 5). The buffer used for Ni affinity chromatography was 20mM BTP pH 8.5,500mM NaCl and the buffer used for gel filtration exclusion chromatography was 20mM BTP pH 8.5,300mM NaCl. The purified protein is N protein monomer (the amino acid sequence of which is shown in the 1 st to 346 th positions of SEQ ID No. 6) and is used for subsequent experiments.
FIG. 3 shows the interaction of HPIV 3N (29-374) and P (1-42) proteins; wherein a in the figure is a hydrophobic surface of N and P, and as the color becomes lighter, the hydrophobicity increases. In the figure, the interaction interface of N and P is shown in a cartoon diagram, and the amino acids involved in the interaction are shown in a stick shape.
FIG. 5 is a purification scheme of HPIV3 soluble monomeric N protein; wherein in FIG. 5A is an SDS-PAGE detection diagram of HPIV 3N-P mutant protein purified by a Ni affinity chromatography column and digested by TEV protease, the rightmost lane M is a standard molecular weight Marker,100 and 200 respectively represent imidazole eluents (unit is mM) of different concentrations purified by the Ni column, and NO TEV represents that the protein is not digested by TEV and TEV CUT is digested by TEV; FIG. 5B is a SDS-PAGE detection of a second Ni affinity column purification after TEV cleavage, BEF represents the pre-purification sample, FT represents the flow-through sample, 200 represents the imidazole eluent (in mM); FIG. 5C is a western-blot detection diagram corresponding to the square frame region in FIG. 5B, wherein the primary antibody is a murine His-tag antibody; in FIG. 5, D is a SDS-PAGE detection of the protein by gel filtration exclusion chromatography (GE Superdex 200 10/300) with the protein peaks being further purified, the elution volume (in mL) on the abscissa and the ultraviolet (280 nm) absorbance of the protein (in AU) on the ordinate, and the sample at the peak position on the right.
2.2 determination of affinity of PARM to HPIV 3N protein
2.2.1 determination of affinity of HPIV 3N protein and PARM Using MST technology
The MST technique is used to determine the affinity of HPIV 3N protein (the amino acid sequence of which is shown in SEQ ID No.6 at positions 1-346) with PARM (PARM 1 the amino acid sequence of which is shown in SEQ ID No.3 and PARM2 the amino acid sequence of which is shown in SEQ ID No.3 at positions 22-78).
The HPIV 3N protein was first labeled. 100 μl of HPIV 3N protein at a concentration of 1 μM was mixed with 6 μl of fluorescent dye NT-647 (Cysteine Reactive, nano Temper Technologies) and 94 μl of labeled buffer (20mM HEPES pH 7.5,0.3M NaCl) and incubated at room temperature for 30min in the absence of light. The incubated dye and protein mixture was added to a B column (Nano Temper Technologies) pre-equilibrated with buffer, and after the solution had completely entered the column, 300. Mu.L buffer was added to rinse the B column. Then 600. Mu.L buffer was added for elution, the first two drops of eluent were not collected, and about 500. Mu.L of eluent was collected from the third drop and stored in a dark place. PARM was then diluted in a double dilution in 16 PCR tubes and mixed with fluorescent-labeled N protein (buffer used for binding: 20mM HEPES pH 7.3, 300mM NaCl,1% [ w/v)]BSA), incubated at room temperature for 10min under dark conditions, samples No. 1-16 were drawn into standard capillaries by siphoning, respectively, and were placed in a monlith nt.115 microphoresis meter for measurement. The LED color is Blue, and the intensity is 20%; MST intensity was selected to be 20%; the measured temperature was set at 24℃and the time was set at 30sec. The measured data were processed with mo. Affinity Analysis software. The same procedure was used to test the P of HPIV3 1-42 Affinity of the polypeptide and HPIV 3N protein.
MST experiments showed an almost 10-fold increase in PARM affinity compared to HPIV 3P polypeptide (K D PARM1: 4.0.+ -. 1.3nM, PARM2: 4.7.+ -. 1.4nM, P polypeptide: 43.0.+ -. 3.9 nM) (FIG. 7).
Wherein the P polypeptide of HPIV3 (HPIV 3P 1-42 ) The preparation method of (2) is as follows: with HPIV 3P 1-42 The coding sequence of the polypeptide gene (nucleotide sequence shown in SEQ ID No.4 at positions 1-126) replaces the sequence between BamH I and Xho I recognition sites of pGEX-6P-1 vectorFragments (small fragments between BamH I and Xho I recognition sites) keeping the other nucleotide sequences of pGEX-6P-1 vector unchanged to give HPIV 3P 1-42 The recombinant expression vector of the gene is named pGEX-6P-1-P.
pGEX-6P-1-P was transformed into BL21 (DE 3) E.coli and P-polypeptide was obtained by the same induction expression and purification procedure as in example 1.3 PARM.
2.2.2 determination of affinity of HPIV 3N protein and 3 fragments of HPIV 3P Using MST technique
The affinity of the HPIV 3N protein (amino acid sequence of which is shown in SEQ ID No.6 at positions 1-346) and 3 fragments of HPIV 3P were also determined using the MST technique shown in 2.2.1, the three P fragments being P 1-42 、P 1-21 、P 22-42 . Fragment P 1-42 The amino acid sequence of (2) is shown as 1 st to 42 nd positions of SEQ ID No.3, and the coding sequence of the coding chain is shown as 1 st to 126 th positions of SEQ ID No. 4; fragment P 1-21 The amino acid sequence of (2) is shown as SEQ ID No.3 at positions 1-21, and the coding sequence of the coding chain is shown as SEQ ID No.4 at positions 1-63; fragment P 22-42 The amino acid sequence of (2) is shown as 22-42 of SEQ ID No.3, and the coding sequence of the coding chain is shown as 64-126 of SEQ ID No. 4.
Wherein P of HPIV3 1-21 、P 22-42 The preparation method of the fragment comprises the following steps: with HPIV 3P as shown in SEQ ID No.2 at positions 1063-1125 1-21 The nucleotide sequence of the fragment encoding gene replaces the fragment between BamH I and Xho I recognition sites (small fragment between BamH I and Xho I recognition sites) of pGEX-6P-1 vector, and other nucleotide sequences of pGEX-6P-1 vector are kept unchanged to obtain HPIV 3P 1-21 Recombinant expression vector of fragment gene named pGEX-6P-1-P 1-21
pGEX-6P-1-P 1-21 Transformed into BL21 (DE 3) E.coli, and P is obtained by the same induction expression and purification procedure as in example 1.3PARM 1-21 A polypeptide.
With HPIV 3P as shown in SEQ ID No.2 at positions 1126-1188 22-42 The nucleotide sequence of the fragment encoding gene replaces the fragment between BamH I and Xho I recognition sites (small fragment between BamH I and Xho I recognition sites) of pGEX-6P-1 vector,maintaining the other nucleotide sequences of pGEX-6P-1 vector unchanged to obtain HPIV 3P 22-42 Recombinant expression vector of fragment gene named pGEX-6P-1-P 22-42
pGEX-6P-1-P 22-42 Transformed into BL21 (DE 3) E.coli, and P is obtained by the same induction expression and purification procedure as in example 1.3PARM 22-42 A polypeptide.
The affinity assay results are shown in FIG. 8. The results show P 22-42 Affinity with P of (C) 1-42 Little difference, K D The values were around 42nM, and P 1-21 The binding force with N is obviously reduced by three orders of magnitude, thus P 22-42 Fragments are P 1-42 A main region that interacts with N protein.
2.2.3 determination of affinity of HPIV 3N protein to P polypeptide in comparison File Using MST technique
The affinity of the HPIV 3N protein (amino acid sequence shown in SEQ ID No.6 at positions 1-346) and the P polypeptide (amino acid sequence MESDAKNYQI MDSWEEESRDKSTNISSALN IIEFILSTDP) in the control document (CN 113072624) was determined using the MST technique shown in 2.2.1. The results are shown in FIG. 9, which shows the affinity of the HPIV 3N protein for K of the P polypeptide in the control document (CN 113072624) D =50.2±10.3nM, 10-fold lower affinity than HPIV 3N protein and PARM in 2.2.1.
Example 3 Targeted inhibition of PARM
Further, we assessed the role of PARM in the binding of N protein to RNA. MBP-N (1-403) and GST-P (1-42) (GST-tagged P) 1-42 Protein) or MBP-N (1-403) and GST-PARM1 were incubated overnight in 20. Mu.L reaction system (20mM HEPES pH 7.5, 100mM NaCl, 2mM EDTA, 6% glycerol) at 0.1, 0.5, 1.0, 2.0. Mu.M with 0.1. Mu.M ssRNA (sequence FAM-AAAAAAAAAAAAAAAAAAAAAAAA AAAAAA,30nt, FAM tag at 5' end for detection), respectively, and 4. Mu.L was taken for 7% non-denaturing polyacrylamide gel (80:1) electrophoresis. The running buffer was 0.5 XTBE (45 mM Tris pH 8.0, 45mM boric acid, 1mM EDTA). FAM-ssRNA signal was detected by 488nm excitation. In sharp contrast to GST-P (1-42), binding of MBP-N (1-403) to RNA in the presence of GST-PARM1 The capacity was significantly reduced (FIG. 10). The proper affinity between N and P is important for the assembly of the viral nucleocapsid, since on the one hand, N needs to interact with P to maintain the open conformation of N and be carried by P to the viral replication site; on the other hand, when replication site N binds viral RNA to form nucleocapsids, P needs to leave N at the right time under the combined action of the adjacent N protein and RNA. However, PARM1 greatly enhances binding between itself and N and inhibits N-binding RNA, i.e. PARM1 hives N. Given the very high conservation of the sequences used by PARM1 designed in the respiratory virus genus and the important role of N in viral RNA transcription and replication, PARM1 with nanomolar affinity for N is useful as a polypeptide inhibitor against respiratory viruses that specifically targets N.
FIG. 10 is a measurement of the targeted inhibition of PARM1 and PARM2, and EMSA shows that GST-PARM1 (labeled PARM1 in FIG. 10) affects N protein binding to RNA more than GST-P (1-42) (labeled P in FIG. 10). PARM1 binds similarly to PARM2 (labeled PARM2 in fig. 10) except for the exposure time.
The amino acid sequence of MBP-N (1-403) is shown as SEQ ID No.8, wherein amino acid residues 1-405 of SEQ ID No.8 are MBP-TAG, and amino acid residues 406-808 of SEQ ID No.8 are N 1-403 The method comprises the steps of carrying out a first treatment on the surface of the The coding sequence of the coding chain of MBP-N (1-403) is shown as SEQ ID No.9, wherein the 1 st to 1215 th positions of the SEQ ID No.9 are the coding sequence of MBP-TAG, and the 1216 th to 2427 th positions of the SEQ ID No.9 are N 1-403 Is a coding sequence of (a).
GST-P 1-42 The amino acid sequence of (2) is shown as SEQ ID No.10, wherein amino acid residues 1-228 of SEQ ID No.10 are GST-TAG, amino acid residues 229-270 of SEQ ID No.10 are P 1-42 ;GST-P 1-42 The coding sequence of the coding chain is shown as SEQ ID No.11, wherein the 1 st to 684 th bits of SEQ ID No.11 are the coding sequence of GST-TAG, and the 685 th to 810 th bits of SEQ ID No.11 are P 1-42 Is a coding sequence of (a).
The amino acid sequence of GST-PARM1 is shown as SEQ ID No.12, wherein amino acid residues 1-228 of SEQ ID No.12 are GST-TAG, and amino acid residues 229-306 of SEQ ID No.12 are PARM; the coding sequence of the GST-PARM coding chain is shown as SEQ ID No.13, wherein the 1 st to 684 th bits of the SEQ ID No.13 are the coding sequence of GST-TAG, and the 685 th to 918 th bits of the SEQ ID No.13 are the coding sequence of PARM.
The nucleotide sequence of MBP-N (1-403) shown in SEQ ID No.9 is connected to a pET28a vector by a seamless connection method (a kit is purchased from Zhongmeitai and company and is numbered as C5891), so that a recombinant expression vector pET28a-MBP-N (1-403) is obtained, the structure of the recombinant expression vector pET28a-MBP-N (1-403) is that a fragment between 5110-5207 sites of pET28a is replaced by the nucleotide sequence shown in SEQ ID No.9, and other nucleotide sequences of the pET28a vector are kept unchanged.
The coding gene of GST-PARM1 with the nucleotide sequence shown as SEQ ID No.13 is used for replacing a fragment between BamH I and Xho I (a small fragment between BamH I and Xho I) of the pGEX-6P-1 vector, other nucleotide sequences of the pGEX-6P-1 vector are kept unchanged, and a recombinant expression vector of GST-PARM1 gene is obtained and named pGEX-6P-1-GST-PARM1.
The constructed recombinant expression vector pET28a-MBP-N (1-403) and pGEX-6P-1-GST-PARM1 are transformed into BL21 (DE 3) competent cells together for expression and purification. The method of induction expression and purification is as in example 1.3, and MBP-N (1-403) and GST-PARM1 complex is obtained.
The preparation method of MBP-N (1-403) and GST-PARM2 complex is the same as that of MBP-N (1-403) and GST-PARM1, and the difference is that: the PARM part in MBP-N (1-403) and GST-PARM2 is the coding sequence of PARM 2.
GST-P having the nucleotide sequence shown in SEQ ID No.11 1-42 The coding gene of (1) replaces the fragment between BamH I and Xho I (small fragment between BamH I and Xho I) of pGEX-6P-1 vector to obtain GST-P 1-42 Recombinant expression vector of gene named pGEX-6P-1-GST-P 1-42
The constructed recombinant expression vectors pET28a-MBP-N (1-403) and pGEX-6P-1-GST-P 1-42 Co-transformation into BL21 (DE 3) competent cells for expression and purification. Induction of expression and purification methods As in example 1.3, MBP-N (1-403) was obtained &GST-P 1-42 A complex.
The bacterial liquid after induced expression was centrifuged at 4000rpm for 10min, and the bacterial liquid was collected for further purification. The centrifugally collected bacteria were resuspended in buffer of 20mM Tris pH 7.5,500mM NaCl,2mM EDTA,1mM protease inhibitor phenylmethylsulfonyl fluoride (Phenylmethylsulfonyl fluoride, PMSF), 0.1% (v/v) Triton X-100 and cells were lysed in an ice-salt bath using an ultrasonic disrupter, the ultrasonic procedure being: operating for 2sec, intermittent for 4sec and total time of 12-16min. The disrupted cell suspension was centrifuged at 20000 Xg for 50min at 4℃and the supernatant was filtered through a 0.45 μm filter. The protein supernatant was loaded into a gravity column with Amylose affinity chromatography packing (NEB) at a column volume of 2mL in a refrigerator at 4℃and the permeate was collected in another clean beaker. This process was repeated 2 times. Elution was performed with 10mM maltose in buffer (20mM Tris pH 7.5,300mM NaCl) and the eluate was flowed into a gravity column with GST affinity chromatography packing (Glutathione Sepharose Fast Flow) at a column volume of 2mL and the permeate was collected in another clean beaker. The process was repeated and the protein supernatant was passed through the GST affinity chromatography column 2-3 times. Elution was performed with a buffer containing 10mM reduced glutathione (20mM Tris pH 7.5,300mM NaCl), and the salt concentration was reduced to 20mM or less by continuous concentration with ultrafiltration concentration tube and addition of salt-free buffer for anion exchange (GE Mono Q).
Mono Q from GE company at 4 DEG C TM 5/50GL was ion-exchange chromatographed on a Bio-Rad FPLC instrument. The 5 column volumes were washed with high salt buffer (20 mM Tris-HCl pH 8.0,1M NaCl) and 10 column volumes were equilibrated with salt-free buffer (20 mM Tris-HCl pH 8.0). Concentrating the protein sample to 1mL for loading, and running a set program: the flow rate is 1mL/min; the proportion (volume ratio) of high-salt buffer in the mixed buffer increases in a linear manner at the time of elution. The protein samples with peak positions were collected for subsequent gel filtration exclusion chromatography.
Gel filtration exclusion chromatography was performed on a Bio-Rad FPLC instrument with GE Superdex 200/300 GL from GE company at 4 ℃. The column was equilibrated (at least 1 column volume, 24 mL) with buffer (20 mM Tris-HCl pH 7.5,150mM NaCl) at a flow rate of 0.3mL/min to zero and leveled off detection lines (OD 280nm and OD260 nm). The target protein was concentrated to 1mL, and centrifuged at 12000 Xg at 4℃for 10min to remove bubbles and precipitates from the solution. Setting an operation program: the flow rate is 0.3mL/min, 2mL of the sample is automatically loaded, the loading volume is 2mL, 8mL to 24mL of the sample is automatically collected, and 0.5mL of the sample is collected by each tube. Protein samples were loaded into a 1mL sample loop of FPLC, starting the procedure. Protein complex samples at peak positions were collected for subsequent experiments.
Example 4 determination of affinity of HPIV1-PARM with HPIV 1N protein
The affinities of the HPIV1-PARM and HPIV 1N proteins, and the affinities of the HPIV1-P and HPIV 1N proteins of human parainfluenza virus (HPIV 1) were determined using MST technology, respectively.
HPIV1-PARM is classified into HPIV1-PARM1 and HPIV1-PARM2. The amino acid sequence of HPIV1-PARM1 is shown as SEQ ID No.16, wherein the 1 st to 42 th amino acid residues of HPIV 1P polypeptide are shown as SEQ ID No.16, the 43 rd to 50 th amino acid residues of connecting peptide (GS) 4 are shown as SEQ ID No.16, and the 51 st to 78 th amino acid residues of HPIV 1N protein are shown as 376 st to 403 th amino acid residues of HPIV 1P polypeptide. The coding gene of the coding chain of the HPIV1-PARM1 is shown as SEQ ID No.17, wherein the 1 st to 126 th positions of the SEQ ID No.17 are coding sequences of amino acids 1 to 42 th positions of the HPIV 3P polypeptide, the 127 th to 150 th positions of the SEQ ID No.17 are coding sequences of a connecting peptide (GS) 4, and the 151 th to 237 th positions of the SEQ ID No.17 are coding sequences of 376 th to 403 th positions of the HPIV 1N protein.
The amino acid sequence of HPIV1-PARM2 is shown as 22 th-78 th site of SEQ ID No.16, wherein 22 th-42 th site of SEQ ID No.16 is 22 th-42 th site amino acid residue of HPIV 1P polypeptide, 43 th-50 th site amino acid residue of connecting peptide (GS) 4, and 51 th-78 th site amino acid residue of SEQ ID No.16 is 376 th-403 th site amino acid residue of HPIV 1N protein. The coding gene of the coding chain of the HPIV1-PARM is shown as 64-237 of SEQ ID No.17, wherein 64-126 of SEQ ID No.17 is the coding sequence of 22-42 amino acids of the HPIV 3P polypeptide, 127-150 of SEQ ID No.17 is the coding sequence of connecting peptide (GS) 4, and 151-237 of SEQ ID No.17 is the coding sequence of 376-403 of the HPIV 1N protein.
The coding genes for N and P of HPIV1 are numbered D01070.1 (2007-12-7) and M74081.1 (2008-10-17) in the GenBank library, respectively.
The amino acid sequences of N and P of HPIV1 are obtained by translation of the above-described coding genes.
Preparation of HPIV 1N protein reference is made to the preparation of HPIV 3N protein in example 2.1.
HPIV1-PARM was purified by expression according to example 1.3 and affinity tested according to example 2.2.1.
The results are shown in FIG. 11, which shows the affinity assays K for the HPIV1-PARM1, HPIV1-PARM2, HPIV1-P and HPIV 1N proteins D The values were 4.0.+ -. 1.1nM, 3.6.+ -. 1.2nM, 49.0.+ -. 3.6nM, respectively.
Example 5 determination of affinity of SeV-PARM with SeV N protein
The affinity of SeV-PARM and SeV N proteins of Sendai virus SeV, and the affinity of SeV-P and SeV N proteins were determined separately using MST technology.
SeV-PARM is classified into SeV-PARM1 and SeV-PARM2. The amino acid sequence of the SeV-PARM1 is shown as SEQ ID No.18, wherein the 1 st to 42 th amino acid residues of the SeV P polypeptide are shown as SEQ ID No.18, the 43 rd to 50 th amino acid residues of the connecting peptide (GS) 4 are shown as SEQ ID No.18, and the 51 st to 78 th amino acid residues of the SeV N protein are shown as 376 st to 403 th amino acid residues of the SeV N protein. The coding gene of the coding chain of the SeV-PARM1 is shown as SEQ ID No.19, wherein the 1 st to 126 th positions of the SEQ ID No.19 are the coding sequences of the 1 st to 42 th amino acids of the SeV P polypeptide, the 127 th to 150 th positions of the SEQ ID No.19 are the coding sequences of the connecting peptide (GS) 4, and the 151 st to 237 th positions of the SEQ ID No.19 are the coding sequences of the 376 st to 403 th positions of the SeV N protein.
The amino acid sequence of the SeV-PARM2 is shown as 22-78 positions of SEQ ID No.18, wherein 22-42 positions of the SEQ ID No.18 are 22-42 amino acid residues of the SeV P polypeptide, 43-50 positions of the SEQ ID No.18 are amino acid residues of a connecting peptide (GS) 4, and 51-78 positions of the SEQ ID No.18 are 376-403 amino acid residues of the SeV N protein. The coding gene of the coding chain of the SeV-PARM2 is shown as 64-237 of SEQ ID No.19, wherein 64-126 of the SEQ ID No.19 is the coding sequence of 22-42 amino acids of the SeV P polypeptide, 127-150 of the SEQ ID No.19 is the coding sequence of connecting peptide (GS) 4, and 151-237 of the SEQ ID No.19 is the coding sequence of 376-403 of the SeV N protein.
Expression purification of SeV-PARM was performed in accordance with example 1.3 and affinity detection was performed in accordance with example 2.2.1.
The results are shown in FIG. 12, which shows the affinity assay K for SeV-PARM1, seV-PARM2, seV-P and SeV N proteins D The values were 3.7.+ -. 1.1nM, 4.3.+ -. 1.3nM, 42.0.+ -. 3.1nM, respectively. MST experiments show that the affinity of SeV-PARM is improved by nearly 10 times compared with that of SeV-P.
The coding genes for N and P of SeV are numbered X17218.1 (1993-5-11) and X17008.1 (2005-4-18), respectively, in the GenBank library.
The amino acid sequences of N and P of SeV are obtained by translation of the above-described coding genes.
Preparation of SeV N protein reference is made to the preparation of HPIV 3N protein in example 2.1.
Example 6 determination of affinity of PRV1-PARM with PRV 1N protein
The affinity of PRV1-PARM and PRV 1N proteins and the affinity of PRV1-P and PRV 1N proteins of porcine parainfluenza virus PRV1 were determined separately using MST technology.
PRV1-PARM is classified into PRV1-PARM1 and PRV1-PARM2. The amino acid sequence of PRV1-PARM1 is shown as SEQ ID No.20, wherein the 1 st to 42 th amino acid residues of PRV1P polypeptide are shown as SEQ ID No.20, the 43 rd to 50 th amino acid residues of connecting peptide (GS) 4 are shown as SEQ ID No.20, and the 51 st to 78 th amino acid residues of PRV 1N protein are shown as 376 st to 403 th amino acid residues of PRV1P polypeptide. The coding gene of the coding chain of the PRV1-PARM1 is shown as SEQ ID No.21, wherein the 1 st to 126 th positions of the SEQ ID No.21 are coding sequences of amino acids 1 to 42 th positions of the PRV1P polypeptide, the 127 th to 150 th positions of the SEQ ID No.21 are coding sequences of a connecting peptide (GS) 4, and the 151 st to 237 th positions of the SEQ ID No.21 are coding sequences of 376 st to 403 th positions of the PRV 1N protein.
The amino acid sequence of PRV1-PARM2 is shown as 22-78 positions of SEQ ID No.20, wherein 22-42 positions of SEQ ID No.20 are 22-42 amino acid residues of PRV1P polypeptide, 43-50 positions of SEQ ID No.20 are amino acid residues of connecting peptide (GS) 4, and 51-78 positions of SEQ ID No.20 are 376-403 amino acid residues of PRV 1N protein. The coding gene of the coding chain of the PRV1-PARM2 is shown as 64 th-237 th site of SEQ ID No.21, wherein the 64 th-126 th site of the SEQ ID No.21 is the coding sequence of 22 th-42 th amino acids of the PRV1P polypeptide, the 127 th-150 th site of the SEQ ID No.21 is the coding sequence of connecting peptide (GS) 4, and the 151 th-237 th site of the SEQ ID No.21 is the coding sequence of 376 th-403 th site of the PRV 1N protein.
HRV1-PARM was expression purified with reference to example 1.3 and affinity assay with reference to example 2.2.1.
The results are shown in FIG. 13, where affinity of PRV1-PARM1, PRV1-PARM2 and PRV 1N proteins determine K D Values of 5.1+ -1.6 nM, 4.8+ -1.3 nM; affinity assay K for PRV1-P and PRV 1N proteins D The value was 59.0.+ -. 5.6nM. MST experiments show that the affinity of PRV1-PARM is improved by nearly 10 times compared with that of PRV 1-P.
The coding genes of N and P of PRV1 are respectively shown as 120-1700 and 1850-3574 base sequences in GenBank library with the number JX857410.1 (2013-9-26).
The amino acid sequences of N and P of PRV1 are obtained by translation of the above-described coding genes.
PRV 1N protein preparation method reference is made to the preparation method of HPIV 3N protein in example 2.1.
Example 7 determination of affinity of BPIV3-PARM to BPIV 3N protein
The affinity of the BPIV3-PARM and BPIV 3N proteins, BPIV3-P and BPIV 3N proteins of bovine parainfluenza virus BPIV3 was determined using MST technology.
BPIV3-PARM is classified into BPIV3-PARM1 and BPIV3-PARM2. Wherein the amino acid sequence of the BPIV3-PARM1 is shown as SEQ ID No.22, wherein the 1 st to 42 th amino acid residues of the BPIV 3P polypeptide are shown as SEQ ID No.22, the 43 rd to 50 th amino acid residues of the connecting peptide (GS) 4 are shown as SEQ ID No.22, and the 51 st to 78 th amino acid residues of the BPIV 3N protein are shown as 376 st to 403 th amino acid residues of the BPIV 3P polypeptide. The coding gene of the coding chain of the BPIV3-PARM1 is shown as SEQ ID No.23, wherein the 1 st to 126 th positions of the SEQ ID No.23 are coding sequences of amino acids 1 to 42 th positions of the BPIV 3P polypeptide, the 127 th to 150 th positions of the SEQ ID No.23 are coding sequences of connecting peptide (GS) 4, and the 151 th to 237 th positions of the SEQ ID No.23 are coding sequences of 376 th to 403 th positions of the BPIV 3N protein.
The amino acid sequence of the BPIV3-PARM2 is shown as 22 th to 78 th positions of SEQ ID No.22, wherein 22 th to 42 th positions of the SEQ ID No.22 are amino acid residues 22 th to 42 th positions of the BPIV 3P polypeptide, 43 th to 50 th positions of the SEQ ID No.22 are amino acid residues of a connecting peptide (GS) 4, and 51 th to 78 th positions of the SEQ ID No.22 are amino acid residues 376 th to 403 th positions of the BPIV 3N protein. The coding gene of the coding chain of the BPIV3-PARM2 is shown as 64-237 of SEQ ID No.23, wherein 64-126 of the SEQ ID No.23 is the coding sequence of 22-42 amino acids of the BPIV 3P polypeptide, 127-150 of the SEQ ID No.23 is the coding sequence of connecting peptide (GS) 4, and 151-237 of the SEQ ID No.23 is the coding sequence of 376-403 of the BPIV 3N protein.
BPIV3-PARM was purified by expression according to example 1.3, and affinity assay according to example 2.2.1.
The results are shown in FIG. 14, affinity assay K for BPIV3-PARM1, BPIV3-PARM2 and BPIV 3N proteins D Values of 4.1+ -1.3 nM, 4.6+ -0.9 nM; affinity assay K for BPIV3-P and BPIV 3N proteins D The value was 57.0.+ -. 6.6nM. MST experiments show that the affinity of BPIV3-PARM is improved by nearly 10 times compared with that of BPIV 3-P.
The N and P encoding genes of BPIV3 are the 111 th to 1658 th base sequences and the 1784 th to 3574 th base sequences, respectively, which are numbered Y00114.1 (2005-6-22) in GenBank library.
The N and P amino acid sequences of BPIV3 are obtained by translation of the above-mentioned coding genes.
Preparation of BPIV 3N protein reference is made to the preparation of HPIV 3N protein in example 2.1.
The present invention is described in detail above. It will be apparent to those skilled in the art that the present invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with respect to specific embodiments, it will be appreciated that the invention may be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The application of some of the basic features may be done in accordance with the scope of the claims that follow.
Sequence listing
<110> Chinese university of agriculture
<120> polypeptide inhibitors targeting respiratory viruses
<160> 23
<170> SIPOSequenceListing 1.0
<210> 1
<211> 396
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 1
Lys Asn Thr Val Ser Ile Phe Ala Leu Gly Pro Thr Ile Thr Asp Asp
1 5 10 15
Asp Glu Lys Met Thr Leu Ala Leu Leu Phe Leu Ser His Ser Leu Asp
20 25 30
Asn Glu Lys Gln His Ala Gln Arg Ala Gly Phe Leu Val Ser Leu Leu
35 40 45
Ser Met Ala Tyr Ala Asn Pro Glu Leu Tyr Leu Thr Thr Asn Gly Ser
50 55 60
Asn Ala Asp Val Lys Tyr Val Ile Tyr Met Ile Glu Lys Asp Leu Lys
65 70 75 80
Arg Gln Lys Tyr Gly Gly Phe Val Val Lys Thr Arg Glu Met Ile Tyr
85 90 95
Glu Lys Thr Thr Glu Trp Ile Phe Gly Ser Asp Leu Asp Tyr Asp Gln
100 105 110
Glu Thr Met Leu Gln Asn Gly Arg Asn Asn Ser Thr Ile Glu Asp Leu
115 120 125
Val His Thr Phe Gly Tyr Pro Ser Cys Leu Gly Ala Leu Ile Ile Gln
130 135 140
Ile Trp Ile Val Leu Val Lys Ala Ile Thr Ser Ile Ser Gly Leu Arg
145 150 155 160
Lys Gly Phe Phe Thr Arg Leu Glu Ala Phe Arg Gln Asp Gly Thr Val
165 170 175
Gln Ala Gly Leu Val Leu Ser Gly Asp Thr Val Asp Gln Ile Gly Ser
180 185 190
Ile Met Arg Ser Gln Gln Ser Leu Val Thr Leu Met Val Glu Thr Leu
195 200 205
Ile Thr Met Asn Thr Ser Arg Asn Asp Leu Thr Thr Ile Glu Lys Asn
210 215 220
Ile Gln Ile Val Gly Asn Tyr Ile Arg Asp Ala Gly Leu Ala Ser Phe
225 230 235 240
Phe Asn Thr Ile Arg Tyr Gly Ile Glu Thr Arg Met Ala Ala Leu Thr
245 250 255
Leu Ser Thr Leu Arg Pro Asp Ile Asn Arg Leu Lys Ala Leu Met Glu
260 265 270
Leu Tyr Leu Ser Lys Gly Pro Arg Ala Pro Phe Ile Cys Ile Leu Arg
275 280 285
Asp Pro Ile His Gly Glu Phe Ala Pro Gly Asn Tyr Pro Ala Ile Trp
290 295 300
Ser Tyr Ala Met Gly Val Ala Val Val Gln Asn Arg Ala Met Gln Gln
305 310 315 320
Tyr Val Thr Gly Arg Ser Tyr Leu Asp Ile Asp Met Phe Gln Leu Gly
325 330 335
Gln Ala Val Ala Arg Asp Ala Glu Ala Gln Gly Ser Gly Ser Gly Ser
340 345 350
Gly Ser Met Glu Ser Asp Ala Lys Asn Tyr Gln Ile Met Asp Ser Trp
355 360 365
Glu Glu Glu Pro Arg Asp Lys Ser Thr Asn Ile Ser Ser Ala Leu Asn
370 375 380
Ile Ile Glu Phe Ile Leu Ser Thr Asp Pro Gln Glu
385 390 395
<210> 2
<211> 1188
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 2
aaaaatactg tctccatatt tgcccttgga ccgacaataa ctgatgacga tgagaaaatg 60
acattagctc ttctatttct atctcattca ctagataatg agaaacaaca tgcacaaagg 120
gcagggttct tggtgtcttt attgtcaatg gcttatgcca atccagagct ttacctgaca 180
acaaatggaa gtaatgcaga tgttaaatat gtcatatata tgattgagaa agatctaaaa 240
cggcaaaagt atggaggatt tgtggttaag acgagagaga tgatatatga aaagacaact 300
gagtggatat ttggaagtga cctggattat gaccaggaaa ctatgctgca gaacggcaga 360
aacaattcaa cgattgaaga tcttgttcac acatttgggt atccatcatg tttaggagct 420
cttataatac agatctggat agttttggtc aaagccatca ctagcatctc agggttaaga 480
aaaggctttt tcactcgatt agaggctttc agacaagatg gaacagtgca agcagggctg 540
gtattgagcg gtgacacagt ggatcagatt gggtcaatca tgcggtctca acagagcttg 600
gtaactctta tggttgagac attaataaca atgaatacta gcagaaatga cctcacaacc 660
atagaaaaga atatacaaat tgttggtaac tacataagag atgcaggtct tgcttcattc 720
ttcaatacaa tcaggtatgg aattgagact agaatggcag ctttgactct atctactctc 780
agaccagata tcaatagatt aaaagctctg atggaattgt atttatcaaa gggaccacgc 840
gctcctttta tctgtatcct cagagatcct atacatggtg agttcgcacc aggcaactat 900
cctgccatat ggagttatgc aatgggggtg gcagttgtac aaaacagagc catgcaacag 960
tatgtgacgg gaagatcata tctagatatt gatatgttcc agctgggaca agcagtagca 1020
cgtgatgctg aagctcaggg tagcggtagc ggtagcggta gcatggaaag cgatgctaaa 1080
aactatcaaa tcatggattc ttgggaagag gaaccaagag ataaatcaac taatatctcc 1140
tcggccctca acatcattga attcatactc agcaccgacc cccaagaa 1188
<210> 3
<211> 78
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 3
Met Glu Ser Asp Ala Lys Asn Tyr Gln Ile Met Asp Ser Trp Glu Glu
1 5 10 15
Glu Pro Arg Asp Lys Ser Thr Asn Ile Ser Ser Ala Leu Asn Ile Ile
20 25 30
Glu Phe Ile Leu Ser Thr Asp Pro Gln Glu Gly Ser Gly Ser Gly Ser
35 40 45
Gly Ser Met Ser Ser Thr Leu Glu Asp Glu Leu Gly Val Thr His Glu
50 55 60
Ala Lys Glu Ser Leu Lys Arg His Ile Arg Asn Ile Asn Ser
65 70 75
<210> 4
<211> 237
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 4
atggaaagcg atgctaaaaa ctatcaaatc atggattctt gggaagagga accaagagat 60
aaatcaacta atatctcctc ggccctcaac atcattgaat tcatactcag caccgacccc 120
caagaaggta gcggtagcgg tagcggtagc atgagctcaa cactggaaga tgaacttgga 180
gtgacacacg aagccaaaga aagcttgaaa agacatataa ggaacataaa cagttaa 237
<210> 5
<211> 42
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 5
Met Glu Ser Asp Ala Lys Asn Tyr Gln Ile Met Asp Ser Trp Glu Glu
1 5 10 15
Glu Pro Arg Asp Lys Ser Thr Asn Ile Ser Ser Ala Glu Asn Ile Ile
20 25 30
Glu Phe Ile Leu Ser Thr Asp Pro Gln Glu
35 40
<210> 6
<211> 403
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 6
Lys Asn Thr Val Ser Ile Phe Ala Leu Gly Pro Thr Ile Thr Asp Asp
1 5 10 15
Asp Glu Lys Met Thr Leu Ala Leu Leu Phe Leu Ser His Ser Leu Asp
20 25 30
Asn Glu Lys Gln His Ala Gln Arg Ala Gly Phe Leu Val Ser Leu Leu
35 40 45
Ser Met Ala Tyr Ala Asn Pro Glu Leu Tyr Leu Thr Thr Asn Gly Ser
50 55 60
Asn Ala Asp Val Lys Tyr Val Ile Tyr Met Ile Glu Lys Asp Leu Lys
65 70 75 80
Arg Gln Lys Tyr Gly Gly Phe Val Val Lys Thr Arg Glu Met Ile Tyr
85 90 95
Glu Lys Thr Thr Glu Trp Ile Phe Gly Ser Asp Leu Asp Tyr Asp Gln
100 105 110
Glu Thr Met Leu Gln Asn Gly Arg Asn Asn Ser Thr Ile Glu Asp Leu
115 120 125
Val His Thr Phe Gly Tyr Pro Ser Cys Leu Gly Ala Leu Ile Ile Gln
130 135 140
Ile Trp Ile Val Leu Val Lys Ala Ile Thr Ser Ile Ser Gly Leu Arg
145 150 155 160
Lys Gly Phe Phe Thr Arg Leu Glu Ala Phe Arg Gln Asp Gly Thr Val
165 170 175
Gln Ala Gly Leu Val Leu Ser Gly Asp Thr Val Asp Gln Ile Gly Ser
180 185 190
Ile Met Arg Ser Gln Gln Ser Leu Val Thr Leu Met Val Glu Thr Leu
195 200 205
Ile Thr Met Asn Thr Ser Arg Asn Asp Leu Thr Thr Ile Glu Lys Asn
210 215 220
Ile Gln Ile Val Gly Asn Tyr Ile Arg Asp Ala Gly Leu Ala Ser Phe
225 230 235 240
Phe Asn Thr Ile Arg Tyr Gly Ile Glu Thr Arg Met Ala Ala Leu Thr
245 250 255
Leu Ser Thr Leu Arg Pro Asp Ile Asn Arg Leu Lys Ala Leu Met Glu
260 265 270
Leu Tyr Leu Ser Lys Gly Pro Arg Ala Pro Phe Ile Cys Ile Leu Arg
275 280 285
Asp Pro Ile His Gly Glu Phe Ala Pro Gly Asn Tyr Pro Ala Ile Trp
290 295 300
Ser Tyr Ala Met Gly Val Ala Val Val Gln Asn Arg Ala Met Gln Gln
305 310 315 320
Tyr Val Thr Gly Arg Ser Tyr Leu Asp Ile Asp Met Phe Gln Leu Gly
325 330 335
Gln Ala Val Ala Arg Asp Ala Glu Ala Gln Glu Asn Leu Tyr Phe Gln
340 345 350
Gly Met Glu Ser Asp Ala Lys Asn Tyr Gln Ile Met Asp Ser Trp Glu
355 360 365
Glu Glu Pro Arg Asp Lys Ser Thr Asn Ile Ser Ser Ala Glu Asn Ile
370 375 380
Ile Glu Phe Ile Leu Ser Thr Asp Pro Gln Glu Leu Glu His His His
385 390 395 400
His His His
<210> 7
<211> 1212
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 7
aaaaatactg tctccatatt tgcccttgga ccgacaataa ctgatgacga tgagaaaatg 60
acattagctc ttctatttct atctcattca ctagataatg agaaacaaca tgcacaaagg 120
gcagggttct tggtgtcttt attgtcaatg gcttatgcca atccagagct ttacctgaca 180
acaaatggaa gtaatgcaga tgttaaatat gtcatatata tgattgagaa agatctaaaa 240
cggcaaaagt atggaggatt tgtggttaag acgagagaga tgatatatga aaagacaact 300
gagtggatat ttggaagtga cctggattat gaccaggaaa ctatgctgca gaacggcaga 360
aacaattcaa cgattgaaga tcttgttcac acatttgggt atccatcatg tttaggagct 420
cttataatac agatctggat agttttggtc aaagccatca ctagcatctc agggttaaga 480
aaaggctttt tcactcgatt agaggctttc agacaagatg gaacagtgca agcagggctg 540
gtattgagcg gtgacacagt ggatcagatt gggtcaatca tgcggtctca acagagcttg 600
gtaactctta tggttgagac attaataaca atgaatacta gcagaaatga cctcacaacc 660
atagaaaaga atatacaaat tgttggtaac tacataagag atgcaggtct tgcttcattc 720
ttcaatacaa tcaggtatgg aattgagact agaatggcag ctttgactct atctactctc 780
agaccagata tcaatagatt aaaagctctg atggaattgt atttatcaaa gggaccacgc 840
gctcctttta tctgtatcct cagagatcct atacatggtg agttcgcacc aggcaactat 900
cctgccatat ggagttatgc aatgggggtg gcagttgtac aaaacagagc catgcaacag 960
tatgtgacgg gaagatcata tctagatatt gatatgttcc agctgggaca agcagtagca 1020
cgtgatgctg aagctcagga aaacctgtat tttcagggaa tggaaagcga tgctaaaaac 1080
tatcaaatca tggattcttg ggaagaggaa ccaagagata aatcaactaa tatctcctcg 1140
gccgaaaaca tcattgaatt catactcagc accgaccccc aagaactcga gcaccaccac 1200
caccaccact ga 1212
<210> 8
<211> 808
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 8
Met Gly His His His His His His His Glu Glu Gly Lys Leu Val Ile
1 5 10 15
Trp Ile Asn Gly Asp Lys Gly Tyr Asn Gly Leu Ala Glu Val Gly Lys
20 25 30
Lys Phe Glu Lys Asp Thr Gly Ile Lys Val Thr Val Glu His Pro Asp
35 40 45
Lys Leu Glu Glu Lys Phe Pro Gln Val Ala Ala Thr Gly Asp Gly Pro
50 55 60
Asp Ile Ile Phe Trp Ala His Asp Arg Phe Gly Gly Tyr Ala Gln Ser
65 70 75 80
Gly Leu Leu Ala Glu Ile Thr Pro Asp Lys Ala Phe Gln Asp Lys Leu
85 90 95
Tyr Pro Phe Thr Trp Asp Ala Val Arg Tyr Asn Gly Lys Leu Ile Ala
100 105 110
Tyr Pro Ile Ala Val Glu Ala Leu Ser Leu Ile Tyr Asn Lys Asp Leu
115 120 125
Leu Pro Asn Pro Pro Lys Thr Trp Glu Glu Ile Pro Ala Leu Asp Lys
130 135 140
Glu Leu Lys Ala Lys Gly Lys Ser Ala Leu Met Phe Asn Leu Gln Glu
145 150 155 160
Pro Tyr Phe Thr Trp Pro Leu Ile Ala Ala Asp Gly Gly Tyr Ala Phe
165 170 175
Lys Tyr Glu Asn Gly Lys Tyr Asp Ile Lys Asp Val Gly Val Asp Asn
180 185 190
Ala Gly Ala Lys Ala Gly Leu Thr Phe Leu Val Asp Leu Ile Lys Asn
195 200 205
Lys His Met Asn Ala Asp Thr Asp Tyr Ser Ile Ala Glu Ala Ala Phe
210 215 220
Asn Lys Gly Glu Thr Ala Met Thr Ile Asn Gly Pro Trp Ala Trp Ser
225 230 235 240
Asn Ile Asp Thr Ser Lys Val Asn Tyr Gly Val Thr Val Leu Pro Thr
245 250 255
Phe Lys Gly Gln Pro Ser Lys Pro Phe Val Gly Val Leu Ser Ala Gly
260 265 270
Ile Asn Ala Ala Ser Pro Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu
275 280 285
Asn Tyr Leu Leu Thr Asp Glu Gly Leu Glu Ala Val Asn Lys Asp Lys
290 295 300
Pro Leu Gly Ala Val Ala Leu Lys Ser Tyr Glu Glu Glu Leu Ala Lys
305 310 315 320
Asp Pro Arg Ile Ala Ala Thr Met Glu Asn Ala Gln Lys Gly Glu Ile
325 330 335
Met Pro Asn Ile Pro Gln Met Ser Ala Phe Trp Tyr Ala Val Arg Thr
340 345 350
Ala Val Ile Asn Ala Ala Ser Gly Arg Gln Thr Val Asp Glu Ala Leu
355 360 365
Lys Asp Ala Gln Thr Asn Ser Ser Ser Asn Asn Asn Asn Asn Asn Asn
370 375 380
Asn Asn Asn Leu Gly Ile Glu Gly Arg Gly Glu Asn Leu Tyr Phe Gln
385 390 395 400
Gly His Met Gly Ser Met Leu Ser Leu Phe Asp Thr Phe Asn Ala Arg
405 410 415
Arg Gln Glu Asn Ile Thr Lys Ser Ala Gly Gly Ala Ile Ile Pro Gly
420 425 430
Gln Lys Asn Thr Val Ser Ile Phe Ala Leu Gly Pro Thr Ile Thr Asp
435 440 445
Asp Asp Glu Lys Met Thr Leu Ala Leu Leu Phe Leu Ser His Ser Leu
450 455 460
Asp Asn Glu Lys Gln His Ala Gln Arg Ala Gly Phe Leu Val Ser Leu
465 470 475 480
Leu Ser Met Ala Tyr Ala Asn Pro Glu Leu Tyr Leu Thr Thr Asn Gly
485 490 495
Ser Asn Ala Asp Val Lys Tyr Val Ile Tyr Met Ile Glu Lys Asp Leu
500 505 510
Lys Arg Gln Lys Tyr Gly Gly Phe Val Val Lys Thr Arg Glu Met Ile
515 520 525
Tyr Glu Lys Thr Thr Glu Trp Ile Phe Gly Ser Asp Leu Asp Tyr Asp
530 535 540
Gln Glu Thr Met Leu Gln Asn Gly Arg Asn Asn Ser Thr Ile Glu Asp
545 550 555 560
Leu Val His Thr Phe Gly Tyr Pro Ser Cys Leu Gly Ala Leu Ile Ile
565 570 575
Gln Ile Trp Ile Val Leu Val Lys Ala Ile Thr Ser Ile Ser Gly Leu
580 585 590
Arg Lys Gly Phe Phe Thr Arg Leu Glu Ala Phe Arg Gln Asp Gly Thr
595 600 605
Val Gln Ala Gly Leu Val Leu Ser Gly Asp Thr Val Asp Gln Ile Gly
610 615 620
Ser Ile Met Arg Ser Gln Gln Ser Leu Val Thr Leu Met Val Glu Thr
625 630 635 640
Leu Ile Thr Met Asn Thr Ser Arg Asn Asp Leu Thr Thr Ile Glu Lys
645 650 655
Asn Ile Gln Ile Val Gly Asn Tyr Ile Arg Asp Ala Gly Leu Ala Ser
660 665 670
Phe Phe Asn Thr Ile Arg Tyr Gly Ile Glu Thr Arg Met Ala Ala Leu
675 680 685
Thr Leu Ser Thr Leu Arg Pro Asp Ile Asn Arg Leu Lys Ala Leu Met
690 695 700
Glu Leu Tyr Leu Ser Lys Gly Pro Arg Ala Pro Phe Ile Cys Ile Leu
705 710 715 720
Arg Asp Pro Ile His Gly Glu Phe Ala Pro Gly Asn Tyr Pro Ala Ile
725 730 735
Trp Ser Tyr Ala Met Gly Val Ala Val Val Gln Asn Arg Ala Met Gln
740 745 750
Gln Tyr Val Thr Gly Arg Ser Tyr Leu Asp Ile Asp Met Phe Gln Leu
755 760 765
Gly Gln Ala Val Ala Arg Asp Ala Glu Ala Gln Met Ser Ser Thr Leu
770 775 780
Glu Asp Glu Leu Gly Val Thr His Glu Ala Lys Glu Ser Leu Lys Arg
785 790 795 800
His Ile Arg Asn Ile Asn Ser Ser
805
<210> 9
<211> 2427
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 9
atgggccatc atcaccatca ccatcacgaa gaaggtaaac tggtaatctg gattaacggc 60
gataaaggct ataacggtct cgctgaagtc ggtaagaaat tcgagaaaga taccggaatt 120
aaagtcaccg ttgagcatcc ggataaactg gaagagaaat tcccacaggt tgcggcaact 180
ggcgatggcc ctgacattat cttctgggca cacgaccgct ttggtggcta cgctcaatct 240
ggcctgttgg ctgaaatcac cccggacaaa gcgttccagg acaagctgta tccgtttacc 300
tgggatgccg tacgttacaa cggcaagctg attgcttacc cgatcgctgt tgaagcgtta 360
tcgctgattt ataacaaaga tctgctgccg aacccgccaa aaacctggga agagatcccg 420
gcgctggata aagaactgaa agcgaaaggt aagagcgcgc tgatgttcaa cctgcaagaa 480
ccgtacttca cctggccgct gattgctgct gacgggggtt atgcgttcaa gtatgaaaac 540
ggcaagtacg acattaaaga cgtgggcgtg gataacgctg gcgcgaaagc gggtctgacc 600
ttcctggttg acctgattaa aaacaaacac atgaatgcag acaccgatta ctccatcgca 660
gaagctgcct ttaataaagg cgaaacagcg atgaccatca acggcccgtg ggcatggtcc 720
aacatcgaca ccagcaaagt gaattatggt gtaacggtac tgccgacctt caagggtcaa 780
ccatccaaac cgttcgttgg cgtgctgagc gcaggtatta acgccgccag tccgaacaaa 840
gagctggcaa aagagttcct cgaaaactat ctgctgactg atgaaggtct ggaagcggtt 900
aataaagaca aaccgctggg tgccgtagcg ctgaagtctt acgaggaaga gttggcgaaa 960
gatccacgta ttgccgccac catggaaaac gcccagaaag gtgaaatcat gccgaacatc 1020
ccgcagatgt ccgctttctg gtatgccgtg cgtactgcgg tgatcaacgc cgccagcggt 1080
cgtcagactg tcgatgaagc cctgaaagac gcgcagacta attcgagctc gaacaacaac 1140
aacaataaca ataacaacaa cctcgggatc gagggaaggg gagaaaatct ttattttcaa 1200
ggtcatatgg gatccatgtt gagcctattt gatacattta atgcacgtag gcaagaaaac 1260
ataacaaaat cagctggtgg agctatcatt cctggacaga aaaatactgt ctccatattt 1320
gcccttggac cgacaataac tgatgacgat gagaaaatga cattagctct tctatttcta 1380
tctcattcac tagataatga gaaacaacat gcacaaaggg cagggttctt ggtgtcttta 1440
ttgtcaatgg cttatgccaa tccagagctt tacctgacaa caaatggaag taatgcagat 1500
gttaaatatg tcatatatat gattgagaaa gatctaaaac ggcaaaagta tggaggattt 1560
gtggttaaga cgagagagat gatatatgaa aagacaactg agtggatatt tggaagtgac 1620
ctggattatg accaggaaac tatgctgcag aacggcagaa acaattcaac gattgaagat 1680
cttgttcaca catttgggta tccatcatgt ttaggagctc ttataataca gatctggata 1740
gttttggtca aagccatcac tagcatctca gggttaagaa aaggcttttt cactcgatta 1800
gaggctttca gacaagatgg aacagtgcaa gcagggctgg tattgagcgg tgacacagtg 1860
gatcagattg ggtcaatcat gcggtctcaa cagagcttgg taactcttat ggttgagaca 1920
ttaataacaa tgaatactag cagaaatgac ctcacaacca tagaaaagaa tatacaaatt 1980
gttggtaact acataagaga tgcaggtctt gcttcattct tcaatacaat caggtatgga 2040
attgagacta gaatggcagc tttgactcta tctactctca gaccagatat caatagatta 2100
aaagctctga tggaattgta tttatcaaag ggaccacgcg ctccttttat ctgtatcctc 2160
agagatccta tacatggtga gttcgcacca ggcaactatc ctgccatatg gagttatgca 2220
atgggggtgg cagttgtaca aaacagagcc atgcaacagt atgtgacggg aagatcatat 2280
ctagatattg atatgttcca gctgggacaa gcagtagcac gtgatgctga agctcagatg 2340
agctcaacac tggaagatga acttggagtg acacacgaag ccaaagaaag cttgaaaaga 2400
catataagga acataaacag ttcatga 2427
<210> 10
<211> 270
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 10
Met Ser Pro Ile Leu Gly Tyr Trp Lys Ile Lys Gly Leu Val Gln Pro
1 5 10 15
Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu
20 25 30
Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu
35 40 45
Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr Ile Asp Gly Asp Val Lys
50 55 60
Leu Thr Gln Ser Met Ala Ile Ile Arg Tyr Ile Ala Asp Lys His Asn
65 70 75 80
Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu Ile Ser Met Leu Glu
85 90 95
Gly Ala Val Leu Asp Ile Arg Tyr Gly Val Ser Arg Ile Ala Tyr Ser
100 105 110
Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu
115 120 125
Met Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn
130 135 140
Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp
145 150 155 160
Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu
165 170 175
Val Cys Phe Lys Lys Arg Ile Glu Ala Ile Pro Gln Ile Asp Lys Tyr
180 185 190
Leu Lys Ser Ser Lys Tyr Ile Ala Trp Pro Leu Gln Gly Trp Gln Ala
195 200 205
Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp Glu Asn Leu Tyr
210 215 220
Phe Gln Gly Ser Met Glu Ser Asp Ala Lys Asn Tyr Gln Ile Met Asp
225 230 235 240
Ser Trp Glu Glu Glu Pro Arg Asp Lys Ser Thr Asn Ile Ser Ser Ala
245 250 255
Leu Asn Ile Ile Glu Phe Ile Leu Ser Thr Asp Pro Gln Glu
260 265 270
<210> 11
<211> 813
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 11
atgtccccta tactaggtta ttggaaaatt aagggccttg tgcaacccac tcgacttctt 60
ttggaatatc ttgaagaaaa atatgaagag catttgtatg agcgcgatga aggtgataaa 120
tggcgaaaca aaaagtttga attgggtttg gagtttccca atcttcctta ttatattgat 180
ggtgatgtta aattaacaca gtctatggcc atcatacgtt atatagctga caagcacaac 240
atgttgggtg gttgtccaaa agagcgtgca gagatttcaa tgcttgaagg agcggttttg 300
gatattagat acggtgtttc gagaattgca tatagtaaag actttgaaac tctcaaagtt 360
gattttctta gcaagctacc tgaaatgctg aaaatgttcg aagatcgttt atgtcataaa 420
acatatttaa atggtgatca tgtaacccat cctgacttca tgttgtatga cgctcttgat 480
gttgttttat acatggaccc aatgtgcctg gatgcgttcc caaaattagt ttgttttaaa 540
aaacgtattg aagctatccc acaaattgat aagtacttga aatccagcaa gtatatagca 600
tggcctttgc agggctggca agccacgttt ggtggtggcg accatcctcc aaaatcggat 660
gaaaacctgt attttcaggg atccatggaa agcgatgcta aaaactatca aatcatggat 720
tcttgggaag aggaaccaag agataaatca actaatatct cctcggccct caacatcatt 780
gaattcatac tcagcaccga cccccaagaa taa 813
<210> 12
<211> 306
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 12
Met Ser Pro Ile Leu Gly Tyr Trp Lys Ile Lys Gly Leu Val Gln Pro
1 5 10 15
Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu
20 25 30
Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu
35 40 45
Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr Ile Asp Gly Asp Val Lys
50 55 60
Leu Thr Gln Ser Met Ala Ile Ile Arg Tyr Ile Ala Asp Lys His Asn
65 70 75 80
Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu Ile Ser Met Leu Glu
85 90 95
Gly Ala Val Leu Asp Ile Arg Tyr Gly Val Ser Arg Ile Ala Tyr Ser
100 105 110
Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu
115 120 125
Met Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn
130 135 140
Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp
145 150 155 160
Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu
165 170 175
Val Cys Phe Lys Lys Arg Ile Glu Ala Ile Pro Gln Ile Asp Lys Tyr
180 185 190
Leu Lys Ser Ser Lys Tyr Ile Ala Trp Pro Leu Gln Gly Trp Gln Ala
195 200 205
Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp Glu Asn Leu Tyr
210 215 220
Phe Gln Gly Ser Met Glu Ser Asp Ala Lys Asn Tyr Gln Ile Met Asp
225 230 235 240
Ser Trp Glu Glu Glu Pro Arg Asp Lys Ser Thr Asn Ile Ser Ser Ala
245 250 255
Leu Asn Ile Ile Glu Phe Ile Leu Ser Thr Asp Pro Gln Glu Gly Ser
260 265 270
Gly Ser Gly Ser Gly Ser Met Ser Ser Thr Leu Glu Asp Glu Leu Gly
275 280 285
Val Thr His Glu Ala Lys Glu Ser Leu Lys Arg His Ile Arg Asn Ile
290 295 300
Asn Ser
305
<210> 13
<211> 921
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 13
atgtccccta tactaggtta ttggaaaatt aagggccttg tgcaacccac tcgacttctt 60
ttggaatatc ttgaagaaaa atatgaagag catttgtatg agcgcgatga aggtgataaa 120
tggcgaaaca aaaagtttga attgggtttg gagtttccca atcttcctta ttatattgat 180
ggtgatgtta aattaacaca gtctatggcc atcatacgtt atatagctga caagcacaac 240
atgttgggtg gttgtccaaa agagcgtgca gagatttcaa tgcttgaagg agcggttttg 300
gatattagat acggtgtttc gagaattgca tatagtaaag actttgaaac tctcaaagtt 360
gattttctta gcaagctacc tgaaatgctg aaaatgttcg aagatcgttt atgtcataaa 420
acatatttaa atggtgatca tgtaacccat cctgacttca tgttgtatga cgctcttgat 480
gttgttttat acatggaccc aatgtgcctg gatgcgttcc caaaattagt ttgttttaaa 540
aaacgtattg aagctatccc acaaattgat aagtacttga aatccagcaa gtatatagca 600
tggcctttgc agggctggca agccacgttt ggtggtggcg accatcctcc aaaatcggat 660
gaaaacctgt attttcaggg atccatggaa agcgatgcta aaaactatca aatcatggat 720
tcttgggaag aggaaccaag agataaatca actaatatct cctcggccct caacatcatt 780
gaattcatac tcagcaccga cccccaagaa ggtagcggta gcggtagcgg tagcatgagc 840
tcaacactgg aagatgaact tggagtgaca cacgaagcca aagaaagctt gaaaagacat 900
ataaggaaca taaacagtta a 921
<210> 14
<211> 42
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 14
Met Glu Ser Asp Ala Lys Asn Tyr Gln Ile Met Asp Ser Trp Glu Glu
1 5 10 15
Glu Pro Arg Asp Lys Ser Thr Asn Glu Ser Ser Ala Leu Asn Ile Ile
20 25 30
Glu Phe Ile Leu Ser Thr Asp Pro Gln Glu
35 40
<210> 15
<211> 129
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 15
atggaaagcg atgctaaaaa ctatcaaatc atggattctt gggaagagga accaagagat 60
aaatcaacta atgaatcctc ggccctcaac atcattgaat tcatactcag caccgacccc 120
caagaataa 129
<210> 16
<211> 78
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 16
Met Asp Gln Asp Ala Phe Phe Phe Glu Arg Asp Pro Glu Ala Glu Gly
1 5 10 15
Glu Ala Pro Arg Lys Gln Glu Ser Leu Ser Asp Val Ile Gly Leu Leu
20 25 30
Asp Val Val Leu Ser Tyr Lys Pro Thr Glu Gly Ser Gly Ser Gly Ser
35 40 45
Gly Ser Ile Ser Ser Ala Leu Glu Glu Glu Leu Asn Val Thr Asp Thr
50 55 60
Ala Lys Glu Arg Leu Arg His His Leu Thr Asn Leu Ser Gly
65 70 75
<210> 17
<211> 237
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 17
atggatcagg atgccttctt ttttgagagg gatcctgaag ccgaaggaga ggcaccacga 60
aaacaagaat cactctcaga tgtcatcgga ctccttgacg tcgtcctatc ctacaagccc 120
accgaaggta gcggtagcgg tagcggtagc atcagcagtg ctctggagga agaactcaat 180
gtgacggaca cagcaaaaga gagactaaga caccatctga caaacctttc aggataa 237
<210> 18
<211> 78
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 18
Met Asp Gln Asp Ala Phe Ile Leu Lys Glu Asp Ser Glu Val Glu Arg
1 5 10 15
Glu Ala Pro Gly Gly Arg Glu Ser Leu Ser Asp Val Ile Gly Phe Leu
20 25 30
Asp Ala Val Leu Ser Ser Glu Pro Thr Asp Gly Ser Gly Ser Gly Ser
35 40 45
Gly Ser Ile Ser Ser Ala Leu Glu Asp Glu Leu Gly Val Thr Asp Thr
50 55 60
Ala Lys Glu Arg Leu Arg His His Leu Ala Asn Leu Ser Gly
65 70 75
<210> 19
<211> 237
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 19
atggatcaag atgccttcat tcttaaagaa gattctgaag ttgagaggga ggcgccagga 60
ggaagagagt cgctctcgga tgttatcgga ttcctcgatg ctgtcctgtc gagtgaacca 120
actgacggta gcggtagcgg tagcggtagc atcagcagtg ccctggaaga tgagttagga 180
gtgacggata cagccaagga gaggctcaga catcatctgg caaacttgtc cggttaa 237
<210> 20
<211> 78
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 20
Met Asp Gln Asp Ala Leu Phe Pro Glu Glu Ser Met Glu Asp Gln Glu
1 5 10 15
Glu Gly His Ser Thr Thr Ser Thr Leu Thr Ser Ala Val Gly Leu Ile
20 25 30
Asp Ile Ile Leu Ala Ser Glu Pro Thr Asp Gly Ser Gly Ser Gly Ser
35 40 45
Gly Ser Ile Ser Asn Ala Leu Glu Ser Glu Leu Gly Ile Thr Glu Asn
50 55 60
Ala Lys Asp Arg Leu Lys His His Leu Ala Asn Leu Ser Gly
65 70 75
<210> 21
<211> 237
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 21
atggatcagg atgccttctt ttttgaagaa tctatggagg accagaagaa ggaacaccca 60
acaaccagca cactcactag tgcagtcgga ctcatcgaca ttatccttgc cagtgagcct 120
acagacggta gcggtagcgg tagcggtagc attagtaatg ccttagaagg tgaattaggt 180
ataactgaaa gtgccaaaga caggctcaaa catcatcttg ctaatctttc tggataa 237
<210> 22
<211> 78
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 22
Met Glu Asn Asn Ala Lys Asp Asn Gln Ile Met Asp Ser Trp Glu Glu
1 5 10 15
Gly Ser Gly Asp Lys Ser Ser Asp Ile Ser Ser Ala Leu Asp Ile Ile
20 25 30
Glu Phe Ile Leu Ser Thr Asp Ser Gln Glu Gly Ser Gly Ser Gly Ser
35 40 45
Gly Ser Met Ser Ser Ile Leu Glu Asp Glu Leu Gly Val Thr Gln Glu
50 55 60
Ala Lys Gln Ser Leu Lys Lys His Met Lys Asn Ile Ser Ser
65 70 75
<210> 23
<211> 237
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 23
atggaaaaca atgctaaaga caatcaaatc atggattctt gggaagaggg atcaggagac 60
aagtcatctg acatctcatc ggccctcgac atcattgaat tcatactcag caccgactcc 120
caagagggta gcggtagcgg tagcggtagc atgagctcaa tactagagga tgaactaggg 180
gtcacacagg aagccaagca aagcttaaag aaacatatga agaacatcag cagttaa 237

Claims (7)

1. A polypeptide, characterized in that the polypeptide is as set forth in any one of C1) -C10):
c1 The amino acid sequence of the polypeptide is shown in 22 th to 78 th positions of SEQ ID No. 3;
c2 The amino acid sequence of the polypeptide is shown as SEQ ID No. 3;
c3 The amino acid sequence of the polypeptide is shown in 22 th to 78 th positions of SEQ ID No. 16;
C4 The amino acid sequence of the polypeptide is shown as SEQ ID No. 16;
c5 The amino acid sequence of the polypeptide is shown in 22 th-78 th positions of SEQ ID No. 18;
c6 The amino acid sequence of the polypeptide is shown as SEQ ID No. 18;
c7 The amino acid sequence of the polypeptide is shown in 22 th to 78 th positions of SEQ ID No. 20;
c8 The amino acid sequence of the polypeptide is shown as SEQ ID No. 20;
c9 The amino acid sequence of the polypeptide is shown in 22 th-78 th positions of SEQ ID No. 22;
c10 The amino acid sequence of the polypeptide is shown as SEQ ID No. 22.
2. A biological material associated with the polypeptide of claim 1, wherein: the biological material is any one of the following:
b1 A nucleic acid molecule encoding the polypeptide of claim 1;
b2 An expression cassette comprising the nucleic acid molecule of B1);
b3 A recombinant vector comprising the nucleic acid molecule of B1) or comprising the expression cassette of B2);
b4 A recombinant microorganism comprising B1) said nucleic acid molecule or comprising B2) said expression cassette or comprising B3) said recombinant vector;
b5 An animal cell line containing B1) said nucleic acid molecule or containing B2) said expression cassette or containing B3) said recombinant vector;
b6 A plant cell line comprising B1) said nucleic acid molecule or comprising B2) said expression cassette or comprising B3) said recombinant vector;
B7 A recombinant cell that produces the polypeptide of claim 1.
3. The biological material according to claim 2, wherein the nucleic acid molecule according to B1) has a coding sequence of any one of the coding chains shown in D1) to D10):
d1 The coding sequence of the coding chain is shown as 64 th to 237 th bits of SEQ ID No. 4;
d2 The coding sequence of the coding chain is shown as SEQ ID No. 4;
d3 The coding sequence of the coding chain is shown as 64-237 of SEQ ID No. 17;
d4 The coding sequence of the coding chain is shown as SEQ ID No. 17;
d5 The coding sequence of the coding chain is shown as 64-237 of SEQ ID No. 19;
d6 The coding sequence of the coding chain is shown as SEQ ID No. 19;
d7 The coding sequence of the coding chain is shown as 64-237 of SEQ ID No. 21;
d8 The coding sequence of the coding chain is shown as SEQ ID No. 21;
d9 The coding sequence of the coding chain is shown as 64-237 of SEQ ID No. 23;
d10 The coding sequence of the coding strand is shown as SEQ ID No. 23.
4. Use of the polypeptide of claim 1 and/or the biomaterial of claim 2 or 3 for the manufacture of a medicament for inhibiting a respiratory virus.
5. The use according to claim 4, characterized in that: the respiratory virus is HPIV1, HPIV3, seV, PPIV1 or BPIV3.
6. A medicament or pharmaceutical composition characterized in that: the medicament or pharmaceutical composition contains the polypeptide of claim 1.
7. A method for producing an N protein which binds to the polypeptide of claim 1, characterized by: the method comprises the steps of carrying out hydrophobic mutation on P polypeptide of N-P fusion protein, and then carrying out enzyme digestion on the P polypeptide by using TEV to obtain N protein; the amino acid sequence of the N-P fusion protein is shown as SEQ ID No. 1;
the P polypeptide obtained by the hydrophobic mutation is E1) or E2):
e1 The amino acid sequence is shown as SEQ ID No. 5;
e2 The amino acid sequence is shown as SEQ ID No. 14.
CN202111253551.5A 2021-10-27 2021-10-27 Polypeptide inhibitors targeting respiratory viruses Active CN113999292B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111253551.5A CN113999292B (en) 2021-10-27 2021-10-27 Polypeptide inhibitors targeting respiratory viruses

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111253551.5A CN113999292B (en) 2021-10-27 2021-10-27 Polypeptide inhibitors targeting respiratory viruses

Publications (2)

Publication Number Publication Date
CN113999292A CN113999292A (en) 2022-02-01
CN113999292B true CN113999292B (en) 2023-05-26

Family

ID=79924173

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111253551.5A Active CN113999292B (en) 2021-10-27 2021-10-27 Polypeptide inhibitors targeting respiratory viruses

Country Status (1)

Country Link
CN (1) CN113999292B (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2007203671B2 (en) * 1997-05-23 2011-09-01 The Government Of The United States Of America As Represented By The Department Of Health And Human Services Production of attenuated parainfluenza virus vaccines from cloned nucleotide sequences
US7951383B2 (en) * 1997-05-23 2011-05-31 The United States Of America As Represented By The Department Of Health And Human Services Attenuated parainfluenza virus (PIV) vaccines
US7208161B1 (en) * 1997-05-23 2007-04-24 The United States Of America, Represented By The Secretary, Department Of Health And Human Services Production of attenuated parainfluenza virus vaccines from cloned nucleotide sequences
IL144831A0 (en) * 1999-12-10 2002-06-30 Us Health USE OF RECOMBINANT PARAINFLUENZA VIRUSES (PIVs) AS VECTORS TO PROTECT AGAINST INFECTION AND DISEASE CAUSED BY PIV AND OTHER HUMAN PATHOGENS
AU2011244909A1 (en) * 2000-03-21 2011-11-24 Medimmune, Llc Recombinant parainfluenza virus expression systems and vaccines
JP2008537683A (en) * 2005-01-12 2008-09-25 ザ ガバメント オブ ザ ユナイテッド ステイツ オブ アメリカ,レプリゼンテッド バイ ザ セクレタリー,デパートメント オブ ヘルス アンド ヒューマン サービシズ Attenuated human parainfluenza virus, methods and uses thereof
EP3247388A1 (en) * 2015-01-20 2017-11-29 The United States of America, as represented by The Secretary, Department of Health and Human Services Recombinant human/bovine parainfluenza virus 3 (b/hpiv3) expressing a chimeric rsv/bpiv3 f protein and uses thereof

Also Published As

Publication number Publication date
CN113999292A (en) 2022-02-01

Similar Documents

Publication Publication Date Title
CN105188745B (en) RSV F polypeptide before stabilized soluble fusion
CN110894242B (en) Recombinant CL7-CVN protein and preparation method and application thereof
CN109627294B (en) Correctly folded recombinant rabies virus G protein extracellular domain and potential application thereof
Wang et al. Structure determination of Cucumber green mottle mosaic virus by X-ray fiber diffraction: significance for the evolution of tobamoviruses
WO2017205996A1 (en) Production technique for recombinant human cryptochrome protein i (hcry1) and combination product thereof
CN111117977B (en) Recombinant polypeptide linked zymogen, preparation method, activation method and application thereof
Davies et al. Recombinant baculovirus vectors expressing glutathione–S–transferase fusion proteins
CN114262697B (en) Bsu DNA polymerase and Bsu DNA polymerase mutant as well as gene, plasmid and genetic engineering bacteria thereof
WO2015106584A1 (en) Tat-il-24-kdel fusion protein, and preparation method therefor and use thereof
CN113999292B (en) Polypeptide inhibitors targeting respiratory viruses
CN107058363B (en) Method for realizing efficient secretory expression of small molecule peptides based on amyloid protein and application thereof
CN109182366A (en) The preparation method of thermosensitive type uracil-DNA glycosylase
CN111218452B (en) Recombinant human TSG-6 gene, recombinant human TSG-6 protein standard, and preparation methods and applications thereof
CN104911189B (en) Human Annexin V gene optimization sequence and manufacturing method and application thereof
CN114940707B (en) Rae1-Nup98 specific binding polypeptide and application thereof
CN117164722B (en) Spiral bundle DAMP4-PR-FO fusion protein D2L and preparation method and application thereof
CN108179142A (en) A kind of new IgA protease and its preparation method and application
CN104945488B (en) Polypeptide with immunoglobulin binding capacity
CN111172249B (en) rhTSG-6 fluorescent quantitative RT-qPCR detection kit and application thereof
CN113527508B (en) Preparation method of thrombopoietin peptidomimetic-Fc fusion protein
WO2010075770A1 (en) A recombinant fusion protein, the preparation method and use thereof
CN116239671A (en) Preparation method of B/Phuket/3073/2013 (B/Yamagata linear) HA antigen
CN116731115A (en) A/Darwin/9/2021 (H3N 2) HA antigen preparation method
CN116375806A (en) Preparation method of B/Austria/1359417/2021 (B/Victoria linear) HA antigen
CN111196846A (en) Prokaryotic expression preparation method of CDA1 protein related to chloroplast development

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant