CN118388630A

CN118388630A - Method for biosynthesis of human structural material type VI collagen

Info

Publication number: CN118388630A
Application number: CN202410467176.1A
Authority: CN
Inventors: 宋海红; 杨霞; 兰小宾; 王玲玲; 张永健; 何振瑞; 张国梁
Original assignee: Shanxi Jinbo Bio Pharmaceutical Co ltd
Current assignee: Shanxi Jinbo Bio Pharmaceutical Co ltd
Priority date: 2023-07-18
Filing date: 2023-07-18
Publication date: 2024-07-26
Also published as: CN118373898A; CN116948014B; CN116948014A; CN118373899A

Abstract

Methods of biosynthesis of human structural material type VI collagen are provided. The polypeptides described herein comprise an amino acid sequence selected from the group consisting of: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20 or 21. The recombinant human VI type collagen has cell adhesion activity, is convenient to separate and purify, and can be used as biological dressing, human bionic material, plastic and cosmetic material and the like.

Description

Method for biosynthesis of human structural material type VI collagen

The application is applied for the day 2023, 7 months and 18 days, and the application number is: 202310883890.4A divisional application of Chinese patent application of the application name "method for biosynthesis of human structural material type VI collagen".

Technical Field

The application relates to the field of biosynthesis, in particular to a preparation method of a human structural material VI type collagen by biosynthesis.

Background

Collagen is an important extracellular matrix component in animal organisms, and plays an important role in cell migration, cell metabolism, cell signaling pathway response, platelet aggregation, maintenance, regulation, injury repair and other aspects of normal physiological functions of cells, tissues and organs. Collagen has good biocompatibility, bioactivity and degradability as important natural biological protein, and can be widely applied to various fields such as chemical industry, medicine, food, cosmetics and the like.

At present, the main acquisition mode of the collagen is collagen extract obtained by treating animal tissues by using acid, alkali and enzymolysis methods, and the technology for extracting the collagen from the animal tissues is mature, but the collagen peptide extracted by the method has uneven properties, large inter-batch difference and potential safety hazard of virus infection in the production process, and the amino acid sequence of the animal collagen has large sequence difference with that of the human protein, thus being easy to cause immunogenicity and anaphylactic reaction. In addition, collagen is obtained by expressing through a genetic engineering technology, and at present, systems such as escherichia coli, pichia pastoris, mammalian cells, insect cells, plants and the like are mainly utilized for expression production, and the above expression systems have some defects, such as high difficulty in subsequent purification of an escherichia coli system expression product; mammalian cells have high expression cost and low yield; the insect cell expression system has the advantages of higher cost, low yield and larger difference from human cells after translation; the plant expression period is longer, and the method is not suitable for industrial production; although the pichia pastoris expression system has the advantages of high-density fermentation production, extremely low culture cost, short period, high expression and other large-scale industrial production, the pichia pastoris expression system cannot be confirmed to be applicable to all types of collagen expression.

At present, many recombinant human collagens are mainly type I, type II and type III, but few studies are conducted on type VI collagens, which are a type of collagen existing in all extracellular matrixes (ECMs) and can bind different substances of ECMs so as to bridge cells to surrounding connective tissues and organize three-dimensional tissue structures of skeletal muscles, tendons, bones and cartilages. By binding to type IV collagen and other pearl albumin in the basal layer, type VI collagen can establish a tight connection between muscle cells and ECM. The VI type collagen is taken as the main component of extracellular matrix, can promote the adhesion of fibroblasts in wound repair, can enhance the regeneration of chondrocytes, and plays an important role in wound healing.

There is a need in the art for methods of biosynthesis of human structural material type VI collagen.

Disclosure of Invention

The inventors performed large-scale screening of human type VI collagen and selected 11 human recombinant type VI collagens. These recombinant type VI collagens are suitable for isolation and purification. After activity measurement of these human recombinant type VI collagens, the inventors found that the recombinant type VI collagen has cell adhesion activity and can be used in industry. The inventors have also found that some recombinant type VI collagens (C6C, C6g, C6i, C6k proteins) have higher yields, yields and/or purities than other recombinant type VI collagens. The invention provides a method for efficiently obtaining human structural material type VI collagen.

In one aspect, the invention provides a polypeptide comprising one or more repeat units linked directly or through a linker, the repeat units comprising an amino acid sequence selected from the group consisting of: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, or 21, said variant being (1) an amino acid sequence in which one or more amino acid residues are mutated in said amino acid sequence or (2) an amino acid sequence having at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to said amino acid sequence.

The repeat sequence is as follows:

GERGGPGERGPRGTPGTRGPRGDPGEAGPQGDQGREGPVGVPGDPGEAGPIGPKG YRGDEGPPGSEGARGAPGPAGPPGDPGLMGERGEDGPA(SEQ ID NO:1)

GCKGSPGFDGIQGPPGPKGDPGAFGLKGEKGEPGADGEAGRP(SEQ ID NO:3)

GERGGPGERGPRGTPGTRGPRGDPGEAGPQGDQGRE(SEQ ID NO:5)

GTEGFPGFPGYPGNRGAPGINGTKGYPGLKGDEGEAGDPGDD(SEQ ID NO:7)

GPPGLRGDPGFEGERGKPGLPGEKGEAGDPGRPGDLGPVGYQGMKGEKGSRGEK GSRGPKGYKGEKGKRGIDGVDGVKGEMGYPGLPGCKGSPGFDGIQGPPGPKGDPGAF GLKGEKGEPGADGEAGRPGSSGPSGDEGQPGEPGPPGEKGEAGDEGNPGPDGAP(SEQ ID NO:9)

GIQGPPGPKGDPGAFGLKGEKGEPGADGEAGRPGSSGPSGDEGQPGEPGPPGEKG EAGDEGNPGPDGAPGERGGPGERGPRGTPGTRGPRGDPGEAGPQGDQGREGPVGVPG DPGEAGPIGPKGYRGDEGPP(SEQ ID NO:11)

GPPGLRGDPGFEGERGKPGLPGEKGEAGDPGRPGDL(SEQ ID NO:13)

GMKGEKGSRGEKGSRGPKGYKGEKGKRGIDGVDGVKGEM(SEQ ID NO:15)

GSSGPSGDEGQPGEPGPPGEKGEAGDEGNPGPDGAP(SEQ ID NO:17)

GDPGEAGPIGPKGYRGDEGPPGSEGARGAPGPAGPPGDPGLMGERGEDGPA(SEQ ID NO:19)

GPPGLRGDPGFEGERGKPGLPGEKGEAGDPGRPGDLGPVGYQGMKGEKGSRGEKGSR

GPKGYKGEKGKRGIDGVDGVKGEMGYPGLPGCKGSPGFDGIQGPPGPKGDPGAFGLK

GEKGEPGADGEAGRPGSSGPSGDEGQPGEPGPPGEKGEAGDEGNPGPDGAPGERGGP

GERGPRGTPGTRGPRGDPGEAGPQGDQGREGPVGVPGDPGEAGPIGPKGYRGDEGPP

GSEGARGAPGPAGPPGDPGLMGERGEDGPAGNGTEGFPGFPGYPGNRGAPGINGTKGYPGLKGDEGEAGDPGDDNNDIAPRGVKGAKGYRGPEGPQGPPGHQGPPGPD(SEQ ID NO:21).

In one embodiment, the plurality of repeat units is 2-80 repeat units, such as 2-60, 2-70, 2-50, 2-45, 2-40, 2-35, 2-30, 2-25, 2-20, 2-15, or 2-10 repeat units, such as 2, 3,4, 5, 6, 7, 8, 9.

In one embodiment, the linker comprises one or more amino acid residues, e.g., 1-30, 1-20, 1-10, 1-9, 1-8, 1-7, 1-6, 1-5, 1-4, 1-3, or 1-2 amino acid residues.

In one embodiment, the mutation is selected from the group consisting of a substitution, addition, insertion, or deletion.

In one embodiment, the substitution is a conservative amino acid substitution.

In one embodiment, the polypeptide is recombinant collagen. In one embodiment, the polypeptide is recombinant type VI collagen or human recombinant type VI collagen.

In one embodiment, the polypeptide has cell adhesion activity.

In one embodiment, the polypeptide comprises an amino acid sequence selected from the group consisting of: 2,4, 6, 8, 10, 12, 14, 16, 18, 20, or 21, said variant being (1) an amino acid sequence in which one or more amino acid residues are mutated in said amino acid sequence or (2) an amino acid sequence having at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to said amino acid sequence.

In one embodiment, the mutation is selected from the group consisting of a substitution, addition, insertion, or deletion. In one embodiment, the substitution is a conservative amino acid substitution.

In another aspect, nucleic acids are provided that encode the polypeptides described herein. In one embodiment, wherein the nucleic acid comprises a codon optimized nucleotide sequence. In one embodiment, the nucleotide sequence is codon optimized for E.coli expression. In one embodiment, the nucleic acid comprises a nucleotide sequence selected from the group consisting of: SEQ ID NOS.22-32.

In another aspect, a vector is provided comprising a nucleic acid as described herein. In one embodiment, the vector comprises expression control elements, nucleotides of a purification tag and/or nucleotides of a leader sequence operably linked to the nucleic acid. In one embodiment, the expression control element is selected from a promoter, terminator or enhancer. In one embodiment, the purification tag is selected from a His tag, a GST tag, an MBP tag, a SUMO tag, or a NusA tag. In one embodiment, the vector is an expression vector or cloning vector, preferably pET-28a (+). pET-28a (+) can contain an N-terminal His, thrombin and T7 protein tag, and a C-terminal His tag. In this context, the N-terminus of the polypeptide may comprise a cleavage site to facilitate purification, e.g.a TEV cleavage site.

In another aspect, a host cell is provided comprising a nucleic acid or vector as described herein. In one embodiment, the host cell is a eukaryotic cell or a prokaryotic cell. In one embodiment, the eukaryotic cell is a yeast cell, an animal cell, and/or an insect cell, and in one embodiment, the prokaryotic cell is an E.coli cell, such as E.coli BL21.

In another aspect, compositions are provided that comprise one or more of the polypeptides, nucleic acids, vectors, and host cells described herein. In one embodiment, the composition is a kit. In one embodiment, the composition is one or more of a biological dressing, a human biomimetic material, a cosmetic material, an organoid culture material, a cardiovascular stent material, a coating material, a tissue injection filling material, an ophthalmic material, a gynaecological biomaterial, a nerve repair regenerating material, a liver tissue material, and a vascular repair regenerating material, a 3D printed artificial organ biomaterial, a cosmetic material, a pharmaceutical adjuvant, and a food additive. In one embodiment, the composition is an injectable composition or an oral composition.

In another aspect, there is provided the use of a polypeptide, nucleic acid, vector, host cell and/or composition herein in one or more of a biological dressing, a human biomimetic material, a cosmetic plastic material, an organoid culture material, a cardiovascular stent material, a coating material, a tissue injection filling material, an ophthalmic material, a gynaecological biomaterial, a nerve repair regenerative material, a liver tissue material and a vascular repair regenerative material, a 3D printed artificial organ biomaterial, a cosmetic material, a pharmaceutical adjuvant and a food additive.

In another aspect, methods of promoting cell adhesion are provided that include the step of contacting a polypeptide, nucleic acid, vector, host cell and/or composition herein with a cell (e.g., an animal cell, mammalian cell, or human cell).

In another aspect, there is provided a method of cosmetic shaping, tissue injection filling, ophthalmic treatment, nerve repair, or vascular repair of a subject in need thereof, comprising administering to the subject a polypeptide herein. In one embodiment, the administration is oral administration or injection administration. In one embodiment, the subject has a disease or disorder associated with a type viii collagen deficiency, such as anterior ocular segment hypoplasia.

In another aspect, there is provided a method of producing a polypeptide described herein, comprising:

(1) Culturing a host cell described herein under suitable culture conditions;

(2) Harvesting the host cells and/or culture medium comprising the polypeptide; and

(3) Purifying the polypeptide.

In one embodiment, the host cell is an E.coli cell, preferably an E.coli BL21 (DE 3) cell.

In one embodiment, step (1) comprises culturing E.coli cells in LB medium and inducing expression by IPTG.

In one embodiment, step (2) comprises harvesting the E.coli cells, resuspending in an equilibration working fluid, homogenizing the E.coli cells, preferably high pressure homogenizing, and separating the supernatant. In one embodiment, the equilibration working fluid comprises 100-500mM sodium chloride, 10-50mM Tris, 10-50mM imidazole, pH7-9.

In one embodiment, step (3) comprises crude purity, cleavage, purification with a precision and/or reverse-hanging nickel column. In one embodiment, step (3) comprises crude purity, and one or more of the following: enzyme cutting, purifying and/or reversely hanging nickel column purifying.

In one embodiment, crude purification comprises subjecting the supernatant to Ni-Sepharose column purification to obtain an eluate comprising the protein of interest, wherein the eluate comprises 100-500mM sodium chloride, 10-50mM Tris and 100-500mM imidazole, preferably pH7-9.

In one embodiment, the purification comprises gradient elution of the eluent containing the target protein or the cleaved product with a strong anion exchange chromatography column; preferably, the gradient elution comprises 0-15% B fluid for 1-5 minutes then 3 column volumes, 15-30% B fluid for 1-5 minutes then 3 column volumes, 30-50% B fluid for 1-5 minutes then 3 column volumes, 50-100% B fluid for 1-5 minutes then 3 column volumes; wherein the solution B contains 10-50mM Tris,0.5-5M sodium chloride, and pH7-9.

In one embodiment, the reverse nickel column purification comprises Ni-agarose gel column purification of the digested product; preferably, the eluate comprises 10-50mM Tris,10-50mM sodium chloride, 0.5-5M imidazole, pH7-9.

In this context, cleavage may be that performed by the TEV enzyme.

The advantages of the invention include: (1) Provided is a human recombinant type VI collagen which has cell adhesion activity and can be used in industry; (2) Methods for preparing human recombinant type VI collagen in high yields, yields and/or purity are provided.

Drawings

FIG. 1 shows an electrophoretogram of C6 a.

FIG. 2 shows an electrophoretogram of C6 d.

FIG. 3 shows an electrophoretogram of C6 e.

FIG. 4 shows an electrophoretogram of C6 h.

FIG. 5 shows an electrophoretogram of C6 j.

FIG. 6 shows an electrophoretogram of C6 f.

FIG. 7 shows an electrophoretogram of C6 b.

FIG. 8 shows an electrophoretogram of C6C.

FIG. 9 shows an electrophoretogram of C6 i.

FIG. 10 shows an electrophoretogram of C6 g.

FIG. 11 shows an electrophoretogram of C6 k.

FIG. 12 shows the cell adhesion results.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly and completely described in the following in conjunction with the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

As used herein, "type vi collagen" is a type of collagen that is present in all extracellular matrix (ECM), and is capable of binding to different substances of the ECM, thereby bridging cells to surrounding connective tissue and organizing the three-dimensional tissue structure of skeletal muscle, tendons, bones, and cartilage.

As used herein, "polypeptide" refers to a plurality of amino acid residues joined by peptide bonds. Herein, a polypeptide comprises one or more repeat units. The repeat unit may be derived from human type vi collagen. Thus, the polypeptide may be human recombinant type VI collagen. Multiple repeat units may be joined by a linker which may be a natural amino acid residue of the repeat unit on human type VI collagen, for example 1 to 80 amino acid residues, for example 1、2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49、50、51、52、53、54、55、56、57、58、59、60、61、62、63、64、65、66、67、68、69、70、71、72、73、74、75、76、77、78、79、80. repeat units may be SEQ ID NO 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 or 21. The polypeptide may be SEQ ID NO 2, 4, 6, 8, 10, 12, 14, 16, 18, 20 or 21.

As used herein, "human recombinant type vi collagen" refers to a recombinant protein consisting of or consisting essentially of a sequence derived from human type vi collagen. In this context, human recombinant type vi collagen may consist of or consist essentially of a fragment or multiple repeats of a fragment derived from human type vi collagen.

As used herein, the term "variant" means a polypeptide having cell adhesion activity that includes alterations (i.e., substitutions, additions, insertions, and/or deletions) at one or more positions. Substitution means that an amino acid occupying a certain position is replaced with a different amino acid; deletion means the removal of an amino acid occupying a certain position; whereas insertion means adding an amino acid next to and immediately after the amino acid occupying a certain position. Addition refers to the addition of one or more amino acid residues at the C-and/or N-terminus of an amino acid sequence. Substitutions may be conservative substitutions. A variant of a repeat unit may be a sequence that is altered or mutated (i.e., substituted, added, inserted and/or deleted) in SEQ ID NO:1, 3,5, 7, 9, 11, 13, 15, 17, 19 or 21 by one or more amino acid residues. A variant of a polypeptide may be a sequence that is altered or mutated (i.e., substituted, added, inserted and/or deleted) in SEQ ID NO. 2,4, 6, 8, 10, 12, 14, 16, 18, 20 or 21 by one or more amino acid residues.

In the context of the present invention, conservative substitutions may be defined by substitutions within the class of amino acids reflected in one or more of the following tables:

amino acid residues of conserved class:

Acidic residues D and E

Basic residue K, R, and H

Hydrophilic uncharged residues S, T, N and Q

Aliphatic uncharged residues G, A, V, L and I

Nonpolar uncharged residues C, M and P

Aromatic residues F, Y and W.

Physical and functional classification of alternative amino acid residues:

Residues S and T containing alcohol groups

Aliphatic residues I, L, V and M

Cycloalkenyl related residues F, H, W and Y

Hydrophobic residues A, C, F, G, H, I, L, M, R, T, V, W and Y

Negatively charged residues D and E

Polar residue C, D, E, H, K, N, Q, R, S and T

Positively charged residues H, K and R

Small residues A, C, D, G, N, P, S, T and V

Very small residues A, G and S

Residues A, C, D, E, G, H, K, N, Q, R, S, P involved in corner formation and T flexible residues Q, T, K, S, G, P, D, E and R.

As used herein, "cell adhesion" refers to adhesion between cells and collagen. Collagen (e.g., a polypeptide described herein) can promote adhesion between cells and a container in which the cells are cultured.

As used herein, the term "expression" includes any step involving the production of a polypeptide, including, but not limited to: transcription, post-transcriptional modification, translation, post-translational modification, and secretion.

As used herein, the term "expression vector" means a linear or circular DNA molecule comprising a polynucleotide encoding a polypeptide and operably linked to control sequences that provide for its expression.

As used herein, the term "host cell" means any cell type that is readily transformed, transfected, transduced, or the like with a nucleic acid construct or expression vector comprising a polynucleotide of the present invention. The term "host cell" encompasses any parent cell progeny that are not identical to the parent cell due to mutations that occur during replication.

As used herein, the term "nucleic acid" means a single-or double-stranded nucleic acid molecule that is isolated from a naturally occurring gene or that has been modified to contain a segment of nucleic acid in a manner that does not otherwise exist in nature, or that is synthetic, and that may contain one or more control sequences. The nucleic acid may be SEQ ID NO. 22-32. The nucleic acid may be a codon optimized nucleic acid, e.g., a nucleic acid that is codon optimized for expression in E.coli cells.

The term "operably linked" means a configuration in which a control sequence is placed at an appropriate position relative to the coding sequence of a polynucleotide such that the control sequence directs the expression of the coding sequence.

The degree of relatedness between two amino acid sequences or between two nucleotide sequences is described by the parameter "sequence identity". For the purposes of the present invention, sequence identity between two amino acid sequences is determined using the Nidlmann-Wen application algorithm (Needleman and Wunsch,1970, J.mol. Biol. [ J. Mol. J. Mol. 48:443-453) as implemented in the Nidel program of the EMBOSS software package (EMBOSS: european molecular biology open software suite, rice et al 2000,Trends Genet, [ genetics trend ]16:276-277, preferably version 5.0 or more). The parameters used are gap opening penalty of 10, gap extension penalty of 0.5, and EBLOSUM62 (the emoss version of BLOSUM 62) substitution matrix. The output of the nitel labeled "longest identity" (obtained using the non-simplified option) was used as the percent identity and calculated as follows:

(identical residue. Times.100)/(alignment Length-total number of gaps in the alignment)

For the purposes of the present invention, the sequence identity between two deoxynucleotide sequences is determined using the Nidelman-Wen application algorithm (Needleman and Wunsch,1970, supra) as implemented in the Nidel program of the EMBOSS software package (EMBOSS: european molecular biology open software suite, rice et al, 2000, supra), preferably version 5.0.0 or newer. The parameters used are gap opening penalty 10, gap extension penalty 0.5, and EDNAFULL (EMBOSS version of NCBI NUC 4.4) substitution matrix. The output of the nitel labeled "longest identity" (obtained using the non-simplified option) was used as the percent identity and calculated as follows:

(identical deoxyribonucleotides x 100)/(alignment Length-total number of gaps in the alignment)

Polypeptides

The present invention provides polypeptides comprising one or more repeat units linked directly or through a linker, the repeat units comprising an amino acid sequence selected from the group consisting of: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 or 21. A variant may be (1) an amino acid sequence in which one or more amino acid residues are mutated in the amino acid sequence of SEQ ID NO:1, 3, 5, 7, 9, 11, 13, 15, 17, 19 or 21 or (2) an amino acid sequence having at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to the amino acid sequence of SEQ ID NO:1, 3, 5, 7, 9, 11, 13, 15, 17, 19 or 21. In the case of the polypeptides described herein, the mutation may be selected from substitution, addition, insertion or deletion. Preferably, the substitutions are conservative amino acid substitutions.

The polypeptides described herein may comprise a plurality of repeat units, e.g., 2-80 repeat units, e.g., 2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49、50 repeat units.

The linker in the polypeptides described herein may comprise one or more amino acid residues, e.g. 1-50, 1-20, 1-10, 1-9, 1-8, 1-7, 1-6, 1-5, 1-4, 1-3, 1-2 amino acid residues, e.g. 2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49 amino acid residues.

The polypeptides described herein are recombinant collagens, in particular recombinant type VI collagens, preferably having cell adhesion activity. In this context, recombinant type VI collagen is human recombinant type VI collagen.

The polypeptides described herein may also comprise an amino acid sequence selected from the group consisting of: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, or 21, said variant being (1) an amino acid sequence in which one or more amino acid residues are mutated in said amino acid sequence or (2) an amino acid sequence having at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to said amino acid sequence.

Nucleic acid constructs

The invention also relates to nucleic acid constructs comprising a nucleic acid of the invention operably linked to one or more control sequences that direct the expression of the coding sequence in a suitable host cell under conditions compatible with the control sequences. The vector may comprise a nucleic acid construct.

The nucleic acid can be manipulated in a variety of ways to provide for expression of the polypeptide. Depending on the expression vector, it may be desirable or necessary to manipulate the nucleic acid prior to insertion into the vector. Techniques for modifying nucleic acids using recombinant DNA methods are well known in the art.

The control sequence may be a promoter, i.e., a polynucleotide recognized by a host cell for expression of a polypeptide encoding the invention. Promoters comprise transcriptional control sequences that mediate the expression of a polypeptide. The promoter may be any nucleic acid that exhibits transcriptional activity in the host cell including variant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell.

Examples of suitable promoters for directing transcription of the vector or nucleic acid construct of the invention in a bacterial host cell are those obtained from: bacillus amyloliquefaciens alpha-amylase gene (amyQ), bacillus licheniformis alpha-amylase gene (amyL), bacillus licheniformis penicillinase gene (penP), bacillus stearothermophilus maltoamylase gene (amyM), bacillus subtilis levan sucrase gene (sacB), bacillus subtilis xylA and xylB genes, bacillus thuringiensis cryIIIA gene, E.coli lac operon, E.coli trc promoter.

In yeast hosts, useful promoters are obtained from the following genes: saccharomyces cerevisiae enolase (ENO-1), saccharomyces cerevisiae galactokinase (GAL 1), saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase (ADH 1, ADH 2/GAP), saccharomyces cerevisiae Triose Phosphate Isomerase (TPI), saccharomyces cerevisiae metallothionein (CUP 1), and Saccharomyces cerevisiae 3-phosphoglycerate kinase.

The control sequence may also be a transcription terminator which is recognized by a host cell to terminate transcription. The terminator is operably linked to the 3' terminus of the polynucleotide encoding the polypeptide. Any terminator which is functional in the host cell may be used in the present invention.

Preferred terminators for bacterial host cells are obtained from the following genes: bacillus clausii alkaline protease (aprH), bacillus licheniformis alpha-amylase (amyL), and E.coli ribosomal RNA (rrnB).

Preferred terminators for yeast host cells are obtained from the following genes: saccharomyces cerevisiae enolase, saccharomyces cerevisiae cytochrome C (CYC 1), and Saccharomyces cerevisiae glyceraldehyde-3-phosphate dehydrogenase. Other useful terminators for yeast host cells are described by Romanos et al (1992, supra).

The control sequence may also be an mRNA stabilizing region downstream of the promoter and upstream of the coding sequence of the gene, which enhances expression of the gene.

Examples of suitable mRNA stable regions are obtained from the following genes: the Bacillus thuringiensis cryIIIA gene (WO 94/25612) and the Bacillus subtilis SP82 gene (Hue et al 1995,Journal of Bacteriology J bacteriology 177:3465-3471).

The control sequence may also be a leader sequence, i.e., an untranslated region of an mRNA that is important for translation by the host cell. The leader sequence is operably linked to the 5' terminus of the polynucleotide encoding the polypeptide. Any leader sequence that is functional in the host cell may be used.

Suitable leader sequences for yeast host cells are obtained from the following genes: saccharomyces cerevisiae enolase (ENO-1), saccharomyces cerevisiae 3-phosphoglycerate kinase, saccharomyces cerevisiae alpha-factor, and Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase (ADH 2/GAP).

The control sequence may also be a polyadenylation sequence, a sequence operably linked to the 3' terminus of the polynucleotide and which, when transcribed, is recognized by the host cell as a signal to add polyadenosine residues to transcribed mRNA. Any polyadenylation sequence which is functional in the host cell may be used.

Useful polyadenylation sequences for yeast host cells are described by Guo and Sherman,1995,Mol.Cellular Biol [ molecular cell biology ] 15:5983-5990.

The control sequence may also be a signal peptide coding region encoding a signal peptide linked to the N-terminus of the polypeptide and directing the polypeptide into the cell's secretory pathway. The 5' -end of the coding sequence of the polynucleotide may itself contain a signal peptide coding sequence naturally linked in translation reading frame to a segment of the coding sequence encoding a polypeptide. Alternatively, the 5' -end of the coding sequence may contain a signal peptide coding sequence that is foreign to the coding sequence. In cases where the coding sequence does not naturally contain a signal peptide coding sequence, an exogenous signal peptide coding sequence may be required. Alternatively, the foreign signal peptide coding sequence may simply replace the natural signal peptide coding sequence in order to enhance secretion of the polypeptide. However, any signal peptide coding sequence that directs the expressed polypeptide into the secretory pathway of a host cell may be used.

The effective signal peptide coding sequence of the bacterial host cell is a signal peptide coding sequence obtained from the following genes: bacillus NCIB 11837 maltogenic amylase, bacillus licheniformis subtilisin, bacillus licheniformis beta-lactamase, bacillus stearothermophilus alpha-amylase, bacillus stearothermophilus neutral protease (nprT, nprS, nprM), and Bacillus subtilis prsA. Additional signal peptides are described by Simonen and Palva,1993,Microbiological Reviews [ microbial reviews ] 57:109-137.

Useful signal peptides for yeast host cells are obtained from the following genes: saccharomyces cerevisiae alpha-factor and Saccharomyces cerevisiae invertase. Other useful signal peptide coding sequences are described by Romanos et al (1992, supra).

VI type collagen

Collagen plays an important role in maintaining cell matrix structure and function. The type VI collagen is in a fine reticular structure and is distributed among the type I, type III and type V collagen fibers, so that the type I, type III and type V collagen fibers are connected with a basement membrane, and the type VI collagen is an anchor point between the collagens and an attachment site of cells in a matrix, and plays an important role in self-stabilization of the matrix. Under electron microscope, the VI type collagen is dumbbell-shaped, the middle rod-shaped part is about 150nm, and the two wings are spherical. The collagen VI consists of three different peptide chains alpha 1, alpha 2 and alpha 3, wherein the human collagen VI alpha 1 and alpha 2 chain genes are positioned at 21g 22.3 sites, have the length of 36kb and consist of 30 exons. The α3 collagen gene is located at the 2q 37 site. Each peptide chain contains a helical region and a large globular region at the N, C terminus, the two globular regions mainly contain Fengwei Lebrad factor A repeat region, the Fengwei Lebrad factor A repeat region has a molecular weight of about 21kDa and the number of NC chain ends in α1, α2, and α3 are different. Studies have shown that region a is centered on a beta sheet flanked by 3 alpha helices. Type VI collagen contains 18 non-helical Fengwei Lebby factor A repeat regions each having a repeat sequence of 200 amino acid residues, where α3 contains 12 and α1, α2 each contains 3. Within the microfibers, discontinuous Gly-X-Y changes the helix structure into multiple segments. The VI type collagen forms monomer, dimer and tetramer under the action of disulfide bond. The dimer was formed by two monomers in an antiparallel manner with 75mm overlap region force, the sphere region still at both ends, and the C-terminal can be disulfide-stably bonded to its adjacent helix structure. The two dimers form a tetramer in a parallel fashion. Recent studies have shown that the C-terminal region, although a non-helical region of collagen type VI, is important for its function, and is required for the formation of both dimers and tetramers. Extracellular, tetramer terminal-to-terminal linkage forms a microfibrillar structure, while tetramer-to-tetramer bonding is non-covalent, which is assumed to be a region a interaction. Type II collagen and aggrecan bind to type VI collagen via a leucine-rich repeat-rich proteoglycan complex. By binding in this way, type VI collagen plays a dominant role in tissue formation, construction and self-stabilization, either at an early stage of tissue formation or during repair of wounds or fractures. Type VI collagen can also interweave with type V collagen between fibers, in combination with mucopolysaccharide to maintain the structure of the epidermal matrix. In addition to the above several collagens, type VI collagen can be bound to other macromolecules such as type IV collagen, type XIV collagen, decorin, microfibrillar associated glycoprotein 1, hyaluronic acid, 1B2 and ot2pl binding elements, and cell surface proteoglycans NG, etc., and the VI collagen stabilizes the structure of the extracellular matrix by binding to these macromolecules. By maintaining the structure and function of the extracellular matrix, type VI collagen is able to maintain the integrity of blood vessels, lungs, cartilage, muscle, skin, etc., and if the gene encoding type VI collagen is mutated, bethlem myopathy and Ullrich syndrome will develop, resulting in muscle weakness and wasting. Loss of type VI collagen can also lead to loss of mitochondrial function and apoptosis. On the other hand, the increase or accumulation of type VI collagen also causes diseases such as superficial fibroids, neurofibromas, keloids, pulmonary fibrosis, liver fibrosis, diabetic kidney injury, and rheumatoid arthritis. Type VI collagen also affects cell differentiation, adhesion, migration, proliferation, and survival. The cardiomyocytes were cultured in a matrix containing collagen VI and differentiation of myofibroblasts was found to occur due to induction of pro-VI, as well as in vivo experiments. In the study of the function of type VI collagen, type VI collagen has strong adhesion to various oriented hematopoietic stem cells, and the position where such adhesion is exerted is defined in the helical region of 3 peptide chains. Collagen VI also promotes proliferation of various cells, which can be blocked by the individual peptide chains of collagen VI. While enhanced fiber cell diffusion and migration is achieved by the combination of type VI collagen with cell surface proteoglycans NG, decorin, syndecan, hyaluronic acid, and other types of collagen.

Expression vector

The invention also relates to recombinant expression vectors comprising the nucleic acids of the invention, promoters, and transcriptional and translational stop signals. The nucleic acid and control sequences may be linked together to produce a recombinant expression vector, which may include one or more convenient restriction sites to allow for insertion or substitution of a polynucleotide encoding the polypeptide at such sites. Alternatively, the polynucleotide may be expressed by inserting the nucleic acid or a nucleic acid construct comprising the nucleic acid into an appropriate vector for expression. In generating the expression vector, the coding sequence is located in the vector such that the coding sequence is operably linked to appropriate control sequences for expression.

The recombinant expression vector may be any vector (e.g., a plasmid or virus) that can be conveniently subjected to recombinant DNA procedures and that can cause expression of the polynucleotide. The choice of vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced. The vector may be a linear or closed circular plasmid.

The vector may be an autonomously replicating vector, i.e., a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome. The vector may contain any means for ensuring self-replication. Alternatively, the vector may be one that, when introduced into a host cell, integrates into the genome and replicates together with one or more chromosomes into which it has been integrated. Furthermore, a single vector or plasmid or two or more vectors or plasmids may be used, which together contain the total DNA to be introduced into the genome of the host cell, or transposons may be used.

The vector preferably contains one or more selectable markers that allow convenient selection of cells, such as transformed cells, transfected cells, transduced cells, or the like. A selectable marker is a gene the product of which provides for biocide or viral resistance, resistance to heavy metals, prototrophy to auxotrophs, and the like.

Examples of bacterial selectable markers are the bacillus licheniformis or bacillus subtilis dal genes, or markers that confer antibiotic resistance (e.g., ampicillin, chloramphenicol, kanamycin, neomycin, spectinomycin, or tetracycline resistance). Suitable markers for yeast host cells include, but are not limited to: ADE2, HIS3, LEU2, LYS2, MET3, TRP1, and URA3.

The selectable marker may be a dual selectable marker system as described in WO 2010/039889. In one aspect, the dual selectable marker is an hph-tk dual selectable marker system.

The vector may contain elements that allow the vector to integrate into the host cell genome or the vector to autonomously replicate in the cell independent of the genome.

For integration into the host cell genome, the vector may rely on the polynucleotide sequence encoding the polypeptide or any other element of the vector for integration into the genome by homologous or nonhomologous recombination. Alternatively, the vector may contain additional polynucleotides for directing integration by homologous recombination at precise locations in the chromosome in the host cell genome. To increase the likelihood of integration at a precise location, the integration element should contain a sufficient number of nucleic acids, for example 100 to 10,000 base pairs, 400 to 10,000 base pairs, and 800 to 10,000 base pairs, which have a high degree of sequence identity with the corresponding target sequence to enhance the probability of homologous recombination. The integration element may be any sequence homologous to a target sequence within the host cell genome. Furthermore, the integrational elements may be non-encoding or encoding polynucleotides. On the other hand, the vector may be integrated into the genome of the host cell by non-homologous recombination.

For autonomous replication, the vector may further comprise an origin of replication which makes autonomous replication of the vector in the host cell in question possible. The origin of replication may be any plasmid replicon that mediates autonomous replication that functions in a cell. The term "origin of replication" or "plasmid replicon" means a polynucleotide that enables a plasmid or vector to replicate in vivo.

Examples of bacterial origins of replication are the origins of replication of plasmids pBR322, pUC19, pACYC177, and pACYC184, which allow replication in E.coli, and the origins of replication of plasmids pUB110, pE194, pTA1060, and pAM beta 1, which allow replication in Bacillus.

Examples of replication origins for use in yeast host cells are the 2 micron origin of replication, ARS1, ARS4, a combination of ARS1 and CEN3, and a combination of ARS4 and CEN 6.

More than one copy of a polynucleotide of the invention may be inserted into a host cell to enhance production of the polypeptide. Increased copy number of a polynucleotide may be obtained by integrating at least one additional copy of the sequence into the host cell genome or by including an amplifiable selectable marker gene with the polynucleotide, wherein cells comprising amplified copies of the selectable marker gene and thereby additional copies of the polynucleotide may be selected by culturing the cells in the presence of an appropriate selectable agent.

Procedures for ligating the elements described above to construct recombinant expression vectors of the invention are well known to those of ordinary skill in the art (see, e.g., sambrook et al, 1989).

Host cells

The invention also relates to recombinant host cells comprising a polynucleotide of the invention operably linked to one or more control sequences that direct the production of a polypeptide of the invention. The construct or vector comprising the polynucleotide is introduced into a host cell such that the construct or vector is maintained as a chromosomal integrant or as an autonomously replicating extra-chromosomal vector, as described earlier. The term "host cell" encompasses any parent cell progeny that are not identical to the parent cell due to mutations that occur during replication. The choice of host cell will depend to a large extent on the gene encoding the polypeptide and its source.

The host cell may be any cell useful in the recombinant production of the polypeptides of the invention, e.g., a prokaryote or eukaryote.

The prokaryotic host cell may be any gram-positive or gram-negative bacterium. Gram positive bacteria include, but are not limited to: bacillus, clostridium, enterococcus, geobacillus, lactobacillus, lactococcus, bacillus, staphylococcus, streptococcus and streptomyces. Gram negative bacteria include, but are not limited to: campylobacter, escherichia coli, flavobacterium, fusobacterium, helicobacter, mudacter, neisseria, pseudomonas, salmonella, and ureaplasma.

The host cell may also be a eukaryotic organism, such as a mammalian, insect, plant or fungal cell. Plant cells herein do not include plant cells that can be regenerated into plants. Animal cells also do not include cells that produce an animal body.

The host cell may be a fungal cell such as Basidiomycota (Basidiomycota), pot (Chytridiomycota), and zygomycota (Zygomycota), oomycetota (Oomyceta), and the like. The fungal host cell may be a yeast cell including ascomycetes (ascosporogenous yeast) (Endomycetales), basidiomycetes (basidiosporogenous yeast) and yeasts belonging to the class of the Fungi (Blastomycetes). The yeast host cell may be a Candida (Candida), hansenula (Hansenula), kluyveromyces (Kluyveromyces), pichia (Pichia), saccharomyces (Saccharomyces), schizosaccharomyces (Schizosaccharomyces) or Yarrowia cell, such as a Kluyveromyces lactis (Kluyveromyces lactis), karst (Saccharomyces carlsbergensis), saccharomyces cerevisiae, saccharifying yeast (Saccharomyces diastaticus), moraxella (Saccharomyces douglasii), kluyveromyces (Saccharomyces kluyveri), nodaker (Saccharomyces norbensis), oval yeast (Saccharomyces oviformis) or Yarrowia lipolytica (Yarrowia lipolytica) cell.

Production method

The invention also relates to the production of a polypeptide as described herein, comprising:

(1) Culturing a host cell described herein under suitable culture conditions;

(3) Purifying the polypeptide.

The host cells are cultured in a suitable nutrient medium for producing the polypeptides using methods known in the art. For example, the cells may be cultured by shake flask culture, or small-scale or large-scale fermentation (including continuous, batch, fed-batch, or solid state fermentations) in laboratory or industrial fermentors performed in a suitable medium and under conditions allowing the polypeptide to be expressed and/or isolated. Culturing occurs in a suitable nutrient medium containing carbon and nitrogen sources and inorganic salts using procedures known in the art. Suitable media are available from commercial suppliers or may be prepared according to published compositions (e.g., in catalogues of the American type culture Collection). If the polypeptide is secreted into the nutrient medium, the polypeptide can be recovered directly from the medium. If the polypeptide is not secreted, it can be recovered from the cell lysate.

The polypeptides may be detected using methods known in the art that are specific for the polypeptides. These detection methods include, but are not limited to: the use of specific antibodies, the formation of enzyme products or the disappearance of enzyme substrates. For example, an enzyme assay may be used to determine the activity of a polypeptide.

Methods known in the art may be used to recover the polypeptide. For example, the polypeptide may be recovered from the nutrient medium by conventional procedures including, but not limited to, collection, centrifugation, filtration, extraction, spray-drying, evaporation, or precipitation. In one aspect, a fermentation broth comprising the polypeptide is recovered.

The polypeptides may be purified by a variety of procedures known in the art, including, but not limited to, chromatography (e.g., ion exchange chromatography, affinity chromatography, hydrophobic chromatography, focused chromatography, and size exclusion chromatography), electrophoresis procedures (e.g., preparative isoelectric focusing), differential solubilization (e.g., ammonium sulfate precipitation), SDS-PAGE, or extraction, in order to obtain a substantially pure polypeptide.

Step (1) may comprise one or more of the following steps: construction of an expression plasmid, for example, insertion of the coding nucleotide sequence into a pET-28a-Trx-His expression vector, results in a recombinant expression plasmid. The successfully constructed expression plasmid can be transformed into E.coli cells (e.g.E.coli competent cell BL21 (DE 3)). The specific process can be as follows: (1) Adding the plasmid to be transformed into escherichia coli competent cells BL21 (DE 3); (2) Placing the mixture in ice bath (for example, 10-60min, for example, 30 min), then heat-shocking in water bath (for example, at 40-50deg.C, for example, 42 deg.C, for 45-90 s), taking out, and placing in ice bath (for example, 1-5min, for example, 2 min); (3) Adding liquid LB medium, and culturing (e.g., culturing at 35-40deg.C, e.g., 37deg.C, 150-300rpm, e.g., 220rpm for 40-80min, e.g., 60 min); (4) coating the bacterial liquid and selecting single colonies. For example, the bacterial liquid is evenly spread on LB plate containing ampicillin sodium, the plate is cultivated in an incubator at 37 ℃ for 15-17 hours, and colonies with even size are grown.

Step (2) may comprise culturing the single colony in LB medium containing the antibiotic stock (e.g.at 150-300rpm, e.g.220 rpm,35-40℃e.g.37℃in a thermostatted shaker for 5-10h, e.g.7 h). The shake flask after cultivation is cooled to 10-20 ℃, for example, 16 ℃, and the cells are collected (for example, collected by centrifugation) after the induction of expression by adding IPTG for a period of time.

Step (3) may include resuspension of the bacterial cells with an equilibration working fluid, cooling the bacterial fluid to 15 ℃ or less, homogenizing (e.g., high pressure homogenizing, e.g., 1-5 times, e.g., 2 times), and separating the homogenized bacterial fluid to obtain a supernatant. The equilibration working fluid may comprise 100-500mM sodium chloride, 10-50mM Tris and 10-50mM imidazole, pH7-9. For example, the concentration of sodium chloride may be 110、120、130、140、150、160、170、180、190、200、210、220、230、240、250、260、270、280、290、300、310、320、330、340、350、360、370、380、390、400、410、420、430、440、450、460、470、480 or 490nM. The concentration of Tris may be 10, 15, 20, 25, 30, 35, 40, 45 or 50nM. The concentration of imidazole may be 10, 15, 20, 25, 30, 35, 40, 45 or 50nM. The pH may be 7, 7.5, 8, 8.5 or 9.

Step (3) may comprise purifying and cleaving the polypeptide. Purification may be crude, including purification of the supernatant by a Ni-Sepharose column to obtain an eluate containing the target protein. Crude purity may include water-washed columns, e.g., 2-10 Column Volumes (CVs), e.g., 5 CVs. The column may be equilibrated with an equilibration solution (200 mM sodium chloride, 25mM Tris, 20mM imidazole, pH 8.0), for example 2-10 CV, for example 5 CV. The equilibration solution may comprise 100-500mM sodium chloride, 10-50mM Tris and 10-50mM imidazole, pH7-9. For example, the concentration of sodium chloride may be 110、120、130、140、150、160、170、180、190、200、210、220、230、240、250、260、270、280、290、300、310、320、330、340、350、360、370、380、390、400、410、420、430、440、450、460、470、480 or 490nM. The concentration of Tris may be 10, 15, 20, 25, 30, 35, 40, 45 or 50nM. The concentration of imidazole may be 10, 15, 20, 25, 30, 35, 40, 45 or 50nM. The pH may be 7, 7.5, 8, 8.5 or 9.

Step (3) may include adding the supernatant to the column and washing the protein with a wash solution. The wash solution may comprise 100-500mM sodium chloride, 10-50mM Tris and 10-50mM imidazole, pH7-9. For example, the concentration of sodium chloride may be 110、120、130、140、150、160、170、180、190、200、210、220、230、240、250、260、270、280、290、300、310、320、330、340、350、360、370、380、390、400、410、420、430、440、450、460、470、480 or 490nM. The concentration of Tris may be 10, 15, 20, 25, 30, 35, 40, 45 or 50nM. The concentration of imidazole may be 10, 15, 20, 25, 30, 35, 40, 45 or 50nM. The pH may be 7, 7.5, 8, 8.5 or 9. Then, an eluent may be added and the fluid permeate collected. The eluate may comprise 100-500mM sodium chloride, 10-50mM Tris, 100-500mM imidazole, pH8.0. For example, the concentration of sodium chloride may be 110、120、130、140、150、160、170、180、190、200、210、220、230、240、250、260、270、280、290、300、310、320、330、340、350、360、370、380、390、400、410、420、430、440、450、460、470、480 or 490nM. The concentration of Tris may be 10, 15, 20, 25, 30, 35, 40, 45 or 50nM. The concentration of imidazole may be 110、120、130、140、150、160、170、180、190、200、210、220、230、240、250、260、270、280、290、300、310、320、330、340、350、360、370、380、390、400、410、420、430、440、450、460、470、480 or 490nM. The pH may be 7, 7.5, 8, 8.5 or 9.

Cleavage may include addition of TEV (in a total protein to total TEV enzyme ratio of 10-100:1, e.g., 50:1, 10-20deg.C, e.g., 16deg.C, for 2-8h, e.g., 4 h). The digested protein solution is dialyzed, e.g., in a dialysis bag, at 1-6deg.C, e.g., 4deg.C, for 1-8 hours, e.g., 2 hours, and transferred to fresh dialysate for 1-6deg.C, e.g., 4deg.C, overnight dialysis.

Purification may include purification (e.g., protein isoelectric point > 8.0). Preferably, purification involves gradient elution of the eluate or cleaved product (e.g., cleaved and dialyzed product) containing the protein of interest with a strong anion exchange chromatography column (e.g., pH may be 7, 7.5, 8, 8.5, or 9.). Gradient elution involves 0-15% B liquid for 1-5 minutes then 1-5, e.g. 3 column volumes, 15-30% B liquid for 1-5 minutes then 1-5, e.g. 3 column volumes, 30-50% B liquid for 1-5 minutes then 1-5, e.g. 3 column volumes, 50-100% B liquid for 1-5 minutes then 1-5, e.g. 3 column volumes. The solution B may contain 10-50mM Tris,0.5-5M sodium chloride, pH7-9. For example, the concentration of Tris is 15, 20, 25, 30, 35, 40 or 45mM. The concentration of sodium chloride is 1,2, 3 or 4M. The pH may be 7, 7.5, 8, 8.5 or 9. The purification may involve equilibration of the column with liquid a and loading followed by gradient elution. The solution A may comprise 10-50mM Tris,10-50mM sodium chloride, pH7-9. For example, the concentration of Tris is 15, 20, 25, 30, 35, 40 or 45mM. The concentration of sodium chloride was 15, 20, 25, 30, 35, 40 or 45mM. The pH may be 7, 7.5, 8, 8.5 or 9.

Purification may include reverse-hanging nickel column purification (e.g., protein isoelectric point < 8.0). Reverse-hanging nickel column purification may include purification of the digested product (e.g., dialyzed product) against a Ni-agarose gel column. The eluate may comprise 10-50mM (e.g., 15, 20, 25, 30, 35, 40, or 45 mM) Tris,10-50mM (e.g., 15, 20, 25, 30, 35, 40, or 45 mM) sodium chloride, 0.5-5M (e.g., 1,2,3, or 4M) imidazole, and pH7-9 (e.g., 7, 7.5, 8, 8.5, or 9).

The invention has the advantages that: 1. the polypeptide of the invention is derived from type VI collagen and is recombinant type VI collagen; 2. the polypeptide of the invention is suitable for the preparation of escherichia coli, and can be separated and purified; 3. the polypeptide of the invention has larger expression quantity and is suitable for subsequent purification.

The following examples are further provided to illustrate the invention.

Examples

The invention is further illustrated by the following examples, but any examples or combinations thereof should not be construed as limiting the scope or embodiments of the invention. The scope of the present invention is defined by the appended claims, and the scope of the claims will be apparent to those skilled in the art from consideration of the specification and the common general knowledge in the field. Any modifications or variations of the technical solution of the present invention may be carried out by those skilled in the art without departing from the spirit and scope of the present invention, and such modifications and variations are also included in the scope of the present invention.

Example 1: construction, expression and screening of VI collagen fragments

1. Screening the functional region in large scale to obtain the target gene functional region of the recombinant humanized VI type collagen with the following different functions

Amino acid sequence

(1)C6a

GPPGLRGDPGFEGERGKPGLPGEKGEAGDPGRPGDLGPVGYQGMKGEK

GSRGEKGSRGPKGYKGEKGKRGIDGVDGVKGEMGYPGLPGCKGSPGFD

GIQGPPGPKGDPGAFGLKGEKGEPGADGEAGRPGSSGPSGDEGQPGEPGP

PGEKGEAGDEGNPGPDGAPGERGGPGERGPRGTPGTRGPRGDPGEAGPQ

GDQGREGPVGVPGDPGEAGPIGPKGYRGDEGPPGSEGARGAPGPAGPPG

DPGLMGERGEDGPAGNGTEGFPGFPGYPGNRGAPGINGTKGYPGLKGDEGEAGDPGDDNNDIAPRGVKGAKGYRGPEGPQGPPGHQGPPGPD(SEQ ID NO:21)

(2)C6b

GPPGLRGDPGFEGERGKPGLPGEKGEAGDPGRPGDLGPVGYQGMKG

EKGSRGEKGSRGPKGYKGEKGKRGIDGVDGVKGEMGYPGLPGCKGSPG

FDGIQGPPGPKGDPGAFGLKGEKGEPGADGEAGRPGSSGPSGDEGQPGEPGPPGEKGEAGDEGNPGPDGAP(SEQ ID NO:9)

GPPGLRGDPGFEGERGKPGLPGEKGEAGDPGRPGDLGPVGYQGMKG

EKGSRGEKGSRGPKGYKGEKGKRGIDGVDGVKGEMGYPGLPGCKGSPG

FDGIQGPPGPKGDPGAFGLKGEKGEPGADGEAGRPGSSGPSGDEGQPGEPGPPGEKGEAGDEGNPGPDGAP (the amino acid sequence of the repeating unit of C6b is SEQ ID NO:9, the number of repeating units is 2, the amino acid sequence of C6b is SEQ ID NO: 10)

(3)C6c

GERGGPGERGPRGTPGTRGPRGDPGEAGPQGDQGREGPVGVPGDPG

EAGPIGPKGYRGDEGPPGSEGARGAPGPAGPPGDPGLMGERGEDGPA(SEQ ID NO:1)

GERGGPGERGPRGTPGTRGPRGDPGEAGPQGDQGREGPVGVPGDPG

EAGPIGPKGYRGDEGPPGSEGARGAPGPAGPPGDPGLMGERGEDGPA

GERGGPGERGPRGTPGTRGPRGDPGEAGPQGDQGREGPVGVPGDPG

EAGPIGPKGYRGDEGPPGSEGARGAPGPAGPPGDPGLMGERGEDGPA (the amino acid sequence of the repeating unit of C6C is SEQ ID NO:1, the number of repeating units is 3, the amino acid sequence of C6C is SEQ ID NO: 2)

(4)C6d

GIQGPPGPKGDPGAFGLKGEKGEPGADGEAGRPGSSGPSGDEGQPGE

PGPPGEKGEAGDEGNPGPDGAPGERGGPGERGPRGTPGTRGPRGDPGEA GPQGDQGREGPVGVPGDPGEAGPIGPKGYRGDEGPP(SEQ ID NO:11)

GIQGPPGPKGDPGAFGLKGEKGEPGADGEAGRPGSSGPSGDEGQPGE

PGPPGEKGEAGDEGNPGPDGAPGERGGPGERGPRGTPGTRGPRGDPGEA GPQGDQGREGPVGVPGDPGEAGPIGPKGYRGDEGPP (the amino acid sequence of the repeating unit of C6d is SEQ ID NO:11, the number of repeating units is 2, the amino acid sequence of C6d is SEQ ID NO: 12)

(5)C6e

GPPGLRGDPGFEGERGKPGLPGEKGEAGDPGRPGDL(SEQ ID NO:13)

GPPGLRGDPGFEGERGKPGLPGEKGEAGDPGRPGDL

GPPGLRGDPGFEGERGKPGLPGEKGEAGDPGRPGDL (the amino acid sequence of the repeating unit of C6e is SEQ ID NO:13, the number of repeating units is 6, the amino acid sequence of C6e is SEQ ID NO: 14)

(6)C6f

GMKGEKGSRGEKGSRGPKGYKGEKGKRGIDGVDGVKGEM(SEQ ID NO:15)

GMKGEKGSRGEKGSRGPKGYKGEKGKRGIDGVDGVKGEM

GMKGEKGSRGEKGSRGPKGYKGEKGKRGIDGVDGVKGEM (the amino acid sequence of the repeating unit of C6f is SEQ ID NO:15, the number of repeating units is 6, the amino acid sequence of C6f is SEQ ID NO: 16)

(7)C6g

GCKGSPGFDGIQGPPGPKGDPGAFGLKGEKGEPGADGEAGRP(SEQ ID NO:3)

GCKGSPGFDGIQGPPGPKGDPGAFGLKGEKGEPGADGEAGRP

GCKGSPGFDGIQGPPGPKGDPGAFGLKGEKGEPGADGEAGRP (the amino acid sequence of C6g of the repeating unit is SEQ ID NO:3, the number of repeating units is 6, the amino acid sequence of C6g is SEQ ID NO: 4)

(8)C6h

GSSGPSGDEGQPGEPGPPGEKGEAGDEGNPGPDGAP(SEQ ID NO:17)

GSSGPSGDEGQPGEPGPPGEKGEAGDEGNPGPDGAP

GSSGPSGDEGQPGEPGPPGEKGEAGDEGNPGPDGAP (the amino acid sequence of the repeat unit of C6h is SEQ ID NO:17, the number of repeat units is 6, the amino acid sequence of C6h is SEQ ID NO: 18)

(9)C6i

GERGGPGERGPRGTPGTRGPRGDPGEAGPQGDQGRE(SEQ ID NO:5)

GERGGPGERGPRGTPGTRGPRGDPGEAGPQGDQGRE

GERGGPGERGPRGTPGTRGPRGDPGEAGPQGDQGRE (the amino acid sequence of the repeating unit of C6i is SEQ ID NO:5, the number of repeating units is 6, the amino acid sequence of C6i is SEQ ID NO: 6)

(10)C6j

GDPGEAGPIGPKGYRGDEGPPGSEGARGAPGPAGPPGDPGLMGERGE DGPA(SEQ ID NO:19)

GDPGEAGPIGPKGYRGDEGPPGSEGARGAPGPAGPPGDPGLMGERGE

DGPA

GDPGEAGPIGPKGYRGDEGPPGSEGARGAPGPAGPPGDPGLMGERGE

DGPA

GDPGEAGPIGPKGYRGDEGPPGSEGARGAPGPAGPPGDPGLMGERGEDGPA (the amino acid sequence of the repeating unit of C6j is SEQ ID NO:19, the number of repeating units is 4, the amino acid sequence of C6j is SEQ ID NO: 20)

(11)C6k

GTEGFPGFPGYPGNRGAPGINGTKGYPGLKGDEGEAGDPGDD(SEQ ID NO:7)

GTEGFPGFPGYPGNRGAPGINGTKGYPGLKGDEGEAGDPGDD

GTEGFPGFPGYPGNRGAPGINGTKGYPGLKGDEGEAGDPGDD (the amino acid sequence of the repeating unit of C6k is SEQ ID NO:7, the number of repeating units is 6, the amino acid sequence of C6k is SEQ ID NO: 8)

Nucleotide sequence

1.C6a

GGACCCCCAGGGTTGCGTGGTGATCCGGGCTTCGAAGGTGAACGCGGCAAACCGGGTCTGCCTGGCGAGAAGGGCGAGGCGGGTGACCCGGGCCGCCCTGGTGACCTGGGTCCGGTGGGTTATCAGGGTATGAAAGGCGAGAAGGGCTCGCGTGGCGAGAAGGGTAGCCGTGGTCCGAAAGGGTATAAAGGCGAGAAGGGCAAGCGCGGCATTGATGGCGTTGATGGTGTTAAAGGTGAAATGGGTTACCCGGGCCTGCCAGGGTGCAAAGGCTCTCCGGGTTTTGATGGTATCCAGGGTCCGCCGGGTCCGAAAGGCGACCCGGGCGCGTTTGGTTTGAAGGGCGAGAAGGGTGAGCCGGGGGCGGATGGTGAAGCCGGCCGTCCGGGTAGCAGCGGTCCGAGCGGTGATGAAGGTCAGCCGGGTGAACCGGGTCCGCCGGGAGAAAAGGGCGAGGCAGGCGACGAAGGAAACCCGGGTCCGGACGGCGCTCCGGGTGAGCGTGGCGGTCCGGGTGAGCGCGGTCCGCGTGGCACCCCGGGAACCCGTGGCCCGCGTGGCGACCCAGGCGAGGCGGGTCCGCAAGGTGACCAAGGTAGAGAAGGTCCGGTTGGCGTGCCGGGCGACCCGGGTGAGGCGGGTCCCATCGGTCCGAAGGGTTACCGTGGTGACGAGGGTCCTCCGGGCAGCGAAGGTGCGCGTGGAGCCCCTGGACCGGCAGGCCCACCGGGCGACCCAGGTCTGATGGGTGAGCGCGGTGAAGATGGTCCGGCGGGTAACGGTACGGAAGGCTTCCCGGGGTTCCCGGGATACCCGGGCAACCGTGGTGCGCCAGGTATCAATGGCACCAAAGGCTATCCGGGCCTTAAGGGCGACGAAGGCGAGGCTGGTGACCCGGGCGATGATAATAACGATATTGCACCGCGTGGCGTCAAAGGTGCTAAAGGTTACCGCGGTCCAGAAGGTCCGCAAGGTCCACCGGGTCACCAGGGTCCGCCGGGTCCGGAT(SEQ ID NO:22)

2.C6b

GGACCACCCGGGCTCCGTGGTGACCCGGGCTTCGAAGGCGAGCGCGGCAAACCGGGTTTGCCGGGTGAAAAGGGTGAAGCAGGCGACCCGGGTCGCCCAGGCGACCTGGGTCCGGTTGGTTACCAAGGTATGAAAGGGGAGAAGGGATCTAGAGGCGAAAAAGGCTCCCGCGGTCCGAAAGGCTATAAGGGCGAAAAGGGCAAGCGTGGCATTGATGGTGTCGATGGTGTTAAAGGCGAAATGGGTTATCCGGGTCTGCCGGGCTGCAAAGGTAGCCCGGGTTTTGATGGTATCCAGGGTCCGCCTGGTCCGAAGGGCGACCCGGGTGCGTTTGGTCTGAAAGGTGAGAAGGGCGAGCCGGGCGCGGATGGTGAGGCGGGTCGTCCGGGCAGCAGCGGTCCGAGCGGTGACGAAGGTCAGCCGGGCGAGCCGGGTCCGCCTGGTGAAAAGGGTGAGGCCGGTGACGAGGGCAACCCGGGTCCGGATGGCGCGCCGGGCCCACCGGGCTTACGTGGTGACCCGGGCTTCGAAGGTGAACGTGGTAAGCCGGGTTTGCCGGGTGAGAAGGGCGAAGCTGGCGACCCGGGCCGTCCGGGCGACCTGGGTCCGGTGGGTTATCAGGGTATGAAAGGCGAGAAGGGTTCTCGTGGCGAAAAGGGTTCCCGTGGTCCGAAAGGCTACAAAGGTGAAAAGGGAAAGCGCGGCATTGATGGTGTGGATGGCGTGAAAGGTGAGATGGGTTACCCGGGTCTGCCTGGTTGTAAAGGTTCCCCGGGATTCGACGGCATCCAGGGTCCGCCCGGTCCGAAAGGTGACCCGGGCGCGTTTGGCCTGAAAGGCGAAAAGGGCGAGCCGGGTGCCGATGGCGAGGCAGGACGTCCGGGGTCGAGCGGCCCAAGCGGTGATGAAGGCCAACCGGGTGAGCCGGGCCCACCGGGCGAGAAGGGTGAGGCTGGCGACGAAGGTAATCCGGGTCCGGATGGCGCGCCG(SEQ ID NO:23)

3.C6c

GGAGAAAGGGGGGGCCCGGGCGAGCGCGGCCCGCGTGGCACCCCGGGGACCCGTGGCCCGCGTGGTGACCCGGGCGAGGCTGGCCCGCAAGGTGATCAAGGTCGTGAAGGTCCGGTGGGCGTGCCGGGTGATCCGGGTGAGGCGGGCCCCATCGGTCCGAAAGGTTACCGTGGCGATGAGGGCCCTCCGGGTAGCGAAGGCGCGCGTGGCGCTCCGGGTCCGGCGGGTCCGCCAGGCGACCCGGGCCTGATGGGTGAACGCGGCGAAGATGGTCCGGCGGGCGAACGCGGTGGTCCGGGCGAGCGCGGTCCGCGTGGCACGCCGGGCACTCGCGGCCCACGTGGTGATCCGGGTGAGGCGGGTCCGCAGGGTGACCAGGGTCGTGAAGGTCCAGTTGGTGTTCCGGGCGACCCGGGTGAAGCCGGTCCGATTGGTCCGAAGGGTTACCGCGGCGACGAAGGCCCACCGGGTAGCGAAGGTGCCCGTGGCGCACCGGGTCCGGCAGGCCCACCGGGCGACCCGGGCTTGATGGGCGAGCGCGGCGAGGACGGCCCGGCTGGCGAGCGCGGTGGTCCGGGTGAGCGCGGCCCGCGTGGCACCCCGGGCACCCGTGGTCCGCGTGGTGACCCGGGTGAGGCAGGTCCTCAAGGTGATCAGGGCAGAGAAGGTCCGGTCGGTGTTCCGGGCGACCCGGGAGAGGCGGGTCCGATCGGCCCGAAGGGTTATCGTGGTGACGAAGGTCCGCCTGGTAGCGAAGGCGCGCGTGGCGCGCCAGGCCCTGCCGGTCCACCGGGTGACCCGGGCCTGATGGGTGAGCGTGGTGAGGATGGTCCGGCG(SEQ ID NO:24)

4.C6d

GGAATACAAGGGCCACCGGGGCCGAAGGGCGACCCGGGTGCTTTCGGCCTGAAAG

GCGAGAAAGGCGAGCCGGGCGCGGATGGTGAGGCGGGTCGTCCGGGTAGCAGCG

GTCCGTCTGGCGACGAGGGCCAGCCGGGTGAACCGGGTCCGCCAGGAGAAAAGG

GTGAGGCCGGCGACGAAGGTAATCCGGGTCCGGATGGCGCGCCTGGCGAGCGCGG

CGGCCCGGGCGAACGCGGACCGCGTGGCACGCCGGGCACCCGTGGTCCGCGTGGT

GATCCGGGCGAAGCAGGTCCCCAAGGCGATCAGGGTCGCGAAGGCCCGGTTGGTG

TGCCGGGCGACCCGGGCGAGGCGGGTCCGATTGGTCCGAAAGGCTACCGCGGCGA

CGAAGGTCCACCGGGTATCCAGGGTCCACCGGGTCCCAAGGGCGACCCGGGTGCG

TTTGGTTTGAAAGGCGAGAAGGGTGAACCGGGTGCAGATGGTGAGGCGGGCAGAC

CTGGCAGCTCGGGCCCGTCCGGTGACGAAGGCCAACCGGGCGAACCGGGTCCGCC

AGGTGAGAAGGGCGAGGCCGGTGACGAGGGCAACCCGGGTCCGGATGGTGCACC

GGGGGAGCGCGGTGGTCCGGGCGAGCGTGGTCCGCGTGGCACCCCGGGCACCCGT

GGTCCGCGTGGTGACCCTGGTGAGGCGGGTCCGCAGGGCGATCAAGGCCGTGAAG

GTCCGGTGGGTGTTCCGGGCGACCCGGGCGAAGCTGGCCCGATCGGTCCGAAAGGTTATCGTGGTGATGAAGGTCCGCCA(SEQ ID NO:25)

5.C6e

GGGCCCCCAGGATTGCGCGGCGATCCGGGTTTCGAGGGCGAGCGCGGCAAGCCGG

GTCTGCCGGGTGAAAAAGGTGAGGCTGGTGACCCGGGCCGTCCGGGCGACCTGGG

TCCGCCTGGCCTGCGTGGTGATCCGGGTTTTGAAGGTGAACGTGGCAAGCCGGGTC

TGCCGGGTGAAAAAGGTGAGGCCGGTGATCCGGGCCGTCCGGGCGACCTTGGCCC

ACCGGGCTTGCGTGGCGACCCGGGCTTCGAGGGCGAGCGCGGTAAGCCGGGCCTG

CCCGGTGAAAAGGGTGAGGCAGGCGATCCGGGACGCCCAGGCGACCTGGGTCCGC

CAGGCCTGCGTGGTGACCCGGGTTTCGAAGGCGAACGCGGTAAACCGGGTTTGCC

GGGTGAGAAAGGTGAGGCGGGTGATCCGGGCCGTCCGGGCGACCTGGGTCCGCCT

GGATTACGTGGTGACCCGGGCTTTGAAGGTGAGCGCGGAAAACCGGGTCTGCCAG

GCGAGAAGGGCGAAGCGGGTGACCCGGGTCGTCCGGGCGACCTGGGCCCACCGG

GTTTGAGAGGTGATCCGGGCTTTGAAGGTGAGCGTGGTAAGCCGGGTCTGCCGGG

TGAGAAAGGCGAAGCGGGCGATCCGGGTCGTCCGGGCGATCTC(SEQ ID NO:26)

6.C6f

GGGATGAAAGGAGAGAAGGGCTCCCGTGGTGAAAAGGGCAGCCGTGGTCCGAAA

GGTTATAAAGGTGAGAAGGGCAAACGCGGCATTGATGGTGTTGATGGCGTGAAGG

GCGAGATGGGTATGAAAGGTGAGAAGGGCAGCCGTGGTGAAAAAGGTAGCCGTG

GCCCAAAGGGTTACAAAGGTGAAAAGGGCAAACGCGGGATCGATGGTGTTGACG

GCGTGAAAGGCGAGATGGGTATGAAAGGCGAGAAGGGTTCTCGTGGTGAAAAGG

GCAGCCGCGGTCCGAAAGGCTACAAAGGCGAGAAGGGTAAACGTGGGATCGACG

GCGTTGATGGCGTCAAGGGCGAAATGGGCATGAAGGGCGAAAAGGGTAGCCGTG

GTGAAAAGGGTTCACGTGGTCCGAAAGGTTATAAAGGTGAAAAGGGCAAACGTG

GTATTGATGGTGTGGACGGCGTCAAAGGAGAGATGGGTATGAAAGGTGAAAAGG

GTAGCCGTGGTGAGAAGGGTTCTCGCGGTCCGAAAGGTTATAAAGGTGAGAAGGG

CAAACGTGGCATCGACGGTGTGGATGGCGTTAAAGGCGAGATGGGTATGAAGGGC

GAGAAGGGTTCCCGCGGTGAGAAGGGTAGCCGTGGTCCGAAAGGCTACAAGGGT

GAAAAGGGCAAGCGCGGTATTGACGGCGTTGACGGCGTGAAAGGCGAAATG(SEQ

ID NO:27)

7.C6g

GGGTGTAAAGGATCGCCGGGCTTTGACGGCATCCAAGGTCCGCCTGGTCCGAAAG

GAGATCCGGGTGCGTTCGGTCTGAAAGGTGAAAAGGGCGAGCCGGGTGCCGACGG

CGAAGCAGGCCGTCCTGGCTGTAAAGGTAGCCCGGGTTTCGACGGCATTCAAGGT

CCTCCGGGTCCGAAAGGCGACCCGGGTGCGTTTGGCTTGAAGGGCGAAAAGGGCG

AACCGGGTGCTGATGGCGAGGCAGGTCGCCCAGGCTGCAAAGGTTCTCCGGGTTT

TGATGGTATTCAGGGTCCACCGGGCCCCAAGGGCGATCCGGGCGCGTTCGGTCTG

AAGGGTGAAAAGGGCGAACCGGGCGCGGACGGTGAGGCGGGTCGTCCGGGCTGT

AAAGGTAGCCCGGGTTTTGATGGTATCCAGGGTCCGCCAGGCCCGAAAGGCGACC

CGGGTGCGTTTGGTTTAAAGGGCGAGAAAGGCGAGCCGGGTGCGGACGGCGAGG

CTGGTCGTCCGGGGTGCAAAGGCAGCCCGGGCTTCGATGGCATTCAGGGTCCACC

GGGTCCGAAGGGTGACCCGGGTGCGTTCGGCCTGAAAGGTGAAAAGGGCGAGCC

GGGTGCTGATGGTGAAGCCGGTCGCCCTGGCTGCAAGGGCAGCCCGGGATTCGAC

GGCATCCAGGGCCCGCCGGGTCCGAAGGGCGATCCGGGTGCCTTCGGCCTGAAAG

GGGAGAAGGGTGAGCCGGGTGCGGACGGTGAAGCAGGTCGTCCG(SEQ ID NO:28)

8.C6h

GGGTCAAGTGGACCGTCTGGTGATGAAGGTCAGCCGGGCGAACCGGGTCCGCCGG

GTGAAAAAGGTGAGGCGGGTGACGAAGGTAACCCGGGTCCGGATGGCGCCCCTG

GCTCCAGCGGTCCGTCAGGCGACGAGGGTCAGCCGGGTGAACCGGGCCCTCCGGG

CGAGAAGGGCGAAGCTGGCGACGAGGGAAACCCGGGCCCGGACGGTGCTCCGGG

TAGCAGCGGTCCGTCTGGCGATGAGGGTCAACCGGGTGAGCCGGGCCCACCGGGT

GAAAAGGGTGAGGCGGGTGACGAGGGCAATCCGGGTCCGGATGGCGCGCCGGGC

AGCAGCGGCCCATCCGGCGACGAGGGCCAACCGGGCGAACCGGGCCCACCGGGC

GAGAAGGGTGAAGCGGGTGATGAGGGGAACCCGGGTCCGGACGGCGCACCGGGC

AGCAGCGGTCCGTCTGGTGACGAAGGCCAGCCGGGCGAGCCGGGTCCGCCGGGCG

AGAAAGGTGAGGCCGGTGACGAAGGTAATCCGGGTCCGGATGGTGCGCCAGGTTC

GAGCGGTCCGTCCGGCGATGAAGGTCAACCGGGCGAGCCTGGTCCACCGGGTGAA

AAAGGCGAGGCAGGCGACGAAGGCAACCCGGGTCCGGATGGAGCGCCA(SEQ ID

NO:29)

9.C6i

GGAGAAAGGGGGGGTCCGGGCGAGCGCGGCCCACGTGGCACCCCCGGCACTCGC

GGTCCGCGCGGCGATCCGGGTGAAGCAGGCCCACAGGGTGATCAAGGTCGCGAGG

GCGAGCGTGGCGGTCCGGGTGAACGCGGGCCGCGTGGCACCCCGGGCACCCGTGG

TCCGCGTGGTGATCCGGGTGAAGCGGGTCCGCAGGGTGATCAGGGTCGTGAAGGC

GAGCGCGGAGGCCCAGGCGAGCGTGGTCCGCGTGGCACCCCGGGTACGCGTGGTC

CGCGTGGTGACCCGGGCGAGGCGGGTCCGCAAGGTGACCAGGGCCGTGAAGGTG

AACGTGGTGGCCCGGGCGAGCGCGGCCCGCGTGGCACCCCGGGCACCCGTGGTCC

GCGCGGCGACCCGGGTGAGGCCGGTCCGCAGGGCGACCAAGGTCGTGAAGGTGA

ACGCGGCGGTCCGGGTGAGCGCGGCCCACGTGGCACGCCGGGTACGCGTGGCCCG

AGAGGCGACCCTGGTGAGGCTGGCCCTCAAGGTGATCAGGGCCGTGAAGGTGAGA

GAGGTGGTCCGGGCGAGCGCGGTCCGAGAGGCACCCCGGGCACCCGTGGTCCGCGTGGTGACCCGGGTGAAGCGGGTCCGCAAGGCGACCAGGGTCGTGAA(SEQ ID NO:30)

10.C6j

GGAGATCCCGGAGAGGCAGGTCCGATTGGTCCGAAAGGCTATCGTGGTGATGAAG

GCCCGCCAGGCTCCGAGGGCGCGCGTGGTGCCCCGGGCCCGGCTGGTCCGCCGGG

CGACCCGGGCCTGATGGGCGAGCGTGGCGAGGATGGTCCGGCGGGTGACCCGGGT

GAAGCCGGTCCGATTGGCCCCAAGGGCTATCGTGGCGACGAAGGTCCGCCGGGGT

CTGAAGGTGCGCGTGGTGCTCCGGGTCCGGCTGGCCCGCCGGGCGATCCGGGCCT

GATGGGTGAGCGCGGTGAAGATGGTCCGGCAGGTGATCCGGGCGAGGCCGGTCCG

ATCGGCCCGAAGGGTTACCGCGGTGATGAAGGTCCGCCTGGCAGCGAAGGTGCGC

GTGGTGCGCCTGGTCCAGCAGGCCCGCCGGGCGACCCGGGCCTGATGGGTGAGCG

CGGCGAAGATGGTCCGGCGGGTGACCCGGGTGAGGCAGGTCCGATCGGTCCGAAA

GGTTACCGCGGTGACGAGGGTCCGCCTGGCAGCGAAGGTGCGAGAGGCGCGCCAG

GCCCGGCTGGCCCACCGGGCGACCCGGGCTTGATGGGTGAACGTGGTGAGGACGGCCCGGCG(SEQ ID NO:31)

11.C6k

AGGGACAGAAGGATTCCCGGGCTTCCCAGGTTATCCGGGCAACCGCGGTGCGCCA

GGCATTAACGGCACCAAAGGTTATCCGGGTTTGAAGGGCGACGAAGGCGAGGCGG

GTGATCCGGGAGACGACGGCACCGAAGGTTTTCCGGGCTTTCCGGGCTACCCGGG

TAATCGTGGTGCACCGGGGATCAACGGCACCAAGGGTTACCCGGGCCTGAAAGGT

GATGAGGGCGAGGCGGGTGATCCGGGCGATGATGGCACCGAGGGCTTCCCGGGGT

TCCCGGGTTATCCGGGTAACCGTGGCGCCCCCGGCATTAATGGTACGAAGGGTTAC

CCGGGCCTGAAAGGTGATGAAGGTGAAGCGGGTGACCCGGGGGACGACGGCACC

GAAGGTTTTCCGGGCTTCCCGGGTTACCCGGGAAACCGCGGTGCGCCAGGCATCA

ATGGCACCAAGGGTTACCCGGGTCTGAAAGGTGACGAGGGCGAAGCGGGTGATCC

GGGTGACGACGGTACGGAGGGTTTTCCGGGTTTCCCGGGTTACCCGGGTAATCGTG

GTGCACCAGGGATCAACGGCACCAAAGGCTATCCGGGTTTGAAGGGTGATGAAGG

CGAGGCCGGTGACCCGGGCGACGATGGTACTGAGGGTTTCCCTGGCTTTCCGGGCT

ACCCGGGAAACCGTGGTGCTCCGGGCATTAACGGTACGAAAGGCTATCCTGGCCTGAAGGGCGACGAGGGTGAAGCTGGTGACCCGGGTGATGAT(SEQ ID NO:32) Plasmid cloning

Each of the above coding nucleotide sequences is commercially synthesized. Each of the above-mentioned coding nucleotide sequences (a collagen tool cleavage site is added at the 5' end, the amino acid sequence of the collagen tool cleavage site is ENLYFQ, and the nucleotide sequence is GAAAACCTGTATTTCCAG). Inserting the recombinant expression plasmid between KpnI and XhoI restriction enzyme cutting sites of the pET-28a-Trx-His expression vector to obtain the recombinant expression plasmid.

Host cell transformation

The successfully constructed expression plasmid was transformed into E.coli competent cell BL21 (DE 3). The specific process is as follows: (1) E.coli competent cells BL21 (DE 3) were taken out in an ultra-low temperature refrigerator and placed on ice, and 2. Mu.l of plasmid to be transformed was added to E.coli competent cells BL21 (DE 3) and slightly mixed 2-3 times when half-melted. (2) The mixture is placed on ice for 30min, then is thermally shocked for 45-90s in a water bath at 42 ℃, and is placed on ice for 2min after being taken out. (3) Transfer to biosafety cabinet and add 700. Mu.l liquid LB medium, then culture at 37℃for 60min at 220 rpm. (4) 200. Mu.l of the bacterial liquid was uniformly spread on LB plates containing ampicillin sodium. (5) The plates were incubated in an incubator at 37℃for 15-17 hours until colonies of uniform size were obtained. From the transformed LB plate, 5-6 single colonies were picked in shake flasks containing LB medium from antibiotic stock solution, in a shaking table at 220rpm,37℃for 7h. Cooling the cultured shake flask to 16 ℃, adding IPTG to induce expression for a period of time, subpackaging the bacterial liquid in a centrifugal flask, centrifuging at 8000rpm and 4 ℃ for 10min, collecting bacterial cells, recording the weight of the bacterial cells, and sampling (labeling: bacterial liquid) for electrophoresis detection.

Polypeptide isolation and purification

The collected bacterial cells were resuspended in an equilibration working solution (200 mM sodium chloride, 25mM Tris, 20mM imidazole, pH 8.0), the bacterial cells were cooled to 15℃or less, homogenized at high pressure twice, and the bacterial cells were collected after completion of the homogenization. Subpackaging the homogenized bacterial liquid into a centrifugal bottle, centrifuging at 17000rpm and 4 ℃ for 30min, collecting supernatant, taking supernatant (marked as supernatant) and performing electrophoresis detection on precipitate.

Purifying and enzyme cutting recombinant VI type humanized collagen, and the specific process is as follows: (1) crude purity: a. water wash column (Ni 6FF, cytiva), 5 CVs. b. The column was equilibrated with equilibration solution (200 mM sodium chloride, 25mM Tris,20mM imidazole, pH 8.0) for 5 CV. c. Loading: adding the supernatant into column material, taking flow, and performing electrophoresis test (marked as flow-through). d. Cleaning the hybrid protein: 25mL of wash solution (200 mM sodium chloride, 25mM Tris,20mM imidazole) was added until the solution was complete, and the wash solution was taken and passed through for electrophoretic examination (labeled wash). e. Collecting the target protein: 20mL of the eluate (200 mM sodium chloride, 25mM Tris, 250mM imidazole, pH 8.0) was added, and the flow-through solution (label: elution) was collected, and the protein concentration was measured to calculate the protein amount, and the electrophoresis was performed. f. The column was washed with 1M imidazole working solution (labeled 1M wash). g. The column was washed with purified water. (2) enzyme digestion: the ratio of total protein to total TEV enzyme is 50:1, adding TEV enzyme, cutting at 16 ℃ for 4 hours, sampling and carrying out electrophoresis detection (after cutting). The protein solution after enzyme digestion is put into a dialysis bag, dialyzed for 2 hours at 4 ℃, and then transferred into a new dialysate for overnight dialysis at 4 ℃ (marked as exchange A solution).

(3) Refined (protein isoelectric point > 8.0): a. balance column (Capto Q, cytiva): the column was equilibrated with solution A (20 mM Tris,20mM sodium chloride, pH 8.0) at a flow rate of 10ml/min. b. Loading: the flow rate was 5ml/min, the flow-through (labeled QFL) was loaded and collected and subjected to electrophoresis. c. Gradient elution: the solution was set with 0-15% B (20 mM Tris,1M sodium chloride, pH 8.0) for 2min then 3 CV, 15-30% B for 2min then 3 CV, 30-50% B for 2min then 3 CV, 50-100% B for 2min then 3 CV, peak-off was collected and electrophoretically detected (labeled as B wash). d. And (5) cleaning the column materials. The proteins were stored in a 4 ℃ environment.

(4) Reverse nickel (Ni 6FF, cytiva) (protein isoelectric point < 8.0): a. balance column material: the column was equilibrated with solution A (20 mM Tris,20mM sodium chloride, 20mM imidazole, pH 8.0) for 5 CV. b. Loading: and adding the protein after enzyme switching liquid into the column material until the liquid flows out, and taking the flow to pass through for electrophoresis detection (marked as reverse nickel hanging). c. The column (labeled 1M wash) was washed with 1M imidazole working solution (20 mM Tris,20mM sodium chloride, 1M imidazole, pH 8.0). d. The column was washed with purified water. The proteins were stored in a4 ℃ environment.

Concentration detection

Accurately measuring a proper amount of sample, diluting by 10-50 times with eluent, and fully and uniformly stirring with a glass rod. The absorbance at 280nm was measured using an ultraviolet-visible spectrophotometer, and the protein concentration was calculated according to the formula C (mg/ml) =a280×absorbance factor×dilution (note: absorbance factor can be obtained from amino acid sequence; absorbance value is required to be 0.1-1).

The concentration detection results were as follows:

Plasmid(s)

Absorption coefficient of

A280

Dilution factor

Concentration of

Eluted protein volume

Protein amount

C6a

1.91

0.665

1

1.27mg/ml

20ml

25.40mg

C6b

2.01

0.330

5

3.32mg/ml

20ml

66.33mg

C6c

2.18

0.121

5

1.32mg/ml

20ml

26.38mg

C6d

2.28

0.238

2

1.08mg/ml

20ml

21.66mg

C6e

2.48

0.595

1

1.48mg/ml

20ml

29.51mg

C6f

1.69

0.360

10

6.08mg/ml

20ml

121.68mg

C6g

2.65

0.291

5

3.94mg/ml

20ml

78.71mg

C6h

2.37

0.130

5

1.54mg/ml

20ml

30.81mg

C6i

2.49

0.277

5

3.45mg/ml

20ml

68.97mg

C6j

1.68

0.329

5

2.76mg/ml

20ml

55.27mg

C6k

1.25

0.392

5

2.46mg/ml

20ml

49.25mg

Protein expression level: c6f > C6g > C6i > C6b > C6j > C6k > C6h > C6e > C6C > C6a > C6d

Electrophoresis detection

The specific process is as follows: 40. Mu.l of the sample solution was taken, 10. Mu.l of 5 Xprotein loading buffer (250 mM Tris-HCl (pH: 6.8), 10% SDS,0.5% bromophenol blue, 50% glycerol, 5% beta-mercaptoethanol) was added, and the mixture was placed in boiling water at 100℃for 10 minutes, then 10. Mu.l of each well was added to SDS-PAGE protein gel, and after running at 80V for 2 hours, protein staining was performed for 20 minutes with Coomassie blue staining solution (0.1% Coomassie blue R-250, 25% isopropanol, 10% glacial acetic acid), and further protein staining was performed with protein staining solution (10% acetic acid, 5% ethanol).

FIG. 1 shows an electrophoretogram of C6 a. FIG. 2 shows an electrophoretogram of C6 d. FIG. 3 shows an electrophoretogram of C6 e. FIG. 4 shows an electrophoretogram of C6 h. FIG. 5 shows an electrophoretogram of C6 j. FIG. 6 shows an electrophoretogram of C6 f. FIG. 7 shows an electrophoretogram of C6 b. FIG. 8 shows an electrophoretogram of C6C. FIG. 9 shows an electrophoretogram of C6 i. FIG. 10 shows an electrophoretogram of C6 g. FIG. 11 shows an electrophoretogram of C6 k.

And (3) electrophoresis detection results show that: the crude purity yield of C6a, C6d and C6h is lower (figures 1, 2 and 4); the crude purity yield of C6e is lower and the enzyme digestion effect is poorer (figure 3); c6j crude purity yield was lower and the impurity protein was more (fig. 5); after the C6f is eluted, part of the impurity band exists, and the amount of protein is small after enzyme digestion and purification (FIG. 6); after the elution of C6b, part of the impurity band exists, and the enzyme digestion is incomplete (figure 7); the crude purity yield of C6C, C6g, C6i and C6k is higher, and the purity of the target protein after reverse nickel hanging is higher (figures 8, 9, 10 and 11).

From the purification results, it was found that the recombinant VI-type humanized collagen prepared from two plasmids of C6C, C6g, C6i and C6k was excellent, and thus mass spectrometry was performed on the C6C, C6g, C6i and C6k proteins.

Example 2: mass spectrometric detection of recombinant VI humanized collagen

Experimental method

Protein samples were subjected to DTT reduction and iodoacetamide alkylation and then subjected to enzymatic hydrolysis overnight with trypsin. The peptides obtained after enzymatic hydrolysis were desalted by C18ZipTip and mixed with matrix α -cyano-4-hydroxycinnamic acid (CHCA) as a spot-size plate. Finally, analysis was performed by matrix assisted laser desorption ionization-time of flight mass spectrometer MALDI-TOF/TOF UlraflextremeTM, brucker, germany (peptide fingerprinting techniques can be found in Protein J.2016; 35:212-7).

Data retrieval is handled through MS/MSIon Search pages from the local masco web site. The protein identification results are obtained according to the primary mass spectrum of the peptide fragments generated after enzymolysis. Detecting parameters: and (3) carrying out enzymolysis on the Trypsin, and setting two missed cleavage sites. Alkylation of cysteine was set as the immobilization modification. The oxidation of methionine is a variable modification. The database used for the identification was NCBprot.

Table 1: recombinant VI humanized collagen C6C mass spectrum detection molecular weight and corresponding polypeptide

Compared with a theoretical sequence, the coverage rate of the detected polypeptide fragment is 100%, and the detection result is very reliable.

Table 2: recombinant VI humanized collagen C6g mass spectrum detection molecular weight and corresponding polypeptide

Table 3: recombinant VI humanized collagen C6i mass spectrum detection molecular weight and corresponding polypeptide

Table 4: recombinant VI humanized collagen C6k mass spectrum detection molecular weight and corresponding polypeptide

The coverage rate of the detected polypeptide fragment is 78.57% compared with the theoretical sequence, and the detection result is very reliable.

Example 3: biological activity detection of recombinant VI humanized collagen

The method for detecting the activity of the collagen can be specifically implemented by the following method according to the reference Juming Yao,Satoshi Yanagisawa,Tetsuo Asakura,Design,Expression and Characterization of Collagen-Like Proteins Based on the Cell Adhesive and Crosslinking Sequences Derived from Native Collagens,J Biochem.136,643-649(2004).:

(1) The concentration of the protein sample to be detected is detected by utilizing an ultraviolet absorption method, and the protein sample to be detected comprises bovine type I collagen (PC, china food and drug inspection institute, number: 38002) and the recombinant type VI humanized collagen C6C, C6g, C6i and C6k provided by the invention.

Specifically, the ultraviolet absorbance of the samples at 215nm and 225nm, respectively, was measured, and the protein concentration was calculated using the empirical formula C (μg/mL) =144× (a 215-a 225), taking care of detection at a215< 1.5. The principle of the method is as follows: the characteristic absorption of peptide bond under far ultraviolet light is measured, the influence of chromophore content is avoided, the interference substances are few, the operation is simple and convenient, and the method is suitable for detecting the human collagen and analogues thereof which are not developed by coomassie brilliant blue. (reference is Walker JM. The Protein Protocols Handbook, second edition. HumanaPress. 43-45.). After the protein concentration was detected, the concentration of all proteins to be tested was adjusted to 1mg/mL with PBS.

(2) Collagen at different concentrations, positive and negative controls were added to the elisa plate, 100 μl per well, 5 duplicate wells were placed per group, and incubated overnight at 4 ℃.

(3) The supernatant was discarded, 100. Mu.L of 1% BSA (heat-inactivated at 56℃for 30 min) was added, and incubated at 37℃for 60min. The supernatant was discarded and washed 3 times with D-PBS solution (NC).

(4) 10 ⁵ D-PBS resuspended cells were added to each well and incubated for 120min at 37℃with 3T3/NIH cells. Each well was washed 3 times with D-PBS solution.

(5) The absorbance at OD ₄₅₀ nm was measured using CCK8 detection kit. The cell attachment rate can be calculated from the values of the blank. The calculation formula is as follows: The cell attachment rate can be used for reflecting the activity of the collagen. The higher the activity of the protein, the better the environment can be provided for the cells in a short time, and the cell attachment is assisted.

The results are shown in fig. 12, in which positive controls have a significant cell adhesion promoting effect compared to the D-PBS group, and the recombinant humanized collagen also has a cell adhesion promoting effect (ordinate is cell adhesion activity% relative to the D-PBS group) at the experimental concentration, and the results are statistically significant, where P < 0.05; * P < 0.01; * P < 0.001.

The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

The present invention provides the following:

1. A polypeptide comprising one or more repeat units, said repeat units being linked directly or through a linker, said repeat units comprising an amino acid sequence selected from the group consisting of: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, or 21, said variant being (1) an amino acid sequence in which one or more amino acid residues are mutated in said amino acid sequence or (2) an amino acid sequence having at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to said amino acid sequence;

preferably, wherein the plurality of repeating units is 2-50 repeating units, for example 2-45, 2-40, 2-35, 2-30, 2-25, 2-20, 2-15, 2-10, 2-8 or 2-6 repeating units;

Preferably, wherein the linker comprises one or more amino acid residues, e.g. 1-10, 1-9, 1-8, 1-7, 1-6, 1-5, 1-4, 1-3, or 1-2 amino acid residues;

preferably, wherein the mutation is selected from the group consisting of substitution, addition, insertion or deletion;

Preferably, wherein the substitution is a conservative amino acid substitution;

Preferably, wherein the polypeptide is recombinant collagen; preferably recombinant type VI collagen; preferably human recombinant type VI collagen;

preferably, the polypeptide has cell adhesion activity.

2. The polypeptide of item 1, comprising an amino acid sequence selected from the group consisting of: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, or 21, said variant being (1) an amino acid sequence in which one or more amino acid residues are mutated in said amino acid sequence or (2) an amino acid sequence having at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to said amino acid sequence;

preferably, the substitutions are conservative amino acid substitutions.

3. A nucleic acid encoding the polypeptide according to item 1 or 2,

Preferably, wherein the nucleic acid comprises a codon optimized nucleotide sequence,

Preferably, a nucleotide sequence wherein codon optimization is performed for E.coli expression;

preferably, wherein the nucleic acid comprises a nucleotide sequence selected from the group consisting of: SEQ ID NOS.22-32.

4. A vector comprising the nucleic acid according to item 3,

Preferably, wherein the vector comprises an expression control element, a nucleotide of a purification tag and/or a nucleotide of a leader sequence operably linked to said nucleic acid,

Preferably, wherein the expression control element is selected from a promoter, terminator or enhancer;

Preferably, wherein the purification tag is selected from His tag, GST tag, MBP tag, SUMO tag or NusA tag;

preferably, the vector is an expression vector or a cloning vector, preferably pET-28a (+).

5. A host cell comprising the nucleic acid of item 3 or the vector of item 4; wherein preferably the host cell is a eukaryotic cell or a prokaryotic cell; wherein preferably the eukaryotic cell is a yeast cell, an animal cell and/or an insect cell and/or the prokaryotic cell is an E.coli cell, e.g.E.coli BL21.

6. A composition comprising one or more of the polypeptide according to item 1 or 2, the nucleic acid according to item 3, the vector according to item 4, and the host cell according to item 5, preferably the composition is a kit; preferably, the composition is one or more of a biological dressing, a human biomimetic material, a cosmetic material for plastic, an organoid culture material, a cardiovascular stent material, a coating material, a tissue injection filling material, an ophthalmic material, a gynaecological biomaterial, a nerve repair regenerating material, a liver tissue material and a blood vessel repair regenerating material, a 3D printing artificial organ biomaterial, a cosmetic material, a pharmaceutical adjuvant and a food additive, preferably wherein the composition is an injectable composition or an oral composition.

7. Use of the polypeptide according to item 1 or 2, the nucleic acid according to item 3, the vector according to item 4, the host cell according to item 5 and/or the composition according to item 6 in one or more of a biological dressing, a human biomimetic material, a cosmetic material, an organoid culture material, a cardiovascular scaffold material, a coating material, a tissue injection filling material, an ophthalmic material, a gynaecological biomaterial, a nerve repair regenerative material, a liver tissue material and a vascular repair regenerative material, a 3D printing artificial organ biomaterial, a cosmetic material, a pharmaceutical adjuvant and a food additive.

8. A method of promoting cell adhesion comprising the step of contacting a polypeptide according to item 1 or 2, a nucleic acid according to item 3, a vector according to item 4, a host cell according to item 5 and/or a composition according to item 6 with a cell, preferably the cell is an animal cell, preferably a mammalian cell, preferably a human cell.

9. A method of cosmetic shaping, tissue injection filling, ophthalmic treatment, nerve repair or vascular repair of a subject in need thereof, comprising administering to the subject the polypeptide of claim 1 or 2, preferably orally or by injection; preferably, the subject is a human.

10. Producing the polypeptide of item 1 or 2, comprising:

(1) Culturing the host cell according to item 5 under suitable culture conditions;

(3) Purifying the polypeptide, and optionally excision of the tag;

preferably, wherein the host cell is an E.coli cell, preferably an E.coli BL21 (DE 3) cell;

Preferably, step (1) comprises culturing E.coli cells in LB medium and inducing expression by IPTG;

Preferably, step (2) comprises harvesting the E.coli cells, re-suspending in an equilibration working fluid, homogenizing the E.coli cells, preferably high pressure homogenizing, and separating the supernatant; preferably, the equilibration working fluid comprises 100-500mM sodium chloride, 10-50mM Tris, 10-50mM imidazole, pH7-9;

preferably, step (3) comprises crude purification, enzyme digestion, fine purification and/or reverse hanging nickel column purification;

Preferably, wherein the crude purification comprises subjecting the supernatant to a Ni-agarose gel column purification to obtain an eluate comprising the protein of interest, wherein the eluate comprises 100-500mM sodium chloride, 10-50mM Tris and 100-500mM imidazole, preferably pH7-9;

preferably, the cleavage comprises cleavage with TEV, preferably at a total protein to total TEV ratio of 10-100:1 for 2-8h;

Preferably, the purification comprises gradient elution of the eluent containing the target protein or the product after enzyme digestion by a strong anion exchange chromatographic column; preferably, the gradient elution comprises 0-15% B fluid for 1-5 minutes then 3 column volumes, 15-30% B fluid for 1-5 minutes then 3 column volumes, 30-50% B fluid for 1-5 minutes then 3 column volumes, 50-100% B fluid for 1-5 minutes then 3 column volumes; wherein the solution B comprises 10-50mM Tris,0.5-5M sodium chloride and pH7-9;

Preferably, wherein the reverse nickel column purification comprises Ni-agarose gel column purification of the digested product; preferably, the eluate comprises 10-50mM Tris,10-50mM sodium chloride, 0.5-5M imidazole, pH7-9.

Claims

1. The amino acid sequence of the recombinant collagen is SEQ ID NO. 4 or 10.

2. A nucleic acid encoding the recombinant collagen according to claim 1;

Preferably, the nucleic acid comprises a codon optimized nucleotide sequence;

preferably, the nucleic acid comprises a nucleotide sequence that is codon optimized for E.coli expression;

Preferably, the nucleic acid comprises the nucleotide sequence shown as SEQ ID NO. 28 or 23.

3. A vector comprising the nucleic acid of claim 2;

Preferably, the vector comprises an expression control element, a nucleotide of a purification tag and/or a nucleotide of a leader sequence operably linked to the nucleic acid;

Preferably, the vector is an expression vector or a cloning vector;

Preferably, the vector is pET-28a (+).

4. A host cell comprising the nucleic acid of claim 2 or the vector of claim 3; wherein the host cell is a eukaryotic cell or a prokaryotic cell; wherein the eukaryotic cell is a yeast cell, an animal cell, and/or an insect cell;

preferably, wherein the prokaryotic cell is an E.coli cell;

preferably, wherein the E.coli cell is E.coli BL21.

5. A composition comprising one or more of the recombinant collagen according to claim 1, the nucleic acid according to claim 2, the vector according to claim 3, and the host cell according to claim 4;

preferably, wherein the composition is a kit;

Preferably, the composition is one or more of biological dressing, human biomimetic material, cosmetic material, organoid culture material, cardiovascular stent material, coating material, tissue injection filling material, ophthalmic material, obstetrical and gynecological biological material, nerve repair regeneration material, liver tissue material, vascular repair regeneration material, 3D printing artificial organ biological material, cosmetic raw material, pharmaceutical adjuvant and food additive;

Preferably, the composition is an injectable composition or an oral composition.

6. Use of the recombinant collagen according to claim 1, the nucleic acid according to claim 2, the vector according to claim 3 and/or the host cell according to any one of claims 4 in one or more of a biological dressing, a human biomimetic material, a cosmetic plastic material, an organoid culture material, a cardiovascular scaffold material, a coating material, a tissue injection filling material, an ophthalmic material, a gynaecological biomaterial, a nerve repair regeneration material, a liver tissue material and a vascular repair regeneration material, a 3D printed artificial organ biomaterial, a cosmetic raw material, a pharmaceutical adjuvant and a food additive.

7. A method of promoting cell adhesion in vitro comprising the step of contacting a recombinant collagen according to claim 1, a nucleic acid according to claim 2, a vector according to claim 3 and/or a host cell according to claim 4 with a cell in vitro;

Preferably, wherein the cell is an animal cell;

preferably, wherein the animal cell is a mammalian cell;

preferably, wherein the mammalian cell is a human cell.

8. A method of producing the recombinant collagen according to claim 1, comprising:

(1) Culturing the host cell of claim 4 under suitable culture conditions;

(2) Harvesting host cells and/or culture medium comprising recombinant collagen; and

(3) Purifying the recombinant collagen and optionally excision of the tag;

preferably, wherein the host cell is an E.coli cell;

preferably, wherein the E.coli cell is an E.coli BL21 (DE 3) cell;

preferably, wherein step (1) comprises culturing E.coli cells in LB medium and inducing expression by IPTG;

Preferably, the step (2) comprises harvesting the escherichia coli thalli, re-suspending the escherichia coli thalli in an equilibrium working solution, homogenizing the escherichia coli thalli, and separating a supernatant;

Preferably, wherein the homogenizing is high pressure homogenizing;

preferably, the balancing working solution comprises 100-500mM sodium chloride, 10-50mM Tris, 10-50mM imidazole, and pH7-9;

Preferably, wherein step (3) comprises crude purity, cleavage, purification with a fine purity and/or reverse-hanging nickel column;

Preferably, wherein the crude purification comprises subjecting the supernatant to Ni-Sepharose column purification to obtain an eluate comprising the target protein, wherein the eluate comprises 100-500mM sodium chloride, 10-50mM Tris, and 100-500mM imidazole;

preferably wherein the pH of the eluent is 7-9;

preferably, wherein the cleavage comprises cleavage with TEV;

preferably, the total amount of protein and the total amount of TEV enzyme are cut for 2-8h in a ratio of 10-100:1;

Preferably, the purification comprises gradient elution of the eluent containing the target protein or the product after enzyme digestion by a strong anion exchange chromatographic column;

Preferably, wherein the gradient elution comprises 0-15% b fluid for 1-5 minutes then 3 column volumes, 15-30% b fluid for 1-5 minutes then 3 column volumes, 30-50% b fluid for 1-5 minutes then 3 column volumes, 50-100% b fluid for 1-5 minutes then 3 column volumes; wherein the solution B comprises 10-50mM Tris,0.5-5M sodium chloride and pH7-9;

Preferably, wherein the reverse nickel column purification comprises Ni-agarose gel column purification of the digested product;

Preferably, the eluent comprises 10-50mM Tris,10-50mM sodium chloride, 0.5-5M imidazole, pH7-9.

9. Use of the recombinant collagen according to claim 1, the nucleic acid according to claim 2, the vector according to claim 3 and/or the host cell according to claim 4 in the preparation of a medicament or composition for promoting cell adhesion.

10. Use of the recombinant collagen according to claim 1, the nucleic acid according to claim 2, the vector according to claim 3 and/or the host cell according to claim 4 in the manufacture of a medicament or composition for the treatment or prevention of a disease or disorder associated with a type viii collagen deficiency, such as pre-ocular dysplasia.