CN113528574A

CN113528574A - Signal peptide related sequence and application thereof in protein synthesis

Info

Publication number: CN113528574A
Application number: CN202110864773.4A
Authority: CN
Inventors: 郭敏; 于雪
Original assignee: Kangma Healthcode Shanghai Biotech Co Ltd
Current assignee: Kangma Healthcode Shanghai Biotech Co Ltd
Priority date: 2018-08-07
Filing date: 2018-08-07
Publication date: 2021-10-22
Anticipated expiration: 2038-08-07
Also published as: CN110819647A; CN113528575A; CN113584060B; CN113528574B; CN113667685A; CN113584058A; CN113667685B; CN113584058B; CN113584059A; CN113528575B; CN113481226A; CN113481226B; CN113584059B; CN113584060A

Abstract

The invention provides a signal peptide related sequence and application thereof in protein synthesis, in particular to a signal peptide with the effect of improving protein expression and a coding sequence thereof, and a nucleic acid construct formed by operably connecting the coding sequence of the signal peptide and the coding sequence of a foreign protein can obviously improve the efficiency of foreign protein synthesis and simplify the expression and purification processes of a target foreign protein. Meanwhile, the invention provides a vector or a vector combination, a genetic engineering cell and a kit corresponding to the protein, so that the protein can be applied to protein synthesis.

Description

Signal peptide related sequence and application thereof in protein synthesis

Technical Field

The invention relates to the field of biotechnology, in particular to a signal peptide related sequence and application thereof in protein synthesis.

Background

Proteins are important molecules in cells, and are involved in performing almost all functions of cells. The difference in the sequence and structure of the protein determines the difference in its function (1, 2). Within the cell, proteins can catalyze various biochemical reactions as enzymes, can coordinate various activities of the organism as signaling molecules, can support biological morphology, store energy, transport molecules, and mobilize the organism (2). In the biomedical field, protein antibodies are important means for treating diseases such as cancer as targeted drugs (2).

Signal peptides are short peptides at the N-terminus of proteins that carry the secretory information of proteins and are widely distributed in all prokaryotes and eukaryotes (3, 4). The research on signal peptides has been focused on many scientific and industrial fields, including the production of recombinant proteins, disease diagnosis, immunization and many biological experimental techniques (4, 5). Many studies have shown that signal peptides play a very important role in recombinant protein production (6, 7). However, some of the functions of signal peptides in protein expression and transmembrane structure formation remain elusive (4,8, 9).

In addition to the understanding of intracellular protein synthesis, protein synthesis can also be carried out extracellularly. The in vitro protein synthesis system is generally characterized in that components such as mRNA or DNA template, RNA polymerase, amino acid, ATP and the like are added into a lysis system of bacteria, fungi, plant cells or animal cells to complete the rapid and efficient translation of foreign proteins (10, 11). Currently, commercial in vitro protein expression systems that are frequently tested include the e.coli system (ECE), Rabbit Reticulocyte Lysate (RRL), Wheat Germ (WGE), Insect Cell Extract (ICE) and human systems (11, 12). Compared with the traditional in vivo recombinant expression system, the in vitro cell-free protein synthesis system has multiple advantages, such as the capability of expressing special proteins which have toxic action on cells or contain unnatural amino acids (such as D-amino acids), the capability of directly taking PCR products as templates to simultaneously synthesize multiple proteins in parallel and the development of high-throughput drug screening and proteomics research (10-12).

Studies have shown that part of the signal peptide sequence has some promotion effect on protein expression, while DNA templates used in vitro synthesis usually do not have signal peptide related sequences (13). Therefore, in an in vitro protein synthesis system, a short polypeptide sequence with the length of less than 30 amino acids is generally inserted into the N-terminal of the target protein to improve the translation efficiency of the target protein, but the insertion of some short peptides can significantly influence the structure and function of the target protein (4, 14).

Therefore, there is an urgent need in the art to provide a signal peptide-related sequence that can be applied to an in vitro protein expression system, and can significantly improve the yield of a target protein, reduce the cost of protein expression, and improve the protein translation efficiency.

Disclosure of Invention

The invention aims to provide a signal peptide related sequence which can be applied to an in vitro protein expression system, can obviously improve the yield of target protein, reduce the cost of protein expression and improve the protein translation efficiency.

In a first aspect of the invention, there is provided a nucleic acid construct comprising a first nucleotide sequence encoding a signal peptide operably linked to a second nucleotide sequence encoding a foreign protein, the 3' end of the first nucleotide sequence being upstream of the second nucleotide sequence, and the first nucleotide sequence being selected from the group consisting of:

(a) a nucleotide sequence encoding any one of the following signal peptides: the amino acid sequence is SEQ ID NO: 14-24;

(b) SEQ ID NO: 1-13.

In another preferred embodiment, the nucleic acid construct has a structure of formula I from 5 'to 3':

Z1-Z2-Z3 (I)

in the formula (I), the compound is shown in the specification,

Z1-Z3 are each an element used to construct the construct;

each "-" is independently a bond or a nucleotide linking sequence;

z1 is the coding sequence of signal peptide;

z2 is nothing or a linking sequence;

z3 is the coding sequence of no or exogenous protein;

wherein the coding sequence of the signal peptide is selected from the group consisting of:

(a) a polynucleotide encoding a polypeptide as set forth in SEQ ID No. 14-24;

(b) a polynucleotide having a sequence as set forth in any one of SEQ ID No. 1-13;

(c) a polynucleotide having a nucleotide sequence homology of 75% or more (preferably 85% or more, more preferably 90% or more or 95% or more or 98% or more or 99%) with any one of the sequences shown in SEQ ID No. 1-13;

(d) 1-13, or a polynucleotide in which 1-60 (preferably 1-30, more preferably 1-10) nucleotides are truncated or added at the 5 'end and/or 3' end of the polynucleotide shown in SEQ ID NO;

(e) a polynucleotide complementary to any one of the polynucleotides of (a) - (d).

In another preferred embodiment, the nucleic acid construct has a structure of formula II from 5 'to 3':

Z1-Z2-Z3 (II)

in the formula (I), the compound is shown in the specification,

Z1-Z3 are each an element used to construct the construct;

each "-" is independently a bond or a nucleotide linking sequence;

z1 is the coding sequence of signal peptide;

z2 is a linker sequence;

z3 is the coding sequence of no or exogenous protein;

(a) a polynucleotide encoding a polypeptide as set forth in SEQ ID No. 14-24;

In another preferred embodiment, the nucleic acid construct has a structure of formula III from 5 'to 3':

Z1-Z2-Z3 (III)

in the formula (I), the compound is shown in the specification,

Z1-Z3 are each an element used to construct the construct;

each "-" is independently a bond or a nucleotide linking sequence;

z1 is the coding sequence of signal peptide;

z2 is a linker sequence;

z3 is the coding sequence of the foreign protein;

(a) a polynucleotide encoding a polypeptide as set forth in SEQ ID No. 14-24;

In another preferred embodiment, the operably linked is directly linked or linked through a linking sequence.

In another preferred embodiment, the linker sequence is SEQ ID NO: 25.

In another preferred embodiment, the amino acid sequence of the linker sequence is as shown in SEQ ID NO. 26.

In another preferred embodiment, the coding sequence of the signal peptide is a codon optimized coding sequence.

In another preferred embodiment, the coding sequence of the signal peptide is as shown in SEQ ID NO. 1-13.

In another preferred embodiment, the amino acid sequence of the signal peptide has the sequence shown in SEQ ID NO. 14-24 or an active fragment thereof, or has a homology of more than or equal to 85%, preferably more than or equal to 90% with the amino acid sequence shown in SEQ ID NO. 14-24; more preferably 95% homology or more; most preferably, homology of more than or equal to 97%, such as more than 98%, more than 99%) and has the same activity as the sequence shown in SEQ ID NO. 14-24.

In another preferred embodiment, the coding sequence of the signal peptide is shown in SEQ ID NO. 11-13.

In another preferred embodiment, the coding sequence of the signal peptide is as shown in SEQ ID NO. 2-7.

In another preferred embodiment, the coding sequence of the signal peptide is shown in SEQ ID No. 1.

In another preferred embodiment, the coding sequence of the signal peptide is shown in SEQ ID NO. 8-10.

In another preferred embodiment, the amino acid sequence of the signal peptide is shown in SEQ ID NO. 22-24.

In another preferred embodiment, the amino acid sequence of the signal peptide is shown in SEQ ID NO. 15-20.

In another preferred embodiment, the amino acid sequence of the signal peptide is shown in SEQ ID No. 14.

In another preferred embodiment, the amino acid sequence of the signal peptide is shown in SEQ ID No. 21.

In another preferred embodiment, the linker sequence is a codon optimized linker sequence.

In another preferred embodiment, the linker sequence has a sequence that is not prone to secondary structure formation (e.g., AT-rich sequence, hairpin-free structure, G-quadruplex-free (G-quadruplex), etc.), and is not enriched in rare codons.

In another preferred embodiment, the linker sequence is selected from the group consisting of;

(i) a polynucleotide having a sequence as set forth in SEQ ID No. 25;

(ii) polynucleotide having a nucleotide sequence homology of 75% or more (preferably 85% or more, more preferably 90% or 95% or 98% or 99% or more) with the sequence shown in SEQ ID No. 25;

(iii) a polynucleotide in which 1 to 60 (preferably 1 to 30, more preferably 1 to 10) nucleotides are truncated or added at the 5 'end and/or the 3' end of the polynucleotide shown in SEQ ID No. 25;

(iv) (iv) a polynucleotide complementary to any one of the polynucleotides of (i) to (iii).

In another preferred embodiment, the foreign protein is from a prokaryote or a eukaryote.

In another preferred embodiment, the foreign protein is from an animal, plant, pathogen.

In another preferred embodiment, the foreign protein is from a mammal, preferably a primate, a rodent, including a human, a mouse, a rat.

In another preferred embodiment, the foreign protein is selected from the group consisting of: luciferin protein, luciferase (e.g., firefly luciferase), green fluorescent protein, yellow fluorescent protein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, variable region of an antibody, luciferase mutation, alpha-amylase, enteromycin A, hepatitis C virus E2 glycoprotein, insulin precursor, interferon alpha A, interleukin-1 beta, lysozyme, serum albumin, single chain antibody fragment (scFV), transthyretin, tyrosinase, xylanase, or a combination thereof.

In another preferred embodiment, the coding sequence for the foreign protein encodes a protein selected from the group consisting of: a luciferin protein, or a luciferase (such as firefly luciferase), a green fluorescent protein, a yellow fluorescent protein, an aminoacyl tRNA synthetase, a glyceraldehyde-3-phosphate dehydrogenase, a catalase, an actin, an antibody or variable region thereof, a luciferase mutant, or a combination thereof.

In another preferred embodiment, the nucleic acid construct further comprises a promoter upstream of the 5' end.

In another preferred embodiment, the promoter comprises a constitutive or inducible promoter.

In another preferred embodiment, the promoter is selected from the group consisting of: a T7 promoter, a T3 promoter, an SP6 promoter, or a combination thereof.

In another preferred embodiment, the nucleic acid construct further comprises an enhancer element, an RBS ribosome binding sequence, a Spacer sequence (Spacer), other related sequences for RNA transcription, translation, or a combination thereof.

In another preferred embodiment, the enhancer element comprises an internal ribosome entry site element (IRES), a ribosome binding site element, a non-coding sequence, or a combination thereof.

In another preferred embodiment, the IRES element is derived from one or more cells selected from the group consisting of: prokaryotic cells and eukaryotic cells.

In another preferred embodiment, the eukaryotic cells include higher eukaryotic cells.

In another preferred embodiment, the IRES element comprises an endogenous IRES element and an exogenous IRES element.

In another preferred embodiment, the IRES element is derived from one or more cells selected from the group consisting of: human (human), Chinese hamster ovary Cells (CHO), insect cells (instect), Wheat germ (Wheat cells), Rabbit reticulocyte (Rabbit reticulocyte).

In another preferred embodiment, the IRES element is selected from the group consisting of: ScGPR1, ScFLO8, ScNCE102, ScMSN1, KlFLO8, KlNCE102, KlMSN1, GAA, Omega10A, or a combination thereof.

In a second aspect of the invention, there is provided a signal peptide whose amino acid sequence is encoded by the first nucleotide sequence of the first aspect.

In another preferred embodiment, the amino acid sequence of the signal peptide is shown in any one of SEQ ID No. 14-24.

In a third aspect of the invention, there is provided a vector or combination of vectors comprising a nucleic acid construct according to the first aspect of the invention.

In a fourth aspect of the invention, there is provided a genetically engineered cell having a nucleic acid construct according to the first aspect of the invention integrated at one or more sites in its genome or comprising a vector or combination of vectors according to the third aspect of the invention.

In another preferred embodiment, the genetically engineered cell comprises a prokaryotic cell and a eukaryotic cell.

In another preferred embodiment, the genetically engineered cell is selected from the group consisting of: human cells (e.g., Hela cells), chinese hamster ovary cells, insect cells, wheat germ cells, rabbit reticulocyte, yeast cells, or combinations thereof.

In another preferred embodiment, the genetically engineered cell is a yeast cell.

In another preferred embodiment, the yeast cell is selected from the group consisting of: saccharomyces cerevisiae, Kluyveromyces yeast, or a combination thereof.

In another preferred embodiment, the yeast of the genus kluyveromyces is selected from the group consisting of: kluyveromyces lactis, Kluyveromyces marxianus, Kluyveromyces multibuyveri, or a combination thereof.

In a fifth aspect of the invention, a kit is provided, wherein the kit comprises reagents selected from one or more of the following groups:

(a) a nucleic acid construct according to the first aspect of the invention;

(b) a vector or combination of vectors according to the third aspect of the invention; and

(c) the genetically engineered cell according to the fourth aspect of the invention.

In another preferred embodiment, the kit further comprises (d) a eukaryotic in vitro biosynthesis system (e.g., a eukaryotic in vitro protein synthesis system).

In another preferred embodiment, the eukaryotic in vitro biosynthetic system is selected from the group consisting of: a yeast in vitro biosynthesis system, a chinese hamster ovary cell in vitro biosynthesis system, an insect cell in vitro biosynthesis system, a Hela cell in vitro biosynthesis system, or a combination thereof.

In another preferred embodiment, the eukaryotic in vitro biosynthetic system comprises a eukaryotic in vitro protein synthetic system.

In another preferred embodiment, the eukaryotic in vitro protein synthesis system is selected from the group consisting of: a yeast in vitro protein synthesis system, a chinese hamster ovary cell in vitro protein synthesis system, an insect cell in vitro protein synthesis system, a Hela cell in vitro protein synthesis system, or a combination thereof.

In another preferred embodiment, the yeast in vitro biosynthesis system (e.g., yeast in vitro protein synthesis system) is a Kluyveromyces in vitro biosynthesis system (e.g., Kluyveromyces in vitro protein synthesis system), preferably a Kluyveromyces lactis in vitro biosynthesis system (e.g., Kluyveromyces lactis in vitro protein synthesis system).

In another preferred embodiment, the yeast in vitro biosynthesis system is a Kluyveromyces in vitro biosynthesis system.

In a sixth aspect, the invention provides the use of a nucleic acid construct according to the first aspect, a signal peptide according to the second aspect, a vector or combination of vectors according to the third aspect, a genetically engineered cell according to the fourth aspect or a kit according to the fifth aspect in an in vitro protein synthesis system.

In a seventh aspect of the present invention, there is provided an in vitro protein synthesis method, comprising the steps of:

(i) providing an in vitro biosynthetic system comprising a nucleic acid construct according to the first aspect of the invention;

(ii) (ii) incubating the in vitro biosynthetic system of step (i) under suitable conditions for a reaction time to synthesize the foreign protein.

In another preferred embodiment, the in vitro protein synthesis method further comprises (iii) isolating or detecting said foreign protein, optionally from said eukaryotic in vitro biosynthesis system.

In another preferred embodiment, in the step (ii), the reaction temperature is 20 to 37 ℃, preferably 22 to 35 ℃.

In another preferred embodiment, in the step (ii), the reaction time is 1 to 72 hours, preferably 2 to 24 hours.

In another preferred embodiment, the in vitro biosynthesis system may be selected from a yeast in vitro biosynthesis system (e.g., a yeast in vitro protein synthesis system).

In another preferred embodiment, the yeast in vitro biosynthesis system (e.g., yeast in vitro protein synthesis system) is a Kluyveromyces in vitro biosynthesis system (e.g., Kluyveromyces in vitro protein synthesis system) (preferably a Kluyveromyces lactis in vitro biosynthesis system, e.g., Kluyveromyces lactis in vitro protein synthesis system).

In another preferred embodiment, the foreign protein is selected from the group consisting of: luciferin protein, or luciferase (e.g., firefly luciferase), green fluorescent protein, yellow fluorescent protein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, variable region of an antibody, luciferase mutation, alpha-amylase, enteromycin A, hepatitis C virus E2 glycoprotein, insulin precursor, interferon alpha A, interleukin-1 beta, lysozyme, serum albumin, single chain antibody fragment (scFV), transthyretin, tyrosinase, xylanase, or a combination thereof.

In another preferred embodiment, the coding sequence of the foreign protein encodes a foreign protein selected from the group consisting of: luciferin protein, luciferase (e.g., firefly luciferase), green fluorescent protein, yellow fluorescent protein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, variable region of an antibody, luciferase mutant, alpha-amylase, enteromycin A, hepatitis C virus E2 glycoprotein, insulin precursor, interferon alpha A, interleukin-1 beta, lysozyme, serum albumin, single chain antibody fragment (scFV), transthyretin, tyrosinase, xylanase, or a combination thereof.

In an eighth aspect, the invention provides an isolated polynucleotide selected from the group consisting of:

(a) a polynucleotide encoding a polypeptide as set forth in SEQ ID No. 14-24;

In another preferred embodiment, the polynucleotide is a nucleotide sequence encoding a signal peptide.

In another preferred embodiment, the polynucleotide comprises a DNA sequence.

In a ninth aspect, the invention provides a linker sequence selected from the group consisting of:

(i) a polynucleotide having a sequence as set forth in SEQ ID No. 25;

It is to be understood that within the scope of the present invention, the above-described features of the present invention and those specifically described below (e.g., in the examples) may be combined with each other to form new or preferred embodiments. Not to be reiterated herein, but to the extent of space.

Drawings

FIG. 1 shows the basic biological process from DNA to protein.

FIG. 2 shows Relative Fluorescence unit values (RFUs) of Enhanced green Fluorescence protein (eGFP) synthesized by 13 related sequences of the signal peptide of the present invention in an in vitro protein synthesis system.

Detailed Description

The present inventors have made extensive and intensive studies and, as a result of extensive screening and investigation, have for the first time found a novel signal peptide for improving the protein translation efficiency of an in vitro protein synthesis system and a nucleic acid construct comprising a signal peptide-encoding sequence, which comprises a first nucleotide sequence (including a codon-optimized or non-optimized signal peptide-encoding sequence) encoding a signal peptide operably linked to a second nucleotide sequence encoding a foreign protein. Experiments show that the signal intensity of the synthesized foreign protein is remarkably improved (p is less than 0.05) compared with that of a control group by applying the nucleic acid construct or the signal peptide sequence in an in-vitro protein synthesis system (such as a yeast in-vitro protein synthesis system). The invention also simplifies the expression and purification mode of the foreign protein.

Protein synthesis system

Protein synthesis refers to the process by which an organism synthesizes a protein according to genetic information on messenger ribonucleic acid (mRNA) transcribed from deoxyribonucleic acid (DNA), as shown in fig. 1. Protein biosynthesis is also known as Translation (Translation), the process by which the sequence of bases in an mRNA molecule is converted into the sequence of amino acids in a protein or polypeptide chain. This is the second step in gene expression, the final stage in the production of the gene product protein. Different tissue cells have different physiological functions because they express different genes to produce proteins with special functions, and at least 200 components involved in protein biosynthesis are mainly composed of mRNA, tRNA, ribonucleate, related enzymes and protein factors.

The in vitro protein synthesis system is generally characterized in that mRNA or DNA template, RNA polymerase, amino acid, ATP and other components are added into a lysis system of bacteria, fungi, plant cells or animal cells to complete the rapid and efficient translation of exogenous protein. Currently, commercial in vitro protein expression systems that are frequently tested include the e.coli system (ECE), Rabbit Reticulocyte Lysate (RRL), Wheat Germ (WGE), Insect Cell Extract (ICE) and human-derived systems.

Yeast (yeast) combines the advantages of simple culture, efficient protein folding, and post-translational modification. Among them, Saccharomyces cerevisiae and Pichia pastoris are model organisms expressing complex eukaryotic proteins and membrane proteins, and yeast may be used as material for preparing in vitro translation systems.

Kluyveromyces (Kluyveromyces) is a species of ascosporogenous yeast, of which Kluyveromyces marxianus and Kluyveromyces lactis (Kluyveromyces lactis) are industrially widely used. In comparison with other yeasts, kluyveromyces lactis has many advantages such as superior secretion ability, better large-scale fermentation characteristics, a level of food safety, and the ability to modify proteins post-translationally.

In the present invention, one preferred protein synthesis system is an in vitro protein synthesis system. The in vitro protein synthesis system of the present invention is not particularly limited, and one preferred in vitro protein synthesis system is a Kluyveromyces expression system (more preferably, a Kluyveromyces lactis expression system).

In the present invention, the in vitro protein synthesis system comprises: yeast cell extract and optionally a solvent, which is water or an aqueous solvent.

In a particularly preferred embodiment, the in vitro protein synthesis system provided by the present invention further comprises: 4-hydroxyethyl piperazine ethanesulfonic acid, potassium acetate, magnesium acetate, nucleoside triphosphate mixtures, amino acid mixtures, phosphocreatine, Dithiothreitol (DTT), phosphocreatine kinase, RNase inhibitors, luciferin, luciferase DNA, RNA polymerase.

In the present invention, the RNA polymerase is not particularly limited and may be selected from one or more RNA polymerases, and a typical RNA polymerase is T7 RNA polymerase.

In the present invention, the proportion of the yeast cell extract in the in vitro protein synthesis system is not particularly limited, and usually the yeast cell extract accounts for 20 to 70%, preferably 30 to 60%, more preferably 40 to 50% of the in vitro protein synthesis system.

In the present invention, the yeast cell extract does not contain intact cells, and typical yeast cell extracts include ribosomes for protein translation, transfer RNAs, aminoacyl tRNA synthetases, initiation and elongation factors required for protein synthesis, and termination and release factors. In addition, the yeast extract also contains some other proteins, especially soluble proteins, which originate in the cytoplasm of the yeast cell.

In the present invention, the yeast cell extract contains 10-100mg/mL of protein, preferably 20-80 mg/mL. The method for determining the protein content is a Coomassie brilliant blue determination method.

In the present invention, the preparation method of the yeast cell extract is not limited, and a preferred preparation method comprises the steps of:

(i) providing a yeast cell;

(ii) washing the yeast cells to obtain washed yeast cells;

(iii) subjecting the washed yeast cells to cell disruption treatment, thereby obtaining a crude yeast extract;

(iv) and carrying out solid-liquid separation on the yeast crude extract to obtain a liquid part, namely the yeast cell extract.

In the present invention, the solid-liquid separation method is not particularly limited, but centrifugation is preferable.

In the present invention, the centrifugation conditions are not particularly limited, and the centrifugation conditions are 5000-.

In the present invention, the centrifugation time is not particularly limited, and the centrifugation time is 0.5min to 2h, preferably 20min to 50 min.

In the present invention, the temperature of the centrifugation is not particularly limited, and it is preferable that the centrifugation is performed at 1 to 10 ℃, preferably, 2 to 6 ℃.

In the present invention, the washing treatment is not particularly limited, and a preferable washing treatment is a treatment with a washing solution at a pH of 7 to 8 (preferably, 7.4), the washing solution is not particularly limited, and typically the washing solution is selected from the group consisting of: potassium 4-hydroxyethylpiperazine ethanesulfonate, potassium acetate, magnesium acetate, or a combination thereof.

In the present invention, the manner of the cell disruption treatment is not particularly limited, and preferred cell disruption treatments include high-pressure disruption, freeze-thaw (e.g., liquid nitrogen low-temperature) disruption.

The nucleoside triphosphate mixture in the in vitro protein synthesis system is adenosine triphosphate, guanosine triphosphate, cytosine nucleoside triphosphate and uracil nucleoside triphosphate. In the present invention, the concentration of each mononucleotide is not particularly limited, and usually the concentration of each mononucleotide is 0.5 to 5mM, preferably 1.0 to 2.0 mM.

The amino acid mixture in the in vitro protein synthesis system may comprise natural or unnatural amino acids, and may comprise D-or L-amino acids. Representative amino acids include (but are not limited to) the 20 natural amino acids: glycine, alanine, valine, leucine, isoleucine, phenylalanine, proline, tryptophan, serine, tyrosine, cysteine, methionine, asparagine, glutamine, threonine, aspartic acid, glutamic acid, lysine, arginine, and histidine. The concentration of each amino acid is usually 0.01-0.5mM, preferably 0.02-0.2mM, such as 0.05, 0.06, 0.07, 0.08 mM.

In a preferred embodiment, the in vitro protein synthesis system further comprises polyethylene glycol (PEG) or an analog thereof. The concentration of polyethylene glycol or an analog thereof is not particularly limited, and usually, the concentration (w/v) of polyethylene glycol or an analog thereof is 0.1 to 8%, preferably 0.5 to 4%, more preferably 1 to 2%, based on the total weight of the protein synthesis system. Representative PEG examples include (but are not limited to): PEG3000, PEG8000, PEG6000 and PEG 3350. It is understood that the systems of the present invention may also include other polyethylene glycols of various molecular weights (e.g., PEG200, 400, 1500, 2000, 4000, 6000, 8000, 10000, etc.).

In a preferred embodiment, the in vitro protein synthesis system further comprises sucrose. The concentration of sucrose is not particularly limited, and generally, the concentration of sucrose is 0.03 to 40 wt%, preferably 0.08 to 10 wt%, more preferably 0.1 to 5 wt%, based on the total weight of the protein synthesis system.

A particularly preferred in vitro protein synthesis system comprises, in addition to yeast extract, the following components: 22mM of 4-hydroxyethyl piperazine ethanesulfonic acid with the pH value of 7.4, 30-150mM of potassium acetate, 1.0-5.0mM of magnesium acetate, 1.5-4mM of nucleoside triphosphate mixture, 0.08-0.24mM of amino acid mixture, 25mM of creatine phosphate, 1.7mM of dithiothreitol, 0.27mg/mL of phosphocreatine kinase, 1% -4% of polyethylene glycol, 0.5% -2% of sucrose, 8-20 ng/mu L of DNA of firefly luciferase and 0.027-0.054mg/mL of T7 RNA polymerase.

Coding sequence of foreign protein (foreign DNA)

As used herein, the term "coding sequence for a foreign protein" is used interchangeably with "foreign DNA" and refers to a foreign DNA molecule used to direct protein synthesis. Typically, the DNA molecule is linear or circular. The DNA molecule contains a sequence encoding a foreign protein.

In the present invention, examples of the sequence encoding the foreign protein include (but are not limited to): genome sequence and cDNA sequence. The sequence for encoding the foreign protein also comprises a promoter sequence, a 5 'untranslated sequence and a 3' untranslated sequence.

In the present invention, the selection of the foreign DNA is not particularly limited, and in general, the foreign DNA encodes a protein selected from the group consisting of: luciferin protein, luciferase (e.g., firefly luciferase), green fluorescent protein, yellow fluorescent protein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, antibody or variable region thereof, luciferase mutant, or a combination thereof.

The foreign DNA may also encode a protein selected from the group consisting of: alpha-amylase, enteromycin A, hepatitis C virus E2 glycoprotein, insulin precursor, interferon alpha A, interleukin-1 beta, lysozyme, serum albumin, single chain antibody fragments (scFV), transthyretin, tyrosinase, xylanase, or a combination thereof.

In a preferred embodiment, the exogenous DNA encodes a protein selected from the group consisting of: green fluorescent protein (eGFP), Yellow Fluorescent Protein (YFP), escherichia coli β -galactosidase (lactasise, LacZ), human Lysine-tRNA synthetase (Lysine-tRNA synthetase), human Leucine-tRNA synthetase (Leucine-tRNA synthetase), arabidopsis thaliana Glyceraldehyde 3-phosphate dehydrogenase (Glyceraldehyde-3-phosphate dehydrogenase), murine Catalase (Catalase), or combinations thereof.

Nucleic acid constructs

In a first aspect, the present invention provides a nucleic acid construct comprising a first nucleotide sequence encoding a signal peptide operably linked to a second nucleotide sequence encoding a foreign protein, the 3' end of the first nucleotide sequence being upstream of the second nucleotide sequence, and the first nucleotide sequence being selected from the group consisting of:

(b) SEQ ID NO: 1-13.

The term "operably linked" refers to a functional spatial arrangement of two or more nucleotide regions or nucleotide sequences. For example: the nucleotide sequence encoding the signal peptide is placed at a specific position relative to the nucleotide sequence of the foreign protein, so that an effect of increasing the expression of the foreign protein is obtained. The operable linkage is direct linkage or linkage through a linking sequence.

In a preferred embodiment, the nucleic acid construct of the invention has a structure of formula I from 5 'to 3':

Z1-Z2-Z3 (I)

in the formula (I), the compound is shown in the specification,

Z1-Z3 are each an element used to construct the construct;

each "-" is independently a bond or a nucleotide linking sequence;

z1 is the coding sequence of signal peptide;

z2 is nothing or a linking sequence;

z3 is the coding sequence of no or exogenous protein;

(a) a polynucleotide encoding a polypeptide as set forth in SEQ ID No. 14-24;

In a preferred embodiment, the nucleic acid construct of the invention has a structure of formula II from 5 'to 3':

Z1-Z2-Z3 (II)

in the formula (I), the compound is shown in the specification,

Z1-Z3 are each an element used to construct the construct;

each "-" is independently a bond or a nucleotide linking sequence;

z1 is the coding sequence of signal peptide;

z2 is a linker sequence;

z3 is the coding sequence of no or exogenous protein;

(a) a polynucleotide encoding a polypeptide as set forth in SEQ ID No. 14-24;

In a preferred embodiment, the nucleic acid construct of the invention has a structure of formula III from 5 'to 3':

Z1-Z2-Z3 (III)

in the formula (I), the compound is shown in the specification,

Z1-Z3 are each an element used to construct the construct;

each "-" is independently a bond or a nucleotide linking sequence;

z1 is the coding sequence of signal peptide;

z2 is a linker sequence;

z3 is the coding sequence of the foreign protein;

(a) a polynucleotide encoding a polypeptide as set forth in SEQ ID No. 14-24;

In a preferred embodiment, the amino acid sequence of the signal peptide according to the invention has the sequence shown in SEQ ID NO. 14-24 or an active fragment thereof or has a homology of > 85%, preferably > 90% with the amino acid sequence shown in SEQ ID NO. 14-24; more preferably 95% homology or more; most preferably, homology of more than or equal to 97%, such as more than 98%, more than 99%) and has the same activity as the sequence shown in SEQ ID NO. 14-24.

In the present invention, the selection of the coding sequence of the foreign protein is not particularly limited, and generally, the coding sequence of the foreign protein encodes a protein selected from the group consisting of: luciferin protein, luciferase (e.g., firefly luciferase), green fluorescent protein, yellow fluorescent protein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, antibody or variable region thereof, luciferase mutant, or a combination thereof.

The coding sequence for the foreign protein may also encode a protein selected from the group consisting of: alpha-amylase, enteromycin A, hepatitis C virus E2 glycoprotein, insulin precursor, interferon alpha A, interleukin-1 beta, lysozyme, serum albumin, single chain antibody fragments (scFV), transthyretin, tyrosinase, xylanase, or a combination thereof.

In addition, the nucleic acid constructs of the invention may be linear or circular. The nucleic acid construct of the present invention may be single-stranded or double-stranded. The nucleic acid constructs of the invention may be DNA, RNA, or DNA/RNA hybrids.

Preferred signal peptide sequences of the invention and their nucleotide sequences encoding signal peptides are shown in Table 1.

In another preferred embodiment, said construct further comprises an element or a combination thereof selected from the group consisting of: promoters, terminators, poly (A) elements, transport elements, gene targeting elements, selectable marker genes, enhancers, resistance genes, transposase encoding genes.

A variety of selectable marker genes are applicable to the present invention, including but not limited to: auxotrophic markers, resistance markers, reporter gene markers. The use of a selectable marker serves to screen for recombinant cells (recombinants) so that recipient cells can be significantly distinguished from non-transformed cells. The auxotrophic marker is a marker gene that is introduced so as to complement a mutant gene of a recipient cell, thereby allowing the recipient cell to exhibit wild-type growth. The resistance marker refers to transferring resistance genes into receptor cells, and the transferred genes enable the receptor cells to show drug resistance at a certain drug concentration. As a preferred mode of the invention, a resistance marker is used to achieve convenient screening of recombinant cells.

In the invention, the nucleic acid construct is applied to an in vitro protein synthesis system, so that the translation efficiency of the foreign protein can be obviously improved. In a preferred embodiment, the use of the nucleic acid construct of the invention in the yeast in vitro protein synthesis system of the invention significantly increases the efficiency of protein translation.

Carrier

The invention also provides a vector or combination of vectors comprising the nucleic acid construct of the invention. Preferably, the carrier is selected from: bacterial plasmids, bacteriophages, yeast plasmids, or animal cell vectors, shuttle vectors. Further, the vector may be a transposon vector. Methods for preparing recombinant vectors are well known to those of ordinary skill in the art. Any plasmid and vector may be used as long as it can replicate and is stable in the host.

One of ordinary skill in the art can use well-known methods to construct expression vectors containing the promoter and/or gene sequences of interest described herein. These methods include in vitro recombinant DNA techniques, DNA synthesis techniques, in vivo recombinant techniques, and the like.

Genetically engineered cell

The invention also provides a genetic engineering cell, wherein the genetic engineering cell contains the construct or the vector combination, or the chromosome of the genetic engineering cell is integrated with the construct or the vector. In another preferred embodiment, the genetically engineered cell further comprises a vector comprising a gene encoding a transposase or having a transposase gene integrated into its chromosome.

Preferably, the genetically engineered cell is a eukaryotic cell.

In another preferred embodiment, the eukaryotic cells, include, but are not limited to: a yeast cell (preferably, a kluyveromyces cell, more preferably a kluyveromyces lactis cell).

The constructs or vectors of the invention may be used to transform appropriate genetically engineered cells. The genetically engineered cells may be prokaryotic cells, such as E.coli, Streptomyces, Agrobacterium: or lower eukaryotic cells, such as yeast cells; or a higher animal cell, such as an insect cell. It is clear to one of ordinary skill in the art how to select appropriate vectors and genetically engineered cells. Transformation of genetically engineered cells with recombinant DNA may be carried out using conventional techniques well known to those skilled in the art. When the host is a prokaryote (e.g., Escherichia coli), CaCl may be used₂The treatment can also be carried out by electroporation. When the host is a eukaryote, the following DNA may be usedThe transfection method comprises the following steps: calcium phosphate coprecipitation, conventional mechanical methods (e.g., microinjection, electroporation, liposome encapsulation, etc.). The transformed plant may be transformed by methods such as Agrobacterium transformation or biolistic transformation, for example, leaf disc method, immature embryo transformation, flower bud soaking method, etc.

In addition, the genetically engineered cells of the invention can be used to produce or provide the nucleic acid constructs of the invention.

In vitro high-flux protein synthesis method

The signal peptide and the construction containing the signal peptide coding sequence are particularly suitable for remarkably improving the synthesis efficiency or the yield of foreign proteins in an in vitro biosynthesis system.

Correspondingly, the invention also provides an in vitro high-flux protein synthesis method, which comprises the following steps:

(i) providing a nucleic acid construct according to the first aspect of the invention in the presence of an in vitro protein synthesis system;

(ii) (ii) incubating the in vitro protein synthesis system of step (i) under suitable conditions for a period of time T1, thereby synthesizing the foreign protein.

In another preferred example, the method further comprises: (iii) optionally isolating or detecting said foreign protein from said in vitro protein synthesis system.

The main advantages of the invention include:

(1) the invention discovers for the first time that the signal peptide related sequence and the coding sequence of the foreign protein are used as nucleic acid constructs, and the nucleic acid constructs can be applied to the in vitro protein synthesis system of the invention, and can be used for improving the translation efficiency of target protein and expressing and purifying the target protein.

(2) The signal peptide related sequence of the invention can affect the folding of mRNA after the translation initiation codon, thereby changing the translation efficiency of the target protein.

(3) Compared with other cells, the kluyveromyces lactis can be applied to the production of proteins in the fields of food and medicines due to the safety and high efficiency of the kluyveromyces lactis, and the in vitro protein synthesis system has the advantages of suitability for high-throughput protein synthesis screening, short toxic protein synthesis time, low cost and the like, so the in vitro protein synthesis system derived from the kluyveromyces lactis cells can be widely applied to related fields.

(4) The signal peptide related sequence provided by the invention can improve the translation efficiency of a target foreign protein, and can increase the possibility of synthesizing different proteins of a yeast in-vitro protein synthesis system (such as a Kluyveromyces lactis in-vitro protein synthesis system).

(5) The invention firstly develops a novel signal peptide for improving the protein translation efficiency of an in vitro protein synthesis system and a nucleic acid construct comprising a signal peptide coding sequence.

The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Experimental procedures without specific conditions noted in the following examples, generally followed by conventional conditions, such as Sambrook et al, molecular cloning: the conditions described in the Laboratory Manual (New York: Cold Spring Harbor Laboratory Press,1989), or according to the manufacturer's recommendations. Unless otherwise indicated, percentages and parts are percentages and parts by weight.

Material

Unless otherwise specified, the materials and reagents used in the examples of the present invention are commercially available products.

The foreign proteins in the examples are exemplified by eGFP.

EXAMPLE 1 determination of eukaryotic Signal peptide related sequences

1.1 Source and determination of related sequences of Signal peptides: randomly intercepting a DNA sequence corresponding to the N end of the constructed foreign protein, and screening a sequence or an element which is obviously improved for the expression of the foreign protein through experiments.

Specifically, 30 nucleotide fragments with the length of 36,54 or other lengths are selected and synthesized, and partial bases are modified by adopting a mode of alternative substitution of synonymous codons, so as to improve the success rate of plasmid construction, wherein the method comprises the following steps (but is not limited to): to reduce the GC content of the signal peptide-related sequence and thereby reduce the annealing temperature of the sequence; or with preferred codons. Dozens of plasmids are obtained by construction, and the results of analysis and screening on 30 plasmids and testing the effect of the plasmids on the expression of foreign proteins show that compared with a control, 13 signal peptide related sequences have the effect of improving the expression of proteins through experimental verification, and the amino acid sequences of the corresponding signal peptides and the nucleotide sequence information of the coded signal peptides are listed in Table 1. Other sequence information that is not or not significantly effective in increasing protein expression is not listed.

TABLE 1 plasmids and related nucleic acid sequences

EXAMPLE 2 construction of plasmid for in vitro protein Synthesis System containing Signal peptide-related sequence

Construction of plasmid: for the selected 30 signal peptide related sequences and the linker sequences (containing TEV cleavage sites on the linker sequences) were amplified using 1 pair of primers, and the plasmid backbone originally containing the protein of interest (exemplified by eGFP) was amplified using its corresponding reverse primer. After amplification, 30 signal peptide related sequences + connecting sequence fragments are respectively inserted into the N end of the target protein. In the finally constructed plasmid, 30 signal peptide related sequence + linker nucleic acid sequences were inserted between the ATG start codon and eGFP of the pD2P-eGFP plasmid. The names of 13 plasmids are respectively: pD2P-1.0SP- (001-.

The specific construction process is as follows:

performing PCR amplification by using 2 pairs of primers respectively, and mixing 10 mu L of correctly identified amplification products; adding 0.5. mu.L of DpnI into 10. mu.L of the amplification product, and incubating at 37 ℃ for 6 h; adding 4 mu L of the product treated by the DpnI into 50 mu L of DH5 alpha competent cells, placing the cells on ice for 30min, thermally shocking the cells for 45s at 42 ℃, placing the cells on ice for 3min, adding 200 mu L of LB liquid culture medium, performing shake culture at 37 ℃ for 4h, and coating the cells on LB solid culture medium containing Amp antibiotics for overnight culture; 6 monoclonals are picked for amplification culture, sequencing is carried out to confirm correctness, and plasmids are extracted for storage.

EXAMPLE 3 use of Signal peptide-related sequences in vitro protein Synthesis systems

3.1 amplification of a fragment containing a signal peptide-related sequence between the transcription start and termination sequences of T7 and a fragment of pD2P-eGFP in all plasmids by PCR using the primers pD2P _ F: CGCGAAATTAATACGACTCACTATAGG (SEQ ID No.:27) and pD2P _ R: TCCGGATATAGTTCCTCCTTTCAG (SEQ ID No.: 28).

And purifying and enriching the amplified DNA fragment which is determined to be correct through sequencing by using an ethanol precipitation method: adding 1/10 volume of 3M sodium acetate (pH5.2) into the PCR product, adding 2.5-3 times volume of 95% ethanol, and incubating on ice for 15 min; centrifuging at room temperature at a speed higher than 14000g for 30min, and discarding the supernatant; washing with 70% ethanol, centrifuging for 15min, discarding the supernatant, dissolving the precipitate with ultrapure water, and determining the DNA concentration.

3.2 the purified DNA fragments were added to the home-made in vitro protein synthesis system according to the instructions. And placing the reaction system in an environment with the temperature of 22-30 ℃, and standing and incubating for about 2-6 h. Immediately after the reaction, the reaction mixture was placed in an Envision 2120 multifunctional microplate reader (Perkin Elmer), and read to detect the intensity of eGFP signal and Relative Fluorescence Unit (RFU) as an activity Unit.

PC (Positive control) is a group to which only a linker sequence was added to the N-terminus of the enhanced green fluorescent protein, and NC (negative control) is a group to which no nucleic acid construct was added. Mu.l, 2. mu.l and 3. mu.l are the amounts of the DNA template added to the in vitro protein synthesis system, respectively, and the total reaction system volume of all reactions is 30. mu.l.

Results of the experiment

1. Construction of plasmid for in vitro protein Synthesis System

Through a plurality of attempts, 30 in vitro protein synthesis system plasmids containing signal peptide related sequences are successfully constructed.

2. Application of signal peptide related sequence in vitro protein synthesis system

As shown in fig. 2, 13 related sequences including signal peptide all caused significant increase in RFU value of eGFP emission in vitro protein synthesis system (RFU value reached more than 1500 after 3 hours of reaction), which was as high as 2900. In particular, pD2P-1.0SP-012 (1. mu.l of DNA template was added, and the RFU value reached 2900 after 3 hours of reaction) showed a 2.6-fold increase in relative fluorescence unit value as compared with the control PC (RFU value of 800) to which no signal peptide-related sequence had been inserted.

For the other 17 unlisted signal peptide sequences, the relative fluorescence unit values were unchanged or not significantly changed from PC, mostly 800-830, e.g., 823 for pD2P-1.0SP-019 and 816 for pD2P-1.0 SP-027.

The result of the invention shows that the signal peptide related sequence at the N end of the target protein can obviously improve the yield of the target protein and greatly improve the expression and purification effects of the target protein. The translation efficiency of the target protein is improved, the selectivity of a protein expression and purification mode of an in vitro synthesis system is increased, and the availability of the in vitro protein synthesis system is greatly enhanced.

Further, the present inventors have found that the translation efficiency of a target protein can be further improved by combining 5'-UTR, a strong promoter (e.g., T7 promoter, T3 promoter, SP6 promoter), a different IRES element (e.g., KLNCE102) and a different signal peptide-related sequence, 3' -UTR, and the like.

COMPARATIVE EXAMPLE (PC and NC)

PC (Positive control) is an experimental group in which only a linker sequence was added to the N-terminus of the enhanced green fluorescent protein, and the RFU value of the foreign protein was 800 and the total volume of the reaction system was 30. mu.l when 1. mu.l of the DNA template was added.

NC (negative control) is an experimental group without any nucleic acid construct, the RFU value of the foreign protein is 20, and the total volume of the reaction system is 30. mu.l.

Wherein 1. mu.l, 2. mu.l and 3. mu.l in FIG. 2 represent the amounts of the DNA templates added to the in vitro protein synthesis system, respectively, the total reaction system volume of all reactions was 30. mu.l, and the reaction time was 3 hours.

All documents referred to herein are incorporated by reference into this application as if each were individually incorporated by reference. Furthermore, it should be understood that various changes and modifications of the present invention can be made by those skilled in the art after reading the above teachings of the present invention, and these equivalents also fall within the scope of the present invention as defined by the appended claims.

Reference documents:

1.Fromm HJ,Hargrove M.Essentials of Biochemistry.2012.

2.Garcia RA,Riley MR.Applied biochemistry and biotechnology.Humana Press,；1981.263-264p.

3.Martoglio B.Intramembrane proteolysis and post-targeting functions of signal peptides.Biochem Soc Trans.2003；31(6):1243–7.

4.Owji H,Nezafat N,Negahdaripour M,Hajiebrahimi A,Ghasemi Y.A Comprehensive Review of Signal Peptides:Structure,Roles,and Applications.Eur J Cell Biol.2018.

5.Liu H,Wu R,Yuan L,Tian G,Huang X,Wen Y,et al.Introducing a cleavable signal peptide enhances the packaging efficiency of lentiviral vectors pseudotyped with Japanese encephalitis virus envelope proteins.Virus Res.2017；229:9–16.

6.Cui Y,Meng Y,Zhang J,Cheng B,Yin H,Gao C,et al.Efficient secretory expression of recombinant proteins in Escherichia coli with a novel actinomycete signal peptide.Protein Expr Purif.2017；129:69–74.

7.Ling HL,Rahmat Z,Murad AMA,Mahadi NM,Illias RM.Proteome-based identification of signal peptides for improved secretion of recombinant cyclomaltodextrin glucanotransferase in Escherichia coli.Process Biochem.2017；61:47–55.

8.Zhang S,Corin K.18-Peptide surfactants in membrane protein purification and stabilization A2-Koutsopoulos,Sotirios BT-Peptide Applications in Biomedicine,Biotechnology and Bioengineering.In Woodhead Publishing；2018.p.485–512.

9.Stone TA,Deber CM.Therapeutic design of peptide modulators of protein-protein interactions in membranes.Biochim Biophys Acta-Biomembr.2017；1859(4):577–85.

10.Katzen F,Chang G,Kudlicki W.The past,present and future of cell-free protein synthesis.Trends Biotechnol.2005；23(3):150–6.

11.Gan R,Jewett MC.A combined cell-free transcription-translation system from Saccharomyces cerevisiae for rapid and robust protein synthesis.Biotechnol J.2014；9(5):641–51.

12.Lu Y.Cell-free synthetic biology:Engineering in an open world.Synth Syst Biotechnol.2017；2(1):23–7.

13.Kralicek A V.,Radjainia M,Mohamad Ali NAB,Carraher C,Newcomb RD,Mitra AK.A PCR-directed cell-free approach to optimize protein expression using diverse fusion tags.Protein Expr Purif.2011；80(1):117–24.

14.Hansted JG,

L,

F,Sperl ing-Petersen HU,Mortensen KK.Expressivity tag:A novel tool for increased expression in Escherichia coli.J Biotechnol.2011；155(3):275–83.

15.Kasi D,Nah HJ,Catherine C,Kim ES,Han K,Ha JC,et al.Enhanced Production of Soluble Recombinant Proteins With an In Situ-Removable Fusion Partner in a Cell-Free Synthesis System.Biotechnol J.2017；12(11):1–6.

16.

S,Nordlund P,Weigelt J,Hal lberg BM,Bray J,Gileadi O,et al.Protein production and purification.Nat Methods.2008；5(2):135–46.

sequence listing

<110> Kangma (Shanghai) Biotech Co., Ltd

<120> signal peptide related sequence and application thereof in protein synthesis

<130> P2018-1332

<141> 2018-08-07

<160> 28

<170> SIPOSequenceListing 1.0

<210> 1

<211> 36

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 1

agtgagcaaa gccaattaga tgattcgact atagac 36

<210> 2

<211> 36

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 2

ctgacaactg ttctccctaa cgtagctaca ttaaac 36

<210> 3

<211> 54

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 3

atgctgacaa ctgttctccc taacgtagct acattaaaca gtatgtttgc cctg 54

<210> 4

<211> 36

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 4

aattgctccg cacattgtat caaaaaggct ttacct 36

<210> 5

<211> 54

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 5

aattgctccg cacattgtat caaaaaggct ttacctgcac agtggatccg ttgc 54

<210> 6

<211> 36

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 6

aaaacacata tagtcagctc agtaacaaca acacta 36

<210> 7

<211> 54

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 7

aaaacacata tagtcagctc agtaacaaca acactattgc taggttccat atta 54

<210> 8

<211> 36

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 8

tctggtggtc aaattttcgt aaagacgctg accggt 36

<210> 9

<211> 36

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 9

tctggtggtc aaattttcgt caaaactcta acaggt 36

<210> 10

<211> 36

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 10

tctggtggtc aaattttcgt taaaactctt actggt 36

<210> 11

<211> 27

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 11

aagcctccag tatacccatc gatttgc 27

<210> 12

<211> 36

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 12

atgattacag aaacatcatc accgttcaga tctata 36

<210> 13

<211> 54

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 13

atggtcgcta gaggtagaac agacgagata tctacagatg tttcagaggc taat 54

<210> 14

<211> 12

<212> PRT

<213> Artificial sequence (artificial sequence)

<400> 14

Ser Glu Gln Ser Gln Leu Asp Asp Ser Thr Ile Asp

1 5 10

<210> 15

<211> 12

<212> PRT

<213> Artificial sequence (artificial sequence)

<400> 15

Leu Thr Thr Val Leu Pro Asn Val Ala Thr Leu Asn

1 5 10

<210> 16

<211> 17

<212> PRT

<213> Artificial sequence (artificial sequence)

<400> 16

Leu Thr Thr Val Leu Pro Asn Val Ala Thr Leu Asn Ser Met Phe Ala

1 5 10 15

Leu

<210> 17

<211> 12

<212> PRT

<213> Artificial sequence (artificial sequence)

<400> 17

Asn Cys Ser Ala His Cys Ile Lys Lys Ala Leu Pro

1 5 10

<210> 18

<211> 18

<212> PRT

<213> Artificial sequence (artificial sequence)

<400> 18

Asn Cys Ser Ala His Cys Ile Lys Lys Ala Leu Pro Ala Gln Trp Ile

1 5 10 15

Arg Cys

<210> 19

<211> 12

<212> PRT

<213> Artificial sequence (artificial sequence)

<400> 19

Lys Thr His Ile Val Ser Ser Val Thr Thr Thr Leu

1 5 10

<210> 20

<211> 18

<212> PRT

<213> Artificial sequence (artificial sequence)

<400> 20

Lys Thr His Ile Val Ser Ser Val Thr Thr Thr Leu Leu Leu Gly Ser

1 5 10 15

Ile Leu

<210> 21

<211> 12

<212> PRT

<213> Artificial sequence (artificial sequence)

<400> 21

Ser Gly Gly Gln Ile Phe Val Lys Thr Leu Thr Gly

1 5 10

<210> 22

<211> 9

<212> PRT

<213> Artificial sequence (artificial sequence)

<400> 22

Lys Pro Pro Val Tyr Pro Ser Ile Cys

1 5

<210> 23

<211> 11

<212> PRT

<213> Artificial sequence (artificial sequence)

<400> 23

Ile Thr Glu Thr Ser Ser Pro Phe Arg Ser Ile

1 5 10

<210> 24

<211> 17

<212> PRT

<213> Artificial sequence (artificial sequence)

<400> 24

Val Ala Arg Gly Arg Thr Asp Glu Ile Ser Thr Asp Val Ser Glu Ala

1 5 10 15

Asn

<210> 25

<211> 48

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 25

gaaaacctgt atttccaagg aggtagtgga ggaagtggtg gaagtgga 48

<210> 26

<211> 16

<212> PRT

<213> Artificial sequence (artificial sequence)

<400> 26

Glu Asn Leu Tyr Phe Gln Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly

1 5 10 15

<210> 27

<211> 27

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 27

cgcgaaatta atacgactca ctatagg 27

<210> 28

<211> 24

<212> DNA

<213> Artificial sequence (artificial sequence)

<400> 28

tccggatata gttcctcctt tcag 24

Claims

1. A nucleic acid construct comprising a first nucleotide sequence encoding a signal peptide operably linked to a second nucleotide sequence encoding a foreign protein, the 3' end of the first nucleotide sequence being upstream of the second nucleotide sequence, and the first nucleotide sequence being a nucleotide sequence encoding a signal peptide: the amino acid sequence is ITETSSPFRSI signal peptide.

2. The nucleic acid construct of claim 1, wherein: the operable linkage is direct linkage or linkage through a linking sequence.

3. The nucleic acid construct of claim 2, wherein: the connecting sequence is SEQ ID No.: 25.

4. A signal peptide, characterized by: the amino acid sequence of the polypeptide is encoded by the first nucleotide sequence of claim 1.

5. A vector or vector combination comprising the nucleic acid construct of any one of claims 1 to 3.

6. A genetically engineered cell having the nucleic acid construct of any one of claims 1 to 3 integrated at one or more sites in the genome of the genetically engineered cell, or having the vector or combination of vectors of claim 5 incorporated therein;

the genetically engineered cell is selected from a prokaryotic cell, a Chinese hamster ovary cell, an insect cell, a rabbit reticulocyte, a yeast cell, or a combination thereof.

7. A kit comprising reagents selected from one or more of the group consisting of:

(a) the nucleic acid construct of any one of claims 1-3;

(b) the vector or combination of vectors of claim 5; and

(c) the genetically engineered cell of claim 6.

8. The kit of claim 7, further comprising (d) an in vitro biosynthetic system.

9. The kit of claim 8, wherein: the in vitro biosynthetic system is selected from the group consisting of: a yeast in vitro biosynthesis system, a chinese hamster ovary cell in vitro biosynthesis system, an insect cell in vitro biosynthesis system, a Hela cell in vitro biosynthesis system, or a combination thereof.

10. The kit of claim 9, wherein: the yeast in-vitro biosynthesis system is a Kluyveromyces in-vitro biosynthesis system.

11. Use of a nucleic acid construct according to any of claims 1 to 3, a signal peptide according to claim 4, a vector or combination of vectors according to claim 5, a genetically engineered cell according to claim 6 or a kit according to any of claims 7 to 9 in an in vitro protein synthesis system.

12. An in vitro protein synthesis method, comprising the steps of:

(i) providing an in vitro biosynthetic system comprising the nucleic acid construct of any of claims 1-3;

(ii) (ii) incubating the in vitro biosynthetic system of step (i) under suitable conditions for a reaction time, thereby synthesizing the foreign protein.

13. The in vitro protein synthesis method according to claim 12, wherein the suitable conditions are a reaction temperature of 20-37 ℃ and a reaction time of 1-72 h.

14. The in vitro protein synthesis method of claim 12, further comprising: (iii) isolating or detecting the foreign protein.