CN111778169B - Method for improving in vitro protein synthesis efficiency - Google Patents

Method for improving in vitro protein synthesis efficiency Download PDF

Info

Publication number
CN111778169B
CN111778169B CN202010673244.1A CN202010673244A CN111778169B CN 111778169 B CN111778169 B CN 111778169B CN 202010673244 A CN202010673244 A CN 202010673244A CN 111778169 B CN111778169 B CN 111778169B
Authority
CN
China
Prior art keywords
glu
genetically engineered
lys
ala
engineered strain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010673244.1A
Other languages
Chinese (zh)
Other versions
CN111778169A (en
Inventor
郭敏
于雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kangma Healthcode Shanghai Biotech Co Ltd
Original Assignee
Kangma Healthcode Shanghai Biotech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kangma Healthcode Shanghai Biotech Co Ltd filed Critical Kangma Healthcode Shanghai Biotech Co Ltd
Priority to CN202010673244.1A priority Critical patent/CN111778169B/en
Publication of CN111778169A publication Critical patent/CN111778169A/en
Application granted granted Critical
Publication of CN111778169B publication Critical patent/CN111778169B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • C12N15/81Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
    • C12N15/815Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts for yeasts other than Saccharomyces
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/67General methods for enhancing the expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P21/00Preparation of peptides or proteins
    • C12P21/02Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E50/00Technologies for the production of fuel of non-fossil origin
    • Y02E50/10Biofuels, e.g. bio-diesel

Abstract

The invention provides a method for improving the efficiency of in vitro protein synthesis, in particular to a genetic engineering strain for in vitro cell-free protein synthesis, wherein a first exogenous gene expression cassette for expressing a first nucleic acid construct of a first fusion protein is integrated in the genome of the genetic engineering strain, and the expression or activity of KlEXN53 gene (nuclease gene) is reduced in the genetic engineering strain. The cell extract (such as yeast cell extract) derived from the engineering strain of the invention can obviously improve the stability of nucleic acid, does not need additional manual addition of T7RNA polymerase, and obviously improves the efficiency of producing protein by an in vitro protein synthesis system.

Description

Method for improving in vitro protein synthesis efficiency
The scheme is a divisional application provided by a Chinese invention patent application named as a method for improving protein synthesis efficiency in cells based on application date 2018, 1, 31 and application number 201810093624.0.
Technical Field
The invention relates to the technical field of biology, in particular to a method for improving in-vitro protein synthesis efficiency in cells.
Background
Proteins are important molecules in cells and are involved in performing almost all functions of the cell. The sequence and structure of the protein are different, which determines the difference in function. In cells, proteins can catalyze various biochemical reactions as enzymes, can coordinate various activities of organisms as signaling molecules, can support biological morphology, store energy, transport molecules, and move organisms. In the biomedical field, protein antibodies are important means for treating diseases such as cancer as targeted drugs [1-2]
In a cell, the production of a protein involves both gene transcription and mRNA translation.
Gene transcription refers to a process of synthesizing one RNA by using one strand of DNA as a template and 4 NTPs (ATP, CTP, GTP and UTP) as raw materials under the catalytic action of DNA-dependent RNA polymerase (RNP or RNAP) according to the base complementary pairing principle. For some RNA viruses, RNA may also direct the synthesis of RNA.
Translation of mRNA into protein refers to the process of assembling activated amino acids into protein polypeptide chains on ribosomes (also called nucleoproteins) under the action of enzymes and cofactors, using mRNA as a template and tRNA as a vehicle.
Regulation of protein synthesis plays an important role in many processes, including transcriptional and translational regulation, such as responding to external stresses such as nutritional deficiencies, and cellular development and differentiation.
Transcriptional regulation refers to the regulation of RNA synthesis using DNA as a template, and all cells have a large number of sequence-specific DNA binding proteins (trans-acting factors) that can accurately recognize and bind to specific DNA sequences (cis-acting elements), and act as switches at the transcriptional level. Transcriptional level regulation is an important link in eukaryotic gene expression regulation. Depending on whether eukaryotic gene expression is affected by the environment: developmental regulation and transient regulation. Wherein, the development regulation refers to the regulation and control of the eukaryotic organism to ensure the self growth, development, differentiation and the like of the eukaryotic organism to the gene expression according to the procedures of 'reservation' and 'order', and is an irreversible process; transient regulation refers to adaptive transcriptional regulation by eukaryotes under stimulation of internal and external environments, and is a reversible process.
Four processes of translational regulation include translation initiation, translation elongation, translation termination, and ribosome recirculation, where translation initiation is the most regulated one [3] . During the translation initiation phase, the ribosomal small subunit (40S) binds (tRNA) i Met And recognizes the 5' end of mRNA by the action of a translation initiation factor. The small subunit moves downstream and binds to the ribosomal large subunit (60S) at the initiation codon (AUG) to form an intact ribosome, and enters the translation elongation stage [4]
The biosynthesis systems currently in use are in vivo biosynthesis systems and in vitro biosynthesis systems. The in vivo biosynthesis system refers to a general term for the process of synthesizing various compounds catalyzed by enzymes in an in vivo system, i.e., assimilation reaction in vivo, and includes photosynthesis, gluconeogenesis, and biosynthesis of nucleotides, nucleic acids, and proteins. Protein synthesis is quantitatively the most important in cellular biosynthesis. Protein biosynthesis is also known as translation, the process by which the sequence of bases in an mRNA molecule is converted into the sequence of amino acids in a protein or polypeptide chain. Protein biosynthesis is divided into five stages, activation of amino acids, initiation of polypeptide chain synthesis, extension of peptide chains, termination and release of peptide chains, and processing modification after protein synthesis.
An in vitro biosynthesis system refers to the efficient in vitro synthesis of specific chemical molecules or biological macromolecules (DNA, RNA, proteins) in a lysis system of bacteria, fungi, plant cells or animal cells by adding exogenously encoded nucleic acid DNA, RNA, substrates and energy sources. A common in vitro biosynthesis system is an in vitro protein synthesis system (cell-free protein synthesis system), which uses exogenous mRNA or DNA template and cell lysate to complete the rapid and efficient translation of exogenous recombinant protein [5]
Cell-free systems, which were originally traced back to Buchner who suggested that biosynthesis could be carried out in vitro in 1897, demonstrated bioethanol production by yeast cell-free systems. However, due to Adenosine Triphosphate (ATP) imbalance, this system is not suitable for large-scale application [6] . Welch and Scopes solved the above problem in 1985 by various explorations, obtaining high yields of ethanol, but this system also has two major drawbacks: additional addition of costly enzymes and intolerance of temperature changes are required [7] . However, there are some inherent problematic issues with this technology at present: such as reversibility, instability, leakage, inactivation, recycling of enzymes, lack of stable enzymes, enzyme complexes and cofactors.
A commercially common in vitro protein synthesis system is an in vitro transcription-translation coupling system (IVTT), which transcribes an mRNA intermediate by RNA polymerase through a DNA template, and then completes one-step efficient translation of a foreign protein by using amino acids, ATP, and other components. Currently, common commercial in vitro protein expression systems include Escherichia Coli Extract (ECE) system, rabbit ReticuLocyte (RRL) system, wheat Germ (WGE) system, insect Cell Extract (ICE) system, and human-derived system.
High versatility in the inheritance of endogenous regulatory mechanisms is required when cell-free expression systems are used for in vitro protein synthesis. Currently available regulatory mechanisms are limited by the physiological background of biomass at the time of cell harvest (rapid growth). For example, only one sigma factor is present in cell-free extracts, and transcriptional modularity is still poor.
Therefore, there is an urgent need in the art to develop a novel in vitro translation system that can regulate both transcription and translation, thereby improving stability and resistance, reducing production costs, and increasing the yield of protein synthesis.
Disclosure of Invention
The invention aims to provide a novel in vitro translation system which can regulate and control the aspects of transcription and translation, thereby improving the stability and resistance, reducing the production cost and improving the yield of protein synthesis.
Another objective of the invention is to provide a method for realizing stable and efficient expression of foreign proteins by creating a novel in vitro translation system through various types of genome modification and artificial modification of cells integrating endogenous gene expression, protein translation and nucleic acid stability control.
In a first aspect, the invention provides a genetically engineered strain for in vitro cell-free protein synthesis, the genetically engineered strain having integrated into its genome a first exogenous gene expression cassette expressing a first nucleic acid construct of a first fusion protein; the first fusion protein has the structure of formula Ia or formula Ib:
S-A-B-C (Ia)
S-C-B-A (Ib);
in the formula (I), the compound is shown in the specification,
a is a PabI element;
b is nothing or a connecting peptide;
c is an eIF4G element;
s is an optional signal peptide, wherein each "-" is a peptide bond;
and, in the genetically engineered strain, the expression or activity of KlEXN53 gene (a nuclease gene) is reduced.
In another preferred embodiment, the genome of the genetically engineered strain further integrates into it a second exogenous gene expression cassette expressing a second nucleic acid construct having a structure from 5'to 3' of formula II:
Z1-Z2 (II)
in the formula (I), the compound is shown in the specification,
z1, Z2 are each an element used to construct the construct;
each "-" is independently a bond or a nucleotide linking sequence;
z1 is a promoter element selected from the group consisting of: RNR2, PRP38, TEF1, PGK1, AGC1, IES1, MET8, PAK1, UBA3, VPS4, CDC2, UGA2, DUG3, VPS28, APL2, RAP1, TRM2, RPB3, PGA2, BAP2, ARA1, or a combination thereof;
z2 is the coding sequence of RNP protein.
In another preferred embodiment, the expression level of the EXN53 gene is less than or equal to 10%, preferably less than or equal to 5%, more preferably less than or equal to 2%.
In another preferred embodiment, the "reduction" refers to the reduction of the expression or activity of the EXN53 gene under the following conditions:
the ratio of A1/A0 is less than or equal to 30%, preferably less than or equal to 10%, more preferably less than or equal to 5%, more preferably less than or equal to 2%, and most preferably 0-2%;
wherein A1 is the expression or activity of the EXN53 gene; a0 is the expression or activity of the wild-type EXN53 gene.
In another preferred example, the expression or activity of the EXN53 gene (nuclease gene) in the strain is reduced by a means selected from the group consisting of: gene mutation, gene knockout, gene disruption, RNA interference techniques, criprpr techniques, or combinations thereof.
In another preferred embodiment, the formula Ia or Ib is a structure from N-terminus to C-terminus.
In another preferred embodiment, said element a comprises the wildtype and mutant PabI sequences.
In another preferred embodiment, the PabI is PabI from a cell (e.g., yeast).
In another preferred embodiment, element A has the sequence shown in SEQ ID NO.1 or an active fragment thereof or a polypeptide which has a homology of > 85% (preferably > 90% homology; etc., preferably > 95% homology; most preferably > 97% homology, e.g.98% or more, 99% or more) with the amino acid sequence shown in SEQ ID NO.1 and which has the same activity as the sequence shown in SEQ ID NO. 1.
In another preferred embodiment, the element C comprises wild-type and mutant eIF4G sequences.
In another preferred embodiment, the eIF4G is eIF4G from a cell (e.g., yeast).
In another preferred embodiment, element C has the sequence shown in SEQ ID NO.2 or an active fragment thereof, or a polypeptide which has a homology of > 85% (preferably > 90% homology; etc. preferably > 95% homology; most preferably > 97% homology, e.g. 98% or more, 99% or more) with the amino acid sequence shown in SEQ ID NO.2 and which has the same activity as the sequence shown in SEQ ID NO. 2.
In another preferred embodiment, the first fusion protein is a recombinant protein, preferably a recombinant protein expressed by cells, including prokaryotic and eukaryotic cells.
In another preferred embodiment, the cell is selected from the group consisting of: e.coli, bacteria, mammalian cells (e.g., HF9, hela, CHO, HEK 293), plant cells, yeast cells, insect cells, or combinations thereof.
In another preferred embodiment, the cell is selected from the group consisting of: hela, CHO, HF9, E μ Myc, HEK293, BY-2, yeast, or combinations thereof.
In another preferred embodiment, the first fusion protein is a recombinant protein expressed by yeast.
In another preferred embodiment, said element A is derived from the PabI protein of a cell, such as a yeast.
In another preferred embodiment, the element C is derived from the eIF4G protein of a cell (e.g., yeast).
In another preferred embodiment, the first fusion protein is selected from the group consisting of:
(A) A polypeptide having an amino acid sequence shown in SEQ ID NO. 3; (B) A polypeptide having homology of not less than 80% (preferably, not less than 90% homology; etc., preferably, not less than 95% homology; most preferably, not less than 97% homology, such as 98% or more, 99% or more) with the amino acid sequence shown in SEQ ID NO.3, and having a function or activity of improving the expression efficiency of a foreign protein;
(C) 3 by substitution, deletion or addition of 1-15 (preferably, 2-10, more preferably, 3-8) amino acid residues, and has the function or activity of improving the expression efficiency of the foreign protein.
In another preferred embodiment, the amino acid sequence of the first fusion protein is shown in SEQ ID NO.3.
In another preferred embodiment, the first fusion protein has one or more properties selected from the group consisting of:
(a) The expression efficiency of the foreign protein is improved;
(b) Improving the efficiency of in vitro translation.
In another preferred embodiment, the foreign protein is selected from the group consisting of: luciferin protein, luciferase (e.g., firefly luciferase), green fluorescent protein, yellow fluorescent protein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, variable region of an antibody, luciferase mutation, alpha-amylase, enteromycin A, hepatitis C virus E2 glycoprotein, insulin precursor, interferon alpha A, interleukin-1 beta, lysozyme, serum albumin, single chain antibody fragment (scFV), transthyretin, tyrosinase, xylanase, or a combination thereof.
In another preferred embodiment, Z1 and Z2 are derived from a cell (e.g., yeast).
In another preferred embodiment, the promoter element is selected from the group consisting of: scRNR2, scADH1, scGAPDH, scTEF1, scPGK1, scSED1, klRNR2, klADH1, klGAPDH, klTEF1, klPGK1, klSED1, klIES1, klMET8, klPAK1, klSOK1, klUBA3, klVPS4, klAGC1, klCDC2, klDUG3, klPRP38, klUGA2, klVPS28, scAPL2, scARA1, scBAP2, pga ScRAP 2, scRPB 1, scRPB3, scTRM2, or a combination thereof.
In another preferred embodiment, the strength of the promoter element is ± 20% - ± 50% of the promoter scrrn 2, based on the promoter scrrn 2.
In another preferred embodiment, the strength of the promoter ScRNR2 is based on the number of transcripts.
In another preferred embodiment, the promoter strength refers to the ability of a promoter to regulate the transcription expression level of a downstream coding gene, wherein if the promoter strength is increased, the transcription expression level of the downstream coding gene is increased, and if the promoter strength is decreased, the transcription expression level of the downstream coding gene is decreased.
In another preferred embodiment, the RNP protein is selected from the group consisting of: a T7RNAP protein, T3RNAP, T4RNAP, T5RNAP, SP6 RNAP, SP3, or a combination thereof.
In another preferred embodiment, the peptide linker is 0-50 amino acids, preferably 10-40 amino acids, more preferably 15-25 amino acids in length. In SEQ ID NO.3, the peptide linker has a length of 30 amino acids, and has an amino acid sequence of Gly Gly Gly Ser Gly Gly Gly Gly Gly Ser Thr Gln Asp Glu Val Gln Gly Pro His Ala Gly Lys Ser Thr Val Gly Gly Gly Ser corresponding to positions 593-622 of SEQ ID NO.3.
In another preferred embodiment, the yeast is selected from the group consisting of: saccharomyces cerevisiae, kluyveromyces, or combinations thereof.
In another preferred embodiment, the yeast of the genus kluyveromyces is selected from the group consisting of: kluyveromyces lactis, kluyveromyces marxianus, kluyveromyces polybranhanskii (Kluyveromyces dobzhanskii), or a combination thereof.
In another preferred embodiment, the sequence of the second nucleic acid construct is as shown in SEQ ID NO.4.
In another preferred example, the EXN53 gene is derived from yeast from one or more sources selected from the group consisting of: pichia pastoris, kluyveromyces, preferably from Kluyveromyces.
In another preferred embodiment, the kluyveromyces includes kluyveromyces marxianus, and/or kluyveromyces lactis.
In another preferred embodiment, the EXN53 has the nucleotide sequence shown in SEQ ID NO. 5.
In another preferred embodiment, the protein sequence of the EXN53 is shown in SEQ ID NO. 6.
In another preferred embodiment, the strain is selected from the group consisting of: a kluyveromyces strain, a pichia strain, a saccharomyces cerevisiae strain, a schizosaccharomyces strain, or a combination thereof.
In a second aspect, the invention provides a use of the genetically engineered strain of the first aspect of the invention to improve the efficiency of in vitro protein synthesis.
In a third aspect, the present invention provides a cell-free cell extract derived from the genetically engineered strain of the first aspect of the invention.
In another preferred embodiment, the cell extract is a soluble cell extract.
In another preferred embodiment, the cell extract is derived from one or more cells selected from the group consisting of: coli, bacteria, mammalian cells (e.g., HF9, hela, CHO, HEK 293), plant cells, yeast cells, insect cells, or combinations thereof.
In another preferred embodiment, the cell extract is derived from one or more cells selected from the group consisting of: hela, CHO, HF9, E μ Myc, HEK293, BY-2, yeast, or combinations thereof.
In another preferred embodiment, the cell extract comprises a yeast cell extract.
In another preferred embodiment, the yeast cell is selected from the group consisting of yeast of one or more sources selected from the group consisting of: pichia pastoris, kluyveromyces, or combinations thereof; preferably, the yeast cell comprises: kluyveromyces, more preferably Kluyveromyces marxianus and/or Kluyveromyces lactis.
In another preferred embodiment, the yeast cell extract is an aqueous extract of yeast cells.
In another preferred embodiment, the yeast cell extract does not contain long-chain nucleic acid molecules endogenous to yeast.
In another preferred embodiment, the yeast cell extract is prepared by a method comprising the steps of:
(i) Providing a yeast cell;
(ii) Washing the yeast cells to obtain washed yeast cells;
(iii) Performing cell breaking treatment on the washed yeast cells to obtain a yeast crude extract; and
(iv) And carrying out solid-liquid separation on the yeast crude extract to obtain a liquid part, namely the yeast cell extract.
In another preferred embodiment, the solid-liquid separation comprises centrifugation.
In another preferred embodiment, the centrifugation is carried out in the liquid state.
In another preferred embodiment, the centrifugation conditions are 5000-100000g, preferably 8000-30000g.
In another preferred embodiment, the centrifugation time is 0.5min to 2h, preferably 20 to 50min.
In another preferred embodiment, the centrifugation is carried out at 1-10 ℃, preferably at 2-6 ℃.
In another preferred embodiment, the washing treatment is carried out using a washing solution at a pH of 7 to 8 (preferably, 7.4).
In another preferred embodiment, the washing solution is selected from the group consisting of: potassium 4-hydroxyethylpiperazine ethanesulfonate, potassium acetate, magnesium acetate, or a combination thereof.
In another preferred example, the cell disruption treatment comprises high-pressure disruption, freeze-thawing (such as liquid nitrogen low temperature) disruption.
In a fourth aspect, the invention provides an in vitro cell-free protein synthesis system comprising a cell extract according to the third aspect of the invention.
In another preferred embodiment, the protein synthesis system consists or consists essentially of a cell extract according to the third aspect of the invention.
In another preferred embodiment, the cell-free protein synthesis system does not additionally add RNP protein.
In another preferred embodiment, the RNP protein is selected from the group consisting of: a T7RNAP protein, T3RNAP, T4RNAP, T5RNAP, SP6 RNAP, SP3, or a combination thereof.
In another preferred embodiment, the cell-free protein synthesis system further comprises one or more components selected from the group consisting of:
(b) Polyethylene glycol;
(c) Optionally exogenous sucrose; and
(d) Optionally a solvent, which is water or an aqueous solvent.
In another preferred embodiment, the cell-free protein synthesis system further comprises one or more components selected from the group consisting of:
(e1) A substrate for RNA synthesis;
(e2) A substrate for synthesizing a protein;
(e3) Magnesium ions;
(e4) Potassium ions;
(e5) A buffering agent;
(e6) An RNA polymerase;
(e7) An energy regeneration system.
In another preferred embodiment, the cell-free protein synthesis system further comprises one or more components selected from the group consisting of:
(e8) Heme;
(e9) Spermidine(s).
In another preferred embodiment, the substrate for the synthesis of RNA comprises: nucleoside monophosphates, nucleoside triphosphates, or combinations thereof.
In another preferred embodiment, the substrate of the synthetic protein comprises: 1-20 kinds of natural amino acids and non-natural amino acids.
In another preferred embodiment, the magnesium ion is derived from a magnesium ion source selected from the group consisting of: magnesium acetate, magnesium glutamate, or a combination thereof.
In another preferred embodiment, the potassium ion is derived from a potassium ion source selected from the group consisting of: potassium acetate, potassium glutamate, or a combination thereof.
In another preferred embodiment, the energy regeneration system is selected from the group consisting of: a phosphocreatine/phosphocreatine enzyme system, a glycolytic pathway and its intermediate energy system, or a combination thereof.
In another preferred embodiment, the cell-free protein synthesis system further comprises (f 1) an artificially synthesized tRNA.
In another preferred embodiment, the buffer is selected from the group consisting of: 4-hydroxyethylpiperazine ethanesulfonic acid, tris, or a combination thereof.
In another preferred embodiment, the cell-free protein synthesis system further comprises (g 1) an exogenous DNA molecule for directing protein synthesis.
In another preferred embodiment, the DNA molecule is linear.
In another preferred embodiment, the DNA molecule is circular.
In another preferred embodiment, the DNA molecule comprises a sequence encoding a foreign protein.
In another preferred embodiment, the sequence encoding the foreign protein includes a genomic sequence and a cDNA sequence.
In another preferred embodiment, the sequence encoding the foreign protein further comprises a promoter sequence, a 5 'untranslated sequence, and a 3' untranslated sequence.
In another preferred embodiment, the cell-free protein synthesis system comprises a component selected from the group consisting of: 4-hydroxyethylpiperazine ethanesulfonic acid, potassium acetate, magnesium acetate, nucleoside triphosphates, amino acids, phosphocreatine, dithiothreitol (DTT), phosphocreatine kinase, RNA polymerase, or a combination thereof.
In another preferred embodiment, the polyethylene glycol is selected from the group consisting of: PEG3000, PEG8000, PEG6000, PEG3350, or combinations thereof.
In another preferred embodiment, the polyethylene glycol comprises polyethylene glycol having a molecular weight (Da) of 200 to 10000, preferably 3000 to 10000.
In another preferred embodiment, the concentration (w/v, e.g., g/ml) of component (b) in the protein synthesis system is 0.1-8%, preferably 0.5-4%, more preferably 1-2%.
In another preferred embodiment, the concentration of component (c) in the protein synthesis system is 0.03 to 40wt%, preferably 0.08 to 10wt%, more preferably 0.1 to 5wt%, based on the total weight of the protein synthesis system.
In another preferred embodiment, the concentration of component (c) in the protein synthesis system is 0.2 to 4%, preferably 0.5 to 4%, more preferably 0.5 to 1%, based on the total volume of the protein synthesis system.
In another preferred embodiment, the nucleoside triphosphate is selected from the group consisting of: adenosine triphosphate, guanosine triphosphate, cytosine nucleoside triphosphate, uracil nucleoside triphosphate, or combinations thereof.
In another preferred embodiment, the concentration of component (e 1) in the protein synthesis system is 0.1-5mM, preferably 0.5-3 mM, more preferably 1-1.5mM.
In another preferred embodiment, the amino acid is selected from the group consisting of: glycine, alanine, valine, leucine, isoleucine, phenylalanine, proline, tryptophan, serine, tyrosine, cysteine, methionine, asparagine, glutamine, threonine, aspartic acid, glutamic acid, lysine, arginine, histidine, or combinations thereof.
In another preferred embodiment, the amino acids include D-form amino acids and/or L-form amino acids.
In another preferred embodiment, the concentration of said component (e 2) in said protein synthesis system is 0.01-0.48mM, preferably 0.04-0.24mM, more preferably 0.04-0.2mM, most preferably 0.08mM.
In another preferred embodiment, the concentration of said component (e 3) in said protein synthesis system is 1-10mM, preferably 1-5mM, more preferably 2-4mM.
In another preferred embodiment, the concentration of said component (e 4) in said protein synthesis system is 30-210mM, preferably 30-150mM, more preferably 30-60mM.
In another preferred embodiment, the concentration of the component (e 6) in the protein synthesis system is 0.01 to 0.3mg/mL, preferably 0.02 to 0.1mg/mL, more preferably 0.027 to 0.054mg/mL.
In another preferred embodiment, the concentration of 4-hydroxyethylpiperazine ethanesulfonic acid in the protein synthesis system is 5 to 50mM, preferably 10 to 50mM, preferably 15 to 30mM, more preferably 20 to 25mM.
In another preferred embodiment, the concentration of potassium acetate in the protein synthesis system is 20-210mM, preferably 30-150mM, more preferably 30-60mM.
In another preferred embodiment, the concentration of magnesium acetate in the protein synthesis system is 1-10mM, preferably 1-5mM, more preferably 2-4mM.
In another preferred embodiment, the concentration of said creatine phosphate in said protein synthesis system is 10 to 50mM, preferably 20 to 30mM, more preferably 25mM.
In another preferred embodiment, the concentration of heme in said protein synthesis system is in the range of 0.01 to 0.1mM, preferably, 0.02 to 0.08mM, more preferably, 0.03 to 0.05mM, most preferably, 0.04mM.
In another preferred embodiment, the concentration of spermidine in the protein synthesis system is 0.05 to 1mM, preferably 0.1 to 0.8mM, more preferably 0.2 to 0.5mM, more preferably 0.3 to 0.4mM, most preferably 0.4mM.
In another preferred embodiment, the concentration of Dithiothreitol (DTT) in the protein synthesis system is 0.2-15mM, preferably 0.2-7mM, more preferably 1-2mM.
In another preferred embodiment, the concentration of phosphocreatine kinase in the protein synthesis system is 0.1-1mg/mL, preferably 0.2-0.5mg/mL, more preferably 0.27mg/mL.
In another preferred embodiment, the RNA polymerase is T7RNA polymerase.
In another preferred embodiment, the concentration of the T7RNA polymerase in the protein synthesis system is 0.01-0.3mg/mL, preferably 0.02-0.1mg/mL, and more preferably 0.027-0.054mg/mL.
In another preferred embodiment, the cell-free in vitro synthesis system has the following properties:
in the synthesis system, the total protein synthesis amount reaches 200 mu g protein/mL system.
In another preferred embodiment, the composition of the cell-free protein synthesis system comprises:
Figure BDA0002583106770000101
in another preferred embodiment, the composition of the cell-free protein synthesis system further comprises:
general scope preferred scope
Spermidine, 0.2-0.4mM 0.3-0.4mM;
heme in 0.01-0.04mM 0.03-0.04mM.
In another preferred embodiment, the PEG is selected from PEG3350, PEG3000, and/or PEG8000.
In a fifth aspect, the present invention provides a method for in vitro protein synthesis, comprising the steps of:
(i) Providing an in vitro cell-free protein synthesis system according to the fourth aspect of the invention, and adding exogenous DNA molecules for directing protein synthesis;
(ii) (ii) incubating the protein synthesis system of step (i) under suitable conditions for a period of time T1, thereby synthesizing the protein encoded by the exogenous DNA.
In another preferred example, the method further comprises: (iii) Optionally isolating or detecting said protein encoded by the foreign DNA from said protein synthesis system.
In another preferred embodiment, the exogenous DNA is from a prokaryote or a eukaryote.
In another preferred embodiment, the exogenous DNA is from an animal, plant, pathogen.
In another preferred embodiment, the exogenous DNA is from a mammal, preferably a primate, a rodent, including a human, a mouse, a rat.
In another preferred embodiment, the coding sequence of the foreign protein encodes a foreign protein selected from the group consisting of: luciferin protein, luciferase (e.g., firefly luciferase), green fluorescent protein, yellow fluorescent protein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, variable region of an antibody, luciferase mutant, alpha-amylase, enteromycin A, hepatitis C virus E2 glycoprotein, insulin precursor, interferon alpha A, interleukin-1 beta, lysozyme, serum albumin, single chain antibody fragment (scFV), transthyretin, tyrosinase, xylanase, or a combination thereof.
In another preferred embodiment, the foreign protein is selected from the group consisting of: luciferin protein, luciferase (e.g., firefly luciferase), green fluorescent protein, yellow fluorescent protein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, variable region of an antibody, luciferase mutation, alpha-amylase, enteromycin A, hepatitis C virus E2 glycoprotein, insulin precursor, interferon alpha A, interleukin-1 beta, lysozyme, serum albumin, single chain antibody fragment (scFV), transthyretin, tyrosinase, xylanase, or a combination thereof.
In another preferred embodiment, the exogenous DNA encodes an exogenous protein selected from the group consisting of: luciferin protein, luciferase (e.g., firefly luciferase), green fluorescent protein, yellow fluorescent protein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, variable region of an antibody, luciferase mutant, alpha-amylase, enteromycin A, hepatitis C virus E2 glycoprotein, insulin precursor, interferon alpha A, interleukin-1 beta, lysozyme, serum albumin, single chain antibody fragment (scFV), transthyretin, tyrosinase, xylanase, or a combination thereof.
In another preferred embodiment, the exogenous DNA encodes a protein selected from the group consisting of: luciferin protein, luciferase (e.g., firefly luciferase), green fluorescent protein, yellow fluorescent protein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, variable region of an antibody, luciferase mutation, alpha-amylase, enteromycin A, hepatitis C virus E2 glycoprotein, insulin precursor, interferon alpha A, interleukin-1 beta, lysozyme, serum albumin, single chain antibody fragment (scFV), transthyretin, tyrosinase, xylanase, or a combination thereof.
In another preferred embodiment, in the step (ii), the reaction temperature is 20 to 37 ℃, preferably 20 to 25 ℃.
In another preferred embodiment, in the step (ii), the reaction time is 1-6h, preferably 2-4h.
It is to be understood that within the scope of the present invention, the above-described features of the present invention and those specifically described below (e.g., in the examples) may be combined with each other to form new or preferred embodiments. Not to be repeated herein, depending on the space.
Drawings
FIG. 1 shows a schematic representation of the regulation of biosynthetic responses by controlling gene transcription, translation, and nucleic acid stability.
FIG. 2 shows a schematic of a design of artificially engineered cells integrating endogenous gene expression, protein translation, and nucleic acid stability control. Fig. 2A is a diagram of a genetically engineered CRISPR system in k.lactis genome forming the fusion protein PabI-eIF4G, and fig. 2B is a diagram of a genetically engineered CRISPR system in k.lactis genome with nuclease KlEXN53 knockout, endogenously expressing a RNA polymerase with a specific strength promoter pScRNR 2-T7.
FIG. 3 shows a pHoCas9_ SE _ tRNA _ ScRNR2_ KlPAB1-gRNA1 plasmid map. gRNA1 of KlPAB1 is 1 gRNA in ORF of KlPAB1 gene, and has tRNA-Tyr promoter and SNR52 terminator, and the plasmid has kana selection marker.
FIG. 4 shows a map of the KlPAB1-30Linker-KleIF4G-DD1-pMD18 plasmid. HR1 and HR2 are gene sequences of about 1000bp upstream and downstream of ORF of KlPAB1 gene, and the plasmid has Amp screening mark.
FIG. 5 shows a pHoCas9_ SE _ Kana _ tRNA _ ScRNR2_ KlEXN53-gRNA1&3 plasmid map. gRNA1 and gRNA3 of KlEXN53 are two gRNAs at the 5 'end and the 3' end of KlEXN53 gene ORF respectively, and carry tRNA-Tyr promoter and SNR52 terminator, and the plasmid carries kana selection marker.
FIG. 6 shows a map of the KlEXN53-pScRNR2-T7 RNP-DD1-pMD18 plasmid. HR1 and HR2 are gene sequences of about 1000bp upstream and downstream of ORF of KlEXN53 gene, and the plasmid has Amp screening marker.
FIG. 7 shows a graph of in vitro translational activity assay data for engineered strains. The fluorescent intensity of the firefly fluorescent protein (Fluc) is used to indicate the synthesis capacity of the recombinant protein of the in vitro biosynthesis system. Wherein wt represents wild type Kluyveromyces and 3in1 represents Klexn53 delta-pScRNR 2-T7RNP & KlPAB1-KleIF4G yeast strain, i.e., a yeast strain in which KlexN53 gene is knocked out and replaced with pScRNR2-T7RNP having a promoter of specific strength, and KlPAB1 and KleIF4G form a fusion protein.
FIG. 8 shows a graph of data from an in vitro translational activity assay of the engineered 3in1 strain with or without the addition of T7 RNP. The fluorescent intensity of the firefly fluorescent protein (Fluc) is used to indicate the synthesis capacity of the recombinant protein of the in vitro biosynthesis system.
FIG. 9 summarizes the regulation of in vitro synthesis by transcription, translation regulation and nuclease stabilization to achieve stable, efficient and rapid expression of foreign proteins.
Detailed Description
The present inventors have made extensive and intensive studies and, as a result of extensive screening and research, have constructed a genetically engineered strain particularly suitable for in vitro protein synthesis, which has incorporated into its genome (i) a first foreign gene expression cassette expressing a first nucleic acid construct of a first fusion protein, such as a fusion protein formed by KlPAB1 and KleIF 4G; and simultaneously reduces the expression or activity of nuclease (such as KlEXN 53) in the engineering strain of the invention. Experiments show that the cell-free extract based on the engineering strain unexpectedly synergistically and remarkably improves the protein synthesis efficiency of an in vitro cell-free protein synthesis system. Preferably, the engineered strain of the invention may incorporate a second exogenous gene expression cassette expressing a second nucleic acid construct (e.g., promoter pScRNR2-T7 RNP), and a cell extract (e.g., yeast cell extract) derived from the engineered strain of the invention may significantly improve the stability of the nucleic acid without the need for additional manual addition of T7RNP and significantly improve the efficiency of protein production in an in vitro protein synthesis system. On the basis of this, the present inventors have completed the present invention.
Experiments show that when no T7RNP is additionally added, the luciferase activity of the strain in IVTT is more than or equal to 5 times of that of a wild type.
Term(s) for
As used herein, the terms "engineered strain" and "genetically engineered strain" are used interchangeably and refer to the engineered strain of the invention for use in increasing the efficiency of in vitro protein synthesis, i.e. the strain of the first aspect of the invention.
eIF4F element
In eukaryotes, a variety of translation initiation factors are involved in the protein translation initiation process (Table 1). Among them, eIF4F is responsible for the recognition of "cap structures" and the recruitment of downstream translation initiation factors and ribosomes. eIF4F consists of three protein subunits: eIF4E, eIF4G and eIF4A. eIF4E is specifically combined with a cap structure to anchor eIF4F in a non-translated region at the 5' end of mRNA; eIF4A is an RNA helicase; eIF4G is a scaffold protein for almost the entire translation initiation process, and can interact with a variety of translation initiation factors, playing an important role in the recruitment of downstream factors.
TABLE 1 translation initiation factor in Yeast
Figure BDA0002583106770000141
In the present invention, the in vitro protein synthesis capacity is enhanced by inserting a constitutive or inducible promoter (e.g., pScTEF1, pScPGK1, pKLTEF1, pKLPGK1, pScADH1, pScTPI1, pScTDH3, pKLADH1, pKLTPI1, pKLTDH3, etc.) derived from a yeast (e.g., saccharomyces cerevisiae, kluyveromyces, etc.) in front of eIF4G.
In a preferred embodiment, the nucleotide sequence of the eIF4G is shown in SEQ ID NO. 7; the protein sequence of the eIF4G is shown in SEQ ID NO. 2.
ATGGGCGAACCTACATCCGATCAGCAACCAGCTGTTGAAGCTCCAGTTGTGCAGGA GGAGACAACCAGTTCTCCGCAAAAAAACAGTGGATATGTCAAGAATACTGCTGGA AGCGGTGCTCCTAGAAATGGGAAATATGATGGTAACAGGAAGAACTCTAGGCCTT ATAACCAAAGAGGTAACAACAACAATAATAATGGTTCTTCCTCGAATAAGCACTAT CAAAAGTATAACCAACCAGCGTACGGTGTTTCTGCGGGATACATTCCGAACTACGG CGTATCGGCAGAGTACAACCCTCTGTACTATAACCAGTACCAACAGCAGCAACAGC TGTACGCTGCTGCTTACCAGACTCCAATGAGCGGACAAGGTTATGTCCCCCCAGTA GTGTCTCCAGCTGCTGTTTCAGCTAAACCAGCGAAGGTTGAGATTACTAACAAGTC TGGTGAACACATAGATATTGCTTCCATTGCTCATCCACATACTCATTCTCATTCTCA ATCTCATTCGCGTGCAGTTCCAGTAGTGTCGCCTCCAGCTAACGTTACCGTCGCTGC TGCTGTATCATCCTCTGTGTCTCCATCAGCTTCTCCAGCTGTCAAAGTACAGAGCCC TGCTGCTAATGGTAAGGAACAATCTCCAGCTAAGCCTGAAGAACCAAAGAAGGAC ACTTTAATTGTGAACGATTTCTTGGAACAAGTTAAAAGACGCAAGGCTGCTTTAGC TGCTAAGAAGGCTGTCGAAGAGAAGGGTCCTGAGGAACCGAAGGAATCTGTCGTT GGAACTGACACTGATGCAAGCGTTGATACTAAGACAGGGCCTACAGCCACTGAAT CTGCCAAGTCTGAAGAAGCTCAATCAGAATCACAAGAAAAGACTAAGGAAGAGGC TCCAGCTGAGCCAAAACCATTGACTTTGGCCGAAAAATTGAGACTTAAGAGGATGG AAGCTGCAAAGCAAGCTTCTGCTAAGACCGAGGAACTAAAGACTGAAGAATCTAA GCCTGAAGAAACAAAGACCGAGGAGCTAAAGACTGAAGAATCTAAGCCTGAAGAA ACAAAGACCGAGGAGCTAAAGACTGAAGAAACAAAGTCCGAGGAACTAAAGACT GAAGAACCTAAGGCGGAAGAATCAAAGGCGGAAGAACCAAAGCCTGAAGAACCA AAGACCGAGGAACCGACGACTGAACAACCAAAGTCAGATGAACCAAAGTCGGAA GAATCAAAAACTGAAGAGCCAAAAACCGAGGTATTAAAGACTGAAGAACCAAAAT CGGAAGAATCAAAGCCTGCAGAACCAAAGACTGAAGAAACAGCAACTGAAGAAA CAGCAACTGAAGCAAACGCCGAAGAAGGTGAACCGGCTCCTGCTGGTCCCGTTGA AACTCCTGCTGATGTTGAAACAAAACCTCGAGAAGAGGCTGAAGTTGAAGACGAT GGAAAGATTACCATGACCGATTTCCTACAGAAGTTGAAAGAGGTTTCTCCAGTTGA TGATATTTATTCCTTCCAATACCCAAGTGACATTACGCCTCCAAATGATAGATATAA AAAGACAAGCATTAAATATGCATACGGACCTGATTTCTTGTATCAGTTCAAAGAAA AGGTCGATGTTAAATACGATCCAGCGTGGATGGCTGAAATGACGAGTAAAATTGTC ATCCCTCCTAAGAAGCCTGGTTCAAGCGGAAGAGGCGAAGATAGATTTAGTAAGG GTAAGGTTGGATCTCTAAGAAGTGAAGGCAGATCGGGTTCCAGGTCCAACTCGAA GAAGAAGTCAAAGAGGGATGATAGAAAATCTAATAGATCATACACTTCCAGAAAG GACCGTGAAAGATTCAGAGAGGAAGAAGTCGAAGAGCCAAAGGTTGAGGTTGCCC CATTGGTCCCAAGTGCTAATAGATGGGTTCCTAAATCTAAGATGAAGAAAACAGAA GTCAAGTTAGCTCCAGACGGAACAGAACTTTACGACGCGGAAGAAGCATCAAGAA AGATGAAGTCATTGCTGAATAAATTGACATTAGAAATGTTCGAACCTATTTCTGAT GATATCATGAAGATCGCTAACCAATCTAGATGGGAAGAAAAGGGTGAGACTTTGA AGATTGTCATCCAACAAATTTTCAATAAGGCCTGCGATGAACCTCATTGGTCATCA ATGTACGCGCAATTATGTGGTAAGGTCGTTAAAGACTTAGATGATAGCATTAAAGA CTCAGAAACCCCAGATAAGACTGGTTCTCACTTGGTTTTGCATTACTTAGTCCAAAG ATGTCAAACTGAATTCCAAACAGGATGGACTGATCAACTACCTACAAACGAAGAC GGTACTCCTCTACAACCTGAAATGATGTCCGATGAATACTATAAGATGGCTGCCGC TAAGAGAAGAGGTTTGGGTTTGGTTCGTTTCATTGGTTTCTTGTACCGTTCGAACTT ATTGACTTCCAGAATGGTCTTCTTCTGTTTCAAGAGACTAATGAAGGATATTCAAA ACTCTCCTACTGAAGATACTCTAGAGTCTGTATGTGAACTTTTGGAAACAATTGGTG AACAGTTCGAAGGTGCTCGTATTCAAGTTACTGCAGAAGCTGTCATTGAGGGTTCA AGCTTGCTAGACACACTATTCGACCAAATAAAGAACGTGATCGAAAATGGTGACAT CTCCAGCAGAATCAAGTTTAAGTTGATCGACATTGTCGAACTAAGAGAAAAGAGG AACTGGAATAGTAAAAATAAGAACGATGGTCCAAAGACCATTGCTCAAATTCACG AAGAAGAAGCCTTGAAGAGGGCTTTGGAGGAAAGAGAAAGAGAAAGAGATCGCC ATGGGTCCAGAGGTGGTTCCAGACGTATGAATAGCGAGAGAAACTCTTCTAGAAG AGATTTCTCCTCTCATTCTCACAGTCACAATCAAAATAGAGACGGTTTCACTACTAC CAGATCGTCATCAGTGAGATATTCTGAGCCAAAGAAGGAAGAACAAGCTCCAACT CCAACTAAATCTTCTGGTGGCGCTGCCAACATGTTTGATGCATTGATGGATGCCGA AGATGATTAA(SEQ ID NO.7)
MGEPTSDQQPAVEAPVVQEETTSSPQKNSGYVKNTAGSGAPRNGKYDGNRK NSRPYNQRGNNNNNNGSSSNKHYQKYNQPAYGVSAGYIPNYGVSAEYNPLYYNQ YQQQQQLYAAAYQTPMSGQGYVPPVVSPAAVSAKPAKVEITNKSGEHIDIASIAHP HTHSHSQSHSRAVPVVSPPANVTVAAAVSSSVSPSASPAVKVQSPAANGKEQSPAK PEEPKKDTLIVNDFLEQVKRRKAALAAKKAVEEKGPEEPKESVVGTDTDASVDTKT GPTATESAKSEEAQSESQEKTKEEAPAEPKPLTLAEKLRLKRMEAAKQASAKTEEL KTEESKPEETKTEELKTEESKPEETKTEELKTEETKSEELKTEEPKAEESKAEEPKPE EPKTEEPTTEQPKSDEPKSEESKTEEPKTEVLKTEEPKSEESKPAEPKTEETATEETAT EANAEEGEPAPAGPVETPADVETKPREEAEVEDDGKITMTDFLQKLKEVSPVDDIY SFQYPSDITPPNDRYKKTSIKYAYGPDFLYQFKEKVDVKYDPAWMAEMTSKIVIPP KKPGSSGRGEDRFSKGKVGSLRSEGRSGSRSNSKKKSKRDDRKSNRSYTSRKDRER FREEEVEEPKVEVAPLVPSANRWVPKSKMKKTEVKLAPDGTELYDAEEASRKMKS LLNKLTLEMFEPISDDIMKIANQSRWEEKGETLKIVIQQIFNKACDEPHWSSMYAQL CGKVVKDLDDSIKDSETPDKTGSHLVLHYLVQRCQTEFQTGWTDQLPTNEDGTPL QPEMMSDEYYKMAAAKRRGLGLVRFIGFLYRSNLLTSRMVFFCFKRLMKDIQNSP TEDTLESVCELLETIGEQFEGARIQVTAEAVIEGSSLLDTLFDQIKNVIENGDISSRIKF KLIDIVELREKRNWNSKNKNDGPKTIAQIHEEEALKRALEERERERDRHGSRGGSRR MNSERNSSRRDFSSHSHSHNQNRDGFTTTRSSSVRYSEPKKEEQAPTPTKSSGGAAN MFDALMDAEDD(SEQ ID NO.2)
Pab1 element (Pab 1 protein or PabI protein)
Pab1 is a 71kDa RNA-binding protein consisting of 4 RRM (RNA recognition motif 1-4) domains and 1 MLLE domain. Each RRM domain contains 2 conserved RNP structures (RNP 1/2) responsible for RNA binding.
In a preferred embodiment, the nucleotide sequence of Pab1 is shown in SEQ ID NO. 8; the protein sequence of Pab1 is shown in SEQ ID NO. 1.
ATGTCTGATATTACTGAAAAAACTGCTGAGCAATTGGAAAACTTGCAGATCA ACGATGATCAGCAACCAGCTCAATCTGCCAGTGCTCCATCCACTTCTGCTTCTGAA AGCGAAGCTTCTTCTGTTTCTAAGGTTGAAAACAACAACGCTTCATTGTACGTTGGT GAATTGGATCCAAACATTACTGAAGCATTGTTGTACGATGTGTTTTCACCATTGGGT CCAATTTCCTCGATCCGTGTTTGTCGTGATGCCGTCACCAAGGCTTCGTTAGGTTAC GCTTACGTTAACTATACTGATTACGAAGCTGGTAAGAAAGCTATTCAAGAATTGAA CTATGCTGAAATCAACGGTAGACCATGTAGAATTATGTGGTCCGAACGTGACCCAG CTATCAGAAAGAAGGGTTCTGGTAACATTTTCATCAAGAACTTGCACCCAGCCATT GACAACAAGGCTTTGCATGAAACTTTCTCCACTTTCGGTGAAGTCTTGTCTTGTAAA GTTGCTTTAGATGAGAATGGAAACTCTAGAGGCTTCGGTTTCGTTCATTTCAAGGA AGAATCCGATGCTAAGGATGCTATTGAAGCCGTCAACGGTATGTTGATGAACGGTT TGGAAGTTTACGTTGCCATGCACGTTCCAAAGAAGGACCGTATCTCCAAGTTGGAA GAAGCCAAGGCTAACTTCACCAACATTTACGTCAAGAACATTGACGTTGAAACCAC TGACGAAGAGTTCGAACAGTTGTTCTCCCAATACGGTGAAATTGTCTCTGCTGCTTT GGAAAAGGATGCTGAGGGTAAGCCAAAGGGTTTCGGTTTCGTTAACTTTGTTGACC ACAACGCCGCTGCCAAGGCCGTTGAAGAGTTGAACGGTAAGGAATTCAAGTCTCA AGCTTTGTACGTTGGCAGAGCTCAAAAGAAGTACGAACGTGCTGAAGAATTGAAG AAACAATACGAACAATACCGTTTGGAAAAATTGGCTAAGTTCCAAGGTGTTAACTT GTTCATCAAGAACTTGGACGATTCCATCGATGACGAAAAATTGAAGGAAGAATTCG CCCCATACGGTACCATCACCTCTGCTAGAGTCATGAGAGACCAAGAGGGTAACTCT AAGGGTTTCGGTTTCGTTTGTTTCTCTTCTCCAGAAGAAGCTACCAAGGCTATGACC GAAAAGAACCAACAAATTGTTGCCGGTAAGCCATTGTACGTTGCCATTGCTCAAAG AAAGGATGTCAGAAGATCCCAATTGGCTCAACAAATTCAAGCCAGAAACCAAATC AGATTCCAACAACAGCAACAACAACAAGCTGCTGCCGCTGCTGCTGGTATGCCAGG CCAATACATGCCACAAATGTTCTATGGTGTTATGGCCCCAAGAGGTTTCCCAGGTC CAAACCCAGGTATGAACGGCCCAATGGGTGCCGGTATTCCAAAGAACGGTATGGT CCCACCACCACAACAATTTGCTGGTAGACCAAACGGTCCAATGTACCAAGGTATGC CACCTCAAAACCAATTCCCAAGACACCAACAACAACACTACATCCAACAACAAAA GCAAAGACAAGCCTTGGGTGAACAATTGTACAAGAAGGTCAGTGCCAAGATTGAC GACGAAAACGCCGCTGGTAAGATCACCGGTATGATCTTGGATCTACCACCACAGCA AGTCATCCAATTGTTGGACAACGACGAACAATTTGAACAGCAATTCCAAGAAGCCT TAGCTGCTTACGAAAACTTCAAGAAGGAACAAGAAGCTCAAGCTTAA(SEQ ID NO.8)
MSDITEKTAEQLENLQINDDQQPAQSASAPSTSASESEASSVSKVENNNASLY VGELDPNITEALLYDVFSPLGPISSIRVCRDAVTKASLGYAYVNYTDYEAGKKAIQE LNYAEINGRPCRIMWSERDPAIRKKGSGNIFIKNLHPAIDNKALHETFSTFGEVLSCK VALDENGNSRGFGFVHFKEESDAKDAIEAVNGMLMNGLEVYVAMHVPKKDRISK LEEAKANFTNIYVKNIDVETTDEEFEQLFSQYGEIVSAALEKDAEGKPKGFGFVNFV DHNAAAKAVEELNGKEFKSQALYVGRAQKKYERAEELKKQYEQYRLEKLAKFQG VNLFIKNLDDSIDDEKLKEEFAPYGTITSARVMRDQEGNSKGFGFVCFSSPEEATKA MTEKNQQIVAGKPLYVAIAQRKDVRRSQLAQQIQARNQIRFQQQQQQQAAAAAAG MPGQYMPQMFYGVMAPRGFPGPNPGMNGPMGAGIPKNGMVPPPQQFAGRPNGP MYQGMPPQNQFPRHQQQHYIQQQKQRQALGEQLYKKVSAKIDDENAAGKITGMIL DLPPQQVIQLLDNDEQFEQQFQEALAAYENFKKEQEAQA(SEQ ID NO.1)
First fusion protein
As used herein, the terms "fusion protein of the invention", "PabI-eIF4G fusion protein of the invention", "first fusion protein", and "PabI-eIF4G fusion protein" are used interchangeably and refer to a fusion protein formed by the fusion of a PabI element and an eIF4G element. In the fusion proteins of the invention, the PabI element and the eIF4G element may or may not contain a linker peptide or flexible linker. Furthermore, the fusion protein may or may not contain the starting Met; may or may not contain a signal peptide; and with or without a tag sequence (e.g., 6His, etc.).
In a preferred embodiment, the fusion protein of the present invention has the structure of formula Ia or Ib as described above. Preferably, the amino acid sequence of the fusion protein of the invention is shown in SEQ ID NO.3.
MSDITEKTAEQLENLQINDDQQPAQSASAPSTSASESEASSVSKVENNNASLY VGELDPNITEALLYDVFSPLGPISSIRVCRDAVTKASLGYAYVNYTDYEAGKKAIQE LNYAEINGRPCRIMWSERDPAIRKKGSGNIFIKNLHPAIDNKALHETFSTFGEVLSCK VALDENGNSRGFGFVHFKEESDAKDAIEAVNGMLMNGLEVYVAMHVPKKDRISK LEEAKANFTNIYVKNIDVETTDEEFEQLFSQYGEIVSAALEKDAEGKPKGFGFVNFV DHNAAAKAVEELNGKEFKSQALYVGRAQKKYERAEELKKQYEQYRLEKLAKFQG VNLFIKNLDDSIDDEKLKEEFAPYGTITSARVMRDQEGNSKGFGFVCFSSPEEATKA MTEKNQQIVAGKPLYVAIAQRKDVRRSQLAQQIQARNQIRFQQQQQQQAAAAAAG MPGQYMPQMFYGVMAPRGFPGPNPGMNGPMGAGIPKNGMVPPPQQFAGRPNGP MYQGMPPQNQFPRHQQQHYIQQQKQRQALGEQLYKKVSAKIDDENAAGKITGMIL DLPPQQVIQLLDNDEQFEQQFQEALAAYENFKKEQEAQAGGGGSGGGGSTQDEVQ GPHAGKSTVGGGGSGEPTSDQQPAVEAPVVQEETTSSPQKNSGYVKNTAGSGAPR NGKYDGNRKNSRPYNQRGNNNNNNGSSSNKHYQKYNQPAYGVSAGYIPNYGVSA EYNPLYYNQYQQQQQLYAAAYQTPMSGQGYVPPVVSPAAVSAKPAKVEITNKSGE HIDIASIAHPHTHSHSQSHSRAVPVVSPPANVTVAAAVSSSVSPSASPAVKVQSPAA NGKEQSPAKPEEPKKDTLIVNDFLEQVKRRKAALAAKKAVEEKGPEEPKESVVGTD TDASVDTKTGPTATESAKSEEAQSESQEKTKEEAPAEPKPLTLAEKLRLKRMEAAK QASAKTEELKTEESKPEETKTEELKTEESKPEETKTEELKTEETKSEELKTEEPKAEE SKAEEPKPEEPKTEEPTTEQPKSDEPKSEESKTEEPKTEVLKTEEPKSEESKPAEPKTE ETATEETATEANAEEGEPAPAGPVETPADVETKPREEAEVEDDGKITMTDFLQKLK EVSPVDDIYSFQYPSDITPPNDRYKKTSIKYAYGPDFLYQFKEKVDVKYDPAWMAE MTSKIVIPPKKPGSSGRGEDRFSKGKVGSLRSEGRSGSRSNSKKKSKRDDRKSNRSY TSRKDRERFREEEVEEPKVEVAPLVPSANRWVPKSKMKKTEVKLAPDGTELYDAE EASRKMKSLLNKLTLEMFEPISDDIMKIANQSRWEEKGETLKIVIQQIFNKACDEPH WSSMYAQLCGKVVKDLDDSIKDSETPDKTGSHLVLHYLVQRCQTEFQTGWTDQLP TNEDGTPLQPEMMSDEYYKMAAAKRRGLGLVRFIGFLYRSNLLTSRMVFFCFKRL MKDIQNSPTEDTLESVCELLETIGEQFEGARIQVTAEAVIEGSSLLDTLFDQIKNVIEN GDISSRIKFKLIDIVELREKRNWNSKNKNDGPKTIAQIHEEEALKRALEERERERDRH GSRGGSRRMNSERNSSRRDFSSHSHSHNQNRDGFTTTRSSSVRYSEPKKEEQAPTPT KSSGGAANMFDALMDAEDD(SEQ ID NO.3)
In the present invention, the fusion protein of the present invention can significantly improve the in vitro protein synthesis capacity of cell-free, in vitro protein synthesis systems, particularly yeast in vitro protein synthesis systems.
Exogenous gene expression cassette
As used herein, the term "exogenous gene expression cassette" refers to a first exogenous gene expression cassette with a first nucleic acid construct that expresses a first fusion protein (e.g., a PabI-eIF4G fusion protein), and optionally a second exogenous gene expression cassette that expresses a second nucleic acid construct (e.g., an exogenous gene with a ScRNR2 promoter (T7 RNAP protein)).
The engineering strain is obtained by protoplast fusion of a recombinant strain integrating a first exogenous gene expression cassette and a recombinant strain integrating a second exogenous gene expression cassette. The engineering strain is simultaneously integrated with a first exogenous gene expression cassette and a second exogenous gene expression cassette.
Design and analysis method for enhancing biosynthesis
A. Analysis of Effect of transcriptional control on Activity during in vivo biosynthesis
T7RNA polymerase (T7 RNAP or T7 RNP) is an RNA polymerase derived from T7 bacteriophage, recognizes a conserved promoter sequence (pT 7) of 23nt and provides strong transcription activity. T7RNA polymerase relies on the pT7 promoter to exclusively catalyze the RNA formation process in the 5'to 3' direction; has high specificity and only transcribes DNA located downstream of the T7 promoter in the T7 bacteriophage; and have longer transcription scripts than the multimeric bacterial RNA polymerase.
B. Analysis of the Effect of translational Regulation on Activity during in vivo biosynthesis
eIF4F element
In organisms, a variety of translation initiation factors are involved in the protein translation initiation process, and the translation initiation factors in eukaryotic cells and human cells are shown in tables 1 and 2, respectively. In eukaryotic cells, eIF4F is responsible for recognition of "cap structures" and recruitment of downstream translation initiation factors and ribosomes. eIF4F consists of three protein subunits: eIF4E, eIF4G and eIF4A. The eIF4E is specifically combined with a cap structure, and the eIF4E is anchored in a 5' end untranslated region of mRNA; eIF4A is an RNA helicase; eIF4G is a scaffold protein for almost the entire translation initiation process, and can interact with a variety of translation initiation factors, playing an important role in the recruitment of downstream factors.
TABLE 1 translation initiation factor in eukaryotic cells
Figure BDA0002583106770000201
Figure BDA0002583106770000211
TABLE 2 translation initiation factor in human cells
Figure BDA0002583106770000212
Pab1 element (Pab 1 protein)
Pab1 is located on chromosome C and is a 71kDa RNA binding protein consisting of 4 RRM (RNA recognition motif 1-4) domains and 1 MLLE domain. Each RRM domain contains 2 conserved RNP structures (RNP 1/2) responsible for binding to RNA.
At the 3' end of the mRNA, the polyA sequence is recognized and bound by the Pab1 protein. Pab1 can interact with eIF4G while binding polyA, thereby connecting two ends of mRNA [30] . Thus, eIF4G in combination with Pab1 allows a number of regulatory elements present at the 3' end of mRNA to regulate protein translation initiation in cells, while it is hypothesized that the loop structure of mRNA formed by the interaction of eIF4G with Pab1 can be recruited by the rapid initiation of protein synthesis factors to facilitate translation initiation.
C. Analysis of influence of nuclease stability on Activity during in vivo biosynthesis
Nucleases (also known as nuclear polymerases or polynucleotidases) are enzymes that are capable of hydrolysing phosphodiester bonds between nucleotides in the first step of nucleic acid cleavage. Nucleases can be classified into exonucleases (exonuclease) and endonucleases (endonuclease) according to the position of action of the nuclease. Exonuclease hydrolyzes nucleotides one by one from the 3 'end, called 3' to 5 'exonuclease, exonuclease hydrolyzes nucleotides one by one from the 5' end, called 5'to 3' exonuclease, and endonuclease catalyzes the hydrolysis of phosphodiester bonds within polynucleotide. Nucleases are further classified into deoxyribonuclease (DNase) which acts on DNA and ribonuclease (RNase) which acts on RNA. The stability of the coding substrates in vitro synthesis systems for nucleic acid RNA and DNA proteins influences the yield of the protein. In the eukaryotic translation process, the 5 'end of RNA can complete the processing of cap structure instantly, and the 5' cap structure of RNA plays a crucial role in RNA stability and translation efficiency.
In eukaryotes, there are deadenylation-dependent and deadenylation-independent RNA degradation mechanisms in which the 5' -mature cap structure or the immature cap structure of RNA can be removed by recruiting a decapping complex, and 5' to 3' degradation of RNA also occurs by nuclease [32] . In the mechanism of mRNA degradation independent of adenylation, under the action of an endonuclease, a strand break is generated in the middle of a nucleic acid molecule, and an exonuclease degrades RNA from 3'to 5'.
In the k.lactis in vitro protein synthesis system, mRNA with polyA 3 'terminal tail is transcribed using exogenous linear or circular DNA as template, possibly with or without 5' terminal mature cap structure depending on the difference of promoter (promoter) and RNA transcriptase. Therefore, DNase, RNase, exonuclease and endonuclease all affect the stability of the template in the in vitro protein synthesis system, and modification of these enzymes may have an enhancing effect on the improvement of in vitro biosynthesis activity.
Among them, the lifespan of mRNA is one of the major factors that restrict protein synthesis in vitro due to the special instability of mRNA and the high activity of RNase enzyme in vivo.
The analysis and specific modification of each gene are shown in FIG. 1.
In a preferred embodiment, the design and analysis methods of the invention for enhancing in vitro synthesis are as follows:
I. the in vitro synthesis is enhanced by regulating and controlling translation, namely an artificial fusion protein is designed, and the in vitro translation efficiency is improved.
By combining the analysis of the influence of the in vivo translation regulation and control of the eukaryotic cell on the activity, the invention constructs a fusion protein and improves the in vitro protein synthesis efficiency. In the fusion proteins of the invention, the PabI element and the eIF4G element may or may not contain a linker peptide or flexible linker. Furthermore, the fusion protein may or may not contain the starting Met; may or may not contain a signal peptide; and with or without a tag sequence (e.g., 6His, etc.).
Enhancing in vitro synthesis by regulating transcription, i.e., reducing nucleases in the in vitro synthesis system to increase nucleic acid stability, and introducing a gene for endogenous expression to facilitate transcription.
By combining the transcriptional regulation and the analysis of the influence of nuclease on the activity, the T7RNP can provide stronger transcriptional activity, and the stability of the nuclease directly influences the synthesis efficiency of in vitro protein, so that the nuclease is modified or knocked out to improve the stability of nucleic acid, a gene is introduced to make the endogenous expression of the gene enhance the transcription, and the in vitro protein synthesis activity is improved.
And III, selecting and modifying the genes by using a CRISPR-Cas9 gene editing system, wherein the system modification design is shown in figure 2.
Of course, the present invention is still applicable to the engineering of other eukaryotes to enhance protein synthesis activity in vitro.
The method of the present invention is also suitable for highly expressed proteins in cells.
In vitro expression system
Yeast (yeast) combines the advantages of simple culture, efficient protein folding, and post-translational modification. Wherein, the Saccharomyces cerevisiae (Saccharomyces cerevisiae) and the Pichia pastoris (Pichia pastoris) are model organisms for expressing complex eukaryotic proteins and membrane proteins, and the yeast can also be used as a raw material for preparing an in vitro translation system.
Kluyveromyces (Kluyveromyces) is a species of ascosporogenous yeast, of which Kluyveromyces marxianus and Kluyveromyces lactis (Kluyveromyces lactis) are industrially widely used. In comparison with other yeasts, kluyveromyces lactis has many advantages such as superior secretion ability, better large-scale fermentation characteristics, a level of food safety, and the ability to modify proteins post-translationally.
In the present invention, the yeast in vitro expression system is not particularly limited, and a preferred yeast in vitro expression system is a Kluyveromyces expression system (more preferably, a Kluyveromyces lactis expression system).
Protein synthesis system
The invention provides an in vitro cell-free protein synthesis system, which comprises:
(a) A cell extract (e.g., a yeast cell extract) derived from the genetically engineered bacterium of the first aspect of the invention.
In a preferred embodiment, the synthesis system further comprises:
(b) Polyethylene glycol;
(c) Optionally exogenous sucrose; and
(d) Optionally a solvent, which is water or an aqueous solvent.
In a particularly preferred embodiment, the in vitro protein synthesis system provided by the present invention comprises: yeast cell extract, 4-hydroxyethylpiperazine ethanesulfonic acid, potassium acetate, magnesium acetate, adenine nucleoside triphosphate (ATP), guanine nucleoside triphosphate (GTP), cytosine nucleoside triphosphate (CTP), uracil nucleoside triphosphate (UTP), amino acid mixture, creatine phosphate, dithiothreitol (DTT), phosphocreatine kinase, rnase inhibitor, fluorescein, luciferase DNA, RNA polymerase.
In the present invention, the RNA polymerase is not particularly limited and may be selected from one or more RNA polymerases, and a typical RNA polymerase is T7RNA polymerase.
In the present invention, the proportion of the yeast cell extract in the in vitro protein synthesis system is not particularly limited, and usually the yeast cell extract occupies 20 to 70%, preferably 30 to 60%, more preferably 40 to 50% of the volume in the in vitro protein synthesis system.
In the present invention, the yeast cell extract does not contain intact cells, and typical yeast cell extracts include ribosomes for protein translation, transfer RNAs, aminoacyl tRNA synthetases, initiation and elongation factors required for protein synthesis, and termination and release factors. In addition, the yeast extract also contains some other proteins, especially soluble proteins, which originate in the cytoplasm of the yeast cell.
In the present invention, the content of the protein contained in the yeast cell extract is 20-100mg/mL, preferably 50-100 mg/mL. The method for determining the protein content is a Coomassie brilliant blue determination method.
In the present invention, the preparation method of the yeast cell extract is not limited, and a preferred preparation method comprises the steps of:
(i) Providing a yeast cell;
(ii) Washing the yeast cells to obtain washed yeast cells;
(iii) Performing cell breaking treatment on the washed yeast cells to obtain a yeast crude extract;
(iv) And carrying out solid-liquid separation on the yeast crude extract to obtain a liquid part, namely the yeast cell extract.
In the present invention, the solid-liquid separation method is not particularly limited, and a preferable method is centrifugation.
In a preferred embodiment, the centrifugation is carried out in the liquid state.
In the present invention, the centrifugation conditions are not particularly limited, and one preferable centrifugation condition is 5000 to 100000g, preferably 8000 to 30000g.
In the present invention, the centrifugation time is not particularly limited, and a preferable centrifugation time is 0.5min to 2h, preferably 20 to 50min.
In the present invention, the temperature of the centrifugation is not particularly limited, and it is preferable that the centrifugation is performed at1 to 10 ℃, preferably, 2 to 6 ℃.
In the present invention, the washing treatment is not particularly limited, and a preferable washing treatment is a treatment with a washing solution at a pH of 7 to 8 (preferably, 7.4), the washing solution is not particularly limited, and typically the washing solution is selected from the group consisting of: potassium 4-hydroxyethylpiperazine ethanesulfonate, potassium acetate, magnesium acetate, or a combination thereof.
In the present invention, the manner of the cell disruption treatment is not particularly limited, and a preferable cell disruption treatment includes high-pressure disruption, freeze-thaw (e.g., liquid nitrogen low-temperature) disruption.
The nucleoside triphosphate mixture in the in vitro protein synthesis system is adenosine triphosphate, guanosine triphosphate, cytosine nucleoside triphosphate and uracil nucleoside triphosphate. In the present invention, the concentration of each mononucleotide is not particularly limited, and usually the concentration of each mononucleotide is 0.5 to 5mM, preferably 1.0 to 2.0mM.
The amino acid mixture in the in vitro protein synthesis system may comprise natural or unnatural amino acids, and may comprise D-or L-form amino acids. Representative amino acids include (but are not limited to) the 20 natural amino acids: glycine, alanine, valine, leucine, isoleucine, phenylalanine, proline, tryptophan, serine, tyrosine, cysteine, methionine, asparagine, glutamine, threonine, aspartic acid, glutamic acid, lysine, arginine, and histidine. The concentration of each amino acid is usually 0.01-0.5mM, preferably 0.02-0.2mM, such as 0.05, 0.06, 0.07, 0.08mM.
In a preferred embodiment, the in vitro protein synthesis system further comprises polyethylene glycol or an analog thereof. The concentration of polyethylene glycol or an analog thereof is not particularly limited, and usually, the concentration (w/v) of polyethylene glycol or an analog thereof is 0.1 to 8%, preferably 0.5 to 4%, more preferably 1 to 2%, based on the total weight of the protein synthesis system. Representative PEG examples include (but are not limited to): PEG3000, PEG8000, PEG6000 and PEG3350. It is understood that the systems of the present invention may also include other polyethylene glycols of various molecular weights (e.g., PEG200, 400, 1500, 2000, 4000, 6000, 8000, 10000, etc.).
In a preferred embodiment, the in vitro protein synthesis system further comprises sucrose. The concentration of sucrose is not particularly limited, and generally, the concentration of sucrose is 0.03 to 40wt%, preferably 0.08 to 10wt%, more preferably 0.1 to 5wt%, based on the total weight of the protein synthesis system.
A particularly preferred in vitro protein synthesis system comprises, in addition to yeast extract, the following components: 22mM 4-hydroxyethylpiperazine ethanesulfonic acid with pH of 7.4, 30-150mM potassium acetate, 1.0-5.0mM magnesium acetate, 1.5-4mM nucleoside triphosphate mixture, 0.08-0.24mM amino acid mixture, 25mM creatine phosphate, 1.7mM dithiothreitol, 0.27mg/mL phosphocreatine kinase, 1% -4% polyethylene glycol, 0.5% -2% sucrose, 8-20 ng/. Mu.L DNA of firefly luciferase, 0.027-0.054mg/mL T7RNA polymerase.
The main advantages of the present invention include:
(i) The invention constructs a new genetic engineering strain for the first time, the genome of the strain integrates (i) a first exogenous gene expression cassette of a first nucleic acid construct for expressing a first fusion protein (such as a fusion protein formed by KlPAB1 and KleIF 4G), reduces the expression or activity of nuclease (such as KlEXN 53) in the engineering strain of the invention, and also can optionally integrate a second exogenous gene expression cassette for expressing a second nucleic acid construct (such as a promoter pScRNR2-T7 RNP).
(ii) The invention discovers for the first time that the cell extract (such as yeast cell extract) derived from the engineering strain of the invention can obviously improve the stability of nucleic acid, does not need additional manual addition of T7RNP, and obviously improves the efficiency of producing protein by an in vitro protein synthesis system.
(iii) The invention discovers for the first time that the luciferase activity of the bacterial strain in IVTT is more than or equal to 5 times of that of a wild type strain when T7RNP is not required to be additionally added.
(iv) The invention discovers for the first time that the luciferase activity in IVTT of the strain of the invention is significantly higher than that of the first fusion protein of the invention (wherein the luciferase activity in IVTT of the first fusion protein of the invention is 2.6 times higher than that of wild type).
(v) The invention discovers for the first time that the luciferase activity in IVTT of the strain of the invention is significantly higher than the luciferase activity in IVTT of the strain which reduces the expression or activity of nuclease (such as KlEXN 53) (wherein the luciferase activity in IVTT of the strain which reduces the expression or activity of nuclease (such as KlEXN 53) is improved by 1.46 times compared with the wild-type luciferase activity).
(vi) When the engineered strain of the invention has the following characteristics at the same time: (a) A first exogenous gene expression cassette integrated with a first nucleic acid construct that expresses a first fusion protein (e.g., a fusion protein formed by KlPAB1 and KleIF 4G), (b) reducing the expression or activity of a nuclease (e.g., klEXN 53) in an engineered strain of the invention, a cell-free in vitro protein synthesis system (IVTT) based on the engineered strain can synergistically produce exogenous proteins (e.g., luciferase) with high efficiency.
(vii) The invention realizes the stable existence of T7RNAP in a yeast cell genome and the continuous expression of T7RNAP protein for the first time, and does not need to add T7RNAP additionally when detecting IVTT in a cell-free synthesis system, thereby saving time and cost.
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Experimental procedures without specific conditions noted in the following examples, generally followed by conventional conditions, such as Sambrook et al, molecular cloning: conditions described in a Laboratory Manual (New York: cold Spring Harbor Laboratory Press, 1989), or according to the manufacturer's recommendations. Unless otherwise indicated, percentages and parts are percentages and parts by weight.
The materials and reagents used in the examples of the present invention are commercially available products unless otherwise specified.
Example 1 construction of a novel fusion protein to substantially increase in vitro translation efficiency
1.1 Translation initiation factors eIF4G and Pab1 in K.lactis are optimized by a CRISPR-Cas9 gene editing technology, and KleIF4G is fused with an interaction protein thereof, so that the efficiency of a cell-free in-vitro translation system is improved.
1.1.1 KlPab1 sequence search and CRISPR gRNA sequence determination
According to the invention, through a CRISPR-Cas9 gene editing technology, klPab1 and KleIF4G are fused to promote the interaction of the two, so that the in vitro translation efficiency is improved.
Based on the Pab1 sequence, the KlPab1 gene sequence in Kluyveromyces lactis was obtained (1553322.. 1555100 located on chromosome C). PAM sequence (NGG) is searched near the stop codon of KlPab1 gene, and gRNA sequence is determined. The principle of gRNA selection is: the GC content is moderate, and the standard of the invention is that the GC content is 40-60%; avoiding the presence of poly T structures. Finally, the KlPab1 gRNA sequence identified in this invention is tgcttacgaaaaacttcaaga, located at the 1555058.. 1555077 site of chromosome C.
1.1.2 CRISPR-Cas 9-mediated plasmid construction for KleIF4G integration into KlPab1 site
CRISPR plasmid construction
Using primer PF16: TGCTTACGAAACTTCAAGTTTTTAAGAGCTAGAAATAGC AAGTTAAAATAAGGCTAGTCCG (SEQ ID NO. 9), PR16: GCTCTAAAACTCTT GAAGTTTTCGTAAGCAAAAGTCCCATTCGCCACCCG (SEQ ID NO. 10), PCR amplification was performed using the pC AS plasmid AS a template. mu.L of the amplification product was mixed and 1. Mu.L of DpnI, 2. Mu.L of 10 Xdigestion buffer was added and incubated at 37 ℃ for 3 hours. Adding 10. Mu.L of the product after the DpnI treatment into 100. Mu.L of DH5 alpha competent cells, standing on ice for 30min, thermally shocking at 42 ℃ for 45s, adding 1mL of LB liquid culture medium, carrying out shake culture at 37 ℃ for 1h, coating on Kan resistant LB solid culture medium, and carrying out inversion culture at 37 ℃ until single clones grow out. 5 single clones were selected and cultured in LB liquid medium with shaking, PCR-detected positive and confirmed by sequencing, and the plasmids were extracted and stored, and designated as pH as9_ SE _ tRNA _ ScRNR2_ KlPAB1-gRNA1 (FIG. 3).
KlPab1-KleIF4G donor DNA plasmid construction and amplification
To facilitate the storage and amplification of linear donor DNA, the donor DNA was first inserted into the pMD18 plasmid and then amplified by PCR to obtain a linear donor DNA sequence.
Taking Kluyveromyces lactis genome DNA as a template, and taking a primer PF17: GAGCTCGGTACCCGGGG GATCCTCTAGAGATCCGGTAAGCCAGCCATGCCATTGTACGTGCCAT (SEQ ID No. 11) and P R17: performing PCR amplification on GCCAAGCTTGCATGCCCTGCAGGTCGAACGTATACCGTCCATGTTGATGAT GACT (SEQ ID NO. 12); taking a pMD18 plasmid as a template, and taking a primer pMD18-F: ATCGTCGACCTGCAGGCATG (SEQ ID NO. 13) and pMD18-R: ATCTCTAGAGGA TCCCCGGG (SEQ ID NO. 14) was subjected to PCR amplification. mu.L of each of the two amplification products was mixed, and 1. Mu.L of DpnI and 2. Mu.L of 10 Xdigestion buffer were added thereto, and incubated at 37 ℃ for 3 hours. Adding 10 mu L of the product after the DpnI treatment into 100 mu L of DH5 alpha competent cells, placing on ice for 30min, thermally shocking at 42 ℃ for 45s, adding 1mL of LB liquid culture medium, carrying out shake culture at 37 ℃ for 1h, coating on Amp resistant LB solid culture, and carrying out inversion culture at 37 ℃ until single clones grow out. 5 single clones are selected and cultured in LB liquid culture medium by shaking, after PCR detection is positive and sequencing is confirmed, plasmids are extracted and stored, and the plasmid is named as pKM-KlPab1-DD.
pKM-KlPab1-DD as template, PF18 primer: amplifying GATGCATTGGATGTGCCGAAGA TGATTAAACTTGATTTTTTGACCTTGATCCATCGTC (SEQ ID NO. 15) and PR 18 CTTGAACTTCATCTTGAGTTGAACCTCCCTCCCAGATCCTCCCTACCAGC TTGAGCTTCTTGTTTTTTTTTAAAATTCTCGTAAGCAGCTAAGGCTTC (SEQ ID NO. 16); kluyveromyces lactis DNA is used as a template, and primers PF19: GTGGAGGTTC AACTCAAGATGAAGTTCAAGGTCCACATGCTGGTTAAGTCTACTGGTGGAGG TGGATCTGGCGAACCTACATCCGATCGACTACGATCAGCG (SEQ ID NO. 17) and PR19: TTAATC ATCTTCCGGCATCCATCAATGC (SEQ ID NO. 18) are used for amplification. mu.L of each of the two amplification products, 8.5. Mu.L of each of the two amplification products, 1. Mu.L of DpnI, 2. Mu.L of 10 Xdigestion buffer were mixed and incubated at 37 ℃ for 3 hours. Adding 10 mu L of the product treated by the DpnI into 100 mu L of DH5 alpha competent cells, standing on ice for 30min, performing heat shock at 42 ℃ for 45s, adding 1mL of LB liquid culture medium, performing shake culture at 37 ℃ for 1h, coating the mixture on Amp resistant LB solid culture, and performing inversion culture at 37 ℃ until single clones grow out. 5 single clones were selected and cultured in LB liquid medium with shaking, and after positive PCR detection and sequencing confirmation, the plasmid was extracted and stored, and named KlPAB1-30Linker-KleIF4G-DD1-pMD18 (FIG. 4).
The linear donor DNA is obtained by taking KlPAB1-30Linker-KleIF4G-DD1-pMD18 plasmid as a template and performing amplification by using primers M13-F, GTAAA ACGACGGCCAGT (SEQ ID NO. 19) and M13-R, CAGGAAACAGCTATGAC (SEQ ID NO. 20).
1.1.3 Kluyveromyces lactis transformation and positive identification
I. Streaking a Kluyveromyces lactis liquid on a YPD solid culture medium, picking a single clone, carrying out shake culture in 25mL of 2 XYPD liquid culture medium overnight, and taking 2mL of the liquid to carry out shake culture in 50mL of the 2 XYPD liquid culture medium for 2-8h. The yeast cells were collected by centrifugation at 3000g for 5min at 20 ℃ and resuspended in 500. Mu.L of sterile water, and the cells were collected by centrifugation under the same conditions. Preparing a competent cell solution (5% v/v glycerol, 10% v/v DMSO) and dissolving the yeast cells in 500. Mu.L of this solution. Subpackaging 50 μ L into 1.5mL centrifuge tubes, and storing at-80 deg.C.
Competent cells were thawed at 37 ℃ for 15-30s, centrifuged at 13000g for 2min and the supernatant removed. Preparing a transformation buffer solution: PEG3350 (50% (w/v)) 260. Mu.L, liAc (1.0M) 36. Mu.L, carrier DNA (5.0M g/mL) 20. Mu.L, cas9& gRNA plasmid 15. Mu.L, donor DNA 10. Mu.L, sterile water was added to a final volume of 360. Mu.L. After heat shock, 13000g were centrifuged for 30s to remove the supernatant. Adding 1mL YPD liquid culture medium, culturing for 2-3h, sucking 200 μ L spread on solid YPD (200 μ G/mL G418) culture medium, and culturing for 2-3 days until single colony appears.
Picking 10-20 monoclonals on a plate transformed by the Kluyveromyces lactis, placing the plate into 1mL of YPD (200 mu G/mL G418) liquid culture medium for shaking culture overnight, taking a bacterial liquid as a template, and taking a primer KlPAB1-CICF1 (a primer in a KlPAB1 sequence): TCTCTCCAGAGAAGCTACCAAGGCTA (SEQ ID NO. 21) and primer KleIF4G-CICR2 (KleIF 4G in-sequence primer): TTCTCTTCGACAGCTTCTTAGCAG (SEQ ID NO. 22) is subjected to PCR amplification, klPAB1 site KleIF4G insertion is subjected to PCR detection, a strain which is positive in PCR result and identified through sequencing is determined to be a positive strain, and the strain is named as KlPAB1-KleIF4G.
Example 2 in order to increase the efficiency of in vitro translation, T7RNP with specific priming strength of ScRNR2 was endogenously expressed by knock-out the nuclease KlEXN53, with or without additional addition of T7 RNP.
2.1 Background and analysis of biosynthesis in vivo
In vivo biosynthesis refers to the process of synthesis of various compounds catalyzed by enzymes in the organism, including photosynthesis, gluconeogenesis, and biosynthesis of macromolecules such as nucleotides, nucleic acids, and proteins. Among them, the synthesis of proteins is quantitatively the most important.
The entire process of biosynthesis includes transcription, which is the transfer of genetic information from DNA to RNA, and translation. Namely, a process of synthesizing RNA under the catalysis of RNA polymerase by taking one strand of double-stranded DNA as a template and 4 nucleoside triphosphates of Adenosine Triphosphate (ATP), cytidine Triphosphate (CTP), guanosine Triphosphate (GTP) and Uridine Triphosphate (UTP) as raw materials. Translation is a process in which ribosomes bind to a messenger ribonucleic acid (mRNA) template with the aid of various factors, the triplet codon of the mRNA is recognized by transfer ribonucleic acid (tRNA) and the corresponding amino acid is transferred, and then protein peptide chains are successively synthesized in sequence according to the template mRNA information. The transcriptional regulation is affected by transcription factors, nucleases, RNA polymerase, etc., and the translational regulation is affected by translation initiation, translation elongation, translation termination, etc.
2.2 Design principle of modification of K.lactis genome KlEXN53 to regulate in vitro biosynthetic activity
2.2.1 In vitro biosynthetic System analysis
I. Compositional analysis in different in vitro biosynthetic systems
In vitro biosynthesis systems are used for the translation of foreign proteins by disrupting different cell types, including microorganisms, animals and plants, and extracting cell lysates. In vitro biosynthetic systems in order to perform transcription, translation, protein folding, and energy metabolism, cell lysates must contain elements for energy regeneration and protein synthesis, including ribosomes, aminoacyl-tRNA synthetases, translation initiation and elongation factors, ribosome release factors, nucleotide cycle enzymes, metabolic enzymes, chaperones, and folding enzymes. Substances that need to be added exogenously include amino acids, nucleotides, DNA templates, energy substrates, cofactors, salt molecules, and the like. The stability of the different components of the system has a significant influence on the duration of the reaction and ultimately the protein yield, and depletion of some substrates (e.g. ATP, cysteine, etc.) is a significant cause of the termination of the reaction.
II, analyzing the influence of the components in the in vitro biosynthesis system on the synthesized product;
in the in vitro translation system, the concentrations and pH values of components such as energy metabolism substrates (creatine phosphate, etc.), substrates involved in mRNA synthesis (NTP, etc.), substrates involved in protein synthesis (amino acids, etc.), and phosphate, etc. vary in magnitude as the reaction proceeds, eventually leading to termination of the reaction. By different technical means, corresponding components (NTP, amino acid and the like) are supplemented or corresponding conditions (pH value, phosphate concentration and the like) are stabilized, so that the reaction time of an in vitro translation system can be effectively prolonged, and the protein yield is improved.
Analyzing the influence of nucleic acid and nuclease in an in vitro protein synthesis system on protein synthesis;
as an important substrate for protein translation, the stability of the DNA template and the mRNA produced by its transcription has a significant influence on the duration and yield of the in vitro translation system. After the cells are broken, cell lysate is collected and used for constructing an in vitro translation system, wherein the cell lysate contains various components necessary for protein synthesis, and components unfavorable for reaction such as nuclease, protease and the like. In the system composed of various purified components, the efficiency of protein translation is remarkably improved because of no inhibitor.
Meanwhile, the stability of nucleic acid in an in vitro translation system can be effectively improved through different technical means (including nuclease gene knockout, addition of a stabilizing factor and the like), and the translation efficiency of protein is further improved. The stability of nucleic acid is improved by reducing nuclease in an in vitro biosynthesis system, and the nuclease gene in K.lactis genome is analyzed and specifically modified.
2.2.2 Analysis of nuclease genes in the K.lactis genome
Classification and distribution of nuclease genes in the genes of the lactis;
nucleases (also known as nuclear polymerases or polynucleotidases) are enzymes that are capable of hydrolysing phosphodiester bonds between nucleotides in the first step of nucleic acid cleavage. Nucleases can be classified into exonucleases (exonucleases) and endonucleases (endonucleases) according to the position of action of nucleases. Exonuclease hydrolyzes nucleotides one by one from the 3 'end, called 3' to 5 'exonuclease, exonuclease hydrolyzes nucleotides one by one from the 5' end, called 5'to 3' exonuclease, and endonuclease catalyzes hydrolysis of phosphodiester bonds inside polynucleotide. Nucleases are further divided into deoxyribonucleases (dnases) that act on DNA and ribonucleases (rnases) that act on RNA. Exonuclease is divided into 5'to 3' exonuclease and 3'to 5' exonuclease according to the action direction of nuclease.
Through database gene function comparison analysis, the K.lactis contains 61 nucleases in total and is distributed on K.lactis A-F six chromosomes, wherein the nucleases positioned on the A chromosome have the following functions: klSEN54, klDNA2, klTRM2, klFCF1, klDOM34, klRAD2, klRNH70, klDIS3, klNPP1; nucleases located on the B chromosome are: klOGG1; nucleases located on the C chromosome are: klPOL2, klRAD50, klYSH1, klRCL1, klNGL2, klMRE11, klPOP3, klMKT1, klAPN1, klRPP1, klPOP2; nucleases located on the D chromosome are: klNUC1, klRPP6, klDBR1, klRPS3, klRPM2, klSUV3, klRAD1, klIRE1, klPOP1; nucleases located on the E chromosome are: klPOL3, klDXO1, klMUS81, klPAN3, klAPN2, klRAI1, klEXO1, klREX4, klPAN2, klLCL3; nucleases located on the F chromosome are: klRAD17, klPOP4, klPOP5, klRAD27, klRNH201, klVMA1, klRAT1, klNGL1, klREX2, klTRL1, klPOL31, klCCR4, klTRZ1, klSEN2, klREX3, klSWT1, KLEXN53, klRNT1, klSEN15, klNTG1, klNOB1, klDDP1. According to the functional classification, k. lactis nucleases can be divided into two classes, wherein 18 dnases, 41 rnases and 2 non-functional classes are shown in table 3. Wherein among the DNases having 3'to 5' function are KlAPN1,5'to 3' function are KlRAD27, klEXO1; among the RNases having 3'to 5' function are KlRNH70, klDIS3, klNGL2, klRPP6,5'to 3' function are KlDXO1, klRAT1, KLEXN53.
TABLE 3 nuclease distribution
Figure BDA0002583106770000311
Figure BDA0002583106770000321
Selection and analysis of nuclease genes involved in vitro protein synthesis in the genes of lactis;
the stability of the coding substrates in vitro synthesis systems for nucleic acid RNA and DNA proteins influences the yield of the protein. In the eukaryotic translation process, the 5 'end of RNA can complete the processing of cap structure instantly, and the 5' cap structure of RNA plays a crucial role in RNA stability and translation efficiency.
In eukaryotes, there are both deadenylation-dependent and deadenylation-independent RNA degradation mechanisms in which the 5' mature cap structure or the immature cap structure of RNA can be removed by recruiting a decapping complex, and degradation of RNA 5' to 3' also occurs by nuclease action. In the mechanism of mRNA degradation independent of adenylation, under the action of an endonuclease, a strand break is generated in the middle of a nucleic acid molecule, and an exonuclease degrades RNA from 3'to 5'.
In the k.lactis in vitro protein synthesis system, mRNA with polyA 3 'terminal tail is transcribed using exogenous linear or circular DNA as template, possibly with or without 5' terminal mature cap structure depending on the difference of promoter (promoter) and RNA transcriptase. Therefore, DNase, RNase, exonuclease and endonuclease all affect the stability of the template in the in vitro protein synthesis system, and modification of these enzymes may have an enhancing effect on the improvement of in vitro biosynthesis activity.
Among these, the lifespan of mRNA is one of the major factors that restrict protein synthesis in vitro due to the specific instability of mRNA and the high activity of RNase enzyme in vivo. Therefore, modification to reduce the amount of RNase enzyme in vivo is the preferred modification. Because KlEXN53 is necessary for checkpoint activation when telomere is uncapped (single-stranded DNA generation is promoted), and is one of RNA processing proteins, telomere metabolism can be regulated, genome stability of eukaryotes is influenced, and finally, klEXN53 is knocked out. Of course, this patent is still applicable to the design modification of other nucleases to enhance in vitro protein synthesis activity.
2.3 Selected analysis of the insertion of pScRNR2 and T7RNAP sites into the genome of cells
In order to overcome the defect that the T7RNA polymerase protein needs to be added manually by external sources in the existing in-vitro translation system, the T7RNAP protein is integrated into a cell genome through a CRISPR-Cas9 system of a gene editing technology, a strain capable of stably expressing the T7RNAP protein in a proper amount is created, and a simple, convenient and efficient in-vitro translation system which does not need to add the T7RNAP manually by external sources is formed.
In vitro translation systems have high requirements for T7RNAP content, and too high or too low of a T7RNAP content can affect system efficiency. Therefore, the weak pScRNR2 promoter is inserted in front of T7RNAP, and the structure is constructed into an episomal plasmid, so that the function of the T7RNAP expression cassette in an in vitro translation system is verified.
After the function verification of episome plasmid, the invention replaces a certain gene of a cell by the T7RNAP expression cassette. In the present invention, KLEXN53 gene was knocked out and T7RNAP substitution was carried out.
2.4 Targeted knockout of the EXN53 gene by CRISPR/Cas9, replacement with the T7RNAP expression cassette
2.4.1 KLEXN53 sequence search and CRISPR gRNA sequence determination
I. Construction of plasmid of targeting knockout gene cloning vector:
the sequence of the EXN53 gene in Kluyveromyces lactis (SEQ ID No. 5) was determined by BLAST alignment analysis of the EXN53 gene in the KEGG database, which encodes KlLA0F22385g, designated KLEXN53 (position 2091235.. 2095596 on chromosome F).
The PAM sequence (NGG) was searched for the start codon and stop codon of the KLEXN53 gene and the gRNA sequence was determined. The principle of gRNA selection is: the GC content is moderate, and the standard of the invention is that the GC content is 40-60%; avoiding the presence of poly T structures. Finally, the sequence of KLEXN53 gRNA-1 determined by the invention is AGAGTTCGACAATTGTACT (SEQ ID NO. 23), and the sequence of KLEXN53 gRNA-3 is CGTCGTGGCCGTAGTAATCG (SEQ ID NO. 24).
The plasmid construction and transformation method is as follows: using primers pCas9-KLEXN53-F1: AGAGTTCGACAAT TTGTACTGTTTTAGAGCTAGAAATAGCAAGTTAAAAAATAAGGCTAGTC (SEQ ID NO. 25) and pCas9-KLEXN53-R1: GCTCTAAAACAGTACAATTGTCGAACTCTAA AGTCCCATTCGCCACCCG (SEQ ID NO. 26), pCas9-KLEXN53-F2: CGTCGT GGCCGTAGTAATCGGTTTAGAGAGCTAGAAATAGCAAGTTAAAAAATAAGGCTAGTC (SEQ ID NO. 27) and pCas9-KLEXN53-R2: GCTCTAAAACCGATTATTACGCGCCA CGACGAAAGTCCCATTCGCACCCG (SEQ ID NO. 28) uses pCAS plasmid as template to carry out PCR amplification. mu.L of each amplification product was mixed, and 1. Mu.L of Dpn I, 2. Mu.L of 10 Xdiges station buffer was added thereto, and the mixture was incubated at 37 ℃ for 3 hours. Adding 10 mu L of the product after the Dpn I treatment into 100 mu L of DH5 alpha competent cells, standing on ice for 30min, thermally shocking at 42 ℃ for 45s, adding 1mL of LB liquid culture medium, carrying out shake culture at 37 ℃ for 1h, coating on Kan resistant LB solid culture medium, and carrying out inversion culture at 37 ℃ until single clones grow out. 5 single clones are picked and cultured in LB liquid culture medium in a shaking way, after PCR detection is positive and sequencing confirmation, plasmids are extracted and stored, and the plasmids are respectively named pHoCas9_ SE _ Kana _ tRNA _ ScRNR2_ KLEXN53-gRNA1 and pHoCas9_ S E _ Kana _ tRNA _ ScRNR2_ KLEXN53-gRNA3. pHoCas9_ SE _ Kana _ tRNA _ ScRNR2_ K LEXN53-gRNA1 was used as a template, and primers pCas9-F1: TAGGTCTAGAGATCTGTTTTAGCTT GCCTCG (SEQ ID NO. 29) and pCas9-R1: TATCCACTAGACAGAAGTTTTGCGTTC (SEQ ID NO. 30) using pHoCas9_ SE _ Kana _ tRNA _ ScRNR2_ KLEXN53-gRNA3 as a template and primers pCas9-F2: TATGGAACGCAAACTTCTGTCTAGTGGATAGTATATG TGTTATGTAGTATACTCTTTCTTCAACAATTAATACTCGG (SEQ ID NO. 31) and pCas9-F2: CGAGGCAAGCTAAACAGATCTCTAGACCTATATCACCATAGACAAGAGTTTGTGTCCTTCC (SEQ ID NO. 32) was subjected to PCR amplification, PCR amplification products of pCas9-F1/pCas9-R1 and pCas 9-F2/pCas 9-R2 were mixed according to 1. Adding 10 mu L of the product after the Dpn I treatment into 100 mu L of DH5 alpha competent cells, standing on ice for 30min, performing heat shock at 42 ℃ for 45s, adding 1mL of LB liquid culture medium, performing shake culture at 37 ℃ for 1h, coating on Kan resistant LB solid culture medium, and performing inversion culture at 37 ℃ until single clones grow out. 5 single clones were picked and cultured in LB liquid medium with shaking, PCR was performed to detect positive and sequencing was confirmed, and then plasmids were extracted and stored, and named pHoCas9_ SE _ Kana _ tRNA _ ScRNR2_ KLEXN53-gRNA1&3 (shown in FIG. 5).
2.4.2 Donor DNA plasmid construction and amplification
In order to facilitate the storage and amplification of linear donor DNA, the donor DNA is firstly inserted into the pMD18 plasmid, and then a linear donor DNA sequence is obtained through PCR amplification.
PCR amplification is carried out by taking a plasmid containing a T7RNAP gene as a template and primers PF1: ATGAACACGATTAACATC GCTAAGAACG (SEQ ID NO. 33) and PR1: TTACGCGAACGCGAAGTCCG (SEQ ID NO. 34); taking Kluyveromyces lactis free plasmid as a template, and carrying out PCR amplification by using primers PF2: ATC TTAGAGTCGGACTTCGCGTTCGCGTAAGAGATGCTTCTGCTCATCATC (SEQ ID NO. 35) and PR2: AGTCGTTCTTAGCGATGTTAATCGTTCATGTTCATGGTAATTGGAC AAATAAATACGTGT (SEQ ID NO. 36). mu.L of each of the two amplification products was mixed, and 1. Mu.L of DpnI, 2. Mu.L of 10 Xdigestion buffer was added, and the mixture was incubated at 37 ℃ for 3 hours. Adding 10 mu L of the product treated by DpnI into 100 mu L of DH5 alpha competent cells, placing on ice for 30min, performing heat shock at 42 ℃ for 45s, adding 1mL of LB liquid culture medium, performing shake culture at 37 ℃ for 1h, coating on Kan resistant LB solid culture, and performing inversion culture at 37 ℃ until single clones grow out. 5 single clones are selected and cultured in LB liquid culture medium in a shaking way, after positive PCR detection and sequencing confirmation, plasmids are extracted and stored, and the name is pKM-T7RNAP1.
Taking Kluyveromyces lactis genome DNA as a template, and taking a primer KLEXN53-F1: GTACCCGGGG ATCCCTAGAGATCCAGTGCAGAGCCTCCGAA (SEQ ID NO. 37) and KLEXN53-R1: performing PCR amplification on CATGCCCTGCAGGTCGATGCGAAACCTTAGCTCTCGAAC (SEQ I D NO. 38); taking pMD18 plasmid as a template, and taking a primer pMD18-F: ATCGTCGA CCTGCAGGCATG (SEQ ID NO. 13) and pMD18-R: ATCTCTAGAGGATCCCCGGG (SEQ ID NO. 14) was subjected to PCR amplification. mu.L of each of the two amplification products was mixed, and 1. Mu.L of Dp n I, 2. Mu.L of 10 Xdigestion buffer was added, and the mixture was incubated at 37 ℃ for 3 hours. Adding 10 mu L of the product after the Dpn I treatment into 100 mu L of DH5 alpha competent cells, standing on ice for 30min, performing heat shock at 42 ℃ for 45s, adding 1mL of LB liquid culture medium, performing shake culture at 37 ℃ for 1h, coating the mixture on Amp resistant LB solid culture, and performing inversion culture at 37 ℃ until single clones grow out. 5 single clones are selected and cultured in LB liquid culture medium in a shaking way, after PCR detection is positive and sequencing is confirmed, plasmids are extracted and stored, and the name is KLEXN53-pMD18.
Using KLEXN53-pMD18 plasmid as a template, and primer KLEXN53-F3: ATGTTGGTTTGAAT GGACTATTAACAGAATTATATATATTACACTTCGTTCAATTAATTACGTTTGTG GCTTAACTAAAAAGTCGAACAAGAAGCAGGCAAAG (SEQ ID NO. 39) and KLE XN53-R2: performing PC R amplification on GAAGTGATATAATATTTACTGTTAATAGTCCATTCAAACCAACATTTTA TTTTTTAGTTAAGCCAAAACCGTAATTAATTAATTGAACAC (SEQ ID NO. 40); KLEXN53-DD-pMD18 was constructed. The method comprises the following specific steps: mu.L of each of the two PCR products were mixed, and 1. Mu.L of Dpn I, 2. Mu.L of 10 Xdigestion buffer were added, and incubated at 37 ℃ for 3 hours. Adding 10 mu L of the product after the Dpn I treatment into 100 mu L of DH5 alpha competent cells, standing on ice for 30min, performing heat shock at 42 ℃ for 45s, adding 1mL of LB liquid culture medium, performing shake culture at 37 ℃ for 1h, coating the mixture on Amp resistant LB solid culture, and performing inversion culture at 37 ℃ until single clones grow out. 5 monoclonals are selected to be subjected to shake culture in an LB liquid culture medium, and plasmids are extracted for storage after PCR positive detection and sequencing confirmation.
Taking pKM-T7RNAP1 as a template, and carrying out PCR by taking primers pScRNR2-F3: GTGTTCAATTAATTACGGT TTGTGGGCTTAACTAAAAGTCGAACAAAGAAGCAGGCAAAG (SEQ ID NO. 41) and tCYC1-R2: performing PCR amplification on the GAAGTGATATAATATTTACTGTTAATAGTCCATTCAAACCAACATCT TCGAGCGTCCAAAACCTTC (SEQ ID NO. 42); KLEXN 53-D-pMD 18 was used as a template, and the primers KlEXN53-F4: ATGTTGGTTTGAATGGACTATTAACAGT AAATTATATCACTC (SEQ ID NO. 43) and KlEXN53-R4: TTTTAGTTAAGCC ACAACCGTAATTAATTGAACAC (SEQ ID NO. 44) was subjected to PCR amplification to construct KlEX N53-pScRNR2-T7 RNP-DD1-pMD18 plasmid (shown in FIG. 6). The method comprises the following specific steps: mu.L of each of the two PCR products were mixed, and 1. Mu.L of Dpn I, 2. Mu.L of 10 Xdigestion buffer were added, and the mixture was incubated at 37 ℃ for 3 hours. Adding 10 mu L of the product after the Dpn I treatment into 100 mu L of DH5 alpha competent cells, standing on ice for 30min, thermally shocking at 4 ℃ for 45s, adding 1mL of LB liquid culture medium, performing shake culture at 37 ℃ for 1h, coating the mixture on Amp resistant LB solid culture, and performing inversion culture at 37 ℃ until single clones grow out. 5 monoclonals are selected to be subjected to shake culture in an LB liquid culture medium, and plasmids are extracted for storage after PCR positive detection and sequencing confirmation.
2.4.3 Kluyveromyces lactis transformation and positive identification
I Kluyveromyces lactis KlPAB1-KleIF4G strain successfully transformed in example 3 was streaked on YPD solid medium, a single clone was picked up, the single clone was cultured in 25mL of 2 XYPD liquid medium with shaking overnight, and 2mL of the strain was taken and cultured in 50mL of 2 XYPD liquid medium with shaking for 2-8h. The yeast cells were collected by centrifugation at 3000g for 5min at 20 ℃ and resuspended in 500. Mu.L of sterile water, and the cells were collected by centrifugation under the same conditions. Competent cell solutions (5% v/v glycerol, 10% v/v DMSO) were prepared and yeast cells were dissolved in 500. Mu.L of the solution. Subpackaging 50 μ L into 1.5mL centrifuge tubes, and storing at-80 deg.C.
ii competent cells were thawed at 37 ℃ for 15-30s, centrifuged at 13000g for 2min and the supernatant removed. Preparing a transformation buffer solution: PEG3350 (50% (w/v)) 260. Mu.L, liAc (1.0M) 36. Mu.L, carrier DNA (5.0 mg/mL) 20. Mu.L, cas9& gRNA plasmid 5. Mu.L, donor DNA 10. Mu.L, sterile water was added to a final volume of 360. Mu.L. After heat shock, 13000g were centrifuged for 30s to remove the supernatant. 1mL of YPD liquid medium was added, and the mixture was cultured for 2 to 3 hours, and 200. Mu.L of the mixture was pipetted and spread on solid YPD (200. Mu.g/mL G418) medium, and cultured for 2 to 3 days until single colonies appeared.
iii picking 10-40 monoclonals on a plate transformed by the Kluyveromyces lactis, placing the monoclonals in 1mL YPD (200 mu G/mL G418) liquid culture medium for shaking culture overnight, taking a bacterial liquid as a template, and adding CRISPR insert Check primer KLEXN53-CICF1: TTTGCTGGTTGCCCGTATTCCC (SEQ ID NO. 45), KLEXN53-CICR1: TAATAGCACAGAGGGAATGCACCTT (SEQ ID NO. 46), and PCR was performed on the samples. The strain which is positive in PCR result and identified by sequencing is determined to be a positive strain and is named as klxrn1 delta-pScRNR 2-T7RNP & KlPAB1-KleIF4G or KlEXN53 delta-pScRNR 2-T7RNP & KlPAB1-KleIF4G, which is called 3in1 for short.
EXAMPLE 3 preparation of cell-free in vitro protein Synthesis System and measurement of protein Synthesis efficiency
The experimental method comprises the following steps:
I. preparing a storage solution of an in-vitro protein synthesis system: 1M 4-hydroxyethylpiperazine ethanesulfonic acid at pH 7.4, 5M potassium acetate, 250mM magnesium acetate, 25mM 4 mixture of four nucleoside triphosphates including adenosine triphosphate, guanosine triphosphate, cytosine triphosphate and uracil nucleoside triphosphate, 1mM 2 mixture of twenty amino acids: glycine, alanine, valine, leucine, isoleucine, phenylalanine, proline, tryptophan, serine, tyrosine, cysteine, methionine, asparagine, glutamine, threonine, aspartic acid, glutamic acid, lysine, arginine, and histidine, with the concentration of 20 amino acids each being 1.0mm,1M phosphocreatine, 1M dithiothreitol, 6.48mg/mL phosphocreatine kinase, 1.7mg/mL T7RNA polymerase 20% -50% polyethylene glycol (PEG) 3350 or (PEG) 8000, 20% -40% sucrose;
in vitro protein synthesis reaction system: 4-hydroxyethylpiperazine ethanesulfonic acid at a final concentration of 22mM, pH 7.4, 30-150mM potassium acetate, 1.0-5.0mM magnesium acetate, 1.5mM-4mM nucleoside triphosphate mixtures (adenosine triphosphate, guanosine triphosphate, cytosine nucleoside triphosphate and uracil nucleoside triphosphate), 0.08-0.24mM amino acid mixtures (glycine, alanine, valine, leucine, isoleucine, phenylalanine, proline, tryptophan, serine, tyrosine, cysteine, methionine, asparagine, glutamine, threonine, aspartic acid, glutamic acid, lysine, arginine and histidine), 25mM creatine phosphate, 1.7mM dithiol, 0.27mg/mL creatine phosphate kinase, 8-20ng/μ L firefly luciferase DNA,0.027-0.054mg/mL T7RNA, 1-4% polyethylene glycol, and finally 50% by volume of yeast cell extract;
in vitro protein synthesis reaction: placing the reaction system in an environment with the temperature of 20-30 ℃, and standing for reaction for 2-6h;
luciferase activity assay: after the reaction is finished, adding equal volume of substrate fluorescein (luciferine) into a 96-well white plate or a 384-well white plate, immediately placing the mixture in an Envision 2120 multifunctional microplate reader (Perkin Elmer), reading, detecting the activity of the firefly luciferase, and taking a relative light unit value (RLU) as an activity unit, as shown in fig. 7 and 8.
The experimental results are as follows:
the results of example 3 of the invention show that: modification of klxrn1 delta-pScRNR 2-T7RNP&The KlPAB1-KleIF4G (3 in 1) strain can significantly enhance the protein production efficiency of the yeast in vitro protein synthesis system, and it can be seen from FIG. 7 that the luciferase activity of Wild Type (WT) in IVTT is shown as follows: 2.87X 10 8 , klxrn1Δ-pScRNR2-T7 RNP&The luciferase activity value of the KlPAB1-KleIF4G (3 in 1) yeast strain in IVTT is as follows: 1.539X 10 9 The activity is about 5.36 times that of the wild type. As can be seen from FIG. 8, if T7RNP was added manually from an external source, the activity was rather reduced, with an RLU value of 1.148X 10 9 (ii) a The highest activity was shown when the concentration of the added T7RNAP was 0 and was within the normal range (. Gtoreq.10) 8 ) The results show that the T7RNAP in the structure can be expressed in a proper amount and can meet the requirements of an in vitro translation system, the nucleic acid construct of the structure can remarkably enhance the protein production efficiency of a yeast in vitro protein synthesis system, and the RLU value of the nucleic acid construct is 1.539 multiplied by 10 9
Example 4
The method of example 3 was repeated except that the engineered strain of the invention only integrated the first fusion protein formed by the fusion of the PabI element with the eIF4G element.
The results show that the engineered strain that only incorporates the first fusion protein formed by the fusion of the PabI element with the eIF4G element increases luciferase activity in IVTT 2.6-fold compared to wild type.
Example 5
The method of example 3 was repeated except that the engineered strain of the invention knocked out only KlEXN53 nuclease.
The results showed that the engineered strain of the invention, which knocks out only KlEXN53 nuclease, increased luciferase activity in IVTT by 1.46-fold compared to wild type.
Example 6 summary of the concept of the design of the in vitro biosynthetic System
As described above, the present invention summarizes the design concept of regulating in vitro synthesis by transcription, translation regulation and nuclease stabilization, i.e., a method for improving in vitro biosynthesis activity by integrating the knockout of exonuclease (KlEXN 53), the preparation of endogenously expressed RNA polymerase (T7 RNP) and novel fusion protein (KlPAB 1-KleIF 4G) in cells, thereby achieving stable, efficient and rapid in vitro protein expression, as shown in fig. 9.
Reference documents:
1.Ayyar,B.V.,S.Arora,and R.O'Kennedy,Coming-of-Age of Antibodies in Cancer Therapeutics.Trends Pharmacol Sci,2016.37(12):1009-1028.
2.Scott,A.M.,J.D.Wolchok,and L.J.,Antibody therapy of cancer.Nature Reviews Cancer,2012.12(4):278.
3.Sonenberg,N.and A.G.Hinnebusch,Regulation of translation initiation in eukaryotes:mechanisms and biological targets.Cell,2009.136(4):731-45.
4.M.,D.J.R.G.,Nucleic Acid.Encyclopedia of Cell Biology,Elsevier,2015.
5.Chong,S.,Overview of Cell-Free Protein Synthesis:Historic Landmarks, Commercial Systems,and Expanding Applications.2014:John Wiley&Sons,Inc. 16.30.1-16.30.11.
6.Buchner E,Rapp R.,Alkoholische
Figure BDA0002583106770000381
ohne hefezellen.European Journal of Inorganic Chemistry,1897.30(3):2668-2678.
7.Welch P,Scopes R K,Studies on cell-free metabolism:Ethanol production by a yeast glycolytic system reconstituted from purified enzymes.Journal of biotechnology, 1985.2(5):257-273.
all documents mentioned in this application are incorporated by reference in this application as if each were individually incorporated by reference. Furthermore, it should be understood that various changes and modifications of the present invention can be made by those skilled in the art after reading the above teachings of the present invention, and these equivalents also fall within the scope of the present invention as defined by the appended claims.
Sequence listing
<110> Kangma (Shanghai) Biotechnology Co., ltd
<120> a method for improving in vitro protein synthesis efficiency
<130> 2018100936240
<141> 2020-07-14
<160> 46
<170> SIPOSequenceListing 1.0
<210> 1
<211> 592
<212> PRT
<213> Kluyveromyces lactis (Kluyveromyces lactis)
<400> 1
Met Ser Asp Ile Thr Glu Lys Thr Ala Glu Gln Leu Glu Asn Leu Gln
1 5 10 15
Ile Asn Asp Asp Gln Gln Pro Ala Gln Ser Ala Ser Ala Pro Ser Thr
20 25 30
Ser Ala Ser Glu Ser Glu Ala Ser Ser Val Ser Lys Val Glu Asn Asn
35 40 45
Asn Ala Ser Leu Tyr Val Gly Glu Leu Asp Pro Asn Ile Thr Glu Ala
50 55 60
Leu Leu Tyr Asp Val Phe Ser Pro Leu Gly Pro Ile Ser Ser Ile Arg
65 70 75 80
Val Cys Arg Asp Ala Val Thr Lys Ala Ser Leu Gly Tyr Ala Tyr Val
85 90 95
Asn Tyr Thr Asp Tyr Glu Ala Gly Lys Lys Ala Ile Gln Glu Leu Asn
100 105 110
Tyr Ala Glu Ile Asn Gly Arg Pro Cys Arg Ile Met Trp Ser Glu Arg
115 120 125
Asp Pro Ala Ile Arg Lys Lys Gly Ser Gly Asn Ile Phe Ile Lys Asn
130 135 140
Leu His Pro Ala Ile Asp Asn Lys Ala Leu His Glu Thr Phe Ser Thr
145 150 155 160
Phe Gly Glu Val Leu Ser Cys Lys Val Ala Leu Asp Glu Asn Gly Asn
165 170 175
Ser Arg Gly Phe Gly Phe Val His Phe Lys Glu Glu Ser Asp Ala Lys
180 185 190
Asp Ala Ile Glu Ala Val Asn Gly Met Leu Met Asn Gly Leu Glu Val
195 200 205
Tyr Val Ala Met His Val Pro Lys Lys Asp Arg Ile Ser Lys Leu Glu
210 215 220
Glu Ala Lys Ala Asn Phe Thr Asn Ile Tyr Val Lys Asn Ile Asp Val
225 230 235 240
Glu Thr Thr Asp Glu Glu Phe Glu Gln Leu Phe Ser Gln Tyr Gly Glu
245 250 255
Ile Val Ser Ala Ala Leu Glu Lys Asp Ala Glu Gly Lys Pro Lys Gly
260 265 270
Phe Gly Phe Val Asn Phe Val Asp His Asn Ala Ala Ala Lys Ala Val
275 280 285
Glu Glu Leu Asn Gly Lys Glu Phe Lys Ser Gln Ala Leu Tyr Val Gly
290 295 300
Arg Ala Gln Lys Lys Tyr Glu Arg Ala Glu Glu Leu Lys Lys Gln Tyr
305 310 315 320
Glu Gln Tyr Arg Leu Glu Lys Leu Ala Lys Phe Gln Gly Val Asn Leu
325 330 335
Phe Ile Lys Asn Leu Asp Asp Ser Ile Asp Asp Glu Lys Leu Lys Glu
340 345 350
Glu Phe Ala Pro Tyr Gly Thr Ile Thr Ser Ala Arg Val Met Arg Asp
355 360 365
Gln Glu Gly Asn Ser Lys Gly Phe Gly Phe Val Cys Phe Ser Ser Pro
370 375 380
Glu Glu Ala Thr Lys Ala Met Thr Glu Lys Asn Gln Gln Ile Val Ala
385 390 395 400
Gly Lys Pro Leu Tyr Val Ala Ile Ala Gln Arg Lys Asp Val Arg Arg
405 410 415
Ser Gln Leu Ala Gln Gln Ile Gln Ala Arg Asn Gln Ile Arg Phe Gln
420 425 430
Gln Gln Gln Gln Gln Gln Ala Ala Ala Ala Ala Ala Gly Met Pro Gly
435 440 445
Gln Tyr Met Pro Gln Met Phe Tyr Gly Val Met Ala Pro Arg Gly Phe
450 455 460
Pro Gly Pro Asn Pro Gly Met Asn Gly Pro Met Gly Ala Gly Ile Pro
465 470 475 480
Lys Asn Gly Met Val Pro Pro Pro Gln Gln Phe Ala Gly Arg Pro Asn
485 490 495
Gly Pro Met Tyr Gln Gly Met Pro Pro Gln Asn Gln Phe Pro Arg His
500 505 510
Gln Gln Gln His Tyr Ile Gln Gln Gln Lys Gln Arg Gln Ala Leu Gly
515 520 525
Glu Gln Leu Tyr Lys Lys Val Ser Ala Lys Ile Asp Asp Glu Asn Ala
530 535 540
Ala Gly Lys Ile Thr Gly Met Ile Leu Asp Leu Pro Pro Gln Gln Val
545 550 555 560
Ile Gln Leu Leu Asp Asn Asp Glu Gln Phe Glu Gln Gln Phe Gln Glu
565 570 575
Ala Leu Ala Ala Tyr Glu Asn Phe Lys Lys Glu Gln Glu Ala Gln Ala
580 585 590
<210> 2
<211> 1021
<212> PRT
<213> Kluyveromyces lactis (Kluyveromyces lactis)
<400> 2
Met Gly Glu Pro Thr Ser Asp Gln Gln Pro Ala Val Glu Ala Pro Val
1 5 10 15
Val Gln Glu Glu Thr Thr Ser Ser Pro Gln Lys Asn Ser Gly Tyr Val
20 25 30
Lys Asn Thr Ala Gly Ser Gly Ala Pro Arg Asn Gly Lys Tyr Asp Gly
35 40 45
Asn Arg Lys Asn Ser Arg Pro Tyr Asn Gln Arg Gly Asn Asn Asn Asn
50 55 60
Asn Asn Gly Ser Ser Ser Asn Lys His Tyr Gln Lys Tyr Asn Gln Pro
65 70 75 80
Ala Tyr Gly Val Ser Ala Gly Tyr Ile Pro Asn Tyr Gly Val Ser Ala
85 90 95
Glu Tyr Asn Pro Leu Tyr Tyr Asn Gln Tyr Gln Gln Gln Gln Gln Leu
100 105 110
Tyr Ala Ala Ala Tyr Gln Thr Pro Met Ser Gly Gln Gly Tyr Val Pro
115 120 125
Pro Val Val Ser Pro Ala Ala Val Ser Ala Lys Pro Ala Lys Val Glu
130 135 140
Ile Thr Asn Lys Ser Gly Glu His Ile Asp Ile Ala Ser Ile Ala His
145 150 155 160
Pro His Thr His Ser His Ser Gln Ser His Ser Arg Ala Val Pro Val
165 170 175
Val Ser Pro Pro Ala Asn Val Thr Val Ala Ala Ala Val Ser Ser Ser
180 185 190
Val Ser Pro Ser Ala Ser Pro Ala Val Lys Val Gln Ser Pro Ala Ala
195 200 205
Asn Gly Lys Glu Gln Ser Pro Ala Lys Pro Glu Glu Pro Lys Lys Asp
210 215 220
Thr Leu Ile Val Asn Asp Phe Leu Glu Gln Val Lys Arg Arg Lys Ala
225 230 235 240
Ala Leu Ala Ala Lys Lys Ala Val Glu Glu Lys Gly Pro Glu Glu Pro
245 250 255
Lys Glu Ser Val Val Gly Thr Asp Thr Asp Ala Ser Val Asp Thr Lys
260 265 270
Thr Gly Pro Thr Ala Thr Glu Ser Ala Lys Ser Glu Glu Ala Gln Ser
275 280 285
Glu Ser Gln Glu Lys Thr Lys Glu Glu Ala Pro Ala Glu Pro Lys Pro
290 295 300
Leu Thr Leu Ala Glu Lys Leu Arg Leu Lys Arg Met Glu Ala Ala Lys
305 310 315 320
Gln Ala Ser Ala Lys Thr Glu Glu Leu Lys Thr Glu Glu Ser Lys Pro
325 330 335
Glu Glu Thr Lys Thr Glu Glu Leu Lys Thr Glu Glu Ser Lys Pro Glu
340 345 350
Glu Thr Lys Thr Glu Glu Leu Lys Thr Glu Glu Thr Lys Ser Glu Glu
355 360 365
Leu Lys Thr Glu Glu Pro Lys Ala Glu Glu Ser Lys Ala Glu Glu Pro
370 375 380
Lys Pro Glu Glu Pro Lys Thr Glu Glu Pro Thr Thr Glu Gln Pro Lys
385 390 395 400
Ser Asp Glu Pro Lys Ser Glu Glu Ser Lys Thr Glu Glu Pro Lys Thr
405 410 415
Glu Val Leu Lys Thr Glu Glu Pro Lys Ser Glu Glu Ser Lys Pro Ala
420 425 430
Glu Pro Lys Thr Glu Glu Thr Ala Thr Glu Glu Thr Ala Thr Glu Ala
435 440 445
Asn Ala Glu Glu Gly Glu Pro Ala Pro Ala Gly Pro Val Glu Thr Pro
450 455 460
Ala Asp Val Glu Thr Lys Pro Arg Glu Glu Ala Glu Val Glu Asp Asp
465 470 475 480
Gly Lys Ile Thr Met Thr Asp Phe Leu Gln Lys Leu Lys Glu Val Ser
485 490 495
Pro Val Asp Asp Ile Tyr Ser Phe Gln Tyr Pro Ser Asp Ile Thr Pro
500 505 510
Pro Asn Asp Arg Tyr Lys Lys Thr Ser Ile Lys Tyr Ala Tyr Gly Pro
515 520 525
Asp Phe Leu Tyr Gln Phe Lys Glu Lys Val Asp Val Lys Tyr Asp Pro
530 535 540
Ala Trp Met Ala Glu Met Thr Ser Lys Ile Val Ile Pro Pro Lys Lys
545 550 555 560
Pro Gly Ser Ser Gly Arg Gly Glu Asp Arg Phe Ser Lys Gly Lys Val
565 570 575
Gly Ser Leu Arg Ser Glu Gly Arg Ser Gly Ser Arg Ser Asn Ser Lys
580 585 590
Lys Lys Ser Lys Arg Asp Asp Arg Lys Ser Asn Arg Ser Tyr Thr Ser
595 600 605
Arg Lys Asp Arg Glu Arg Phe Arg Glu Glu Glu Val Glu Glu Pro Lys
610 615 620
Val Glu Val Ala Pro Leu Val Pro Ser Ala Asn Arg Trp Val Pro Lys
625 630 635 640
Ser Lys Met Lys Lys Thr Glu Val Lys Leu Ala Pro Asp Gly Thr Glu
645 650 655
Leu Tyr Asp Ala Glu Glu Ala Ser Arg Lys Met Lys Ser Leu Leu Asn
660 665 670
Lys Leu Thr Leu Glu Met Phe Glu Pro Ile Ser Asp Asp Ile Met Lys
675 680 685
Ile Ala Asn Gln Ser Arg Trp Glu Glu Lys Gly Glu Thr Leu Lys Ile
690 695 700
Val Ile Gln Gln Ile Phe Asn Lys Ala Cys Asp Glu Pro His Trp Ser
705 710 715 720
Ser Met Tyr Ala Gln Leu Cys Gly Lys Val Val Lys Asp Leu Asp Asp
725 730 735
Ser Ile Lys Asp Ser Glu Thr Pro Asp Lys Thr Gly Ser His Leu Val
740 745 750
Leu His Tyr Leu Val Gln Arg Cys Gln Thr Glu Phe Gln Thr Gly Trp
755 760 765
Thr Asp Gln Leu Pro Thr Asn Glu Asp Gly Thr Pro Leu Gln Pro Glu
770 775 780
Met Met Ser Asp Glu Tyr Tyr Lys Met Ala Ala Ala Lys Arg Arg Gly
785 790 795 800
Leu Gly Leu Val Arg Phe Ile Gly Phe Leu Tyr Arg Ser Asn Leu Leu
805 810 815
Thr Ser Arg Met Val Phe Phe Cys Phe Lys Arg Leu Met Lys Asp Ile
820 825 830
Gln Asn Ser Pro Thr Glu Asp Thr Leu Glu Ser Val Cys Glu Leu Leu
835 840 845
Glu Thr Ile Gly Glu Gln Phe Glu Gly Ala Arg Ile Gln Val Thr Ala
850 855 860
Glu Ala Val Ile Glu Gly Ser Ser Leu Leu Asp Thr Leu Phe Asp Gln
865 870 875 880
Ile Lys Asn Val Ile Glu Asn Gly Asp Ile Ser Ser Arg Ile Lys Phe
885 890 895
Lys Leu Ile Asp Ile Val Glu Leu Arg Glu Lys Arg Asn Trp Asn Ser
900 905 910
Lys Asn Lys Asn Asp Gly Pro Lys Thr Ile Ala Gln Ile His Glu Glu
915 920 925
Glu Ala Leu Lys Arg Ala Leu Glu Glu Arg Glu Arg Glu Arg Asp Arg
930 935 940
His Gly Ser Arg Gly Gly Ser Arg Arg Met Asn Ser Glu Arg Asn Ser
945 950 955 960
Ser Arg Arg Asp Phe Ser Ser His Ser His Ser His Asn Gln Asn Arg
965 970 975
Asp Gly Phe Thr Thr Thr Arg Ser Ser Ser Val Arg Tyr Ser Glu Pro
980 985 990
Lys Lys Glu Glu Gln Ala Pro Thr Pro Thr Lys Ser Ser Gly Gly Ala
995 1000 1005
Ala Asn Met Phe Asp Ala Leu Met Asp Ala Glu Asp Asp
1010 1015 1020
<210> 3
<211> 1642
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 3
Met Ser Asp Ile Thr Glu Lys Thr Ala Glu Gln Leu Glu Asn Leu Gln
1 5 10 15
Ile Asn Asp Asp Gln Gln Pro Ala Gln Ser Ala Ser Ala Pro Ser Thr
20 25 30
Ser Ala Ser Glu Ser Glu Ala Ser Ser Val Ser Lys Val Glu Asn Asn
35 40 45
Asn Ala Ser Leu Tyr Val Gly Glu Leu Asp Pro Asn Ile Thr Glu Ala
50 55 60
Leu Leu Tyr Asp Val Phe Ser Pro Leu Gly Pro Ile Ser Ser Ile Arg
65 70 75 80
Val Cys Arg Asp Ala Val Thr Lys Ala Ser Leu Gly Tyr Ala Tyr Val
85 90 95
Asn Tyr Thr Asp Tyr Glu Ala Gly Lys Lys Ala Ile Gln Glu Leu Asn
100 105 110
Tyr Ala Glu Ile Asn Gly Arg Pro Cys Arg Ile Met Trp Ser Glu Arg
115 120 125
Asp Pro Ala Ile Arg Lys Lys Gly Ser Gly Asn Ile Phe Ile Lys Asn
130 135 140
Leu His Pro Ala Ile Asp Asn Lys Ala Leu His Glu Thr Phe Ser Thr
145 150 155 160
Phe Gly Glu Val Leu Ser Cys Lys Val Ala Leu Asp Glu Asn Gly Asn
165 170 175
Ser Arg Gly Phe Gly Phe Val His Phe Lys Glu Glu Ser Asp Ala Lys
180 185 190
Asp Ala Ile Glu Ala Val Asn Gly Met Leu Met Asn Gly Leu Glu Val
195 200 205
Tyr Val Ala Met His Val Pro Lys Lys Asp Arg Ile Ser Lys Leu Glu
210 215 220
Glu Ala Lys Ala Asn Phe Thr Asn Ile Tyr Val Lys Asn Ile Asp Val
225 230 235 240
Glu Thr Thr Asp Glu Glu Phe Glu Gln Leu Phe Ser Gln Tyr Gly Glu
245 250 255
Ile Val Ser Ala Ala Leu Glu Lys Asp Ala Glu Gly Lys Pro Lys Gly
260 265 270
Phe Gly Phe Val Asn Phe Val Asp His Asn Ala Ala Ala Lys Ala Val
275 280 285
Glu Glu Leu Asn Gly Lys Glu Phe Lys Ser Gln Ala Leu Tyr Val Gly
290 295 300
Arg Ala Gln Lys Lys Tyr Glu Arg Ala Glu Glu Leu Lys Lys Gln Tyr
305 310 315 320
Glu Gln Tyr Arg Leu Glu Lys Leu Ala Lys Phe Gln Gly Val Asn Leu
325 330 335
Phe Ile Lys Asn Leu Asp Asp Ser Ile Asp Asp Glu Lys Leu Lys Glu
340 345 350
Glu Phe Ala Pro Tyr Gly Thr Ile Thr Ser Ala Arg Val Met Arg Asp
355 360 365
Gln Glu Gly Asn Ser Lys Gly Phe Gly Phe Val Cys Phe Ser Ser Pro
370 375 380
Glu Glu Ala Thr Lys Ala Met Thr Glu Lys Asn Gln Gln Ile Val Ala
385 390 395 400
Gly Lys Pro Leu Tyr Val Ala Ile Ala Gln Arg Lys Asp Val Arg Arg
405 410 415
Ser Gln Leu Ala Gln Gln Ile Gln Ala Arg Asn Gln Ile Arg Phe Gln
420 425 430
Gln Gln Gln Gln Gln Gln Ala Ala Ala Ala Ala Ala Gly Met Pro Gly
435 440 445
Gln Tyr Met Pro Gln Met Phe Tyr Gly Val Met Ala Pro Arg Gly Phe
450 455 460
Pro Gly Pro Asn Pro Gly Met Asn Gly Pro Met Gly Ala Gly Ile Pro
465 470 475 480
Lys Asn Gly Met Val Pro Pro Pro Gln Gln Phe Ala Gly Arg Pro Asn
485 490 495
Gly Pro Met Tyr Gln Gly Met Pro Pro Gln Asn Gln Phe Pro Arg His
500 505 510
Gln Gln Gln His Tyr Ile Gln Gln Gln Lys Gln Arg Gln Ala Leu Gly
515 520 525
Glu Gln Leu Tyr Lys Lys Val Ser Ala Lys Ile Asp Asp Glu Asn Ala
530 535 540
Ala Gly Lys Ile Thr Gly Met Ile Leu Asp Leu Pro Pro Gln Gln Val
545 550 555 560
Ile Gln Leu Leu Asp Asn Asp Glu Gln Phe Glu Gln Gln Phe Gln Glu
565 570 575
Ala Leu Ala Ala Tyr Glu Asn Phe Lys Lys Glu Gln Glu Ala Gln Ala
580 585 590
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Thr Gln Asp Glu Val Gln
595 600 605
Gly Pro His Ala Gly Lys Ser Thr Val Gly Gly Gly Gly Ser Gly Glu
610 615 620
Pro Thr Ser Asp Gln Gln Pro Ala Val Glu Ala Pro Val Val Gln Glu
625 630 635 640
Glu Thr Thr Ser Ser Pro Gln Lys Asn Ser Gly Tyr Val Lys Asn Thr
645 650 655
Ala Gly Ser Gly Ala Pro Arg Asn Gly Lys Tyr Asp Gly Asn Arg Lys
660 665 670
Asn Ser Arg Pro Tyr Asn Gln Arg Gly Asn Asn Asn Asn Asn Asn Gly
675 680 685
Ser Ser Ser Asn Lys His Tyr Gln Lys Tyr Asn Gln Pro Ala Tyr Gly
690 695 700
Val Ser Ala Gly Tyr Ile Pro Asn Tyr Gly Val Ser Ala Glu Tyr Asn
705 710 715 720
Pro Leu Tyr Tyr Asn Gln Tyr Gln Gln Gln Gln Gln Leu Tyr Ala Ala
725 730 735
Ala Tyr Gln Thr Pro Met Ser Gly Gln Gly Tyr Val Pro Pro Val Val
740 745 750
Ser Pro Ala Ala Val Ser Ala Lys Pro Ala Lys Val Glu Ile Thr Asn
755 760 765
Lys Ser Gly Glu His Ile Asp Ile Ala Ser Ile Ala His Pro His Thr
770 775 780
His Ser His Ser Gln Ser His Ser Arg Ala Val Pro Val Val Ser Pro
785 790 795 800
Pro Ala Asn Val Thr Val Ala Ala Ala Val Ser Ser Ser Val Ser Pro
805 810 815
Ser Ala Ser Pro Ala Val Lys Val Gln Ser Pro Ala Ala Asn Gly Lys
820 825 830
Glu Gln Ser Pro Ala Lys Pro Glu Glu Pro Lys Lys Asp Thr Leu Ile
835 840 845
Val Asn Asp Phe Leu Glu Gln Val Lys Arg Arg Lys Ala Ala Leu Ala
850 855 860
Ala Lys Lys Ala Val Glu Glu Lys Gly Pro Glu Glu Pro Lys Glu Ser
865 870 875 880
Val Val Gly Thr Asp Thr Asp Ala Ser Val Asp Thr Lys Thr Gly Pro
885 890 895
Thr Ala Thr Glu Ser Ala Lys Ser Glu Glu Ala Gln Ser Glu Ser Gln
900 905 910
Glu Lys Thr Lys Glu Glu Ala Pro Ala Glu Pro Lys Pro Leu Thr Leu
915 920 925
Ala Glu Lys Leu Arg Leu Lys Arg Met Glu Ala Ala Lys Gln Ala Ser
930 935 940
Ala Lys Thr Glu Glu Leu Lys Thr Glu Glu Ser Lys Pro Glu Glu Thr
945 950 955 960
Lys Thr Glu Glu Leu Lys Thr Glu Glu Ser Lys Pro Glu Glu Thr Lys
965 970 975
Thr Glu Glu Leu Lys Thr Glu Glu Thr Lys Ser Glu Glu Leu Lys Thr
980 985 990
Glu Glu Pro Lys Ala Glu Glu Ser Lys Ala Glu Glu Pro Lys Pro Glu
995 1000 1005
Glu Pro Lys Thr Glu Glu Pro Thr Thr Glu Gln Pro Lys Ser Asp Glu
1010 1015 1020
Pro Lys Ser Glu Glu Ser Lys Thr Glu Glu Pro Lys Thr Glu Val Leu
1025 1030 1035 1040
Lys Thr Glu Glu Pro Lys Ser Glu Glu Ser Lys Pro Ala Glu Pro Lys
1045 1050 1055
Thr Glu Glu Thr Ala Thr Glu Glu Thr Ala Thr Glu Ala Asn Ala Glu
1060 1065 1070
Glu Gly Glu Pro Ala Pro Ala Gly Pro Val Glu Thr Pro Ala Asp Val
1075 1080 1085
Glu Thr Lys Pro Arg Glu Glu Ala Glu Val Glu Asp Asp Gly Lys Ile
1090 1095 1100
Thr Met Thr Asp Phe Leu Gln Lys Leu Lys Glu Val Ser Pro Val Asp
1105 1110 1115 1120
Asp Ile Tyr Ser Phe Gln Tyr Pro Ser Asp Ile Thr Pro Pro Asn Asp
1125 1130 1135
Arg Tyr Lys Lys Thr Ser Ile Lys Tyr Ala Tyr Gly Pro Asp Phe Leu
1140 1145 1150
Tyr Gln Phe Lys Glu Lys Val Asp Val Lys Tyr Asp Pro Ala Trp Met
1155 1160 1165
Ala Glu Met Thr Ser Lys Ile Val Ile Pro Pro Lys Lys Pro Gly Ser
1170 1175 1180
Ser Gly Arg Gly Glu Asp Arg Phe Ser Lys Gly Lys Val Gly Ser Leu
1185 1190 1195 1200
Arg Ser Glu Gly Arg Ser Gly Ser Arg Ser Asn Ser Lys Lys Lys Ser
1205 1210 1215
Lys Arg Asp Asp Arg Lys Ser Asn Arg Ser Tyr Thr Ser Arg Lys Asp
1220 1225 1230
Arg Glu Arg Phe Arg Glu Glu Glu Val Glu Glu Pro Lys Val Glu Val
1235 1240 1245
Ala Pro Leu Val Pro Ser Ala Asn Arg Trp Val Pro Lys Ser Lys Met
1250 1255 1260
Lys Lys Thr Glu Val Lys Leu Ala Pro Asp Gly Thr Glu Leu Tyr Asp
1265 1270 1275 1280
Ala Glu Glu Ala Ser Arg Lys Met Lys Ser Leu Leu Asn Lys Leu Thr
1285 1290 1295
Leu Glu Met Phe Glu Pro Ile Ser Asp Asp Ile Met Lys Ile Ala Asn
1300 1305 1310
Gln Ser Arg Trp Glu Glu Lys Gly Glu Thr Leu Lys Ile Val Ile Gln
1315 1320 1325
Gln Ile Phe Asn Lys Ala Cys Asp Glu Pro His Trp Ser Ser Met Tyr
1330 1335 1340
Ala Gln Leu Cys Gly Lys Val Val Lys Asp Leu Asp Asp Ser Ile Lys
1345 1350 1355 1360
Asp Ser Glu Thr Pro Asp Lys Thr Gly Ser His Leu Val Leu His Tyr
1365 1370 1375
Leu Val Gln Arg Cys Gln Thr Glu Phe Gln Thr Gly Trp Thr Asp Gln
1380 1385 1390
Leu Pro Thr Asn Glu Asp Gly Thr Pro Leu Gln Pro Glu Met Met Ser
1395 1400 1405
Asp Glu Tyr Tyr Lys Met Ala Ala Ala Lys Arg Arg Gly Leu Gly Leu
1410 1415 1420
Val Arg Phe Ile Gly Phe Leu Tyr Arg Ser Asn Leu Leu Thr Ser Arg
1425 1430 1435 1440
Met Val Phe Phe Cys Phe Lys Arg Leu Met Lys Asp Ile Gln Asn Ser
1445 1450 1455
Pro Thr Glu Asp Thr Leu Glu Ser Val Cys Glu Leu Leu Glu Thr Ile
1460 1465 1470
Gly Glu Gln Phe Glu Gly Ala Arg Ile Gln Val Thr Ala Glu Ala Val
1475 1480 1485
Ile Glu Gly Ser Ser Leu Leu Asp Thr Leu Phe Asp Gln Ile Lys Asn
1490 1495 1500
Val Ile Glu Asn Gly Asp Ile Ser Ser Arg Ile Lys Phe Lys Leu Ile
1505 1510 1515 1520
Asp Ile Val Glu Leu Arg Glu Lys Arg Asn Trp Asn Ser Lys Asn Lys
1525 1530 1535
Asn Asp Gly Pro Lys Thr Ile Ala Gln Ile His Glu Glu Glu Ala Leu
1540 1545 1550
Lys Arg Ala Leu Glu Glu Arg Glu Arg Glu Arg Asp Arg His Gly Ser
1555 1560 1565
Arg Gly Gly Ser Arg Arg Met Asn Ser Glu Arg Asn Ser Ser Arg Arg
1570 1575 1580
Asp Phe Ser Ser His Ser His Ser His Asn Gln Asn Arg Asp Gly Phe
1585 1590 1595 1600
Thr Thr Thr Arg Ser Ser Ser Val Arg Tyr Ser Glu Pro Lys Lys Glu
1605 1610 1615
Glu Gln Ala Pro Thr Pro Thr Lys Ser Ser Gly Gly Ala Ala Asn Met
1620 1625 1630
Phe Asp Ala Leu Met Asp Ala Glu Asp Asp
1635 1640
<210> 4
<211> 3542
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 4
agtcgaacaa gaagcaggca aagtttagag cactgcccct ccgcactcaa aaaagaaaaa 60
actaggagga aaataaaatt ctcaaccaca caaacacata aacacataca aatacaaata 120
caagcttatt tacttgacat cgcgcgatct tccactattc agcgccgtcc gccctctctc 180
gtgttttttg tttacgcgac aactatgcga aatccggagc aacgggcaac cgtttgggga 240
aagaccacac ccacgcgcga tcgccatggc aacgaggtcg cacacgcccc acacccagac 300
ctccctgcga gcgggcatgg gtacaatgtc cccgttgcca cagacaccac ttcgtagcac 360
agcgcagagc gtagcgtgtt gttgctgctg acaaaagaaa atttttctta gcaaagcaaa 420
ggaggggaag cacgggcaga tagcaccgta ccataccctt ggaaactcga aatgaacgaa 480
gcaggaaatg agagaatgag agttttgtag gtatatatag cggtagtgtt tgcgcgttac 540
catcatcttc tggatctatc tattgttctt ttcctcatca ctttcccctt tttcgctctt 600
cttcttgtct tttatttctt tctttttttt aattgttccc tcgattggct atctaccaaa 660
gaatccaaac ttaatacacg tatttatttg tccaattacc atgaacacga ttaacatcgc 720
taagaacgac ttctctgaca tcgaactggc tgctatcccg ttcaacactc tggctgacca 780
ttacggtgag cgtttagctc gcgaacagtt ggcccttgag catgagtctt acgagatggg 840
tgaagcacgc ttccgcaaga tgtttgagcg tcaacttaaa gctggtgagg ttgcggataa 900
cgctgccgcc aagcctctca tcactaccct actccctaag atgattgcac gcatcaacga 960
ctggtttgag gaagtgaaag ctaagcgcgg caagcgcccg acagccttcc agttcctgca 1020
agaaatcaag ccggaagccg tagcgtacat caccattaag accactctgg cttgcctaac 1080
cagtgctgac aatacaaccg ttcaggctgt agcaagcgca atcggtcggg ccattgagga 1140
cgaggctcgc ttcggtcgta tccgtgacct tgaagctaag cacttcaaga aaaacgttga 1200
ggaacaactc aacaagcgcg tagggcacgt ctacaagaaa gcatttatgc aagttgtcga 1260
ggctgacatg ctctctaagg gtctactcgg tggcgaggcg tggtcttcgt ggcataagga 1320
agactctatt catgtaggag tacgctgcat cgagatgctc attgagtcaa ccggaatggt 1380
tagcttacac cgccaaaatg ctggcgtagt aggtcaagac tctgagacta tcgaactcgc 1440
acctgaatac gctgaggcta tcgcaacccg tgcaggtgcg ctggctggca tctctccgat 1500
gttccaacct tgcgtagttc ctcctaagcc gtggactggc attactggtg gtggctattg 1560
ggctaacggt cgtcgtcctc tggcgctggt gcgtactcac agtaagaaag cactgatgcg 1620
ctacgaagac gtttacatgc ctgaggtgta caaagcgatt aacattgcgc aaaacaccgc 1680
atggaaaatc aacaagaaag tcctagcggt cgccaacgta atcaccaagt ggaagcattg 1740
tccggtcgag gacatccctg cgattgagcg tgaagaactc ccgatgaaac cggaagacat 1800
cgacatgaat cctgaggctc tcaccgcgtg gaaacgtgct gccgctgctg tgtaccgcaa 1860
ggacaaggct cgcaagtctc gccgtatcag ccttgagttc atgcttgagc aagccaataa 1920
gtttgctaac cataaggcca tctggttccc ttacaacatg gactggcgcg gtcgtgttta 1980
cgctgtgtca atgttcaacc cgcaaggtaa cgatatgacc aaaggactgc ttacgctggc 2040
gaaaggtaaa ccaatcggta aggaaggtta ctactggctg aaaatccacg gtgcaaactg 2100
tgcgggtgtc gataaggttc cgttccctga gcgcatcaag ttcattgagg aaaaccacga 2160
gaacatcatg gcttgcgcta agtctccact ggagaacact tggtgggctg agcaagattc 2220
tccgttctgc ttccttgcgt tctgctttga gtacgctggg gtacagcacc acggcctgag 2280
ctataactgc tcccttccgc tggcgtttga cgggtcttgc tctggcatcc agcacttctc 2340
cgcgatgctc cgagatgagg taggtggtcg cgcggttaac ttgcttccta gtgaaaccgt 2400
tcaggacatc tacgggattg ttgctaagaa agtcaacgag attctacaag cagacgcaat 2460
caatgggacc gataacgaag tagttaccgt gaccgatgag aacactggtg aaatctctga 2520
gaaagtcaag ctgggcacta aggcactggc tggtcaatgg ctggcttacg gtgttactcg 2580
cagtgtgact aagcgttcag tcatgacgct ggcttacggg tccaaagagt tcggcttccg 2640
tcaacaagtg ctggaagata ccattcagcc agctattgat tccggcaagg gtctgatgtt 2700
cactcagccg aatcaggctg ctggatacat ggctaagctg atttgggaat ctgtgagcgt 2760
gacggtggta gctgcggttg aagcaatgaa ctggcttaag tctgctgcta agctgctggc 2820
tgctgaggtc aaagataaga agactggaga gattcttcgc aagcgttgcg ctgtgcattg 2880
ggtaactcct gatggtttcc ctgtgtggca ggaatacaag aagcctattc agacgcgctt 2940
gaacctgatg ttcctcggtc agttccgctt acagcctacc attaacacca acaaagatag 3000
cgagattgat gcacacaaac aggagtctgg tatcgctcct aactttgtac acagccaaga 3060
cggtagccac cttcgtaaga ctgtagtgtg ggcacacgag aagtacggaa tcgaatcttt 3120
tgcactgatt cacgactcct tcggtaccat tccggctgac gctgcgaacc tgttcaaagc 3180
agtgcgcgaa actatggttg acacatatga gtcttgtgat gtactggctg atttctacga 3240
ccagttcgct gaccagttgc acgagtctca attggacaaa atgccagcac ttccggctaa 3300
aggtaacttg aacctccgtg acatcttaga gtcggacttc gcgttcgcgt aaatccgctc 3360
taaccgaaaa ggaaggagtt agacaacctg aagtctaggt ccctatttat ttttttatag 3420
ttatgttagt attaagaacg ttatttatat ttcaaatttt tctttttttt ctgtacagac 3480
gcgtgtacgc atgtaacatt atactgaaaa ccttgcttga gaaggttttg ggacgctcga 3540
ag 3542
<210> 5
<211> 4362
<212> DNA
<213> Kluyveromyces lactis (Kluyveromyces lactis)
<400> 5
atgggtattc caaaattctt tcatttcatc tctgagagat ggcctcaaat ttctcaattg 60
atcgatggat cacagattcc agagttcgac aatttgtact tggatatgaa ttctattttg 120
cataattgta cgcatggtga tggtagcgag gtgaattcaa gactatcaga ggaagaagtt 180
tattccaaaa ttttcagtta tattgatcat cttttccata ctattaaacc aaaacagaca 240
ttttatatgg caatcgatgg tgtggcccca agagcaaaga tgaaccaaca aagagctcgt 300
agattcagaa ctgccatgga tgccgaaaag gctttgcaga aggccattga aaatggtgac 360
gagttgccta agggagagcc atttgattcc aacgctatta ccccaggaac agaatttatg 420
gcaaaattaa ccgagaattt gaaatatttc atccatgaca agatcaccaa cgataccaga 480
tggcagaacg tgaaggttat tttctccggg catgaggttc ctggtgaagg tgaacataag 540
atcatggatt acatcagagc aattagagca caagaggatt acaatccaaa tacaagacat 600
tgtatctacg ggttagatgc tgatttgatc atcctaggtt tatccaccca tgatcaccac 660
ttttgtttat taagagaaga agttactttt ggtaaacgtt cgtcttctgt gaaaactcta 720
gaaacacaga acttcttctt gttgcatttg tctatcttga gagaatattt ggcattagag 780
ttcgaagaaa taacagattc tgtgcagttt gaatacgact ttgaaagagt attagatgat 840
ttcatctttg tattatttac catcggtaat gatttcttac caaatttgcc cgatttgcat 900
ttgaaaaaag gtgcattccc tgtgctatta caaactttta aagaagctct ccaacatatg 960
gatggttaca ttaatgaaca aggtaagata aatttggcaa gattttccat ttggttgaag 1020
tacttgtccg attttgaata ccttaacttt gagaagaaag atattgacgt tgaatggttc 1080
aatcaacaac ttgaaaatat ttccttggaa ggtgagcgta aacgtactag gatgggtaaa 1140
aagttgttga tgaaacaaca aaagaaattg attggcgccg taaaaccatg gttattgaaa 1200
accgttcaac ggaaggtcac ttctgaatta caggatgccg atttcgaaat tttccctctt 1260
gaggataaag aattggttcg ggccaacctg gatttcttga aggaattcgc atttgatttg 1320
ggtttaattc ttgctcattc taaatcaaaa gatttgtact acttcaagtt ggatttagac 1380
tccatcaatg tgcaagaaac agatgaagag catgaagccc gtattcacga aacaagacgt 1440
tctattaaga aatatgaaca aggtattatt attgcatctg aagaagaatt agaagaagag 1500
cgtgaaattt atagcgagag attcgttgag tggaaagatc aatactataa agataaatta 1560
gatttttcca tcaatgatac tgatagtcta aaagaaatga cagaaaacta tgtcggaggt 1620
ttacaatggg ttttatacta ctattatcgt ggatgtccat cttggtcctg gtattacaga 1680
tatcattacg ctcctcgtat atcagatgtg atcaaaggta tagatcagaa catagaattt 1740
cacaagggac agcctttcaa gcccttccaa caattgatgg ccgtcttgcc tgagagatca 1800
aagaatttga ttccagttgt gtacagacca ctcatgtatg atgaacactc tcctatcttg 1860
gacttctatc ctaacgaagt agagttggat ctaaatggca aaacagcaga ctgggaagct 1920
gtcgtcaaaa tttcattcgt ggatcaaaag cgtttagtag aagcaatggc tccttatgat 1980
gctaagcttt ctccggatga aaagaaaaga aattcgttcg gaactgatct gattttcata 2040
ttcaatcctc aggtagacac agtttacaaa acaccactag cggggttgtt taatgatatt 2100
gaacacaatc attgtatcga aagggaattc attccagaat cgatggaaaa cgttaagttc 2160
ttatttggtt tgccaaaggg cgctaaactc ggagctagct ctttggccgg tttcccatcc 2220
ctaaagacac taccactaac tgcagagctt gcttataatt cttctgttgt cttcaatttc 2280
ccttctaaac aacaatctat ggtgttgcat attcaggacc tttacaagga aaatggcatc 2340
tctctgtcag atctagcaaa aagacatatg ggtaagattg tttattcaag atggccgttt 2400
ctaagagaat ctaaactttt gtcgttgatt acagaggaaa ctgtgtatga aggagttaag 2460
tcaggcaaat taacaaaggt cattgaaaga aagcctcagg attttgaaag gaaagaattt 2520
agagagttga agatgactct caaatcgaat tatcaaagga caaaggccat tcttttggat 2580
gacatttctg ctttggctaa agtggttcca gtaaatggat tggtgagaaa ctctgatggt 2640
tcctattcta agtctttcaa tgaaactatc gaatactatc ctttacaatt aatcgtagag 2700
gacgtaaaaa ataaggacga aagatatatt gaaaaagagc cgttaccaat taataaagaa 2760
tttccaaaag gttcaaaagt tgtgttttta ggtgattatg cgtatggtgg ggaggcgacc 2820
gttgatggtt ataatagtga gactagatta aaacttacag tcaaaaaggg ttctctcaga 2880
gcagagccta acatcggaaa agttagagcg aaattggatt ctcaagcctt gagattctac 2940
ccaacacaag tgttttcaaa aatagctcgt gtccaccctc tcttcttgtc aaaaattact 3000
tcaagatatt tggtcaatga ttctaaaaag aaaagccata acgtcggttt gatgatcaaa 3060
ttcaaagcaa gaaatcaaaa agttctcggt tatgccagat gcagctcaaa caaatgggaa 3120
tactctgacg tcgctcttgg tttgttagag cagttcagat ctacattccc tgagttcttt 3180
gcgaaacttt ctaactcgaa ggaacaagca attccatcga tcactgatct cttccctaac 3240
aaatctagcg cggaagctga ttccattttg aaaacagtgg ctgattggct ctcagaagca 3300
agaaaaccat tcgtggtggt gtctttggaa agtgattcgc taaccaaggc ttcgatggca 3360
gctgttgaat ctgaaatcat aaaatacgtt tctttaccag attcaagcga gcagaagaaa 3420
ttagctaagg tcccacgtga ggcaatctta aatgcggaat cgtcatatgt tctattgcgc 3480
tcccaaaggt tccacttagg tgatagggta atgtacattc aagattcagg caaggttcca 3540
cttcacagca aaggtactgt cgttggctac acttcaattg gcaagaacgt ctcaatccaa 3600
gttctatttg acaatgaaat aattgcagga aacaactttg gtggtaggtt gcagaccaga 3660
cgtggtttgg gattggactc ttccttctta ttaaacttgt ctgatagaca attggtatat 3720
cattcaaagg catcgaagag tgctgataag aaaccaaaag cagttcctaa tgataagcaa 3780
gtcgccctcg caaaaaagaa gagagtggag gaactcaaaa aaaagcaagc ccatgagttg 3840
ttaaatcata tcaagaagga taacgcggaa tcaaataccg aatctggatc cgctccgcaa 3900
atagcagtta acactttaaa tccttcggct gctaacaacg tgttcaatgc cgtcttgaac 3960
caaatcaaac caggttctca acagcaaatt caaccacctc cagcaaattc tctgccttac 4020
aatttcactg ttcccccaca tatggttcct ggtggtattc ctcatcctct tatgatgcag 4080
ccgcctttca ttccgaataa tgagcatatt gcttatgcag ctcctcctca gtcacaacct 4140
gtacaaaatc caccattaga taaagaggca tccaggaatc ttaagaatct cctaattaga 4200
gatgagaatg gacgtacagc aaatgtggag aataaagact cagatgatac aaagagatct 4260
tctcattctc gtggcggtcg tcgtggccgt agtaatcgtg gtcgtggtgc ctccgggagg 4320
ggtggtcatt tcaaaaattc tcctaaaaaa actgaaacct ga 4362
<210> 6
<211> 513
<212> PRT
<213> Kluyveromyces lactis (Kluyveromyces lactis)
<400> 6
Met Gly Ile Pro Lys Phe Phe His Phe Ile Ser Glu Arg Trp Pro Gln
1 5 10 15
Ile Ser Gln Leu Ile Asp Gly Ser Gln Ile Pro Glu Phe Asp Asn Leu
20 25 30
Tyr Leu Asp Met Asn Ser Ile Leu His Asn Cys Thr His Gly Asp Gly
35 40 45
Ser Glu Val Asn Ser Arg Leu Ser Glu Glu Glu Val Tyr Ser Lys Ile
50 55 60
Phe Ser Tyr Ile Asp His Leu Phe His Thr Ile Lys Pro Lys Gln Thr
65 70 75 80
Phe Tyr Met Ala Ile Asp Gly Val Ala Pro Arg Ala Lys Met Asn Gln
85 90 95
Gln Arg Ala Arg Arg Phe Arg Thr Ala Met Asp Ala Glu Lys Ala Leu
100 105 110
Gln Lys Ala Ile Glu Asn Gly Asp Glu Leu Pro Lys Gly Glu Pro Phe
115 120 125
Asp Ser Asn Ala Ile Thr Pro Gly Thr Glu Phe Met Ala Lys Leu Thr
130 135 140
Glu Asn Leu Lys Tyr Phe Ile His Asp Lys Ile Thr Asn Asp Thr Arg
145 150 155 160
Trp Gln Asn Val Lys Val Ile Phe Ser Gly His Glu Val Pro Gly Glu
165 170 175
Gly Glu His Lys Ile Met Asp Tyr Ile Arg Ala Ile Arg Ala Gln Glu
180 185 190
Asp Tyr Asn Pro Asn Thr Arg His Cys Ile Tyr Gly Leu Asp Ala Asp
195 200 205
Leu Ile Ile Leu Gly Leu Ser Thr His Asp His His Phe Cys Leu Leu
210 215 220
Arg Glu Glu Val Thr Phe Gly Lys Arg Ser Ser Ser Val Lys Thr Leu
225 230 235 240
Glu Thr Gln Asn Phe Phe Leu Leu His Leu Ser Ile Leu Arg Glu Tyr
245 250 255
Leu Ala Leu Glu Phe Glu Glu Ile Thr Asp Ser Val Gln Phe Glu Tyr
260 265 270
Asp Phe Glu Arg Val Leu Asp Asp Phe Ile Phe Val Leu Phe Thr Ile
275 280 285
Gly Asn Asp Phe Leu Pro Asn Leu Pro Asp Leu His Leu Lys Lys Gly
290 295 300
Ala Phe Pro Val Leu Leu Gln Thr Phe Lys Glu Ala Leu Gln His Met
305 310 315 320
Asp Gly Tyr Ile Asn Glu Gln Gly Lys Ile Asn Leu Ala Arg Phe Ser
325 330 335
Ile Trp Leu Lys Tyr Leu Ser Asp Phe Glu Tyr Leu Asn Phe Glu Lys
340 345 350
Lys Asp Ile Asp Val Glu Trp Phe Asn Gln Gln Leu Glu Asn Ile Ser
355 360 365
Leu Glu Gly Glu Arg Lys Arg Thr Arg Met Gly Lys Lys Leu Leu Met
370 375 380
Lys Gln Gln Lys Lys Leu Ile Gly Ala Val Lys Pro Trp Leu Leu Lys
385 390 395 400
Thr Val Gln Arg Lys Val Thr Ser Glu Leu Gln Asp Ala Asp Phe Glu
405 410 415
Ile Phe Pro Leu Glu Asp Lys Glu Leu Val Arg Ala Asn Leu Asp Phe
420 425 430
Leu Lys Glu Phe Ala Phe Asp Leu Gly Leu Ile Leu Ala His Ser Lys
435 440 445
Ser Lys Asp Leu Tyr Tyr Phe Lys Leu Asp Leu Asp Ser Ile Asn Val
450 455 460
Gln Glu Thr Asp Glu Glu His Glu Ala Arg Ile His Glu Thr Arg Arg
465 470 475 480
Ser Ile Lys Lys Tyr Glu Gln Gly Ile Ile Ile Ala Ser Glu Glu Glu
485 490 495
Leu Glu Glu Glu Arg Glu Ile Tyr Ser Glu Arg Phe Val Glu Trp Lys
500 505 510
Asp
<210> 7
<211> 3066
<212> DNA
<213> Kluyveromyces lactis (Kluyveromyces lactis)
<400> 7
atgggcgaac ctacatccga tcagcaacca gctgttgaag ctccagttgt gcaggaggag 60
acaaccagtt ctccgcaaaa aaacagtgga tatgtcaaga atactgctgg aagcggtgct 120
cctagaaatg ggaaatatga tggtaacagg aagaactcta ggccttataa ccaaagaggt 180
aacaacaaca ataataatgg ttcttcctcg aataagcact atcaaaagta taaccaacca 240
gcgtacggtg tttctgcggg atacattccg aactacggcg tatcggcaga gtacaaccct 300
ctgtactata accagtacca acagcagcaa cagctgtacg ctgctgctta ccagactcca 360
atgagcggac aaggttatgt ccccccagta gtgtctccag ctgctgtttc agctaaacca 420
gcgaaggttg agattactaa caagtctggt gaacacatag atattgcttc cattgctcat 480
ccacatactc attctcattc tcaatctcat tcgcgtgcag ttccagtagt gtcgcctcca 540
gctaacgtta ccgtcgctgc tgctgtatca tcctctgtgt ctccatcagc ttctccagct 600
gtcaaagtac agagccctgc tgctaatggt aaggaacaat ctccagctaa gcctgaagaa 660
ccaaagaagg acactttaat tgtgaacgat ttcttggaac aagttaaaag acgcaaggct 720
gctttagctg ctaagaaggc tgtcgaagag aagggtcctg aggaaccgaa ggaatctgtc 780
gttggaactg acactgatgc aagcgttgat actaagacag ggcctacagc cactgaatct 840
gccaagtctg aagaagctca atcagaatca caagaaaaga ctaaggaaga ggctccagct 900
gagccaaaac cattgacttt ggccgaaaaa ttgagactta agaggatgga agctgcaaag 960
caagcttctg ctaagaccga ggaactaaag actgaagaat ctaagcctga agaaacaaag 1020
accgaggagc taaagactga agaatctaag cctgaagaaa caaagaccga ggagctaaag 1080
actgaagaaa caaagtccga ggaactaaag actgaagaac ctaaggcgga agaatcaaag 1140
gcggaagaac caaagcctga agaaccaaag accgaggaac cgacgactga acaaccaaag 1200
tcagatgaac caaagtcgga agaatcaaaa actgaagagc caaaaaccga ggtattaaag 1260
actgaagaac caaaatcgga agaatcaaag cctgcagaac caaagactga agaaacagca 1320
actgaagaaa cagcaactga agcaaacgcc gaagaaggtg aaccggctcc tgctggtccc 1380
gttgaaactc ctgctgatgt tgaaacaaaa cctcgagaag aggctgaagt tgaagacgat 1440
ggaaagatta ccatgaccga tttcctacag aagttgaaag aggtttctcc agttgatgat 1500
atttattcct tccaataccc aagtgacatt acgcctccaa atgatagata taaaaagaca 1560
agcattaaat atgcatacgg acctgatttc ttgtatcagt tcaaagaaaa ggtcgatgtt 1620
aaatacgatc cagcgtggat ggctgaaatg acgagtaaaa ttgtcatccc tcctaagaag 1680
cctggttcaa gcggaagagg cgaagataga tttagtaagg gtaaggttgg atctctaaga 1740
agtgaaggca gatcgggttc caggtccaac tcgaagaaga agtcaaagag ggatgataga 1800
aaatctaata gatcatacac ttccagaaag gaccgtgaaa gattcagaga ggaagaagtc 1860
gaagagccaa aggttgaggt tgccccattg gtcccaagtg ctaatagatg ggttcctaaa 1920
tctaagatga agaaaacaga agtcaagtta gctccagacg gaacagaact ttacgacgcg 1980
gaagaagcat caagaaagat gaagtcattg ctgaataaat tgacattaga aatgttcgaa 2040
cctatttctg atgatatcat gaagatcgct aaccaatcta gatgggaaga aaagggtgag 2100
actttgaaga ttgtcatcca acaaattttc aataaggcct gcgatgaacc tcattggtca 2160
tcaatgtacg cgcaattatg tggtaaggtc gttaaagact tagatgatag cattaaagac 2220
tcagaaaccc cagataagac tggttctcac ttggttttgc attacttagt ccaaagatgt 2280
caaactgaat tccaaacagg atggactgat caactaccta caaacgaaga cggtactcct 2340
ctacaacctg aaatgatgtc cgatgaatac tataagatgg ctgccgctaa gagaagaggt 2400
ttgggtttgg ttcgtttcat tggtttcttg taccgttcga acttattgac ttccagaatg 2460
gtcttcttct gtttcaagag actaatgaag gatattcaaa actctcctac tgaagatact 2520
ctagagtctg tatgtgaact tttggaaaca attggtgaac agttcgaagg tgctcgtatt 2580
caagttactg cagaagctgt cattgagggt tcaagcttgc tagacacact attcgaccaa 2640
ataaagaacg tgatcgaaaa tggtgacatc tccagcagaa tcaagtttaa gttgatcgac 2700
attgtcgaac taagagaaaa gaggaactgg aatagtaaaa ataagaacga tggtccaaag 2760
accattgctc aaattcacga agaagaagcc ttgaagaggg ctttggagga aagagaaaga 2820
gaaagagatc gccatgggtc cagaggtggt tccagacgta tgaatagcga gagaaactct 2880
tctagaagag atttctcctc tcattctcac agtcacaatc aaaatagaga cggtttcact 2940
actaccagat cgtcatcagt gagatattct gagccaaaga aggaagaaca agctccaact 3000
ccaactaaat cttctggtgg cgctgccaac atgtttgatg cattgatgga tgccgaagat 3060
gattaa 3066
<210> 8
<211> 1779
<212> DNA
<213> Kluyveromyces lactis (Kluyveromyces lactis)
<400> 8
atgtctgata ttactgaaaa aactgctgag caattggaaa acttgcagat caacgatgat 60
cagcaaccag ctcaatctgc cagtgctcca tccacttctg cttctgaaag cgaagcttct 120
tctgtttcta aggttgaaaa caacaacgct tcattgtacg ttggtgaatt ggatccaaac 180
attactgaag cattgttgta cgatgtgttt tcaccattgg gtccaatttc ctcgatccgt 240
gtttgtcgtg atgccgtcac caaggcttcg ttaggttacg cttacgttaa ctatactgat 300
tacgaagctg gtaagaaagc tattcaagaa ttgaactatg ctgaaatcaa cggtagacca 360
tgtagaatta tgtggtccga acgtgaccca gctatcagaa agaagggttc tggtaacatt 420
ttcatcaaga acttgcaccc agccattgac aacaaggctt tgcatgaaac tttctccact 480
ttcggtgaag tcttgtcttg taaagttgct ttagatgaga atggaaactc tagaggcttc 540
ggtttcgttc atttcaagga agaatccgat gctaaggatg ctattgaagc cgtcaacggt 600
atgttgatga acggtttgga agtttacgtt gccatgcacg ttccaaagaa ggaccgtatc 660
tccaagttgg aagaagccaa ggctaacttc accaacattt acgtcaagaa cattgacgtt 720
gaaaccactg acgaagagtt cgaacagttg ttctcccaat acggtgaaat tgtctctgct 780
gctttggaaa aggatgctga gggtaagcca aagggtttcg gtttcgttaa ctttgttgac 840
cacaacgccg ctgccaaggc cgttgaagag ttgaacggta aggaattcaa gtctcaagct 900
ttgtacgttg gcagagctca aaagaagtac gaacgtgctg aagaattgaa gaaacaatac 960
gaacaatacc gtttggaaaa attggctaag ttccaaggtg ttaacttgtt catcaagaac 1020
ttggacgatt ccatcgatga cgaaaaattg aaggaagaat tcgccccata cggtaccatc 1080
acctctgcta gagtcatgag agaccaagag ggtaactcta agggtttcgg tttcgtttgt 1140
ttctcttctc cagaagaagc taccaaggct atgaccgaaa agaaccaaca aattgttgcc 1200
ggtaagccat tgtacgttgc cattgctcaa agaaaggatg tcagaagatc ccaattggct 1260
caacaaattc aagccagaaa ccaaatcaga ttccaacaac agcaacaaca acaagctgct 1320
gccgctgctg ctggtatgcc aggccaatac atgccacaaa tgttctatgg tgttatggcc 1380
ccaagaggtt tcccaggtcc aaacccaggt atgaacggcc caatgggtgc cggtattcca 1440
aagaacggta tggtcccacc accacaacaa tttgctggta gaccaaacgg tccaatgtac 1500
caaggtatgc cacctcaaaa ccaattccca agacaccaac aacaacacta catccaacaa 1560
caaaagcaaa gacaagcctt gggtgaacaa ttgtacaaga aggtcagtgc caagattgac 1620
gacgaaaacg ccgctggtaa gatcaccggt atgatcttgg atctaccacc acagcaagtc 1680
atccaattgt tggacaacga cgaacaattt gaacagcaat tccaagaagc cttagctgct 1740
tacgaaaact tcaagaagga acaagaagct caagcttaa 1779
<210> 9
<211> 62
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 9
tgcttacgaa aacttcaaga gttttagagc tagaaatagc aagttaaaat aaggctagtc 60
cg 62
<210> 10
<211> 50
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 10
gctctaaaac tcttgaagtt ttcgtaagca aaagtcccat tcgccacccg 50
<210> 11
<211> 55
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 11
gagctcggta cccggggatc ctctagagat ccggtaagcc attgtacgtt gccat 55
<210> 12
<211> 55
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 12
gccaagcttg catgcctgca ggtcgacgat cagtataccg tccatgttga tgact 55
<210> 13
<211> 20
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 13
atcgtcgacc tgcaggcatg 20
<210> 14
<211> 20
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 14
atctctagag gatccccggg 20
<210> 15
<211> 63
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 15
gatgcattga tggatgccga agatgattaa acttgatttt ttgaccttga tcttcatctt 60
gtc 63
<210> 16
<211> 100
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 16
cttgaacttc atcttgagtt gaacctccac ctccagatcc acctccacca gcttgagctt 60
cttgttcttt tttaaaattc tcgtaagcag ctaaggcttc 100
<210> 17
<211> 93
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 17
gtggaggttc aactcaagat gaagttcaag gtccacatgc tggtaagtct actgttggtg 60
gaggtggatc tggcgaacct acatccgatc agc 93
<210> 18
<211> 27
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 18
ttaatcatct tcggcatcca tcaatgc 27
<210> 19
<211> 17
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 19
gtaaaacgac ggccagt 17
<210> 20
<211> 17
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 20
caggaaacag ctatgac 17
<210> 21
<211> 25
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 21
tctccagaag aagctaccaa ggcta 25
<210> 22
<211> 25
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 22
ttctcttcga cagccttctt agcag 25
<210> 23
<211> 20
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 23
agagttcgac aatttgtact 20
<210> 24
<211> 20
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 24
cgtcgtggcc gtagtaatcg 20
<210> 25
<211> 60
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 25
agagttcgac aatttgtact gttttagagc tagaaatagc aagttaaaat aaggctagtc 60
<210> 26
<211> 50
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 26
gctctaaaac agtacaaatt gtcgaactct aaagtcccat tcgccacccg 50
<210> 27
<211> 60
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 27
cgtcgtggcc gtagtaatcg gttttagagc tagaaatagc aagttaaaat aaggctagtc 60
<210> 28
<211> 50
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 28
gctctaaaac cgattactac ggccacgacg aaagtcccat tcgccacccg 50
<210> 29
<211> 30
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 29
taggtctaga gatctgttta gcttgcctcg 30
<210> 30
<211> 27
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 30
tatccactag acagaagttt gcgttcc 27
<210> 31
<211> 80
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 31
tatggaacgc aaacttctgt ctagtggata gtatatgtgt tatgtagtat actctttctt 60
caacaattaa atactctcgg 80
<210> 32
<211> 57
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 32
cgaggcaagc taaacagatc tctagaccta tatccactag acagaagttt gcgttcc 57
<210> 33
<211> 28
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 33
atgaacacga ttaacatcgc taagaacg 28
<210> 34
<211> 20
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 34
ttacgcgaac gcgaagtccg 20
<210> 35
<211> 52
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 35
atcttagagt cggacttcgc gttcgcgtaa gaagatgctt ctgctcatca tc 52
<210> 36
<211> 56
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 36
agtcgttctt agcgatgtta atcgtgttca tggtaattgg acaaataaat acgtgt 56
<210> 37
<211> 43
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 37
gtacccgggg atcctctaga gatccagttg cagagcctcc gaa 43
<210> 38
<211> 45
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 38
catgcctgca ggtcgacgat gcgaaacctt agctctttat cgaac 45
<210> 39
<211> 104
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 39
atgttggttt gaatggacta ttaacagtaa atattatatc acttcgtgtt caattaatta 60
cggtttgtgg cttaactaaa aagtcgaaca agaagcaggc aaag 104
<210> 40
<211> 84
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 40
gaagtgatat aatatttact gttaatagtc cattcaaacc aacatttatt ttagttaagc 60
cacaaaccgt aattaattga acac 84
<210> 41
<211> 59
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 41
gtgttcaatt aattacggtt tgtggcttaa ctaaaaagtc gaacaagaag caggcaaag 59
<210> 42
<211> 67
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 42
gaagtgatat aatatttact gttaatagtc cattcaaacc aacatcttcg agcgtcccaa 60
aaccttc 67
<210> 43
<211> 45
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 43
atgttggttt gaatggacta ttaacagtaa atattatatc acttc 45
<210> 44
<211> 36
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 44
ttttagttaa gccacaaacc gtaattaatt gaacac 36
<210> 45
<211> 22
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 45
tttgctggtt gcccgtattc cc 22
<210> 46
<211> 23
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 46
taatagcaca gggaatgcac ctt 23

Claims (46)

1. A genetically engineered strain for in vitro cell-free protein synthesis, wherein a genome of said genetically engineered strain incorporates a first exogenous gene expression cassette expressing a first nucleic acid construct of a first fusion protein; the first fusion protein has the structure of formula Ia or formula Ib:
S-A-B-C(Ia)
S-C-B-A(Ib);
in the formula (I), the compound is shown in the specification,
a is a Pab1 element;
b is nothing or a connecting peptide;
c is an eIF4G element;
s is an optional signal peptide, with or without a signal peptide in the first fusion protein;
wherein each "-" is a peptide bond;
the genetic engineering strain is Kluyveromyces lactis (Kluyveromyces lactis), and the expression or the activity of KlEXN53 gene is reduced in the genetic engineering strain; the amino acid sequence of KlEXN53 nuclease expressed by the KlEXN53 gene is SEQ ID NO.6;
wherein A and C satisfy at least one of the following conditions: (1) said a is derived from a yeast cell; (2) the C is derived from a yeast cell.
2. The genetically engineered strain of claim 1, wherein a and C satisfy at least one of the following conditions: (1) said a is derived from a kluyveromyces lactis cell; (2) the C is derived from a Kluyveromyces lactis cell.
3. The genetically engineered strain of claim 1, wherein the first fusion protein contains or does not contain an initial Met.
4. The genetically engineered strain of claim 1, wherein a is a wild-type or mutant Pab1 sequence.
5. The genetically engineered strain of claim 4, wherein A is the amino acid sequence of SEQ ID No.1 or an active fragment thereof.
6. The genetically engineered strain of claim 4, wherein the nucleotide sequence of the Pab1 element is SEQ ID No.8.
7. The genetically engineered strain of claim 1, wherein C is a wild-type or mutant eIF4G sequence;
with or without the starting Met.
8. The genetically engineered strain of claim 7, wherein C is the amino acid sequence of SEQ ID No.2 or an active fragment thereof.
9. The genetically engineered strain of claim 7, wherein the nucleotide sequence of the eIF4G element is SEQ ID No.7.
10. The genetically engineered strain of claim 1, wherein B is 0-50 amino acids in length.
11. The genetically engineered strain of claim 10, wherein B is 10-40 amino acids in length.
12. The genetically engineered strain of claim 10, the amino acid sequence of the B is Gly Gly Gly Gly Gly Gly Ser Thr Gln Asp Glu Val Gln Gly Pro His Ala GlyLys Ser Thr Val Gly Gly Gly Gly Gly Gly Gly Ser.
13. The genetically engineered strain of claim 10, wherein B is 15-25 amino acids in length.
14. The genetically engineered strain of claim 1, wherein the amino acid sequence of the first fusion protein is SEQ ID No.3.
15. The genetically engineered strain of any one of claims 1 to 14, further comprising integrated into its genome a second exogenous gene expression cassette of a second nucleic acid construct having a structure of formula II from 5'to 3':
Z1-Z2(II)
in the formula (I), the compound is shown in the specification,
z1, Z2 are each an element used to construct the construct;
each "-" is independently a bond or a nucleotide linking sequence;
z1 is a promoter element which is ScRNR2;
z2 is the coding sequence of T7RNA polymerase.
16. The genetically engineered strain of claim 15, wherein the KlEXN53 gene in the genetically engineered strain is knocked out and the second nucleic acid construct is integrated at the knocked out KlEXN53 gene position.
17. The genetically engineered strain of claim 15, wherein the nucleotide sequence set forth in SEQ ID No.5 is knocked out and the second nucleic acid construct is integrated into the knocked-out KlEXN53 gene location.
18. The genetically engineered strain of claim 15, wherein the nucleotide sequence of the second nucleic acid construct is SEQ ID No.4.
19. The genetically engineered strain of claim 1, wherein "reducing" means reducing the expression or activity of the KlEXN53 gene satisfies the following condition: the ratio of A1 to A0 is less than or equal to 30 percent;
wherein, A1 is the expression or activity of KlEXN53 gene in the genetic engineering strain; a0 is the expression or activity of KlEXN53 gene in the wild type strain.
20. The genetically engineered strain of claim 19, wherein "reducing" means reducing the expression or activity of the KlEXN53 gene satisfies the following condition: the ratio of A1 to A0 is less than or equal to 10 percent.
21. The genetically engineered strain of claim 19, wherein "reducing" means reducing the expression or activity of the KlEXN53 gene satisfies the following condition: the ratio of A1 to A0 is less than or equal to 5 percent.
22. The genetically engineered strain of claim 19, wherein "reducing" means reducing the expression or activity of the KlEXN53 gene satisfies the following condition: the ratio of A1 to A0 is less than or equal to 2 percent.
23. The genetically engineered strain of claim 19, wherein "reducing" means reducing the expression or activity of the KlEXN53 gene satisfies the following condition: the ratio of A1/A0 is selected from 0-2%.
24. A genetically engineered strain for in vitro cell-free protein synthesis, wherein a genome of said genetically engineered strain incorporates a first exogenous gene expression cassette expressing a first nucleic acid construct of a first fusion protein; the first fusion protein has the structure of formula Ia or formula Ib:
S-A-B-C(Ia)
S-C-B-A(Ib);
in the formula (I), the compound is shown in the specification,
a is a Pab1 element;
b is nothing or a connecting peptide;
c is an eIF4G element;
s is an optional signal peptide, wherein each "-" is a peptide bond;
the genetic engineering strain is Kluyveromyces lactis, and the expression or the activity of KlEXN53 gene is reduced in the genetic engineering strain; the nucleotide sequence of the KlEXN53 gene is SEQ ID NO.5;
a is an amino acid sequence shown in SEQ ID NO.1 or an active fragment thereof,
and the C is an amino acid sequence shown in SEQ ID NO.2 or an active fragment thereof.
25. The genetically engineered strain of claim 24, wherein B is 0-50 amino acids in length.
26. The genetically engineered strain of claim 24, wherein B is 10-40 amino acids in length.
27. The genetically engineered strain of claim 24, wherein the amino acid sequence of B is Gly Ser Gly Ser Thr Gln Asp Glu Val Gln Gly Pro His Ala Gly Lys Ser Thr Val Gly Ser.
28. The genetically engineered strain of claim 24, wherein B is 15-25 amino acids in length.
29. The genetically engineered strain of claim 24, wherein the first fusion protein is a polypeptide having at least 80% homology with the amino acid sequence of SEQ ID No.3, and the polypeptide has a function or activity of increasing the expression efficiency of a foreign protein.
30. The genetically engineered strain of claim 24, wherein the first fusion protein is a polypeptide having a homology of 90% or more with the amino acid sequence of SEQ ID No.3, and the polypeptide has a function or activity of increasing the expression efficiency of the foreign protein.
31. The genetically engineered strain of claim 24, wherein the first fusion protein is a polypeptide having homology of 95% or more with the amino acid sequence of SEQ ID No.3, and the polypeptide has a function or activity of increasing the expression efficiency of the foreign protein.
32. The genetically engineered strain of claim 24, wherein the first fusion protein is a polypeptide having a homology of 97% or more with the amino acid sequence of SEQ ID No.3, and the polypeptide has a function or activity of increasing the expression efficiency of the foreign protein.
33. The genetically engineered strain of claim 24, wherein the first fusion protein is a polypeptide having a homology of 98% or more with the amino acid sequence of SEQ ID No.3, and the polypeptide has a function or activity of increasing the expression efficiency of a foreign protein.
34. The genetically engineered strain of claim 24, wherein the first fusion protein is a polypeptide having a homology of 99% or more with the amino acid sequence of SEQ ID No.3, and the polypeptide has a function or activity of increasing the expression efficiency of a foreign protein.
35. The genetically engineered strain of claim 24 further comprising integrated into its genome a second exogenous gene expression cassette of a second nucleic acid construct having a structure of formula ii from 5'to 3':
Z1-Z2(II)
in the formula (I), the compound is shown in the specification,
z1, Z2 are each an element used to construct the construct;
each "-" is independently a bond or a nucleotide linking sequence;
z1 is a promoter element which is ScRNR2;
z2 is the coding sequence of T7RNA polymerase.
36. The genetically engineered strain of claim 35, wherein the nucleotide sequence of the second nucleic acid construct is SEQ ID No.4.
37. The genetically engineered strain of claim 35, wherein the KlEXN53 gene in the genetically engineered strain is knocked out and the second nucleic acid construct is integrated at the knocked out KlEXN53 gene position.
38. Use of the genetically engineered strain of any one of claims 1 to 37 to increase the efficiency of protein synthesis in vitro.
39. A cell-free cell extract derived from the genetically engineered strain of any one of claims 1 to 37;
the cell extract comprises elements of energy regeneration and protein synthesis;
the cell extract comprises the first fusion protein of claim 1 or 24.
40. An in vitro cell-free protein synthesis system, comprising the cell extract of claim 39.
41. The in vitro cell-free protein synthesis system of claim 40, further comprising one or more components selected from the group consisting of:
(b) Polyethylene glycol;
(c) Optionally exogenous sucrose; and
(d) Optionally a solvent, which is water or an aqueous solvent.
42. The in vitro cell-free protein synthesis system of claim 40, wherein the in vitro cell-free protein synthesis system further comprises one or more components selected from the group consisting of:
(e1) A substrate for synthesizing RNA;
(e2) A substrate for synthesizing a protein;
(e3) Magnesium ions;
(e4) Potassium ions;
(e5) A buffering agent;
(e6) An RNA polymerase; and
(e7) An energy regeneration system.
43. The in vitro cell-free protein synthesis system of claim 42, wherein the energy regeneration system is a phosphocreatine/phosphocreatine system.
44. The in vitro cell-free protein synthesis system of claim 40, wherein the in vitro cell-free protein synthesis system further comprises one or more components selected from the group consisting of:
(e8) Heme; and
(e9) Spermidine.
45. A method for synthesizing a protein in vitro, comprising the steps of:
(i) Providing the in vitro cell-free protein synthesis system of claims 40-44 and adding an exogenous DNA molecule for directing protein synthesis;
(ii) (ii) incubating the in vitro cell-free protein synthesis system of step (i) under suitable conditions for a period of time T1, thereby synthesizing the protein encoded by the exogenous DNA.
46. The method for in vitro synthesizing a protein according to claim 45 further comprising: (iii) Optionally isolating or detecting said protein encoded by the exogenous DNA from said in vitro cell-free protein synthesis system.
CN202010673244.1A 2018-01-31 2018-01-31 Method for improving in vitro protein synthesis efficiency Active CN111778169B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010673244.1A CN111778169B (en) 2018-01-31 2018-01-31 Method for improving in vitro protein synthesis efficiency

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010673244.1A CN111778169B (en) 2018-01-31 2018-01-31 Method for improving in vitro protein synthesis efficiency
CN201810093624.0A CN110093284B (en) 2018-01-31 2018-01-31 Method for improving protein synthesis efficiency in cell

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201810093624.0A Division CN110093284B (en) 2018-01-31 2018-01-31 Method for improving protein synthesis efficiency in cell

Publications (2)

Publication Number Publication Date
CN111778169A CN111778169A (en) 2020-10-16
CN111778169B true CN111778169B (en) 2023-02-10

Family

ID=67441924

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202010673244.1A Active CN111778169B (en) 2018-01-31 2018-01-31 Method for improving in vitro protein synthesis efficiency
CN201810093624.0A Active CN110093284B (en) 2018-01-31 2018-01-31 Method for improving protein synthesis efficiency in cell

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201810093624.0A Active CN110093284B (en) 2018-01-31 2018-01-31 Method for improving protein synthesis efficiency in cell

Country Status (1)

Country Link
CN (2) CN111778169B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111778169B (en) * 2018-01-31 2023-02-10 康码(上海)生物科技有限公司 Method for improving in vitro protein synthesis efficiency
CN111484998B (en) 2019-05-30 2023-04-21 康码(上海)生物科技有限公司 Method for in vitro quantitative co-expression of multiple proteins and application thereof
CN112111042B (en) 2019-06-21 2024-04-05 康码(上海)生物科技有限公司 Biological magnetic microsphere and preparation method and application method thereof
WO2021104435A1 (en) 2019-11-30 2021-06-03 康码(上海)生物科技有限公司 Biomagnetic microsphere and preparation method therefor and use thereof
CN115109792A (en) * 2022-06-22 2022-09-27 清华大学 Cell-free reaction system based on escherichia coli and application thereof

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108949801A (en) * 2017-11-24 2018-12-07 康码(上海)生物科技有限公司 A method of by knocking out nucleic acid enzyme system to regulate and control external biological synthesizing activity

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2329513A1 (en) * 1998-05-26 1999-12-02 University Of Medicine And Dentistry Of New Jersey System for reproducing and modulating stability and turnover of rna molecules
US20110004955A1 (en) * 2008-01-30 2011-01-06 Monsanto Technology Llc Transgenic plants with enhanced agronomic traits
CN102459613A (en) * 2009-04-29 2012-05-16 巴斯夫植物科学有限公司 Plants having enhanced yield-related traits and a method for making the same
WO2012019630A1 (en) * 2010-08-13 2012-02-16 Curevac Gmbh Nucleic acid comprising or coding for a histone stem-loop and a poly(a) sequence or a polyadenylation signal for increasing the expression of an encoded protein
CN103060334B (en) * 2013-01-28 2014-05-07 西南大学 Silkworm PABP (polyadenylic acid bonding protein) bonding protein interacting factor gene BmPaip1 as well as recombinant expression vector and application thereof
CN106978349B (en) * 2016-09-30 2018-06-22 康码(上海)生物科技有限公司 A kind of kit of protein synthesis in vitro and preparation method thereof
CN106701607B (en) * 2017-01-11 2020-05-08 浙江科技学院 Method for realizing high-accuracy fixed-point gene knockout in saccharomycetes
CN111778169B (en) * 2018-01-31 2023-02-10 康码(上海)生物科技有限公司 Method for improving in vitro protein synthesis efficiency
CN110845622B (en) * 2018-08-21 2021-10-26 康码(上海)生物科技有限公司 Preparation of fusion protein with deletion of different structural domains and application of fusion protein in improvement of protein synthesis

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108949801A (en) * 2017-11-24 2018-12-07 康码(上海)生物科技有限公司 A method of by knocking out nucleic acid enzyme system to regulate and control external biological synthesizing activity

Also Published As

Publication number Publication date
CN110093284B (en) 2020-08-18
CN110093284A (en) 2019-08-06
CN111778169A (en) 2020-10-16

Similar Documents

Publication Publication Date Title
CN111778169B (en) Method for improving in vitro protein synthesis efficiency
JP7246100B2 (en) Preparation of Novel Fusion Proteins and Their Use in Improving Protein Synthesis
KR102345759B1 (en) Methods for modulating biosynthetic activity in vitro by knock-out of nuclease systems
CN110408635B (en) Application of nucleic acid construct containing streptavidin element in protein expression and purification
CN109423496B (en) Nucleic acid construct for endogenously expressing RNA polymerase in cells
WO2018161374A1 (en) Protein synthesis system for protein synthesis in vitro, kit and preparation method thereof
CN110845622B (en) Preparation of fusion protein with deletion of different structural domains and application of fusion protein in improvement of protein synthesis
CN113481226B (en) Signal peptide related sequence and application thereof in protein synthesis
CN110408636B (en) DNA sequence with multiple labels connected in series and application thereof in protein expression and purification system
CN110938649A (en) Protein synthesis system for improving expression quantity of foreign protein and application method thereof
CN110551745A (en) Multiple histidine sequence tag and application thereof in protein expression and purification
WO2019100431A1 (en) Tandem dna element capable of enhancing protein synthesis efficiency
CN112342248A (en) Method for changing in vitro protein synthesis capacity through gene knockout and application thereof
CN111118065A (en) Gene modification method of eukaryote, corresponding gene engineering cell and application thereof
WO2024051855A1 (en) Nucleic acid construct and use thereof in ivtt system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant