CN112048485B - Engineered transaminase polypeptide for preparing sitagliptin - Google Patents

Engineered transaminase polypeptide for preparing sitagliptin Download PDF

Info

Publication number
CN112048485B
CN112048485B CN201910493876.7A CN201910493876A CN112048485B CN 112048485 B CN112048485 B CN 112048485B CN 201910493876 A CN201910493876 A CN 201910493876A CN 112048485 B CN112048485 B CN 112048485B
Authority
CN
China
Prior art keywords
gly
polypeptide
leu
asp
val
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910493876.7A
Other languages
Chinese (zh)
Other versions
CN112048485A (en
Inventor
陈海滨
罗霄
蔡宝琴
尚传洋
孙磊
马可·博科拉
胡东晓
王子坤
俞嘉薇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Enzymaster Ningbo Bio Engineering Co Ltd
Original Assignee
Enzymaster Ningbo Bio Engineering Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Enzymaster Ningbo Bio Engineering Co Ltd filed Critical Enzymaster Ningbo Bio Engineering Co Ltd
Priority to CN201910493876.7A priority Critical patent/CN112048485B/en
Publication of CN112048485A publication Critical patent/CN112048485A/en
Application granted granted Critical
Publication of CN112048485B publication Critical patent/CN112048485B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1096Transferases (2.) transferring nitrogenous groups (2.6)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/70Vectors or expression systems specially adapted for E. coli
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P17/00Preparation of heterocyclic carbon compounds with only O, N, S, Se or Te as ring hetero atoms
    • C12P17/18Preparation of heterocyclic carbon compounds with only O, N, S, Se or Te as ring hetero atoms containing at least two hetero rings condensed among themselves or condensed with a common carbocyclic ring system, e.g. rifamycin
    • C12P17/182Heterocyclic compounds containing nitrogen atoms as the only ring heteroatoms in the condensed system
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y206/00Transferases transferring nitrogenous groups (2.6)
    • C12Y206/01Transaminases (2.6.1)

Abstract

The present invention provides engineered transaminase polypeptides useful for the synthesis of sitagliptin intermediates under industrially relevant conditions. The invention also provides polynucleotides encoding the engineered transaminase polypeptides, host cells capable of expressing the engineered transaminase, and methods of using the engineered transaminase polypeptides to prepare sitagliptin intermediates. The engineered transaminase polypeptides of the invention were developed from wild-type transaminases from Aspergillus fumigatus by a directed evolution process. Compared with other aminotransferases for preparing sitagliptin intermediates, the engineered aminotransferase polypeptide has better activity and/or stability, so that the preparation of the enzyme and the operation of the transamination reaction are simpler, and the industrial application prospect is good.

Description

Engineered transaminase polypeptide for preparing sitagliptin
Technical Field
The invention belongs to the technical field of biological engineering, and particularly relates to an engineered transaminase polypeptide for preparing sitagliptin.
Background
Diabetes mellitus is a chronic progressive disease of the whole body caused by the metabolic disorder of the human body, and the main pathogenic factor is absolute or relative lack of insulin. Diabetes also causes various complications, such as chronic damage to various tissues, particularly eyes, kidneys, heart, blood vessels, nerves, dysfunction, and the like. There are four major types of diabetes: type I diabetes, type II diabetes, other special type diabetes and gestational diabetes, secondary diabetes. Among them, type II diabetes (i.e. non-insulin dependent type) accounts for more than 90%, and is often seen in middle-aged and elderly people over 30 years old, and the insulin secretion of this type of patients is not low or even high, but the body is not sensitive to insulin, so that the insulin in the body of the patient is relatively deficient, and the secretion of insulin in the body can be stimulated by some oral drugs. The main means of the medicine for clinically treating the type II diabetes is to artificially synthesize the hypoglycemic medicine by oral administration except injecting insulin. The current clinical drugs mainly include sulfonylureas-insulin secretion promoter, biguanides-insulin sensitivity enhancer, thiazolidinedione-insulin sensitizer, alpha-glucosidase inhibitor, non-sulfonylureas, etc. Although the hypoglycemic effect of the medicines is ideal, the medicines all have certain side effects. The novel dipeptidyl peptidase-4 (DPP-4) inhibiting drug sitagliptin phosphate has moderate hypoglycemic effect, has obvious hypoglycemic effect when being used alone or being used together with metformin and pioglitazone, and has the advantages of safe administration, good tolerance and less adverse reaction.
Sitagliptin phosphate was developed by merck in the united states and approved by the FDA in the united states for the treatment of type II diabetes in 2006 for 10 months under the trade name tenofovir (Januvia). The key step in sitagliptin phosphate synthesis is the construction of a chiral amino intermediate. The currently published synthetic methods mostly require catalysts and can be divided into two categories, namely metal catalytic synthesis and enzyme catalytic synthesis. As for the metal catalytic synthesis methods, published patents include WO2004087650, WO2005003135, WO2007050485, etc., and the metal catalysts used include ruthenium, platinum, palladium or rhodium, which are expensive and difficult to remove or recover from the products; meanwhile, the chiral purity of the chiral amino intermediate generated by the process using the metal catalyst is not high enough, and the requirements of pharmacopoeia can be met by further methods such as recrystallization. As the enzymatic Synthesis method, there are disclosed "biological assay Synthesis of Chiral Amines from Ketone Applied to Chiral thiolutin Manual, Science, 2010, v329 (isuse 5989), pp.305-309", and patents including WO2010099501, CN105018440, CN107384887 and the like. The enzyme catalysts disclosed in these documents for the synthesis of chiral amino intermediates or sitagliptin are all transaminases and are engineered transaminases obtained by artificial mutation and screening or directed evolution based on different wild-type transaminases. The published data show that due to the special chemical structure of sitagliptin chiral amino intermediate, wild transaminase existing in nature can not catalyze the synthesis reaction of the chiral amino intermediate, or the catalytic activity is very low, so that the wild transaminase can not be used for industrial production; on the basis of obtaining the activity for catalytically synthesizing the chiral amino intermediate, the engineered transaminase obtained by directed evolution needs to be improved in stability and tolerance to a reaction solvent (a substrate is poor in water solubility, and an organic solvent needs to be added in an enzyme catalytic reaction) so as to meet the requirements of industrial production on cost and environmental protection. Compared with a metal catalyst, the stereoselectivity of the chiral amino intermediate synthesized by enzyme catalysis is high, the ee value of the transamination product can reach 99% or more (directly meeting the requirements of pharmacopoeia), the reaction conditions are mild, and the synthesis of sitagliptin or the intermediate thereof is catalyzed by the artificially modified transaminase in the current industrial production.
The presently disclosed transaminases for the synthesis of sitagliptin intermediates still suffer from a number of deficiencies. For example, an engineered transaminase developed by Codexis (wild type from Arthrobacter sp.) reported in WO2010099501 has good tolerance to the solvent DMSO, but poor tolerance to alcoholic solvents; DMSO is used in the enzyme-catalyzed reaction, and is difficult to remove from the reaction system due to the high boiling point of DMSO, so that the loss of reaction products in the purification process is large, and the cost is high. CN105018440, Nanjing Boyou Kangyuan biological medicine science and technology Limited modifies transaminase from Mycobacterium (Mycobacterium vanbalenii) PYR-1 to obtain engineered transaminase, which is used to synthesize a relatively simple sitagliptin intermediate: r-3-amino-4- (2,4, 5-trifluorophenyl) -butyric acid methyl ester; although the engineered transaminase disclosed in CN105018440 has good tolerance to alcohol solvents such as ethanol (the used solvent is 50% ethanol, which is beneficial to product purification), the catalytic activity is not high, 10g/L of enzyme protein needs to be added, and the product of the transamination reaction needs to be further converted into sitagliptin protected by Boc after being protected by Boc, so that sitagliptin is obtained after deprotection, and the overall yield is not high, resulting in high cost. CN107384887, Onobu et al screened and engineered a transaminase derived from Burkholderia gladioli (Burkholderia gladioli). Although the engineered transaminase disclosed in CN107384887 can directly use sitagliptin precursor ketone as a substrate, the activity is not high enough, 50g/L of wet bacteria is needed to achieve a high conversion rate, and DMSO is still used as a solvent in a reaction system; the high concentration of wet bacteria makes the reaction system extremely complex in components, is very unfavorable for treatment and product extraction after the reaction, and simultaneously, the removal of DMSO also makes the separation yield of the product limited.
Disclosure of Invention
1 overview
The technical problem to be solved by the invention is to overcome the defects of the engineered transaminase for synthesizing sitagliptin or a chiral amino intermediate thereof in the prior art, and provide the engineered transaminase which directly takes sitagliptin precursor ketone as a substrate, has better activity, better tolerance to alcohol solvents and better thermal stability. The engineered transaminase provided by the invention allows the transaminase to be simply and rapidly purified from the engineering bacteria lysate in a heat treatment manner, the purified enzyme solution can be directly used in a reaction system using methanol as a solvent to catalyze the reaction for synthesizing sitagliptin, the ee value of the product is more than or equal to 99.7%, the requirements of pharmacopoeia are directly met, and the enzyme synthesis process of sitagliptin, which has the advantages of lower cost, simpler operation and more environment-friendly environment, can be realized.
The invention provides an engineered transaminase polypeptide with high catalytic activity, strong stereoselectivity, good solvent tolerance and good thermal stability, which can convert 3-carbonyl-1- [3- (trifluoromethyl) -5,6,7, 8-tetrahydro-1, 2, 4-triazolo [4,3-a ] pyrazin-7-yl ] -4- (2,4, 5-trifluorophenyl) butan-1-one (a compound shown as sitagliptin precursor ketone or A-1) into (3R) -3-amino-1- [3- (trifluoromethyl) -5,6,7, 8-tetrahydro-1, 2, 4-triazolo [4,3-a ] pyrazin-7-yl ] -4- (2,4, 5-trifluorophenyl) butan-1-one ("sitagliptin" or compound of a-2)), this reaction is shown in figure 1. The present invention provides an integrated enzymatic-chemical synthesis process for the synthesis of sitagliptin phosphate. The invention also provides a gene of the engineered transaminase polypeptide, a recombinant expression vector containing the gene, an engineering bacterium for expressing the engineered transaminase polypeptide, a high-density fermentation method of the engineering bacterium, and application of the engineered transaminase polypeptide in catalyzing A-1 asymmetric transamination to synthesize A-2.
The engineered transaminase polypeptides provided by the invention include those having the activity shown in figure 1 and having amino acid sequences corresponding to SEQ ID NOs: 2 sequence by one or more residues: x3, X4, X9, X15, X20, X27, X36, X40, X41, X53, X60, X61, X62, X76, X78, X92, X93, X97, X98, X101, X113, X114, X115, X116, X122, X126, X128, X130, X137, X148, X153, X155, X159, X186, X187, X188, X190, X194, X209, X214, X226, X235, X244, X253, X262, X263, X266, X271, X273, X275, X279, X287, X290, X312, X323. The engineered transaminase polypeptides provided by the present invention include amino acid sequences that include at least one of the following features: s3, M4, S9, Q15, R20, K27, L36, S40, D41, H53, V60, I61, S62, Q76, I78, a92, L93, K97, N98, a101, F113, V114, E115, V116, L122, R126, S128, P130, N137, V148, N153, L155, E159, L186, T187, K188, L190, M194, N209, S214, I226, R235, D244, I253, Q262, a263, S266, M271, T273, a275, M279, Q287, N290, P312, G323; or both on the basis of these differences, an insertion or deletion of 1,2, 3, 4,5, 6,7,8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 20, 21, 22, 23, 24, 25 or more amino acid residues.
More specifically, in some embodiments, the peptide of SEQ ID NO:2 comprises an engineered transaminase polypeptide corresponding to SEQ ID No: 4.6, 8, 10 and 12.
In some embodiments, the improved engineered transaminase polypeptides comprise a sequence identical to SEQ ID No: 4.6, 8, 10, 12, or at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the reference sequence.
The identity between two amino Acid sequences or between two nucleotide sequences can be obtained by algorithms commonly used in the art, by calculation using NCBI Blastp and Blastn software based on default parameters, or by using the Clustal W algorithm (Nucleic Acid Research,22(22): 4673-. For example, the method adopts the Clustal W algorithm, and the sequence shown in SEQ ID NO:2 and SEQ ID NO:12 was 83.3% amino acid sequence identity.
In another aspect, the invention provides polynucleotide sequences encoding engineered transaminase polypeptides. In some embodiments, the polynucleotide may be part of an expression vector having one or more control sequences for expression of the engineered transaminase polypeptide. In some embodiments, the polynucleotide may comprise a nucleotide sequence corresponding to SEQ ID No: 3. 5, 7, 9, 11.
As known to those skilled in the art, due to the degeneracy of the nucleotide codons, the nucleotide sequence encoding SEQ ID No: 4.6, 8, 10, 12 is not limited to the nucleotide sequence of SEQ ID No: 3. 5, 7, 9 and 11. The nucleic acid sequence encoding the engineered transaminase of the present invention can also be a nucleic acid sequence encoding the amino acid sequence set forth in SEQ ID No: 4.6, 8, 10, 12, or a pharmaceutically acceptable salt thereof.
In another aspect, the disclosure provides expression vectors and host cells comprising a polynucleotide encoding an engineered transaminase or capable of expressing an engineered transaminase. In some embodiments, the host cell may be a bacterial host cell, such as e. The host cell can be used to express and isolate the engineered transaminase described herein, or alternatively used directly to react a conversion substrate to a product.
In some embodiments, the engineered transaminase in the form of intact cells, crude extracts, isolated polypeptides, or purified polypeptides can be used alone or in an immobilized form (e.g., immobilized on a resin).
The present disclosure also provides a method of converting a ketone substrate of structural formula a-1 to a chiral amine compound of structural formula a-2 using an engineered transaminase polypeptide disclosed herein, the chiral amine product of structural formula a-2 being in excess compared to the corresponding enantiomer, the method comprising contacting the ketone substrate of structural formula a-1 and an amino donor with the transaminase polypeptide under reaction conditions suitable to convert a-1 to a-2, wherein the transaminase polypeptide is the engineered transaminase polypeptide described herein. In some embodiments, the engineered transaminase polypeptide has a sequence identical to SEQ ID NO:2 has at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity and is capable of converting the ketone substrate of structural formula a-1 to the amine product of structural formula a-2.
In some embodiments, the amine product of structural formula a-2 is produced in at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or greater enantiomeric excess.
Specific embodiments of engineered transaminase polypeptides for use in the methods are further provided in the detailed description. An improved engineered transaminase polypeptide useful in the above methods can include a sequence selected from the group consisting of the amino acid sequences corresponding to SEQ ID NOs: 4.6, 8, 10, 12.
Any of the methods for making compounds of formula a-2 using engineered polypeptides as disclosed herein can be performed under a range of suitable reaction conditions including, but not limited to, ranges of amino donor, pH, temperature, buffer, solvent system, substrate loading, polypeptide loading, cofactor loading, pressure, and reaction time. For example, in some embodiments, the preparation of a compound of formula a-2 may be performed, wherein suitable reaction conditions include: (a) a loading of substrate compound a-1 of about 1g/L to 200 g/L; (b) about 0.1g/L to 50g/L of the engineered polypeptide; (c) an amino donor loading of about 10g/L to 300 g/L; (d) a PLP cofactor concentration of about 0.1mM to 5 mM; (e) from about 0% (v/v) to about 60% (v/v) of an organic solvent, where the organic solvent includes, but is not limited to, dimethyl sulfoxide (DMSO), Dimethylformamide (DMF), methyl tert-butyl ether (MTBE), isopropyl acetate, methanol, ethanol, or propanol; (f) a pH of about 7.0 to about 11.0; and (g) a temperature of about 10 ℃ to 60 ℃.
2 detailed description of the invention
2.1 definition
In relation to the present disclosure, unless otherwise explicitly defined, technical and scientific terms used in the description herein have the meaning commonly understood by one of ordinary skill in the art.
"protein," "polypeptide," and "peptide" are used interchangeably herein to refer to a polymer of at least two amino acids covalently linked by amide bonds, regardless of length or post-translational modification (e.g., glycosylation, phosphorylation, lipidation, myristoylation, ubiquitination, etc.). This definition includes both D-amino acids and L-amino acids, as well as mixtures of D-amino acids and L-amino acids.
"engineered transaminase," "engineered transaminase polypeptide," "improved transaminase polypeptide," and "engineered polypeptide" are used interchangeably herein.
"cell" or "wet cell" refers to a host cell expressing a polypeptide or an engineered polypeptide, including wet cells obtained by the production processes described in examples 3 and 7.
"Polynucleotide" and "nucleic acid" are used interchangeably herein.
As used herein, "cofactor" refers to a non-protein compound that acts in conjunction with an enzyme in a catalytic reaction. As used herein, "cofactor" is intended to include pyridoxal phosphate (pyridoxol-5' -phosphate, or PLP), pyridoxine (pyridoxol, or PN), pyridoxol (pl), pyridoxine (pm), pyridoxine phosphate (PNP), and pyridoxine phosphate (PMP), which are compounds of the vitamin B6 family, which are sometimes also referred to as coenzymes.
"PLP", "pyridoxal phosphate", "pyridoxal 5' -phosphate", "PYP" and "P5P" are used interchangeably herein to refer to compounds that are used as cofactors in an enzyme-catalyzed reaction.
"coding sequence" refers to the nucleic acid portion (e.g., a gene) that encodes the amino acid sequence of a protein.
"naturally occurring" or "wild type" refers to the form found in nature. For example, a naturally occurring or wild-type polypeptide or polynucleotide sequence is a sequence that exists in an organism, can be isolated from a source in nature, and has not been intentionally modified by man-made manipulations.
"recombinant" or "engineered" or "non-naturally occurring" when used in reference to, for example, a cell, nucleic acid or polypeptide, refers to the following materials or materials corresponding to the native or native form of the material: the material is altered in a manner not found in nature or is the same as but produced or obtained from synthetic materials and/or by manipulation using recombinant techniques.
"sequence identity" is used herein to refer to a comparison between polynucleotides or polypeptides (sequence identity "is typically expressed as a percentage) and is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may include additions or deletions (i.e., gaps) as compared to the reference sequence for optimal alignment of the two sequences. The percentage can be calculated as follows: determining the number of positions at which identical nucleic acid bases or amino acid residues occur in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. Alternatively, the percentage may be calculated by: the number of positions at which the identical nucleic acid base or amino acid residue is present in both sequences, or the number of positions at which a nucleic acid base or amino acid residue aligns with a gap, is determined to yield the number of matched positions, the number of matched positions is divided by the total number of positions in the window of comparison, and the result is multiplied by 100 to yield the percentage of sequence identity. One skilled in the art will recognize that there are many established algorithms that can be used to align two sequences. Optimal alignment of sequences for comparison can be achieved, for example, by the local homology algorithm of Smith and Waterman, 1981, adv.appl.Math.2:482, by the homology alignment algorithm of Needleman and Wunsch, 1970, J.mol.biol.48:443, by the similarity search method of Pearson and Lipman,1988, Proc.Natl.Acad.Sci.USA85:2444, by computer implementation of these algorithms (GAP, BESTFIT, TA or TFASTA in the GCGWISSONIn software package) or by FAS visual inspection (see generally, Current Protocols in Molecular Biology, F.M.Ausubel et al, Current Protocols, Greene Publishing Associates Inc. and John Wiley & Sons, Inc. (increase in 1995)). Examples of algorithms suitable for determining sequence identity and percent sequence similarity are the BLAST and BLAST2.0 algorithms described in Altschul et al, 1990, J.Mol.biol.215: 403-. Software for performing BLAST analysis is publicly available through the National Center for Biotechnology Information (National Center for Biotechnology Information) website. The algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence that either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word matches (word hits) serve as seeds for initiating searches to find longer HSPs containing them. Word matches are then extended in both directions along each sequence to the point where the cumulative alignment score cannot be increased. For nucleotide sequences, cumulative scores were calculated using the parameters M (reward score for matching residue pairs; always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, the scoring matrix is used to calculate the cumulative score. The extension of the word matching string in each direction is terminated when: the cumulative alignment score decreased by an amount X from its maximum achievement value; (ii) a cumulative score of 0 or less due to cumulative one or more negative scoring residue alignments; or to the end of either sequence. The BLAST algorithm parameters W, T and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses word length (W)11, expectation (E)10, M-5, N-4, and a two-strand comparison as defaults. For amino acid sequences, the BLASTP program uses as default the following: word length (W) is 3, expectation (E) is 10 and BLOSUM62 score matrix (see Henikoff and Henikoff,1989, Proc Natl Acad Sci USA 89: 10915). An exemplary determination of sequence alignment and% sequence identity may use the BESTFIT or GAP program in the GCG Wisconsin software package (Accelrys, Madison WI), using the default parameters provided.
"reference sequence" refers to a defined sequence used as a basis for sequence comparison. The reference sequence may be a subset of a larger sequence, e.g., a fragment of a full-length gene or polypeptide sequence. In general, a reference sequence is at least 20 nucleotides or amino acid residues in length, at least 25 residues in length, at least 50 residues in length, or the full length of the nucleic acid or polypeptide. Because two polynucleotides or polypeptides may each (1) include a sequence that is similar between the two sequences (i.e., a portion of the complete sequence), and (2) may further include a sequence that differs between the two sequences, sequence comparisons between two (or more) polynucleotides or polypeptides are typically performed by comparing the sequences of the two polynucleotides or polypeptides over a "comparison window" to identify and compare local regions of sequence similarity. In some embodiments, a "reference sequence" is not intended to be limited to a wild-type sequence, and may include engineered or altered sequences. For example, "a reference sequence based on SEQ ID NO:2 having isoleucine at the residue corresponding to X101" refers to a reference sequence in which the corresponding residue (which is alanine) at X101 in SEQ ID NO:2 has been changed to isoleucine.
"comparison window" refers to a conceptual segment of at least about 20 contiguous nucleotide positions or amino acid residues, wherein a sequence can be compared to a reference sequence of at least 20 contiguous nucleotides or amino acids, and wherein the portion of the sequence in the comparison window can include additions or deletions (i.e., gaps) of 20% or less as compared to the reference sequence (which does not include additions or deletions) for optimal alignment of the two sequences. The comparison window may be longer than 20 contiguous residues and includes optionally 30, 40, 50, 100 or longer windows.
In the context of numbering for a specified amino acid or polynucleotide sequence, "corresponding to," "referenced to," or "relative to" refers to the number of specified reference sequence residues when the specified amino acid or polynucleotide sequence is compared to the reference sequence. In other words, the residue number or residue position of a given sequence is specified in terms of the reference sequence, rather than the actual numerical position of the residue within a given amino acid or polynucleotide sequence. For example, a given amino acid sequence, such as the amino acid sequence of an engineered polypeptide, can be aligned with a reference sequence by introducing gaps in order to optimize residue matching between the two sequences. In these cases, residue numbering in a given amino acid or polynucleotide sequence is made relative to a reference sequence to which it has been aligned, although gaps exist.
"amino acid difference" or "residue difference" refers to the difference in an amino acid residue at one position of a polypeptide sequence relative to the amino acid residue at the corresponding position in a reference sequence. The position of an amino acid difference is generally referred to herein as "Xn", where n refers to the corresponding position in the reference sequence on which the residue difference is based. For example, "a residue difference at position X101 compared to SEQ ID NO: 2" refers to a difference in the amino acid residue at position 101 corresponding to the polypeptide of SEQ ID NO: 2. Thus, if the reference polypeptide of SEQ ID NO 2 has an alanine at position 101, then "residue difference at position X101 compared to SEQ ID NO 2" refers to an amino acid substitution of any residue other than alanine at the position of the polypeptide corresponding to position 101 of SEQ ID NO 2. In most examples herein, a particular amino acid residue difference at one position is denoted as "XnY", where "Xn" refers to the corresponding position as described above, and "Y" is a one-letter identifier of the amino acid found in the engineered polypeptide (i.e., a residue that is different from that in the reference polypeptide). In some examples (e.g., in table 2), the disclosure also provides for specific amino acid differences represented by the conventional symbol "AnB," where a is a one-letter identifier of a residue in a reference sequence, "n" is the number of residue positions in the reference sequence, and B is a one-letter identifier of a residue substitution in the sequence of the engineered polypeptide. In some examples, a polypeptide of the disclosure can comprise one or more amino acid residue differences relative to a reference sequence, which are represented by a list of specific positions at which the residue differences exist relative to the reference sequence.
"deletion" refers to a modification of a polypeptide by the removal of one or more amino acids from a reference polypeptide. Deletions may include the removal of 1 or more, 2 or more amino acids, 5 or more amino acids, 10 or more amino acids, 15 or more amino acids, or 20 or more amino acids, up to 10% of the total number of amino acids comprising the reference enzyme, or up to 20% of the total number of amino acids comprising the reference enzyme, while retaining the activity of the engineered transaminase polypeptide for the reactions shown in fig. 1. Deletions may involve internal and/or terminal portions of the polypeptide. In various embodiments, a deletion may comprise a continuous segment or may be discontinuous.
"insertion" refers to a modification of a polypeptide by the addition of one or more amino acids from a reference polypeptide. In some embodiments, the engineered polypeptides disclosed herein include one or more amino acid insertions into the naturally occurring transaminase polypeptide, as well as other engineered polypeptides. It may be inserted in the internal part of the polypeptide or at the carboxy or amino terminus. As used herein, an insertion includes fusion proteins known in the art. Insertions may be contiguous stretches of amino acids, or separated by one or more amino acids in a naturally occurring polypeptide.
As used herein, a "fragment" refers to a polypeptide having an amino-terminal and/or carboxy-terminal deletion, but in which the remaining amino acid sequence is identical to the corresponding position in the sequence. Fragments may be at least 10 amino acids long, at least 20 amino acids long, at least 50 amino acids long or longer, and up to 70%, 80%, 90%, 95%, 98%, and 99% of the full-length engineered polypeptide.
An "isolated polypeptide" or "purified polypeptide" refers to a polypeptide that: the polypeptides are substantially separated from other materials with which they are naturally associated, such as proteins, lipids, and polynucleotides. The term includes polypeptides that have been removed or purified from their naturally occurring environment or expression system (e.g., in a host cell or in vitro synthesis). The engineered polypeptide may be present intracellularly, in cell culture media, or prepared in various forms, such as a lysate or an isolated preparation. As such, in some embodiments, the engineered polypeptide may be an isolated polypeptide.
"chiral center" refers to a carbon atom to which four different groups are attached.
"stereoselectivity" (stereoselectivity) refers to the preferential formation of one stereoisomer over another or multiple isomers in a chemical or enzymatic reaction. Stereoselectivity can be partial, where one stereoisomer forms preferentially to the other; or may be complete, in which only one stereoisomer is formed. When the stereoisomers are enantiomers (enantiomers), the stereoselectivity is referred to as enantioselectivity, and the fraction of excess (usually reported as a percentage) of one enantiomer in a mixture of the two enantiomers is often optionally reported as "enantiomeric excess" (ee). This fraction (typically a percentage) is often alternatively reported in the art as the enantiomeric excess (ee) calculated therefrom according to the following formula: { major enantiomer concentration-minor enantiomer concentration }/{ major enantiomer concentration + minor enantiomer concentration }.
"stereoisomers", "stereoisomeric forms" and similar expressions are used interchangeably herein and refer to all isomers resulting from molecular differences differing only in the orientation of their atoms in space. It includes enantiomers and isomers of compounds that have more than one chiral center and that are not mirror images of each other (i.e., "diastereomers").
"improved enzymatic properties" refers to an improvement in any enzymatic property exhibited by an engineered polypeptide as compared to a reference sequence, such as the wild-type transaminase of SEQ ID No: 2. desirable improved enzyme properties include, but are not limited to, enzyme activity (which may be expressed as a percentage of conversion of substrate), thermostability, solvent stability (e.g., stability against alcohol compounds), pH activity profile, cofactor requirements, tolerance to an inhibitor (e.g., substrate or product inhibition), stereospecificity, and stereoselectivity.
"conversion" refers to the enzymatic conversion of a substrate to the corresponding product. "percent conversion" or "conversion" refers to the percentage of substrate in a reaction system that is converted to product at a specified reaction time under specified reaction conditions. Thus, the "enzymatic activity" or "activity" of a transaminase or engineered polypeptide can be expressed as a "percent conversion" of substrate to product. The conversion is generally calculated by measuring the molar concentration of the product A-2 and the molar concentration of the substrate A-1 in the reaction system by sampling: { molar concentration of A-2 }/{ [ molar concentration of A-1 ] + [ molar concentration of A-2 ] }.
"thermostable" refers to an engineered polypeptide that maintains similar activity after exposure to elevated temperatures (e.g., 80 ℃ or greater) for a period of time (e.g., 0.5 hours or greater) as compared to the wild-type enzyme.
"solvent stable" or "solvent tolerant" means that the engineered polypeptide maintains similar activity compared to the wild-type enzyme after exposure to different concentrations (e.g., 5-99%) of a solvent (methanol, ethanol, isopropanol, dimethyl sulfoxide (DMSO), tetrahydrofuran, 2-methyltetrahydrofuran, acetone, toluene, butyl acetate, methyl tert-butyl ether, etc.) for a period of time (e.g., 0.5-24 hours).
"suitable reaction conditions" refer to those conditions (e.g., ranges of enzyme loading, substrate loading, loading of amino donor, cofactor loading, temperature, pH, buffer, co-solvent, etc.) in the biocatalytic reaction solution under which the engineered polypeptide of the present disclosure converts the substrate to the desired product compound. Exemplary "suitable reaction conditions" are provided in the present disclosure and are exemplified by the examples.
The compounds may be designated by their chemical structure and/or chemical name. When a chemical structure conflicts with a chemical name, the chemical structure determines the identity of the compound.
2.2 directed evolution Process and the engineered transaminases developed
The invention discloses an engineered transaminase polypeptide, which is formed by mutating a wild transaminase (transaminase) through creative directed evolution process and carrying out substitution, insertion or deletion of a certain amount of amino acid residues; reference is made to "Directed Evolution: Bridging New Chemistry to Life" France H. Arnold, Angewandte Chemistry, November 28,2017 for an introduction to Directed Evolution. (because of its pioneering contribution to directed evolution technology, Frances h. arnold received the nobel prize of 2018.) this wild-type transaminase was derived from Aspergillus fumigatus (Aspergillus fumigatus), the amino acid sequence of which is as set forth in SEQ ID NO:2, respectively. The inventor detects that the wild-type transaminase corresponding to SEQ ID No. 2 has No activity on A-1, can not catalyze the reaction shown in figure 1, and has poor tolerance to methanol and poor thermal stability.
In order to develop an engineered transaminase with excellent performance for the reaction shown in fig. 1, the present invention designed and performed 5 stages of directed evolution development, as shown in table 1. The research and development emphasis of each stage is different, and different screening reaction conditions are adopted in a targeted manner; the optimal engineered transaminase polypeptides developed at each stage are shown in table 2.
TABLE 1 five stages of directed evolution
Figure BDA0002087915510000071
In the directed evolution phase I, the aim was to develop SEQ ID NO: 2. According to SEQ ID NO:2 (PDB ID: 4CHI), the inventors designed and constructed site-directed saturated or multi-site combinatorial mutation libraries for multiple residues using bioinformatics techniques, and then screened these libraries using the screening reaction conditions shown in table 1 to obtain the first protein consisting of SEQ ID NO:2 an engineered transaminase polypeptide active in the reaction shown in figure 1, the engineered transaminase polypeptide having an amino acid sequence set forth in SEQ ID NO:4, respectively. And SEQ ID NO:2 in comparison, SEQ ID NO:4 has 10 amino acid mutations: V60G, I61F, S62H, F113M, E115K, V148L, T187I, L190M, S214P and a 275G; these amino acid mutations are distributed near the catalytically active center.
The method for constructing the mutation library may be site-directed Mutagenesis PCR (as shown in example 2) or multi-site Mutagenesis PCR (refer to "Mutagenesis and Synthesis of Novel Recombinant Genes Using PCR," Chapter 32, in PCR Primer,2nd edition (es. Diefenbach and Dveksler), "Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, USA, 2003.")
In the directed evolution stages II and III, the objective is to develop mutants with higher reactivity as shown in FIG. 1, and simultaneously improve the expression of the engineered transaminase polypeptides in E.coli hosts. The amino acid sequence of the optimally engineered transaminase obtained in stage II is set forth in SEQ ID NO: and 6. And SEQ ID NO:4 compared to SEQ ID NO:6 has 8 newly added amino acid mutations, and the amino acid mutations are mainly distributed near the catalytic activity center.
The amino acid sequence of the optimally engineered transaminase obtained in stage III is set forth in SEQ ID NO: shown in fig. 8. And SEQ ID NO:6 in comparison, SEQ ID NO:8 has 18 newly added amino acid mutations. These amino acid mutations are distributed primarily near the catalytically active center, and some are distributed at the nitrogen termini of the entire protein sequence (i.e., S3T and S9A), which are believed to enhance the expression of the engineered transaminase polypeptides in E.coli hosts.
In the screening reaction of stages I, II and III, DMSO was used as a solvent to aid in the dissolution of substrate A-1 (i.e., sitagliptin precursor ketone). In general, DMSO is less harmful to the enzyme than alcohol solvents such as methanol, and the use of DMSO in the screening reaction is advantageous in screening mutant enzymes with improved specific activity. After the end of stage III, the transaminase is engineered to have the sequence of SEQ ID NO:8 has remarkable activity on a substrate A-1, and has obvious conversion on A-1 in a screening reaction system taking methanol as a solvent. Therefore, starting from stage IV, the screening reaction uses methanol as a solvent to evolve an engineered transaminase that is well-tolerated by methanol. Considering that the reaction shown in FIG. 1 produces acetone, a certain amount of acetone is also added to the reaction system in the screening reaction conditions of stages IV and V, in order to develop an engineered transaminase with a certain tolerance to acetone. The amino acid sequence of the optimally engineered transaminase obtained in stage IV is set forth in SEQ ID NO: shown at 10. And SEQ ID NO:8 compared to SEQ ID NO:10 have 8 additional amino acid mutations distributed in addition to the vicinity of the catalytically active center (e.g.e115L and M279C), at the interface between the monomers in the quaternary structure of the protein (e.g.P 130D) and at the surface-exposed parts of the quaternary structure of the protein (e.g.A 101I and N290G).
In order to evolve the engineered transaminase with better thermal stability while improving the activity and the tolerance to methanol, a step of carrying out heat treatment on enzyme liquid is added in the screening reaction condition of the stage V. The amino acid sequence of the optimally engineered transaminase obtained in stage V is set forth in SEQ ID NO: shown at 12. And SEQ ID NO:10 in comparison, SEQ ID NO:12 have 9 additional amino acid mutations, which are distributed in addition to the vicinity of the catalytically active center (e.g.Q10 15K, R20A, K27Q), also at the interfaces between monomers in the quaternary structure of the protein (e.g.N137E and A263C), and at the surface-exposed parts of the quaternary structure of the protein (e.g.Q76H, A92P, L93V and K97R).
From the bioinformatics point of view, it is generally considered that the mutation of the catalytic activity center and the amino acids in the vicinity thereof in the quaternary structure of the protein has strong correlation with the catalytic activity; mutations in amino acids at the interfaces between monomers in the quaternary structure of the protein, as well as in amino acids exposed at the surface in the quaternary structure of the protein, are strongly correlated with solvent resistance and thermal stability of the protein.
TABLE 2
Figure BDA0002087915510000081
Figure BDA0002087915510000091
In table 2, each row gives the nucleotide and amino acid sequence numbers for a particular engineered transaminase polypeptide; the residue differences of each of the engineered transaminase polypeptides compared to SEQ ID No:2 are also given, as well as the sequence identity compared to SEQ ID No: 2. The amino Acid sequence identity is calculated using the Clustal W algorithm (Nucleic Acid Research,22(22):4673-4680, 1994).
2.3 polynucleotides, control sequences, expression vectors and host cells useful in the preparation of engineered transaminase polypeptides
In another aspect, the disclosure provides polynucleotides encoding the engineered polypeptides described herein having transaminase activity. The polynucleotide may be operably linked to one or more heterologous regulatory sequences that control gene expression to produce a recombinant polynucleotide capable of expressing the polypeptide. An expression construct comprising a heterologous polynucleotide encoding an engineered transaminase can be introduced into a suitable host cell to express the corresponding engineered transaminase polypeptide.
As will be apparent to those skilled in the art, the availability of protein sequences and knowledge of codons corresponding to various amino acids provide an illustration of all polynucleotides that are capable of encoding a protein sequence of interest. The degeneracy of the genetic code, wherein the same amino acid is encoded by alternative or synonymous codons, allows for the production of a very large number of nucleic acids, all of which encode the improved transaminase polypeptides disclosed herein. Thus, after determining a particular amino acid sequence, one skilled in the art can generate any number of different nucleic acids by merely modifying the sequence of one or more codons in a manner that does not alter the amino acid sequence of the protein. In this regard, the present disclosure specifically contemplates each and every possible alteration of polynucleotides that may be prepared by selecting combinations based on possible codon usage, and all such alterations are deemed to be specifically disclosed for any of the polypeptides disclosed herein, including the amino acid sequences of the exemplary engineered polypeptides provided in table 2, as well as any of the polypeptides disclosed as the sequences of SEQ ID No. 4, SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, and SEQ ID No. 12 in the sequence listing incorporated herein by reference.
In various embodiments, the codons are preferably selected to accommodate the host cell in which the protein is produced. For example, preferred codons for bacteria are used to express genes in bacteria; preferred codons for use in yeast for expression in yeast; and preferred codons for mammals are for expression in mammalian cells.
In some embodiments, the polynucleotide encodes a transaminase polypeptide comprising an amino acid sequence that is at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to a reference sequence selected from SEQ ID No. 4, SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, and SEQ ID No. 12, wherein the polypeptide has transaminase activity and one or more of the improved properties described herein, e.g., the ability to convert compound a-1 to product compound a-2 with increased activity as compared to the polypeptide of SEQ ID No. 2.
In some embodiments, the polynucleotide encodes an engineered transaminase polypeptide comprising an amino acid sequence that has the percent identity described above as compared to SEQ ID No. 2 and has one or more amino acid residue differences. In some embodiments, the present disclosure provides an engineered polypeptide having transaminase activity, the engineered polypeptide comprising a combination of at least 80% sequence identity to the reference sequence of SEQ ID No. 2 and a residue difference at a position selected from: x3, X4, X9, X15, X20, X27, X36, X40, X41, X53, X60, X61, X62, X76, X78, X92, X93, X97, X98, X101, X113, X114, X115, X116, X122, X126, X128, X130, X137, X148, X153, X155, X159, X186, X187, X188, X190, X194, X209, X214, X226, X235, X244, X253, X262, X263, X266, X271, X273, X275, X279, X287, X290, X312, X323.
In some embodiments, the polynucleotide encoding the engineered transaminase polypeptide comprises a sequence selected from SEQ ID NO 3, SEQ ID NO 5, SEQ ID NO 7, SEQ ID NO 9, and SEQ ID NO 11.
In some embodiments, the polynucleotide encodes a polypeptide described herein, but has about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or more sequence identity at the nucleotide level to a reference polynucleotide encoding an engineered transaminase. In some embodiments, the reference polynucleotide sequence is selected from the group consisting of the sequences of SEQ ID NO 3, SEQ ID NO 5, SEQ ID NO 7, SEQ ID NO 9 and SEQ ID NO 11.
Isolated polynucleotides encoding engineered transaminase polypeptides can be manipulated to provide for expression of the polypeptides in a variety of ways including further alteration of sequences by codon optimization to improve expression, insertion into appropriate expression elements with or without additional control sequences, and transformation into host cells suitable for expression and production of the polypeptides.
Manipulation of the isolated polynucleotide prior to its insertion into a vector may be desirable or necessary depending on the expression vector. Techniques for modifying polynucleotides and nucleic acid sequences using recombinant DNA methods are well known in the art. Guidance is provided in the following: sambrook et al, 2001, Molecular Cloning A Laboratory Manual, third edition, Cold Spring Harbor Laboratory Press; and Current Protocols in Molecular Biology, authored by ausubel.f., greene pub.associates, updated in 1998,2010.
In another aspect, the disclosure also relates to recombinant expression vectors comprising a polynucleotide encoding an engineered transaminase polypeptide or variant thereof, and one or more expression regulatory regions, such as promoters and terminators, origins of replication, and the like, depending on the type of host into which they are to be introduced. Alternatively, the nucleic acid sequences of the present disclosure may be expressed by inserting the nucleic acid sequence or a nucleic acid construct comprising the sequence into an appropriate expression vector. In creating the expression vector, the coding sequence is located in the vector such that the coding sequence is operably linked to the appropriate control sequences for expression.
The recombinant expression vector may be any vector (e.g., a plasmid or virus) that can be conveniently subjected to recombinant DNA procedures and can bring about the expression of the polynucleotide sequence. The choice of the vector will generally depend on the compatibility of the vector with the host cell into which the vector is to be introduced. The vector may be a linear or closed circular plasmid. The expression vector may be an autonomously replicating vector, i.e., a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome. The vector may comprise any means for assuring self-replication. Alternatively, the vector may be one which, when introduced into a host cell, is integrated into the genome and replicated together with the chromosome(s) into which it has been integrated. Furthermore, a single vector or plasmid or two or more vectors or plasmids which together comprise the total DNA to be introduced into the genome of the host cell may be used.
Many expression vectors useful for embodiments of the present disclosure are commercially available. Exemplary expression vectors can be prepared by operably linking a polynucleotide encoding an improved transaminase polypeptide to plasmid pACYC-Duet-1 (Novagen).
In another aspect, the present disclosure provides host cells comprising polynucleotides encoding the improved transaminase polypeptides of the disclosure operably linked to one or more control sequences for expression of the transaminase in the host cell. Host cells for expressing the polypeptides encoded by the expression vectors of the present disclosure are well known in the art and include, but are not limited to, bacterial cells such as e.coli, arthrobacter species KNK168, streptomyces, and Salmonella typhimurium (Salmonella typhimurium) cells; fungal cells such as yeast cells (e.g., Saccharomyces cerevisiae or Pichia pastoris); insect cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as CHO, COS, BHK, 293, and Bowes melanoma cells; and plant cells. An exemplary host cell is E.coli BL21(DE 3). The host cell may be wild-type or engineered for genome editing, such as by knocking out wild-type transaminase genes carried in the host cell genome. Suitable media and growth conditions for the above-described host cells are well known in the art.
Polynucleotides for expressing a transaminase can be introduced into a cell by a variety of methods known in the art. Techniques include, among others, electroporation, bioparticle bombardment, liposome-mediated transfection, calcium chloride transfection, and protoplast fusion. Various methods of introducing polynucleotides into cells will be apparent to those skilled in the art.
2.4 methods for producing engineered transaminase polypeptides
Engineered transaminases can be obtained by subjecting polynucleotides encoding the transaminase to mutagenesis and/or directed evolution methods. An exemplary directed evolution technique can be found in "Bioanalysis for the Pharmaceutical Industry: Discovery, Development, and Manufacturing" (2009John Wiley & Sons Asia (Pte) Ltd. ISBN: 978-0-470-82314-9).
When the sequence of the engineered polypeptide is known, the polynucleotide encoding the polypeptide can be prepared by standard solid phase methods according to known synthetic methods. In some embodiments, fragments of up to about 100 bases can be individually synthesized and then ligated (e.g., by enzymatic or chemical ligation methods or polymerase-mediated methods) to form any desired contiguous sequence. For example, polynucleotides and oligonucleotides of the disclosure can be prepared by chemical synthesis using, for example, the methods described in Beaucage et al, 1981, TetLett 22: 1859-69, or the methods described in Matthes et al, 1984, EMBO J.3:801-05, e.g., as typically practiced in automated synthesis methods. According to the phosphoramidite method, oligonucleotides are synthesized, purified, annealed, ligated and cloned into suitable vectors, for example, in an automated DNA synthesizer. In addition, substantially any nucleic acid can be obtained from any of a variety of commercial sources.
In some embodiments, the present disclosure also provides a method for making or making an engineered transaminase polypeptide, where the method includes culturing a host cell capable of expressing a polynucleotide encoding the engineered polypeptide under culture conditions suitable for expression of the polypeptide. In some embodiments, the method of making a polypeptide further comprises isolating the polypeptide. The engineered polypeptide may be expressed in suitable cells and isolated (or recovered) from the host cells and/or culture medium using any one or more of the well-known techniques for protein purification, including lysozyme treatment, sonication, filtration, salting out, ultracentrifugation, and chromatography, among others.
2.5 methods of using engineered transaminases and compounds prepared therewith
In another aspect, the improved engineered transaminase polypeptides described herein can convert A-1 to A-2 in the presence of an amino donor. In some embodiments, the engineered transaminase polypeptides can be used in a process to prepare an enantiomeric excess of a compound of formula a-2:
Figure BDA0002087915510000111
in these embodiments, the method comprises reacting, under suitable reaction conditions, a compound represented by structural formula a 1:
Figure BDA0002087915510000112
a step of contacting with an engineered transaminase polypeptide disclosed herein.
In some embodiments of the above methods, the compound of formula a-2 is produced in at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater enantiomeric excess.
Specific embodiments of engineered transaminase polypeptides for use in the methods are further provided in the detailed description. An improved engineered transaminase polypeptide useful in the above methods can include a sequence selected from the group consisting of amino acid sequences corresponding to SEQ ID NOs: 4.6, 8, 10, 12, also including amino acid sequences having at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity to any one of the reference amino acid sequences selected from the group consisting of the SEQ ID NO 4, SEQ ID NO 6, SEQ ID NO 8, SEQ ID NO 10 and SEQ ID NO 12 sequences.
As described herein and exemplified in the examples, the present disclosure contemplates ranges of suitable reaction conditions that may be used in the methods herein, including but not limited to ranges of pH, temperature, buffer, solvent system, substrate loading, polypeptide loading, and reaction time. Additional suitable reaction conditions for performing a method of biocatalytically converting a substrate compound to a product compound using the engineered transaminase polypeptides described herein can be readily optimized by routine experimentation including, but not limited to, contacting the engineered transaminase polypeptides with the substrate compound under experimental reaction conditions of concentration, pH, temperature, solvent conditions, and detecting the product compound, e.g., using the methods described in the examples provided herein.
As described above, engineered polypeptides having transaminase activity for use in the methods of the present disclosure typically comprise an amino acid sequence having at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to any of the reference amino acid sequences selected from the group consisting of SEQ ID No. 4, SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, and SEQ ID No. 12 sequences.
The substrate compound in the reaction mixture may vary, taking into account, for example, the amount of the desired product compound, the effect of substrate concentration on enzyme activity, the stability of the enzyme under the reaction conditions, and the percent conversion of substrate to product. In some embodiments of the process, suitable reaction conditions include a loading of substrate A-1 of at least about 0.5g/L, at least about 1g/L, at least about 5g/L, at least about 10g/L, at least about 15g/L, at least about 20g/L, at least about 30g/L, at least about 50g/L, at least about 75g/L, at least about 100g/L, at least about 150g/L, at least about 200g/L, or even greater. While the values for substrate loading provided herein are based on the molecular weight of compound a-1, it is also contemplated that equivalent molar amounts of various hydrates and salts of compound a-1 can also be used in the process.
In the methods described herein, the engineered transaminase polypeptides catalyze the formation of chiral amine products from ketone substrates and amino donors. In some embodiments, the amino donor in the reaction conditions comprises any suitable amino acid selected from alanine, isopropylamine (also known as 2-aminopropane), phenylalanine, glutamine, leucine, or 3-aminobutyric acid, or any suitable chiral or achiral amine selected from methylbenzylamine; the amino donor may also be used in embodiments in the form of a salt (e.g., alanine hydrochloride, alanine acetate, isopropylamine hydrochloride, isopropylamine acetate, etc.). In some embodiments, the amino donor is isopropylamine. In some embodiments, suitable reaction conditions include an amino donor, particularly isopropylamine, present at a loading of at least about 1 times the molar loading of substrate A-1. In some embodiments, isopropylamine is present at a loading of 0.1M to about 4.0M.
The transaminase catalytic reaction is reversible, and in some embodiments, the engineered transaminase polypeptides disclosed herein can also convert a chiral amine compound of formula a-2 to compound a-1.
In embodiments of the reaction, the reaction conditions may include a suitable pH. As described above, the desired pH or desired pH range can be maintained by using an acid or base, a suitable buffer, or a combination of buffering and addition of an acid or base. The pH of the reaction mixture may be controlled before and/or during the course of the reaction. In some embodiments, suitable reaction conditions include a solution pH of about 7 to about 11. In some embodiments, the reaction conditions include a solution pH of about 7, 7.5, 8, 8.5, 9, 9.5, 10, 10.5, or 11.
In embodiments of the methods herein, suitable temperatures may be used for the reaction conditions, for example, to account for the increase in reaction rate at higher temperatures, the activity of the enzyme for a sufficiently long duration of the reaction. Accordingly, in some embodiments, suitable reaction conditions include a temperature of from about 10 ℃ to about 60 ℃, from about 25 ℃ to about 50 ℃, from about 25 ℃ to about 40 ℃, or from about 25 ℃ to about 30 ℃. In some embodiments, suitable reaction temperatures include temperatures of about 25 ℃, 30 ℃, 35 ℃, 40 ℃, 45 ℃,50 ℃, 55 ℃, or 60 ℃. In some embodiments, the temperature during the enzymatic reaction may be maintained at a certain temperature throughout the course of the reaction. In some embodiments, the temperature during the enzymatic reaction may be adjusted to a temperature profile during the course of the reaction.
The process using the engineered transaminase is usually carried out in water or solvent. Suitable solvents include aqueous buffer solutions, organic solvents, and/or co-solvent systems, which typically include an aqueous solvent and an organic solvent. The aqueous solution (water or aqueous cosolvent system) may be pH-buffered or unbuffered. In some embodiments, the methods of using the engineered transaminase polypeptides are typically performed in an aqueous cosolvent system comprising: organic solvents (e.g., methanol, ethanol, propanol, Isopropanol (IPA)), dimethyl sulfoxide (DMSO), Dimethylformamide (DMF), isopropyl acetate, ethyl acetate, butyl acetate, 1-octanol, heptane, octane, methyl tert-butyl ether (MTBE), toluene, etc.), ionic liquids (e.g., 1-ethyl 4-methylimidazole tetrafluoroborate, 1-butyl-3-methylimidazole hexafluorophosphate, etc.). The organic solvent component of the aqueous cosolvent system may be miscible with the aqueous component to provide a single liquid phase, or may be partially miscible or immiscible with the aqueous component to provide a dual liquid phase. Carbon dioxide generated during the transamination reaction may cause the formation of foam, and an antifoaming agent may be appropriately added. An exemplary aqueous cosolvent system comprises water and one or more organic solvents. Typically, the organic solvent component of the aqueous cosolvent system is selected so that it does not completely inactivate the transaminase. Suitable co-solvent systems can be readily identified by measuring the enzymatic activity of a particular engineered transaminase with a defined substrate of interest in a candidate solvent system and using an enzymatic activity assay such as that described herein. In some embodiments of the method, suitable reaction conditions include an aqueous co-solvent comprising methanol at a concentration of about 1% to about 100% (v/v), about 1% to about 60% (v/v), about 2% to about 60% (v/v), about 5% to about 60% (v/v), about 10% to about 50% (v/v), or about 10% to about 40% (v/v). In some embodiments of the method, suitable reaction conditions include methanol at a concentration of at least about 1%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%.
Suitable reaction conditions may include combinations of reaction parameters that provide biocatalytic conversion of a substrate compound into its corresponding product compound. Accordingly, in some embodiments of the methods, the combination of reaction parameters comprises: (a) a loading of about 10g/L to 200g/L of substrate A-1; (b) an engineered polypeptide concentration of about 1g/L to 50 g/L; (c) a loading of isopropylamine of about 0.1 to 4.0M; (d) a pH of about 7.0 to 11.0; and (e) a temperature of about 10 ℃ to 60 ℃.
In some embodiments, the above methods comprise contacting ≥ 10g/L of an A-1 substrate with ≥ 5g/L of the engineered transaminase polypeptide described herein in the presence of about 1M to about 2M isopropylamine at reaction conditions of about 10% to about 40% methanol, a temperature of about 30 ℃ to about 50 ℃ and a pH of 7.0 to 10.0, at least 70%, 80%, 90%, 95% or more of the substrate A-1 is converted to chiral amine product A-2 within 24 hours, and the chiral amine product A-2 is produced in at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more enantiomeric excess. In some embodiments, the transaminase polypeptide capable of performing the above-described reactions includes a polypeptide corresponding to SEQ ID NO: 4.6, 8, 10, 12.
Exemplary reaction conditions include those provided in table 1 and examples 8, 9, 10, 11, 14, 15.
In performing the enzymatic reaction described herein, the engineered polypeptide may be added to the reaction mixture in the form of a partially purified or purified enzyme, a heat-treated enzyme solution, whole cells transformed with a gene encoding the enzyme, and/or cell extracts and/or lysates of such cells. Intact cells transformed with a gene encoding an engineered polypeptide, or cell extracts thereof, lysates thereof, and isolated enzymes can be used in a variety of different forms, including solid (e.g., lyophilized, spray-dried, etc.) or semi-solid (e.g., wet mash, etc.). The cell extract or cell lysate may be partially purified by precipitation (e.g., ammonium sulfate, polyethyleneimine, heat treatment, or the like) followed by desalting procedures (e.g., ultrafiltration, dialysis, and the like) prior to lyophilization. Any enzyme preparation can be stabilized by crosslinking or immobilization to a solid phase material (e.g., a resin) using known crosslinking agents such as, for example, glutaraldehyde.
In some embodiments of the enzyme-catalyzed reactions described herein, the reaction is carried out under suitable reaction conditions described herein, wherein the engineered polypeptide is immobilized to a solid support. Solid supports that can be used to immobilize the engineered polypeptides for the enzyme-catalyzed reaction include, but are not limited to, microspheres or resins comprising polymethacrylates with epoxy functional groups, polymethacrylates with amino epoxy functional groups, styrene/DVB copolymers with octadecyl functional groups, or polymethacrylates with octadecyl functional groups. Exemplary solid supports include, but are not limited to, chitosan beads, EupergitC, and SEPABEAD (mitsubishi), including the following different types of SEPABEADs: EC-EP, EC-HFA/S, EXA252, EXE119, and EXE 120.
In some embodiments, where the engineered polypeptide may be expressed in the form of a secreted polypeptide, media containing the secreted polypeptide may be used in the methods herein.
In some embodiments, the solid reactant (e.g., enzyme, salt, etc.) can be provided to the reaction in a variety of different forms, including powders (e.g., lyophilized, spray-dried, etc.), solutions, emulsions, suspensions, and the like. The reactants can be readily lyophilized or spray dried using methods and equipment well known to those of ordinary skill in the art. For example, the protein solution may be frozen in small quantities at-80 ℃ and then added to a pre-cooled freeze-drying chamber, after which vacuum is applied.
In some embodiments, there are multiple options for the order or manner of adding the reactants. The reactants can be added together into the solvent at the same time (e.g., a single phase solvent, a biphasic aqueous cosolvent system, etc.); or alternatively, some of the reactants may be added first, and others may be added in a flowable or batch interval.
The various features and embodiments of the present disclosure are illustrated in the following representative examples, which are intended to be illustrative rather than limiting.
Drawings
FIG. 1 reaction of A-1 to A-2 catalyzed by an engineered transaminase polypeptide of the invention
FIG. 2 electrophoresis results (SDS-PAGE) of the enzyme solutions of SEQ ID No. 12 before and after the heat treatment
Detailed Description
The present invention is further illustrated by the following examples, which are not intended to limit the scope of the invention. The experimental procedures, in which specific conditions are not noted in the following examples, are generally carried out under conventional conditions or conditions recommended by the manufacturers.
Example 1: construction of Gene cloning and expression vectors
The gene sequence of the wild-type transaminase derived from Aspergillus fumigatus can be retrieved from NCBI and then synthesized by techniques common in the art and cloned into the expression vector pACYC-Duet-1 (Novagen). The recombinant expression plasmid was transformed into competent cells of e.coli BL21(DE 3). The transformation conditions were 42 ℃ and heat shock for 90 seconds, the transformation solution was spread on LB plates containing chloramphenicol, and inverted overnight at 37 ℃ to obtain recombinant transformants.
Example 2: construction of transaminase mutation library
As used herein, commercial reagents are used, preferably Quikchange kit (supplier: Agilent). The sequence design of the mutant primers was performed according to the kit instructions.
The PCR system is as follows: 10 XBuffer 2.5 uL, dNTP mix 1 uL, primer Oligomix 2 uL (5 uM), plasmid template 2.5 uL (50 ng/. mu.l), Hi-Fi enzyme 1 uL, ddH2O 16 uL.
The PCR amplification step comprises: (1) pre-denaturation at 95 ℃ for 1 min; (2) denaturation at 95 deg.C for 1 min; (3) annealing at 55 deg.C for 1 min; (4) extending for 6min at 65 ℃; repeating the steps (2) to (4) 29 times; (5) extension was continued for 5min at 65 ℃ and cooled to 4 ℃. Mu.l of DpnI (Kit) was added to the PCR product, and the mixture was digested at 37 ℃ for 2 hours. Transformation products were transferred to E.coli BL21(DE3) electrocompetent cells, plated on LB plates containing chloramphenicol, and cultured in an inverted state at 37 ℃ overnight to obtain library colonies.
Example 3: expression of mutant enzyme library and preparation of enzyme solution for screening
Colonies of the mutant enzyme library were picked from the agar plate, inoculated into a 96-well plate of LB medium containing chloramphenicol, and cultured overnight on a shaker at 30 ℃. When OD of culture solution 600 When the concentration reached 2 to 3, 20. mu.L of the suspension was inoculated into 9 ml of TB medium containing chloramphenicol6-well deep-well plates (400. mu.L of TB medium per well) were incubated at 30 ℃ on a shaker. When OD of culture solution 600 When the concentration reaches 0.6-0.8, IPTG with the final concentration of 1mM is added as an inducer, and the mixture is placed on a shaker at 30 ℃ for overnight expression (18-20 h). After the expression is finished, the deep-hole plate containing the bacterial liquid is centrifuged, and the supernatant of the bacterial liquid is removed to obtain wet bacteria. The wet cells were added to a cell lysate (1g/L lysozyme, 0.5g/L PMBS, 0.05mM PLP in TEOA-HCL buffer, pH9), and the mixture was shaken for 1 hour to disrupt the cells, thereby obtaining a lysate. And (4) centrifuging the lysate, and transferring the supernatant to a new deep-well plate to obtain the enzyme solution for the screening reaction.
Example 4: directed evolution phase II screening reactions
The enzyme solution obtained in example 3 was used as it was for the screening reaction. mu.L of the enzyme solution was mixed with 170. mu.L of the reaction mother liquor on a 96-well plate to give a final concentration of each component in the reaction system [ substrate A-115 g/L, DMSO 30% (v/v), isopropylamine 1.0M, PLP 0.1mM, TEOA 0.1M, pH9.5 ], and the plate was placed in a shaker at 40 ℃ for 24 hours. After the reaction was completed, acetonitrile was added for inactivation, followed by centrifugation (4000rpm, 30min), and the centrifuged supernatant was taken for HPLC analysis and the conversion of A-1 to A-2 was calculated according to the method of example 5.
Example 5: analytical method
The quantitative analysis detection method for calculating the conversion rate of the transamination reaction comprises the following steps: the reaction inactivated samples were tested on an Agilent1100HPLC on a Phenomenex Luna 5u C8(4.6x 150mM) analytical column with 10mM ammonium acetate as mobile phase: acetonitrile 50:50, flow rate 2mL/min, column temperature 40 ℃, detection wavelength 268 nm. The retention time of the product A-2 was 1.4min, and the retention time of the substrate A-1 was 1.9 min.
The quantitative analysis and detection method for calculating the ee value of the product A-2 comprises the following steps: the samples were analyzed by Agilent1100HPLC using a CHIRALPAK AD-H chiral column (4.6X 250mm) with a mobile phase of ethanol, n-heptane, diethylamine, water 60:40:0.1:0.1, a flow rate of 1mL per minute, a column temperature of 30 ℃ and a detection wavelength of 224 nm. The retention time of the (R) -configuration compound (i.e., A-2) was 15.3 min; the retention time of the compound in (S) -configuration (i.e. the enantiomer of A-2) was 12.2 min.
Example 6: expression of engineered transaminase polypeptides
A single colony of microorganisms comprising E.coli BL21(DE3) harboring the plasmid expressing the engineered transaminase polypeptide of interest was inoculated into a 250mL Erlenmeyer flask containing 50mL LB medium (containing chloramphenicol 30. mu.g/mL) and incubated overnight at 30 ℃ with shaking on a shaker. When OD of culture solution 600 When the amount reached 2, the cells were inoculated into a 1000mL Erlenmeyer flask containing 250mL of TB medium at an inoculum size of 5% (v/v), and cultured with shaking at 30 ℃. When OD of culture solution 600 When 0.6 was reached, IPTG was added to a final concentration of 1mM to induce transaminase expression. After culturing for 20 hours, the culture medium was centrifuged (8000rpm, 10 minutes), and the supernatant was discarded after centrifugation to collect cells and obtain wet cells. The wet cells were used directly to prepare the enzyme solution, or could be stored frozen at-20 ℃ until use.
And (3) resuspending the wet thalli in a PBS buffer solution, carrying out ultrasonic disruption in an ice bath, centrifuging and collecting a supernatant to obtain an enzyme solution containing the engineered transaminase polypeptide.
And (3) freeze-drying the enzyme liquid by using a freeze dryer to obtain enzyme powder.
Example 7: enzyme liquid pretreatment and screening reaction in directed evolution stage V
The enzyme solution of the mutation library was prepared as in example 3, and the well plate containing the enzyme solution was sealed and heat-treated by shaking at 80 ℃ for 1 hour in a water bath shaker. After completion of the heat treatment, the enzyme solution was centrifuged at 4000rpm for 30 minutes, and 40. mu.L of the supernatant was taken from each well into a 96-well plate previously filled with a reaction mother liquor (160. mu.L/well) so that the final concentration of each component in the reaction system was [ substrate A-1,50g/L, methanol 35% (v/v), acetone 1.5% (v/v), isopropylamine 2.0M, PLP 0.25mM, TEOA 0.1M, pH9.5 ], and the plate was put in a shaker at 45 ℃ and shaken for 24 hours. After the reaction was completed, acetonitrile was added for inactivation, followed by centrifugation (4000rpm, 30min), and the centrifuged supernatant was taken for HPLC analysis and the conversion of A-1 to A-2 was calculated according to the method of example 5.
Example 8: reaction for catalyzing conversion of A-1 to A-2 by enzyme powder of SEQ ID No. 4
The enzyme powder of the engineered transaminase corresponding to SEQ ID No. 4 was prepared according to the method described in example 6. A representative 5mL reaction volume scheme is as follows. In a reaction flask having a total volume of 30mL, enzyme powder and a reaction reagent were charged so that the final concentrations of the respective components in the reaction system were [ 10g/L of enzyme powder of SEQ ID No:4, substrate A-15 g/L, DMSO 25% (v/v), 1.0M isopropylamine, 0.1mM PLP, 0.1M TEOA, pH9.0 ], and a magnetic rotor was added to the reaction flask, and the reaction was started by placing it on a magnetic stirrer set at 400rpm and 25 ℃. After 24 hours of reaction, the reaction was quenched by adding 5mL of acetonitrile to the flask, followed by centrifugation (8000rpm, 5min), and HPLC analysis was performed according to the method of example 5 using the centrifuged supernatant, whereby the conversion was 12.4% and the ee value of the product A-2 was 99.7% or more.
Example 9: reaction for catalyzing conversion of A-1 to A-2 by enzyme powder of SEQ ID No. 6
The enzyme powder of the engineered transaminase corresponding to SEQ ID No. 6 was prepared according to the method described in example 6. A representative 5mL reaction volume scheme is as follows. In a reaction flask having a total volume of 30mL, enzyme powder and a reaction reagent were charged so that the final concentrations of the respective components in the reaction system were [ 5g/L of enzyme powder of SEQ ID No. 6, 15g/L of substrate A, 30% (v/v) of DMSO, 1.0M isopropylamine, 0.1mM PLP, 0.1M TEOA, pH9.5 ], and a magnetic rotor was added to the reaction flask, which was placed on a magnetic stirrer set at 30 ℃ at 400rpm to start the reaction. After 24 hours of reaction, the reaction was quenched by adding 5mL of acetonitrile to the flask, followed by centrifugation (8000rpm, 5min), and the centrifuged supernatant was analyzed by HPLC according to the method of example 5, whereby the conversion was 43.0% and the ee value of the product A-2 was 99.7% or more.
Example 10: reaction for catalyzing conversion of A-1 to A-2 by enzyme powder of SEQ ID No. 8
The enzyme powder of the engineered transaminase corresponding to SEQ ID No. 8 was prepared according to the method described in example 6. A representative 5mL reaction volume scheme is as follows. In a reaction flask having a total volume of 30mL, enzyme powder and a reaction reagent were charged so that the final concentrations of the respective components in the reaction system were [ 5g/L of enzyme powder of SEQ ID No:8, substrate A-120 g/L, DMSO 45% (v/v), 1.0M isopropylamine, 0.1mM PLP, 0.1M TEOA, pH9.5 ], a magnetic rotor was added to the reaction flask, and the reaction was started by placing it on a magnetic stirrer set at 400rpm and 40 ℃. After 24 hours of reaction, the reaction was quenched by adding 5mL of acetonitrile to the flask, followed by centrifugation (8000rpm, 5min), and the centrifuged supernatant was analyzed by HPLC according to the method of example 5, whereby the conversion was 20.2% and the ee value of the product A-2 was 99.7% or more.
Example 11: reaction for catalyzing conversion of A-1 to A-2 by enzyme solution of SEQ ID No. 10
According to the method described in example 6, an enzyme solution expressing the engineered transaminase corresponding to SEQ ID No. 10 was prepared. A representative 5mL reaction volume scheme is as follows. In a reaction flask having a total volume of 30mL, an enzyme solution and a reaction reagent were charged so that the final concentrations of the respective components in the reaction system were [ 30% (v/v) of the enzyme solution of SEQ ID No:10, substrate A-150g/L, methanol 30% (v/v), 1.0M isopropylamine, 0.1mM PLP, 0.1M TEOA, pH9.5 ], and a magnetic rotor was added to the reaction flask, which was placed on a magnetic stirrer set at 400rpm and 40 ℃ to start the reaction. After 24 hours of reaction, the reaction was quenched by adding 5mL of acetonitrile to the flask, followed by centrifugation (8000rpm, 5min), taking the centrifuged supernatant and diluting it 25-fold, and HPLC analysis was performed according to the method of example 5, with a conversion of 44.9% and ee of the product A-2 of 99.7% or more.
The enzyme solution of SEQ ID No. 10 was subjected to heat treatment at [80 ℃ C., 1h ] in the same manner as in example 13, and the reaction was carried out using the enzyme solution after heat treatment under the conditions of [ 30% (v/v) of the enzyme solution of SEQ ID No. 10 heat-treated at [80 ℃ C., 1h ], substrate A-150g/L, methanol 30% (v/v), 1.0M isopropylamine, 0.1mM PLP, 0.1M TEOA, pH9.5 ], the conversion rate was 22.1% and the ee value of the product A-2 was 99.7% or more after 24 hours of the reaction.
Example 12: high-density fermentation process for expressing engineered transaminase polypeptides
A single colony of microorganisms comprising E.coli BL21(DE3) with an engineered transaminase gene (SEQ ID No:11) was inoculated into 50mL LB broth containing 30. mu.g/mL chloramphenicol. Cells were cultured overnight in a shaker at 30 ℃. When OD of culture 600 When the culture solution reaches 1.6 to 2.2, the culture solution is taken out from the incubator to obtain a seed solution for the inoculation of the fermenter.
A5L fermentor containing 2.0L growth medium was sterilized in a steam autoclave at 121 ℃ and inoculated with the seed solution described above. The temperature of the fermenter was maintained at 37 ℃ with circulating water in a jacket, and the stirring speed was 800rpm, and air was supplied to the fermenter to maintain the dissolved oxygen level at 40% saturation or more. The pH of the culture was maintained at 7.0 by the addition of ammonium hydroxide. The growth of the culture was maintained by feeding a feed solution containing 500g/L of dietary glucose dextrose monohydrate, 12g/L of ammonium chloride and 5g/L of magnesium sulfate heptahydrate. Prior to induction of expression, the culture temperature was maintained at 30 ℃ and expression of the engineered transaminase was induced by addition of isopropyl- β -D-thiogalactoside (IPTG) to a final concentration of 1mM and fermentation continued for about 18 h. After completion of the fermentation, the wet cells were collected by centrifugation at 8000rpm for 10 minutes at 4 ℃ using a Thermo Multifuge X3R centrifuge.
The wet cells were resuspended in a10 mM potassium phosphate salt buffer pH7.0 containing 250. mu.M PLP at 4 ℃ and centrifuged at 8000rpm for 10min at 4 ℃ using a Thermo Multifuge X3R centrifuge, and the wet cells were collected again as washed wet cells. 20g of the washed wet cells were resuspended in 100mL of 10mM potassium phosphate buffer solution (pH7.0), and the suspension was crushed 2 times by a pressure homogenizer to obtain a homogenized cell suspension. And (3) centrifuging the bacterial liquid at 4000rpm for 30min, and collecting supernatant to obtain the enzyme liquid containing the engineered transaminase polypeptide.
Example 13: heat treatment of enzyme solutions
The preparation described in example 12 gave the amino acid sequence of SEQ ID No:12 corresponding to 25mL of transaminase polypeptide enzyme solution. Placing the enzyme solution into a 50mL centrifuge tube, placing the centrifuge tube in a water bath at 80 ℃, stirring to uniformly heat the enzyme solution, centrifuging the enzyme solution (4000rpm for 30min) after 1.0 hour, and collecting supernatant to obtain the enzyme solution after heat treatment. The heat-treated enzyme solution sample was subjected to SDS-PAGE electrophoretic analysis, and FIG. 2 shows the results of the electrophoresis: sample 1 is SEQ ID No:12, sample 2 is SEQ ID No:12 corresponds to the enzyme solution after heat treatment. The results in FIG. 2 show that after [80 ℃,1h ] heat treatment, the foreign proteins in the enzyme solution have been completely removed, whereas SEQ ID No:12 the corresponding engineered transaminase still remains in the enzyme solution.
The protein content of the heat-treated enzyme solution was measured using the Bradford method well known in the art, and the total protein content of the heat-treated enzyme solution was 14.3 g/L.
Example 14: the heat-treated enzyme liquid of SEQ ID No. 12 catalyzes the reaction of converting A-1 into A-2
The enzyme solution of SEQ ID No. 12 prepared in example 13 after heat treatment at [80 ℃ for 1 hour ] was used directly to catalyze the conversion of A-1 to A-2. A representative 5mL reaction volume scheme is as follows. In a reaction flask having a total volume of 30mL, an enzyme solution and a reaction reagent were charged so that the final concentrations of the respective components in the reaction system were [ SEQ ID No. [ 12 enzyme solution (v/v) heat-treated at [80 ℃ C., 1h ], substrate A-150g/L, methanol 35% (v/v), 1.5M isopropylamine, 1.0mM PLP, 0.1M TEOA pH9.5 ], a magnetic rotor was added to the reaction flask, and the reaction was started by placing the reaction flask on a magnetic stirrer set at 400rpm and 45 ℃. After 24 hours of reaction, the reaction was quenched by adding 5mL of acetonitrile to the flask, followed by centrifugation (8000rpm, 5min), and the centrifuged supernatant was diluted 25-fold and analyzed by HPLC according to the method of example 5, whereby the conversion was 93% and the ee value of the product A-2 was 99.7% or more.
Example 15: reaction process for converting A-1 into A-2 under catalysis of engineered transaminase polypeptide
Figure BDA0002087915510000171
Preparing a substrate mother solution in advance: 10.0g of the substrate was dissolved in 25mL of methanol, and the total volume of the substrate mother liquor after dissolution was about 33.0 mL. Preparing an isopropylamine mixed mother solution in advance: 1.49g of triethanolamine was weighed, purified water was added to 12.5mL, 5.9g of isopropylamine was added, the pH was adjusted to 9.5 with concentrated hydrochloric acid, and the total volume was made up to 42.0mL with purified water. Preparation of 4.0M isopropylamine solution in advance: preparing 34.0mL of isopropylamine into 100.0mL of solution by using pure water.
A250 mL three-necked flask was set in a water bath with the water temperature controlled at 45 ℃. 17.5mL of the enzyme solution of SEQ ID No. 12 subjected to [80 ℃,1 hour ] heat treatment, 106mg of pyridoxal phosphate (PLP), and 7.5mL of pure water were sequentially charged into a three-necked flask, and the prepared isopropylamine mixed solution was slowly added thereto, stirred uniformly, and then the pH of the reaction system was adjusted to 9.5 with 4.0M isopropylamine solution. 4.0mL of substrate mother liquor was added to the three-necked flask to start the reaction, and the remaining substrate mother liquor was added to the three-necked flask by a peristaltic pump until the substrate mother liquor was completely added to the three-necked flask. During the reaction, the pH of the reaction system was continuously monitored, and the pH of the reaction system was maintained at 9.5. + -. 0.5 with 4.0M isopropylamine solution for 24 hours.
After the reaction is finished, extracting the A-2 from the reaction system by using dichloromethane, and evaporating the solvent to dryness to obtain about 8.5g of a crude product of the A-2, wherein the ee value is more than or equal to 99.7%. 8.0g of the crude product of the product A-2 is dissolved in a phosphoric acid solution, 0.3g of sitagliptin phosphate seed crystal is added, and about 9.0g of sitagliptin phosphate can be prepared by a recrystallization method, wherein the purity is 100.0 percent, and the ee value is more than or equal to 99.7 percent.
It should be understood that various changes or modifications can be made by those skilled in the art after reading the above disclosure of the present invention, and equivalents also fall within the scope of the appended claims of the present application.
Sequence listing
<110> Ningbo Saise bioengineering Co., Ltd
<120> an engineered transaminase polypeptide for preparing sitagliptin
<130> EMP011
<160> 12
<170> SIPOSequenceListing 1.0
<210> 1
<211> 969
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 1
atggcttcga tggacaaagt cttctcaggt tactacgccc gtcaaaaact gctggaacgc 60
tcagataatc cgttctcaaa aggtattgcc tatgtcgaag gtaaactggt gctgccgagt 120
gatgcgcgca ttccgctgct ggacgaaggc tttatgcata gtgatctgac ctacgacgtt 180
atctccgtct gggacggccg tttctttcgc ctggatgacc acctgcagcg cattctggaa 240
tcatgcgata aaatgcgtct gaaatttccg ctggcactga gctctgtcaa aaatatcctg 300
gcagaaatgg tggctaaaag cggcattcgt gacgctttcg tcgaagtgat cgttacccgc 360
ggcctgacgg gtgttcgtgg ctctaaaccg gaagatctgt ataacaataa catttacctg 420
ctggtgctgc cgtatatctg ggttatggca ccggaaaatc agctgcatgg cggtgaagct 480
attatcaccc gtacggtgcg tcgcaccccg ccgggtgcct ttgatccgac gatcaaaaac 540
ctgcaatggg gtgacctgac caaaggcctg tttgaagcga tggatcgtgg tgccacctat 600
ccgttcctga cggatggcga caccaatctg acggaaggca gcggtttcaa tattgtcctg 660
gtgaaaaacg gtattatcta caccccggat cgtggtgttc tgcgcggcat tacgcgtaaa 720
tcagtgatcg atgttgcgcg cgccaactcg attgacatcc gtctggaagt ggttccggtg 780
gaacaagcgt accactccga tgaaattttc atgtgtacca cggccggcgg tattatgccg 840
atcaccctgc tggatggtca gccggttaac gacggtcaag tcggcccgat taccaagaaa 900
atttgggatg gctattggga aatgcactac aacccggctt attcgtttcc ggtggattat 960
ggcagcggt 969
<210> 2
<211> 323
<212> PRT
<213> Aspergillus fumigatus (Aspergillus fumigatus)
<400> 2
Met Ala Ser Met Asp Lys Val Phe Ser Gly Tyr Tyr Ala Arg Gln Lys
1 5 10 15
Leu Leu Glu Arg Ser Asp Asn Pro Phe Ser Lys Gly Ile Ala Tyr Val
20 25 30
Glu Gly Lys Leu Val Leu Pro Ser Asp Ala Arg Ile Pro Leu Leu Asp
35 40 45
Glu Gly Phe Met His Ser Asp Leu Thr Tyr Asp Val Ile Ser Val Trp
50 55 60
Asp Gly Arg Phe Phe Arg Leu Asp Asp His Leu Gln Arg Ile Leu Glu
65 70 75 80
Ser Cys Asp Lys Met Arg Leu Lys Phe Pro Leu Ala Leu Ser Ser Val
85 90 95
Lys Asn Ile Leu Ala Glu Met Val Ala Lys Ser Gly Ile Arg Asp Ala
100 105 110
Phe Val Glu Val Ile Val Thr Arg Gly Leu Thr Gly Val Arg Gly Ser
115 120 125
Lys Pro Glu Asp Leu Tyr Asn Asn Asn Ile Tyr Leu Leu Val Leu Pro
130 135 140
Tyr Ile Trp Val Met Ala Pro Glu Asn Gln Leu His Gly Gly Glu Ala
145 150 155 160
Ile Ile Thr Arg Thr Val Arg Arg Thr Pro Pro Gly Ala Phe Asp Pro
165 170 175
Thr Ile Lys Asn Leu Gln Trp Gly Asp Leu Thr Lys Gly Leu Phe Glu
180 185 190
Ala Met Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp Thr
195 200 205
Asn Leu Thr Glu Gly Ser Gly Phe Asn Ile Val Leu Val Lys Asn Gly
210 215 220
Ile Ile Tyr Thr Pro Asp Arg Gly Val Leu Arg Gly Ile Thr Arg Lys
225 230 235 240
Ser Val Ile Asp Val Ala Arg Ala Asn Ser Ile Asp Ile Arg Leu Glu
245 250 255
Val Val Pro Val Glu Gln Ala Tyr His Ser Asp Glu Ile Phe Met Cys
260 265 270
Thr Thr Ala Gly Gly Ile Met Pro Ile Thr Leu Leu Asp Gly Gln Pro
275 280 285
Val Asn Asp Gly Gln Val Gly Pro Ile Thr Lys Lys Ile Trp Asp Gly
290 295 300
Tyr Trp Glu Met His Tyr Asn Pro Ala Tyr Ser Phe Pro Val Asp Tyr
305 310 315 320
Gly Ser Gly
<210> 3
<211> 969
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 3
atggcctcta tggacaaagt cttttcggga tattatgcgc gccagaagct gcttgaacgg 60
agcgacaatc ctttctctaa gggcattgct tatgtggaag gaaagctcgt cttacctagt 120
gatgctagaa taccgctact cgacgaaggt ttcatgcaca gtgacctaac ctatgatggc 180
ttccacgttt gggatggtcg cttctttcga ttggacgatc atttgcaacg gattttggaa 240
agctgcgata agatgcggct caagttccca cttgcactga gctcagtgaa aaatattctg 300
gctgagatgg tcgccaagag tggtatccgg gatgcgatgg tgaaggttat cgtgacacgt 360
ggtctgacag gtgtacgtgg ttcgaagcct gaggatctgt ataataacaa catatacctg 420
cttgttcttc catacatttg gctgatggcg cctgagaacc agctccatgg tggcgaggct 480
atcattacaa ggacagtgcg acgaacaccc ccaggtgcat ttgatcctac tatcaaaaat 540
ctacagtggg gtgatttaat taagggaatg tttgaggcaa tggaccgtgg cgccacatac 600
ccatttctca ctgatggaga caccaacctt actgaaggac cgggtttcaa cattgttttg 660
gtgaagaacg gtattatcta tacccctgat cgaggtgtct tgcgagggat cacacgtaaa 720
agtgtgattg acgttgcccg agccaacagc atcgacatcc gccttgaggt cgtaccagtg 780
gagcaggctt atcactctga tgagatcttc atgtgcacaa ctggcggcgg cattatgcct 840
ataacattgc ttgatggtca acctgttaat gacggccagg ttggcccaat cacaaagaag 900
atatgggatg gctattggga gatgcactac aatccggcgt atagttttcc tgttgactat 960
ggcagtggc 969
<210> 4
<211> 323
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 4
Met Ala Ser Met Asp Lys Val Phe Ser Gly Tyr Tyr Ala Arg Gln Lys
1 5 10 15
Leu Leu Glu Arg Ser Asp Asn Pro Phe Ser Lys Gly Ile Ala Tyr Val
20 25 30
Glu Gly Lys Leu Val Leu Pro Ser Asp Ala Arg Ile Pro Leu Leu Asp
35 40 45
Glu Gly Phe Met His Ser Asp Leu Thr Tyr Asp Gly Phe His Val Trp
50 55 60
Asp Gly Arg Phe Phe Arg Leu Asp Asp His Leu Gln Arg Ile Leu Glu
65 70 75 80
Ser Cys Asp Lys Met Arg Leu Lys Phe Pro Leu Ala Leu Ser Ser Val
85 90 95
Lys Asn Ile Leu Ala Glu Met Val Ala Lys Ser Gly Ile Arg Asp Ala
100 105 110
Met Val Lys Val Ile Val Thr Arg Gly Leu Thr Gly Val Arg Gly Ser
115 120 125
Lys Pro Glu Asp Leu Tyr Asn Asn Asn Ile Tyr Leu Leu Val Leu Pro
130 135 140
Tyr Ile Trp Leu Met Ala Pro Glu Asn Gln Leu His Gly Gly Glu Ala
145 150 155 160
Ile Ile Thr Arg Thr Val Arg Arg Thr Pro Pro Gly Ala Phe Asp Pro
165 170 175
Thr Ile Lys Asn Leu Gln Trp Gly Asp Leu Ile Lys Gly Met Phe Glu
180 185 190
Ala Met Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp Thr
195 200 205
Asn Leu Thr Glu Gly Pro Gly Phe Asn Ile Val Leu Val Lys Asn Gly
210 215 220
Ile Ile Tyr Thr Pro Asp Arg Gly Val Leu Arg Gly Ile Thr Arg Lys
225 230 235 240
Ser Val Ile Asp Val Ala Arg Ala Asn Ser Ile Asp Ile Arg Leu Glu
245 250 255
Val Val Pro Val Glu Gln Ala Tyr His Ser Asp Glu Ile Phe Met Cys
260 265 270
Thr Thr Gly Gly Gly Ile Met Pro Ile Thr Leu Leu Asp Gly Gln Pro
275 280 285
Val Asn Asp Gly Gln Val Gly Pro Ile Thr Lys Lys Ile Trp Asp Gly
290 295 300
Tyr Trp Glu Met His Tyr Asn Pro Ala Tyr Ser Phe Pro Val Asp Tyr
305 310 315 320
Gly Ser Gly
<210> 5
<211> 969
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 5
atggcctcta tggacaaagt cttttcggga tattatgcgc gccagaagct gcttgaacgg 60
agcgacaatc ctttctctaa gggcattgct tatgtggaag gaaagctcgt cttacctagt 120
gatgctagaa taccgctatt ggacgaaggt ttcatgcaca gtgacctaac ctatgatggc 180
ttccacgttt gggatggtcg cttctttcga ttggacgatc atttgcaacg gctcttggaa 240
agctgcgata agatgcggct caagttccca cttgcactga gctcagtgaa aaatattctg 300
gctgagatgg tcgccaagag tggtatccgg gatgcgatgg tgaaggttat cgtgacacgt 360
ggtctgacag gtgtacaggg ttcgaagcct gaggatctgt ataataacaa catatacctg 420
cttgttttgc catacatttg gttgatggcg cctgagaagc agctccatgg tggcagcgct 480
atcattacaa ggacagtgcg acgaacaccc ccaggtgcat ttgatcctac tatcaaaaat 540
ctacagtggg gtgatttaat taagggaatg tttgaggcaa tggaccgtgg cgccacatac 600
ccatttctca ctgatggaga caccaacctt actgaaggac cgggtttcaa cattgttttg 660
gtgaagaacg gtattatcta tacccctgat cgaggtgtct tggaggggat cacacgtaaa 720
agtgtgattg acgttgcccg agccaacagc atcgacgtcc gccttgaggt cgtaccagtg 780
gaggcggctt atcacgcgga tgagatcttc atgtgcacaa ctggcggcgg cattatgcct 840
ataacattgc ttgatggtca acctgttaat gacggccagg ttggcccaat cacaaagaag 900
atatgggatg gctattggga gatgcactac aatccggcgt atagttttcc tgttgactat 960
ggcagtggc 969
<210> 6
<211> 323
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 6
Met Ala Ser Met Asp Lys Val Phe Ser Gly Tyr Tyr Ala Arg Gln Lys
1 5 10 15
Leu Leu Glu Arg Ser Asp Asn Pro Phe Ser Lys Gly Ile Ala Tyr Val
20 25 30
Glu Gly Lys Leu Val Leu Pro Ser Asp Ala Arg Ile Pro Leu Leu Asp
35 40 45
Glu Gly Phe Met His Ser Asp Leu Thr Tyr Asp Gly Phe His Val Trp
50 55 60
Asp Gly Arg Phe Phe Arg Leu Asp Asp His Leu Gln Arg Leu Leu Glu
65 70 75 80
Ser Cys Asp Lys Met Arg Leu Lys Phe Pro Leu Ala Leu Ser Ser Val
85 90 95
Lys Asn Ile Leu Ala Glu Met Val Ala Lys Ser Gly Ile Arg Asp Ala
100 105 110
Met Val Lys Val Ile Val Thr Arg Gly Leu Thr Gly Val Gln Gly Ser
115 120 125
Lys Pro Glu Asp Leu Tyr Asn Asn Asn Ile Tyr Leu Leu Val Leu Pro
130 135 140
Tyr Ile Trp Leu Met Ala Pro Glu Lys Gln Leu His Gly Gly Ser Ala
145 150 155 160
Ile Ile Thr Arg Thr Val Arg Arg Thr Pro Pro Gly Ala Phe Asp Pro
165 170 175
Thr Ile Lys Asn Leu Gln Trp Gly Asp Leu Ile Lys Gly Met Phe Glu
180 185 190
Ala Met Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp Thr
195 200 205
Asn Leu Thr Glu Gly Pro Gly Phe Asn Ile Val Leu Val Lys Asn Gly
210 215 220
Ile Ile Tyr Thr Pro Asp Arg Gly Val Leu Glu Gly Ile Thr Arg Lys
225 230 235 240
Ser Val Ile Asp Val Ala Arg Ala Asn Ser Ile Asp Val Arg Leu Glu
245 250 255
Val Val Pro Val Glu Ala Ala Tyr His Ala Asp Glu Ile Phe Met Cys
260 265 270
Thr Thr Gly Gly Gly Ile Met Pro Ile Thr Leu Leu Asp Gly Gln Pro
275 280 285
Val Asn Asp Gly Gln Val Gly Pro Ile Thr Lys Lys Ile Trp Asp Gly
290 295 300
Tyr Trp Glu Met His Tyr Asn Pro Ala Tyr Ser Phe Pro Val Asp Tyr
305 310 315 320
Gly Ser Gly
<210> 7
<211> 969
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 7
atggcaacca tggataaagt ttttgcaggt tattatgcac gtcagaaact gctggaacgt 60
agcgataatc cgtttagcaa aggtattgca tacgttgaag gtaaatttgt tctgccgagc 120
gaagcacgta ttccgctgct ggatgaaggt tttatgggta gcgatctgac ctatgatggt 180
tttcatgttt gggatggtcg tttttttcgt ctggatgatc atctgcagcg tctgctggaa 240
agctgtgata aaatgcgtct gaaatttccg ctggcactga gcagcgttaa aaaaattctg 300
gttgaaatgg ttgcaaaaag cggtattcgt gatgcaatgg gtaaaattat tgtgacccgt 360
ggtctgaccg gtgttcaggg tagcaaaccg gaagatctgt ataataataa tatctacctg 420
ctggttctgc cgtatatttg gctgatggca ccggaaaaac agcgtcatgg tggtagcgca 480
attattaccc gtaccgttcg tcgtaccccg ccgggtgcat ttgatccgac cattaaaaat 540
ctgcagtggg gtgatctgat tcgtggtatg tttgaagcaa aagatcgtgg tgcaacctat 600
ccgtttctga ccgatggtga tacccatctg accgaaggtc cgggttttaa tattgttctg 660
gttaaaaatg gcatcctgta taccccggat cgtggtgttc tggaaggtat tacccgtaaa 720
agcgttattg aagttgcacg tgcaaatagc attgatgttc gtctggaagt tgttccggtt 780
gaaaccgcat atcatgcaga tgaaattttt atgtgtacca ccggtggtgg tattatgccg 840
attaccctgc tggatggtaa accggttaat gatggtcagg ttggtccgat taccaaaaaa 900
atttgggatg gttattggga aatgcattat aatgcagcat atagctttcc ggttgattat 960
ggtagcgat 969
<210> 8
<211> 323
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 8
Met Ala Thr Met Asp Lys Val Phe Ala Gly Tyr Tyr Ala Arg Gln Lys
1 5 10 15
Leu Leu Glu Arg Ser Asp Asn Pro Phe Ser Lys Gly Ile Ala Tyr Val
20 25 30
Glu Gly Lys Phe Val Leu Pro Ser Glu Ala Arg Ile Pro Leu Leu Asp
35 40 45
Glu Gly Phe Met Gly Ser Asp Leu Thr Tyr Asp Gly Phe His Val Trp
50 55 60
Asp Gly Arg Phe Phe Arg Leu Asp Asp His Leu Gln Arg Leu Leu Glu
65 70 75 80
Ser Cys Asp Lys Met Arg Leu Lys Phe Pro Leu Ala Leu Ser Ser Val
85 90 95
Lys Lys Ile Leu Val Glu Met Val Ala Lys Ser Gly Ile Arg Asp Ala
100 105 110
Met Gly Lys Ile Ile Val Thr Arg Gly Leu Thr Gly Val Gln Gly Ser
115 120 125
Lys Pro Glu Asp Leu Tyr Asn Asn Asn Ile Tyr Leu Leu Val Leu Pro
130 135 140
Tyr Ile Trp Leu Met Ala Pro Glu Lys Gln Arg His Gly Gly Ser Ala
145 150 155 160
Ile Ile Thr Arg Thr Val Arg Arg Thr Pro Pro Gly Ala Phe Asp Pro
165 170 175
Thr Ile Lys Asn Leu Gln Trp Gly Asp Leu Ile Arg Gly Met Phe Glu
180 185 190
Ala Lys Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp Thr
195 200 205
His Leu Thr Glu Gly Pro Gly Phe Asn Ile Val Leu Val Lys Asn Gly
210 215 220
Ile Leu Tyr Thr Pro Asp Arg Gly Val Leu Glu Gly Ile Thr Arg Lys
225 230 235 240
Ser Val Ile Glu Val Ala Arg Ala Asn Ser Ile Asp Val Arg Leu Glu
245 250 255
Val Val Pro Val Glu Thr Ala Tyr His Ala Asp Glu Ile Phe Met Cys
260 265 270
Thr Thr Gly Gly Gly Ile Met Pro Ile Thr Leu Leu Asp Gly Lys Pro
275 280 285
Val Asn Asp Gly Gln Val Gly Pro Ile Thr Lys Lys Ile Trp Asp Gly
290 295 300
Tyr Trp Glu Met His Tyr Asn Ala Ala Tyr Ser Phe Pro Val Asp Tyr
305 310 315 320
Gly Ser Asp
<210> 9
<211> 969
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 9
atggcaacca tagataaagt ctttgcgggc tattacgcgc gtcaaaaact gctggaacgc 60
tctgataacc cgtttagcaa aggcatcgcg tacgttgaag gcaaatttgt cctgccgtac 120
gaagcacgta ttccgctgct ggacgaaggt tttatgggca gcgacctgac ctacgacggt 180
tttcacgttt gggacggtcg ctttttccgc ctggacgatc atctgcaacg tatcctggaa 240
tcctgcgaca aaatgcgcct gaaatttccg ctggcactga gcagcgttaa aaaaatcctg 300
attgaaatgg ttgcaaaaag cggtattcgc gacgcgatgg gcctgatcat tgttacccgc 360
ggtctgaccg gcgttcaggg tcgcaaagat gaggacctgt acaacaacaa catctacctg 420
ctggttctgc cgtatatctg gctgatggca ccggaagacc aacgtcacgg cggttctgcg 480
attattaccc gtaccgttcg tcgtaccccg ccgggcgcat ttgatccgac cattaaaaac 540
ctgcagtggg gcgatttcaa ccgcggtatg tttgaagcga aggatcgcgg cgcaacctat 600
ccgtttctga ccgacggcga tacccacctg actgaaggtc cgggttttaa cattgtcctg 660
gtcaaaaacg gcatcctgta taccccggat cgtggcgttc tggaaggtat tacccgcaaa 720
agcgttattg aagttgcgcg cgcaaattcc attgatgtcc gcctggaagt tgttccggtt 780
gaaaccgcat accacgcgga cgaaatcttt tgctgttcga ccggcggcgg tatttgtccg 840
atcaccctgc tggacggtaa accggttggt gacggtcagg taggcccgat taccaaaaag 900
atttgggacg gctactggga gatgcattat aacgcggcgt acagctttcc ggtagattat 960
ggtagcgat 969
<210> 10
<211> 323
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 10
Met Ala Thr Ile Asp Lys Val Phe Ala Gly Tyr Tyr Ala Arg Gln Lys
1 5 10 15
Leu Leu Glu Arg Ser Asp Asn Pro Phe Ser Lys Gly Ile Ala Tyr Val
20 25 30
Glu Gly Lys Phe Val Leu Pro Tyr Glu Ala Arg Ile Pro Leu Leu Asp
35 40 45
Glu Gly Phe Met Gly Ser Asp Leu Thr Tyr Asp Gly Phe His Val Trp
50 55 60
Asp Gly Arg Phe Phe Arg Leu Asp Asp His Leu Gln Arg Ile Leu Glu
65 70 75 80
Ser Cys Asp Lys Met Arg Leu Lys Phe Pro Leu Ala Leu Ser Ser Val
85 90 95
Lys Lys Ile Leu Ile Glu Met Val Ala Lys Ser Gly Ile Arg Asp Ala
100 105 110
Met Gly Leu Ile Ile Val Thr Arg Gly Leu Thr Gly Val Gln Gly Arg
115 120 125
Lys Asp Glu Asp Leu Tyr Asn Asn Asn Ile Tyr Leu Leu Val Leu Pro
130 135 140
Tyr Ile Trp Leu Met Ala Pro Glu Asp Gln Arg His Gly Gly Ser Ala
145 150 155 160
Ile Ile Thr Arg Thr Val Arg Arg Thr Pro Pro Gly Ala Phe Asp Pro
165 170 175
Thr Ile Lys Asn Leu Gln Trp Gly Asp Phe Asn Arg Gly Met Phe Glu
180 185 190
Ala Lys Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp Thr
195 200 205
His Leu Thr Glu Gly Pro Gly Phe Asn Ile Val Leu Val Lys Asn Gly
210 215 220
Ile Leu Tyr Thr Pro Asp Arg Gly Val Leu Glu Gly Ile Thr Arg Lys
225 230 235 240
Ser Val Ile Glu Val Ala Arg Ala Asn Ser Ile Asp Val Arg Leu Glu
245 250 255
Val Val Pro Val Glu Thr Ala Tyr His Ala Asp Glu Ile Phe Cys Cys
260 265 270
Ser Thr Gly Gly Gly Ile Cys Pro Ile Thr Leu Leu Asp Gly Lys Pro
275 280 285
Val Gly Asp Gly Gln Val Gly Pro Ile Thr Lys Lys Ile Trp Asp Gly
290 295 300
Tyr Trp Glu Met His Tyr Asn Ala Ala Tyr Ser Phe Pro Val Asp Tyr
305 310 315 320
Gly Ser Asp
<210> 11
<211> 969
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 11
atggcaacca tagataaagt ctttgcgggc tattacgcgc gtaaaaaact gctggaagcg 60
tctgataacc cgtttagcca aggcatcgcg tacgttgaag gcaaatttgt cctgccgtac 120
gaagcacgta ttccgctgct ggacgaaggt tttatgggca gcgacctgac ctacgacggt 180
tttcacgttt gggacggtcg ctttttccgc ctggacgatc atctgcatcg tatcctggaa 240
tcctgcgaca aaatgcgcct gaaatttccg ctgcccgtga gcagcgttcg taaaatcctg 300
attgaaatgg ttgcaaaaag cggtattcgc gacgcgatgg gcctgatcat tgttacccgc 360
ggtatgaccg gcgttcaggg tatgaaagat gaggacctgt acaacaacga aatctacctg 420
ctggttctgc cgtatatctg gctgatggca ccggaagacc aacgtcacgg cggtagcgcg 480
attattaccc gtaccgttcg tcgtaccccg ccgggcgcat ttgatccgac cattaaaaac 540
ctgcagtggg gcgatttcaa ccgcggtctg tttgaagcga aagatcgcgg cgcaacctat 600
ccgtttctga ccgacggcga tacccacctg actgaaggtc cgggttttaa cattgtcctg 660
gtcaaaaacg gcatcctgta taccccggat cgtggcgttc tggaaggtat tacccgcaaa 720
agcgttattg aagttgcgcg cgcaaattcc attgataccc gcctggaagt tgttccggtt 780
gaaacctgct accacgcgga cgaaatcttt tgctgttcga ccggcggcgg tatttgtccg 840
atcaccctgc tggacggtaa accggttggt gacggtcagg taggcccgat taccaaaaag 900
atttgggacg gctactggga gatgcattat aacgatgcgt acagctttcc ggtagattat 960
ggtagcgat 969
<210> 12
<211> 323
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 12
Met Ala Thr Ile Asp Lys Val Phe Ala Gly Tyr Tyr Ala Arg Lys Lys
1 5 10 15
Leu Leu Glu Ala Ser Asp Asn Pro Phe Ser Gln Gly Ile Ala Tyr Val
20 25 30
Glu Gly Lys Phe Val Leu Pro Tyr Glu Ala Arg Ile Pro Leu Leu Asp
35 40 45
Glu Gly Phe Met Gly Ser Asp Leu Thr Tyr Asp Gly Phe His Val Trp
50 55 60
Asp Gly Arg Phe Phe Arg Leu Asp Asp His Leu His Arg Ile Leu Glu
65 70 75 80
Ser Cys Asp Lys Met Arg Leu Lys Phe Pro Leu Pro Val Ser Ser Val
85 90 95
Arg Lys Ile Leu Ile Glu Met Val Ala Lys Ser Gly Ile Arg Asp Ala
100 105 110
Met Gly Leu Ile Ile Val Thr Arg Gly Met Thr Gly Val Gln Gly Met
115 120 125
Lys Asp Glu Asp Leu Tyr Asn Asn Glu Ile Tyr Leu Leu Val Leu Pro
130 135 140
Tyr Ile Trp Leu Met Ala Pro Glu Asp Gln Arg His Gly Gly Ser Ala
145 150 155 160
Ile Ile Thr Arg Thr Val Arg Arg Thr Pro Pro Gly Ala Phe Asp Pro
165 170 175
Thr Ile Lys Asn Leu Gln Trp Gly Asp Phe Asn Arg Gly Leu Phe Glu
180 185 190
Ala Lys Asp Arg Gly Ala Thr Tyr Pro Phe Leu Thr Asp Gly Asp Thr
195 200 205
His Leu Thr Glu Gly Pro Gly Phe Asn Ile Val Leu Val Lys Asn Gly
210 215 220
Ile Leu Tyr Thr Pro Asp Arg Gly Val Leu Glu Gly Ile Thr Arg Lys
225 230 235 240
Ser Val Ile Glu Val Ala Arg Ala Asn Ser Ile Asp Thr Arg Leu Glu
245 250 255
Val Val Pro Val Glu Thr Cys Tyr His Ala Asp Glu Ile Phe Cys Cys
260 265 270
Ser Thr Gly Gly Gly Ile Cys Pro Ile Thr Leu Leu Asp Gly Lys Pro
275 280 285
Val Gly Asp Gly Gln Val Gly Pro Ile Thr Lys Lys Ile Trp Asp Gly
290 295 300
Tyr Trp Glu Met His Tyr Asn Asp Ala Tyr Ser Phe Pro Val Asp Tyr
305 310 315 320
Gly Ser Asp

Claims (16)

1. An engineered transaminase polypeptide capable of converting a compound of structural formula a-1: 3-carbonyl-1- [3- (trifluoromethyl) -5,6,7, 8-tetrahydro-1, 2, 4-triazolo [4,3-a ] pyrazin-7-yl ] -4- (2,4, 5-trifluorophenyl) butan-1-one, converted to a compound of structural formula a-2: (3R) -3-amino-1- [3- (trifluoromethyl) -5,6,7, 8-tetrahydro-1, 2, 4-triazolo [4,3-a ] pyrazin-7-yl ] -4- (2,4, 5-trifluorophenyl) butan-1-one, wherein said transaminase polypeptide is a sequence selected from the group consisting of: SEQ ID Nos. 4, 6, 8,
10. 12 under the amino acid sequence of SEQ ID NO
Figure FDA0003764461640000011
2. The transaminase polypeptide of claim 1, wherein the suitable reaction conditions include 2g/L to 100g/L of substrate a-1, 0.5M to 2.0M isopropylamine, 50 μ M to 5mM PLP, ph7.0 to 11.0, 10% to 40% (v/v) methanol, 10 to 60 ℃.
3. A polypeptide immobilized on a solid material by chemical bonding or physical adsorption, said polypeptide being selected from the transaminase polypeptides of any one of claims 1-2.
4. A polynucleotide encoding the polypeptide of any one of claims 1-3.
5. The polynucleotide of claim 1, wherein the polynucleotide sequence is a polynucleotide corresponding to SEQ ID No: 3. 5, 7, 9, 11.
6. An expression vector comprising the polynucleotide of claims 4-5.
7. The expression vector of claim 6, comprising a plasmid, cosmid, phage, or viral vector.
8. A host cell comprising the expression vector of any one of claims 6-7, wherein the host cell is e.
9. A method of preparing a transaminase polypeptide, comprising the steps of: culturing the host cell of claim 8, and obtaining the transaminase polypeptide from the culture.
10. A transaminase catalyst produced by the process of claim 9 by a host cell or culture broth obtained from the culture comprising the transaminase polypeptide, or an article of manufacture processed therewith; wherein the product is an extract obtained from the transformant cells, an isolated product obtained by isolating or purifying transaminase in the extract, or an immobilized product obtained by immobilizing the transformant cells and the extract or the isolated product of the extract.
11. A process for preparing a compound of structural formula a-2:
Figure FDA0003764461640000021
the process comprises reacting a compound of formula A-1 in the presence of an amino donor under reaction conditions suitable to convert a compound of formula A-1 to a compound of formula A-2
Figure FDA0003764461640000022
A step of contacting with the engineered transaminase polypeptide of any one of claims 1-3.
12. The method of claim 11, wherein the chiral amine product a-2 is produced in at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater enantiomeric excess.
13. The process of any one of claims 11-12, wherein the reaction solvent comprises water, methanol, ethanol, propanol, isopropanol, isopropyl acetate, dimethyl sulfoxide (DMSO), or Dimethylformamide (DMF).
14. The method of any one of claims 11-13, wherein the reaction conditions comprise a temperature of 10 ℃ to 60 ℃.
15. The method of any one of claims 11-14, wherein the reaction conditions comprise ph7.0 to ph 11.0.
16. The method of any one of claims 11-15, wherein the a-1 substrate is present at a loading of 2g/L to 100 g/L.
CN201910493876.7A 2019-06-07 2019-06-07 Engineered transaminase polypeptide for preparing sitagliptin Active CN112048485B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910493876.7A CN112048485B (en) 2019-06-07 2019-06-07 Engineered transaminase polypeptide for preparing sitagliptin

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910493876.7A CN112048485B (en) 2019-06-07 2019-06-07 Engineered transaminase polypeptide for preparing sitagliptin

Publications (2)

Publication Number Publication Date
CN112048485A CN112048485A (en) 2020-12-08
CN112048485B true CN112048485B (en) 2022-09-27

Family

ID=73609114

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910493876.7A Active CN112048485B (en) 2019-06-07 2019-06-07 Engineered transaminase polypeptide for preparing sitagliptin

Country Status (1)

Country Link
CN (1) CN112048485B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023169184A1 (en) * 2022-03-10 2023-09-14 Enzymaster (Ningbo) Bio-Engineering Co., Ltd. Biocatalyst and method for the synthesis of ubrogepant intermediates

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107653233A (en) * 2017-07-06 2018-02-02 泰州学院 A kind of improved transaminase, its encoding gene and the genetic engineering bacterium for expressing the enzyme
CN109957554A (en) * 2017-12-26 2019-07-02 宁波酶赛生物工程有限公司 It is engineered TRANSAMINASE POLYPEPTIDES and its application

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107653233A (en) * 2017-07-06 2018-02-02 泰州学院 A kind of improved transaminase, its encoding gene and the genetic engineering bacterium for expressing the enzyme
CN109957554A (en) * 2017-12-26 2019-07-02 宁波酶赛生物工程有限公司 It is engineered TRANSAMINASE POLYPEPTIDES and its application

Also Published As

Publication number Publication date
CN112048485A (en) 2020-12-08

Similar Documents

Publication Publication Date Title
CN109957554B (en) Engineered transaminase polypeptides and uses thereof
CN111321129B (en) Engineered ketoreductase polypeptides and uses thereof
AU2019302422B2 (en) Engineered phosphopentomutase variant enzymes
CN113061594B (en) Transaminase mutants, immobilized transaminases and use for preparing sitagliptin
US10889806B2 (en) Engineered pantothenate kinase variant enzymes
EP3820502A1 (en) Engineered purine nucleoside phosphorylase variant enzymes
CN111411094B (en) (R) -omega-transaminase mutant and application thereof
CN112048485B (en) Engineered transaminase polypeptide for preparing sitagliptin
CN111411095B (en) Recombinant (R) -omega-transaminase, mutant and application thereof
CN115466756A (en) Transaminase, immobilized transaminase and application of transaminase to preparation of sitagliptin
JP7306719B2 (en) Modified decarboxylase polypeptides and their use in the preparation of β-alanine
EP1496113B1 (en) Gluconate dehydratase
AU2020299358A1 (en) Engineered acetate kinase variant enzymes
WO2023169184A1 (en) Biocatalyst and method for the synthesis of ubrogepant intermediates
CN111793615B (en) Engineered polypeptides and their use in the synthesis of tyrosine or tyrosine derivatives
KR102016050B1 (en) Novel promoter and uses thereof
CN109943543B (en) Alcohol dehydrogenase mutant and preparation method and application thereof
US20240002817A1 (en) Engineered pantothenate kinase variant enzymes
CN112746066A (en) L-lysine decarboxylase mutant and application thereof
CN116515782A (en) Amine dehydrogenase mutant and application thereof in chiral amine alcohol compound synthesis
JP2001321180A (en) Heat-resistant aminotransferase and its use

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant