CN116209754A - Engineered biocatalysts and methods for synthesizing chiral amines - Google Patents

Engineered biocatalysts and methods for synthesizing chiral amines Download PDF

Info

Publication number
CN116209754A
CN116209754A CN202180065368.4A CN202180065368A CN116209754A CN 116209754 A CN116209754 A CN 116209754A CN 202180065368 A CN202180065368 A CN 202180065368A CN 116209754 A CN116209754 A CN 116209754A
Authority
CN
China
Prior art keywords
polypeptide
engineered
sequence
transaminase
seq
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180065368.4A
Other languages
Chinese (zh)
Inventor
桑托什·西瓦拉马克莱斯娜
埃里卡·贝穆德斯
南希塔·苏布兰马尼安
大卫·恩特韦斯特尔
斯蒂芬妮·玛丽·弗盖特
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Codexis Inc
Original Assignee
Codexis Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Codexis Inc filed Critical Codexis Inc
Publication of CN116209754A publication Critical patent/CN116209754A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1096Transferases (2.) transferring nitrogenous groups (2.6)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P17/00Preparation of heterocyclic carbon compounds with only O, N, S, Se or Te as ring hetero atoms
    • C12P17/18Preparation of heterocyclic carbon compounds with only O, N, S, Se or Te as ring hetero atoms containing at least two hetero rings condensed among themselves or condensed with a common carbocyclic ring system, e.g. rifamycin
    • C12P17/182Heterocyclic compounds containing nitrogen atoms as the only ring heteroatoms in the condensed system
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P41/00Processes using enzymes or microorganisms to separate optical isomers from a racemic mixture
    • C12P41/006Processes using enzymes or microorganisms to separate optical isomers from a racemic mixture by reactions involving C-N bonds, e.g. nitriles, amides, hydantoins, carbamates, lactames, transamination reactions, or keto group formation from racemic mixtures
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y206/00Transferases transferring nitrogenous groups (2.6)
    • C12Y206/01Transaminases (2.6.1)
    • C12Y206/01018Beta-alanine-pyruvate transaminase (2.6.1.18)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P13/00Preparation of nitrogen-containing organic compounds
    • C12P13/001Amines; Imines

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Genetics & Genomics (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Immobilizing And Processing Of Enzymes And Microorganisms (AREA)
  • Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)

Abstract

The present disclosure provides engineered transaminase polypeptides for the production of amines, polynucleotides encoding the engineered transaminases, host cells capable of expressing the engineered transaminases, and methods of using the engineered transaminases to prepare compounds useful for the production of active agents.

Description

Engineered biocatalysts and methods for synthesizing chiral amines
The present application claims priority from U.S. provisional patent application serial No. 63/084,166 filed on 28, 9, 2020, which is hereby incorporated by reference in its entirety for all purposes.
1. Technical field
The present disclosure relates to transaminase biocatalysts and methods of using the biocatalysts for preparing chiral amines.
2. References to sequence listings, tables, or computer programs
The formal copy of the sequence listing is submitted as an ASCII formatted text file via the EFS-Web concurrently with the specification, with a file name of "CX2-212WO1_ST25.Txt", a creation date of 2021, 9 months, 16 days, and a size of 984 kilobytes. The sequence listing submitted via EFS-Web is part of the specification and is incorporated by reference herein in its entirety.
3. Background
Transaminases (e.c. 2.6.1) catalyze the transfer of an amino group, a pair of electrons and protons, from a primary amine of an amino donor substrate to a carbonyl group of an amino acceptor molecule, as shown in scheme 1.
Figure BDA0004141823750000011
Scheme 1
The amino acceptor compound (I), which is a precursor of the desired chiral amine product (III), is reacted with an amino donor compound (II). The transaminase catalyzes the transfer of the amino group of the amino donor (II) to the ketone group of the amino acceptor (I). The reaction yields the desired chiral amine product compound (III) and a new amino acceptor compound (IV) having a keto group as a by-product.
Wild-type transaminases having the ability to catalyze the reactions of scheme 1 have been isolated from a variety of microorganisms including, but not limited to, anti-digestion alcaligenes (Alcaligenes denitrificans), bordetella bronchiseptica (Bordetella bronchiseptica), bordetella parapertussis (Bordetella parapertussis), brucella melitensis (Brucella melitensis), burkholderia melitensis (Burkholderia mallei), burkholderia melitensis (Burkholderia pseudomallei), chromobacterium violaceus (Chromobacterium violaceum), chaetobacter gracilis (Oceanicola granulosus) HTCC2516, escherichia species (Oceanobacter sp.) RED65, helicoverpa species (oceanopsis sp.) MED92, pseudomonas (Pseudomonas putida), solanacearum (Ralstonia solanacearum), rhizobium melitensis (Rhizobium melitensis), rhizobium species (Rhizobium japonicum.) (strain NGR 234), bacillus thuringiensis (Bacillus thuringiensis), klebsiella pneumoniae (Klebsiella pneumoniae) and Vibrio (17, biol, 17, and the like) (see, for example, biochem, 17, biol, 17, and the like). Several of these wild-type transaminase genes, as well as the encoded polypeptides, have been sequenced, including Ralstonia solanacearum (Genbank accession number YP-002257813.1, GI: 207739420), burkholderia-like 1710b (Genbank accession number ABA47738.1, GI: 76578263), bordetella pertussis (Bordetella petrii) (Genbank accession number AM902716.1, GI: 163258032), and Vibrio fluvialis (Genbank accession numbers AEA39183.1, GI: 327207066). Two wild-type aminotransferases of the classes EC 2.6.1.18 and EC 2.6.1-19 have been crystallized and structurally characterized (see, e.g., yonaha et al, 1983, agric. Biol. Chem.47 (10): 2257-2265).
The wild-type transaminase from Vibrio fluvialis JS17 is the ω -amino acid pyruvate transaminase (E.C.2.6.1.18), which catalyzes the reaction of scheme 2 using pyridoxal 5' -phosphate as cofactor.
Figure BDA0004141823750000031
Scheme 2
It has also been reported that this wild-type transaminase from Vibrio fluvialis shows catalytic activity against aliphatic amino donors having no carboxyl groups.
Chiral amine compounds are often used in the pharmaceutical, agrochemical and chemical industries as intermediates or synthons for the preparation of various medicaments, such as cephalosporins or pyrrolidine derivatives. Many of these industrial applications of chiral amine compounds involve the use of only one specific optically active form, e.g. only the (R) or (S) enantiomer is physiologically active. Aminotransferase has potential industrial uses such as for stereoselective synthesis of optically pure chiral amine compounds in the enantiomeric enrichment of amino acids (see, e.g., shin et al, 2001, biosci. Biotechnol. Biochem.65:1782-1788; iwasaki et al, 2003, biotech. Lett.25:1843-1846; iwasaki et al, 2004, appl. Microb. Biotech.69:499-505; yun et al, 2004, appl. Environ. Microbiol.70:2529-2534; and Hwang et al, 2004,Enzyme Microbiol.Technol.34:429-426).
Other examples of the use of aminotransferases include intermediates and precursors for the preparation of pregabalin (pregabalin) (e.g., WO 2008/127646); enzymatic transamination of cyclopamine analogs (e.g., WO 2011/017551); stereospecific synthesis and enantiomeric enrichment of beta-amino acids (e.g., WO 2005/005633); enantiomeric enrichment of amines (e.g., U.S. patent No. US 4,950,606, U.S. patent No. 5,300,437, and U.S. patent No. 5,169,780); and the production of amino acids and derivatives (e.g., U.S. patent No. 5,316,943, U.S. patent No. 4,518,692, U.S. patent No. 4,826,766, U.S. patent No. 6,197,558, and U.S. patent No. 4,600,692).
However, the transaminases used to catalyze the reaction for the preparation of chiral amine compounds may have characteristics that are not suitable for commercial use, such as instability to industrially useful process conditions (e.g., solvents, temperature) and limited substrate recognition. Thus, there is a need for other types of transaminase biocatalysts that can be used in industrial processes for the preparation of chiral amine compounds in optically active form.
4. Summary of the invention
The present disclosure provides engineered polypeptides having transaminase activity, polynucleotides encoding the polypeptides, methods of making the polypeptides, and methods of using the polypeptides for biocatalytic conversion of a ketone substrate to an amine product. The polypeptides of the present disclosure having transaminase activity have been engineered with one or more residue differences compared to the previously engineered transaminase polypeptide (engineered transaminase polypeptide of amino-acid sequence SEQ ID NO: 4), with enhanced activity and thermostability relative to Vibrio fluvialis wild-type transaminase or relative to an engineered variant of wild-type transaminase. Amino acid residue differences are located at residue positions that affect various enzyme properties including, among other things, activity, stereoselectivity, stability, expression, product tolerance, and substrate tolerance.
Certolitinib (savolinib) or 3- [ (1S) -1-imidazo [1,2-a ] pyridin-6-ylethyl ] -5- (1-methylpyrazol-4-yl) triazolo [4,5-b ] pyrazine (1) is a small molecule drug developed by Hutchison MediPharma Limited and AstraZeneca. It is a potent c-Met kinase inhibitor, being tested in combination with octenib (Osimertinib) to treat non-small cell lung cancer and advanced or metastatic papillary renal cell carcinoma patients (see section 5.3 below).
The current chemical synthesis process for producing compound (1) involves five steps, the first of which involves the transamination of substrate ketone compound (2) to produce enantioselective amine product compound (3) (WO 2020/053198). The engineered (S) -selective aminotransferase can be used for such conversion under industrial process conditions. While selectivity is suitable, there is a need to improve enzyme activity to accept higher substrate loadings to optimize industrial production. The present disclosure provides engineered S-selective aminotransferases with improved activity and substrate tolerance.
In some embodiments, the disclosure provides an engineered transaminase comprising a polypeptide sequence that has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID nos. 4 and/or 6, or a functional fragment thereof, wherein the engineered transaminase comprises at least one substitution or set of substitutions in the polypeptide sequence, and wherein the amino acid positions of the polypeptide sequence are numbered with reference to SEQ ID nos. 4 and/or 6. In some embodiments, an engineered transaminase comprises a polypeptide sequence that contains at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID No. 4, wherein the engineered transaminase comprises at least one substitution or set of substitutions in the polypeptide sequence at one or more positions selected from the following: 13. 41/57/130/415/419, 41/113/415, 53/57, 88/89, 97/415, 148, 227, 260, 302, 355/415/419, 362, 417 and 443, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 4. In some further embodiments, the engineered transaminase comprises a polypeptide sequence that contains at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID No. 6, wherein the engineered transaminase comprises at least one substitution or set of substitutions in the polypeptide sequence at one or more positions selected from the following: 13. 13/41/57/88/130/415/417, 13/41/57/89/97/417, 13/41/57/97/130/415/417, 13/41/57/97/130/415/417/443, 13/41/57/97/443, 13/41/57/130/417, 13/41/57/417, 13/41/88, 13/41/88/89, 13/41/88/89/97/415/443, 13/41/88/89/417, 13/41/88/97, 13/41/88/130/415/443, 13/41/88/443, 13/41/89/130/148/443, 13/41/89/417, 13/41/89/443, 13/41/97/130/417, 13/41/97/415, 13/41/97/415/417, 13/41/97/417, 13/41/97/417/443, 13/41/130/415/443, 13/41/415, 13/41/415/417, 13/41/415/443, 13/41/417, 13/41/417/443, 13/57/88/89/130/415/443, 13/57/88/97, 13/57/88/97/415/443, 13/57/88/130/415, 13/57/88/130/417/443, 13/57/88/415, 13/57/97/130/415/417/443, 13/57/97/417, 13/88/89/415/417, 13/88/89/415/417/443, 13/88/130/443, 13/88/415, 13/89/97/415/417, 13/89/97/417, 13/89/417, 13/97/148/415, 13/97/415, 13/97/415/417, 13/97/417, 13/130/415, 13/130/415/417, 13/130/417, 13/130/417/443, 13/415/417, 13/415/417/443, 13/415/443, 13/417/443, 13/443, 23/53/162/233/277/315/415/418/432, 23/53/315/417/418, 23/277/315/395/415/417/432, 23/277/395/417/418, 23/395/418, 23/418, 41/57/88, 41/57/88/415/443, etc, 41/57/130/148/415/417, 41/57/130/443, 41/57/415/417, 41/88/89/97/130/415, 41/88/89/415/417, 41/88/97/130/417, 41/88/130/415/417, 41/88/443, 41/97/130/148/415/417/443, 41/97/417, 41/97/417/443, 41/130/415, 41/130/415/417/443, 41/130/415/443, 41/415/443, 41/417/443, 53/162, 53/162/395/417, 53/162/418/432, 53/233, 53/277/395, 53/277/395/417/418, 53/277/415/417, 57/88/97/130/415/443, 57/88/97/130/417, 57/88/97/417, 57/97/130/148/417/443, 57/417, 88, 88/89/130/417, 88/97/415/417/443, 88/130/417/443, 88/148/417/443, 88/415/417, 88/415/417/443, 88/417, and water-absorbing agent, 89/97/415/417, 89/97/417, 89/443, 97/130, 97/148/415, 97/415/417, 97/417, 130/415, 130/417, 130/443, 162/233/415/417, 162/395/415/417, 162/418, 233/315/415/417, 233/315/417, 277/395/415/418/432, 315, 315/415/418/432, 395/418, 415/417/418, 415/417/418/432, 415/417/443, 415/443, 417 and 443, wherein the amino acid position of the polypeptide sequence is numbered with reference to SEQ ID No. 6.
In some further embodiments, the engineered transaminase comprises a polypeptide sequence that contains at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identity to the sequence of at least one engineered transaminase variant listed in table 5.1 and/or table 6.1. In still further embodiments, the engineered transaminase is a variant engineered transaminase provided in tables 5.1 and/or 6.1. In some further embodiments, the engineered transaminase comprises a polypeptide sequence that is at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to the sequence of at least one engineered transaminase variant set forth in SEQ ID nos. 4 and/or 6. In some further embodiments, the engineered transaminase comprises polypeptide sequences that contain SEQ ID NO 4 and/or 6. In some further embodiments, the engineered transaminase comprises a polypeptide sequence that is at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to the sequence of at least one engineered transaminase variant set forth in even-numbered sequences of SEQ ID NOS: 6-358. In yet other embodiments, the engineered transaminase comprises the polypeptide sequences set forth in even-numbered sequences of SEQ ID NOS: 6-358. In some further embodiments, the engineered transaminase comprises at least one improved property compared to a wild-type vibrio fluvial transaminase or an engineered variant of a wild-type transaminase. In some further embodiments, the improved property of the engineered transaminase includes improved activity towards the substrate. In some further embodiments, the substrate comprises compound (2). In some further embodiments, the improved property of the engineered transaminase includes improved substrate tolerance. In still other embodiments, the improved properties of the engineered transaminase include improved thermostability. In some further embodiments, the engineered transaminase is purified. The present disclosure also provides compositions comprising the engineered transaminases provided herein. In some embodiments, the composition comprises more than one engineered transaminase provided herein.
The present disclosure also provides polynucleotide sequences encoding at least one of the engineered aminotransferases provided herein. In some embodiments, a polynucleotide sequence encodes at least one engineered transaminase comprising at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID No. 3 and/or 5, wherein the polynucleotide sequence of the engineered transaminase comprises at least one substitution at one or more positions. In some further embodiments, the polynucleotide sequence encodes at least one engineered transaminase or functional fragment thereof comprising at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID NOs 3 and/or 5. In yet other embodiments, the polynucleotide sequence is operably linked to a control sequence. In still some additional embodiments, the polynucleotide sequence is codon optimized.
The present disclosure also provides expression vectors comprising at least one polynucleotide sequence encoding an engineered transaminase provided herein. In some embodiments, the expression vector comprises at least one polynucleotide sequence comprising at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID No. 3 and/or 5, wherein the polynucleotide sequence of the engineered transaminase comprises at least one substitution at one or more positions. In some embodiments, the expression vector comprises a polynucleotide sequence encoding at least one engineered transaminase or functional fragment thereof, the polynucleotide sequence comprising at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID No. 3 and/or 5.
The present disclosure also provides host cells comprising at least one expression vector provided herein. In some embodiments, the host cell comprises at least one polynucleotide sequence provided herein. In some embodiments, the host cell comprises at least one polynucleotide sequence comprising at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID No. 3 and/or 5, wherein the polynucleotide sequence encoding the engineered transaminase comprises at least one substitution at one or more positions. In some embodiments, the host cell comprises a polynucleotide sequence encoding at least one engineered transaminase comprising at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to SEQ ID nos. 4 and/or 6. In some embodiments, at least one polynucleotide sequence encoding an engineered transaminase is present in at least one expression vector.
The present disclosure also provides methods of producing an engineered transaminase in a host cell, the methods comprising culturing a host cell provided herein under suitable conditions, thereby producing at least one engineered transaminase. In some embodiments, the method further comprises recovering at least one engineered transaminase from the culture and/or the host cell. In some further embodiments, the method further comprises the step of purifying the at least one engineered transaminase.
In some embodiments, the engineered polypeptide having transaminase activity is immobilized on a solid support, optionally wherein the solid support is selected from beads or resins comprising polymethacrylates with epoxy functionality, polymethacrylates with amino epoxy functionality, styrene/DVB copolymers with octadecyl functionality, or polymethacrylates.
In some embodiments, an engineered polypeptide having transaminase activity is capable of converting a substrate compound (2) to a product compound (3) under appropriate reaction conditions (see section 5.3 below). In some embodiments, the engineered polypeptide is capable of converting compound (2) to compound (3) with at least 1.2-fold, 2-fold, 5-fold, 10-fold, 20-fold, 25-fold, 50-fold, 75-fold, 100-fold greater activity than the activity of the reference sequence (e.g., SEQ ID NOS: 4 and/or 6) under appropriate reaction conditions. In some embodiments, the engineered polypeptide is capable of converting compound (2) to compound (3) with increased activity relative to a reference sequence (e.g., SEQ ID NO:4 and/or 6), wherein suitable reaction conditions include a loading of at least 50g/L of compound (2), about 5g/L of the engineered polypeptide, about 0.25g/L of PLP, about 1.8M isopropylamine, about pH 10, and about 50 ℃.
Guidance regarding the selection of engineered transaminases, the preparation of biocatalysts, the selection of enzyme substrates, and parameters for performing the process are further described in the detailed description below.
5. Detailed description of the preferred embodiments
As used in this specification and the appended claims, the singular forms "a", "an", and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a polypeptide" includes more than one polypeptide.
Similarly, "include (comprise, comprises, comprising)", "including (include, includes) and" including "are interchangeable and are not intended to be limiting.
It will also be appreciated that where the description of various embodiments uses the term "comprising," those skilled in the art will appreciate that in some specific examples, embodiments may be alternatively described using a language "consisting essentially of or" consisting of.
It is to be understood that both the foregoing general description, including the drawings and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
The section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described.
5.1 abbreviations
Abbreviations for genetically encoded amino acids are conventional and as follows:
Figure BDA0004141823750000091
when a three letter abbreviation is used, the amino acids may be referred to as the α -carbon (C α ) L-configuration or D-configuration of (C). For example, "Ala" means alanine without specifying a configuration for the alpha-carbon, and "D-Ala" and "L-Ala" mean D-alanine and L-alanine, respectively. When single letter abbreviations are used, uppercase letters denote amino acids of the L-configuration with respect to the a-carbon, and lowercase letters denote amino acids of the D-configuration with respect to the a-carbon. For example, "A" represents L-alanine and "a" represents D-alanine. When polypeptide sequences are presented in a series of single or three letter abbreviations (or mixtures thereof), the sequences are presented in the amino (N) to carboxyl (C) direction according to conventional practice.
Abbreviations for genetically encoded nucleosides are conventional and are as follows: adenosine (a); guanosine (G); cytidine (C); thymidine (T); and uridine (U). Unless specifically described, the abbreviated nucleosides can be ribonucleosides or 2' -deoxyribonucleosides. Nucleosides can be designated as ribonucleosides or 2' -deoxyribonucleosides either individually or collectively. When the nucleic acid sequence is presented in a single letter abbreviated string, the sequence is presented in the 5 'to 3' direction by conventional convention and does not show phosphate.
5.2 definition
Technical and scientific terms used in the description herein will have meanings commonly understood by one of ordinary skill in the art with reference to the present disclosure unless specifically defined otherwise. Accordingly, the following terms are intended to have the following meanings.
"protein," "polypeptide," and "peptide" are used interchangeably herein to refer to a polymer of at least two amino acids covalently linked by an amide linkage, regardless of length or post-translational modification (e.g., glycosylation, phosphorylation, lipidation, myristoylation, ubiquitination, etc.). Included within this definition are D-amino acids and L-amino acids and mixtures of D-amino acids and L-amino acids.
"Polynucleotide" or "nucleic acid" refers to two or more nucleotides that are covalently linked together. The polynucleotide may comprise entirely ribonucleotides (i.e., RNA), entirely 2 'deoxyribonucleotides (i.e., DNA), or a mixture of ribonucleotides and 2' deoxyribonucleotides. While nucleosides will typically be linked together via standard phosphodiester linkages, polynucleotides may include one or more non-standard linkages. The polynucleotide may be single-stranded or double-stranded, or may include both single-stranded and double-stranded regions. Furthermore, while a polynucleotide will typically comprise naturally occurring coding nucleobases (i.e., adenine, guanine, uracil, thymine, and cytosine), it may comprise one or more modified and/or synthetic nucleobases, such as, for example, inosine, xanthine, hypoxanthine, and the like. Preferably, such modified or synthetic nucleobases will be the coding nucleobases.
"aminotransferase" and "transaminase" are used interchangeably herein to refer to a polypeptide having a group that is amino (NH) 2 ) A polypeptide that has the enzymatic ability to transfer from a primary amine to the carbonyl group of the acceptor molecule (c=o). Transaminase as used herein includes naturally occurring (wild-type) transaminases and non-naturally occurring engineering resulting from human treatmentAnd (5) polypeptide formation.
"amino acceptor" and "amine acceptor", "ketone substrate", "ketone (keto)" and "ketone (ketone)" are used interchangeably herein to refer to carbonyl (keto) or ketone (ketone) compounds that receive an amino group from a donor amine. In some embodiments, the amino acceptor is a molecule of the general formula,
Figure BDA0004141823750000111
wherein when R is α And R is β When used independently, is an alkyl, cycloalkyl, heterocycloalkyl, aryl, or heteroaryl group, which may be unsubstituted or substituted with one or more enzymatically acceptable groups. R is R α Structurally or chirally can be linked to R β The same or different. In some embodiments, R α And R is β Together, may form an unsubstituted, substituted, or fused ring with other rings. Amino acceptors include ketocarboxylic acids and alkanones (ketones). Typical ketocarboxylic acids are alpha-ketocarboxylic acids such as glyoxylic acid, pyruvic acid, oxaloacetic acid, and the like, as well as salts of these acids. Amino acceptors also include substrates that are converted to amino acceptors by other enzymes or whole cellular processes, such as fumaric acid (which can be converted to oxaloacetic acid), glucose (which can be converted to pyruvic acid), lactate, maleic acid, and the like. Amino acceptors that can be used include, for example and without limitation: 3, 4-dihydronaphthalen-1 (2H) -one, 1-phenylbutan-2-one, 3-dimethylbutan-2-one, oct-2-one, ethyl 3-oxobutyrate, 4-phenylbutan-2-one, 1- (4-bromophenyl) ethanone, 2-methyl-cyclohexanone, 7-methoxy-2-tetralone, 1-hydroxybutan-2-one, pyruvic acid, acetophenone, 3' -hydroxyacetophenone, 2-methoxy-5-fluoroacetophenone, levulinic acid, 1-phenylpropan-1-one, 1- (4-bromophenyl) propan-1-one, 1- (4-nitrophenyl) propan-1-one, 1-phenylpropan-2-one, 2-oxo-3-methylbutanoic acid, 1- (3-trifluoromethylphenyl) propan-1-one, hydroxyacetone, methoxyoxypropione (methoxypropanone), 1-phenylbutan-1-one, 1- (2, 5-dimethoxy-4-methylphenyl) butan-2-one, 1- (4-hydroxyphenyl) butan-2-one, acetyl-2-phenylpyruvate, 2-phenylpropanone, 2-ketoglutarate and 2-ketosuccinic acid, comprising, where possible, both (R) and (S) single isomers.
An "amino donor" or "amine donor" refers to an amino compound that donates an amino group to an amino acceptor, thus becoming a carbonyl species. In some embodiments, the amino donor is a molecule of the general formula,
Figure BDA0004141823750000121
wherein when R is ε And R is δ When used independently, is an alkyl, cycloalkyl, heterocycloalkyl, aryl, or heteroaryl group, which may be unsubstituted or substituted with one or more enzymatically non-inhibiting groups. R is R ε Can be structurally or chirally matched with R δ The same or different. In some embodiments, R ε And R is δ Together, may form an unsubstituted, substituted, or fused ring with other rings. Typical amino donors that may be used include chiral and achiral amino acids, and chiral and achiral amines. Amino donors that may be used are, for example and without limitation: isopropylamine (also known as 2-aminopropane), alpha-phenylethylamine (also known as 1-phenylethylamine) and its enantiomer (S) -1-phenylethylamine and (R) -1-phenylethylamine, 2-amino-4-phenylbutane, glycine, L-glutamic acid, L-glutamate, monosodium glutamate, L-alanine, D, L-alanine, L-aspartic acid, L-lysine, D, L-ornithine, beta-alanine, taurine, n-octylamine, cyclohexylamine, 1, 4-butanediamine (also known as putrescine), 1, 6-hexamethylenediamine, 6-aminocaproic acid, 4-aminobutyric acid, tyramine and benzylamine, 2-aminobutane, 2-amino-1-butanol, 1-amino-1-phenylethane, 1-amino-1- (2-methoxy-5-fluorophenyl) ethane, 1-amino-1-phenylpropane, 1-amino-1- (4-hydroxyphenyl) propane, 1-amino-1- (4-bromophenyl) propane, 1-amino-1- (4-nitrophenyl) propane, 1-amino-1-nitrophenyl) propane, 1-amino-2-phenylpropane, 2-amino-3-amino-2-phenylpropane, 1-phenyl-2-aminobutane, 1- (2, 5-dimethoxy-4-methylphenyl) -2-aminobutane, 1-phenyl-3 -aminobutane, 1- (4-hydroxyphenyl) -3-aminobutane, 1-amino-2-methylcyclopentane, 1-amino-3-methylcyclopentane, 1-amino-2-methylcyclohexane, 1-amino-1- (2-naphthyl) ethane, 3-methylcyclopentylamine, 2-ethylcyclopentylamine, 2-methylcyclohexylamine, 3-methylcyclohexylamine, 1-aminotetralin, 2-amino-5-methoxytetralin and 1-aminoindan, including both (R) and (S) single isomers where possible and including all possible salts of amines.
"chiral amine" means a compound of the formula R α -CH(NH 2 )-R β As used herein and in its broadest sense, includes a variety of different and mixed functional types of aliphatic and alicyclic compounds characterized by the presence of a primary amino group bound to a secondary carbon atom which carries, in addition to a hydrogen atom, (i) a divalent group forming a chiral cyclic structure, or (ii) two substituents which are structurally or chirally different from each other (other than hydrogen). Divalent groups forming chiral cyclic structures include, for example, 2-methylbutane-1, 4-diyl, pentane-1, 4-diyl, hexane-1, 5-diyl, 2-methylpentane-1, 5-diyl. Secondary carbon atom (R above) α And R is β ) The two different substituents on the above may also vary very widely and include alkyl, aralkyl, aryl, halogen, hydroxy, lower alkyl, lower alkoxy, lower alkylthio, cycloalkyl, carboxyl, alkoxycarbonyl, carbamoyl, mono-and di- (lower alkyl) -substituted carbamoyl, trifluoromethyl, phenyl, nitro, amino, mono-and di- (lower alkyl) -substituted amino, alkylsulfonyl, arylsulfonyl, alkylcarboxamide, arylcarboxamide, and the like, as well as alkyl, aralkyl, or aryl groups substituted with the above.
"pyridoxal phosphate", "PLP", "pyridoxal 5' -phosphate", "PYP" and "P5P" are used interchangeably herein to refer to a compound that serves as a coenzyme in a transaminase reaction. In some embodiments, pyridoxal phosphate is defined by the structure 1- (4 '-formyl-3' -hydroxy-2 '-methyl-5' -pyridinyl) methoxyphosphonic acid, CAS number [54-47-7 ]]. Pyridoxal 5' -phosphate is produced in vivo from pyridoxine (also known asIs vitamin B 6 ) Is produced by phosphorylation and oxidation of (a). In the transamination reaction using a transaminase, the amine group of the amino donor is transferred to the coenzyme to produce a ketone by-product, while pyridoxal 5' -phosphate is converted to pyridoxamine phosphate. Pyridoxal 5' -phosphate is regenerated by reaction with a different ketone compound (amino acceptor). The transfer of amine groups from pyridoxamine phosphate to the amino acceptor generates amines and regenerates the coenzyme. In some embodiments, pyridoxal 5' -phosphate may be derived from vitamin B 6 Other member substitutions of the family, including Pyridoxine (PN), pyridoxal (PL), pyridoxamine (PM) and their phosphorylated counterparts; pyridoxine phosphate (PNP) and pyridoxamine phosphate (PMP).
"coding sequence" refers to that portion of a nucleic acid (e.g., gene) that encodes the amino acid sequence of a protein.
"naturally occurring" or "wild type" refers to a form found in nature. For example, naturally occurring polypeptides or wild-type polypeptide or polynucleotide sequences are sequences found in organisms that can be isolated from natural sources and are not intentionally modified by human manipulation.
"recombinant" or "engineered" or "non-naturally occurring" when used in reference to, for example, a cell, nucleic acid or polypeptide, refers to the following materials or materials corresponding to the natural or native form of the materials: the material is altered in a manner that would not otherwise exist in nature, or is the same as it but is produced or obtained from synthetic materials and/or by manipulation using recombinant techniques. Non-limiting examples include, among others, recombinant cells expressing genes not found in the native (non-recombinant) form of the cell or expressing native genes that were otherwise expressed at different levels.
"percent sequence identity" and "percent homology" are used interchangeably herein to refer to a comparison between polynucleotides and between polypeptides, and are determined by comparing two optimally aligned sequences in a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may include additions or deletions (i.e., gaps) as compared to a reference sequence for optimal alignment of the two sequences. The percentages can be calculated as follows: determining the number of positions in the two sequences at which the same nucleobase or amino acid residue occurs to produce a number of matched positions, dividing the number of matched positions by the total number of positions in the comparison window, and multiplying the result by 100 to yield the percent sequence identity. Alternatively, the percentages may be calculated as follows: determining the number of positions in the two sequences at which the same nucleobase or amino acid residue occurs or which are aligned with a gap to produce a number of matched positions, dividing the number of matched positions by the total number of positions in the comparison window, and multiplying the result by 100 to yield the percentage of sequence identity. Those skilled in the art understand that there are many established algorithms that can be used to align two sequences. The optimal alignment of sequences for comparison can be performed, for example, by: local homology algorithms by Smith and Waterman,1981, adv. Appl. Math.2:482, homology alignment algorithms by Needleman and Wunsch,1970, J. Mol. Biol.48:443, similarity search methods by Pearson and Lipman,1988,Proc.Natl.Acad.Sci.USA 85:2444, computer implementation of these algorithms (GAP, BESTFIT, FASTA or TFASTA in the GCG Wisconsin software package) or visual inspection (see, generally, current Protocols in Molecular Biology, F.M. Ausubel et al, current Protocols, greene Publishing Associates, inc. and the company between John Wiley & Sons, inc. (journal in 1995)). Examples of algorithms suitable for determining percent sequence identity and percent sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al, 1990, J.mol. Biol.215:403-410, and Altschul et al, 1977,Nucleic Acids Res.3389-3402, respectively. Software for performing BLAST analysis is available to the public through the national center for biotechnology information website. The algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence that either match or meet a certain positive value of threshold score T when aligned with words of the same length in the database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits then extend in both directions along each sequence to the extent that the cumulative alignment score cannot be increased. For nucleotide sequences, cumulative scores were calculated using parameters M (reward score for matching residue pairs; always > 0) and N (penalty score for mismatched residues; always < 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. The stop word hits the extension in each direction when: the cumulative alignment score decreases from its maximum reached value by an amount X; as one or more negative scoring residue alignments are accumulated, the cumulative score reaches 0 or less; or to the end of either sequence. The BLAST algorithm parameters W, T and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses the following as default values: word length (W) is 11, expected value (E) is 10, m=5, n= -4, and comparison of the two chains. For amino acid sequences, the BLASTP program uses the following as default values: word length (W) is 3, expected value (E) is 10, and BLOSUM62 scoring matrices (see Henikoff and Henikoff,1989,Proc.Natl.Acad.Sci.USA 89:10915). Exemplary determinations of sequence alignment to% sequence identity may use the BESTFIT or GAP program in the GCG Wisconsin software package (Accelrys, madison Wis.) using the default parameters provided.
"reference sequence" refers to a defined sequence that serves as the basis for sequence comparison. The reference sequence may be a subset of a larger sequence, e.g., a segment of a full-length gene or polypeptide sequence. Typically, the reference sequence is at least 20 nucleotides or amino acid residues in length, at least 25 residues in length, at least 50 residues in length, or the full length of the nucleic acid or polypeptide. Because two polynucleotides or polypeptides may each (1) include a sequence that is similar between the two sequences (i.e., a portion of the complete sequence), and (2) may also include a different (divegent) sequence between the two sequences, sequence comparisons between two (or more) polynucleotides or polypeptides are typically made by comparing the sequences of the two polynucleotides or polypeptides in a "comparison window" to identify and compare sequence similarity of local regions. In some embodiments, a "reference sequence" may be based on a primary amino acid sequence (primary amino acid sequence), where the reference sequence is a sequence that may have one or more changes in the primary sequence. For example, "a reference sequence based on SEQ ID NO:2 having alanine at the residue corresponding to X34" or X34A refers to a reference sequence in which the corresponding residue at X34 of SEQ ID NO:2, which is threonine, has been changed to alanine.
"comparison window" refers to a conceptual segment of at least about 20 contiguous nucleotide positions or amino acid residues, wherein a sequence can be compared to a reference sequence of at least 20 contiguous nucleotides or amino acids, and wherein the portion of the sequence in the comparison window can include 20% or less additions or deletions (i.e., gaps) as compared to the reference sequence for optimal alignment of the two sequences. The comparison window may be longer than 20 consecutive residues and optionally include windows of 30, 40, 50, 100 or longer.
"substantial identity" refers to a polynucleotide or polypeptide sequence that has at least 80% sequence identity, at least 85% identity, and 89% to 95% sequence identity, more typically at least 99% sequence identity, as compared to a reference sequence, across a comparison window of at least 20 residue positions, often across a comparison window of at least 30-50 residues, wherein the percent sequence identity is calculated by comparing the reference sequence to sequences comprising a total of 20% deletions or additions of the reference sequence within the comparison window. In particular embodiments applied to polypeptides, the term "substantial identity" means that two polypeptide sequences share at least 80% sequence identity, preferably at least 89% sequence identity, at least 95% sequence identity, or more (e.g., 99% sequence identity) when optimally aligned using default GAP weights, such as by the programs GAP or BESTFIT. Preferably, the different residue positions differ by conservative amino acid substitutions.
When used in the context of numbering a given amino acid or polynucleotide sequence, "corresponding to," "reference," or "relative to" means that the residues of the specified reference sequence are numbered when the given amino acid or polynucleotide sequence is compared to the reference sequence. In other words, the residue number or residue position of a given polymer is specified with respect to a reference sequence, rather than by the actual digital position of the residue within a given amino acid or polynucleotide sequence. For example, given an amino acid sequence, such as an engineered transaminase, the amino acid sequence can be optimized for residue matching between two sequences by introducing gaps to align with a reference sequence. In these cases, residues in a given amino acid or polynucleotide sequence are numbered with respect to the reference sequence with which they are aligned, despite gaps.
"amino acid difference" or "residue difference" refers to a change in an amino acid residue at one position in a polypeptide sequence relative to an amino acid residue at a corresponding position in a reference sequence. The position of an amino acid difference is generally referred to herein as "Xn", where n refers to the corresponding position in the reference sequence on which the residue difference is based. For example, "a residue difference at position X34 compared to SEQ ID NO. 2" refers to a change in the amino acid residue at a polypeptide position corresponding to position 34 of SEQ ID NO. 2. Thus, if the reference polypeptide of SEQ ID NO. 2 has a threonine at position 34, "residue difference at position X34 as compared to SEQ ID NO. 2" refers to an amino acid substitution of any residue other than threonine at the position of the polypeptide corresponding to position 34 of SEQ ID NO. 2. In most cases herein, a particular amino acid residue difference at one position is indicated as "XnY", where "Xn" is the corresponding position as specified above, and "Y" is a single letter identifier of the amino acid found in the engineered polypeptide (i.e., a different residue than in the reference polypeptide). In some embodiments, when more than one amino acid can occur at a given residue position, the optional amino acids can be listed in XnY/Z form, where Y and Z represent optional amino acid residues. In some examples (e.g., in tables 5.1 and 6.1), the present disclosure also provides for specific amino acid differences represented by the conventional symbol "AnB," where a is a single-letter identifier of a residue in the reference sequence, "n" is a number of a residue position in the reference sequence, and B is a single-letter identifier of a residue substitution in the sequence of the engineered polypeptide. Additionally, in some examples, a polypeptide of the disclosure may comprise one or more amino acid residue differences relative to a reference sequence, represented by a list of specific positions that are altered relative to the reference sequence.
"conservative amino acid substitution" refers to the substitution of a residue with a different residue having a similar side chain, and thus generally includes the substitution of an amino acid in a polypeptide with an amino acid in the same or similar amino acid definition category. For example, and without limitation, an amino acid having an aliphatic side chain may be substituted with another aliphatic amino acid such as alanine, valine, leucine, and isoleucine; amino acids having a hydroxyl side chain are substituted with another amino acid having a hydroxyl side chain such as serine and threonine; an amino acid having an aromatic side chain is substituted with another amino acid having an aromatic side chain such as phenylalanine, tyrosine, tryptophan, and histidine; amino acids having a basic side chain are substituted with another amino acid having a basic side chain such as lysine and arginine; an amino acid having an acidic side chain is substituted with another amino acid having an acidic side chain such as aspartic acid or glutamic acid; and the hydrophobic amino acid or the hydrophilic amino acid is replaced with another hydrophobic amino acid or hydrophilic amino acid, respectively. Exemplary conservative substitutions are provided in table 1 below:
TABLE 1
Figure BDA0004141823750000171
"non-conservative substitution" refers to the substitution of an amino acid in a polypeptide with an amino acid having significantly different side chain properties. Non-conservative substitutions may use amino acids between defined groups, rather than within, and affect: (a) the structure of the peptide backbone in the substitution region (e.g., proline for glycine), (b) charge or hydrophobicity, or (c) side chain volume. For example, but not limited to, exemplary non-conservative substitutions may be substitution of an acidic amino acid with a basic or aliphatic amino acid; substitution of aromatic amino acids with small amino acids; and replacing the hydrophilic amino acid with a hydrophobic amino acid.
"deletion" refers to modification of a polypeptide by removing one or more amino acids from a reference polypeptide. Deletions may include removal of 1 or more amino acids, 2 or more amino acids, 5 or more amino acids, 10 or more amino acids, 15 or more amino acids, or 20 or more amino acids, up to 10% of the total number of amino acids comprising the reference enzyme or up to 20% of the total number of amino acids, while retaining enzyme activity and/or retaining improved properties of the engineered transaminase. Deletions may involve internal and/or terminal portions of the polypeptide. In various embodiments, the deletions may include continuous segments or may be discontinuous.
"insertion" refers to modification of a polypeptide by adding one or more amino acids to a reference polypeptide. In some embodiments, the improved engineered transaminase comprises an insertion of one or more amino acids into a naturally occurring transaminase polypeptide, as well as an insertion of one or more amino acids into other improved transaminase polypeptides. The insertion may be in an internal portion of the polypeptide, or to the carboxyl or amino terminus. As used herein, an insertion includes fusion proteins as known in the art. An insert may be a contiguous segment of amino acids, or separated by one or more amino acids in a reference polypeptide.
"fragment" as used herein refers to a polypeptide having an amino-terminal and/or carboxy-terminal deletion, but wherein the remaining amino acid sequence is identical to the corresponding position in the sequence. Fragments may be at least 14 amino acids long, at least 20 amino acids long, at least 50 amino acids long or longer, and up to 70%, 80%, 90%, 95%, 98% and 99% of the full-length transaminase polypeptide, e.g., the reference engineered transaminase polypeptide of SEQ ID NO. 2.
An "isolated polypeptide" refers to a polypeptide that: the polypeptides are substantially separated from other contaminants with which they naturally accompany, such as proteins, lipids, and polynucleotides. The term includes polypeptides that have been removed or purified from their naturally occurring environment or expression system (e.g., in a host cell or in vitro synthesis). The improved transaminase may be present in the cell, in the cell culture medium or prepared in various forms, such as a lysate or isolated preparation. As such, in some embodiments, the improved transaminase can be an isolated polypeptide.
"substantially pure polypeptide" refers to a composition in which the polypeptide material is the predominant material present (i.e., it is more abundant on a molar or weight basis than any other individual macromolecular material in the composition), and is typically a substantially purified composition when the target material comprises at least about 50% by mole or weight of the macromolecular material present. Generally, a substantially pure transaminase composition will constitute about 60% or more, about 70% or more, about 80% or more, about 90% or more, about 95% or more, and about 98% or more by mole or weight% of all macromolecular species present in the composition. In some embodiments, the target substance is purified to substantial homogeneity (i.e., contaminant substances cannot be detected in the composition by conventional detection methods), wherein the composition consists essentially of a single macromolecular substance. Solvent species, small molecules (< 500 daltons), and elemental ion species are not considered macromolecular species. In some embodiments, the isolated improved transaminase polypeptide is a substantially pure polypeptide composition.
"stereoselectivity" refers to the preferential formation of one stereoisomer over another stereoisomer in a chemical or enzymatic reaction. The stereoselectivity may be partial, where one stereoisomer forms better than the other, or the stereoselectivity may be complete, where only one stereoisomer forms. When stereoisomers are enantiomers, the stereoselectivity is referred to as enantioselectivity, i.e., the fraction of one enantiomer (usually reported as a percentage) of the sum of the two enantiomers. Alternatively, it is generally reported in the art as an enantiomeric excess (e.e.), typically as a percentage, calculated therefrom according to the formula: [ major enantiomer-minor enantiomer ]/[ major enantiomer + minor enantiomer ]. Where stereoisomers are diastereomers, the stereoselectivity is referred to as diastereoselectivity, i.e., the fraction of one diastereomer (typically reported as a percentage) of a mixture of two diastereomers, typically alternatively reported as diastereomeric excess (d.e.). Enantiomeric excess and diastereomeric excess are types of stereoisomer excess.
"highly stereoselective" refers to chemical or enzymatic reactions capable of converting a substrate, such as compound (2), to its corresponding chiral amine product, such as compound (3), in a stereomeric excess of at least about 85%.
By "improved enzymatic property" is meant an improved transaminase polypeptide that exhibits any enzymatic property as compared to a reference transaminase. For the engineered transaminase polypeptides described herein, the comparison is generally made for a wild-type transaminase, although in some embodiments, the reference transaminase may be another engineered transaminase. Desirable improved enzyme properties include, but are not limited to, enzyme activity (which may be expressed in terms of percent conversion of substrate), thermostability, solvent stability, pH activity profile (profile), cofactor requirements, refractoriness to inhibitors (e.g., substrate or product inhibition), product or substrate tolerance and stereoselectivity (including enantioselectivity).
"enhanced enzymatic activity" refers to an improved property of an engineered transaminase polypeptide, which can be expressed as an enhanced specific activity (e.g., product produced/time/weight protein) or an increased percent conversion of substrate to product (e.g., a specified amount of transaminase used in a specified period of time, percent conversion of starting amount of substrate to product) as compared to a reference transaminase. Exemplary methods of determining enzyme activity are provided in the examples. Any property associated with enzyme activity may be affected, including classical enzyme property K m 、V max Or k cat The alteration may result in an increase in enzyme activity. The improvement in enzyme activity may be about 1.2-fold, up to 2-fold, 5-fold, 10-fold, 20-fold, 25-fold, 50-fold, 75-fold, 100-fold or more of the enzyme activity of the corresponding wild-type transaminase enzyme, or of an additional engineered transaminase from which the transaminase polypeptide is derived. The transaminase activity can be determined by any of the standard assays, such as by monitoring changes in spectrophotometric characteristics of the reactant or product. In some embodiments, the amount of product produced may be measured by High Performance Liquid Chromatography (HPLC) separation of bound UV absorbance or fluorescence detection after derivatization, such as with o-phthalaldehyde (OPA). Comparison of enzyme Activity Using defined enzyme preparations, assays defined under set conditions and one or moreMore defined substrates are performed as described in further detail herein. Typically, when comparing lysates, the number of cells and the amount of protein assayed are determined and the same expression system and the same host cell are used to minimize the variation in the amount of enzyme produced by the host cell and present in the lysate.
"conversion" refers to the enzymatic conversion of a substrate to the corresponding product. "percent conversion" refers to the percentage of substrate that is converted to product over a period of time under specified conditions. Thus, the "enzymatic activity" or "activity" of a transaminase polypeptide can be expressed as a "percent conversion" of a substrate to a product.
By "thermostable" is meant that the transaminase polypeptide retains a similar activity (e.g., greater than 60% to 80%) after exposure to an elevated temperature (e.g., 40-80 ℃) for a period of time (e.g., 0.5-24 hours) as compared to the wild-type enzyme.
"solvent stable" refers to a transaminase polypeptide that retains similar activity (more than, e.g., 60% to 80%) after exposure to different concentrations (e.g., 5% -99%) of solvent (ethanol, isopropanol, dimethyl sulfoxide (DMSO), tetrahydrofuran, 2-methyltetrahydrofuran, acetone, toluene, butyl acetate, methyl t-butyl ether, etc.) for a period of time (e.g., 0.5-24 hours) as compared to the wild-type enzyme.
"thermostable and solvent stable" refers to a thermostable and solvent stable transaminase polypeptide.
"stringent hybridization" is used herein to refer to conditions under which a nucleic acid hybrid is stable. As known to those skilled in the art, the stability of a hybrid is reflected in the melting temperature (T m ) Is a kind of medium. Generally, the stability of a hybrid varies with ionic strength, temperature, G/C content and the presence of chaotropic agents. T of Polynucleotide m The values may be calculated using known methods for predicting melting temperature (see, e.g., baldino et al Methods Enzymology 168:761-777; bolton et al 1962,Proc.Natl.Acad.Sci.USA 48:1390;Bresslauer et al 1986,Proc.Natl.Acad.Sci USA 83:8893-8897; freier et al 1986,Proc.Natl.Acad.Sci USA 83:9373-9377; kierzek et al Biochemistry 25:7840-784) 6, preparing a base material; rychlik et al 1990,Nucleic Acids Res 18:6409-6412 (Protect 1991,Nucleic Acids Res 19:698); sambrook et al, supra); suggs et al, 1981,In Developmental Biology Using Purified Genes (Brown et al, editorial), pages 683-693, academic Press; and Wetmur,1991,Crit Rev Biochem Mol Biol 26:227-259, all of which are incorporated herein by reference). In some embodiments, the polynucleotide encodes a polypeptide disclosed herein and hybridizes under defined conditions, such as moderately stringent or highly stringent conditions, to a complement of a sequence encoding an engineered transaminase of the present disclosure.
"hybridization stringency" refers to hybridization conditions, such as washing conditions, in nucleic acid hybridization. Typically, the hybridization reaction is performed under conditions of lower stringency, followed by a different but higher stringency wash. The term "moderately stringent hybridization" refers to conditions that allow the target DNA to bind to a complementary nucleic acid that is about 60% identical, preferably about 75% identical, about 85% identical to the target DNA and greater than about 90% identical to the target polynucleotide. Exemplary moderately stringent conditions are those equivalent to hybridization in 50% formamide, 5 XDenhart solution, 5 XSSPE, 0.2% SDS at 42℃followed by washing in 0.2 XSSPE, 0.2% SDS at 42 ℃. "high stringency hybridization" generally refers to the thermal melting temperature T as determined under solution conditions for a defined polynucleotide sequence m Differing by about 10 c or less. In some embodiments, high stringency conditions refer to conditions that allow hybridization of only those nucleic acid sequences that form stable hybrids in 0.018M NaCl at 65 ℃ (i.e., if the hybrids are unstable in 0.018M NaCl at 65 ℃, they are unstable under high stringency conditions as contemplated herein). High stringency conditions can be provided, for example, by hybridization at 42℃equivalent to that of 50% formamide, 5 XDenhart's solution, 5 XSSPE, 0.2% SDS, followed by washing at 65℃in 0.1 XSSPE and 0.1% SDS. Another high stringency condition is hybridization in 5 XSSC containing 0.1% (w: v) SDS at 65℃and washing in 0.1 XSSC containing 0.1% SDS at 65 ℃. Other high stringency hybridization conditions and moderate stringencySex conditions are described in the references cited above.
"heterologous" polynucleotide refers to any polynucleotide that is introduced into a host cell by experimental techniques, and includes polynucleotides that are removed from the host cell, subjected to laboratory procedures, and then reintroduced into the host cell.
"codon optimized" refers to the change of codons of a polynucleotide encoding a protein to those codons that are preferentially used in a particular organism such that the encoded protein is efficiently expressed in the organism of interest. Although the genetic code is degenerate, i.e., most amino acids are represented by several codons called "synonymous" or "synonymous" codons, it is well known that codon usage for a particular organism is non-random and biased for a particular codon triplet. This codon usage bias may be higher for a given gene, a gene of common function or ancestral origin, a highly expressed protein versus a low copy number protein, and the collectin coding region of the genome of the organism. In some embodiments, the polynucleotide encoding the transaminase may be codon optimized for optimal production from the host organism selected for expression.
"preferred, optimal, highly codon usage preferred codons" interchangeably refer to codons in a protein coding region that are used at a higher frequency than other codons encoding the same amino acid. Preferred codons may be determined based on the codon usage in a single gene, a group of genes of common function or origin, a highly expressed gene, the codon frequency in the agrin coding region of the whole organism, the codon frequency in the agrin coding region of the relevant organism, or a combination thereof. Codons whose frequency increases with the level of gene expression are generally the optimal codons for expression. Various methods for determining codon frequency (e.g., codon usage, relative synonymous codon usage) and codon preference in a particular organism, as well as the effective number of codons used in a gene, are known, including multivariate analysis, e.g., using cluster analysis or correlation analysis (see GCG CodonPreference, genetics Computer Group Wisconsin Package; codonW, john Peden, university of Nottingham; mclnerney, j.o,1998,Bioinformatics 14:372-73; stenico et al, 1994,Nucleic Acids Res.222437-46; wright, f.,1990, gene87:23-29). Codon usage tables are increasingly available to organisms (see, e.g., wada et al 1992,Nucleic Acids Res.20:2111-2118; nakamura et al 2000,Nucl.Acids Res.28:292;Duret et al, supra; henout and Danchin, "Escherichia coli and Salmonella,"1996, neidhardt et al, ASM Press, washington D.C., p.2047-2066). The data source used to obtain codon usage may depend on any available nucleotide sequence capable of encoding a protein. These datasets include nucleic acid sequences that are known to actually encode expressed proteins (e.g., complete protein coding sequence-CDS), expressed Sequence Tags (ESTS), or predicted coding regions of genomic sequences (see, e.g., mount, D., bioinformation: sequence and Genome Analysis, chapter 8, cold Spring Harbor Laboratory Press, cold Spring Harbor, N.Y.,2001;Uberbacher,E.C, 1996,Methods Enzymol.266:259-281; tiwari et al, 1997, comput. Appl. Biosci.13: 263-270).
The definition of "control sequences" herein includes all components necessary or advantageous for the expression of the polynucleotides and/or polypeptides of the present disclosure. Each control sequence may be native or foreign to the nucleic acid sequence encoding the polypeptide. Such control sequences include, but are not limited to, leader sequences, polyadenylation sequences, propeptide sequences, promoters, signal peptide sequences, and transcription terminators. At a minimum, the control sequences include a promoter and transcriptional and translational stop signals. The control sequences may be provided with linkers for the purpose of introducing specific restriction sites facilitating ligation of the control sequences with the coding region of the nucleic acid sequence encoding a polypeptide.
"operably connected" is defined herein as configured as follows: in such a configuration the control sequences are suitably placed (i.e., in functional relationship) at positions relative to the polynucleotide of interest such that the control sequences direct or regulate expression of the polynucleotide and/or polypeptide of interest.
"promoter sequence" refers to a nucleic acid sequence that is recognized by a host cell for expression of a polynucleotide of interest, such as a coding sequence. The promoter sequence comprises a transcription control sequence that mediates expression of the polynucleotide of interest. The promoter may be any nucleic acid sequence that exhibits transcriptional activity in the host cell of choice including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell.
"suitable reaction conditions" refer to those conditions in the biocatalytic reaction solution (e.g., ranges of enzyme loading, substrate loading, cofactor loading, temperature, pH, buffers, co-solvents, etc.): under such conditions the transaminase polypeptides of the present disclosure are capable of converting a substrate compound to a product compound (e.g., converting compound (2) to compound (3)). Exemplary "suitable reaction conditions" are provided in the detailed description and are exemplified by the examples.
"loading" such as in "compound loading" or "enzyme loading" or "cofactor loading" refers to the concentration or amount of a component in the reaction mixture at the beginning of the reaction.
"substrate" in the context of biocatalyst-mediated methods refers to a compound or molecule that is acted upon by a biocatalyst. For example, an exemplary substrate for an engineered transaminase biocatalyst in the methods disclosed herein is compound (2).
"product" in the context of a biocatalyst-mediated process refers to a compound or molecule resulting from the action of a biocatalyst. For example, an exemplary product of the engineered transaminase biocatalyst in the methods disclosed herein is compound (3).
"heteroalkyl", "heteroalkenyl" and "heteroalkynyl" refer to alkyl, alkenyl and alkynyl groups as defined herein in which one or more carbon atoms are each independently replaced with the same or different heteroatoms or heteroatom groups. Heteroatoms and/or heteroatom groups that may be substituted for carbon atoms include, but are not limited to, -O-, -S-O-, -NR γ -、-PH-、-S(O)-、-S(O) 2 -、-S(O)NR γ -、-S(O) 2 NR γ -and the like, including combinations thereof, wherein each R γ Independently selected from the group consisting of hydrogen, alkyl, heteroalkyl, cycloalkyl, heterocycloalkyl, aryl, heteroaryl, and other suitable substituents.
"aryl" refers to an unsaturated aromatic carbocyclic group having 6 to 12 carbon atoms (inclusive) with a single ring (e.g., phenyl) or more than one fused ring (e.g., naphthyl or anthracenyl). Exemplary aryl groups include phenyl, pyridyl, naphthyl, and the like.
"arylalkyl" refers to an alkyl (alkyl) substituted with an aryl group, i.e., an "aryl-alkyl-" group, preferably having from 1 to 6 carbon atoms in the alkyl portion (inclusive) and from 6 to 12 carbon atoms in the aryl portion (inclusive). Such arylalkyl groups are exemplified by phenyl, naphthyl, and the like.
"arylalkenyl" refers to alkenyl substituted with aryl, i.e., an "aryl-alkenyl-" group, preferably having from 2 to 6 carbon atoms (inclusive) in the alkenyl moiety and from 6 to 12 carbon atoms (inclusive) in the aryl moiety.
"arylalkynyl" refers to an alkynyl group substituted with an aryl group, i.e., an "aryl-alkynyl-" group, preferably having from 2 to 6 carbon atoms (inclusive) in the alkynyl moiety and from 6 to 12 carbon atoms (inclusive) in the aryl moiety.
"cycloalkyl" refers to a cyclic alkyl group of from 3 to 12 carbon atoms (inclusive) having a single ring or multiple condensed rings, optionally substituted with from 1 to 3 alkyl groups. Exemplary cycloalkyl groups include, but are not limited to, single ring structures such as cyclopropyl, cyclobutyl, cyclopentyl, cyclooctyl, 1-methylcyclopropyl, 2-methylcyclopentyl, 2-methylcyclooctyl, and the like, or multi-ring structures including bridged ring systems such as adamantyl and the like.
"cycloalkylalkyl" refers to an alkyl group substituted with a cycloalkyl group, i.e., a "cycloalkyl-alkyl-" group, preferably having from 1 to 6 carbon atoms (inclusive) in the alkyl portion and from 3 to 12 carbon atoms (inclusive) in the cycloalkyl portion. Such cycloalkylalkyl groups are exemplified by cyclopropylmethyl, cyclohexylethyl, and the like.
"cycloalkylalkenyl" refers to alkenyl substituted with cycloalkyl, i.e. "cycloalkyl-alkenyl-" groups, preferably having from 2 to 6 carbon atoms (inclusive) in the alkenyl moiety and from 3 to 12 carbon atoms (inclusive) in the cycloalkyl moiety.
"cycloalkylalkynyl" refers to an alkynyl substituted with a cycloalkyl, i.e. "cycloalkyl-alkynyl" group, preferably having from 2 to 6 carbon atoms (inclusive) in the alkynyl moiety and from 3 to 12 carbon atoms (inclusive) in the cycloalkyl moiety.
"amino" means a radical-NH 2 . Substituted amino refers to the group-NHR η 、NR η R η And NR η R η R η Wherein each R is η Independently selected from substituted or unsubstituted alkyl, cycloalkyl, cycloheteroalkyl, alkoxy, aryl, heteroaryl, heteroarylalkyl, acyl, alkoxycarbonyl, sulfanyl (sulfanyl), sulfinyl, sulfonyl, and the like. Typical amino groups include, but are not limited to, dimethylamino, diethylamino, trimethylammonium, triethylammonium, methylsulfonylamino, furyl-oxy-sulfonamino, and the like.
"alkylamino" means-NHR ζ A group, wherein R is ζ Is an alkyl, N-oxide derivative or a protected derivative thereof, for example, methylamino, ethylamino, N-propylamino, i-propylamino, N-butylamino, i-butylamino, t-butylamino or methylamino-N-oxide, and the like.
"Arylamino" means-NHR λ Wherein R is λ Is an aryl group, which may be optionally substituted.
"Heteroarylamino" refers to the radical-NHR σ Wherein R is σ Is a heteroaryl group, which may be optionally substituted.
"aminoalkyl" refers to an alkyl group wherein one or more of the hydrogen atoms is replaced with an amino group, including substituted amino groups.
"oxo" means =o.
"oxy" refers to a divalent group-O-, which may have various substituents to form different oxy groups, including ethers and esters.
"alkoxy" OR "alkyloxy" are used interchangeably herein to refer to the group-OR ζ Wherein R is ζ Are alkyl groups, including optionally substituted alkyl groups as also defined herein.
"aryloxy" means-OR λ A group, wherein R is λ Is an aryl group, which may be optionally substituted.
"heteroaryloxy" means-OR σ Wherein R is σ Is a heteroaryl group, which may be optionally substituted.
"carboxy" refers to-COOH.
"carboxyalkyl" refers to an alkyl group substituted with a carboxyl group.
"carbonyl" refers to-C (O) -, which may have various substituents to form different carbonyl groups, including acids, acid halides, aldehydes, amides, esters, and ketones.
"alkylcarbonyl" means-C (O) R ζ Wherein R is ζ Is an alkyl group, which may be optionally substituted.
"arylcarbonyl" means-C (O) R λ Wherein R is λ Is an aryl group, which may be optionally substituted.
"heteroarylcarbonyl" means-C (O) R σ Wherein R is σ Is a heteroaryl group, which may be optionally substituted.
"Alkyloxycarbonyl" refers to-C (O) OR ζ Wherein R is ζ Is an alkyl group, which may be optionally substituted.
"Aryloxycarbonyl" means-C (O) OR λ Wherein R is λ Is an aryl group, which may be optionally substituted.
"heteroaryloxycarbonyl" refers to-C (O) OR σ Wherein R is σ Is a heteroaryl group, which may be optionally substituted.
"arylalkoxycarbonyl" means-C (O) OR ρ Wherein R is ρ Is an "aryl-alkyl-" group,which may be optionally substituted.
"Alkylcarbonyloxy" means-OC (O) -R ζ Wherein R is ζ Is an alkyl group, which may be optionally substituted.
"arylcarbonyloxy" means-OC (O) R λ Wherein R is λ Is an aryl group, which may be optionally substituted.
"Heteroarylalkyloxycarbonyl" refers to-C (O) OR ω Wherein R is ω Is a heteroarylalkyl group, which may be optionally substituted.
"Heteroarylcarbonyloxy" means-OC (O) R σ Wherein R is σ Is a heteroaryl group, which may be optionally substituted.
"aminocarbonyl" means-C (O) NH 2 . Substituted aminocarbonyl refers to-C (O) NR η R η Wherein the amino group NR η As defined herein.
"aminocarbonylalkyl" refers to an alkyl group substituted with an aminocarbonyl group.
"halogen" or "halo" refers to fluoro, chloro, bromo and iodo.
"haloalkyl" refers to an alkyl group substituted with one or more halogens. Thus, the term "haloalkyl" is intended to include monohaloalkyl, dihaloalkyl, trihaloalkyl, and the like, up to perhaloalkyl. For example, the expression "(C1-C2) haloalkyl" includes 1-fluoromethyl, difluoromethyl, trifluoromethyl, 1-fluoroethyl, 1-difluoroethyl, 1, 2-difluoroethyl, 1-trifluoroethyl, perfluoroethyl and the like.
"hydroxy" refers to-OH.
"hydroxyalkyl" refers to an alkyl group substituted with one or more hydroxyl groups.
"cyano" refers to-CN.
"nitro" means-NO 2
"thio" or "sulfanyl" refers to-SH. Substituted thio or sulfanyl means-S-R η Wherein R is η Is alkyl, aryl or other suitable substituent.
"Alkylthio" means-SR ζ Wherein R is ζ Is an alkyl group, which may be optionally substituted. Typical alkylthio groups include, but are not limited to, methylthio, ethylthio, n-propylthio, and the like.
"arylthio" means-SR λ Wherein R is λ Is an alkyl group, which may be optionally substituted. Typical arylthio groups include, but are not limited to, phenylthio, (4-tolyl) thio, pyridylthio, and the like.
"heteroarylthio" means-SR σ Wherein R is σ Is heteroaryl, which may be optionally substituted.
"Sulfonyl" means-SO 2 -. Substituted sulfonyl means-SO 2 -R η Wherein R is η Is alkyl, aryl or other suitable substituent.
"alkylsulfonyl" means-SO 2 -R ζ Wherein R is ζ Is an alkyl group, which may be optionally substituted. Typical alkylsulfonyl groups include, but are not limited to, methylsulfonyl, ethylsulfonyl, n-propylsulfonyl, and the like.
"arylsulfonyl" means-SO 2 -R λ Wherein R is λ Is aryl, which may be optionally substituted. Typical arylsulfonyl groups include, but are not limited to, phenylsulfonyl, (4-tolyl) sulfonyl, pyridylsulfonyl, and the like.
"heteroarylsulfonyl" means-SO 2 -R σ Wherein R is σ Is a heteroaryl group, which may be optionally substituted.
"sulfinyl" refers to-SO-. Substituted sulfinyl refers to-SO-R η Wherein R is η Is alkyl, aryl or other suitable substituent.
"Alkylsulfinyl" means-SO-R ζ Wherein R is ζ Is an alkyl group, which may be optionally substituted. Typical alkylsulfinyl groups include, but are not limited to, methylsulfinyl, ethylsulfinyl, n-propylsulfinyl, and the like.
"arylsulfinyl" means-SO-R λ Wherein R is λ Is aryl, which may be optionally substituted. Typical arylsulfinyl groups include, but are not limited to, phenylsulfinyl, (4-tolyl) sulfinyl, pyridylsulfinyl, and the like.
"heteroaryl sulfinyl" refers to-SO-R σ Wherein R is σ Is a heteroaryl group, which may be optionally substituted.
"Alkylsulfamoylalkyl" means "alkyl-NH-SO 2 - "alkyl substituted with a group".
"Arylsulfonylalkyl" means an "aryl-SO group" is used 2 - "alkyl substituted with a group".
"heteroarylsulfonylalkyl" means "heteroaryl-SO" used 2 - "alkyl substituted with a group".
"sulfamoyl" means-SO 2 NH 2 . Substituted aminosulfonyl means-SO 2 NR δ R δ Wherein the amino group-NR η R η As defined herein.
"heteroaryl" refers to an aromatic heterocyclic group having 1 to 10 carbon atoms (inclusive) and 1 to 4 heteroatoms (inclusive) selected from oxygen, nitrogen and sulfur within the ring. Such heteroaryl groups may have a single ring (e.g., pyridyl or furyl) or more than one fused ring (e.g., indolizinyl or benzothienyl).
"heteroarylalkyl" refers to an alkyl group substituted with a heteroaryl group, i.e., a "heteroaryl-alkyl-" group, preferably having 1 to 6 carbon atoms in the alkyl moiety (inclusive) and 5 to 12 ring atoms in the heteroaryl moiety (inclusive). Such heteroarylalkyl groups are exemplified by pyridylmethyl and the like.
"heteroarylalkenyl" refers to an alkenyl group substituted with a heteroaryl group, i.e., a "heteroaryl-alkenyl-" group, preferably having 2 to 6 carbon atoms in the alkenyl moiety (inclusive) and 5 to 12 ring atoms in the heteroaryl moiety (inclusive).
"heteroarylalkynyl" refers to an alkynyl substituted with a heteroaryl, i.e., a "heteroaryl-alkynyl" group, preferably having 2 to 6 carbon atoms in the alkynyl moiety (inclusive) and 5 to 12 ring atoms in the heteroaryl moiety (inclusive).
"heterocycle", "heterocyclic" and interchangeably "heterocycloalkylene" refer to a saturated or unsaturated group having a single ring or more than one fused ring, having from 2 to 10 carbon ring atoms (inclusive) and from 1 to 4 heteroatoms (inclusive) selected from nitrogen, sulfur or oxygen within the ring. Such heterocyclic groups may have a single ring (e.g., piperidinyl or tetrahydrofuranyl) or more than one fused ring (e.g., indolinyl, dihydrobenzofuran, or quinuclidinyl). Examples of heterocycles include, but are not limited to, furan, thiophene, thiazole, oxazole, pyrrole, imidazole, pyrazole, pyridine, pyrazine, pyrimidine, pyridazine, indolizine, isoindole, indole, indazole, purine, quinolizine, isoquinoline, quinoline, phthalazine (phtalazine), naphthyridine, quinoxaline, quinazoline, cinnoline, pteridine, carbazole (carbazole), carboline (carboline), phenanthridine (phenanthrine), acridine, phenanthroline (phenanthrine), isothiazole, phenazine (phenazine), isoxazole, phenoxazine (phenazine), phenothiazine (phenazine), imidazolidine, imidazoline (imidazoline), piperidine, piperazine, pyrrolidine, indoline, and the like.
"heterocyclylalkyl" refers to an alkyl group substituted with a heterocycloalkyl group, i.e., a "heterocycloalkyl-alkyl-" group, preferably having 1 to 6 carbon atoms in the alkyl portion (inclusive) and 3 to 12 ring atoms in the heterocycloalkyl portion (inclusive).
"heterocycloalkylalkenyl" refers to an alkenyl group substituted with a heterocycloalkylene, i.e., a "heterocycloalkylalkenyl-" group, preferably having 2 to 6 carbon atoms in the alkenyl portion (inclusive) and 3 to 12 ring atoms in the heterocycloalkylene portion (inclusive).
"heterocycloalkylalkynyl" refers to an alkynyl substituted with a heterocycloalkylyl, i.e., a "heterocycloalkylalkynyl-" group, preferably having 2 to 6 carbon atoms in the alkynyl moiety (inclusive) and 3 to 12 ring atoms in the heterocycloalkylyl moiety (inclusive).
"leaving group" generally refers to any atom or moiety that can be replaced by another atom or moiety in a chemical reaction. More specifically, a leaving group refers to an atom or moiety that is readily replaced and substituted by a nucleophile (e.g., an amine, thiol, alcohol, or cyanide). Such leaving groups are well known and include carboxylate salts, N-hydroxysuccinimide ("NHS"), N-hydroxybenzotriazole, halogen (fluorine, chlorine, bromine or iodine) and alkoxy groups. Non-limiting features and examples of leaving groups can be found, for example, in Organic Chemistry, second edition, francis Carey (1992), pages 328-331; introduction to Organic Chemistry, second edition, andrew Streitwieser and Clayton Heathcock (1981), pages 169-171; and Organic Chemistry, fifth edition, john McMurry, brooks/Cole Publishing (2000), pages 398 and 408; all of which are incorporated herein by reference in their entirety.
Unless otherwise indicated, the positions occupied by hydrogen in the foregoing groups may be further substituted with substituents such as, but not limited to, the following: hydroxy, oxo, nitro, methoxy, ethoxy, alkoxy, substituted alkoxy, trifluoromethoxy, haloalkoxy, fluoro, chloro, bromo, iodo, halogen, methyl, ethyl, propyl, butyl, alkyl, alkenyl, alkynyl, substituted alkyl, trifluoromethyl, haloalkyl, hydroxyalkyl, alkoxyalkyl, thio, alkylthio, acyl, carboxyl, alkoxycarbonyl, carboxamido, substituted carboxamido, alkylsulfonyl, alkylsulfinyl, alkylsulfonylamino, sulfonamide (sulfonamido), substituted sulfonamide, cyano, amino, substituted amino, alkylamino, dialkylamino, aminoalkyl, acylamino, amidino (amidoximo), hydroxycarboyl (hydro amoyl), phenyl, aryl, substituted aryl, aryloxy, arylalkyl, arylalkenyl, arylalkynyl, pyridyl, imidazolyl, heteroaryl, substituted heteroaryl, heteroaryloxy, heteroarylalkyl, heteroarylalkenyl, heteroaryl, cyclobutyl, cycloalkyl, heterocyclyl, (cycloalkyl, heterocyclyl) and (heterocyclo) alkyl, cycloalkyl; and preferred heteroatoms are oxygen, nitrogen and sulfur. It will be appreciated that where open valences are present on these substituents, they may be further substituted with alkyl, cycloalkyl, aryl, heteroaryl and/or heterocyclic groups, where such open valences are present on the carbon, they may be further substituted with halogen and oxygen-, nitrogen-or sulphur-bonded substituents, and where more than one such open valences is present, these groups may be linked to form a ring by forming a bond directly or by forming a bond with a new heteroatom (preferably oxygen, nitrogen or sulphur). It will also be appreciated that the above substitutions may be made, provided that substitution of a substituent for hydrogen does not introduce unacceptable instability to the molecules of the present disclosure, and is otherwise chemically reasonable.
"optional" or "optionally" means that the subsequently described event or circumstance may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where it does not. Those of ordinary skill in the art will understand that for any molecule described as comprising one or more optional substituents, only sterically practical and/or synthetically feasible compounds are intended to be included. "optionally substituted" means all subsequent modifiers in the term or series of chemical groups. For example, in the term "optionally substituted arylalkyl", the "alkyl" and "aryl" portions of the molecule may or may not be substituted, and for a series of "optionally substituted alkyl, cycloalkyl, aryl, and heteroaryl", the alkyl, cycloalkyl, aryl, and heteroaryl groups may or may not be substituted independently of each other.
"protecting group" refers to a group of atoms that, when attached to a reactive functional group in a molecule, masks, reduces, or prevents the reactivity of the functional group. Typically, the protecting groups may be selectively removed as desired during the course of the synthesis. Examples of protecting groups can be found in Wuts and Greene, "Greene's Protective Groups in Organic Synthesis," 4 th edition, wiley Interscience (2006), and Harrison et al, compendium of Synthetic Organic Methods, volumes 1-8, 1971-1996,John Wiley&Sons,NY. Functional groups that may have protecting groups include, but are not limited to, hydroxyl, amino, and carboxyl groups. Representative amino protecting groups include, but are not limited to, formyl, acetyl, trifluoroacetyl, benzyl, benzyloxycarbonyl ("CBZ"), t-butoxycarbonyl ("Boc"), trimethylsilyl ("TMS"), 2-trimethylsilyl-ethanesulfonyl ("SES"), trityl and substituted trityl groups, allyloxycarbonyl, 9-fluorenylmethoxycarbonyl ("FMOC"), nitro-veratroxycarbonyl ("NVOC"), and the like.
As used herein, "polyol" refers to a compound containing multiple hydroxyl groups. For polymers, polyols include polymers having hydroxyl functionality. Exemplary polymeric polyols include, for example and without limitation, polyethers and polyesters such as polyethylene glycol, polypropylene glycol, poly (tetramethylene) glycol, and polytetrahydrofuran.
5.3 engineering transaminase polypeptides
The present disclosure provides engineered polypeptides having transaminase activity, polynucleotides encoding the polypeptides, and methods of using the polypeptides. Where the foregoing description refers to a polypeptide, it is to be understood that it also describes polynucleotides encoding the polypeptide.
Transaminases, also known as aminotransferases, catalyze the transfer of an amino group from a primary amine of an amino donor substrate to a carbonyl (e.g., a keto or aldehyde group) of an amino acceptor molecule. Transaminases have been identified from a variety of microorganisms including, but not limited to, alcaligenes catarrhalis, bode bronchitis, bode parapertussis, brucella melitensis, bode melitensis, chromobacterium violaceum, cyrtymenia cerealis HTCC2516, RED65 of the genus Dacron, MED92 of the genus Dacron, pseudomonas putida, solanaceae, rhizobium melitensis, rhizobium species (strain NGR 234), bacillus thuringiensis, klebsiella pneumoniae and Vibrio fluvial (see, e.g., shin et al, 2001, biosci. Biotechnol, biochem. 65:1782-1788).
Transaminases can be used for chiral resolution of racemic amines, which exploits the ability of transaminases to react in a stereospecific manner, i.e., preferentially convert oneThe enantiomer of the species is converted to the corresponding ketone, thereby producing a mixture enriched in the other enantiomer (see, e.g., koselewski et al 2009, org Lett.11 (21): 4810-2). The stereoselectivity of the aminotransferase in the conversion of a ketone to the corresponding amine also makes these enzymes useful in the asymmetric synthesis of optically pure amines from the corresponding ketone compounds (see e.g.,
Figure BDA0004141823750000322
et al, "Biocatalytic Routes to Optically Active Amines," Chem Cat Chem 1 (1): 42-51; zua and Hua,2009,Biotechnol J.4 (10): 1420-31).
Wild-type ω -transaminase ω -VfT from vibrio fluvial shows high enantioselectivity to the (S) -enantiomer of certain chiral amines and has substrate specificity for chiral aromatic amines (see, e.g., shin and Kim,2002, j. Org. Chem.67: 2848-2853). The high enantioselectivity of omega-VfT has been applied to chiral resolution of amines (see, e.g., yun, et al, 2004, biotechnol. Bioeng.87:772-778; shin and Kim,1997, biotechnol. Bioeng.55:348-358;
Figure BDA0004141823750000321
2008, adv. Synth. Catalyst. 350:802-807). omega-VfT aminotransferase has also been used in asymmetric synthesis of optically pure amines from prochiral ketone substrates. However, the use of such aminotransferases in asymmetric synthesis of chiral amines is limited by: adverse balance of the reverse reaction (see, e.g., shin and Kim,1999, biotechnol. Bioeng.65, 206-211); inhibition of chiral amine products (see, e.g., shin et al, 2001,Biotechnol Bioeng 73:179-187; yun and Kim,2008, biosci. Biotechnol. Biochem.72 (11): 3030-3033); low activity on amine receptors with large side chains such as aromatic groups (see, e.g., shin and Kim,2002, j. Org. Chem.67: 2848-2853); and low enzyme stability (see, e.g., yun and Kim, above). / >
Variant aminotransferases derived from vibrio fluvialis omega-VfT aminotransferase have been reported to have increased tolerance to aliphatic ketones (see, e.g., yun et al, 2005,Appl Environ Micriobiol.71 (8): 4220-4224), and broadened amino donor substrate specificity (see, e.g., cho et al, 2008,Biotechnol Bioeng.99 (2): 275-84). Patents US8,470,564, US9,029,106, US9,512,410, US9,944,909, US10,323,233 and US10,550,370 (each of these are hereby incorporated by reference) describe engineered transaminases derived from ω -VfT having improved properties for the synthesis of chiral amine compounds, including increased stability to temperature and/or organic solvents, and increased enzymatic activity to structurally different amino acceptor molecules. Patent publications US8,852,900 and US8,932,838, each of which is hereby incorporated by reference herein, describe engineered aminotransferases derived from ω -VfT optimized for the enantioselective conversion of the substrate 3' -hydroxyacetophenyl to the product (S) -3- (1-aminoethyl) -phenol.
Notably, the present disclosure identifies amino acid residue positions and corresponding amino acid residue substitutions in engineered transaminase polypeptides that can increase enzymatic activity, enantioselectivity, stability, substrate tolerance, and refractoriness to product inhibition.
Identification of specific residue positions and substitutions in engineering transaminase polypeptides of the present disclosure is performed by: engineering by directed evolution methods using structure-based rational sequence library design and screening for improved functional properties using activity assays based on the conversion of prochiral ketone groups of an exemplary substrate amine receptor of a compound to its corresponding chiral amine product. Specifically, the conversion of the ketone of compound (2) to the corresponding chiral amine of compound (3) is shown in scheme 4.
In some embodiments, the invention provides details of ATA enzymes suitable for the production of the intermediate (1S) -1-imidazo [1,2-a ] pyridin-6-ylethylamine (3), which in a further step is used to produce the effective small molecule drug sivoratinib (1).
Figure BDA0004141823750000341
Sivoratinib, compound (1) or 3- [ (1S) -1-imidazo [1,2-a ] pyridin-6-ylethyl ] -5- (1-methylpyrazol-4-yl) triazolo [4,5-b ] pyrazine.
The current chemical synthesis method for producing compound (1) involves four steps, as shown in scheme 3, instead of the process involving seven steps before (WO 2020/053198). This new process not only eliminates unwanted chiral resolution of the final compound (1) in the last step (resulting in 50% waste of product), but it also avoids several isolation and purification steps of intermediate from the original route. More importantly, it uses engineered ATA enzymes to introduce chiral centers in the first step to produce enantioselective amines (3) which extend throughout the rest of the synthesis.
Figure BDA0004141823750000342
Scheme 3
In some embodiments, the engineered transaminases of the present disclosure are evolved to further improve the activity and substrate tolerance in the asymmetric enantioselective transfer of the substrate ketone 1-imidazo [1,2-a ] pyridin-6-yl ethanone (2) to the product (1S) -1-imidazo [1,2-a ] pyridin-6-yl ethylamine (3), as shown in scheme 4.
Figure BDA0004141823750000351
Scheme 4
An engineered transaminase polypeptide adapted to efficiently convert a ketone substrate compound to a chiral amine product compound has one or more residue differences from the amino acid sequence of a reference engineered transaminase polypeptide of SEQ ID NO. 4. The residue differences are associated with enhancement of enzyme properties including enzyme activity, enzyme stability, substrate tolerance and resistance to product amine inhibition.
The present disclosure provides engineered polypeptides having transaminase activity (also referred to herein as "engineered transaminase polypeptides") that can be used to selectively transaminate amino-receptor substrate compounds to produce chiral amine products, which in some embodiments can include compound (3). Thus, in one aspect, the present disclosure provides an engineered polypeptide having transaminase activity that is capable of converting a substrate compound (2) to a product compound (3) as shown in scheme 4.
The engineered polypeptide of the present disclosure is a non-naturally occurring transaminase engineered to have improved enzymatic properties (such as increased activity) compared to the wild-type transaminase polypeptide of vibrio fluvial JS 17 (GenBank accession No. AEA39183.1, GI:327207066;SEQ ID NO:2) and also compared to the reference engineered transaminase polypeptide of SEQ ID No. 4, the reference engineered transaminase polypeptide of SEQ ID No. 4 is used as the starting backbone sequence for the directed evolution of the engineered polypeptide of the present disclosure. The reference engineered transaminase polypeptide of SEQ ID NO. 4 has a 26 amino acid difference with respect to the wild-type transaminase of Vibrio fluvialis JS 17 (SEQ ID NO. 2).
The engineered transaminase polypeptides of the present disclosure are produced by directed evolution of SEQ ID No. 4 for efficient conversion of compound (2) to compound (3) under certain industrially relevant conditions, and have one or more residue differences compared to a reference engineered transaminase polypeptide. These residue differences are associated with improvements in various enzyme properties, particularly increased activity, increased stereoselectivity, increased stability, and tolerance to increased substrate and/or product concentrations (e.g., reduced product inhibition). Thus, in some embodiments, an engineered polypeptide having transaminase activity is capable of converting substrate compound (2) to compound (3) with an activity that is increased by at least about 1.2-fold, 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 100-fold or more relative to the activity of a reference polypeptide (e.g., SEQ ID NO:4 and/or 6) under suitable reaction conditions. In some embodiments, an engineered polypeptide having transaminase activity is capable of converting substrate compound (2) to compound (3) at a conversion percentage of at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, or at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% over a reaction time of about 48 hours, about 36 hours, about 24 hours, or even less under suitable reaction conditions. In some embodiments, an engineered polypeptide having transaminase activity is capable of converting compound (2) to compound (3) in a diastereomeric excess of at least 90%, 95%, 97%, 98%, 99% or greater under suitable reaction conditions.
The present disclosure provides a number of exemplary engineered transaminase polypeptides comprising the amino acid sequences of even-numbered sequence identifiers SEQ ID NOS: 6-358. These exemplary engineered transaminase polypeptides comprise an amino acid sequence that contains one or more of the following residue differences relative to a reference sequence (e.g., SEQ ID NOs: 4 and/or 6) that are associated with improved properties of the engineered transaminase polypeptide that converts compound (2) to compound (3).
In some cases, exemplary engineered polypeptides have amino acid sequences that also include one or more residue differences compared to a reference sequence (e.g., SEQ ID NOS: 4 and/or 6). In some cases, exemplary engineered polypeptides have amino acid sequences that also include one or more residue differences compared to a reference sequence (e.g., SEQ ID NOS: 4 and/or 6).
In some embodiments, the engineered polypeptide comprises an amino acid sequence that is at least about 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to a reference sequence selected from SEQ ID NOs 4 and/or 6, wherein the polypeptide has transaminase activity and one or more improved properties as described herein, e.g., the ability to convert compound (2) to product compound (3) with increased activity, as compared to the reference sequence (e.g., the polypeptide of SEQ ID NOs 4 and/or 6). In some embodiments, the reference sequence is SEQ ID NO. 4. In some embodiments, the reference sequence is SEQ ID NO. 6.
In some embodiments, the engineered transaminase polypeptide comprising an amino acid sequence has one or more amino acid residue differences compared to SEQ ID NOs 4 and/or 6. In some embodiments, the disclosure provides engineered polypeptides having transaminase activity comprising an amino acid sequence that has at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to a reference sequence of SEQ ID NOs 4 and/or 6 and at least one amino acid residue difference selected from those provided herein (see e.g., table 5.1 and/or table 6.1).
In some embodiments, the present disclosure provides an engineered transaminase polypeptide comprising an amino acid sequence that has one or more amino-acid residue differences compared to SEQ ID NO:4 selected from the following positions: 13. 41/57/130/415/419, 41/113/415, 53/57, 88/89, 97/415, 148, 227, 260, 302, 355/415/419, 362, 417 and 443, wherein said positions are numbered with reference to SEQ ID NO. 4. In some embodiments, the amino acid differences include substitutions 13A, 13E, 13G, 13K, 13S, 41V/57Y/130Y/415F/419D, 41V/113F/415F, 53M/57W, 88K, 88R/89L, 88V, 97A/415S, 148E, 148G, 227A, 227C, 260T, 302N, 355C/415S/419D, 362G, 417A, 417I, 417V, 443E, and 443M, wherein the positions are numbered with reference to SEQ ID NO: 4. In some further embodiments, the amino acid differences include substitutions T13A, T13E, T G, T13K, T S, I V/F57Y/F130Y/R415F/Q419D, I V/V113F/R415F, N M/F57W, L88 and 88R, L R/M89L, L88V, S A/R415S, Q38148E, Q148G, G227C, C T, E302N, R C/R415S/Q419D, H362G, L417A, L417I, L35417 35443E and K443M, wherein the positions are numbered with reference to SEQ ID NO 4.
In some embodiments, the present disclosure provides an engineered transaminase polypeptide comprising an amino acid sequence that has at least one or more amino-acid residue differences compared to SEQ ID NO:6 selected from the following positions: 13. 13/41/57/88/130/415/417, 13/41/57/89/97/417, 13/41/57/97/130/415/417, 13/41/57/97/130/415/417/443, 13/41/57/97/443, 13/41/57/130/417, 13/41/57/417, 13/41/88, 13/41/88/89, 13/41/88/89/97/415/443, 13/41/88/89/417, 13/41/88/97, 13/41/88/130/415/443, 13/41/88/443, 13/41/89/130/148/443, 13/41/89/417, 13/41/89/443, 13/41/97/130/417, 13/41/97/415, 13/41/97/415/417, 13/41/97/417, 13/41/97/417/443, 13/41/130/415/443, 13/41/415, 13/41/415/417, 13/41/415/443, 13/41/417, 13/41/417/443, 13/57/88/89/130/415/443, 13/57/88/97, 13/57/88/97/415/443, 13/57/88/130/415, 13/57/88/130/417/443, 13/57/88/415, 13/57/97/130/415/417/443, 13/57/97/417, 13/88/89/415/417, 13/88/89/415/417/443, 13/88/130/443, 13/88/415, 13/89/97/415/417, 13/89/97/417, 13/89/417, 13/97/148/415, 13/97/415, 13/97/415/417, 13/97/417, 13/130/415, 13/130/415/417, 13/130/417, 13/130/417/443, 13/415/417, 13/415/417/443, 13/415/443, 13/417/443, 13/443, 23/53/162/233/277/315/415/418/432, 23/53/315/417/418, 23/277/315/395/415/417/432, 23/277/395/417/418, 23/395/418, 23/418, 41/57/88, 41/57/88/415/443, etc, 41/57/130/148/415/417, 41/57/130/443, 41/57/415/417, 41/88/89/97/130/415, 41/88/89/415/417, 41/88/97/130/417, 41/88/130/415/417, 41/88/443, 41/97/130/148/415/417/443, 41/97/417, 41/97/417/443, 41/130/415, 41/130/415/417/443, 41/130/415/443, 41/415/443, 41/417/443, 53/162, 53/162/395/417, 53/162/418/432, 53/233, 53/277/395, 53/277/395/417/418, 53/277/415/417, 57/88/97/130/415/443, 57/88/97/130/417, 57/88/97/417, 57/97/130/148/417/443, 57/417, 88, 88/89/130/417, 88/97/415/417/443, 88/130/417/443, 88/148/417/443, 88/415/417, 88/415/417/443, 88/417, and water-absorbing agent, 89/97/415/417, 89/97/417, 89/443, 97/130, 97/148/415, 97/415/417, 97/417, 130/415, 130/417, 130/443, 162/233/415/417, 162/395/415/417, 162/418, 233/315/415/417, 233/315/417, 277/395/415/418/432, 315, 315/415/418/432, 395/418, 415/417/418, 415/417/418/432, 415/417/443, 415/443, 417 and 443, wherein said positions are numbered with reference to SEQ ID No. 6. In some embodiments of the present invention, in some embodiments, the amino acid differences include substitutions 13A, 13A/41V/57Y/88R/130Y/415S/417V, 13A/41V/57Y/89L/97A/417V, 13A/41V/57Y/97A/130Y/415S/417I/443M, 13A/41V/57Y/130Y/417V, 13A/41V/57Y/417I, 13A/41V/88R/89L, 13A/41V/88R/443M, 13A/41V/89L/130G/148G/443M, 13A/41V/89L/443M, 13A/41V/97A/417I, 13A/41V/130Y/415S/443M, 13A/41V/415S/417I, 13A/41V/415S/417V 13A/41V/417V/443M, 13A/57Y/88R/89L/130Y/415S/443M, 13A/57Y/88R/97A/415S/443M, 13A/57Y/88R/130Y/415S, 13A/57Y/88R/130Y/417V/443M, 13A/57Y/88R/415S, 13A/88R/89L/415S/417V, 13A/88R/130Y/443M, 13A/88R/415S, 13A/89L/417I, 13A/97A/148G/415S, 13A/97A/417V, 13A/130Y/417V, 13A/415S/417I/443M, 13A/415S/417V, 417A/415S/V, 13A/415S/443M, 13A/417I/443M, 13E/41V/57Y/97A/130Y/415S/417V, 13E/41V/57Y/97A/443M, 13E/41V/88R/89L/97A/415S/443M, 13E/41V/88R/89L/417V, 13E/41V/88R/97A, 13E/41V/88R/130Y/415S/443M, 13E/41V/89L/417V, 13E/41V/97A/130Y/417I, 13E/41V/97A/415S/417V, 13E/41V/97A/417I/443M, 13E/41V/415S/443M, 13E/41V/417V/41M, 417E/41V/443M 13E/57Y/88R/97A/415S/443M, 13E/57Y/97A/130Y/415S/417V/443M, 13E/57Y/97A/417V, 13E/88R/89L/415S/417I, 13E/88R/89L/415S/417V/443M, 13E/89L/97A/415S/417V, 13E/89L/97A/417V, 13E/97A/415S/417V, 13E/130Y/415S/417I, 13E/130Y/417I/443M, 13E/130Y/417V, 13E/415S/443M, 13E/417I, 13E/417V/443M, 13E/443M, 23K/53C/162A/233I/277I/315G/415A/418D/432V, 23K/53C/315G/417V/418D, 23K/277I/315G/395D/415A/417G/432V, 23K/277I/395D/417V/418D, 23K/395D/418D, 23K/418D, 41V/57Y/88R/415S/417M, 41V/57Y/130Y/148G/415S/417I, 41V/57Y/415S/417I, 41V/88R/89L/97A/130Y/415S, 41V/88R/89L/415S/417I, 41V/88R/97A/130Y/417I, 41V/88Y/88S/417I, and the like 41V/88R/130Y/415S/417I, 41V/88R/443M, 41V/97A/130Y/148G/415S/417V/443M, 41V/97A/417I/443M, 41V/130Y/415S/417I/443M, 41V/130Y/415S/443M, 41V/415S/443M 41V/417I, 41V/417V/443M, 53C/162A/395D/417V, 53C/162A/418D/432V, 53C/233I, 53C/277I/395D/417V/418D, 53C/277I/415A/417G, 57Y/88R/97A/130Y/415S/443M, 57Y/88R/97A/130Y/417V, 57Y/88R/97A/417I, 57Y/97A/130Y/148G/417I/443M, 57Y/417V, 88R/89L/130Y/417I, 88R/97A/415S/417I/443M, 88R/130Y/417I/443M, 88R/148G/417V/443M, 88R/415S/417I/417M, 88R/417S/417M, 88L/97A/415S/417I, 89L/97A/417I, 97L/443M, 97A/130Y, 97A/148G/415S, 97A/415S/417S, 97A/417I 97A/415S/417V, 97A/417I, 130Y/415S, 130Y/417I, 130Y/443M, 162A/233I/415A/417V, 162A/395D/415A/417V, 162A/418D, 233I/315G/415A/417V, 233I/315G/417V, 277I/395D/418D/432V, 315G/415A/418D/432V, 395D/418D, 415A/417G/418D, 415A/417V/432V, 415S/417I/443M, 415S/417V/443M, 415S/443M, 417V and 443M, wherein said position is referred to SEQ ID NO: 6. In some further embodiments of the present invention, the amino acid differences include substitutions T13A, T A/I41V/F57Y/L88R/F130Y/R415S/L417V, T A/I41V/F57Y/M97A/L417V, T A/I41V/F417 35Y/S97A/F130Y/R415S/L417I/K443M, T A/I41V/F57Y/F130Y/L417 42 82A/I41V/F57Y/L417I, T A/I41V/L88R/M89/L, T A/I41V/L88R/K443L, T A/I41V/M89L/F130Y/Q148G/K443L, T A/I41V/M89L/K443L, T A/I41V/S97A/L417L, T A/I41V/F130Y/R415S/K443L, T A/I41V/R415S/L417L, T A/I41V/R415S' 41V/L88R/M89L, T A/I41V/L88R/K443L, T A/I41V/M89L/F130Y/Q148G/K443L, T A/I41V/M89L/K443L, T A/I41V/S97A/L417L, T A/I41V/F130Y/R415S/K443L, T A/I41V/R415S/L417L, T A/I41V/R415S-, T13A/R415S/L417V, T A/R415S/K443V, T A/L417I/K443V, T E/I41V/F57Y/S97A/F130Y/R415S/L417V, T E/I41V/F57Y/S97A/K443V, T E/I41V/L88R/M97A/R415S/K V, T E/I41V/L88R/M89L 417V, T E I41V/L88R/S97V, T E/I41V/L88R/F130Y/R415S/K443V, T E/I41V/M89L/L417V, T E/I41V/S97A/F130Y/L417V, T E/I41V/S97A/R415S/L417V, T E/I41V/S97A/L417I/K443V, T E/I41V/R415S/K V, T E/S V, T E/I41V/L417V, T E/F57Y/L88R/S97A/R415S/K443V, T E/F57Y/S97A/F130Y/R415S/L417V/K443V, T E/F57Y/S97A/L417V, T E/L88R/M89L/R415S/L417V, T E/L88R/M415L/R415S/L417V, T E/L88R/M89L/R415S/L417V, T E/L89L/R415S/L417V/K V, T E-M89L/S97A/R415S/L417V, T E/M89L/S97A/L417V, T E/S97A/R415S/L417V, T E/F130Y/R415S/L417V, T E/F130Y/L417I/K443V, T E/F130Y/L417V, T E/R415S/K443V, T E/L417V, T13E/L417V/K443 13E/K443 23K/N53C/G162A/T233I/T277I/E315G/R415A/G418D/A432K/N53C/E315G/L417V/G418K 23/T277I/E315G/G395D/R415A/L417G/A432K/T277I/G395D/L417V/G418K/G395D/G418K/G418 41V/F57Y/L88R/R415S/K443 41V/F57Y/F130Y/Q148G/R415S/L417 41V/F57Y/F130Y/K443 41V/F57Y/R415S/L417 41V/L88R/M89L/S97A/F130Y/R415 41V/L88R/M415S/L417 41V/L88R/S97A/F130Y the ratio of/L417V/L88R/F130Y/R415S/L417V/L88R/K443 41V/S97A/F130Y/Q148G/R415S/L417V/K443 41V/S97A/L417 41V/S97A/L417I/K443 41V/F130Y/R415 41V/F130Y/R415S/L417I/K443 41V/F130Y/K443 41V/R415S% K443 41V/L417V/K443 53C/G162A/G395D/L417C/G162A/G418D/A432C/T233C/T277I/G395C/G395D/L417V/G418C/T277I/R415A/L417 57Y/L88R/S97A/F130Y/R415S/K443M, F57Y/L88R/S97A/L130Y/L417 57Y/L88R/S97A/L417 57Y/S97A/F130Y/Q148G/L417I/K417 57Y/L417 88R/M88L/M/F130Y/L417R/S97A/R415S/L417I/K443 88R/F130Y/L417I/K443 88R/Q148G/L417V/K443 88R/R415S/L417V/K443 88R/L417/S97A/R415S/L417A/S97A/L417/K443 97A/F130A/Q148G/R415A 415/R415A/415A 415/L415/97A/F130A/F148G/R415A/97A/R415/L415S/L417 97A/R415S/L417 97A/L417 130Y/R415 130Y/L417 130Y/K443 162A/T233I/R415A/L417 162A/G395D/R415A/L417 162A/G418I/E315G/R415A/L417 233I/E315G/L417 277I/G395D/R415A/G418D/A432 315G 315A/G418D/A432 395D/G418A/L417G 418A/L417V/G418D/A432 415S/L417V/K415S/K443 417V and K443M, wherein the position is referenced to SEQ ID NO: 6.
In some embodiments, the engineered transaminase polypeptide shows increased activity of converting a substrate compound (e.g., compound (2)) to an amino product compound (e.g., compound (3)) with the same amount of enzyme in a stereomeric excess over a defined period of time as compared to a reference engineered transaminase of wild-type or SEQ ID No. 4. In some embodiments, under suitable reaction conditions, the engineered transaminase polypeptide has at least about 1.2-fold, 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 20-fold, 30-fold, 40-fold, or 50-fold or more activity as compared to a reference engineered polypeptide represented by SEQ ID NO. 4.
In some embodiments, the engineered transaminase polypeptide has increased stability to the temperature and/or solvents used in the conversion reaction compared to the wild-type or reference engineered enzyme. In some embodiments, the engineered transaminase polypeptide has at least 1.2-fold, 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, or more stability as compared to the reference polypeptide of SEQ ID NO. 4 under suitable reaction conditions.
In some embodiments, the engineered transaminase polypeptide has increased tolerance to the ketone substrate compound (2) as compared to the wild-type or reference engineered enzyme. In some embodiments, as described further below, the engineered transaminase polypeptide has at least 1.2-fold, 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, or more increased tolerance to substrate compound (2) as compared to the polypeptide represented by SEQ ID No. 4 under suitable reaction conditions.
In some embodiments, the engineered transaminase polypeptide has increased refractoriness or tolerance to inhibition by the product chiral amine of compound (3) as compared to the wild-type or reference engineered enzyme. In some embodiments, as described further below, the engineered transaminase polypeptide has at least 1.2-fold, 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold or more increased tolerance to inhibition by product compound (3) as compared to the polypeptide represented by SEQ ID NO. 4 under suitable reaction conditions.
In some embodiments, the engineered transaminase polypeptide is capable of converting compound (2) to compound (3) in a stereogenic excess of greater than 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5% or more (i.e., over other stereogenic product compounds having opposite enantiomers at the chiral amine center) under suitable reaction conditions.
In some embodiments, the engineered transaminase polypeptide is capable of converting substrate compound (2) to product compound (3) with increased tolerance to the presence of a substrate relative to the reference polypeptide of SEQ ID NO. 4 under suitable reaction conditions. Thus, in some embodiments, the engineered transaminase polypeptide is capable of converting a substrate compound (2) to a product compound (3) at a substrate loading concentration of at least about 1g/L, about 5g/L, about 10g/L, about 20g/L, about 30g/L, about 40g/L, about 50g/L, about 70g/L, about 100g/L, about 125g/L, about 150g/L, about 175g/L, or about 200g/L or more over a reaction time of about 72 hours or less, about 48 hours or less, about 36 hours or less, or about 24 hours or less at a conversion percentage of at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 98%, or at least about 99% under suitable reaction conditions.
In some embodiments, the present disclosure also provides engineered transaminase polypeptides comprising fragments of any of the engineered polypeptides described herein that retain the functional activity and/or improved properties of the engineered transaminase. Thus, in some embodiments, the present disclosure provides polypeptide fragments having transaminase activity, such as converting compound (2) to compound (3) under suitable reaction conditions, wherein the fragments comprise at least about 80%, 90%, 95%, 96%, 97%, 98% or 99% of the full-length amino acid sequence of an engineered transaminase polypeptide of the present disclosure, such as an exemplary engineered transaminase polypeptide selected from the following: SEQ ID NO:4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356 and 358.
In some embodiments, an engineered transaminase polypeptide can have an amino acid sequence that includes a deletion of any of the engineered transaminase polypeptides described herein, such as the following exemplary engineered polypeptides: SEQ ID NO:4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356 and 358. Thus, for each and every embodiment of the engineered transaminase polypeptides of the disclosure, the amino acid sequence can comprise a deletion of one or more amino acids, 2 or more amino acids, 3 or more amino acids, 4 or more amino acids, 5 or more amino acids, 6 or more amino acids, 8 or more amino acids, 10 or more amino acids, 15 or more amino acids, or 20 or more amino acids, up to 10% of the total amino acids of the transaminase polypeptide, up to 20% of the total amino acids of the transaminase polypeptide, or up to 30% of the total amino acids of the transaminase polypeptide, wherein the relevant functional activities and/or improved properties of the engineered transaminase described herein are maintained. In some embodiments, the deletions may comprise 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-15, 1-20, 1-21, 1-22, 1-23, 1-24, 1-25, 1-30, 1-35, 1-40, 1-45, or 1-50 amino acid residues. In some embodiments, the number of deletions may be 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, or 50 amino acid residues. In some embodiments, the deletion may comprise a deletion of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 20, 21, 22, 23, 24, or 25 amino acid residues.
In some embodiments, an engineered transaminase polypeptide herein can have an amino acid sequence that includes an insertion as compared to any of the engineered transaminase polypeptides described herein, such as the following exemplary engineered polypeptides: SEQ ID NO:4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356 and 358. Thus, for each and every embodiment of the transaminase polypeptides of the present disclosure, the insert can comprise one or more amino acids, 2 or more amino acids, 3 or more amino acids, 4 or more amino acids, 5 or more amino acids, 6 or more amino acids, 8 or more amino acids, 10 or more amino acids, 15 or more amino acids, 20 or more amino acids, 30 or more amino acids, 40 or more amino acids, or 50 or more amino acids, wherein the relevant functional activity and/or improved properties of the engineered transaminases described herein are maintained. The insert may be inserted into the amino-or carboxy-terminal, or intermediate, portion of the transaminase polypeptide.
In some embodiments, the engineered transaminase polypeptides herein can have an amino acid sequence that includes a sequence selected from the following: SEQ ID NO:4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356 and 358, and optionally one or several (e.g., up to 3, 4, 5 or up to 10) amino acid residues. In some embodiments, the amino acid sequence optionally has 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-15, 1-20, 1-21, 1-22, 1-23, 1-24, 1-25, 1-30, 1-35, 1-40, 1-45, or 1-50 amino acid residue deletions, insertions, and/or substitutions. In some embodiments, the amino acid sequence optionally has 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, or 50 amino acid residues deleted, inserted, and/or substituted. In some embodiments, the amino acid sequence optionally has 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 20, 21, 22, 23, 24, or 25 amino acid residue deletions, insertions, and/or substitutions. In some embodiments, the substitution may be a conservative substitution or a non-conservative substitution.
In some embodiments, the disclosure provides an engineered polypeptide having transaminase activity, the polypeptide comprising an amino acid sequence that is at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to a sequence selected from the following: SEQ ID NO:4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356 and 358, provided that the amino acid sequence is as disclosed in patents US8,470,564, US9,029,106, US9,512,410, US9,944,909, US10,323,233, US10,550,370, US8,852,900 and US8,932,838; yun et al 2005,Appl Environ Micriobiol, 71 (8): 4220-4224); and Cho et al, 2008,Biotechnol Bioeng.99 (2), 275-84 (all of which are incorporated herein by reference) are not identical (i.e., do not include) to the amino acid sequence of any of the exemplary engineered transaminase polypeptides.
In the above embodiments, suitable reaction conditions for the engineered polypeptides may be those described in tables 5.1 and 6.1, examples, and elsewhere herein. Suitable reaction conditions under which the above described improved properties of the engineered polypeptides are converted may be determined with respect to the concentration or amount of the polypeptides, the substrate, cofactors, buffers, co-solvents, pH and/or conditions including temperature and reaction time, as described further below and in the examples.
In some embodiments, the polypeptides of the present disclosure may be in the form of fusion polypeptides, wherein the engineered polypeptide is fused to other polypeptides, such as, for example and without limitation, an antibody tag (e.g., myc epitope), a purification sequence (e.g., his tag for binding to a metal), and a cell localization signal (e.g., secretion signal). Thus, the engineered polypeptides described herein may be used with or without fusion to other polypeptides.
It is to be understood that the polypeptides described herein are not limited to genetically encoded amino acids. In addition to genetically encoded amino acids, the polypeptides described herein may comprise, in whole or in part, naturally occurring and/or synthetic non-encoded amino acids. Some common non-coding amino acids that polypeptides described herein may comprise include, but are not limited to: a D-stereoisomer genetically encoding an amino acid; 2, 3-diaminopropionic acid (Dpr); alpha-aminoisobutyric acid (Aib); epsilon-aminocaproic acid (Aha); delta-aminopentanoic acid (Ava); n-methylglycine or sarcosine (MeGly or Sar); ornithine (Orn); citrulline (Cit); t-butylalanine (Bua); t-butylglycine (Bug); n-methyl isoleucine (MeIle); phenylglycine (Phg); cyclohexylalanine (Cha); norleucine (Nle); naphthylalanine (Nal); 2-chlorophenylalanine (Ocf); 3-chlorophenylalanine (Mcf); 4-chlorophenylalanine (Pcf); 2-fluorophenylalanine (Off); 3-fluorophenylalanine (Mff); 4-fluorophenylalanine (Pff); 2-bromophenylalanine (Obf); 3-bromophenylalanine (Mbf); 4-bromophenylalanine (Pbf); 2-methyl phenylalanine (Omf); 3-methyl phenylalanine (Mmf); 4-methylphenylalanine (Pmf); 2-nitrophenylalanine (Onf); 3-nitrophenylalanine (Mnf); 4-nitrophenylalanine (Pnf); 2-cyanophenylalanine (Ocf); 3-cyanophenylalanine (Mcf); 4-cyanophenylalanine (Pcf); 2-trifluoromethylphenylalanine (Otf); 3-trifluoromethylphenylalanine (Mtf); 4-trifluoromethylphenylalanine (Ptf); 4-aminophenylalanine (Paf); 4-iodophenylalanine (Pif); 4-aminomethylphenylalanine (Pamf); 2, 4-dichlorophenylalanine (Opef); 3, 4-dichlorophenylalanine (Mpcf); 2, 4-difluorophenylalanine (Opff); 3, 4-difluorophenylalanine (Mpff); pyridin-2-ylalanine (2 pAla); pyridin-3-ylalanine (3 pAla); pyridin-4-ylalanine (4 pAla); naphthalen-1-ylalanine (1 nAla); naphthalen-2-ylalanine (2 nAla); thiazolylalanine (taAla); benzothiophenylalanine (btala); thienyl alanine (ttala); furyl alanine (fAla); homophenylalanine (hPhe); homotyrosine (hTyr); high tryptophan (hTrp); pentafluorophenylalanine (5 ff); styrylalanine (sla); anthracenyl alanine (aAla); 3, 3-diphenylalanine (Dfa); 3-amino-5-phenylpentanoic acid (Afp); penicillamine (Pen); 1,2,3, 4-tetrahydroisoquinoline-3-carboxylic acid (Tic); beta-2-thienyl alanine (Thi); methionine sulfoxide (Mso); n (w) -nitroarginine (nArg); high lysine (hLys); phosphonomethyl phenylalanine (pmPhe); phosphoserine (pSer); threonine phosphate (pThr); high aspartic acid (hAsp); homoglutamic acid (hGlu); 1-aminocyclopent- (2 or 3) -ene-4-carboxylic acid; pipecolic Acid (PA); azetidine-3-carboxylic acid (ACA); 1-aminocyclopentane-3-carboxylic acid; allyl glycine (aOly); propargylglycine (pgGly); homoalanine (hAla); norvaline (nVal); homoleucine (hLeu), homovaline (hVal); homoisoleucine (hlle); homoarginine (hArg); n-acetyl lysine (AcLys); 2, 4-diaminobutyric acid (Dbu); 2, 3-diaminobutyric acid (Dab); n-methylvaline (MeVal); homocysteine (hCys); homoserine (hSer); hydroxyproline (Hyp) and homoproline (hPro). Additional non-coding amino acids that can be included by the polypeptides described herein will be apparent to those skilled in the art (see, e.g., the different amino acids provided in Fasman,1989,CRC Practical Handbook of Biochemistry and Molecular Biology,CRC Press,Boca Raton,FL,pp.3-70 and the references cited therein, all of which are incorporated by reference). These amino acids may be in the L-or D-configuration.
Those skilled in the art will recognize that amino acids or residues having side chain protecting groups may also constitute the polypeptides described herein. Non-limiting examples of such protected amino acids (which in this case belong to the aromatic class) include (protecting groups listed in brackets), but are not limited to: arg (tos), cys (methylbenzyl), cys (nitropyridyloxythio), glu (delta-benzyl ester), gln (xanthenyl), asn (N-delta-xanthenyl), his (bom), his (benzyl), his (tos), lys (fmoc), lys (tos), ser (O-benzyl), thr (O-benzyl) and Tyr (O-benzyl).
Non-coding amino acids that can constitute the conformational constraints of the polypeptides described herein include, but are not limited to, N-methyl amino acids (L-configuration); 1-aminocyclopent- (2 or 3) -ene-4-carboxylic acid; pipecolic acid (pimelic acid); azetidine-3-carboxylic acid; high proline (hPro) and 1-aminocyclopentane-3-carboxylic acid.
In some embodiments, the engineered transaminase polypeptides can be provided on a solid support, such as a membrane, a resin, a solid support, or other solid phase material. The solid support may be made of organic polymers such as polystyrene, polyethylene, polypropylene, polyvinylfluoride, polyoxyethylene and polyacrylamide, and copolymers and grafts thereof. The solid support may also be inorganic, such as glass, silica, controlled Pore Glass (CPG), reversed phase silica, or a metal such as gold or platinum. The configuration of the solid support may be in the form of beads, spheres, particles (particles), granules (grains), gels, membranes or surfaces. The surface may be planar, substantially planar or non-planar. The solid support may be porous or nonporous, and may have swelling or non-swelling characteristics. The solid support may be configured in the form of a well, recess or other receptacle, container, feature or location.
In some embodiments, an engineered polypeptide having the transaminase activity of the present disclosure can be immobilized on a solid support such that the engineered polypeptide retains its improved activity, stereoselectivity, and/or other improved properties relative to the reference polypeptide of SEQ ID NO. 4. In such embodiments, the immobilized polypeptide can facilitate biocatalytic conversion of a substrate compound, such as compound (2) or other suitable substrate, to product compound (3) or a corresponding product (e.g., as shown in scheme 4 described herein), and is readily retained after the reaction is complete (e.g., by bead retention to which the polypeptide is immobilized), and then reused or recovered in a subsequent reaction. Such an immobilized enzyme method allows further improvement in efficiency and reduction in cost. Accordingly, it is also contemplated that any method using the engineered transaminase polypeptides of the present disclosure can be performed using the same engineered transaminase polypeptide bound or immobilized on a solid support.
Methods of enzyme immobilization are well known in the art. The engineered transaminase polypeptide can be bound non-covalently or covalently. Various methods for binding and immobilizing enzymes to solid supports (e.g., resins, membranes, beads, glass, etc.) are well known in the art and are described, for example: yi et al, "Covalent immobilization of omega-transaminase from Vibrio fluvialis JS17 on chitosan beads," Process Biochemistry 42 (5): 895-898 (May 2007); martin et al, "Characterization of free and immobilized (S) -aminotransferase for acetophenone production," Applied Microbiology and Biotechnology 76 (4): 843-851 (Sept. 2007); koszelewski et al, "Immobilization of omega-transaminases by encapsulation in a sol-gel/celite matrix," Journal of Molecular Catalysis B: enzymic, 63:39-44 (Apr.2010); truppo et al, "Development of an Improved Immobilized CAL-B for the Enzymatic Resolution of a Key Intermediate to Odanacatib," Organic Process Research & Development, published online: dx.doi.org/10.1021/op200157c; hermanson, g.t., bioconjugate Techniques, second edition, academic Press (2008); mateo et al, "Epoxy sepabeads: a novel Epoxy support for stabilization of industrial enzymes via very intense multipoint covalent attachment," Biotechnology Progress (3): 629-34 (2002); and Bioconjugation Protocols: strategies and Methods, in Methods in Molecular Biology, c.m. niemeyer, humana Press (2004); the disclosure of each of which is incorporated herein by reference. Solid supports useful for immobilizing the engineered transaminases of the present disclosure include, but are not limited to, beads or resins comprising polymethacrylates having epoxy functionality, polymethacrylates having amino epoxy functionality, styrene/DVB copolymers having octadecyl functionality, or polymethacrylates. Exemplary solid supports useful for immobilization of engineered transaminases of the present disclosure include, but are not limited to, chitosan beads, eupergit C, and SEPABEAD (Mitsubishi), including the following different types of sepabed: EC-EP, EC-HFA/S, EXA252, EXE119 and EXE120.
In some embodiments, the engineered polypeptide may be in various forms, such as, for example, as an isolated preparation, as a substantially purified enzyme, whole cells transformed with a gene encoding an enzyme, and/or as a cell extract and/or lysate of such cells. The enzyme may be in the form of a lyophilized, spray dried, precipitated or crude paste, as discussed further below.
In some embodiments, the polypeptides described herein may be provided in the form of a kit. The enzymes in the kit may be present individually or as more than one enzyme. The kit may further comprise reagents for performing an enzymatic reaction, substrates for evaluating the enzymatic activity, and reagents for detecting the product. The kit may also include a reagent dispenser and instructions for use of the kit.
In some embodiments, the polypeptides may be provided on a solid support in the form of an array in which the polypeptides are arranged in different positions. The array may be used to test a variety of substrate compounds for conversion by a polypeptide. More than one support may be configured at multiple locations on the array that are either automatically delivered of the reagent or addressable by the detection method and/or instrument. Various methods for bonding to substrates such as films, beads, glass, etc. are described in, among other things, hermanson, g.t., bioconjugate Techniques, second edition, academic Press; (2008) And Bioconjugation Protocols: strategies and Methods, in Methods in Molecular Biology, c.m. niemeyer, humana Press (2004); the disclosure of each of which is incorporated herein by reference.
In some embodiments, the kits of the present disclosure comprise an array comprising more than one different engineered transaminase polypeptides disclosed herein at different addressable locations, wherein the different polypeptides are different variants of a reference sequence, each variant having at least one different improved enzymatic property. Such arrays comprising more than one engineered polypeptide and methods of their use are described in US9,228,223.
5.4 polynucleotides encoding engineered polypeptides, expression vectors and host cells
In another aspect, the disclosure provides polynucleotides encoding the engineered transaminase polypeptides described herein. The polynucleotide may be operably linked to one or more heterologous regulatory sequences that control gene expression to produce a recombinant polynucleotide capable of expressing the polypeptide. Expression constructs comprising heterologous polynucleotides encoding engineered aminotransferase may be introduced into suitable host cells to express the corresponding aminotransferase polypeptides.
As will be apparent to those skilled in the art, knowledge of the availability of protein sequences and codons corresponding to the various amino acids provides an illustration of all polynucleotides capable of encoding the subject polypeptide. The degeneracy of the genetic code, in which the same amino acids are encoded by selectable or synonymous codons, allows a very large number of nucleic acids to be produced, all of which encode an improved transaminase. Thus, knowing a particular amino acid sequence, one skilled in the art can prepare any number of different nucleic acids by simply modifying one or more codons of the sequence in a manner that does not alter the amino acid sequence of the protein. In this regard, the present disclosure specifically contemplates that each and every possible variation in a polynucleotide encoding a polypeptide described herein can be made by selecting combinations based on possible codon usage, and that for any polypeptide described herein, all such variations, including the amino acid sequences presented in table 5.1 and table 6.1 and disclosed in the sequence listing as incorporated by reference below, should be considered specifically disclosed: SEQ ID NO:4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356 and 358.
In various embodiments, the codons are preferably selected to be suitable for the host cell in which the protein is produced. For example, preferred codons used in bacteria are used for expression in bacteria; preferred codons used in yeast are used for expression in yeast; and preferred codons for use in mammals are used for expression in mammalian cells. In some embodiments, not all codons need to be replaced to optimize codon usage of the transaminase, as the native sequence will include preferred codons and may not be required for all amino acid residues due to the use of preferred codons. Thus, a codon-optimized polynucleotide encoding a transaminase may comprise preferred codons at about 40%, 50%, 60%, 70%, 80% or more than 90% of the codon positions of the full-length coding region.
In some embodiments, as described above, the polynucleotide encodes an engineered polypeptide having transaminase activity having the properties disclosed herein, such as the ability to convert substrate compound (2) to product compound (3), wherein the polypeptide comprises a polypeptide sequence selected from SEQ ID NOs: 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, and the like 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356 and 358 have an amino acid sequence with at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identity, and one or more residue differences compared to a reference polypeptide. In some embodiments, the reference sequence is selected from SEQ ID NOs 4 and/or 6. In some embodiments, the reference sequence is SEQ ID NO. 4. In some embodiments, the reference sequence is SEQ ID NO. 6.
In some embodiments, the polynucleotide encoding the engineered transaminase comprises a polynucleotide sequence selected from the following: SEQ ID NO:3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, and 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355 and/or 357.
In some embodiments, the polynucleotide is capable of hybridizing under highly stringent conditions to a reference polynucleotide sequence selected from the group consisting of: SEQ ID NO:3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 175, 177, 179, 181, 183, 185, 187, 181, 183, 135, 181, 135, 181, 147, 149, 139, 173, 151, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355 and/or 357, and encodes a polypeptide having transaminase activity having one or more improved properties described herein.
In some embodiments, the polynucleotide encodes a polypeptide described herein, but has about 80% or greater sequence identity at the nucleotide level to a reference polynucleotide encoding an engineered transaminase, about 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater sequence identity. In some embodiments, the reference polynucleotide sequence is selected from the group consisting of SEQ ID NOs: 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, and 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355 and/or 357.
The isolated polynucleotide encoding any of the engineered transaminase polypeptides herein can be manipulated in a variety of ways to provide for expression of the polypeptides. In some embodiments, polynucleotides encoding polypeptides may be provided as expression vectors in which one or more control sequences are present to regulate expression of the polynucleotide and/or polypeptide. Depending on the expression vector, manipulation of the isolated polynucleotide prior to insertion into the vector may be desirable or necessary. Techniques for altering polynucleotides and nucleic acid sequences using recombinant DNA methods are well known in the art. The guidance is provided in the following: sambrook et al, 2001,Molecular Cloning:A Laboratory Manual, third edition, cold Spring Harbor Laboratory Press; and Current Protocols in Molecular Biology, ausubel. F. Editors, greene Pub. Associates,1998,2006 years.
In some embodiments, the control sequences include, among others, a promoter, a leader sequence, a polyadenylation sequence, a propeptide sequence, a signal peptide sequence, and a transcription terminator. Suitable promoters may be selected based on the host cell used. For bacterial host cells, suitable promoters for directing transcription of the nucleic acid constructs of the present disclosure include those obtained from: coli lac operon, streptomyces coelicolor (Streptomyces coelicolor) agarase gene (dagA), bacillus subtilis (Bacillus subtilis) levan sucrase gene (sacB), bacillus licheniformis (Bacillus licheniformis) alpha-amylase gene (amyL), bacillus stearothermophilus (Bacillus stearothermophilus) maltogenic amylase gene (amyM), bacillus amyloliquefaciens (Bacillus amyloliquefaciens) alpha-amylase gene (amyQ), bacillus licheniformis penicillinase gene (penP), bacillus subtilis xylA and xylB genes, and prokaryotic beta-lactamase gene (Villa-Kamaroff et al, 1978,Proc.Natl Acad.Sci.USA 75:3727-3731), and tac promoter (DeBoer et al, 1983,Proc.Natl Acad.Sci.USA 80:21-25). Exemplary promoters for filamentous fungal host cells include promoters obtained from the following genes: aspergillus oryzae (Aspergillus oryzae) TAKA amylase, rhizomucor miehei (Rhizomucor miehei) aspartic proteinase, aspergillus niger (Aspergillus niger) neutral alpha-amylase, aspergillus niger or Aspergillus awamori (Aspergillus awamori) glucoamylase (glaA), rhizomucor miehei lipase, aspergillus oryzae alkaline proteinase, aspergillus oryzae triose phosphate isomerase, aspergillus nidulans (Aspergillus nidulans) acetamidase, and Fusarium oxysporum (Fusarium oxysporum) trypsin-like proteinase (see, e.g., WO 96/00787), and NA2-tpi promoters (hybrids from the promoters of the Aspergillus niger neutral alpha-amylase gene and the Aspergillus oryzae triose phosphate isomerase gene), and mutants, truncated, and hybrid promoters thereof. Exemplary yeast cell promoters can be derived from the following genes: saccharomyces cerevisiae enolase (ENO-1), saccharomyces cerevisiae galactokinase (GAL 1), saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase (ADH 2/GAP), and Saccharomyces cerevisiae 3-phosphoglycerate kinase. Other useful promoters for yeast host cells are described by Romanos et al, 1992, yeast 8:423-488.
The control sequence may also be a suitable transcription terminator sequence, a sequence recognized by a host cell to terminate transcription. The terminator sequence is operably linked to the 3' terminus of the nucleic acid sequence encoding the polypeptide. Any terminator which is functional in the host cell of choice may be used in the present invention. For example, exemplary transcription terminators for filamentous fungal host cells may be obtained from the genes for Aspergillus oryzae TAKA amylase, aspergillus niger glucoamylase, aspergillus nidulans anthranilate synthase, aspergillus niger alpha-glucosidase, and Fusarium oxysporum trypsin-like protease. Exemplary terminators for yeast host cells can be obtained from the following genes: saccharomyces cerevisiae enolase, saccharomyces cerevisiae cytochrome C (CYC 1), and Saccharomyces cerevisiae glyceraldehyde-3-phosphate dehydrogenase. Other useful promoters for yeast host cells are described by Romanos et al, 1992, supra.
The control sequence may also be a suitable leader sequence, which is an untranslated region of an mRNA important for translation by the host cell. The leader sequence is operably linked to the 5' terminus of the nucleic acid sequence encoding the polypeptide. Any leader sequence that is functional in the host cell of choice may be used. Exemplary leaders for filamentous fungal host cells are obtained from the genes for Aspergillus oryzae TAKA amylase and Aspergillus nidulans triose phosphate isomerase. Suitable leader sequences for yeast host cells are obtained from the genes for Saccharomyces cerevisiae enolase (ENO-1), saccharomyces cerevisiae 3-phosphoglycerate kinase, saccharomyces cerevisiae alpha-factor, and Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase (ADH 2/GAP).
The control sequence may also be a polyadenylation sequence, a sequence operably linked to the 3' terminus of the nucleic acid sequence and which, when transcribed, is recognized by the host cell as a signal to add polyadenosine residues to transcribed mRNA. Any polyadenylation sequence which is functional in the host cell of choice may be used in the present invention. Exemplary polyadenylation sequences for filamentous fungal host cells may be derived from the genes for Aspergillus oryzae TAKA amylase, aspergillus niger glucoamylase, aspergillus nidulans anthranilate synthase, fusarium oxysporum trypsin-like protease, and Aspergillus niger alpha-glucosidase. Useful polyadenylation sequences for yeast host cells are set forth in Guo and Sherman,1995,Mol Cell Bio 15:5983-5990.
The control sequence may also be a signal peptide coding region that encodes an amino acid sequence linked to the amino terminus of the polypeptide and directs the encoded polypeptide into the cell's secretory pathway. The 5' end of the coding sequence of the nucleic acid sequence may inherently contain a signal peptide coding region naturally linked in translation reading frame (in translation reading frame) with the segment of the coding region encoding the secreted polypeptide. Alternatively, the 5' end of the coding sequence may comprise a signal peptide coding region foreign to the coding sequence. Any signal peptide coding region that directs the expressed polypeptide to the secretory pathway of a host cell of choice may be used to engineer expression of the polypeptide. Effective signal peptide coding regions of bacterial host cells are those obtained from the genes for bacillus NClB 11837 maltogenic amylase, bacillus stearothermophilus (Bacillus stearothermophilus) alpha-amylase, bacillus licheniformis (Bacillus licheniformis) subtilisin, bacillus licheniformis (Bacillus licheniformis) beta-lactamase, bacillus stearothermophilus (Bacillus stearothermophilus) neutral protease (nprT, nprS, nprM), and bacillus subtilis prsA. Additional signal peptides are described in Simonen and Palva,1993,Microbiol Rev 57:109-137. The effective signal peptide coding region of the filamentous fungal host cell may be a signal peptide coding region obtained from the genes for Aspergillus oryzae TAKA amylase, aspergillus niger neutral amylase, aspergillus niger glucoamylase, rhizomucor miehei aspartic proteinase, humicola insolens (Humicola insolens) cellulase, and Humicola lanuginosa (Humicola lanuginosa) lipase. Useful yeast host cell signal peptides can be derived from genes for Saccharomyces cerevisiae alpha-factor and Saccharomyces cerevisiae invertase.
The control sequence may also be a propeptide coding region that codes for an amino acid sequence positioned at the amino terminus of a polypeptide. The resulting polypeptide is referred to as a proenzyme or propolypeptide (or a zymogen in some cases). The propeptide may be converted to the mature active polypeptide by catalytic or autocatalytic cleavage of the propeptide from the propeptide. The propeptide coding region may be obtained from the genes for Bacillus subtilis alkaline protease (aprE), bacillus subtilis neutral protease (nprT), saccharomyces cerevisiae alpha-factor, rhizomucor miehei aspartic proteinase, and myceliophthora thermophila (Myceliophthora thermophila) lactase (WO 95/33836). When both the signal peptide and the propeptide region are present at the amino terminus of a polypeptide, the propeptide region is positioned next to the amino terminus of a polypeptide and the signal peptide region is positioned next to the amino terminus of the propeptide region.
It may also be desirable to add regulatory sequences that allow for the regulation of the expression of the polypeptide relative to the growth of the host cell. Examples of regulatory systems are those that cause the expression of a gene to be turned on or off in response to a chemical or physical stimulus, including the presence of a regulatory compound. In prokaryotic host cells, suitable regulatory sequences include the lac, tac and trp operator systems. In yeast host cells, suitable regulatory systems include, for example, the ADH2 system or the GAL1 system. In filamentous fungi, suitable regulatory sequences include the TAKA alpha-amylase promoter, the Aspergillus niger glucoamylase promoter, and the Aspergillus oryzae glucoamylase promoter.
In another aspect, the disclosure also relates to recombinant expression vectors, which include a polynucleotide encoding an engineered transaminase polypeptide, and one or more expression-regulatory regions such as promoters and terminators, origins of replication, and the like, depending on the type of host into which they are to be introduced. The various nucleic acid and control sequences described above may be linked together to produce a recombinant expression vector, which may include one or more convenient restriction sites to allow for insertion or substitution of the nucleic acid sequence encoding the polypeptide at such sites. Alternatively, the nucleic acid sequences of the present disclosure may be expressed by inserting the nucleic acid sequences or nucleic acid constructs comprising the sequences into an appropriate expression vector. In the production of the expression vector, the coding sequence is located in the vector such that the coding sequence is operably linked to appropriate control sequences for expression.
The recombinant expression vector may be any vector (e.g., a plasmid or virus) that can be conveniently subjected to recombinant DNA procedures and can cause expression of the polynucleotide sequence. The choice of vector will generally depend on the compatibility of the vector with the host cell into which the vector is to be introduced. The vector may be a linear plasmid or a closed circular plasmid.
The expression vector may be an autonomously replicating vector, i.e., a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome. The vector may comprise any means (means) for ensuring self-replication. Alternatively, the vector may be one that is integrated into the genome when introduced into the host cell and replicated together with the chromosome(s) into which it has been integrated. Furthermore, a single vector or plasmid or two or more vectors or plasmids which together contain the total DNA to be introduced into the genome of the host cell, or a transposon may be used.
The expression vector preferably comprises one or more selectable markers (selectable marker) which allow for easy selection of transformed cells. Selectable markers are genes the products of which provide biocide or viral resistance, resistance to heavy metals, prototrophy auxotrophs, and the like. Examples of bacterial selection markers are the dal genes from bacillus subtilis or bacillus licheniformis, or markers conferring antibiotic resistance such as ampicillin, kanamycin, chloramphenicol (example 1) or tetracycline resistance. Suitable markers for yeast host cells are ADE2, HIS3, LEU2, LYS2, MET3, TRP1 and URA3. Selectable markers for use in a filamentous fungal host cell include, but are not limited to, amdS (acetamidase), argB (ornithine carbamoyltransferase), bar (phosphinothricin acetyltransferase), hph (hygromycin phosphotransferase), niaD (nitrate reductase), pyrG (orotidine-5' -phosphate decarboxylase), sC (sulfate adenyltransferase), and trpC (anthranilate synthase) and equivalents thereof. Embodiments for use in Aspergillus (Aspergillus) cells include the amdS and pyrG genes of Aspergillus nidulans or Aspergillus oryzae and the bar gene of Streptomyces hygroscopicus (Streptomyces hygroscopicus).
In another aspect, the present disclosure provides a host cell comprising a polynucleotide encoding an engineered transaminase polypeptide of the present disclosure operably linked to one or more control sequences for expression of transaminase in the host cell. Host cells for expressing the polypeptides encoded by the expression vectors of the invention are well known in the art and include, but are not limited to, bacterial cells such as E.coli, vibrio fluvialis, streptomyces (Streptomyces) and Salmonella typhimurium (Salmonella typhimurium) cells; fungal cells such as yeast cells (e.g., saccharomyces cerevisiae or Pichia pastoris (ATCC accession No. 201178)); insect cells such as Drosophila (Drosophila) S2 and Spodoptera (Spodoptera) Sf9 cells; animal cells such as CHO, COS, BHK, 293 and Bowes melanoma cells; and plant cells. Exemplary host cells are E.coli W3110 (ΔfhuA) and BL21.
Accordingly, in another aspect, the present disclosure provides a method of preparing an engineered transaminase polypeptide, wherein the method can include culturing a host cell capable of expressing a polynucleotide encoding the engineered transaminase polypeptide under conditions suitable for expression of the polypeptide. The method may further comprise isolating or purifying the expressed transaminase polypeptide, as described herein.
Suitable media and growth conditions for such host cells are well known in the art. Polynucleotides for expressing aminotransferase can be introduced into cells by various methods known in the art. Techniques include, among others, electroporation, biolistic methods, liposome-mediated transfection, calcium chloride transfection, and protoplast fusion.
For the embodiments herein, the engineered polypeptides and corresponding polynucleotides may be obtained using methods used by those of skill in the art. Parent polynucleotide sequences encoding Vibrio fluvialis wild-type polypeptides are described in Shin et al, 2003, appl. Microbiol. Biotechnol.61 (5-6): 463-471, and methods of producing engineered transaminase polypeptides having improved stability and substrate recognition properties are disclosed in patent application publications US8,470,564, US9,029,106, US9,512,410, US9,944,909, US10,323,233 and US10,550,370, US8,852,900, US8,932,838, which publications are incorporated herein by reference.
Engineered transaminases having the properties disclosed herein can be obtained by subjecting a polynucleotide encoding a naturally occurring or engineered transaminase to mutagenesis and/or directed evolution methods, as discussed above. For example, mutagenesis and directed evolution methods can be readily applied to polynucleotides to generate libraries of variants that can be expressed, screened, and assayed. Mutagenesis and directed evolution methods are well known in the art (see e.g., U.S. Pat. nos. 5,605,793, 5,811,238, 5,830,721, no. 5,837,458, no. 1, no. 6,117,679, no. 6,132,970, no. 6,165,793, no. 1, no., no. 1, no. 5, no. 811, no. 5, no. 458, no. 6,335,160, no. 6, no. 165,793, no. 1, no. 6, no. 160 first, second, third, first, second, first, second, third, first, second, fourth, fifth, sixth, seventh, fifth, seventh, eighth, fifth, seventh, sixth, 395,547, the first, the second, the fourth, the fifth, the seventh, the first, the fifth, the seventh, the fourth, the fifth, the seventh, the fourth, the fifth, the sixth, the seventh, the fifth, the seventh, the eighth, the seventh, the fifth, the eighth, the seventh, the fifth No. 1, no. 5, no. 6,395, no. 547 first, second, third, fourth, fifth, sixth, seventh, eighth, and the like, 8,457,903, 8,504,498, 8,589,085, 8,762,066, 8,768,871, 9,593,326, and all related non-U.S. counterpart patents; ling et al, anal biochem.,254 (2): 157-78[1997]; dale et al, meth.mol.biol.,57:369-74[1996]; smith, ann.Rev.Genet.,19:423-462[1985]; botstein et al, science,229:1193-1201[1985]; carter, biochem.j.,237:1-7[1986]; kramer et al, cell,38:879-887[1984]; wells et al, gene,34:315-323[1985]; minshull et al, curr.op.chem.biol.,3:284-290[1999]; christins et al, nat. Biotechnol.,17:259-264[1999]; crameri et al, nature,391:288-291[1998]; crameri, et al, nat. Biotechnol.,15:436-438[1997]; zhang et al, proc.Nat.Acad.Sci.U.S.A.,94:4504-4509[1997]; crameri et al, nat. Biotechnol.,14:315-319[1996]; stemmer, nature,370:389-391[1994]; stemmer, proc.Nat.Acad.Sci.USA,91:10747-10751[1994]; WO 95/22625, WO 97/0078, WO 97/35966, WO 98/27230, WO 00/42651, WO 01/75767 and WO 2009/152336, all of which are incorporated herein by reference. All publications are herein incorporated by reference.
From the clones obtained after the mutagenesis treatment, engineered transaminases with the desired improved enzymatic properties can be screened. For example, where the desired improved enzyme property is thermostability, the enzyme activity may be measured after subjecting the enzyme preparation to a defined temperature and measuring the amount of enzyme activity remaining after the heat treatment. Clones containing the aminotransferase encoding polynucleotide are then isolated, sequenced to determine the nucleotide sequence changes (if any), and used to express the enzyme in the host cell. Measurement of enzyme activity from an expression library can be performed using standard biochemical techniques such as HPLC analysis after derivatization of the product amine (e.g., derivatization with OPA).
When the sequence of the engineered polypeptide is known, the polynucleotide encoding the enzyme may be prepared by standard solid phase methods according to known synthetic methods. In some embodiments, fragments of up to about 100 bases can be synthesized separately and then ligated (e.g., by enzymatic or chemical ligation methods or polymerase-mediated methods) to form any desired contiguous sequence. For example, the polynucleotides and oligonucleotides disclosed herein may be prepared by chemical synthesis using, for example, the methods described in Beaucage et al, 1981, tet Lett22:1859-69, or the methods described in Matthes et al, 1984,EMBO J.3:801-05, e.g., as typically practiced in automated synthesis methods. According to the phosphoramidite method, oligonucleotides are synthesized, e.g., in an automated DNA synthesizer, purified, annealed, ligated and cloned in appropriate vectors.
Thus, in some embodiments, a method for preparing an engineered transaminase polypeptide can include: (a) Synthesizing a polynucleotide encoding a polypeptide comprising a sequence selected from the group consisting of SEQ ID NOs: 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 238, 236, 146, 148, 146, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 122, 222, 58, 240, 190, 198, 200 and has a sequence identical to SEQ ID NO:4, and one or more residue differences compared to one another.
In some embodiments of the methods, the amino acid sequence encoded by the polynucleotide may optionally have one or several (e.g., up to 3, 4, 5, or up to 10) amino acid residues deleted, inserted, and/or substituted. In some embodiments, the amino acid sequence optionally has 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-15, 1-20, 1-21, 1-22, 1-23, 1-24, 1-25, 1-30, 1-35, 1-40, 1-45, or 1-50 amino acid residue deletions, insertions, and/or substitutions. In some embodiments, the amino acid sequence optionally has 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, or 50 amino acid residues deleted, inserted, and/or substituted. In some embodiments, the amino acid sequence optionally has 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 20, 21, 22, 23, 24, or 25 amino acid residue deletions, insertions, and/or substitutions. In some embodiments, the substitution may be a conservative substitution or a non-conservative substitution.
The expressed engineered transaminase can be measured for desirable improved properties, e.g., activity, enantioselectivity, stability and product tolerance, in the conversion of compound (2) to compound (3) by any of the assay conditions described herein.
In some embodiments, any engineered transaminase expressed in a host cell can be recovered from the cell and/or culture medium using any one or more of the well known techniques for protein purification including, among others, lysozyme treatment, sonication, filtration, salting out, ultracentrifugation, and chromatography. Suitable solutions for lysing and efficient extraction of proteins from bacteria such as E.coli are provided in the examples and are also commercially available, e.g.CelLytic B from Sigma-Aldrich of St.Louis MO TM
Chromatographic techniques for isolating transaminase polypeptides include, among others, reverse-phase chromatography, high-performance liquid chromatography, ion-exchange chromatography, gel electrophoresis, and affinity chromatography. The conditions used to purify a particular enzyme will depend in part on factors such as net charge, hydrophobicity, hydrophilicity, molecular weight, molecular shape, and the like, and will be apparent to one skilled in the art.
In some embodiments, affinity techniques may be used to isolate improved transaminases. For affinity chromatography purification, any antibody that specifically binds to a transaminase polypeptide can be utilized. For antibody production, a variety of host animals including, but not limited to, rabbits, mice, rats, and the like, may be immunized by injection with a transaminase polypeptide or fragment thereof. The transaminase polypeptides or fragments can be attached to a suitable carrier, such as BSA, by side-chain functionality or by means of linkers attached to the side-chain functionality.
5.7 methods of Using engineered transaminase polypeptides
As described above, the engineered transaminase polypeptides of the present disclosure are evolved to efficiently convert ketones of exemplary substrate compound (2) to the corresponding chiral amines of exemplary product compound (3) in a stereomeric excess in the presence of an amino donor under suitable reaction conditions. The structural features of the engineered transaminase polypeptides also allow conversion of prochiral ketone substrate compounds other than compound (2) to their corresponding amine compounds in stereomeric excess. Accordingly, in another aspect, the present disclosure provides methods of performing an transamination reaction using an engineered transaminase polypeptide, wherein an amino group from an amino donor is transferred to an amino acceptor, e.g., a ketone substrate compound, to produce an amine compound. Generally, methods for performing an transamination reaction include contacting or incubating an engineered transaminase polypeptide of the disclosure with an amino acceptor (e.g., a ketone substrate compound) and an amino donor (e.g., isopropylamine) under reaction conditions suitable for converting the amino acceptor to an amine compound.
For the foregoing methods, any of the engineered transaminase polypeptides described herein can be used. For example and without limitation, in some embodiments, the methods can use an engineered polypeptide of the present disclosure having transaminase activity comprising a polypeptide that hybridizes to an amino acid sequence selected from the group consisting of SEQ ID NOs: 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 238, 236, 144, 146, 37, 84, 144, 33, 144, 154, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 198, 190, 122, 132, 122, 132, 190, 132, 190, 182, 188, 188, 198, 200, 200, and SEQ ID NO:4, and one or more residue differences compared to one another.
In some embodiments, an exemplary transaminase polypeptide capable of performing the methods herein can be a polypeptide comprising a polypeptide selected from the group consisting of SEQ ID NOs: 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356 and 358. Guidance regarding the selection and use of engineered transaminase polypeptides is provided herein in descriptions such as tables 5.1 and 6.1 and examples.
In some embodiments herein and in the examples, it is illustrated that various ranges of suitable reaction conditions may be used, including, but not limited to, the following ranges: amino donor, pH, temperature, buffer, solvent system, substrate loading, polypeptide loading, cofactor loading, pressure, and reaction time. In view of the guidance provided herein, additional suitable reaction conditions for performing the methods of biocatalytically converting a substrate compound into a product compound using the engineered transaminase polypeptides described herein can be readily optimized by routine experimentation, including, but not limited to, contacting the engineered transaminase polypeptide with the substrate compound under experimental reaction conditions of concentration, pH, temperature, solvent conditions, and detecting the product compound.
In some embodiments herein, the transaminase polypeptide uses an amino donor to form a product compound. In some embodiments, the amino donor in the reaction conditions may be selected from isopropylamine (also referred to herein as "IPM"), putrescine, L-lysine, a-amphetamine, D-alanine, L-alanine, or D, L-ornithine. In some embodiments, the amino donor is selected from IPM, putrescine, L-lysine, D-or L-alanine. In some embodiments, the amino donor is IPM. In some embodiments, suitable reaction conditions include an amino donor, particularly IPM, present at a concentration of at least about 0.1M to about 3M, 0.2M to about 2.5M, about 0.5M to about 2M, or about 1M to about 2M. In some embodiments, the amino donor is present at a concentration of about 0.1M, 0.2M, 0.3M, 0.4M, 0.5M, 0.6M, 0.7M, 0.8M, 1M, 1.5M, 2M, 2.5M, or 3M. Higher concentrations of amino donors such as IPM can be used to shift the equilibrium towards amine product formation.
Suitable reaction conditions for using engineered transaminase polypeptides also typically include cofactors. Cofactors useful for the aminotransferases herein include, but are not limited to, pyridoxal 5' -phosphate (also known as pyridoxal phosphate, PLP, P5P), pyridoxine (PN), pyridoxal (PL), pyridoxamine (PM), and their phosphorylated counterparts pyridoxine phosphate (PNP) and pyridoxamine phosphate (PMP). In some embodiments, the cofactor PLP naturally occurs in the cell extract and does not require supplementation. In some embodiments of the method, suitable reaction conditions include exogenous cofactors added to the enzyme reaction mixture, for example, when a partially purified or purified transaminase is used. In some embodiments, suitable reaction conditions may include cofactors selected from PLP, PN, PL, PM, PNP and PMP being present at a concentration of about 0.1g/L to about 10g/L, about 0.2g/L to about 5g/L, about 0.5g/L to about 2.5 g/L. In some embodiments, the reaction conditions include a PLP concentration of about 0.1g/L or less, 0.2g/L or less, 0.5g/L or less, 1g/L or less, 2.5g/L or less, 5g/L or less, or 10g/L or less. In some embodiments, the cofactor may be added at the beginning of the reaction and/or additional cofactor may be added during the reaction.
The substrate compounds in the reaction mixture may vary in view of, for example, the amount of desired product compounds, the effect of substrate concentration on enzyme activity, the stability of the enzyme under reaction conditions, and the percent conversion of substrate to product. In some embodiments, suitable reaction conditions include a substrate compound loading of at least about 0.5g/L to about 200g/L, 1g/L to about 200g/L, about 5g/L to about 150g/L, about 10g/L to about 100g/L, about 20g/L to about 100g/L, or about 50g/L to about 100 g/L. In some embodiments, suitable reaction conditions include a substrate compound loading of at least about 0.5g/L, at least about 1g/L, at least about 5g/L, at least about 10g/L, at least about 15g/L, at least about 20g/L, at least about 30g/L, at least about 50g/L, at least about 75g/L, at least about 100g/L, at least about 150g/L, or at least about 200g/L, or even greater. While the values for substrate loading provided herein are based on the molecular weight of compound (2), it is also contemplated that equimolar amounts of the various hydrates and salts of compound (2) may also be used in the process.
In carrying out the reactions described herein, the engineered transaminase polypeptides can be added to the reaction mixture in the form of purified enzymes, intact cells transformed with genes encoding the enzymes, and/or cell extracts and/or lysates of such cells. Intact cells transformed with a gene encoding an engineered transaminase, or cell extracts thereof, lysates thereof, and isolated enzymes can be used in a variety of different forms, including solid (e.g., lyophilized, spray-dried, etc.) or semi-solid (e.g., crude paste). The cell extract or cell lysate may be partially purified by precipitation (ammonium sulfate, polyethylenimine, heat treatment, etc.), followed by a desalting procedure (e.g., ultrafiltration, dialysis, etc.), and then lyophilized. Any cell preparation may be stabilized by crosslinking using known crosslinking agents such as, for example, glutaraldehyde, or immobilization to a solid phase (e.g., eupergit C, etc.).
Genes encoding the engineered transaminase polypeptides can be transformed into host cells separately or together into the same host cell. For example, in some embodiments, one set of host cells may be transformed with a gene encoding one engineered transaminase polypeptide, and another set of host cells may be transformed with a gene encoding another engineered transaminase polypeptide. Both groups of transformed host cells may be used together in the reaction mixture in the form of whole cells, or in the form of a lysate or extract derived therefrom. In other embodiments, the host cell may be transformed with genes encoding a variety of engineered transaminase polypeptides. In some embodiments, the engineered polypeptide may be expressed in the form of a secreted polypeptide and a medium containing the secreted polypeptide may be used for the transaminase reaction.
The enhancement of the activity and/or stereoselectivity of the engineered transaminase polypeptides disclosed herein provides a method in which a higher percentage of conversion can be achieved at a lower concentration of the engineered polypeptide. In some embodiments of the method, suitable reaction conditions include an engineered polypeptide concentration of about 0.01g/L to about 50g/L, about 0.05g/L to about 50g/L, about 0.1g/L to about 40g/L, about 1g/L to about 40g/L, about 2g/L to about 40g/L, about 5g/L to about 30g/L, about 0.1g/L to about 10g/L, about 0.5g/L to about 10g/L, about 1g/L to about 10g/L, about 0.1g/L to about 5g/L, about 0.5g/L to about 5g/L, or about 0.1g/L to about 2 g/L. In some embodiments, the concentration of the transaminase polypeptide is about 0.01g/L, 0.05g/L, 0.1g/L, 0.2g/L, 0.5g/L, 1g/L, 2g/L, 5g/L, 10g/L, 15g/L, 20g/L, 25g/L, 30g/L, 35g/L, 40g/L, or 50g/L.
During the course of the transamination reaction, the pH of the reaction mixture may change. The pH of the reaction mixture may be maintained at or within a desired pH range. This can be accomplished by adding an acid or base before and/or during the course of the reaction. Alternatively, the pH may be controlled by using a buffer. Accordingly, in some embodiments, the reaction conditions include a buffer. Suitable buffers to maintain the desired pH range are well known in the art and include, for example and without limitation, borates, carbonates, phosphates, triethanolamine (TEA), and the like. In some embodiments, the buffer is a borate. In some embodiments of the process, suitable reaction conditions include a TEA buffer solution, wherein the TEA concentration is from about 0.01M to about 0.4M, 0.05M to about 0.4M, 0.1M to about 0.3M, or about 0.1M to about 0.2M. In some embodiments, the reaction conditions include a TEA concentration of about 0.01M, 0.02M, 0.03M, 0.04M, 0.05M, 0.07M, 0.1M, 0.12M, 0.14M, 0.16M, 0.18M, 0.2M, 0.3M, or 0.4M. In some embodiments, the reaction conditions include water as a suitable solvent, without a buffer.
In embodiments of the process, the reaction conditions may include a suitable pH. The desired pH or desired pH range may be maintained by the use of an acid or base, a suitable buffer, or a combination of buffering and addition of an acid or base. The pH of the reaction mixture may be controlled prior to and/or during the reaction process. In some embodiments, suitable reaction conditions include a pH of from about 6 to about 12, a pH of from about 6 to about 10, a pH of from about 6 to about 8, a pH of from about 7 to about 10, a pH of from about 7 to about 9, or a pH of the solution of from about 7 to about 8. In some embodiments, the reaction conditions include a solution pH of about 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10, 10.5, 11, 11.5, or 12.
In embodiments of the methods herein, suitable temperatures may be used for the reaction conditions, considering, for example, increased reaction rates at higher temperatures and enzyme activity during the reaction period. For example, the engineered polypeptides of the present disclosure have increased stability relative to naturally occurring transaminase polypeptides, e.g., wild-type polypeptides of SEQ ID NO. 2 or engineered variants of wild-type polypeptides, which allows the engineered polypeptides to be used at higher temperatures with increased conversion rates and improved substrate solubility characteristics. Thus, in some embodiments, suitable reaction conditions include a temperature of about 10 ℃ to about 70 ℃, about 10 ℃ to about 65 ℃, about 15 ℃ to about 60 ℃, about 20 ℃ to about 55 ℃, about 30 ℃ to about 55 ℃, or about 40 ℃ to about 50 ℃. In some embodiments, suitable reaction conditions include a temperature of about 10 ℃, 15 ℃, 20 ℃, 25 ℃, 30 ℃, 35 ℃, 40 ℃, 45 ℃, 50 ℃, 55 ℃, 60 ℃, 65 ℃, or 70 ℃. In some embodiments, the temperature during the enzymatic reaction may be maintained at one temperature throughout the reaction process or adjusted over one temperature spectrum during the reaction process.
The process herein is typically carried out in a solvent. Suitable solvents include water, aqueous buffer solutions, organic solvents, polymeric solvents, and/or co-solvent systems, which typically include aqueous solvents, organic solvents, and/or polymeric solvents. The aqueous solvent (water or aqueous co-solvent system) may be pH-buffered or non-buffered. In some embodiments, the method is generally performed in an aqueous co-solvent system comprising: organic solvents (e.g., ethanol, isopropyl alcohol (IPA)), dimethyl sulfoxide (DMSO), ethyl acetate, butyl acetate, 1-octanol, heptane, octane, methyl tert-butyl ether (MTBE), toluene, etc.), ionic or polar solvents (e.g., 1 ethyl 4 methylimidazole tetrafluoroborate, 1 butyl 3 methylimidazole hexafluorophosphate, glycerol, polyethylene glycol, etc.). In some embodiments, the co-solvent may be a polar solvent, such as a polyol, dimethyl sulfoxide, DMSO, or a lower alcohol. The non-aqueous co-solvent component of the aqueous co-solvent system may be miscible with the aqueous component to provide a single liquid phase, or may be partially miscible or immiscible with the aqueous component to provide a dual liquid phase. An exemplary aqueous co-solvent system may include water and one or more co-solvents selected from the group consisting of organic solvents, polar solvents, and polyol solvents. Typically, the co-solvent component of the aqueous co-solvent system is selected so as not to adversely inactivate the transaminase under the reaction conditions. Suitable co-solvent systems can be readily identified by measuring the enzymatic activity of a particular engineered transaminase with a defined substrate of interest in a candidate solvent system and using an enzymatic activity assay such as described herein.
In some embodiments of the process, suitable reaction conditions include an aqueous co-solvent, wherein the co-solvent comprises about 1% to about 80% (v/v), about 1% to about 70% (v/v), about 2% to about 60% (v/v), about 5% to about 40% (v/v), 10% to about 30% (v/v), or about 10% to about 20% (v/v) DMSO. In some embodiments of the method, suitable reaction conditions include an aqueous co-solvent comprising at least about 1%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, or 80% (v/v) DMSO. In some embodiments of the method, suitable reaction conditions include an aqueous co-solvent comprising from about 15% (v/v) to about 45% (v/v), from about 20% (v/v) to about 30% (v/v) DMSO, and in some embodiments, the DMSO concentration is about 25% (v/v).
In some embodiments of the process, suitable reaction conditions include an aqueous co-solvent, where the co-solvent may include a polymer polyol solvent. Examples of suitable polyol solvents include, for example and without limitation, polyethylene glycol methyl ether, diethylene glycol dimethyl ether, triethylene glycol dimethyl ether, and polypropylene glycol. In some embodiments, the aqueous co-solvent comprises polyethylene glycol, polyethylene glycols of different molecular weights being available. Particularly useful are lower molecular weight polyethylene glycols, such as PEG200 to PEG600. Accordingly, in some embodiments, the aqueous co-solvent may include PEG200 from about 1% to about 40% v/v, from about 2% to about 40% v/v, from about 5% to about 40% v/v, from 2% to about 30% v/v, from 5% to about 30% v/v, from 1 to about 20% v/v, from about 2% to about 20% v/v, from about 5% to about 20% v/v, from about 1% to about 10% v/v, from about 2% to about 10% v/v. In some embodiments, suitable reaction conditions include an aqueous co-solvent comprising about 1%, 2%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, or about 40% v/v PEG200.
The amount of reactant used in the transamination reaction will generally vary depending on the amount of product desired and the amount of transaminase substrate concomitantly used. One of ordinary skill in the art will readily understand how to vary these amounts to tailor them to the desired level of productivity and production scale.
In some embodiments, the order of addition of the reactants is not critical. The reactants may be added together simultaneously to the solvent (e.g., single phase solvent, biphasic aqueous co-solvent system, etc.), or alternatively, some of the reactants may be added separately, and some may be added together at different points in time. For example, cofactors, transaminases and transaminase substrates can be added first to the solvent.
Solid reactants (e.g., enzymes, salts, substrate compounds, etc.) can be provided to the reaction in a variety of different forms, including powders (e.g., lyophilized, spray dried, etc.), solutions, emulsions, suspensions, and the like. The reactants can be readily lyophilized or spray dried using methods and apparatus known to those of ordinary skill in the art. For example, the protein solution may be frozen in small amounts at-80 ℃, then added to a pre-cooled lyophilization chamber, after which vacuum is applied.
For improved mixing efficiency when using an aqueous co-solvent system, the transaminase and cofactor may be added and mixed into the aqueous phase first. The organic phase may then be added and mixed, followed by the addition of the transaminase substrate. Alternatively, the transaminase substrate may be premixed in the organic phase prior to addition to the aqueous phase.
The transamination reaction is typically allowed to proceed until the additional conversion of the ketone substrate to the amine product does not change significantly over the reaction time, e.g., less than 10% of the substrate is converted, or less than 5% of the substrate is converted. In some embodiments, the reaction will be allowed to proceed until there is complete or near complete conversion of the substrate ketone to the product amine. Substrate to product conversion can be monitored by detecting the substrate and/or product using known methods. Suitable methods include gas chromatography, HPLC, and the like. The conversion yield of chiral amine product produced in the reaction mixture is typically greater than about 50%, alternatively greater than about 60%, alternatively greater than about 70%, alternatively greater than about 80%, alternatively greater than about 90%, and alternatively greater than about 97%. In some embodiments, the method for producing compound (3) using an engineered transaminase polypeptide under suitable reaction conditions results in a conversion of at least about 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more of a compound of a ketone substrate, e.g., compound (2), to an amine product compound, e.g., compound (3), in about 48 hours or less, in about 36 hours or less, in about 24 hours or less, or even less.
In some embodiments of the process, suitable reaction conditions include a substrate loading of at least about 20g/L, 30g/L, 40g/L, 50g/L, 60g/L, 70g/L, 100g/L or more, and wherein the process results in a conversion of at least about 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more of the substrate compound to the product compound in about 48 hours or less, about 36 hours or less, or about 24 hours or less.
The engineered transaminase polypeptides of the disclosure, when used in a method for preparing chiral amine compound (3) under suitable reaction conditions, result in a stereomeric excess of chiral amine of at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, or 99.9% e.e.
In further embodiments of the method, suitable reaction conditions may include loading the initial substrate to the reaction solution and then contacting the initial substrate with the polypeptide. The reaction solution is then further supplemented with additional substrate compounds as a continuous addition at a rate of at least about 1g/L/h, at least about 2g/L/h, at least about 4g/L/h, at least about 6g/L/h, or higher. Thus, depending on these suitable reaction conditions, the polypeptide is added to a solution having an initial substrate loading of at least about 20g/L, 30g/L, or 40 g/L. After addition of the polypeptide, additional substrate is continuously added to the solution at a rate of about 2g/L/h, 4g/L/h, or 6g/L/h until a final substrate load of at least about 30g/L, 40g/L, 50g/L, 60g/L, 70g/L, 100g/L, 150g/L, 200g/L, or more is achieved. Thus, in some embodiments of the method, suitable reaction conditions include adding the polypeptide to a solution having an initial substrate loading of at least about 20g/L, 30g/L, or 40g/L, followed by adding additional substrate to the solution at a rate of about 2g/L, 4g/L, or 6g/L until a final substrate loading of at least about 30g/L, 40g/L, 50g/L, 60g/L, 70g/L, 100g/L, or more is achieved. Such substrate-supplemented reaction conditions allow higher substrate loadings to be achieved while maintaining high conversion rates of at least about 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater of the ketone substrate to amine product. In some embodiments of the method, the additional substrate is added as a solution containing isopropylamine or isopropylamine acetate (isopropylamine acetate) at a concentration of at least about 0.5M, at least about 1.0M, at least about 2.5M, at least about 5.0M, at least about 7.5M, at least about 10.0M.
In some embodiments of the process, the transamination reaction can include the following suitable reaction conditions: (a) a substrate loading of about 5g/L to 200 g/L; (b) about 0.1g/L to 50g/L of an engineered transaminase polypeptide; (c) about 0.1M to 4M Isopropylamine (IPM); (d) About 0.1g/L to about 10g/L pyridoxal phosphate (PLP) cofactor; (e) a pH of about 6 to 11; and (f) a temperature of about 30 ℃ to 60 ℃.
In some embodiments of the process, the transamination reaction can include the following suitable reaction conditions: (a) a substrate loading of about 10g/L to 150 g/L; (b) about 0.5g/L to 20g/L of an engineered transaminase polypeptide; (c) about 0.1M to 3M Isopropylamine (IPM); (d) About 0.1g/L to 1.0g/L pyridoxal phosphate (PLP) cofactor; (e) about 0.05M to 0.1M carbonate or borate buffer; (f) about 1% to about 45% dmso; (g) a pH of about 7.5 to 11; and (h) a temperature of about 30 ℃ to 55 ℃.
In some embodiments of the process, the transamination reaction can include the following suitable reaction conditions: (a) a substrate loading of about 20g/L to 100 g/L; (b) about 1g/L to 5g/L of an engineered transaminase polypeptide; (c) about 0.5M to 2.5M Isopropylamine (IPM); (d) About 0.2g/L to 2g/L pyridoxal phosphate (PLP) cofactor; (e) about 0.1M borate buffer; (f) about 20% DMSO; (e) a pH of about 10; and (f) a temperature of about 45 ℃ to 60 ℃.
In some embodiments, additional reaction components or additional techniques are performed to supplement the reaction conditions. These may include taking measures to stabilize or prevent inactivation of the enzyme, reduce product inhibition, and/or shift the reaction equilibrium towards product amine formation.
Accordingly, in some embodiments of the process for preparing amines, such as chiral amines, additional amounts of amino acceptors can be added (until saturated) and/or the amino acceptors (ketones) formed can be continuously removed from the reaction mixture. For example, a solvent bridge or two-phase co-solvent system may be used to move the amine product to the extraction solution and thereby reduce amine product inhibition and also shift the equilibrium towards product formation (see, e.g., yun and Kim,2008, biosci. Biotechnol. Biochem.72 (11): 3030-3033).
In some embodiments of the method, suitable reaction conditions include the presence of the reduced cofactor Nicotinamide Adenine Dinucleotide (NADH), which can act to limit the inactivation of the transaminase (see, e.g., van Ophem et al 1998,Biochemistry 37 (9): 2879-88). In such embodiments where NADH is present, cofactor regeneration systems such as Glucose Dehydrogenase (GDH) and glucose or formate dehydrogenase and formate can be used to regenerate NADH in the reaction medium.
In some embodiments, the method may further comprise removing carbonyl byproducts formed from the amino group donor when the amino group is transferred to the amino group acceptor. Such in situ removal may reduce the rate of the reverse reaction such that the forward reaction dominates and then more substrate is converted to product. Removal of carbonyl by-products can be performed in a number of ways. When the amino group donor is an amino acid such as alanine, the carbonyl by-product keto acid can be removed by reaction with a peroxide (see, e.g., US 2008/0213845, incorporated herein by reference). Peroxides that may be used include, among others, hydrogen peroxide; peroxy acids (peracids) such as peracetic acid (CH) 3 CO 3 H) Trifluoro peracetic acidAnd m-chloroperoxybenzoic acid; organic peroxides such as t-butyl peroxide ((CH) 3 ) 3 COOH) or other selective oxidants such as ammonium tetrapropylpiperhenate, mnO 2 、KMnO 4 Ruthenium tetroxide and related compounds. Alternatively, removal of pyruvate can be accomplished by reducing it to lactate using lactate dehydrogenase to shift the equilibrium to the product amine (see, e.g., koszelewski et al, 2008, adv. Syn. Catalyst. 350:2761-2766). The removal of pyruvate can also be accomplished via its decarboxylation (by using pyruvate decarboxylase (see e.g.,
Figure BDA0004141823750000741
2008,Chem BioChem 9:363-365) or acetolactate synthase (see, e.g., yun and Kim, supra).
Alternatively, in embodiments in which an amino acid is used as the amino group donor, the keto acid carbonyl by-product can be recycled to the amino acid by reacting with ammonia and NADH in the presence of an amine donor such as ammonia, using an appropriate dehydrogenase, e.g., an amino acid dehydrogenase, to replenish the amino group donor.
In some embodiments, when the carbonyl by-product produced by the selected amino donor is higher than the vapor pressure of water (e.g., a low boiling point by-product such as a volatile organic carbonyl compound), the carbonyl by-product can be removed by charging a non-reactive gas to the reaction solution, or by reducing the reaction pressure by applying a vacuum, and removing the carbonyl by-product present in the gas phase. A non-reactive gas is any gas that does not react with the reactive components. Various non-reactive gases include nitrogen and noble gases (e.g., inert gases). In some embodiments, the non-reactive gas is nitrogen. In some embodiments, the amino donor used in the method is Isopropylamine (IPM), which forms a carbonyl by-product acetone upon transfer of an amino group to an amino acceptor. Acetone may be removed by charging the reaction solution with nitrogen or applying a vacuum, and removing the acetone from the gas phase by an acetone trap, such as a condenser or other cold trap. Alternatively, acetone may be removed by reduction to isopropanol using a transaminase.
In some embodiments of the process for preparing chiral amines, nitrogen sweep (nitrogen sweep) is used to remove acetone to increase the conversion and yield of chiral amines under industrial process conditions.
In some embodiments of the above methods in which carbonyl byproducts are removed, the corresponding amino group donor may be added during the transamination reaction to replenish the amino donor and/or to maintain the pH of the reaction. Supplementing the amino group donor also shifts the equilibrium towards product formation, increasing the conversion of substrate to product. Thus, in some embodiments in which the amino group donor is isopropylamine and the acetone product is removed in situ, isopropylamine may be added to the solution to replenish the amino group donor lost during acetone removal and maintain the pH of the reaction.
In further embodiments, any of the above-described methods for converting a substrate compound to a product compound may further comprise one or more steps selected from the group consisting of: extraction, separation, purification and crystallization of the product compounds. Methods, techniques and protocols for extracting, isolating, purifying and/or crystallizing product amines from biocatalytic reaction mixtures produced by the above disclosed methods are known to one of ordinary skill and/or can be obtained by routine experimentation. Furthermore, illustrative methods are provided in the examples below.
Various features and embodiments of the present disclosure are illustrated in the following representative examples, which are intended to be illustrative and not limiting.
6. Examples
The following examples, including experiments and results obtained, are provided for illustrative purposes only and should not be construed as limiting the invention. Indeed, many of the reagents and equipment described below have a variety of suitable sources. It is not intended that the invention be limited to any particular source for any reagent or equipment item.
In the experimental disclosure below, the following abbreviations apply: m (mol/l); mM (millimoles/liter), uM and μM (micromoles/liter); nM (nanomole/liter); mol (mol); gm and g (grams); mg (milligrams); ug and μg (micrograms); l and L (liters); mL and mL (milliliters); cm (cm); mm (millimeters); um and μm (micrometers); sec (seconds); min(s) (min); h(s) and hr(s) (hours); u (units); MW (molecular weight); rpm (revolutions per minute); PSI and PSI (pounds per square inch); DEG C (degrees Celsius); RT and RT (room temperature); RH (relative humidity); CV (coefficient of variation); CAM and CAM (chloramphenicol); PMBS (polymyxin B sulfate); IPTG (isopropyl β -D-1-thiogalactopyranoside); LB (Luria broth); TB (super broth); SFP (shake flask powder); CDS (coding sequence); DNA (deoxyribonucleic acid); RNA (ribonucleic acid); nt (nucleotide; polynucleotide); aa (amino acids; polypeptides); coli W3110 (a commonly used laboratory E.coli strain available from Coli Genetic Stock Center [ CGSC ], new Haven, CT); HTP (high throughput); HPLC (high pressure liquid chromatography); HPLC-UV (HPLC-ultraviolet visible detector); 1H NMR (proton Nuclear magnetic resonance Spectroscopy); FIOPC (fold improvement over positive control); sigma and Sigma-Aldrich (Sigma-Aldrich, st. Louis, mo.); difco (Difco Laboratories, BD Diagnostic Systems, detroit, mich); microfluidics (Microfluidics, westwood, MA); life Technologies (Life Technologies, fisher Scientific, waltham, a portion of MA); amerco (amerco, LLC, solon, OH); carbosynth (ltd., berkshire, UK); varian (Varian Medical Systems, palo Alto, CA); agilent (Agilent Technologies, inc., santa Clara, CA); infors (Infors USA Inc., annapolis Junction, MD); and thermo tron (thermo tron, inc., holland, MI).
Example 1: coli expression host containing recombinant aminotransferase gene
The initial transaminase used to produce the variants of the invention is SEQ ID NO. 4, which is cloned into expression vector pCK 110900 (see, FIG. 3 of U.S. patent application publication No. 2006/0195947), operably linked to a lac promoter under the control of a lac repressor. The expression vector also comprises a P15a origin of replication and a Chloramphenicol (CAM) resistance gene. The resulting plasmid was transformed into E.coli W3110 using standard methods known in the art. Transformants were isolated by chloramphenicol selection of cells as known in the art (see, e.g., U.S. patent No. 8,383,346 and WO 2010/144103).
Example 2: HTP production of wet cell pellets containing transaminase
Coli cells from a monoclonal colony containing the gene encoding the recombinant aminotransferase were inoculated into 180 μl of LB containing 1% glucose and 30 μg/mL CAM in wells of a 96 Kong Jiankong microtiter plate. O for plate 2 The permeable seal was sealed and the culture was grown overnight at 30 ℃, 200rpm and 85% humidity. Then, 10. Mu.L of each of the cell cultures was transferred to wells of a 96-well deep well plate containing 390mL TB and 30. Mu.g/mL CAM. O for deep hole plate 2 The permeable seal is sealed and incubated at 30 ℃, 250rpm and 85% humidity until OD 600 Reaching 0.6 to 0.8. The cell cultures were then induced with IPTG at a final concentration of 1mM and incubated overnight under the same conditions as initially used. Cells were then pelleted using a 4,000rmp centrifuge for 10 minutes. The supernatant was discarded and the pellet was frozen at-80 ℃ prior to lysis.
Example 3: HTP production of cell lysates containing aminotransferase
First, 400. Mu.L lysis buffer containing 100mM triethanolamine buffer pH 7.5, 0.5mg/mL PLP, 1mg/mL lysozyme, and 0.5mg/mL PMBS was added to the resulting cell paste as described in example 2 in each well. Cells were lysed for 2 hours at room temperature and shaken on a bench shaker. The plates were then centrifuged at 4℃at 4,000rmp for 10min. The clarified supernatant was used in biocatalytic reactions to determine its activity level.
Example 4: preparation of lyophilized lysate from Shake Flask (SF) cultures
Selected HTP cultures grown as described above were plated onto LB agar plates containing 1% glucose and 30 μg/ml CAM and grown overnight at 37 ℃. Individual colonies from each culture were transferred to 6ml of LB containing 1% glucose and 30 μg/ml CAM. Cultures were grown for 18h at 30℃and 250rpm and subcultured at about 1:50 to 250ml of TB with 30. Mu.g/ml CAM to a final OD of 0.05 600 . The culture was grown at 30℃and 250rpm for about 195 minutes to an OD of between 0.6 and 0.8 600 And induced with 1mM IPTG. The culture was then grown for 20h at 30℃and 250 rpm. The culture was centrifuged at 4,000rpm for 20 minutes. The supernatant was discarded and the pellet was resuspended in 30ml10mM triethanolamine buffer pH 7.5. Cells were pelleted (4000 rpm, 20 min) and frozen at-80℃for 120 min. The frozen pellet was resuspended in 30ml 10mM triethanolamine buffer pH 7.5 and lysed at 18,000psi using a Microfluidizer (Microfluidizer) system (Microfluidics). The lysate was precipitated (10,000 rpm, 60 min), and the supernatant was frozen and lyophilized to produce a Shake Flask (SF) enzyme.
Example 5: transaminase activity improved with respect to SEQ ID NO 4 in respect of the production of Compound (3)
After screening for variants for deamination of the ketone substrate, SEQ ID NO. 4 is selected as parent enzyme. The activity (FIOP transformation) relative to SEQ ID NO. 4 was calculated as the percent conversion of the product formed by the variant versus the percent conversion produced by SEQ ID NO. 4 and is shown in Table 5.1. Percent conversion was calculated by dividing the area of the product peak observed with HPLC-UV analysis by the sum of the areas of the substrate peak and the product peak (example 7), and enantioselectivity was confirmed using normal phase HPLC-UV analysis (example 8).
An engineered polynucleotide encoding a polypeptide having transaminase activity of SEQ ID NO. 4 (SEQ ID NO. 3) was used to produce the engineered polypeptides of Table 5.1. These polypeptides exhibit improved transaminase activity (e.g., the ketone substrate compound (2) 1-imidazo [1,2-a ] pyridin-6-yl ethanone to the product compound (3) (1S) -1-imidazo [1,2-a ] pyridin-6-yl ethylamine) under desired conditions as compared to the starting polypeptide. Engineered polypeptides having the amino acid sequence of the even numbered sequence identifier were generated as described from the "backbone" amino acids in SEQ ID NO. 4 and identified using the HTP assay described below and the analytical method shown in Table 7.1.
Directed evolution begins with the polynucleotide set forth in SEQ ID NO. 3. Libraries of engineered polypeptides were generated using a variety of well-known techniques (e.g., saturation mutagenesis, recombination of previously identified beneficial amino acid differences) and screened using the HTP assay below and the analytical methods described in table 7.1.
Enzyme assays were performed in 96-well deep-well (2 mL) plates in a total volume of 100 μl/well. The reaction was performed using 15. Mu.L HTP lysate, 30g/L ketone substrate, 0.26g/L PLP, 1M IPM, 20% (v/v) DMSO, 100mM carbonate buffer pH 10.0. The reaction was established by adding: 1) 20. Mu.L of substrate ketone in DMSO 150g/L, 2) 65. Mu.L of a master mix solution containing 1.6M of IPM,160mM pH 10.0 pH 10.0 and 0.4g/L PLP, 3) 15. Mu.LHTP lysate. The reaction plate was heat sealed at 180℃for 2s. The plate was then shaken at 600rpm for 20 hours at 50 ℃.
After 20 hours of incubation, 300 μl of acetonitrile was added to each well, and the plate was resealed and shaken at room temperature for 10 minutes, then centrifuged at 4 ℃ for 10 minutes. In the new plate, 20. Mu.L of sample from the plate was further diluted by adding 180. Mu.L of a 1:1 water to acetonitrile mixture. Prior to performing the achiral HPLC analysis as described in example 7, the plates were sealed and mixed for 1min at room temperature.
For chiral analysis, 100 μl of acetonitrile quench obtained from the selected hits from the preliminary analysis was taken into Eppendorf tubes and the solvent was evaporated in speedvac for 30min to 60min. The resulting residue was reconstituted in 100 μl isopropanol containing 0.5% Diethylamine (DEA) and the samples were analyzed using normal phase HPLC method as described in example 8.
The hit variants were grown in 250-mL shake flasks and shake flask powders were generated. SFP activity was assessed with 0.9-7.5g/LSF powder, 30g/L ketone substrate, 0.26g/L PLP, 1M IPM, 20% (v/v) DMSO, 100mM carbonate buffer pH 10.0. The reaction was established using a similar procedure as described above.
Figure BDA0004141823750000781
/>
Figure BDA0004141823750000791
Example 6: transaminase activity improved with respect to SEQ ID NO 6 in respect of the production of Compound (3)
SEQ ID NO. 6 was chosen as the parent enzyme for the next evolution. The activity (FIOP transformation) relative to SEQ ID NO. 6 was calculated as the percent conversion of the product formed by the variant versus the percent conversion produced by SEQ ID NO. 6 and is shown in Table 6.1. Percent conversion was calculated by dividing the area of the product peak observed with HPLC-UV analysis by the sum of the areas of the substrate peak and the product peak (example 7), and enantioselectivity was confirmed using normal phase HPLC-UV analysis (example 8).
The engineered polypeptides of Table 6.1 were produced using an engineered polynucleotide (SEQ ID NO: 5) encoding the polypeptide of SEQ ID NO:6 having transaminase activity. These polypeptides exhibit improved transaminase activity (e.g., the ketone substrate compound (2) 1-imidazo [1,2-a ] pyridin-6-yl ethanone to the product compound (3) (1S) -1-imidazo [1,2-a ] pyridin-6-yl ethylamine) under desired conditions as compared to the starting polypeptide. Engineered polypeptides having the amino acid sequence of the even numbered sequence identifier were generated as described from the "backbone" amino acids in SEQ ID NO. 6 and identified using the HTP assay described below and the analytical method shown in Table 7.1.
Directed evolution begins with the polynucleotide set forth in SEQ ID NO. 5. Libraries of engineered polypeptides were generated using a variety of well-known techniques (e.g., saturation mutagenesis, recombination of previously identified beneficial amino acid differences) and screened using the HTP assay below and the analytical methods described in table 7.1.
Enzyme assays were performed in 96 well deep well (2 mL) plates in a total volume of 100 μl/well. The reaction was performed using 20. Mu.L of 10 Xdiluted HTP lysate, 35g/L of ketone substrate, 0.24g/L PLP, 1M IPM, 20% (v/v) DMSO, 100mM borate buffer pH 10.0. The reaction was established by adding: 1) 20. Mu.L of 175g/L substrate ketone in DMSO, 2) 60. Mu.L of a master mix solution containing 1.65M of IPM pH 10.0, 170mM of borate buffer pH 10.0 and 0.4g/L PLP, 3) 20. Mu.L HTP lysate. The reaction plate was heat sealed at 180℃for 2s. The plate was then shaken at 600rpm for 20 hours at 50 ℃.
After 20 hours of incubation, 300 μl of acetonitrile was added to each well, and the plate was resealed and shaken at room temperature for 10 minutes, then centrifuged at 4 ℃ for 10 minutes. In the new plate, 20. Mu.L of sample from the plate was further diluted by adding 180. Mu.L of a 1:1 water to acetonitrile mixture. Prior to performing the achiral HPLC analysis as described in example 7, the plates were sealed and mixed for 2min at room temperature.
For chiral analysis, 100 μl of acetonitrile quench obtained from the selected hits from the preliminary analysis was taken into Eppendorf tubes and the solvent was evaporated in speedvac for 30min to 60min. The resulting residue was reconstituted in 100 μl isopropanol containing 0.5% Diethylamine (DEA) and the samples were analyzed using normal phase HPLC method as described in example 8.
The hit variants were grown in 250-mL shake flasks and shake flask powders were generated. SFP activity was assessed with 0.9-7.5g/LSF powder, 35g/L ketone substrate, 0.26g/L PLP, 1M IPM, 20% (v/v) DMSO, 100mM borate buffer pH 10.0. The reaction was established using a similar procedure as described above.
Figure BDA0004141823750000801
/>
Figure BDA0004141823750000811
/>
Figure BDA0004141823750000821
/>
Figure BDA0004141823750000831
/>
Figure BDA0004141823750000841
Example 7: detection of the product 1-imidazo [1,2-a ] pyridin-6-ylethylamine by HPLC-UV analysis
The data described in example 5 and example 6 were collected using the analytical methods provided in table 7.1. The methods provided herein can be used to analyze variants produced using the present invention. However, the present invention is not intended to be limited to the methods described herein, as there are other suitable methods known in the art for analyzing the variants provided herein and/or produced using the methods provided herein.
Figure BDA0004141823750000842
/>
Figure BDA0004141823750000851
Example 8: detection of (1S) -1-imidazo [1,2-a ] pyridin-6-ylethylamine by normal phase HPLC-UV analysis
The data described in example 5 and example 6 were collected using the analytical methods provided in table 7.1, and chiral identities of hit variants were verified using the analytical methods provided in table 8.1 herein. The methods provided herein can be used to isolate and identify product isomers produced using the present invention. However, the present invention is not intended to be limited to the methods described herein, as there are other suitable methods known in the art for analyzing the variants provided herein and/or produced using the methods provided herein.
Figure BDA0004141823750000852
All publications, patents, patent applications, and other documents cited in this application are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication, patent application, or other document was individually indicated to be incorporated by reference for all purposes.
While various specific embodiments have been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the invention.

Claims (17)

1. An engineered transaminase comprising a polypeptide sequence that has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identity to SEQ ID No. 4 and/or 6, or a functional fragment thereof, wherein the engineered transaminase comprises at least one substitution or set of substitutions in the polypeptide sequence, and wherein the amino acid positions of the polypeptide sequence are numbered with reference to SEQ ID No. 4 and/or 6.
2. The engineered transaminase enzyme of claim 1, wherein the polypeptide sequence has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identity to SEQ ID No. 4, wherein the engineered transaminase comprises at least one substitution or set of substitutions in the polypeptide sequence at one or more positions selected from the following: 227. 41/227/417/443, 13, 41/57/130/415/419, 41/113/415, 53/57, 88/89, 97/415, 148, 260, 302, 355/415/419, 362, 417 and 443, and wherein the amino acid positions of said polypeptide sequence are numbered with reference to SEQ ID NO. 4.
3. The engineered transaminase enzyme of claim 1, wherein the polypeptide sequence has at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identity to SEQ ID No. 6, wherein the engineered transaminase comprises at least one substitution or set of substitutions in the polypeptide sequence at one or more selected from the following: 13. 13/41/57/88/130/415/417, 13/41/57/89/97/417, 13/41/57/97/130/415/417, 13/41/57/97/130/415/417/443, 13/41/57/97/443, 13/41/57/130/417, 13/41/57/417, 13/41/88, 13/41/88/89, 13/41/88/89/97/415/443, 13/41/88/89/417, 13/41/88/97, 13/41/88/130/415/443, 13/41/88/443, 13/41/89/130/148/443, 13/41/89/417, 13/41/89/443, 13/41/97/130/417, 13/41/97/415, 13/41/97/415/417, 13/41/97/417, 13/41/97/417/443, 13/41/130/415/443, 13/41/415, 13/41/415/417, 13/41/415/443, 13/41/417, 13/41/417/443, 13/57/88/89/130/415/443, 13/57/88/97, 13/57/88/97/415/443, 13/57/88/130/415, 13/57/88/130/417/443, 13/57/88/415, 13/57/97/130/415/417/443, 13/57/97/417, 13/88/89/415/417, 13/88/89/415/417/443, 13/88/130/443, 13/88/415, 13/89/97/415/417, 13/89/97/417, 13/89/417, 13/97/148/415, 13/97/415, 13/97/415/417, 13/97/417, 13/130/415, 13/130/415/417, 13/130/417, 13/130/417/443, 13/415/417, 13/415/417/443, 13/415/443, 13/417/443, 13/443, 23/53/162/233/277/315/415/418/432, 23/53/315/417/418, 23/277/315/395/415/417/432, 23/277/395/417/418, 23/395/418, 23/418, 41/57/88, 41/57/88/415/443, etc, 41/57/130/148/415/417, 41/57/130/443, 41/57/415/417, 41/88/89/97/130/415, 41/88/89/415/417, 41/88/97/130/417, 41/88/130/415/417, 41/88/443, 41/97/130/148/415/417/443, 41/97/417, 41/97/417/443, 41/130/415, 41/130/415/417/443, 41/130/415/443, 41/415/443, 41/417/443, 53/162, 53/162/395/417, 53/162/418/432, 53/233, 53/277/395, 53/277/395/417/418, 53/277/415/417, 57/88/97/130/415/443, 57/88/97/130/417, 57/88/97/417, 57/97/130/148/417/443, 57/417, 88, 88/89/130/417, 88/97/415/417/443, 88/130/417/443, 88/148/417/443, 88/415/417, 88/415/417/443, 88/417, and water-absorbing agent, 89/97/415/417, 89/97/417, 89/443, 97/130, 97/148/415, 97/415/417, 97/417, 130/415, 130/417, 130/443, 162/233/415/417, 162/395/415/417, 162/418, 233/315/415/417, 233/315/417, 277/395/415/418/432, 315, 315/415/418/432, 395/418, 415/417/418, 415/417/418/432, 415/417/443, 415/443, 417 and 443, wherein the amino acid position of the polypeptide sequence is numbered with reference to SEQ ID No. 6.
4. The engineered polypeptide according to claim 2, wherein the residue difference at residue positions 13, 41/57/130/415/419, 41/113/415, 53/57, 88/89, 97/415, 148, 227, 260, 302, 355/415/419, 362, 417 and 443 is selected from 13A, 13E, 13G, 13K, 13S, 41V/57Y/130Y/415F/419D, 41V/113F/415F, 53M/57W, 88K, 88R/89L, 88V, 97A/415S, 148E, 148G, 227A, 227C, 260T, 302N, 355C/415S/419D, 362G, 417A, 417I, 417V, 443E and 443M.
5. The engineered polypeptide according to claim 1, wherein said amino acid sequence further comprises a combination of residue differences selected from the group consisting of SEQ ID No. 4:
T13A;
T13E;
T13G;
T13K;
T13S;
I41V, F57Y, F Y, R415F and Q419D; I41V, V F and R415F;
N53M and F57W;
L88K;
L88R;
L88R and M89L;
L88V;
S97A and R415S;
Q148E;
Q148G;
G227A;
G227C;
C260T;
E302N;
R355C, R415S and Q419D;
H362G;
L417A;
L417I;
L417V;
K443E; and
K443M。
6. the engineered polypeptide according to claim 1, wherein said transaminase has an activity which is increased by at least 1.2-fold in converting compound (2) to compound (3) compared to the polypeptide of SEQ ID No. 4.
7. The engineered polypeptide according to claim 1, wherein said transaminase has an increased enantioselectivity in converting compound (2) to compound (3) compared to the polypeptide of SEQ ID No. 4.
8. The engineered polypeptide according to claim 1, wherein said amino acid sequence comprises an amino acid sequence selected from the group consisting of SEQ ID NOs: 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356 and 358.
9. The engineered polypeptide according to any one of claims 1 to 8, wherein said polypeptide is immobilized on a solid support.
10. The engineered polypeptide according to claim 9, wherein the solid support is a bead or a resin comprising a polymethacrylate having an epoxy functional group, a polymethacrylate having an amino epoxy functional group, a styrene/DVB copolymer having an octadecyl functional group, or a polymethacrylate.
11. A polynucleotide encoding the engineered transaminase polypeptide of any one of claims 1 to 10.
12. A polynucleotide encoding the engineered transaminase polypeptide of claim 1, comprising a nucleotide sequence selected from SEQ ID NOs: 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 175, 177, 179, 181, 183, 185, 187, 189, and the like 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257, 259, 261, 263, 265, 267, 269, 271, 273, 275, 277, 279, 281, 283, 285, 287, 289, 291, 293, 295, 297, 299, 301, 303, 305, 307, 309, 311, 313, 315, 317, 319, 321, 323, 325, 327, 329, 331, 333, 335, 337, 339, 341, 343, 345, 347, 349, 351, 353, 355 and 357.
13. An expression vector comprising the polynucleotide of claim 11 or 12.
14. The expression vector of claim 13, comprising a control sequence.
15. A host cell comprising the polynucleotide of claim 11 or 12 or the expression vector of claim 13 or 14.
16. A method of making the engineered polypeptide of any one of claims 1 to 10, the method comprising culturing the host cell of claim 15 under conditions suitable for expression of the polypeptide.
17. The method of claim 16, further comprising isolating the engineered polypeptide.
CN202180065368.4A 2020-09-28 2021-09-17 Engineered biocatalysts and methods for synthesizing chiral amines Pending CN116209754A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US202063084166P 2020-09-28 2020-09-28
US63/084,166 2020-09-28
PCT/US2021/050944 WO2022066534A1 (en) 2020-09-28 2021-09-17 Engineered biocatalysts and methods for synthesizing chiral amines

Publications (1)

Publication Number Publication Date
CN116209754A true CN116209754A (en) 2023-06-02

Family

ID=80846847

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180065368.4A Pending CN116209754A (en) 2020-09-28 2021-09-17 Engineered biocatalysts and methods for synthesizing chiral amines

Country Status (7)

Country Link
US (1) US20240026401A1 (en)
EP (1) EP4217502A1 (en)
JP (1) JP2023543990A (en)
CN (1) CN116209754A (en)
CA (1) CA3193755A1 (en)
IL (1) IL301146A (en)
WO (1) WO2022066534A1 (en)

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4600692A (en) 1983-02-10 1986-07-15 Purification Engineering, Inc. Immobilized cells for preparing phenylalanine
US4518692A (en) 1983-09-01 1985-05-21 Genetics Institute, Inc. Production of L-amino acids by transamination
US4826766A (en) 1985-09-23 1989-05-02 Genetics Institute, Inc. Production of amino acids using coupled aminotransferases
US5316943A (en) 1988-06-14 1994-05-31 Kidman Gene E Racemic conversion of using a transaminase
US5300437A (en) 1989-06-22 1994-04-05 Celgene Corporation Enantiomeric enrichment and stereoselective synthesis of chiral amines
US5169780A (en) 1989-06-22 1992-12-08 Celgene Corporation Enantiomeric enrichment and stereoselective synthesis of chiral amines
US6197558B1 (en) 1997-05-19 2001-03-06 Nsc Technologies Transaminase biotransformation process
PE20160588A1 (en) * 2009-12-31 2016-07-09 Hutchison Medipharma Ltd TRIAZOLOPYRIDINES AND TRIAZOLOPYRAZINE COMPOUNDS, C-MET INHIBITORS AND COMPOSITIONS THEREOF
US9242981B2 (en) * 2010-09-16 2016-01-26 Merck Sharp & Dohme Corp. Fused pyrazole derivatives as novel ERK inhibitors
EP2828385B1 (en) * 2012-03-23 2018-02-07 Codexis, Inc. Biocatalysts and methods for synthesizing derivatives of tryptamine and tryptamine analogs
ES2962530T3 (en) * 2012-03-23 2024-03-19 Novartis Ag Chemical process for the preparation of spiroindolones and intermediates thereof
IL281283B2 (en) 2018-09-11 2024-01-01 Astrazeneca Ab Improved method for the manufacture of 3-[(1s)-1-imidazo[1,2-a]pyridin-6-ylethyl]-5-(1-methylpyrazol-4-yl)triazolo[4,5-b]pyrazine and polymorphic forms thereof

Also Published As

Publication number Publication date
CA3193755A1 (en) 2022-03-31
US20240026401A1 (en) 2024-01-25
IL301146A (en) 2023-05-01
EP4217502A1 (en) 2023-08-02
JP2023543990A (en) 2023-10-19
WO2022066534A1 (en) 2022-03-31

Similar Documents

Publication Publication Date Title
US10604744B2 (en) Engineered transaminase polypeptides for industrial biocatalysis
US11732248B2 (en) Engineered biocatalysts and methods for synthesizing chiral amines
US11098291B2 (en) Biocatalysts and methods for synthesizing derivatives of tryptamine and tryptamine analogs
CN116209754A (en) Engineered biocatalysts and methods for synthesizing chiral amines
WO2020150125A1 (en) Engineered transaminase polypeptides

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination