HK1237370B

HK1237370B - Microbial ergothioneine biosynthesis

Info

Publication number: HK1237370B
Application number: HK17111346.9A
Authority: HK
Inventors: Jixiang Han; Hui Chen; Xiaodan Yu
Original assignee: Conagen Inc.
Priority date: 2014-04-29
Filing date: 2015-04-28
Publication date: 2022-01-21

Description

Microbial ergothioneine biosynthesis

提交序列表的支持声明Statement of Support for Submission of Sequence Listing

序列表的纸质拷贝和含有名为"32559-12_ST25.txt"且大小为19,443字节(如MICROSOFT WINDOWS® EXPLORER中测量)的文件的序列表的计算机可读形式在本文中提供并通过引用并入本文。该序列表由SEQ ID NO:l-16组成。A paper copy of the sequence listing and a computer readable form of the sequence listing comprising a file named "32559-12_ST25.txt" having a size of 19,443 bytes (as measured in MICROSOFT WINDOWS® EXPLORER) are provided herein and are incorporated herein by reference. The sequence listing consists of SEQ ID NOs: 1-16.

公开背景Public background

本公开总体涉及用于麦角硫因生物合成的方法。更具体地，本公开涉及用于微生物麦角硫因生物合成的方法。The present disclosure generally relates to methods for ergothioneine biosynthesis. More specifically, the present disclosure relates to methods for microbial ergothioneine biosynthesis.

麦角硫因(ET)是具有连接至咪唑环的C₂原子的巯基的组氨酸甜菜碱衍生物。作为硫酮互变异构体，ET是具有独特性质的非常稳定的抗氧化剂。不同于谷胱甘肽和抗坏血酸，ET可以清除不是自由基的氧化物质。ET是放线菌诸如耻垢分枝杆菌(Mycobacterium smegmatis)和丝状真菌诸如粗糙脉孢菌(Neurospora crassa)中产生的天然化合物。其他细菌物种，诸如枯草芽孢杆菌(Bacillussubtilis)、大肠杆菌(Escherichiacoli)、普通变形杆菌(Proteusvulgaris)和链球菌(Streptococcus)，以及属于子囊菌纲(Ascomycetes)和半知菌纲(Deuteromycetes)的真菌不能产生麦角硫因。动物和植物也不能产生麦角硫因，并且必须从膳食来源或在植物的情况下从它们的环境中获得麦角硫因。Ergothioneine (ET) is a histidine betaine derivative with a sulfhydryl group attached to the _C2 atom of the imidazole ring. As a thiol tautomer, ET is a very stable antioxidant with unique properties. Unlike glutathione and ascorbic acid, ET can scavenge oxidative species that are not free radicals. ET is a natural compound produced in actinomycetes such as Mycobacterium smegmatis and filamentous fungi such as Neurospora crassa . Other bacterial species, such as Bacillus subtilis, Escherichia coli, Proteus vulgaris , and Streptococcus , as well as fungi belonging to the Ascomycetes and Deuteromycetes , cannot produce ergothioneine. Animals and plants also cannot produce ergothioneine and must obtain it from dietary sources or, in the case of plants, from their environment.

尽管ET在微生物细胞中的功能尚未充分理解，但其据信在人生理学中是至关重要的。人从膳食来源吸收ET，并且ET在特定组织和细胞诸如肝、肾、中枢神经系统和红细胞中积累。证明特异性阳离子转运蛋白(OCTN1)对人体中的ET具有高亲和力，并且所述转运蛋白的过度活性和缺乏对人细胞发挥负面作用。Although the function of ET in microbial cells is not fully understood, it is believed to be crucial in human physiology. Humans absorb ET from dietary sources, and ET accumulates in specific tissues and cells such as the liver, kidneys, central nervous system, and red blood cells. A specific cation transporter (OCTN1) has been shown to have a high affinity for ET in humans, and overactivity and deficiency of this transporter exert negative effects on human cells.

在某些分枝杆菌真菌中已经检测到ET的生物合成，然而，确切的代谢途径没有完成或仅部分证实。Seebeck使用大肠杆菌在体外重构了分枝杆菌麦角硫因生物合成，以分别表达甲酰甘氨酸生成酶样蛋白(EgtB)、谷氨酰胺转酰胺酶(EgtC)、组氨酸甲基转移酶(EgtD)和替代5-磷酸吡哆醛结合蛋白(EgtE)的来自Erwinia tasmaniensis的无关的β-裂合酶（因为可溶性EgtE蛋白的重组产生失败）(参见，J. Am. Chem. Soc. 2010, 132:6632-6633)。The biosynthesis of ET has been detected in some mycobacterial fungi, however, the exact metabolic pathway has not been completed or has only been partially confirmed. Seebeck reconstructed mycobacterial ergothioneine biosynthesis in vitro using Escherichia coli to express formylglycine-generating enzyme-like protein (EgtB), glutamine transamidase (EgtC), histidine methyltransferase (EgtD) and an unrelated β-lyase from Erwinia tasmaniensis that replaced pyridoxal-5-phosphate binding protein (EgtE) (because recombinant production of soluble EgtE protein failed) (see, J. Am. Chem. Soc. 2010, 132:6632-6633).

迄今为止，仅编码EgtB、EgtC和EgtD的3种基因已被鉴定用于体外生产麦角硫因。EgtE的推定基因仍未在体外或体内表征。到目前为止，尽管有各种生物转化的尝试，但尚未报道使用上述基因以在大肠杆菌中工程改造分枝杆菌麦角硫因代谢途径的微生物生产。此外，尽管各种真菌和分枝杆菌来源可用于麦角硫因提取，但产量太低，以致于不能商业用于麦角硫因的工业生产。因此，存在生产麦角硫因的需求。Up to now, only 3 kinds of genes encoding EgtB, EgtC and EgtD have been identified for in vitro production of thioneine.The putative gene of EgtE is still not characterized in vitro or in vivo.Up to now, although there is the attempt of various biotransformations, it has not yet been reported to use the above-mentioned gene to produce the microorganism of the engineered mycobacterium thioneine metabolic pathway in Escherichia coli.In addition, although various fungi and mycobacterium sources can be used for thioneine extraction, output is too low, so that it can not be commercially used in the industrial production of thioneine.Therefore, there is the demand for producing thioneine.

公开概述Public Overview

本公开总体涉及用于生产麦角硫因的工程改造的宿主细胞和方法。更具体地，本公开涉及工程改造的宿主细胞和用于使用所述工程改造的宿主细胞微生物麦角硫因生物合成的方法。The present disclosure generally relates to engineered host cells and methods for producing ergothioneine. More specifically, the present disclosure relates to engineered host cells and methods for microbial ergothioneine biosynthesis using the engineered host cells.

在一个方面，本公开涉及用于生产麦角硫因的转化的宿主细胞，其包含编码EgtB的核酸序列、编码EgtC的核酸序列、编码EgtD的核酸序列和编码EgtE的核酸序列。In one aspect, the present disclosure relates to a transformed host cell for producing ergothioneine, comprising a nucleic acid sequence encoding EgtB, a nucleic acid sequence encoding EgtC, a nucleic acid sequence encoding EgtD, and a nucleic acid sequence encoding EgtE.

在另一个方面，本公开涉及用于生产麦角硫因的方法。所述方法包括培养宿主细胞，其中所述宿主细胞用编码EgtB的核酸序列、编码EgtC的核酸序列、编码EgtD的核酸序列和编码EgtE的核酸序列转化；诱导所述宿主细胞以表达编码EgtB的核酸序列、编码EgtC的核酸序列、编码EgtD的核酸序列和编码EgtE的核酸序列；和收集麦角硫因。In another aspect, the present disclosure relates to a method for producing ergothioneine. The method includes cultivating a host cell, wherein the host cell is transformed with a nucleic acid sequence encoding EgtB, a nucleic acid sequence encoding EgtC, a nucleic acid sequence encoding EgtD, and a nucleic acid sequence encoding EgtE; inducing the host cell to express the nucleic acid sequence encoding EgtB, the nucleic acid sequence encoding EgtC, the nucleic acid sequence encoding EgtD, and the nucleic acid sequence encoding EgtE; and collecting ergothioneine.

在另一个方面，本公开涉及用于生产麦角硫因的表达载体，其包含编码选自EgtB、EgtC、EgtD和EgtE的氨基酸序列的核酸序列。In another aspect, the present disclosure relates to an expression vector for producing ergothioneine, comprising a nucleic acid sequence encoding an amino acid sequence selected from the group consisting of EgtB, EgtC, EgtD, and EgtE.

附图简述BRIEF DESCRIPTION OF THE DRAWINGS

当考虑其以下详述时，将更好地理解本公开，并且除了上述那些之外的特征、方面和优点将变得显而易见。此类详述参考以下附图，其中：The present disclosure will be better understood when considering the following detailed description thereof, and features, aspects and advantages other than those described above will become apparent. Such detailed description refers to the following drawings, in which:

图1A是含有EgtD和EgtB基因的载体图谱，如实施例1中所讨论。FIG1A is a map of a vector containing the EgtD and EgtB genes, as discussed in Example 1.

图1B是含有EgtC和EgtE基因的载体图谱，如实施例1中所讨论。FIG. 1B is a map of a vector containing the EgtC and EgtE genes, as discussed in Example 1.

图2是说明仅在含有所有四种基因的菌株中生产ET的图，如实施例2中所讨论。EI，用IPTG诱导的空载体细胞；SI，用IPTG诱导的含有四个基因的菌株；Ck+，添加20 mg/L麦角硫因的样品。2 is a graph illustrating that ET is produced only in strains containing all four genes, as discussed in Example 2. EI, empty vector cells induced with IPTG; SI, strain containing four genes induced with IPTG; Ck+, sample supplemented with 20 mg/L ergothioneine.

图3A和3B是显示100 mg/L麦角硫因标准品的HPLC保留时间和UV光谱的图，如实施例2中所讨论。3A and 3B are graphs showing the HPLC retention time and UV spectrum of a 100 mg/L ergothioneine standard, as discussed in Example 2.

图4A和4B是显示用编码EgtB、EgtC、EgtD和EgtE的核酸序列转化的大肠杆菌中生产的麦角硫因的HPLC保留时间和UV光谱的图，如实施例2中所讨论。4A and 4B are graphs showing HPLC retention time and UV spectra of ergothioneine produced in E. coli transformed with nucleic acid sequences encoding EgtB, EgtC, EgtD, and EgtE, as discussed in Example 2.

图5是显示工程改造的大肠杆菌细胞和空载体对照细胞中麦角硫因生产的时间过程的图，如实施例3中所讨论。EI，用IPTG诱导的空载体对照；SI，用IPTG诱导的含有EgtB、EgtC、EgtD和EgtE的菌株。Figure 5 is a graph showing the time course of ergothioneine production in engineered E. coli cells and empty vector control cells, as discussed in Example 3. EI, empty vector control induced with IPTG; SI, strain containing EgtB, EgtC, EgtD, and EgtE induced with IPTG.

图6是显示用各种底物和辅因子进料的转化的大肠杆菌菌株的图。无，未添加底物或辅因子；His，组氨酸；Met，甲硫氨酸；Cys，半胱氨酸；Fe，铁Fe⁺⁺。Figure 6 is a graph showing transformed E. coli strains fed with various substrates and cofactors: None, no substrate or cofactor added; His, histidine; Met, methionine; Cys, cysteine; Fe, iron Fe ⁺⁺ .

尽管本公开易于呈现各种改变和替代形式，但其具体实施方案已经通过实例的方式在附图中显示，并且在本文下面详细描述。然而，应当理解，具体实施方案的描述并不意在将本公开限制为涵盖落入由所附权利要求限定的本公开的精神和范围内的所有改变、等同方案和替代方案。While the present disclosure is susceptible to various changes and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and are described in detail herein below. However, it should be understood that the description of specific embodiments is not intended to limit the disclosure to encompass all changes, equivalents, and alternatives that fall within the spirit and scope of the disclosure as defined by the appended claims.

详述Details

术语“互补”根据本领域普通技术人员所理解的其普通和常规含义使用，并且在没有限制的情况下用于描述能够与彼此杂交的核苷酸碱基之间的关系。例如，关于DNA，腺苷与胸腺嘧啶互补，且胞嘧啶与鸟嘌呤互补。因此，本技术还包括与所附序列表中报道的完整序列以及那些基本上相似的核酸序列互补的分离的核酸片段。The term "complementary" is used according to its ordinary and conventional meaning as understood by those of ordinary skill in the art and is used, without limitation, to describe the relationship between nucleotide bases that are capable of hybridizing to each other. For example, with respect to DNA, adenosine is complementary to thymine, and cytosine is complementary to guanine. Thus, the present technology also includes isolated nucleic acid fragments that are complementary to the complete sequences reported in the accompanying sequence listing, as well as those substantially similar nucleic acid sequences.

术语“核酸”和“核苷酸”根据本领域普通技术人员所理解的其各自普通和常规含义使用，并且在没有限制的情况下用于指单链或双链形式的脱氧核糖核苷酸或核糖核苷酸及其聚合物。除非特别限制，该术语涵盖含有与参考核酸具有相似结合特性且以与天然存在的核苷酸类似的方式代谢的天然核苷酸的已知类似物的核酸。除非另有指明，特定核酸序列还隐含地涵盖其保守修饰或简并的变体(例如，简并密码子取代)和互补序列，以及明确指明的序列。The terms "nucleic acid" and "nucleotide" are used according to their respective ordinary and conventional meanings as understood by those of ordinary skill in the art and are used without limitation to refer to deoxyribonucleotides or ribonucleotides and polymers thereof in single-stranded or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogs of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified or degenerate variants thereof (e.g., degenerate codon substitutions) and complementary sequences, as well as the explicitly indicated sequence.

术语“分离的”根据本领域普通技术人员所理解的其普通和常规含义使用，并且当在分离的核酸或分离的多肽的上下文中使用时，在没有限制的情况下用于指通过人工，远离其天然环境存在并且因此不是天然产物的核酸或多肽。分离的核酸或多肽可以以纯化形式存在或可以存在于非天然环境中，诸如例如在转基因宿主细胞中。The term "isolated" is used according to its ordinary and conventional meaning as understood by those of ordinary skill in the art and, when used in the context of an isolated nucleic acid or isolated polypeptide, is used without limitation to refer to a nucleic acid or polypeptide that has been removed from its natural environment by the hand of man and is therefore not a product of nature. An isolated nucleic acid or polypeptide may exist in a purified form or may exist in a non-natural environment, such as, for example, in a transgenic host cell.

如本文所使用的术语“孵育(incubating)”和“孵育(incubation)”是指将两种或更多种化学或生物实体(诸如化合物和酶)混合并允许它们在有利于产生甜菊糖苷组合物的条件下相互作用的过程。As used herein, the terms "incubating" and "incubation" refer to the process of mixing two or more chemical or biological entities (such as a compound and an enzyme) and allowing them to interact under conditions favorable to the production of a steviol glycoside composition.

术语“简并变体”是指具有通过一个或多个简并密码子取代而不同于参考核酸序列的残基序列的核酸序列。简并密码子取代可以通过产生其中一个或多个选择的(或所有)密码子的第三个位置被混合的碱基和/或脱氧肌苷残基取代的序列来实现。核酸序列和所有其简并变体将表达相同的氨基酸或多肽。The term "degenerate variant" refers to a nucleic acid sequence having a residue sequence that differs from a reference nucleic acid sequence by one or more degenerate codon substitutions. Degenerate codon substitutions can be achieved by generating sequences in which the third position of one or more selected (or all) codons is replaced with mixed bases and/or deoxyinosine residues. A nucleic acid sequence and all its degenerate variants will express the same amino acid or polypeptide.

术语“多肽”、“蛋白”和“肽”根据本领域普通技术人员所理解的其各自普通和常规含义使用；三个术语有时可互换使用，并且在没有限制的情况下用于指氨基酸或氨基酸类似物的聚合物，而无论其大小或功能。尽管“蛋白”经常被用来指相对大的多肽且“肽”经常被用来指小的多肽，但本领域中的这些术语的使用重叠且变化。如本文所使用，术语“多肽”是指肽、多肽和蛋白，除非另有说明。当是指多核苷酸产物时，术语“蛋白”、“多肽”和“肽”在本文中可互换使用。因此，示例性多肽包括多核苷酸产物，天然存在的蛋白，同源物，直向同源物，旁系同源物，片段，和前述的其他等同物、变体和类似物。The terms "polypeptide," "protein," and "peptide" are used according to their respective ordinary and conventional meanings as understood by those of ordinary skill in the art; the three terms are sometimes used interchangeably and are used without limitation to refer to polymers of amino acids or amino acid analogs, regardless of their size or function. Although "protein" is often used to refer to relatively large polypeptides and "peptide" is often used to refer to small polypeptides, the use of these terms in the art overlaps and varies. As used herein, the term "polypeptide" refers to peptides, polypeptides, and proteins, unless otherwise indicated. When referring to polynucleotide products, the terms "protein," "polypeptide," and "peptide" are used interchangeably herein. Thus, exemplary polypeptides include polynucleotide products, naturally occurring proteins, homologs, orthologs, paralogs, fragments, and other equivalents, variants, and analogs of the foregoing.

当关于参考多肽使用时，术语“多肽片段”和“片段”根据本领域普通技术人员的其普通和常规含义使用，并且在没有限制的情况下用于指这样的多肽，其中与参考多肽本身相比氨基酸残基缺失，但其中剩余的氨基酸序列通常与参考多肽中的相应位置相同。此类缺失可以发生在参考多肽的氨基末端或羧基末端，或者两者。The terms "polypeptide fragment" and "fragment" when used with respect to a reference polypeptide are used according to their ordinary and customary meaning by those of ordinary skill in the art and are used without limitation to refer to polypeptides in which amino acid residues are deleted compared to the reference polypeptide itself, but in which the remaining amino acid sequence is generally identical to the corresponding positions in the reference polypeptide. Such deletions can occur at the amino terminus or carboxyl terminus, or both, of the reference polypeptide.

多肽或蛋白的术语“功能片段”是指这样的肽片段，其为全长多肽或蛋白的一部分，并且具有与全长多肽或蛋白基本上相同的生物活性，或实施与全长多肽或蛋白基本上相同的功能(例如，实施相同的酶促反应)。The term "functional fragment" of a polypeptide or protein refers to a peptide fragment that is a portion of the full-length polypeptide or protein and has substantially the same biological activity as the full-length polypeptide or protein, or performs substantially the same function as the full-length polypeptide or protein (e.g., performs the same enzymatic reaction).

可互换使用的术语“变体多肽”、“修饰的氨基酸序列”或“修饰的多肽”是指与参考多肽有一个或多个氨基酸不同(例如，通过一个或多个氨基酸取代、缺失和/或添加)的氨基酸序列。在一个方面，变体是保留参考多肽的一些或所有能力的“功能变体”。The terms "variant polypeptide," "modified amino acid sequence," or "modified polypeptide," used interchangeably, refer to an amino acid sequence that differs from a reference polypeptide in one or more amino acids (e.g., by one or more amino acid substitutions, deletions, and/or additions). In one aspect, the variant is a "functional variant" that retains some or all of the capabilities of the reference polypeptide.

术语“功能变体”还包括保守取代的变体。术语“保守取代的变体”是指具有这样的氨基酸序列的肽，所述氨基酸序列通过一个或多个保守氨基酸取代而不同于参考肽，并且维持参考肽的一些或所有活性。“保守的氨基酸取代”是用功能相似的残基取代氨基酸残基。保守取代的实例包括一个非极性(疏水性)残基诸如异亮氨酸、缬氨酸、亮氨酸或甲硫氨酸取代另一个非极性(疏水性)残基；一个带电荷或极性(亲水性)残基取代另一个带电荷或极性(亲水性)残基，诸如在精氨酸和赖氨酸之间，在谷氨酰胺和天冬酰胺之间，在苏氨酸和丝氨酸之间；一个碱性残基诸如赖氨酸或精氨酸取代另一个碱性残基；或一个酸性残基诸如天冬氨酸或谷氨酸取代另一个酸性残基；或一个芳族残基(诸如苯丙氨酸、酪氨酸或色氨酸)取代另一个芳族残基。预期此类取代对蛋白或多肽的表观分子量或等电点影响很少或没有影响。短语“保守取代的变体”还包括其中残基被化学衍生的残基取代的肽，条件是所得肽维持如本文所述的参考肽的一些或所有活性。The term "functional variant" also includes conservatively substituted variants. The term "conservatively substituted variant" refers to a peptide with an amino acid sequence that is different from a reference peptide by one or more conservative amino acid substitutions, and maintains some or all of the activities of the reference peptide. A "conservative amino acid substitution" is a substitution of an amino acid residue with a functionally similar residue. Examples of conservative substitutions include one non-polar (hydrophobic) residue such as isoleucine, valine, leucine or methionine replacing another non-polar (hydrophobic) residue; one charged or polar (hydrophilic) residue replacing another charged or polar (hydrophilic) residue, such as between arginine and lysine, between glutamine and asparagine, between threonine and serine; one basic residue such as lysine or arginine replacing another basic residue; or one acidic residue such as aspartic acid or glutamic acid replacing another acidic residue; or one aromatic residue (such as phenylalanine, tyrosine or tryptophan) replacing another aromatic residue. It is expected that such substitutions have little or no effect on the apparent molecular weight or isoelectric point of a protein or polypeptide. The phrase "conservatively substituted variants" also includes peptides in which residues are substituted with chemically derivatized residues, provided that the resulting peptide retains some or all of the activity of the reference peptide as described herein.

与本技术的多肽相关的术语“变体”还包括具有这样的氨基酸序列的功能活性多肽，所述氨基酸序列与参考多肽的氨基酸序列至少75%、至少76%、至少77%、至少78%、至少79%、至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%且甚至100%相同。The term "variant" in connection with the polypeptides of the present technology also includes functionally active polypeptides having an amino acid sequence that is at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% and even 100% identical to the amino acid sequence of the reference polypeptide.

以所有其语法形式和拼写变化的术语“同源的”是指具有“共同进化起源”的多核苷酸或多肽之间的关系，包括来自超家族的多核苷酸或多肽和来自不同物种的同源多核苷酸或蛋白(Reeck等人, Cell 50:667, 1987)。此类多核苷酸或多肽具有序列同源性，如由它们的序列相似性所反映，无论是百分比同一性还是在保守位置存在特定氨基酸或基序的方面。例如，两个同源多肽可以具有至少75%、至少76%、至少77%、至少78%、至少79%、至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%且甚至100%相同的氨基酸序列。The term "homologous" with all its grammatical forms and spelling variations refers to the relationship between the polynucleotides or polypeptides with a "common evolutionary origin", including polynucleotides or polypeptides from superfamily and homologous polynucleotides or proteins from different species (Reeck et al., Cell 50:667, 1987). Such polynucleotides or polypeptides have sequence homology, as reflected by their sequence similarity, whether in terms of percentage identity or in terms of the presence of specific amino acids or motifs in conserved positions. For example, two homologous polypeptides can have at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% and even 100% identical amino acid sequence.

关于本技术的变体多肽序列的“百分比(％)氨基酸序列同一性”是指候选序列中与参考多肽(诸如，例如SEQ ID NO:6)的氨基酸残基在比对序列并引入空位(如果必要)以实现最大百分比序列同一性（并且不考虑任何保守取代作为序列同一性的一部分）之后相同的氨基酸残基的百分比。"Percent (%) amino acid sequence identity," with respect to variant polypeptide sequences of the present technology, refers to the percentage of amino acid residues in the candidate sequence that are identical with the amino acid residues in a reference polypeptide (such as, for example, SEQ ID NO: 6), after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity (and not considering any conservative substitutions as part of the sequence identity).

用于确定百分比氨基酸序列同一性的目的的比对可以以本领域技术内的各种方式，例如，使用公开可得的计算机软件诸如BLAST、BLAST-2、ALIGN、ALIGN-2 或Megalign(DNASTAR)软件来实现。本领域技术人员可以确定用于测量比对的适当参数，包括实现所比较序列全长内的最大比对所需的任何算法。例如，％氨基酸序列同一性可以使用序列比较程序NCBI-BLAST2来确定。NCBI-BLAST2序列比较程序可以从ncbi.nlm.nih.gov下载。NCBIBLAST2使用几个搜索参数，其中所有那些搜索参数被设置为默认值，包括例如不屏蔽：是(unmask yes)；链 = 所有；预期发生数：10；最小低复杂度长度=15/5；多通道e-值=0.01；多通道的常数=25；最终空位比对的下降(dropoff)=25；和评分矩阵=BLOSUM62。在NCBI-BLAST2用于氨基酸序列比较的情况下，给定氨基酸序列A对于、与或针对给定氨基酸序列B的％氨基酸序列同一性(其可以替代地表述为对于、与或针对给定氨基酸序列B具有或包含特定％氨基酸序列同一性的给定氨基酸序列A)如下计算：分数X/Y乘以100，其中X是在A和B的该程序比对中通过序列比对程序NCBI-BLAST2评分为相同匹配的氨基酸残基的数目，且其中Y是B中的氨基酸残基的总数。应当理解，当氨基酸序列A的长度不等于氨基酸序列B的长度时，A对于B的％氨基酸序列同一性将不等于B对于A的％氨基酸序列同一性。Alignment for the purpose of determining percent amino acid sequence identity can be accomplished in various ways within the skill in the art, for example, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2, or Megalign (DNASTAR) software. One skilled in the art can determine appropriate parameters for measuring alignment, including any algorithm required to achieve maximum alignment over the full length of the compared sequences. For example, % amino acid sequence identity can be determined using the sequence comparison program NCBI-BLAST2. The NCBI-BLAST2 sequence comparison program can be downloaded from ncbi.nlm.nih.gov. NCBI BLAST2 uses several search parameters, all of which are set to default values, including, for example, unmask yes; chain = all; expected number of occurrences: 10; minimum low complexity length = 15/5; multipass e-value = 0.01; multipass constant = 25; dropoff of final gap alignment = 25; and scoring matrix = BLOSUM62. Where NCBI-BLAST2 is used for amino acid sequence comparison, the % amino acid sequence identity of a given amino acid sequence A to, with, or against a given amino acid sequence B (which can alternatively be expressed as a given amino acid sequence A having or comprising a particular % amino acid sequence identity to, with, or against a given amino acid sequence B) is calculated as follows: the fraction X/Y multiplied by 100, where X is the number of amino acid residues scored as identical matches by the sequence alignment program NCBI-BLAST2 in this program's alignment of A and B, and where Y is the total number of amino acid residues in B. It will be understood that when the length of amino acid sequence A is not equal to the length of amino acid sequence B, the % amino acid sequence identity of A to B will not equal the % amino acid sequence identity of B to A.

在该意义上，用于确定氨基酸序列“相似性”的技术是本领域众所周知的。通常，“相似性”是指在适当位置处的两个或更多个多肽的确切的氨基酸与氨基酸比较，其中氨基酸是相同的或具有相似的化学和/或物理性质，诸如电荷或疏水性。然后可以在比较的多肽序列之间确定所谓的“百分比相似性”。用于确定核酸和氨基酸序列同一性的技术也是本领域众所周知的，并且包括确定该基因的mRNA的核苷酸序列(通常经由cDNA中间体)，并确定其中编码的氨基酸序列，并将其与第二氨基酸序列比较。通常，“同一性”分别是指两个多核苷酸或多肽序列的确切的核苷酸与核苷酸或氨基酸与氨基酸的对应性。可以通过确定它们的“百分比同一性”来比较两个或更多个多核苷酸序列，如同两个或更多个氨基酸序列一样。Wisconsin Sequence Analysis Package, 版本8 (可得自Genetics Computer Group,Madison, Wis.)中可得的程序，例如GAP程序，能够分别计算两个多核苷酸之间的同一性和两个多肽序列之间的同一性和相似性。用于计算序列之间的同一性或相似性的其他程序是本领域技术人员已知的。In this sense, the technology for determining amino acid sequence "similarity" is well known in the art. Generally, "similarity" refers to the exact amino acid and amino acid comparison of two or more polypeptides at the appropriate position, wherein the amino acid is identical or has similar chemical and/or physical properties, such as charge or hydrophobicity. Then the so-called "percent similarity" can be determined between the polypeptide sequences compared. The technology for determining nucleic acid and amino acid sequence identity is also well known in the art, and includes determining the nucleotide sequence of the mRNA of the gene (usually via a cDNA intermediate), and determining the amino acid sequence encoded therein, and comparing it with the second amino acid sequence. Generally, "identity" refers to the exact nucleotide and nucleotide or amino acid and amino acid correspondence of two polynucleotides or polypeptide sequences, respectively. Two or more polynucleotide sequences can be compared by determining their "percent identity", as two or more amino acid sequences. The available programs in the Wisconsin Sequence Analysis Package, Version 8 (available from Genetics Computer Group, Madison, Wis.), such as the GAP program, can calculate the identity between two polynucleotides and the identity and similarity between two polypeptide sequences, respectively. Other programs for calculating identity or similarity between sequences are known to those skilled in the art.

“对应于”参考位置的氨基酸位置是指与参考序列比对的位置，如通过比对氨基酸序列所鉴定。此类比对可以通过手动或通过使用众所周知的序列比对程序诸如ClustalW2、Blast 2等来进行。An amino acid position that "corresponds to" a reference position refers to a position that is aligned with a reference sequence, as identified by aligning the amino acid sequences. Such alignment can be performed manually or by using well-known sequence alignment programs such as ClustalW2, Blast 2, etc.

除非另有指明，两个多肽或多核苷酸序列的百分比同一性是指在两个序列中的较短者的整个长度的相同氨基酸残基或核苷酸的百分比。Unless otherwise indicated, the percent identity of two polypeptide or polynucleotide sequences refers to the percentage of identical amino acid residues or nucleotides over the entire length of the shorter of the two sequences.

“编码序列”根据本领域普通技术人员所理解的其普通和常规含义使用，并且在没有限制的情况下用于指编码特定氨基酸序列的DNA序列。"Coding sequence" is used according to its ordinary and customary meaning as understood by those of ordinary skill in the art and is used without limitation to refer to a DNA sequence that encodes a specific amino acid sequence.

“合适的调节序列”根据本领域普通技术人员所理解的其普通和常规含义使用，并且在没有限制的情况下用于指位于编码序列的上游(5'非编码序列)、内部或下游(3'非编码序列)并影响相关编码序列的转录、RNA加工或稳定性或翻译的核苷酸序列。调节序列可以包括启动子、翻译前导序列、内含子和多腺苷酸化识别序列。"Suitable regulatory sequences" are used according to their ordinary and conventional meaning as understood by those of ordinary skill in the art and are used without limitation to refer to nucleotide sequences that are located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence and that influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, translation leader sequences, introns, and polyadenylation recognition sequences.

“启动子”根据本领域普通技术人员所理解的其普通和常规含义使用，并且在没有限制的情况下用于指能够控制编码序列或功能性RNA的表达的DNA序列。通常，编码序列位于启动子序列的3'端。启动子可以整体源自天然基因，或由源自自然界中发现的不同启动子的不同元件构成，或甚至包含合成的DNA区段。本领域技术人员应当理解，不同的启动子可以指导基因在不同细胞类型中或在不同的发育阶段或响应于不同的环境条件的表达。引起基因在大多数时间在大多数细胞类型中表达的启动子通常称为“组成型启动子”。进一步认识到，由于在大多数情况下，调节序列的确切边界尚未完全确定，所以不同长度的DNA片段可具有相同的启动子活性。"Promoter" is used according to its common and conventional meaning as understood by those of ordinary skill in the art, and is used to refer to a DNA sequence that is capable of controlling the expression of a coding sequence or functional RNA without limitation. Typically, the coding sequence is located at the 3' end of the promoter sequence. The promoter can be derived in its entirety from a natural gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It will be understood by those skilled in the art that different promoters can direct the expression of a gene in different cell types or at different developmental stages or in response to different environmental conditions. Promoters that cause a gene to be expressed in most cell types at most times are generally referred to as "constitutive promoters." It is further recognized that, since in most cases the exact boundaries of regulatory sequences have not yet been fully determined, DNA fragments of different lengths can have the same promoter activity.

术语“可操作连接”是指核酸序列在单个核酸片段上的缔合，使得一个核酸序列的功能受另一个核酸序列的影响。例如，当启动子能够影响编码序列的表达(即，编码序列在启动子的转录控制下)时，启动子与该编码序列可操作连接。编码序列可以与调节序列以有义或反义取向可操作连接。The term "operably linked" refers to the association of nucleic acid sequences on a single nucleic acid fragment such that the function of one nucleic acid sequence is affected by the other. For example, a promoter is operably linked to a coding sequence when the promoter is capable of affecting the expression of the coding sequence (i.e., the coding sequence is under the transcriptional control of the promoter). A coding sequence can be operably linked to a regulatory sequence in either sense or antisense orientation.

如本文所使用的术语“表达”根据本领域普通技术人员所理解的其普通和常规含义使用，并且在没有限制的情况下用于指源自本技术的核酸片段的有义(mRNA)或反义RNA的转录和稳定积累。“过表达”是指在转基因或重组生物体中产生基因产物，其超过正常或非转化生物体中的产生水平。As used herein, the term "expression" is used according to its ordinary and conventional meaning as understood by those of ordinary skill in the art and is used without limitation to refer to the transcription and stable accumulation of sense (mRNA) or antisense RNA of nucleic acid fragments derived from the present technology. "Overexpression" refers to the production of a gene product in a transgenic or recombinant organism that exceeds the level of production in a normal or non-transformed organism.

“转化”根据本领域普通技术人员所理解的其普通和常规含义使用，并且在没有限制的情况下用于指多核苷酸转移至靶细胞中。转移的多核苷酸可以并入靶细胞的基因组或染色体DNA中，导致基因上稳定的遗传(genetically stable inheritance)，或者其可以独立于宿主染色体复制。含有转化的核酸片段的宿主生物体被称为“转基因的”或“重组的”或“转化的”生物体。"Transformation" is used according to its common and conventional meaning as understood by those of ordinary skill in the art, and is used to refer to the transfer of a polynucleotide into a target cell without limitation. The polynucleotide transferred can be incorporated into the genome or chromosomal DNA of the target cell, resulting in genetically stable inheritance, or it can replicate independently of the host chromosome. A host organism containing the transformed nucleic acid fragment is referred to as a "transgenic" or "recombinant" or "transformed" organism.

当在本文中与宿主细胞结合使用时，术语“转化的”、“转基因的”和“重组的”根据本领域普通技术人员所理解的其普通和常规含义使用，并且在没有限制的情况下用于指其中已经引入异源核酸分子的宿主生物体的细胞，诸如植物或微生物细胞。核酸分子可以稳定地整合入宿主细胞的基因组中，或者核酸分子可以作为染色体外分子存在。此类染色体外分子可以自主复制。转化的细胞、组织或主体应理解为不仅涵盖转化过程的最终产物，而且涵盖其转基因子代。When used in conjunction with host cells herein, the terms "transformed," "transgenic," and "recombinant" are used according to their common and conventional meanings as understood by those of ordinary skill in the art, and are used without limitation to refer to cells of a host organism, such as a plant or microbial cell, into which a heterologous nucleic acid molecule has been introduced. The nucleic acid molecule can be stably integrated into the genome of the host cell, or the nucleic acid molecule can exist as an extrachromosomal molecule. Such extrachromosomal molecules can replicate autonomously. Transformed cells, tissues, or subjects are understood to encompass not only the end product of the transformation process, but also their transgenic progeny.

当在本文中与多核苷酸结合使用时，术语“重组的”、“异源的”和“外源的”根据本领域普通技术人员所理解的其普通和常规含义使用，并且在没有限制的情况下用于指源自对于特定宿主细胞而言外源的来源、或者如果源自相同来源则从其原始形式修饰的多核苷酸(例如，DNA序列或基因)。因此，宿主细胞中的异源基因包括对于特定宿主细胞是内源的、但已经通过例如使用定点诱变或其他重组技术进行修饰的基因。所述术语还包括非天然存在的多个拷贝的天然存在的DNA序列。因此，所述术语是指这样的DNA区段，其对于细胞是外来或异源的，或者与细胞同源、但在宿主细胞内通常不存在所述元件的位置处或者以宿主细胞内通常不存在所述元件的形式。When used herein in conjunction with polynucleotides, the terms "recombinant," "heterologous," and "exogenous" are used according to their ordinary and conventional meanings as understood by those of ordinary skill in the art, and are used without limitation to refer to a polynucleotide (e.g., a DNA sequence or gene) that is derived from a source that is foreign to a particular host cell, or, if derived from the same source, is modified from its original form. Thus, a heterologous gene in a host cell includes a gene that is endogenous to the particular host cell but has been modified, for example, by using site-directed mutagenesis or other recombinant techniques. The term also includes non-naturally occurring multiple copies of a naturally occurring DNA sequence. Thus, the term refers to a DNA segment that is foreign or heterologous to the cell, or homologous to the cell, but at a location or in a form that does not normally occur within the host cell.

类似地，当在本文中与多肽或氨基酸序列结合使用时，术语“重组的”、“异源的”和“外源的”意指源自对于特定宿主细胞而言外来的来源、或者如果源自相同来源则从其原始形式修饰的多肽或氨基酸序列。因此，重组DNA区段可以在宿主细胞中表达以产生重组多肽。Similarly, the terms "recombinant," "heterologous," and "exogenous," when used herein in conjunction with a polypeptide or amino acid sequence, mean a polypeptide or amino acid sequence that is derived from a source foreign to the particular host cell, or, if derived from the same source, is modified from its original form. Thus, a recombinant DNA segment can be expressed in a host cell to produce a recombinant polypeptide.

术语“质粒”、“载体”和“盒”根据本领域普通技术人员所理解的其普通和常规含义使用，并且在没有限制的情况下用于指这样的染色体外元件，所述染色体外元件经常携带不是细胞的中央代谢的一部分的基因且通常为环状双链DNA分子的形式。此类元件可以是源自任何来源的单链或双链DNA或RNA的线性或环状的自主复制序列、基因组整合序列、噬菌体或核苷酸序列，其中多个核苷酸序列已经接合或重组成能够将启动子片段和所选基因产物的DNA序列连同适当的3'非翻译序列引入细胞的独特构建体。“转化盒”是指含有外来基因并且除了外来基因外还具有促进特定宿主细胞的转化的元件的特定载体。“表达盒”是指含有外来基因并且除了外来基因外还具有允许该基因在外来宿主中的增强表达的元件的特定载体。The terms "plasmid," "vector," and "cassette" are used according to their ordinary and conventional meanings as understood by those of ordinary skill in the art and are used without limitation to refer to extrachromosomal elements that often carry genes that are not part of the central metabolism of the cell and are typically in the form of circular double-stranded DNA molecules. Such elements can be linear or circular autonomously replicating sequences, genome integrating sequences, phages, or nucleotide sequences derived from single-stranded or double-stranded DNA or RNA from any source, in which multiple nucleotide sequences have been joined or recombined into a unique construct capable of introducing a promoter fragment and the DNA sequence of a selected gene product, along with appropriate 3' non-translated sequences, into a cell. A "transformation cassette" refers to a specific vector that contains a foreign gene and, in addition to the foreign gene, has elements that promote the transformation of a specific host cell. An "expression cassette" refers to a specific vector that contains a foreign gene and, in addition to the foreign gene, has elements that allow for enhanced expression of the gene in a foreign host.

本文所使用的标准重组DNA和分子克隆技术是本领域众所周知的，并且描述于例如Sambrook, J., Fritsch, E. F. 和Maniatis, T. Molecular Cloning: A LaboratoryManual, 第2版; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y., 1989(以下称为"Maniatis"); 和Silhavy, T. J., Bennan, M. L.和Enquist, L. W.Experiments with Gene Fusions; Cold Spring Harbor Laboratory: Cold SpringHarbor, N.Y., 1984; 和Ausubel, F. M.等人, In Current Protocols in MolecularBiology, 由Greene Publishing and Wiley-Interscience出版, 1987; 其各自整体由此在它们与之一致的程度上通过引用并入本文。Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described, for example, in Sambrook, J., Fritsch, E. F., and Maniatis, T. Molecular Cloning: A Laboratory Manual, 2nd ed.; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y., 1989 (hereinafter "Maniatis"); and Silhavy, T. J., Bennan, M. L., and Enquist, L. W. Experiments with Gene Fusions; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y., 1984; and Ausubel, F. M. et al., In Current Protocols in Molecular Biology, published by Greene Publishing and Wiley-Interscience, 1987; each of which is hereby incorporated by reference in its entirety to the extent it is consistent herewith.

除非另有定义，本文所使用的所有技术和科学术语具有与本公开所属领域普通技术人员通常理解的相同含义。尽管类似于或等同于本文所述的那些方法和材料的任何方法和材料可以用于本公开的实践或测试中，但下面描述了优选的材料和方法。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of this disclosure, preferred materials and methods are described below.

根据本公开，已经开发了用于生产麦角硫因和具有编码可用于生产麦角硫因的EgtB、EgtC、EgtD和EgtE的基因的宿主细胞的方法。令人惊讶且出人意料地，已经在体外微生物生产系统中重现了麦角硫因生产途径。According to the present disclosure, methods for producing thioneine and host cells having genes encoding EgtB, EgtC, EgtD and EgtE that can be used to produce thioneine have been developed. Surprisingly and unexpectedly, thioneine production pathways have been reproduced in in vitro microbial production systems.

用于生产麦角硫因的工程改造的宿主细胞Engineered host cells for the production of ergothioneine

在一个方面，本公开涉及工程改造的宿主细胞。工程改造的宿主细胞包括编码EgtB的核酸序列、编码EgtC的核酸序列、编码EgtD的核酸序列和编码EgtE的核酸序列。In one aspect, the present disclosure relates to an engineered host cell. The engineered host cell comprises a nucleic acid sequence encoding EgtB, a nucleic acid sequence encoding EgtC, a nucleic acid sequence encoding EgtD, and a nucleic acid sequence encoding EgtE.

EgtB(或铁(II)-依赖性氧化还原酶EgtB)经由在组氨酸三甲基内盐(N-α,N-α,N-α-三甲基-L-组氨酸)上添加氧和γ-谷氨酰-半胱氨酸而催化组氨酸三甲基内盐的氧化硫化作用。EgtB (or iron(II)-dependent oxidoreductase EgtB) catalyzes the oxidative sulfidation of histidine trimethylolate (N-α,N-α,N-α-trimethyl-L-histidine) via the addition of oxygen and γ-glutamyl-cysteine to the histidine trimethylolate.

合适的EgtB可以是，例如，分枝杆菌EgtB。特别合适的EgtB可以是，例如，编码与SEQ ID NO:2中提供的氨基酸序列至少95％相同的氨基酸序列的EgtB核酸序列。在另一个方面，特别合适的EgtB可以是，例如，编码与SEQ ID NO:2中提供的氨基酸序列至少96％相同的氨基酸序列的EgtB核酸序列。在另一个方面，特别合适的EgtB可以是，例如，编码与SEQID NO:2中提供的氨基酸序列至少97％相同的氨基酸序列的EgtB核酸序列。在另一个方面，特别合适的EgtB可以是，例如，编码与SEQ ID NO:2中提供的氨基酸序列至少98％相同的氨基酸序列的EgtB核酸序列。在另一个方面，特别合适的EgtB可以是，例如，编码与SEQ IDNO:2中提供的氨基酸序列至少99％相同的氨基酸序列的EgtB核酸序列。在另一个方面，特别合适的EgtB可以是，例如，编码与SEQ ID NO:2中提供的氨基酸序列100％相同的氨基酸序列的EgtB核酸序列。A suitable EgtB can be, for example, a Mycobacterium EgtB. A particularly suitable EgtB can be, for example, an EgtB nucleic acid sequence encoding an amino acid sequence that is at least 95% identical to the amino acid sequence provided in SEQ ID NO: 2. In another aspect, a particularly suitable EgtB can be, for example, an EgtB nucleic acid sequence encoding an amino acid sequence that is at least 96% identical to the amino acid sequence provided in SEQ ID NO: 2. In another aspect, a particularly suitable EgtB can be, for example, an EgtB nucleic acid sequence encoding an amino acid sequence that is at least 97% identical to the amino acid sequence provided in SEQ ID NO: 2. In another aspect, a particularly suitable EgtB can be, for example, an EgtB nucleic acid sequence encoding an amino acid sequence that is at least 98% identical to the amino acid sequence provided in SEQ ID NO: 2. In another aspect, a particularly suitable EgtB can be, for example, an EgtB nucleic acid sequence encoding an amino acid sequence that is at least 99% identical to the amino acid sequence provided in SEQ ID NO: 2. In another aspect, a particularly suitable EgtB can be, for example, an EgtB nucleic acid sequence encoding an amino acid sequence that is 100% identical to the amino acid sequence provided in SEQ ID NO: 2.

EgtC(或酰胺水解酶EgtC)催化从N-(γ-谷氨酰)-[N(α), N (α), N (α)-三甲基-L-组氨酰]-半胱氨酸亚砜水解γ-谷氨酰胺键(gamma-glutamyl amide bond)以产生组氨酸三甲基半胱氨酸(hercynylcysteine)亚砜。EgtC (or amidohydrolase EgtC) catalyzes the hydrolysis of the gamma-glutamyl amide bond from N-(γ-glutamyl)-[N(α), N(α), N(α)-trimethyl-L-histidyl]-cysteine sulfoxide to produce histidine trimethylcysteine sulfoxide.

合适的EgtC可以是，例如，分枝杆菌EgtC。特别合适的EgtC可以是，例如，编码与SEQ ID NO:4中提供的氨基酸序列至少95％相同的氨基酸序列的EgtC核酸序列。在另一个方面，特别合适的EgtC可以是，例如，编码与SEQ ID NO:4中提供的氨基酸序列至少96％相同的氨基酸序列的EgtC核酸序列。在另一个方面，特别合适的EgtC可以是，例如，编码与SEQID NO:4中提供的氨基酸序列至少97％相同的氨基酸序列的EgtC核酸序列。在另一个方面，特别合适的EgtC可以是，例如，编码与SEQ ID NO:4中提供的氨基酸序列至少98％相同的氨基酸序列的EgtC核酸序列。在另一个方面，特别合适的EgtC可以是，例如，编码与SEQ IDNO:4中提供的氨基酸序列至少99％相同的氨基酸序列的EgtC核酸序列。在另一个方面，特别合适的EgtC可以是，例如，编码与SEQ ID NO:4中提供的氨基酸序列100％相同的氨基酸序列的EgtC核酸序列。Suitable EgtC can be, for example, mycobacterial EgtC. Particularly suitable EgtC can be, for example, an EgtC nucleic acid sequence encoding an amino acid sequence that is at least 95% identical to the amino acid sequence provided in SEQ ID NO: 4. In another aspect, a particularly suitable EgtC can be, for example, an EgtC nucleic acid sequence encoding an amino acid sequence that is at least 96% identical to the amino acid sequence provided in SEQ ID NO: 4. In another aspect, a particularly suitable EgtC can be, for example, an EgtC nucleic acid sequence encoding an amino acid sequence that is at least 97% identical to the amino acid sequence provided in SEQ ID NO: 4. In another aspect, a particularly suitable EgtC can be, for example, an EgtC nucleic acid sequence encoding an amino acid sequence that is at least 98% identical to the amino acid sequence provided in SEQ ID NO: 4. In another aspect, a particularly suitable EgtC can be, for example, an EgtC nucleic acid sequence encoding an amino acid sequence that is at least 99% identical to the amino acid sequence provided in SEQ ID NO: 4. In another aspect, a particularly suitable EgtC can be, for example, an EgtC nucleic acid sequence encoding an amino acid sequence that is 100% identical to the amino acid sequence provided in SEQ ID NO: 4.

EgtD(或组氨酸特异性甲基转移酶EgtD)催化组氨酸的甲基化以形成N-α,N-α,N-α-三甲基-L-组氨酸(也称为组氨酸三甲基内盐(hercynine))。组氨酸和α-N,N-二甲基组氨酸是优选的底物。EgtD (or histidine-specific methyltransferase EgtD) catalyzes the methylation of histidine to form N-α,N-α,N-α-trimethyl-L-histidine (also known as hercynine). Histidine and α-N,N-dimethylhistidine are preferred substrates.

合适的EgtD可以是，例如，分枝杆菌EgtD。特别合适的EgtD可以是，例如，编码与SEQ ID NO:6中提供的氨基酸序列至少95％相同的氨基酸序列的EgtD核酸序列。在另一个方面，特别合适的EgtD可以是，例如，编码与SEQ ID NO:6中提供的氨基酸序列至少96％相同的氨基酸序列的EgtD核酸序列。在另一个方面，特别合适的EgtD可以是，例如，编码与SEQID NO:6中提供的氨基酸序列至少97％相同的氨基酸序列的EgtD核酸序列。在另一个方面，特别合适的EgtD可以是，例如，编码与SEQ ID NO:6中提供的氨基酸序列至少98％相同的氨基酸序列的EgtD核酸序列。在另一个方面，特别合适的EgtD可以是，例如，编码与SEQ IDNO:6中提供的氨基酸序列至少99％相同的氨基酸序列的EgtD核酸序列。在另一个方面，特别合适的EgtD可以是，例如，编码与SEQ ID NO:6中提供的氨基酸序列100％相同的氨基酸序列的EgtD核酸序列。A suitable EgtD can be, for example, a Mycobacterium EgtD. A particularly suitable EgtD can be, for example, an EgtD nucleic acid sequence encoding an amino acid sequence that is at least 95% identical to the amino acid sequence provided in SEQ ID NO: 6. In another aspect, a particularly suitable EgtD can be, for example, an EgtD nucleic acid sequence encoding an amino acid sequence that is at least 96% identical to the amino acid sequence provided in SEQ ID NO: 6. In another aspect, a particularly suitable EgtD can be, for example, an EgtD nucleic acid sequence encoding an amino acid sequence that is at least 97% identical to the amino acid sequence provided in SEQ ID NO: 6. In another aspect, a particularly suitable EgtD can be, for example, an EgtD nucleic acid sequence encoding an amino acid sequence that is at least 98% identical to the amino acid sequence provided in SEQ ID NO: 6. In another aspect, a particularly suitable EgtD can be, for example, an EgtD nucleic acid sequence encoding an amino acid sequence that is at least 99% identical to the amino acid sequence provided in SEQ ID NO: 6. In another aspect, a particularly suitable EgtD can be, for example, an EgtD nucleic acid sequence encoding an amino acid sequence that is 100% identical to the amino acid sequence provided in SEQ ID NO: 6.

EgtE(或磷酸吡哆醛-依赖性蛋白EgtE)据信催化去除丙酮酸、氨和氧以产生麦角硫因。EgtE (or pyridoxal phosphate-dependent protein EgtE) is believed to catalyze the removal of pyruvate, ammonia, and oxygen to produce ergothioneine.

合适的EgtE可以是，例如，分枝杆菌EgtE。特别合适的EgtE可以是，例如，编码与SEQ ID NO:8中提供的氨基酸序列至少95％相同的氨基酸序列的EgtE核酸序列。在另一个方面，特别合适的EgtE可以是，例如，编码与SEQ ID NO:8中提供的氨基酸序列至少96％相同的氨基酸序列的EgtE核酸序列。在另一个方面，特别合适的EgtE可以是，例如，编码与SEQID NO:8中提供的氨基酸序列至少97％相同的氨基酸序列的EgtE核酸序列。在另一个方面，特别合适的EgtE可以是，例如，编码与SEQ ID NO:8中提供的氨基酸序列至少98％相同的氨基酸序列的EgtE核酸序列。在另一个方面，特别合适的EgtE可以是，例如，编码与SEQ IDNO:8中提供的氨基酸序列至少99％相同的氨基酸序列的EgtE核酸序列。在另一个方面，特别合适的EgtE可以是，例如，编码与SEQ ID NO:8中提供的氨基酸序列100％相同的氨基酸序列的EgtE核酸序列。A suitable EgtE can be, for example, a mycobacterium EgtE. A particularly suitable EgtE can be, for example, an EgtE nucleic acid sequence encoding an amino acid sequence that is at least 95% identical to the amino acid sequence provided in SEQ ID NO: 8. In another aspect, a particularly suitable EgtE can be, for example, an EgtE nucleic acid sequence encoding an amino acid sequence that is at least 96% identical to the amino acid sequence provided in SEQ ID NO: 8. In another aspect, a particularly suitable EgtE can be, for example, an EgtE nucleic acid sequence encoding an amino acid sequence that is at least 97% identical to the amino acid sequence provided in SEQ ID NO: 8. In another aspect, a particularly suitable EgtE can be, for example, an EgtE nucleic acid sequence encoding an amino acid sequence that is at least 98% identical to the amino acid sequence provided in SEQ ID NO: 8. In another aspect, a particularly suitable EgtE can be, for example, an EgtE nucleic acid sequence encoding an amino acid sequence that is at least 99% identical to the amino acid sequence provided in SEQ ID NO: 8. In another aspect, a particularly suitable EgtE can be, for example, an EgtE nucleic acid sequence encoding an amino acid sequence that is 100% identical to the amino acid sequence provided in SEQ ID NO: 8.

合适的宿主细胞可以是，例如，细菌细胞和酵母细胞。合适的细菌细胞可以是，例如，大肠杆菌。Suitable host cells may be, for example, bacterial cells and yeast cells. Suitable bacterial cells may be, for example, Escherichia coli.

合适的酵母细胞可以是，例如，酵母(Saccharomyces)和毕赤酵母(Pichia)。特别合适的酵母可以是，例如，酿酒酵母(Saccharomycescerevisiae)。特别合适的毕赤酵母可以是，例如，巴斯德毕赤酵母。Suitable yeast cells may be, for example, Saccharomyces and Pichia . Particularly suitable yeast may be, for example, Saccharomyces cerevisiae. Particularly suitable Pichia may be, for example, Pichia pastoris.

将编码EgtB、EgtC、EgtD和EgtE的核酸序列克隆至表达载体中在本领域技术人员已知的启动子控制下。合适的启动子可以是，例如，本领域技术人员已知的组成型活性启动子和诱导型启动子。合适的诱导型启动子是本领域技术人员已知的，并且可以是，例如，化学诱导物，营养物添加，营养物耗竭和物理或生理化学因素改变，诸如例如pH改变和温度诱导。合适的化学诱导物可以是，例如，本领域技术人员已知的异丙基β-D-1-硫代半乳糖吡喃糖苷(IPTG)诱导型启动子和抗生素诱导型启动子。特别合适的化学诱导型启动子可以是，例如，本领域技术人员已知的异丙基β-D-1-硫代半乳糖吡喃糖苷(IPTG)诱导型启动子。其他合适的诱导型启动子可以是，例如，本领域技术人员已知的温度诱导型启动子，诸如例如，pL和pRλ噬菌体启动子。The nucleic acid sequences encoding EgtB, EgtC, EgtD and EgtE are cloned into expression vectors under the control of promoters known to those skilled in the art. Suitable promoters can be, for example, constitutively active promoters and inducible promoters known to those skilled in the art. Suitable inducible promoters are well known to those skilled in the art and can be, for example, chemical inducers, nutrients are added, nutrient depletion and physical or physiological chemical factors change, such as, for example, pH changes and temperature induction. Suitable chemical inducers can be, for example, isopropyl β-D-1-thiogalactopyranoside (IPTG) inducible promoters and antibiotic inducible promoters known to those skilled in the art. Particularly suitable chemical inducible promoters can be, for example, isopropyl β-D-1-thiogalactopyranoside (IPTG) inducible promoters known to those skilled in the art. Other suitable inducible promoters can be, for example, temperature inducible promoters known to those skilled in the art, such as, for example, pL and pR lambda phage promoters.

特别合适的表达载体示于图1A和1B中。其他合适的表达载体是本领域技术人员已知的，并且可以是，例如，pET载体、pCDF载体、pRSF载体和Duet载体。Particularly suitable expression vectors are shown in Figures 1A and 1 B. Other suitable expression vectors are known to the person skilled in the art and may be, for example, pET vectors, pCDF vectors, pRSF vectors and Duet vectors.

用于生产麦角硫因的方法Method for producing ergothioneine

所述方法还可以包括向培养物添加底物。底物的合适量可以是，例如，约1 mM至约20 mM。特别合适的底物可以是，例如，组氨酸、甲硫氨酸、半胱氨酸、γ-谷氨酰半胱氨酸及其组合。The method may further comprise adding a substrate to the culture. A suitable amount of substrate may be, for example, from about 1 mM to about 20 mM. Particularly suitable substrates may be, for example, histidine, methionine, cysteine, γ-glutamylcysteine, and combinations thereof.

在另一个实施方案中，所述方法可以包括向培养物添加辅因子。辅因子的合适量可以是，例如，约0.05 mM至约0.4 mM。特别合适的辅因子可以是，例如，铁(II) (Fe⁺⁺)。In another embodiment, the method can include adding a cofactor to the culture. A suitable amount of the cofactor can be, for example, about 0.05 mM to about 0.4 mM. A particularly suitable cofactor can be, for example, iron (II) (Fe ⁺⁺ ).

在一个实施方案中，所述宿主细胞可以生产约10毫克至约30毫克的麦角硫因/升。In one embodiment, the host cell can produce about 10 mg to about 30 mg of ergothioneine per liter.

考虑以下非限制性实施例后将更充分理解本公开。The present disclosure will be more fully understood upon consideration of the following non-limiting examples.

实施例Example

实施例1Example 1

在本实施例中，将EgtB、EgtC、EgtD和EgtE的核酸序列克隆至大肠杆菌中。In this example, the nucleic acid sequences of EgtB, EgtC, EgtD and EgtE were cloned into E. coli.

具体地，从GenBank(登录号NC 008596)获得以下序列：Egt B: MSMEG_6249 (SEQID NO:1)；Egt C: MSMEG_6248 (SEQ ID NO:3)；Egt D: MSMEG_6247 (SEQ ID NO:5)；和Egt E: MSMEG_6246 (SEQ ID NO:7)。将基因引入在IPTG诱导型启动子控制下的载体中。Specifically, the following sequences were obtained from GenBank (Accession No. NC 008596): Egt B: MSMEG_6249 (SEQ ID NO: 1); Egt C: MSMEG_6248 (SEQ ID NO: 3); Egt D: MSMEG_6247 (SEQ ID NO: 5); and Egt E: MSMEG_6246 (SEQ ID NO: 7). The genes were introduced into a vector under the control of an IPTG-inducible promoter.

为了在大肠杆菌中构建ET途径，使用表1中概述的引物对从耻垢分枝杆菌的基因组序列中PCR扩增EgtB、C、D、E核酸序列。用于克隆的所有5'-引物包括EcoRI和BglI限制性位点和核糖体结合位点(RBS)，并且所有3'-引物都包括BamHI-XhoI位点。将EgtD和EgtB序列克隆至pConB7A载体(图1A)中，并将EgtC和EgtE序列克隆至pConA5K载体(图1B)中。在克隆的序列中没有鉴定到序列错误。以相同的方式制备空载体。然后将构建体共转化至大肠杆菌菌株BL21(DE3)中。To construct the ET pathway in E. coli, the primer pairs outlined in Table 1 were used to PCR amplify the EgtB , C , D , and E nucleic acid sequences from the genomic sequence of Mycobacterium smegmatis. All 5'-primers used for cloning included EcoRI and BglI restriction sites and a ribosome binding site (RBS), and all 3'-primers included BamHI-XhoI sites. The EgtD and EgtB sequences were cloned into the pConB7A vector (Figure 1A), and the EgtC and EgtE sequences were cloned into the pConA5K vector (Figure 1B). No sequence errors were identified in the cloned sequences. Empty vectors were prepared in the same manner. The constructs were then co-transformed into the E. coli strain BL21 (DE3).

表1.用于基因克隆的引物。Table 1. Primers used for gene cloning.

实施例2Example 2

在本实施例中，在工程改造的微生物系统中生产麦角硫因。In this example, ergothioneine was produced in an engineered microbial system.

具体地，用如实施例1中所述的编码EgtB、EgtC、EgtD和EgtE的pConB7A载体和pConA5K载体转化大肠杆菌。为了在大肠杆菌系统中共表达四种基因(EgtB、C、D、E)，使转化体在含有100 mg/L氨苄青霉素和50 mg/L卡那霉素的LB培养基中在37℃下生长，直至达到OD₆₀₀ ~ 0.6。通过添加0.2-0.5 mM异丙基β-D-1-硫代半乳糖吡喃糖苷(IPTG)而诱导表达，并将培养物在30℃或37℃下进一步生长16-24小时。通过离心收获细胞，并分别收集上清液和细胞沉淀。将上清液以16,000 x g离心5分钟用于HPLC分析。将沉淀重悬于1ml 50％甲醇中并超声处理1分钟(3 x 20秒)。以16,000×g离心5分钟之后，通过HPLC分析5μl样品，如下所述。将用空载体转化的大肠杆菌以相同的方式处理并通过HPLC分析。将获得自IPTG诱导的含有EgtB、EgtC、EgtD、EgtE基因的大肠杆菌的样品掺入20 mg/L麦角硫因并通过HPLC分析。Specifically, E. coli was transformed with the pConB7A vector and the pConA5K vector encoding EgtB, EgtC, EgtD, and EgtE as described in Example 1. To co-express the four genes ( EgtB , C , D , E ) in the E. coli system, transformants were grown in LB medium containing 100 mg/L ampicillin and 50 mg/L kanamycin at 37°C until an OD ₆₀₀ of ~0.6 was reached. Expression was induced by adding 0.2-0.5 mM isopropyl β-D-1-thiogalactopyranoside (IPTG), and the cultures were further grown at 30°C or 37°C for 16-24 hours. Cells were harvested by centrifugation, and the supernatant and cell pellet were collected separately. The supernatant was centrifuged at 16,000 x g for 5 minutes for HPLC analysis. The pellet was resuspended in 1 ml of 50% methanol and sonicated for 1 minute (3 x 20 seconds). After centrifugation for 5 minutes at 16,000 × g, 5 μ l samples were analyzed by HPLC, as described below. The Escherichia coli transformed with the empty vector was treated in the same manner and analyzed by HPLC. The sample obtained from the Escherichia coli containing EgtB, EgtC, EgtD, EgtE gene induced by IPTG was mixed with 20 mg/L ergothioneine and analyzed by HPLC.

使用Dionex UPLC Ultimate 3000 (Sunnyvale, CA)分析样品。在AtlantisHILIC硅胶柱(粒径3.0μm，直径×长度= 2.1×100 mm；Waters)上分离化合物，并在264 nm检测。流动相由0.1％甲酸/水(A)和0.1％甲酸/乙腈(B)组成。梯度程序为在1分钟的95％B，在8分钟的40％ B，在8.1分钟的95％B，在11分钟时终止。流速为0.6 ml/分钟，且注射体积为5μl。Samples were analyzed using a Dionex UPLC Ultimate 3000 (Sunnyvale, CA). Compounds were separated on an Atlantis HILIC silica column (3.0 μm particle size, diameter × length = 2.1 × 100 mm; Waters) with detection at 264 nm. The mobile phase consisted of 0.1% formic acid in water (A) and 0.1% formic acid in acetonitrile (B). The gradient program was 95% B in 1 minute, 40% B in 8 minutes, 95% B in 8.1 minutes, and terminated at 11 minutes. The flow rate was 0.6 ml/min, and the injection volume was 5 μl.

如图2中所示，ET令人惊讶地仅在IPTG诱导的含有EgtB、EgtC、EgtD、EgtE序列的大肠杆菌菌株(“SI”)中积累，成功地表明在工程改造的大肠杆菌中ET的生物合成。相反，IPTG诱导的含有空载体的大肠杆菌不生产任何ET(“EI”)。在ET-掺入样品中，来自IPTG诱导的含有EgtB、EgtC、EgtD和EgtE的大肠杆菌菌株的ET峰与添加的麦角硫因重叠，并且表明增加的水平以解释添加的ET(“Ck+”)。As shown in Figure 2, ET surprisingly accumulates only in the E. coli strain containing EgtB, EgtC, EgtD, EgtE sequences induced by IPTG ("SI"), successfully demonstrating the biosynthesis of ET in engineered E. coli. In contrast, the E. coli containing the empty vector induced by IPTG does not produce any ET ("EI"). In the ET-spiked samples, the ET peak from the E. coli strain containing EgtB, EgtC, EgtD and EgtE induced by IPTG overlaps with the added ergothioneine and demonstrates increased levels to account for the added ET ("Ck+").

图3A和3B说明100 mg/L麦角硫因标准品的HPLC分析。如图4A中所示，来自含有EgtB、EgtC、EgtD和EgtE的大肠杆菌菌株的ET的保留时间与麦角硫因标准品的保留时间重叠(参见图3A)。除了保留时间以外，ET峰的UV光谱(参见图4B)也与麦角硫因标准品(参见图3B)匹配。这些结果表明，来自表达EgtB、EgtC、EgtD和EgtE的工程改造的大肠杆菌菌株的峰对应于ET。Fig. 3 A and 3B illustrate the HPLC analysis of 100 mg/L thioneine standard substances.As shown in Figure 4 A, the retention time of ET from the coliform bacteria containing EgtB, EgtC, EgtD and EgtE overlaps with the retention time of thioneine standard substances (referring to Fig. 3 A).Except retention time, the UV spectrum at ET peak (referring to Fig. 4 B) also mates with thioneine standard substances (referring to Fig. 3 B).These results show, the peak from the engineered coliform bacteria expressing EgtB, EgtC, EgtD and EgtE corresponds to ET.

实施例3Example 3

在本实施例中，进行工程改造的微生物系统中麦角硫因生产的时间过程。In this example, the time course of ergothioneine production in an engineered microbial system was performed.

具体地，用如实施例1中所述的含有EgtB、EgtC、EgtD和EgtE的基因的载体转化大肠杆菌。对照大肠杆菌细胞包括具有空载体(无Egt基因)的细胞和含有Egt载体、但未被诱导的未诱导菌株。使细胞在30℃或37℃下生长，如实施例2所述。在0小时至20小时的不同时间点取样。超声处理1分钟(3 x 20秒)之后，将样品以16,000 x g离心5分钟，并通过HPLC分析5μl样品，如实施例2中所讨论。Specifically, E. coli was transformed with a vector containing the genes for EgtB, EgtC, EgtD, and EgtE as described in Example 1. Control E. coli cells included cells with an empty vector (no Egt gene) and an uninduced strain containing an Egt vector but not induced. The cells were grown at 30°C or 37°C as described in Example 2. Samples were taken at different time points from 0 hour to 20 hours. After ultrasonic treatment for 1 minute (3 x 20 seconds), the samples were centrifuged at 16,000 x g for 5 minutes, and 5 μl samples were analyzed by HPLC as discussed in Example 2.

如图5中所示，HPLC分析揭示，到IPTG诱导后约1小时，细胞开始生产ET。从IPTG诱导后约3小时直至约10小时观察到ET生产的最快增加。ET生产在10小时后减慢，但继续生产至少直到20小时。同时，在整个时间过程中在空载体对照中完全没有检测到ET。这些结果进一步表明，ET专门在工程改造用于表达EgtB、EgtC、EgtD和EgtE的大肠杆菌菌株中生产。As shown in Figure 5, HPLC analysis revealed that cells began to produce ET by about 1 hour after IPTG induction. The fastest increase in ET production was observed from about 3 hours to about 10 hours after IPTG induction. ET production slowed after 10 hours, but continued to be produced until at least 20 hours. At the same time, no ET was detected in the empty vector control throughout the entire time course. These results further indicate that ET is produced exclusively in E. coli strains engineered to express EgtB, EgtC, EgtD, and EgtE.

实施例4Example 4

在本实施例中，进行进料实验以确定对工程改造的微生物系统中的麦角硫因生产的影响。In this example, feeding experiments were conducted to determine the effects on ergothioneine production in an engineered microbial system.

不受理论束缚，据信ET由氨基酸诸如组氨酸(His)、甲硫氨酸(Met)和半胱氨酸(Cys)合成。ET的咪唑环由His提供，然后将其甲基化以产生组氨酸甜菜碱。Met是充当甲基供体的S-腺苷基甲硫氨酸(SAM)的构件(building block)。硫原子从Cys引入。Without being bound by theory, it is believed that ET is synthesized from amino acids such as histidine (His), methionine (Met), and cysteine (Cys). The imidazole ring of ET is donated by His, which is then methylated to produce histidine betaine. Met is a building block for S-adenosylmethionine (SAM), which acts as a methyl donor. The sulfur atom is introduced from Cys.

为了测定对工程改造的大肠杆菌中的麦角硫因生产的影响，通过培养基将几种底物和辅因子诸如Fe⁺⁺进料给转基因大肠杆菌细胞。诱导3小时后，将2 mM His、4 mM Met、4mM Cys和0.2 mM Fe⁺⁺添加至培养基，并将细胞进一步培养16小时、24小时和42小时。用相同的底物或辅因子进料对照大肠杆菌培养物(携带空载体)。通过HPLC分析样品，如实施例2所述。In order to measure the impact of the thioneine production in the engineered intestinal bacteria, by substratum by several substrates and cofactors such as Fe ⁺⁺ feed to genetically modified Escherichia coli cells.After induction 3 hours, by 2 mM His, 4 mM Met, 4mM Cys and 0.2 mM Fe ⁺⁺ add to substratum, and cell is further cultivated 16 hours, 24 hours and 42 hours.With identical substrate or cofactor feeding control Escherichia coli culture (carrying empty vector).By HPLC analytical sample, as described in Example 2.

如图6中所示，进料实验揭示，Cys的添加在三个时间点内使ET产量增加17.3-44.4％。该结果表明Cys及其衍生物γ-谷氨酰半胱氨酸在ET的生物合成中发挥重要作用。对照培养物不产生任何ET。As shown in Figure 6, the feeding experiment revealed that the addition of Cys increased ET production by 17.3-44.4% over the three time points. This result indicates that Cys and its derivative γ-glutamylcysteine play an important role in the biosynthesis of ET. The control culture did not produce any ET.

实施例5Example 5

在本实施例中，将在工程改造的酿酒酵母系统中生产麦角硫因。In this example, ergothioneine will be produced in an engineered Saccharomyces cerevisiae system.

为了在酿酒酵母中生产ET，将EgtB、C、D、E基因克隆至可商购的pESC载体中，诸如pESC-His和pESC-Leu (Agilent Technologies)。这些载体含有相反取向的GAL1和GAL10酵母启动子，其允许将两个基因引入酵母菌株中分别在两个阻抑型启动子的控制下。然后将所得两种构建体共转化至酿酒酵母中。为了在酵母中共表达四种基因(EgtB、C、D、E)，将转化体在不含两种氨基酸组氨酸和亮氨酸的培养基中生长，直到达到OD₆₀₀ ~ 0.4。通过添加2％半乳糖诱导表达，并且使培养物在28℃或30℃下进一步生长24-48小时。通过离心收获细胞，并分别收集上清液和细胞沉淀。将上清液以12,000 x g离心5分钟，并通过HPLC分析。将沉淀重悬于1ml 50％甲醇中并超声处理1分钟(3 x 20秒)。以12,000 x g离心5分钟之后，将5μl样品注入HPLC中。携带空载体的酵母以相同的方式进行转化和分析。可以将上述构建体最终整合入酵母基因组中并在组成型启动子诸如GPD启动子或GAP启动子的控制下表达。To produce ET in Saccharomyces cerevisiae, the EgtB, C, D, and E genes were cloned into commercially available pESC vectors, such as pESC-His and pESC-Leu (Agilent Technologies). These vectors contain the GAL1 and GAL10 yeast promoters in opposite orientations, which allow the two genes to be introduced into yeast strains under the control of two repressible promoters. The two constructs were then co-transformed into Saccharomyces cerevisiae. To co-express the four genes (EgtB, C, D, and E) in yeast, the transformants were grown in a culture medium containing the amino acids histidine and leucine until an OD ₆₀₀ of ~0.4 was reached. Expression was induced by adding 2% galactose, and the cultures were further grown at 28°C or 30°C for 24-48 hours. The cells were harvested by centrifugation, and the supernatant and cell pellet were collected respectively. The supernatant was centrifuged at 12,000 x g for 5 minutes and analyzed by HPLC. The pellet was resuspended in 1 ml of 50% methanol and sonicated for 1 minute (3 x 20 seconds). After centrifugation at 12,000 x g for 5 minutes, 5 μl of the sample was injected into the HPLC. Yeast carrying the empty vector was transformed and analyzed in the same manner. The above construct can be finally integrated into the yeast genome and expressed under the control of a constitutive promoter such as the GPD promoter or the GAP promoter.

实施例6Example 6

在本实施例中，在工程改造的巴斯德毕赤酵母系统中生产麦角硫因。In this example, ergothioneine was produced in an engineered Pichia pastoris system.

为了在巴斯德毕赤酵母中生产ET，将EgtB、C、D、E基因克隆至可商购的pPICZ或pGAPZ载体(Invitrogen, Life Technologies)中。pPICZ载体含有甲醇调节的AOX1启动子，而pGAPZ载体具有组成型甘油醛-3-磷酸脱氢酶(GAP)启动子。pPICZ载体中的四种基因(EgtB、C、D、E)的共表达将由0.5-5％甲醇诱导。ET的生产将使用上述相同方法通过HPLC分析来分析。To produce ET in Pichia pastoris, the EgtB, C, D, and E genes were cloned into the commercially available pPICZ or pGAPZ vectors (Invitrogen, Life Technologies). The pPICZ vector contains the methanol-regulated AOX1 promoter, while the pGAPZ vector has a constitutive glyceraldehyde-3-phosphate dehydrogenase (GAP) promoter. Co-expression of the four genes (EgtB, C, D, and E) in the pPICZ vector is induced by 0.5-5% methanol. ET production was analyzed by HPLC using the same method described above.

鉴于上述内容，将看到，实现了本公开的几个优点并且获得了其他有利的结果。因为在不背离本公开范围的情况下可以在上述方法和系统中进行各种改变，所以意欲上文描述中含有和附图中显示的所有内容均应被解释为说明性而非限制性意义。In view of the foregoing, it will be seen that the several advantages of the present disclosure are achieved and other advantageous results are obtained. As various changes can be made in the above-described methods and systems without departing from the scope of the present disclosure, it is intended that all matter contained in the above description and shown in the accompanying drawings shall be interpreted as illustrative and not in a limiting sense.

当介绍本公开的要素或其各种版本、实施方案或方面时，冠词“一个/种(a)”，“一个/种(an)”，“该(the)”和“所述(said)”意指存在一个或多个要素。术语“包含”、“包括”和“具有”意欲为包括性的，并且意味着可以存在除了所列要素之外的额外要素。When introducing elements of the present disclosure or various versions, embodiments, or aspects thereof, the articles "a," "an," "the," and "said" mean that there are one or more of the elements. The terms "comprising," "including," and "having" are intended to be inclusive and mean that there may be additional elements other than the listed elements.

序列表Sequence Listing

<110> Conagen Inc.<110> Conagen Inc.

Han, JixiangHan, Jixiang

Chen, HuiChen, Hui

Yu, OliverYu, Oliver

<120> 微生物麦角硫因生物合成<120> Microbial ergothioneine biosynthesis

<130> C1497.70015<130> C1497.70015

<140> 尚未指定<140> Not yet specified

<141> 随此同时地<141> At the same time

<150> PCT/US2015/027977<150> PCT/US2015/027977

<151> 2015-04-28<151> 2015-04-28

<160> 16<160> 16

<170> PatentIn 版本 3.5<170> PatentIn Version 3.5

<210> 1<210> 1

<211> 1287<211> 1287

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 1<400> 1

atgatcgcac gcgagacact ggccgacgag ctggccctgg cccgcgaacg cacgttgcgg 60atgatcgcac gcgagacact ggccgacgag ctggccctgg cccgcgaacg cacgttgcgg 60

ctcgtggagt tcgacgacgc ggaactgcat cgccagtaca acccgctgat gagcccgctc 120ctcgtggagt tcgacgacgc ggaactgcat cgccagtaca acccgctgat gagcccgctc 120

gtgtgggacc tcgcgcacat cgggcagcag gaagaactgt ggctgctgcg cgacggcaac 180gtgtgggacc tcgcgcacat cgggcagcag gaagaactgt ggctgctgcg cgacggcaac 180

cccgaccgcc ccggcatgct cgcacccgag gtggaccggc tttacgacgc gttcgagcac 240cccgaccgcc ccggcatgct cgcacccgag gtggaccggc tttacgacgc gttcgagcac 240

tcacgcgcca gccgggtcaa cctcccgttg ctgccgcctt cggatgcgcg cgcctactgc 300tcacgcgcca gccgggtcaa cctcccgttg ctgccgcctt cggatgcgcg cgcctactgc 300

gcgacggtgc gggccaaggc gctcgacacc ctcgacacgc tgcccgagga cgatccgggc 360gcgacggtgc gggccaaggc gctcgacacc ctcgacacgc tgcccgagga cgatccgggc 360

ttccggttcg cgctggtgat cagccacgag aaccagcacg acgagaccat gctgcaggca 420ttccggttcg cgctggtgat cagccacgag aaccagcacg acgagaccat gctgcaggca 420

ctcaacctgc gcgagggccc acccctgctc gacaccggaa ttcccctgcc cgcgggcagg 480ctcaacctgc gcgagggccc acccctgctc gacaccggaa ttcccctgcc cgcgggcagg 480

ccaggcgtgg caggcacgtc ggtgctggtg ccgggcggcc cgttcgtgct cggggtcgac 540ccaggcgtgg caggcacgtc ggtgctggtg ccgggcggcc cgttcgtgct cggggtcgac 540

gcgctgaccg aaccgcactc actggacaac gaacggcccg cccacgtcgt ggacatcccg 600gcgctgaccg aaccgcactc actggacaac gaacggcccg cccacgtcgt ggacatcccg 600

tcgttccgga tcggccgcgt gccggtcacc aacgccgaat ggcgcgagtt catcgacgac 660tcgttccgga tcggccgcgt gccggtcacc aacgccgaat ggcgcgagtt catcgacgac 660

ggtggctacg accaaccgcg ctggtggtcg ccacgcggct gggcgcaccg ccaggaggcg 720ggtggctacg accaaccgcg ctggtggtcg ccacgcggct gggcgcaccg ccaggaggcg 720

ggcctggtgg ccccgcagtt ctggaacccc gacggcaccc gcacccggtt cgggcacatc 780ggcctggtgg ccccgcagtt ctggaacccc gacggcaccc gcacccggtt cgggcacatc 780

gaggagatcc cgggtgacga acccgtgcag cacgtgacgt tcttcgaagc cgaggcctac 840gaggagatcc cgggtgacga acccgtgcag cacgtgacgt tcttcgaagc cgaggcctac 840

gcggcgtggg ccggtgctcg gttgcccacc gagatcgaat gggagaaggc ctgcgcgtgg 900gcggcgtggg ccggtgctcg gttgcccacc gagatcgaat gggagaaggc ctgcgcgtgg 900

gatccggtcg ccggtgctcg gcgccggttc ccctggggct cagcacaacc cagcgcggcg 960gatccggtcg ccggtgctcg gcgccggttc ccctggggct cagcacaacc cagcgcggcg 960

ctggccaacc tcggcggtga cgcacgccgc ccggcgccgg tcggggccta cccggcgggg 1020ctggccaacc tcggcggtga cgcacgccgc ccggcgccgg tcggggccta cccggcgggg 1020

gcgtcggcct atggcgccga gcagatgctg ggcgacgtgt gggagtggac ctcctcgccg 1080gcgtcggcct atggcgccga gcagatgctg ggcgacgtgt gggagtggac ctcctcgccg 1080

ctgcggccgt ggcccggttt cacgccgatg atctacgagc gctacagcac gccgttcttc 1140ctgcggccgt ggcccggttt cacgccgatg atctacgagc gctacagcac gccgttcttc 1140

gagggcacca catccggtga ctaccgcgtg ctgcgcggcg ggtcatgggc cgttgcaccg 1200gagggcacca catccggtga ctaccgcgtg ctgcgcggcg ggtcatgggc cgttgcaccg 1200

ggaatcctgc ggcccagctt ccgcaactgg gaccacccga tccggcggca gatattctcg 1260ggaatcctgc ggcccagctt ccgcaactgg gaccacccga tccggcggca gatattctcg 1260

ggtgtccgcc tggcctggga cgtctga 1287ggtgtccgcc tggcctggga cgtctga 1287

<210> 2<210> 2

<211> 428<211> 428

<212> PRT<212> PRT

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 2<400> 2

Met Ile Ala Arg Glu Thr Leu Ala Asp Glu Leu Ala Leu Ala Arg GluMet Ile Ala Arg Glu Thr Leu Ala Asp Glu Leu Ala Leu Ala Arg Glu

1 5 10 151 5 10 15

Arg Thr Leu Arg Leu Val Glu Phe Asp Asp Ala Glu Leu His Arg GlnArg Thr Leu Arg Leu Val Glu Phe Asp Asp Ala Glu Leu His Arg Gln

20 25 3020 25 30

Tyr Asn Pro Leu Met Ser Pro Leu Val Trp Asp Leu Ala His Ile GlyTyr Asn Pro Leu Met Ser Pro Leu Val Trp Asp Leu Ala His Ile Gly

35 40 4535 40 45

Gln Gln Glu Glu Leu Trp Leu Leu Arg Asp Gly Asn Pro Asp Arg ProGln Gln Glu Glu Leu Trp Leu Leu Arg Asp Gly Asn Pro Asp Arg Pro

50 55 6050 55 60

Gly Met Leu Ala Pro Glu Val Asp Arg Leu Tyr Asp Ala Phe Glu HisGly Met Leu Ala Pro Glu Val Asp Arg Leu Tyr Asp Ala Phe Glu His

65 70 75 8065 70 75 80

Ser Arg Ala Ser Arg Val Asn Leu Pro Leu Leu Pro Pro Ser Asp AlaSer Arg Ala Ser Arg Val Asn Leu Pro Leu Leu Pro Pro Ser Asp Ala

85 90 9585 90 95

Arg Ala Tyr Cys Ala Thr Val Arg Ala Lys Ala Leu Asp Thr Leu AspArg Ala Tyr Cys Ala Thr Val Arg Ala Lys Ala Leu Asp Thr Leu Asp

100 105 110100 105 110

Thr Leu Pro Glu Asp Asp Pro Gly Phe Arg Phe Ala Leu Val Ile SerThr Leu Pro Glu Asp Asp Pro Gly Phe Arg Phe Ala Leu Val Ile Ser

115 120 125115 120 125

His Glu Asn Gln His Asp Glu Thr Met Leu Gln Ala Leu Asn Leu ArgHis Glu Asn Gln His Asp Glu Thr Met Leu Gln Ala Leu Asn Leu Arg

130 135 140130 135 140

Glu Gly Pro Pro Leu Leu Asp Thr Gly Ile Pro Leu Pro Ala Gly ArgGlu Gly Pro Pro Leu Leu Asp Thr Gly Ile Pro Leu Pro Ala Gly Arg

145 150 155 160145 150 155 160

Pro Gly Val Ala Gly Thr Ser Val Leu Val Pro Gly Gly Pro Phe ValPro Gly Val Ala Gly Thr Ser Val Leu Val Pro Gly Gly Pro Phe Val

165 170 175165 170 175

Leu Gly Val Asp Ala Leu Thr Glu Pro His Ser Leu Asp Asn Glu ArgLeu Gly Val Asp Ala Leu Thr Glu Pro His Ser Leu Asp Asn Glu Arg

180 185 190180 185 190

Pro Ala His Val Val Asp Ile Pro Ser Phe Arg Ile Gly Arg Val ProPro Ala His Val Val Asp Ile Pro Ser Phe Arg Ile Gly Arg Val Pro

195 200 205195 200 205

Val Thr Asn Ala Glu Trp Arg Glu Phe Ile Asp Asp Gly Gly Tyr AspVal Thr Asn Ala Glu Trp Arg Glu Phe Ile Asp Asp Gly Gly Tyr Asp

210 215 220210 215 220

Gln Pro Arg Trp Trp Ser Pro Arg Gly Trp Ala His Arg Gln Glu AlaGln Pro Arg Trp Trp Ser Pro Arg Gly Trp Ala His Arg Gln Glu Ala

225 230 235 240225 230 235 240

Gly Leu Val Ala Pro Gln Phe Trp Asn Pro Asp Gly Thr Arg Thr ArgGly Leu Val Ala Pro Gln Phe Trp Asn Pro Asp Gly Thr Arg Thr Arg

245 250 255245 250 255

Phe Gly His Ile Glu Glu Ile Pro Gly Asp Glu Pro Val Gln His ValPhe Gly His Ile Glu Glu Ile Pro Gly Asp Glu Pro Val Gln His Val

260 265 270260 265 270

Thr Phe Phe Glu Ala Glu Ala Tyr Ala Ala Trp Ala Gly Ala Arg LeuThr Phe Phe Glu Ala Glu Ala Tyr Ala Ala Trp Ala Gly Ala Arg Leu

275 280 285275 280 285

Pro Thr Glu Ile Glu Trp Glu Lys Ala Cys Ala Trp Asp Pro Val AlaPro Thr Glu Ile Glu Trp Glu Lys Ala Cys Ala Trp Asp Pro Val Ala

290 295 300290 295 300

Gly Ala Arg Arg Arg Phe Pro Trp Gly Ser Ala Gln Pro Ser Ala AlaGly Ala Arg Arg Arg Phe Pro Trp Gly Ser Ala Gln Pro Ser Ala Ala

305 310 315 320305 310 315 320

Leu Ala Asn Leu Gly Gly Asp Ala Arg Arg Pro Ala Pro Val Gly AlaLeu Ala Asn Leu Gly Gly Asp Ala Arg Arg Pro Ala Pro Val Gly Ala

325 330 335325 330 335

Tyr Pro Ala Gly Ala Ser Ala Tyr Gly Ala Glu Gln Met Leu Gly AspTyr Pro Ala Gly Ala Ser Ala Tyr Gly Ala Glu Gln Met Leu Gly Asp

340 345 350340 345 350

Val Trp Glu Trp Thr Ser Ser Pro Leu Arg Pro Trp Pro Gly Phe ThrVal Trp Glu Trp Thr Ser Ser Pro Leu Arg Pro Trp Pro Gly Phe Thr

355 360 365355 360 365

Pro Met Ile Tyr Glu Arg Tyr Ser Thr Pro Phe Phe Glu Gly Thr ThrPro Met Ile Tyr Glu Arg Tyr Ser Thr Pro Phe Phe Glu Gly Thr Thr

370 375 380370 375 380

Ser Gly Asp Tyr Arg Val Leu Arg Gly Gly Ser Trp Ala Val Ala ProSer Gly Asp Tyr Arg Val Leu Arg Gly Gly Ser Trp Ala Val Ala Pro

385 390 395 400385 390 395 400

Gly Ile Leu Arg Pro Ser Phe Arg Asn Trp Asp His Pro Ile Arg ArgGly Ile Leu Arg Pro Ser Phe Arg Asn Trp Asp His Pro Ile Arg Arg

405 410 415405 410 415

Gln Ile Phe Ser Gly Val Arg Leu Ala Trp Asp ValGln Ile Phe Ser Gly Val Arg Leu Ala Trp Asp Val

420 425420 425

<210> 3<210> 3

<211> 684<211> 684

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 3<400> 3

atgtgccggc atgtggcgtg gctgggcgcg ccgcggtcgt tggccgacct ggtgctcgac 60atgtgccggc atgtggcgtg gctgggcgcg ccgcggtcgt tggccgacct ggtgctcgac 60

ccgccgcagg gactgctggt gcagtcctac gcaccgcgac gacagaagca cggtctgatg 120ccgccgcagg gactgctggt gcagtcctac gcaccgcgac gacagaagca cggtctgatg 120

aacgccgacg gttggggcgc agggtttttc gacgacgagg gagtggcccg ccgctggcgc 180aacgccgacg gttggggcgc agggtttttc gacgacgagg gagtggcccg ccgctggcgc 180

agcgacaaac cgctgtgggg tgatgcgtcg ttcgcgtcgg tggcacccgc actacgcagt 240agcgacaaac cgctgtgggg tgatgcgtcg ttcgcgtcgg tggcacccgc actacgcagt 240

cgttgcgtgc tggccgcggt gcgctcggcc accatcggca tgcccatcga accgtcggcg 300cgttgcgtgc tggccgcggt gcgctcggcc accatcggca tgcccatcga accgtcggcg 300

tcggcgccgt tcagcgacgg gcagtggctg ctgtcgcaca acggcctggt cgaccgcggg 360tcggcgccgt tcagcgacgg gcagtggctg ctgtcgcaca acggcctggt cgaccgcggg 360

gtgctcccgt tgaccggtgc cgccgagtcc acggtggaca gcgcgatcgt cgcggcgctc 420gtgctcccgt tgaccggtgc cgccgagtcc acggtggaca gcgcgatcgt cgcggcgctc 420

atcttctccc gtggcctcga cgcgctcggc gccaccatcg ccgaggtcgg cgaactcgac 480atcttctccc gtggcctcga cgcgctcggc gccaccatcg ccgaggtcgg cgaactcgac 480

ccgaacgcgc ggttgaacat cctggccgcc aacggttccc ggctgctcgc caccacctgg 540ccgaacgcgc ggttgaacat cctggccgcc aacggttccc ggctgctcgc caccacctgg 540

ggggacacgc tgtcggtcct gcaccgcccc gacggcgtcg tcctcgcgag cgaaccctac 600ggggacacgc tgtcggtcct gcaccgcccc gacggcgtcg tcctcgcgag cgaaccctac 600

gacgacgatc ccggctggtc ggacatcccg gaccggcacc tcgtcgacgt ccgcgacgcc 660gacgacgatc ccggctggtc ggacatcccg gaccggcacc tcgtcgacgt ccgcgacgcc 660

cacgtcgtcg tgacacccct gtga 684cacgtcgtcg tgacacccct gtga 684

<210> 4<210> 4

<211> 227<211> 227

<212> PRT<212> PRT

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 4<400> 4

Met Cys Arg His Val Ala Trp Leu Gly Ala Pro Arg Ser Leu Ala AspMet Cys Arg His Val Ala Trp Leu Gly Ala Pro Arg Ser Leu Ala Asp

1 5 10 151 5 10 15

Leu Val Leu Asp Pro Pro Gln Gly Leu Leu Val Gln Ser Tyr Ala ProLeu Val Leu Asp Pro Pro Gln Gly Leu Leu Val Gln Ser Tyr Ala Pro

20 25 3020 25 30

Arg Arg Gln Lys His Gly Leu Met Asn Ala Asp Gly Trp Gly Ala GlyArg Arg Gln Lys His Gly Leu Met Asn Ala Asp Gly Trp Gly Ala Gly

35 40 4535 40 45

Phe Phe Asp Asp Glu Gly Val Ala Arg Arg Trp Arg Ser Asp Lys ProPhe Phe Asp Asp Glu Gly Val Ala Arg Arg Trp Arg Ser Asp Lys Pro

50 55 6050 55 60

Leu Trp Gly Asp Ala Ser Phe Ala Ser Val Ala Pro Ala Leu Arg SerLeu Trp Gly Asp Ala Ser Phe Ala Ser Val Ala Pro Ala Leu Arg Ser

65 70 75 8065 70 75 80

Arg Cys Val Leu Ala Ala Val Arg Ser Ala Thr Ile Gly Met Pro IleArg Cys Val Leu Ala Ala Val Arg Ser Ala Thr Ile Gly Met Pro Ile

85 90 9585 90 95

Glu Pro Ser Ala Ser Ala Pro Phe Ser Asp Gly Gln Trp Leu Leu SerGlu Pro Ser Ala Ser Ala Pro Phe Ser Asp Gly Gln Trp Leu Leu Ser

100 105 110100 105 110

His Asn Gly Leu Val Asp Arg Gly Val Leu Pro Leu Thr Gly Ala AlaHis Asn Gly Leu Val Asp Arg Gly Val Leu Pro Leu Thr Gly Ala Ala

115 120 125115 120 125

Glu Ser Thr Val Asp Ser Ala Ile Val Ala Ala Leu Ile Phe Ser ArgGlu Ser Thr Val Asp Ser Ala Ile Val Ala Ala Leu Ile Phe Ser Arg

130 135 140130 135 140

Gly Leu Asp Ala Leu Gly Ala Thr Ile Ala Glu Val Gly Glu Leu AspGly Leu Asp Ala Leu Gly Ala Thr Ile Ala Glu Val Gly Glu Leu Asp

145 150 155 160145 150 155 160

Pro Asn Ala Arg Leu Asn Ile Leu Ala Ala Asn Gly Ser Arg Leu LeuPro Asn Ala Arg Leu Asn Ile Leu Ala Ala Asn Gly Ser Arg Leu Leu

165 170 175165 170 175

Ala Thr Thr Trp Gly Asp Thr Leu Ser Val Leu His Arg Pro Asp GlyAla Thr Thr Trp Gly Asp Thr Leu Ser Val Leu His Arg Pro Asp Gly

180 185 190180 185 190

Val Val Leu Ala Ser Glu Pro Tyr Asp Asp Asp Pro Gly Trp Ser AspVal Val Leu Ala Ser Glu Pro Tyr Asp Asp Asp Pro Gly Trp Ser Asp

195 200 205195 200 205

Ile Pro Asp Arg His Leu Val Asp Val Arg Asp Ala His Val Val ValIle Pro Asp Arg His Leu Val Asp Val Arg Asp Ala His Val Val Val

210 215 220210 215 220

Thr Pro LeuThr Pro Leu

225225

<210> 5<210> 5

<211> 966<211> 966

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 5<400> 5

atgacgctct cactggccaa ctacctggca gccgactcgg ccgccgaagc actgcgccgt 60atgacgctct cactggccaa ctacctggca gccgactcgg ccgccgaagc actgcgccgt 60

gacgtccgcg cgggcctcac cgcggcaccg aagagtctgc cgcccaagtg gttctacgac 120gacgtccgcg cgggcctcac cgcggcaccg aagagtctgc cgcccaagtg gttctacgac 120

gccgtcggca gtgatctgtt cgaccagatc acccggctcc ccgagtatta ccccacccgc 180gccgtcggca gtgatctgtt cgaccagatc acccggctcc ccgagtatta ccccacccgc 180

accgaggcgc agatcctgcg gacccggtcg gcggagatca tcgcggccgc gggtgccgac 240accgaggcgc agatcctgcg gacccggtcg gcggagatca tcgcggccgc gggtgccgac 240

accctggtgg aactgggcag tggtacgtcg gagaaaaccc gcatgctgct cgacgccatg 300accctggtgg aactgggcag tggtacgtcg gagaaaaccc gcatgctgct cgacgccatg 300

cgcgacgccg agttgctgcg ccgcttcatc ccgttcgacg tcgacgcggg cgtgctgcgc 360cgcgacgccg agttgctgcg ccgcttcatc ccgttcgacg tcgacgcggg cgtgctgcgc 360

tcggccgggg cggcaatcgg cgcggagtac cccggtatcg agatcgacgc ggtatgtggc 420tcggccgggg cggcaatcgg cgcggagtac cccggtatcg agatcgacgc ggtatgtggc 420

gatttcgagg aacatctggg caagatcccg catgtcggac ggcggctcgt ggtgttcctg 480gatttcgagg aacatctggg caagatcccg catgtcggac ggcggctcgt ggtgttcctg 480

gggtcgacca tcggcaacct gacacccgcg ccccgcgcgg agttcctcag tactctcgcg 540gggtcgacca tcggcaacct gacacccgcg ccccgcgcgg agttcctcag tactctcgcg 540

gacacgctgc agccgggcga cagcctgctg ctgggcaccg atctggtgaa ggacaccggc 600gacacgctgc agccgggcga cagcctgctg ctgggcaccg atctggtgaa ggacaccggc 600

cggttggtgc gcgcgtacga cgacgcggcc ggcgtcaccg cggcgttcaa ccgcaacgtg 660cggttggtgc gcgcgtacga cgacgcggcc ggcgtcaccg cggcgttcaa ccgcaacgtg 660

ctggccgtgg tgaaccgcga actgtccgcc gatttcgacc tcgacgcgtt cgagcatgtc 720ctggccgtgg tgaaccgcga actgtccgcc gatttcgacc tcgacgcgtt cgagcatgtc 720

gcgaagtgga actccgacga ggaacgcatc gagatgtggt tgcgtgcccg caccgcacag 780gcgaagtgga actccgacga ggaacgcatc gagatgtggt tgcgtgcccg caccgcacag 780

catgtccgcg tcgcggcact ggacctggag gtcgacttcg ccgcgggtga ggagatgctc 840catgtccgcg tcgcggcact ggacctggag gtcgacttcg ccgcgggtga ggagatgctc 840

accgaggtgt cctgcaagtt ccgtcccgag aacgtcgtcg ccgagctggc ggaagccggt 900accgaggtgt cctgcaagtt ccgtcccgag aacgtcgtcg ccgagctggc ggaagccggt 900

ctgcggcaga cgcattggtg gaccgatccg gccggggatt tcgggttgtc gctggcggtg 960ctgcggcaga cgcattggtg gaccgatccg gccggggatt tcgggttgtc gctggcggtg 960

cggtga 966cggtga 966

<210> 6<210> 6

<211> 321<211> 321

<212> PRT<212> PRT

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 6<400> 6

Met Thr Leu Ser Leu Ala Asn Tyr Leu Ala Ala Asp Ser Ala Ala GluMet Thr Leu Ser Leu Ala Asn Tyr Leu Ala Ala Asp Ser Ala Ala Glu

1 5 10 151 5 10 15

Ala Leu Arg Arg Asp Val Arg Ala Gly Leu Thr Ala Ala Pro Lys SerAla Leu Arg Arg Asp Val Arg Ala Gly Leu Thr Ala Ala Pro Lys Ser

20 25 3020 25 30

Leu Pro Pro Lys Trp Phe Tyr Asp Ala Val Gly Ser Asp Leu Phe AspLeu Pro Pro Lys Trp Phe Tyr Asp Ala Val Gly Ser Asp Leu Phe Asp

35 40 4535 40 45

Gln Ile Thr Arg Leu Pro Glu Tyr Tyr Pro Thr Arg Thr Glu Ala GlnGln Ile Thr Arg Leu Pro Glu Tyr Tyr Pro Thr Arg Thr Glu Ala Gln

50 55 6050 55 60

Ile Leu Arg Thr Arg Ser Ala Glu Ile Ile Ala Ala Ala Gly Ala AspIle Leu Arg Thr Arg Ser Ala Glu Ile Ile Ala Ala Ala Gly Ala Asp

65 70 75 8065 70 75 80

Thr Leu Val Glu Leu Gly Ser Gly Thr Ser Glu Lys Thr Arg Met LeuThr Leu Val Glu Leu Gly Ser Gly Thr Ser Glu Lys Thr Arg Met Leu

85 90 9585 90 95

Leu Asp Ala Met Arg Asp Ala Glu Leu Leu Arg Arg Phe Ile Pro PheLeu Asp Ala Met Arg Asp Ala Glu Leu Leu Arg Arg Phe Ile Pro Phe

100 105 110100 105 110

Asp Val Asp Ala Gly Val Leu Arg Ser Ala Gly Ala Ala Ile Gly AlaAsp Val Asp Ala Gly Val Leu Arg Ser Ala Gly Ala Ala Ile Gly Ala

115 120 125115 120 125

Glu Tyr Pro Gly Ile Glu Ile Asp Ala Val Cys Gly Asp Phe Glu GluGlu Tyr Pro Gly Ile Glu Ile Asp Ala Val Cys Gly Asp Phe Glu Glu

130 135 140130 135 140

His Leu Gly Lys Ile Pro His Val Gly Arg Arg Leu Val Val Phe LeuHis Leu Gly Lys Ile Pro His Val Gly Arg Arg Leu Val Val Phe Leu

145 150 155 160145 150 155 160

Gly Ser Thr Ile Gly Asn Leu Thr Pro Ala Pro Arg Ala Glu Phe LeuGly Ser Thr Ile Gly Asn Leu Thr Pro Ala Pro Arg Ala Glu Phe Leu

165 170 175165 170 175

Ser Thr Leu Ala Asp Thr Leu Gln Pro Gly Asp Ser Leu Leu Leu GlySer Thr Leu Ala Asp Thr Leu Gln Pro Gly Asp Ser Leu Leu Leu Gly

180 185 190180 185 190

Thr Asp Leu Val Lys Asp Thr Gly Arg Leu Val Arg Ala Tyr Asp AspThr Asp Leu Val Lys Asp Thr Gly Arg Leu Val Arg Ala Tyr Asp Asp

195 200 205195 200 205

Ala Ala Gly Val Thr Ala Ala Phe Asn Arg Asn Val Leu Ala Val ValAla Ala Gly Val Thr Ala Ala Phe Asn Arg Asn Val Leu Ala Val Val

210 215 220210 215 220

Asn Arg Glu Leu Ser Ala Asp Phe Asp Leu Asp Ala Phe Glu His ValAsn Arg Glu Leu Ser Ala Asp Phe Asp Leu Asp Ala Phe Glu His Val

225 230 235 240225 230 235 240

Ala Lys Trp Asn Ser Asp Glu Glu Arg Ile Glu Met Trp Leu Arg AlaAla Lys Trp Asn Ser Asp Glu Glu Arg Ile Glu Met Trp Leu Arg Ala

245 250 255245 250 255

Arg Thr Ala Gln His Val Arg Val Ala Ala Leu Asp Leu Glu Val AspArg Thr Ala Gln His Val Arg Val Ala Ala Leu Asp Leu Glu Val Asp

260 265 270260 265 270

Phe Ala Ala Gly Glu Glu Met Leu Thr Glu Val Ser Cys Lys Phe ArgPhe Ala Ala Gly Glu Glu Met Leu Thr Glu Val Ser Cys Lys Phe Arg

275 280 285275 280 285

Pro Glu Asn Val Val Ala Glu Leu Ala Glu Ala Gly Leu Arg Gln ThrPro Glu Asn Val Val Ala Glu Leu Ala Glu Ala Gly Leu Arg Gln Thr

290 295 300290 295 300

His Trp Trp Thr Asp Pro Ala Gly Asp Phe Gly Leu Ser Leu Ala ValHis Trp Trp Thr Asp Pro Ala Gly Asp Phe Gly Leu Ser Leu Ala Val

305 310 315 320305 310 315 320

ArgArg

<210> 7<210> 7

<211> 1113<211> 1113

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 7<400> 7

atgctcgcgc agcagtggcg tgacgcccgt cccaaggttg ccgggttgca cctggacagc 60atgctcgcgc agcagtggcg tgacgcccgt cccaaggttg ccgggttgca cctggacagc 60

ggggcatgtt cgcggcagag cttcgcggtg atcgacgcga ccaccgcaca cgcacgccac 120ggggcatgtt cgcggcagag cttcgcggtg atcgacgcga ccaccgcaca cgcacgccac 120

gaggccgagg tgggtggtta tgtggcggcc gaggctgcga cgccggcgct cgacgccggg 180gaggccgagg tgggtggtta tgtggcggcc gaggctgcga cgccggcgct cgacgccggg 180

cgggccgcgg tcgcgtcgct catcggtttt gcggcgtcgg acgtggtgta caccagcgga 240cgggccgcgg tcgcgtcgct catcggtttt gcggcgtcgg acgtggtgta caccagcgga 240

tccaaccacg ccatcgacct gttgctgtcg agctggccgg ggaagcgcac gctggcctgc 300tccaaccacg ccatcgacct gttgctgtcg agctggccgg ggaagcgcac gctggcctgc 300

ctgcccggcg agtacgggcc gaatctgtct gccatggcgg ccaacggttt ccaggtgcgt 360ctgcccggcg agtacgggcc gaatctgtct gccatggcgg ccaacggttt ccaggtgcgt 360

gcgctaccgg tcgacgacga cgggcgggtg ctggtcgacg aggcgtcgca cgaactgtcg 420gcgctaccgg tcgacgacga cgggcgggtg ctggtcgacg aggcgtcgca cgaactgtcg 420

gcccatcccg tcgcgctcgt acacctcacc gcattggcaa gccatcgcgg gatcgcgcaa 480gcccatcccg tcgcgctcgt acacctcacc gcattggcaa gccatcgcgg gatcgcgcaa 480

cccgcggcag aactcgtcga ggcctgccac aatgcgggga tccccgtggt gatcgacgcc 540cccgcggcag aactcgtcga ggcctgccac aatgcgggga tccccgtggt gatcgacgcc 540

gcgcaggcgc tggggcatct ggactgcaat gtcggggccg acgcggtgta ctcatcgtcg 600gcgcaggcgc tggggcatct ggactgcaat gtcggggccg acgcggtgta ctcatcgtcg 600

cgcaagtggc tcgccggccc gcgtggtgtc ggggtgctcg cggtgcggcc cgaactcgcc 660cgcaagtggc tcgccggccc gcgtggtgtc ggggtgctcg cggtgcggcc cgaactcgcc 660

gagcgtctgc aaccgcggat ccccccgtcc gactggccaa ttccgatgag cgtcttggag 720gagcgtctgc aaccgcggat ccccccgtcc gactggccaa ttccgatgag cgtcttggag 720

aagctcgaac taggtgagca caacgcggcg gcgcgtgtgg gattctccgt cgcggttggt 780aagctcgaac taggtgagca caacgcggcg gcgcgtgtgg gattctccgt cgcggttggt 780

gagcatctcg cagcagggcc cacggcggtg cgcgaacgac tcgccgaggt ggggcgtctc 840gagcatctcg cagcagggcc cacggcggtg cgcgaacgac tcgccgaggt ggggcgtctc 840

tctcggcagg tgctggcaga ggtcgacggg tggcgcgtcg tcgaacccgt cgaccaaccc 900tctcggcagg tgctggcaga ggtcgacggg tggcgcgtcg tcgaacccgt cgaccaaccc 900

accgcgatca ccacccttga gtccaccgat ggtgccgatc ccgcgtcggt gcgctcgtgg 960accgcgatca ccacccttga gtccaccgat ggtgccgatc ccgcgtcggt gcgctcgtgg 960

ctgatcgcgg agcgtggcat cgtgaccacc gcgtgtgaac tcgcgcgggc accgttcgag 1020ctgatcgcgg agcgtggcat cgtgaccacc gcgtgtgaac tcgcgcgggc accgttcgag 1020

atgcgcacgc cggtgctgcg aatctcgccg cacgtcgacg tgacggtcga cgaactggag 1080atgcgcacgc cggtgctgcg aatctcgccg cacgtcgacg tgacggtcga cgaactggag 1080

cagttcgccg cagcgttgcg tgaggcgccc tga 1113cagttcgccg cagcgttgcg tgaggcgccc tga 1113

<210> 8<210> 8

<211> 370<211> 370

<212> PRT<212> PRT

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 8<400> 8

Met Leu Ala Gln Gln Trp Arg Asp Ala Arg Pro Lys Val Ala Gly LeuMet Leu Ala Gln Gln Trp Arg Asp Ala Arg Pro Lys Val Ala Gly Leu

1 5 10 151 5 10 15

His Leu Asp Ser Gly Ala Cys Ser Arg Gln Ser Phe Ala Val Ile AspHis Leu Asp Ser Gly Ala Cys Ser Arg Gln Ser Phe Ala Val Ile Asp

20 25 3020 25 30

Ala Thr Thr Ala His Ala Arg His Glu Ala Glu Val Gly Gly Tyr ValAla Thr Thr Ala His Ala Arg His Glu Ala Glu Val Gly Gly Tyr Val

35 40 4535 40 45

Ala Ala Glu Ala Ala Thr Pro Ala Leu Asp Ala Gly Arg Ala Ala ValAla Ala Glu Ala Ala Thr Pro Ala Leu Asp Ala Gly Arg Ala Ala Val

50 55 6050 55 60

Ala Ser Leu Ile Gly Phe Ala Ala Ser Asp Val Val Tyr Thr Ser GlyAla Ser Leu Ile Gly Phe Ala Ala Ser Asp Val Val Tyr Thr Ser Gly

65 70 75 8065 70 75 80

Ser Asn His Ala Ile Asp Leu Leu Leu Ser Ser Trp Pro Gly Lys ArgSer Asn His Ala Ile Asp Leu Leu Leu Ser Ser Trp Pro Gly Lys Arg

85 90 9585 90 95

Thr Leu Ala Cys Leu Pro Gly Glu Tyr Gly Pro Asn Leu Ser Ala MetThr Leu Ala Cys Leu Pro Gly Glu Tyr Gly Pro Asn Leu Ser Ala Met

100 105 110100 105 110

Ala Ala Asn Gly Phe Gln Val Arg Ala Leu Pro Val Asp Asp Asp GlyAla Ala Asn Gly Phe Gln Val Arg Ala Leu Pro Val Asp Asp Asp Gly

115 120 125115 120 125

Arg Val Leu Val Asp Glu Ala Ser His Glu Leu Ser Ala His Pro ValArg Val Leu Val Asp Glu Ala Ser His Glu Leu Ser Ala His Pro Val

130 135 140130 135 140

Ala Leu Val His Leu Thr Ala Leu Ala Ser His Arg Gly Ile Ala GlnAla Leu Val His Leu Thr Ala Leu Ala Ser His Arg Gly Ile Ala Gln

145 150 155 160145 150 155 160

Pro Ala Ala Glu Leu Val Glu Ala Cys His Asn Ala Gly Ile Pro ValPro Ala Ala Glu Leu Val Glu Ala Cys His Asn Ala Gly Ile Pro Val

165 170 175165 170 175

Val Ile Asp Ala Ala Gln Ala Leu Gly His Leu Asp Cys Asn Val GlyVal Ile Asp Ala Ala Gln Ala Leu Gly His Leu Asp Cys Asn Val Gly

180 185 190180 185 190

Ala Asp Ala Val Tyr Ser Ser Ser Arg Lys Trp Leu Ala Gly Pro ArgAla Asp Ala Val Tyr Ser Ser Ser Arg Lys Trp Leu Ala Gly Pro Arg

195 200 205195 200 205

Gly Val Gly Val Leu Ala Val Arg Pro Glu Leu Ala Glu Arg Leu GlnGly Val Gly Val Leu Ala Val Arg Pro Glu Leu Ala Glu Arg Leu Gln

210 215 220210 215 220

Pro Arg Ile Pro Pro Ser Asp Trp Pro Ile Pro Met Ser Val Leu GluPro Arg Ile Pro Pro Ser Asp Trp Pro Ile Pro Met Ser Val Leu Glu

225 230 235 240225 230 235 240

Lys Leu Glu Leu Gly Glu His Asn Ala Ala Ala Arg Val Gly Phe SerLys Leu Glu Leu Gly Glu His Asn Ala Ala Ala Arg Val Gly Phe Ser

245 250 255245 250 255

Val Ala Val Gly Glu His Leu Ala Ala Gly Pro Thr Ala Val Arg GluVal Ala Val Gly Glu His Leu Ala Ala Gly Pro Thr Ala Val Arg Glu

260 265 270260 265 270

Arg Leu Ala Glu Val Gly Arg Leu Ser Arg Gln Val Leu Ala Glu ValArg Leu Ala Glu Val Gly Arg Leu Ser Arg Gln Val Leu Ala Glu Val

275 280 285275 280 285

Asp Gly Trp Arg Val Val Glu Pro Val Asp Gln Pro Thr Ala Ile ThrAsp Gly Trp Arg Val Val Glu Pro Val Asp Gln Pro Thr Ala Ile Thr

290 295 300290 295 300

Thr Leu Glu Ser Thr Asp Gly Ala Asp Pro Ala Ser Val Arg Ser TrpThr Leu Glu Ser Thr Asp Gly Ala Asp Pro Ala Ser Val Arg Ser Trp

305 310 315 320305 310 315 320

Leu Ile Ala Glu Arg Gly Ile Val Thr Thr Ala Cys Glu Leu Ala ArgLeu Ile Ala Glu Arg Gly Ile Val Thr Thr Ala Cys Glu Leu Ala Arg

325 330 335325 330 335

Ala Pro Phe Glu Met Arg Thr Pro Val Leu Arg Ile Ser Pro His ValAla Pro Phe Glu Met Arg Thr Pro Val Leu Arg Ile Ser Pro His Val

340 345 350340 345 350

Asp Val Thr Val Asp Glu Leu Glu Gln Phe Ala Ala Ala Leu Arg GluAsp Val Thr Val Asp Glu Leu Glu Gln Phe Ala Ala Ala Leu Arg Glu

355 360 365355 360 365

Ala ProAla Pro

370370

<210> 9<210> 9

<211> 49<211> 49

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 9<400> 9

agaattcaaa agatctaaag gaggccatcc atgatcgcac gcgagacac 49agaattcaaa agatctaaag gaggccatcc atgatcgcac gcgagacac 49

<210> 10<210> 10

<211> 53<211> 53

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 10<400> 10

actcgagttt ggatcctcag acgtcccagg ccaggcggac acccgagaat atc 53actcgagttt ggatcctcag acgtcccagg ccaggcggac acccgagaat atc 53

<210> 11<210> 11

<211> 50<211> 50

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 11<400> 11

agaattcaaa agatctaaag gaggccatcc atgtgccggc atgtggcgtg 50agaattcaaa agatctaaag gaggccatcc atgtgccggc atgtggcgtg 50

<210> 12<210> 12

<211> 34<211> 34

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 12<400> 12

actcgagttt ggatcctcac aggggtgtca cgac 34actcgagttt ggatcctcac aggggtgtca cgac 34

<210> 13<210> 13

<211> 51<211> 51

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 13<400> 13

agaattcaaa agatctaaag gaggccatcc atgacgctct cactggccaa c 51agaattcaaa agatctaaag gaggccatcc atgacgctct cactggccaa c 51

<210> 14<210> 14

<211> 35<211> 35

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 14<400> 14

actcgagttt ggatcctcac cgcaccgcca gcgac 35actcgagttt ggatcctcac cgcaccgcca gcgac 35

<210> 15<210> 15

<211> 47<211> 47

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 15<400> 15

agaattcaaa agatctaaag gaggccatcc atgctcgcgc agcagtg 47agaattcaaa agatctaaag gaggccatcc atgctcgcgc agcagtg 47

<210> 16<210> 16

<211> 35<211> 35

<212> DNA<212> DNA

<213> 人工序列<213> Artificial sequence

<220><220>

<223> 合成的<223> Synthetic

<400> 16<400> 16

actcgagttt ggatcctcag ggcgcctcac gcaac 35actcgagttt ggatcctcag ggcgcctcac gcaac 35

Claims

1. An engineered host cell for the production of ergothionein, wherein the engineered host cell is transformed with a nucleic acid sequence encoding the amino acid sequence of SEQ ID NO:2, the amino acid sequence of SEQ ID NO:4, the amino acid sequence of SEQ ID NO:6, and the amino acid sequence of SEQ ID NO:8, wherein the host cell is * Escherichia coli *, * Saccharomyces cerevisiae *, or * Pichia pastoris *.

2. The engineered host cell of claim 1, wherein the host cell is an Escherichia coli cell.

3. The engineered host cell of claim 1, wherein the host cell is selected from Saccharomyces cerevisiae cells and Pichia pastoris cells.

4. A method for producing ergothioneine, the method comprising:

Culture the host cells according to any one of claims 1-3;

Inducing the host cells to express nucleic acid sequences encoding EgtB, EgtC, EgtD, and EgtE; and

Collect ergothioneine.

5. The method of claim 4, wherein during the culture of host cells, a substrate selected from histidine, methionine, cysteine, γ-glutamylcysteine, and combinations thereof is added to the culture.

6. The method of claim 4, wherein iron (II) is added to the culture during the culture of host cells.

7. The method of claim 4, wherein the host cell is an Escherichia coli cell.

8. The method of claim 4, wherein the host cell is selected from Saccharomyces cerevisiae cells and Pichia pastoris cells.

9. An expression vector for the production of ergothioneine, comprising a nucleic acid sequence encoding an amino acid sequence selected from EgtB, EgtC, EgtD, and EgtE, wherein the nucleic acid sequence encodes an amino acid sequence selected from the following amino acid sequences: the amino acid sequence of SEQ ID NO:2; the amino acid sequence of SEQ ID NO:4; the amino acid sequence of SEQ ID NO:6; and the amino acid sequence of SEQ ID NO:8.