WO2019136585A1

WO2019136585A1 - Regulation of csr system for production of lysine and lysine-derived products

Info

Publication number: WO2019136585A1
Application number: PCT/CN2018/071859
Authority: WO
Inventors: Howard Chou; Ling Chen; Yunfeng LEI; Xiucai LIU
Original assignee: Cathay R & D Center Co., Ltd.; Cibt America Inc.
Priority date: 2018-01-09
Filing date: 2018-01-09
Publication date: 2019-07-18
Also published as: CN111770993A; CN111770993B; EP3737752A1; EP3737752A4; US11485986B2; US20210054423A1

Abstract

Provided in the disclosure relates to microorganisms genetically modified to overexpress biofilm dispersal related polypeptides to enhance the production of lysine and lysine derivatives by the microorganism, method of generating such microorganism, and methods of producing lysine and lysine derivatives using the genetically modified microorganisms.

Description

[Title established by the ISA under Rule 37.2] REGULATION OF CSR SYSTEM FOR PRODUCTION OF LYSINE AND LYSINE-DERIVED PRODUCTS

BACKGROUND OF THE INVENTION

The ability for a molecule to move into and out of a cell can have a significant effect on the intracellular concentration of the molecule. For example, if the molecule is a nutrient, then slowing the movement of the molecule into the cell would inhibit growth (Herbert, D &HL Kornberg, Biochem. J. 156 (2) , 477-480, 1976) . If the molecule is a toxin, then slowing the movement of the molecule out of the cell would inhibit growth. If the molecule is a substrate in a reaction, then slowing the movement of the molecule into the cell would slow down the rate of the reaction. If the molecule is an intermediate in a series of reactions, then slowing the movement of the molecule out of the cell and allowing it to accumulate inside the cell could lead to feedback inhibition (Kikuchi et al., FEMS Microbiology Letters 173: 211-215, 1999; Ogawa-Miyata et al., Biosci. Biotechnol. Biochem. 65: 1149-1154, 2001) .

Previously, it was discovered that a phosphodiesterase protein that increases biofilm dispersal by reducing the intracellular concentration of bis- (3’-5’) -cyclic diguanosine-monophosphate (c-di-GMP) affects the production of an amino acid, e.g., lysine, and its derived products, such as cadaverine (PCT/CN2016/095281) . Although various genes have been shown to hydrolyze c-di-GMP and increase biofilm dispersal activity (e.g., bdcA or yahA from E. coli; rapA, fleN, rocR, or bifA from P. aeruginosa; vieA or mbaA from V. cholerae; and rmdAB from S. coelicolor) , any effects of increasing biofilm dispersal activity by reducing intracellular c-di-GMP concentrations on the production of amino acids or their derivatives were unknown. PCT/CN2016/095281 demonstrates that a genetically modified microorganism in which a biofilm dispersal polypeptide is overexpressed relative to a counterpart microorganism of the same strain that does not comprise the genetic modification showed increased production of lysine or a lysine-derived compounds such as cadaverine.

Another set of genes that affects biofilm formation has also been identified. The carbon storage regulator (Csr) CsrA is a global regulatory protein that has been shown to repress biofilm formation and increase biofilm dispersal in Escherichia coli (Jackson et al., J. Bacteriol. 184: 290-301, 2002) . It was shown that disruption of the csrA gene increased biofilm formation, and overexpression of csrA from a plasmid inhibited biofilm formation in E. coli. It has also been shown that CsrA is an mRNA binding protein that is part of the regulatory pathway affecting glycogen biosynthesis, catabolism, and gluconeogenesis (Romeo et al., J. Bacteriol. 175: 4744-4755, 1993) . CsrA activity is inhibited by the binding of the sRNA CsrB, a non-coding RNA that consists of 18 imperfect repeats (5’-CAGGA (U, C, A) G-3’) that form hairpin structures (Romeo et al., Mol. Microbiol. 29: 1321-1330, 1998) . Therefore, part of the mechanism of inhibition is that 18 CsrA proteins bind to one CsrB molecule at the hairpin structures.

There are two additional proteins/sRNAs that are part of the Csr system. While CsrA regulates mRNA stability and plays a role at both the transcriptional and post-transcriptional levels, the CsrD protein acts as a signaling protein that leads to the positive transcriptional regulation of genes affected by the Csr regulatory system (Esquerre et al., Scientific Reports 6: 25057, 2016) . The Csr system is additionally regulated by the sRNA CsrC that can also inhibit the activity of CsrA (Wellbacher et al., Mol. Microbiol. 48: 657-670, 2003) . It has been shown that both CsrB and CsrC are also upregulated during nutrient poor conditions (Jonas et al., FEMS Microbiol. Lett. 297: 80-86) .

The Csr system has previously been manipulated in order to increase the production of amino acids. For example, it was shown that increasing CsrA production by reducing the amount of CsrB can increase threonine production (WO 2003/046184) . It was subsequently shown (EP 2050816 and US 2009/0258399) that deletion of csrB and csrC can increase amino acid production, specifically, that of arginine. EP 2055771 indicates that attenuation of csrB can increase amino acid production, and in US Patent No. 8759042, that deletion of csrC can increase arginine production. Interestingly, it has also been shown that increased expression of csrB can increase phenylalanine production (Yakandawala et al., Appl. Microbiol. Biotechnol. 78: 283-291, 2008) and tryptophan production (Lu et al., Chin. J. Appl. Environ. Biol. 21: 647-651, 2015) . Thus, these publications demonstrate that increasing or decreasing CsrB or CsrC can either increase or decrease the production of an amino acid. Accordingly, one of skill would not be able to determine whether a specific amino acid different from those previously evaluated would be decreased or increased when CsrB or CsrC levels are manipulated in a cell.

BRIEF SUMMARY OF ASPECTS OF THE DISCLOSURE

The present disclosure is based, in part, on the surprising discovery that increasing the production of CsrA in E. coli does not increase lysine production; but instead, reduces lysine production; and that cells overexpressing either csrB or csrC showed increased lysine production.

Thus, in one aspect, provided herein is a genetically modified host cell comprising an exogenous nucleic acid encoding a CsrB sRNA or a CsrC sRNA, wherein the host cell overexpresses CsrB or CsrC relative to a counterpart host cell that has not been modified to express the exogenous nucleic acid; and has at least one additional genetic modification to increase production of lysine or a lysine derivative compared to a wildtype host cell. In some embodiments, the amino acid derivative is cadaverine. In some embodiments, the CsrB sRNA comprises a nucleotide sequence having at least 85%identity, or at least 90%identity, or at least 95%identity, to SEQ ID NO: 16. In some embodiments, CsrC sRNA comprises a nucleotide sequence having at least 85%identity, or at least 90%identity, or at least 95%identity, to SEQ ID NO: 17. In particular embodiments, the CsrB sRNA comprises the nucleic acid sequence of SEQ ID NO: 16. In other embodiments, the CsrC sRNA comprises the nucleic acid sequence of SEQ ID NO: 17. In some embodiments, the CsrB or CsrC sRNA is heterologous to the host cell. In some embodiments, the exogenous nucleic acid encoding the CsrB or CsrC sRNA is encoded by an expression vector introduced into the cell, wherein the expression vector comprises the exogenous nucleic acid operably linked to a promoter. In other embodiments, the exogenous nucleic acid is integrated into the host chromosome. In some embodiments, the host cell overexpresses a lysine decarboxylase. In further embodiments, the host cell overexpresses one or more lysine biosynthesis polypeptides, such as an aspartate kinase, a dihydrodipicolinate synthase, a diaminopimelate decarboxylase, an aspartate semialdehyde dehydrogenase, a dihydropicolinate reductase, or an aspartate transaminase. In particular embodiments, the aspartate kinase, dihydrodipicolinate synthase, diaminopimelate decarboxylase, aspartate semialdehyde dehydrogenase, adihydropicolinate reductase, or aspartate transaminase is a LysC, DapA, LysA, Asd, DapB, or AspC polypeptide. In certain embodiments, the host cell overexpresses a CadA, LysC, DapA, LysA, Asd, DapB, and AspC polypeptide. In some embodiments, the host cell is of the genus Escherichia, Hafnia, or Corynebacterium. In particular embodiments, the host cell is Escherichia coli, Hafnia alvei, or Corynebacterium glutamicum.

In an additional aspect, provided herein is a method of producing lysine or a lysine derivative, e.g., cadaverine, the method comprising culturing a host cell as described herein, e.g., in the preceding paragraph under conditions in which the CsrB sRNA or CsrC sRNA is overexpressed.

In a further aspect, provided herein is a method of engineering a host cell to increase production of lysine or a lysine derivative, the method comprising introducing an exogenous nucleic acid encoding a CsrB sRNA or CsrC sRNA into the host cell, wherein the host cell has at least one additional genetic modification to increase production of lysine or a lysine derivative compared to a wildtype host cell; culturing the host cell under conditions in which the CsrB or CsrC sRNA is expressed, and selecting a host cell that exhibits increased production of lysine or a lysine derivative relative to a counterpart control host cell that has not been modified to express the exogenous nucleic acid. In some embodiments, the amino acid derivative is cadaverine. In some embodiments, the CsrB sRNA comprises a nucleotide sequence having at least 85%identity, or at least 90%identity, or at least 95%identity, to SEQ ID NO: 16. In some embodiments, the CsrC sRNA comprises a nucleotide sequence having at least 85%identity, or at least 90%identity, or at least 95%identity, to SEQ ID NO: 17. In further embodiments, the CsrB sRNA comprises the nucleic acid sequence of SEQ ID NO: 16. In other embodiments, the CsrC sRNA comprises the nucleic acid sequence of SEQ ID NO: 17. In some embodiments, the CsrB sRNA or CsrC sRNA is heterologous to the host cell. In some embodiments, the exogenous nucleic acid encoding the CsrB or CsrC sRNA is encoded by an expression vector introduced into the cell, wherein the expression vector comprises the exogenous nucleic acid operably linked to a promoter. In other embodiments, the exogenous nucleic acid is integrated into the host chromosome. In some embodiments, the host cell overexpresses a lysine decarboxylase. In some embodiments, the host cell overexpresses one or more lysine biosynthesis polypeptides, such as an aspartate kinase, a dihydrodipicolinate synthase, a diaminopimelate decarboxylase, an aspartate semialdehyde dehydrogenase, a dihydropicolinate reductase, or an aspartate transaminase. In some embodiments, the aspartate kinase, dihydrodipicolinate synthase, diaminopimelate decarboxylase, aspartate semialdehyde dehydrogenase, adihydropicolinate reductase, or aspartate transaminase is a LysC, DapA, LysA, Asd, DapB, or AspC polypeptide. In some embodiments, the host cell overexpresses a CadA, LysC, DapA, LysA, Asd, DapB, and AspC polypeptide. In some embodiments, the host cell is of the genus Escherichia, Hafnia, or Corynebacterium. In certain embodiments, the host cell is Escherichia coli, Hafnia alvei, or Corynebacterium glutamicum.

DETAILED DESCRIPTION OF THE INVENTION

Terminology

As used herein, “CsrB” refers to a small regulatory RNA (sRNA) that comprises imperfect repeats that form hairpin structures and binds CsrA. “CsrB” includes E. coli CsrB and homologs of CsrB from other bacteria, such as member of the Enterobacteriaceae family. E. coli CsrB is about 360 nucleotides in length and contains 18 imperfect repeats 5’-CAGGA (U, C, A) G-3’) (Romeo et al., Mol. Microbiol. 29: 1321-1330, 1998) . CsrA binds to CsrB at the hairpin structures such that one molecule of CsrA binds to each hairpin structure. Thus, CsrB sequesters CsrA and reduces CsrA activity. CsrA/CsrB sequences have been reported in other Enterobacteriaceae, such as Salmonella, Shigella, and Yersinia. In Salmonella, CsrB has been shown to have 16 predicted stem-loops, each carrying the consensus sequence GWGGRHG (Altier, et al. Mol. Microbiol. 35: 635-646, 2000) , where “W” is A or U; R is A or G; and H is A, C, or U. An illustrative E. coli CsrB DNA sequence is provided in SEQ ID NO: 16. Additional CsrB sequences include those encoded by chromosomal region CP015574.1 of a Salmonella enterica subsp; chromosomal region CP024470.1 of Shigella flexneri; chromosomal region CP023645.1 of Shigella sonnei; and chromosomal region LT556085.1 of Citrobacter sp.

As used herein, “CsrC” refers to an sRNA that comprises imperfect repeats similar to those contained in CsrB that form hairpin structures and binds CsrA, thus sequestering CsrA. The term includes includes E. coli CsrC and homologs of CsrC from other bacteria, such as member of the Enterobacteriaceae family. E. coli CsrC is about 245 nucleotides in length and contains 9 such repeats (Weilbacher et al., Mol. Microbiol. 48: 657-670, 2003) . In Salmonella, CsrC has been shown to have 8 predicted stem-loop structures (Fortune et al., Infect. And Immun. 74: 1331-1339, 2006) . An illustrative E. coli CsrC DNA sequence is provided in SEQ ID NO: 17. Additional CsrC sequences include those encoded by chromosomal region CP023645.1 of Shigella sonnei; chromosomal region CP024470.1 of Shigella flexneri; CP023504.1 of Citrobacter werkmanii; and chromosomal region CP018661.1 of Salmonella enterica subsp.

The terms "increased expression" and "overexpression” of a CsrB or CsrC sRNA are used interchangeably herein to refer to an increase in the amount of CsrB or CsrC sRNA in a genetically modified cell, e.g., a cell into which an expression construct encoding a CsrB or CsrC sRNA has been introduced, compared to the amount of CsrB or CsrC sRNA in a counterpart cell that does not have the genetic modification, i.e., a cell of the same strain without the modification. An increased level of expression for purposes of this application is typically at least 5%, or at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, or greater, compared to the counterpart unmodified cell. The unmodified cell need not express the CsrB or CsrC sRNA. Thus, the term “overexpression” also includes embodiments in which a CsrB or CsrC sRNA is expressed in a host cell that does not natively express the CsrB or CsrC sRNA. Increased expression of a CsrB or CsrC sRNA can be assessed by any number of assays, including, but not limited to, measuring the level of RNA and/or the level of CsrB or CsrC sRNA activity, e.g., by measuring CsrA binding activity directly or by assessing an activity modulated by CsrB or CsrC sRNA.

The term “enhanced” in the context of the production of lysine, or a lysine derivative such as cadaverine, as used herein refers to an increase in the production of amino acid, e.g., lysine, or the derivative, by a genetically modified host cell in comparison to a control counterpart cell, such as a cell of the wildtype strain or a cell of the same strain that does not have the genetic modification to increase production of lysine or the lysine derivative. Production of the amino acid or its derivative is typically enhanced by at least 5%, or at least 0%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, or greater compared to the control cell.

The terms "numbered with reference to" , or "corresponding to, " or “determined with reference to” when used in the context of the numbering of a given amino acid or polynucleotide sequence, refers to the numbering of the residues of a specified reference sequence when the given amino acid or polynucleotide sequence is compared to the reference sequence. For example, a nucleotide in a CsrB or CsrC sRNA sequence variant “corresponds to” a nucleotide position in SEQ ID NO: 16 when the residue aligns with the nucleotide in a comparison of SEQ ID NO: 16 and variant in a maximal alignment.

The terms "polynucleotide" and "nucleic acid" as used herein in the context of expression vectors and a sequence encoding a CsrB or CsrC sRNA are used interchangeably and refer to a single or double-stranded polymer of deoxyribonucleotide or ribonucleotide bases read from the 5'to the 3'end. A nucleic acid as used in the present invention will generally contain phosphodiester bonds, although in some cases, nucleic acid analogs may be used that may have alternate backbones, comprising, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphosphoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press) ; positive backbones; non-ionic backbones, and non-ribose backbones. Nucleic acids or polynucleotides may also include modified nucleotides that permit correct read-through by a polymerase. "Polynucleotide sequence" or "nucleic acid sequence" includes both the sense and antisense strands of a nucleic acid as either individual single strands or in a duplex. As will be appreciated by those in the art, the depiction of a single strand also defines the sequence of the complementary strand; thus the sequences described herein also provide the complement of the sequence. Unless otherwise indicated, a particular nucleic acid sequence, e.g., that encodes a polypeptide, also implicitly encompasses variants degenerate codon substitutions and complementary sequences, as well as the sequence explicitly indicated. The nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, where the nucleic acid may contain combinations of deoxyribo-and ribo-nucleotides, and combinations of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine, isoguanine, etc. Nucleic acid sequences are presented in the 5’to 3’direction unless otherwise specified.

The term "substantially identical" used in the context of two nucleic acids or polypeptides, refers to a sequence that has at least 60%, 65%, or 70%sequence identity with a reference sequence. Percent identity can be any integer from 60%to 100%. Some embodiments include at least: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%, compared to a reference sequence using the programs described herein; preferably BLAST using standard default parameters, as described below.

Two nucleic acid sequences or polypeptide sequences are said to be "identical" if the sequence of nucleotides or amino acid residues, respectively, in the two sequences is the same when aligned for maximum correspondence as described below. The terms "identical" or percent "identity, " in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence over a comparison window, as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. When percentage of sequence identity is used in reference to proteins or peptides, it is recognized that residue positions that are not identical often differ by conservative amino acid substitutions, where amino acids residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. Where sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are well known to those of skill in the art. Typically, this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1.

For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.

An algorithm that may be used to determine whether a CsrB or CsrC sRNA has sequence identity to SEQ ID NO: 16 or 17, or another polynucleotide reference sequence, is the BLAST algorithm, which is described in Altschul et al., 1990, J. Mol. Biol. 215: 403-410, which is incorporated herein by reference. Software for performing BLAST and BLAST2 analyses is publicly available through the National Center for Biotechnology Information (on the worldwide web at ncbi. nlm. nih. gov/) .

A "comparison window, " as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith &Waterman, Adv. Appl. Math. 2: 482 (1981) , by the homology alignment algorithm of Needleman &Wunsch, J. Mol. Biol. 48: 443 (1970) , by the search for similarity method of Pearson &Lipman, Proc. Nat'l. Acad. Sci. USA 85: 2444 (1988) , by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI) , or by manual alignment and visual inspection.

The term "promoter, " as used herein, refers to a polynucleotide sequence capable of driving transcription of a DNA sequence in a cell. Thus, promoters used in the polynucleotide constructs of the invention include cis-and trans-acting transcriptional control elements and regulatory sequences that are involved in regulating or modulating the timing and/or rate of transcription of a gene. For example, a promoter can be a cis-acting transcriptional control element, including an enhancer, a repressor binding sequence and the like. These cis-acting sequences typically interact with proteins or other biomolecules to carry out (turn on/off, regulate, modulate, etc. ) gene transcription. Most often the core promoter sequences lie within 1-2 kb of the translation start site, more often within 1 kbp and often within 500 bp or 200 bp or fewer, of the translation start site. By convention, promoter sequences are usually provided as the sequence on the coding strand of the gene it controls. In the context of this application, a promoter is typically referred to by the name of the gene for which it naturally regulates expression. A promoter used in an expression construct of the invention is thus referred to by the name of the gene. Reference to a promoter by name includes a wild type, native promoter as well as variants of the promoter that retain the ability to induce expression. Reference to a promoter by name is not restricted to a particular species, but also encompasses a promoter from a corresponding gene in other species.

A "constitutive promoter" in the context of this invention refers to a promoter that is capable of initiating transcription under most conditions in a cell, e.g., in the absence of an inducing molecule. An “inducible promoter” initiates transcription in the presence of an inducer molecule.

As used herein, a polynucleotide is "heterologous" to an organism or a second polynucleotide sequence if it originates from a foreign species, or, if from the same species, is modified from its original form. For example, when a polynucleotide encoding a CsrB or CsrC sRNA sequence is said to be operably linked to a heterologous promoter, it means that the polynucleotide coding sequence encoding the CsrB or CsrC is derived from one species whereas the promoter sequence is derived from another, different species; or, if both are derived from the same species, the coding sequence is not naturally associated with the promoter (e.g., is a genetically engineered coding sequence, e.g., from a different gene in the same species, or an allele from a different species) . Similarly, a CsrB or CsrC sRNA, or a polynucleotide encoding the a CsrB or CsrC sRNA is “heterologous” to a host cell if the native wildtype host cell does not produce the a CsrB or CsrC sRNA; and a CsrB sRNA or CsrC sRNA variant is “heterologous” to a host cell if the nucleotide sequence differs from a CsrB or CsrC polynucleotide sequence that is native to the host cell.

The term "exogenous" as used herein refers generally to a polynucleotide sequence or polypeptide that is introduced into a host cell by molecular biological techniques to produce a recombinant cell. Examples of "exogenous" polynucleotides include vectors, plasmids, and/or man-made nucleic acid constructs encoding a desired protein. An “exogenous” polypeptide or polynucleotide expressed in the host cell may occur naturally in the wildtype host cell or may be heterologous to the host cell. The term also encompasses progeny of the original host cell that has been engineered to express the exogenous polynucleotide or polypeptide sequence, i.e., a host cell that expresses an “exogenous” polynucleotide may be the original genetically modified host cell or a progeny cell that comprises the genetic modification.

The term "endogenous" refers to naturally-occurring polynucleotide sequences or polypeptides that may be found in a given wild-type cell or organism. In this regard, it is also noted that even though an organism may comprise an endogenous copy of a given polynucleotide sequence or gene, the introduction of an expression construct or vector encoding that sequence, such as to over-express or otherwise regulate the expression of the encoded protein, represents an "exogenous" copy of that gene or polynucleotide sequence. Any of the pathways, genes, RNAs, or enzymes described herein may utilize or rely on an "endogenous" sequence, which may be provided as one or more "exogenous" polynucleotide sequences, or both.

"Recombinant nucleic acid" or "recombinant polynucleotide" as used herein refers to a genetically engineered polymer of nucleic acids wherein at least one of the following is true: (a) the sequence of nucleic acids is foreign to (i.e., not naturally found in) a given host cell; (b) the sequence may be naturally found in a given host cell, but in an unnatural (e.g., greater than expected) amount; or (c) the sequence of nucleic acids comprises two or more subsequences that are not found in the same relationship to each other in nature. For example, regarding instance (c) , a recombinant nucleic acid sequence will have two or more sequences from unrelated genes arranged to make a new functional nucleic acid. A “recombinant” nucleic acid refers to the original polynucleotide that is manipulated as well as copies of the polynucleotide.

The term "operably linked" refers to a functional relationship between two or more polynucleotide (e.g., DNA) segments. Typically, it refers to the functional relationship of a transcriptional regulatory sequence to a transcribed sequence. For example, a promoter or enhancer sequence is operably linked to a DNA or RNA sequence if it stimulates or modulates the transcription of the DNA or RNA sequence in an appropriate host cell or other expression system. Generally, promoter transcriptional regulatory sequences that are operably linked to a transcribed sequence are physically contiguous to the transcribed sequence, i.e., they are cis-acting. However, some transcriptional regulatory sequences, such as enhancers, need not be physically contiguous or located in close proximity to the coding sequences whose transcription they enhance.

The term "expression cassette" or “DNA construct” or “expression construct” refers to a nucleic acid construct that, when introduced into a host cell, results in transcription and/or translation of an RNA or polypeptide, respectively. In the case of expression of transgenes, one of skill will recognize that the inserted polynucleotide sequence need not be identical, but may be only substantially identical to a sequence of the gene from which it was derived. As explained herein, these substantially identical variants are specifically covered by reference to a specific nucleic acid sequence. One example of an expression cassette is a polynucleotide construct that comprises a polynucleotide sequence encoding a polypeptide for use in the invention operably linked to a promoter, e.g., its native promoter, where the expression cassette is introduced into a heterologous microorganism. In some embodiments, an expression cassette comprises a polynucleotide sequence encoding a polypeptide of the invention where the polynucleotide that is targeted to a position in the genome of a microorganism such that expression of the polynucleotide sequence is driven by a promoter that is present in the microorganism.

The term "host cell" as used in the context of this invention refers to a microorganism and includes an individual cell or cell culture that can be or has been a recipient of any recombinant vector (s) or isolated polynucleotide (s) of the invention. Host cells include progeny of a single host cell, and the progeny may not necessarily be completely identical (in morphology or in total DNA complement) to the original parent cell due to natural, accidental, or deliberate mutation and/or change. A host cell includes cells into which a recombinant vector or a polynucleotide of the invention has been introduced, including by transformation, transfection, and the like.

The term "isolated" refers to a material that is substantially or essentially free from components that normally accompany it in its native state. For example, an "isolated polynucleotide, " as used herein, may refer to a polynucleotide that has been isolated from the sequences that flank it in its naturally-occurring or genomic state, e.g., a DNA fragment that has been removed from the sequences that are normally adjacent to the fragment, such as by cloning into a vector. A polynucleotide is considered to be isolated if, for example, it is cloned into a vector that is not a part of the natural environment, or if it is artificially introduced in the genome of a cell in a manner that differs from its naturally-occurring state

Aspects of the disclosure

The present disclosure is based, in part, on the discovery that increased expression of a CsrB or CsrC sRNA in a microorganism, such as a gram-negative bacterium, enhances lysine production and/or production of a derivative of lysine, such as cadaverine.

A host cell that is engineered in accordance with the invention to overexpress a CsrB or CsrC sRNA also overexpresses at least one enzyme involved in the synthesis of an amino or amino acid derivative, such as a lysine decarboxylase polypeptide; and/or an additional polypeptide that is involved in amino acid biosynthesis. Lysine decarboxylase and lysine biosynthesis polypeptides and nucleic acid sequences are available in the art.

The invention employs various routine recombinant nucleic acid techniques. Generally, the nomenclature and the laboratory procedures in recombinant DNA technology described below are commonly employed in the art. Many manuals that provide direction for performing recombinant DNA manipulations are available, e.g., Sambrook &Russell, Molecular Cloning, A Laboratory Manual (3rd Ed, 2001) ; and Current Protocols in Molecular Biology (Ausubel, et al., John Wiley and Sons, New York, 2009-2017) .

Polynucleotides encoding CsrB or CsrC sRNAs

Various polynucleotides have been shown to encode CsrB and CsrC sRNAs that bind to CsrA and reduce CsrA function. Polynucleotides that encode CsrB and CsrC sRNAs suitable for overexpressing in a host cell to increase production of lysine, or a derivative of lysine, include E. coli CsrB and CsrC polynucleotide sequences illustrated by SEQ ID NO: 16 and SEQ ID NO: 17, respectively.

In some embodiments, a host cell is genetically modified to overexpress a CsrB polynucleotide having at least 60%, or at least 70%, 75%, 80%, 85%, or at least 90%identity to SEQ ID NO: 16. Unless indicated otherwise, “SEQ ID NO: 16” refers to the DNA sequence shown in the listing of illustrative sequences and to its RNA counterpart in which uracil bases replace the thymine bases. Thus, when a CsrB RNA sequence is compared to SEQ ID NO: 16 for determining percent identity, it is understood that SEQ ID NO: 16 would contain “U” instead of “T” . CsrB polynucleotide variants of SEQ ID NO: 16 retain the ability to bind CsrA. In some embodiments, a CsrB polynucleotide has at least 85%identity; or at least 90%, or at least 95%identity to SEQ ID NO: 16. In some embodiments, the polynucleotide comprises the sequence of SEQ ID NO: 16. Additional illustrative CsrB sequences include those set forth in SEQ ID NOS: 18, 19, 20, 21, and 22 from Shigella, Triticum, Citrobacter, and Salmonella, which have 99%sequence identity (Shigella and Triticum) , 90%sequence identity (Citrobacter) and 87%sequence identity (Salmonella) to SEQ ID NO: 16.

In some embodiments, a host cell is genetically modified to overexpress a CsrC polynucleotide having at least 60%, or at least 70%, 75%, 80%, 85%, or at least 90%identity to SEQ ID NO: 17. Unless indicated otherwise, “SEQ ID NO: 17” refers to the DNA sequence shown in the listing of illustrative sequences and to its RNA counterpart in which uracil bases replace the thymine bases. Thus, when a CsrC RNA sequence is compared to SEQ ID NO: 17 for determining percent identity, it is understood that SEQ ID NO: 17 would contain “U” instead of “T” . CsrC polynucleotide variants of SEQ ID NO: 17 retain the ability to bind CsrA. In some embodiments, a CsrC polynucleotide has at least 85%identity; or at least 90%, or at least 95% identity to SEQ ID NO: 17. In some embodiments, the polynucleotide comprises the sequence of SEQ ID NO: 17. Additional illustrative CsrC sequences include SEQ ID NOS: 23, 24, 25, and 26 from Shigella sonnei, Shigella flexneri, Citrobacter werkmanii, and Salmonella enterica, which have 100%, 99%, 89%, and 88%sequence identity to SEQ ID NO: 17, respectively.

In some embodiments, the CsrB or CsrC sRNA comprises at least 8, or at least 9, 10, 11, 12, 13, 14, 15, 16, 17, or 18; and typically fewer than 20 hairpins that bind CsrA. In some embodiments, the hairpin structure comprises the sequence GWGGRHG, in which “W” is A or U; R is A or G; and H is A, C, or U. In some embodiments, the hairpin structure comprises the sequence CAGGA (U, C, A) G.

In some embodiments, a host cell is genetically modified to over express a CsrB or CsrC sRNA from Salmonella, Citrobacter, or Shigella.

Activity of a wild-type or variant CsrB or CsrC sRNA can be assessed using any number of assays, including assays that evaluate the production of lysine or a lysine-derived compound. In some embodiments, lysine production or cadaverine production is measured. Illustrative assays are provided in the examples section. In some embodiments, cadaverine production is measured in E. coli modified to co-express LysC, DapA, LysA, Asd, DapB, AspC, and CadA and the CsrB or CsrC sRNA. The following is an illustrative assay that is used to assess production of lysine and/or cadaverine. E. coli are modified to express LysC, DapA, LysA, Asd, DapB, AspC, and CadA and the CsrB or CsrC sRNA. The genes may be individually introduced into E. coli, or introduced in one or more operons. For examples, LysC, DapA, LysA, Asd, DapB, and AspC may be encoded by a synthetic operon present in one plasmid and CadA and a candidate variant may be encoded by a separate plasmid. Each plasmid has a unique antibiotic-resistance selectable marker. Antibiotic-resistant colonies are selected and cultured. For example, cultures are grown overnight at 37℃ in 3mL of medium containing 4%glucose, 0.1%KH ₂PO ₄, 0.1%MgSO ₄, 1.6% (NH ₄) ₂SO ₄, 0.001%FeSO ₄, 0.001%MnSO ₄, 0.2%yeast extract, 0.05%L-methionine, 0.01%L-threonine, 0.005%L-isoleucine, and appropriate antibiotics for selection. The following day, each culture is inoculated into 50 mL of fresh medium with 30 g/L of glucose, 0.7%Ca (HCO ₃) ₂, antibiotic (s) , and grown for 72 hours at 37℃, at which point the concentration of lysine is determined. Lysine or cadaverine can be quantified using NMR. Yield can be calculated by dividing the molar amount of lysine or cadaverine produced by the molar amount of glucose added. A CsrB or CsrC sRNA useful in this invention increases the yield of lysine or a cadaverine. Alternatively, colonies are evaluated for increased production of another lysine derivative.

In some embodiments, a CsrB or CsrC sRNA increases lysine or cadaverine production by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%or greater, when expressed in a host cell compared to a counterpart host cell of the same strain that comprises the same genetic modifications other than the modification to overexpress the CsrB or CsrC sRNA. In some embodiments, CsrB or CsrC sRNA increases lysine or cadaverine production by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%or greater, when expressed in a host cell that is modified to overexpress a lysine decarboxylase, an aspartate kinase, a dihydrodipicolinate synthase, a diaminopimelate decarboxylase, an aspartate semialdehyde dehydrogenase, a dihydropicolinate reductase, and an aspartate transaminase; compared to a counterpart host cell of the same strain that comprises the modification to overexpress the lysine decarboxylase, the aspartate kinase, the dihydrodipicolinate synthase, the diaminopimelate decarboxylase, the aspartate semialdehyde dehydrogenase, the dihydropicolinate reductase, and the aspartate transaminase, but does not overexpress the CsrB or CsrC sRNA.

In some embodiments, activity of a CsrB or CsrC sRNA can be assessed by determining the ability of the RNA to bind CsrA, e.g., in a quantitative mobility shift assay.

Isolation or generation of CsrB or CsrC sequences to incorporate into expression cassettes for overexpression in a host cell can be accomplished by a number of techniques. Such techniques will be discussed in the context of CsrB or CsrC polynucleotide sequences. However, one of skill understands that the same techniques can be used to isolate and express other desired genes. In some embodiments, oligonucleotide probes based on the sequences disclosed here can be used to identify the desired polynucleotide in a cDNA or genomic DNA library from a desired bacterial species. Probes may be used to hybridize with genomic DNA to isolate homologous genes in the same or different species.

In typical embodiments, the nucleic acids of interest are amplified from nucleic acid samples using routine amplification techniques. For instance, PCR may be used to amplify the sequences directly from genomic DNA. PCR and other in vitro amplification methods may also be useful, for example, to clone nucleic acid sequences that code for proteins to be expressed, to make nucleic acids to use as probes for detecting the presence of the desired mRNA in samples, for nucleic acid sequencing, or for other purposes.

Appropriate primers and probes for generating CsrB and CsrC polynucleotides in bacteria can be determined from comparisons of the sequences provided herein. For a general overview of PCR see PCR Protocols: A Guide to Methods and Applications. (Innis, M, Gelfand, D., Sninsky, J. and White, T., eds. ) , Academic Press, San Diego (1990) . Illustrative primer sequences are shown in the Table of Primers in the Examples section.

Nucleic acid sequences encoding a CsrB or CsrC sRNA that confers increased production of lysine, or a lysine-derived product, e.g., cadaverine, to a host cell, may additionally be codon-optimized for expression in a desired host cell. Methods and databases that can be employed are known in the art. For example, preferred codons may be determined in relation to codon usage in a single gene, a set of genes of common function or origin, highly expressed genes, the codon frequency in the aggregate protein coding regions of the whole organism, codon frequency in the aggregate protein coding regions of related organisms, or combinations thereof. See e.g., Henaut and Danchin in "Escherichia coli and Salmonella, " Neidhardt, et al. Eds., ASM Pres, Washington D.C. (1996) , pp. 2047-2066; Nucleic Acids Res. 20: 2111-2118; Nakamura et al., 2000, Nucl. Acids Res. 28: 292) .

Preparation of recombinant vectors

Recombinant vectors for expression of a CsrB or CsrC sRNA can be prepared using methods well known in the art. For example, a DNA sequence encoding a CsrB or CsrC sRNA (described in further detail below) , can be combined with transcriptional and other regulatory sequences which will direct the transcription of the sequence from the gene in the intended cells, e.g., bacterial cells such as E. coli. In some embodiments, an expression vector that comprises an expression cassette that comprises the polynucleotide encoding a CsrB or CsrC sRNA further comprises a promoter operably linked to the CsrB or CsrC polynucleotide. In other embodiments, a promoter and/or other regulatory elements that direct transcription of the CsrB or CsrC polynucleotide are endogenous to the host cell and an expression cassette comprising the CsrB or CsrC polynucleotide is introduced, e.g., by homologous recombination, such that the exogenous CsrB or CsrC polynucleotide is operably linked to an endogenous promoter and expression is driven by the endogenous promoter.

As noted above, expression of a CsrB or CsrC polynucleotide can be controlled by a number of regulatory sequences including promoters, which may be either constitutive or inducible; and, optionally, repressor sequences, if desired. Examples of suitable promoters, especially in a bacterial host cell, are the promoters obtained from the E. coli lac operon and other promoters derived from genes involved in the metabolism of other sugars, e.g., galactose and maltose. Additional examples include promoters such as the trp promoter, bla promoter bacteriophage lambda PL, and T5. In addition, synthetic promoters, such as the tac promoter (U.S. Pat. No. 4,551,433) , can be used. Further examples of promoters include Streptomyces coelicolor agarase gene (dagA) , Bacillus subtilis levansucrase gene (sacB) , Bacillus licheniformis alpha-amylase gene (amyL) , Bacillus stearothermophilus maltogenic amylase gene (amyM) , Bacillus amyloliquefaciens alpha-amylase gene (amyQ) , Bacillus licheniformis penicillinase gene (penP) , Bacillus subtilis xylA and xylB genes. Suitable promoters are also described in Ausubel and Sambrook &Russell, both supra. Additional promoters include promoters described by Jensen &Hammer, Appl. Environ. Microbiol. 64: 82, 1998; Shimada, et al., J. Bacteriol. 186: 7112, 2004; and Miksch et al., Appl. Microbiol. Biotechnol. 69: 312, 2005.

In some embodiments, a promoter that influences expression of a native CsrB or CsrC polynucleotide may be modified to increase expression. For example, an endogenous CsrB or CsrC promoter may be replaced by a promoter that provides for increased expression compared to the native promoter.

An expression vector may also comprise additional sequences that influence expression of a polynucleotide encoding the CsrB or CsrC sRNA. Such sequences include enhancer sequences, a ribosome binding site, or other sequences such as transcription termination sequences, and the like.

A vector expressing a nucleic acid encoding a CsrB or CsrC sRNA may be an autonomously replicating vector, i.e., a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal element, a mini-chromosome, or an artificial chromosome. The vector may contain any means for assuring self-replication. Alternatively, the vector may be one which, when introduced into the host, is integrated into the genome and replicated together with the chromosome (s) into which it has been integrated. Thus, an expression vector may additionally contain an element (s) that permits integration of the vector into the host's genome.

An expression vector of the invention preferably contains one or more selectable markers which permit easy selection of transformed hosts. For example, an expression vector may comprise a gene that confers antibiotic resistance (e.g., ampicillin, kanamycin, chloramphenicol or tetracycline resistance) to the recombinant host organism, e.g., a bacterial cell such as E. coli.

Although any suitable expression vector may be used to incorporate the desired sequences, readily available bacterial expression vectors include, without limitation: plasmids such as pSClOl, pBR322, pBBRlMCS-3, pUR, pET, pEX, pMRlOO, pCR4, pBAD24, p15a, pACYC, pUC, e.g., pUC18 or pUC19, or plasmids derived from these plasmids; and bacteriophages, such as Ml 3 phage and λ phage. One of ordinary skill in the art, however, can readily determine through routine experimentation whether any particular expression vector is suited for any given host cell. For example, the expression vector can be introduced into the host cell, which is then monitored for viability and expression of the sequences contained in the vector.

Expression vectors of the invention may be introduced into the host cell using any number of well-known methods, including calcium chloride-based methods, electroporation, or any other method known in the art.

Host Cells

The present invention provides for a genetically modified host cell that is engineered to overexpress a CsrB or CsrC sRNA. Such a host cell may comprise a nucleic acid encoding a heterologous CsrB or CsrC sRNA, including any non-naturally occurring CsrB or CsrC sRNA variant; or may be genetically modified to overexpress a native CsrB or CsrC sRNA relative to a wildtype host cell.

A genetically modified host strain of the present invention typically comprises at least one additional genetic modification to enhance production of lysine or a lysine derivative relative to a control strain that does not have the one additional genetic modification, e.g., a wildtype strain or a cell of the same strain without the one additional genetic modification. An “additional genetic modification to enhance production of lysine or lysine derivative” can be any genetic modification. In some embodiments, the genetic modification is the introduction of a polynucleotide that expresses an enzyme involved in the synthesis of lysine or a derivative such as cadaverine. In some embodiments, the host cell comprises multiple modifications to increase production, relative to a wildtype host cell, of lysine or a lysine derivative, e.g., cadaverine.

In some aspects, genetic modification of a host cell to overexpress CsrB or CsrC sRNA is performed in conjunction with modifying the host cell to overexpress a lysine decarboxylase polypeptide and/or one or more lysine biosynthesis polypeptides.

A lysine decarboxylase refers to an enzyme that converts L-lysine into cadaverine. The enzyme is classified as E.C. 4.1.1.18. Lysine decarboxylase polypeptides are well characterized enzymes, the structures of which are well known in the art (see, e.g., Kanjee, et al., EMBO J. 30: 931-944, 2011; and a review by Lemmonier &Lane, Microbiology 144; 751-760, 1998; and references described therein) . The EC number for lysine decarboxylase is 4.1.1.18. Illustrative lysine decarboxylase sequences are CadA homologs from Klebsiella sp., WP 012968785.1; Enterobacter aerogenes, YP 004592843.1; Salmonella enterica, WP 020936842.1; Serratia sp., WP 033635725.1; and Raoultella ornithinolytica, YP 007874766.1; and LdcC homologs from Shigella sp., WP 001020968.1; Citrobacter sp., WP 016151770.1; and Salmonella enterica, WP 001021062.1. As used herein, a lysine decarboxylase includes variants of native lysine decarboxylase enzymes that have lysine decarboxylase enzymatic activity. Additional lysine decarboxylase enzymes are described in PCT/CN2014/080873 and PCT/CN2015/072978.

In some embodiments, a host cell may be genetically modified to express one or more polypeptides that affect lysine biosynthesis. Examples of lysine biosynthesis polypeptides include the E. coli genes SucA, Ppc, AspC, LysC, Asd, DapA, DapB, DapD, ArgD, DapE, DapF, LysA, Ddh, PntAB, CyoABE, GadAB, YbjE, GdhA, GltA, SucC, GadC, AcnB, PflB, ThrA, AceA, AceB, GltB, AceE, SdhA, MurE, SpeE, SpeG, PuuA, PuuP, and YgjG, or the corresponding genes from other organisms. Such genes are known in the art (see, e.g., Shah et al., J. Med. Sci. 2: 152-157, 2002; Anastassiadia, S. Recent Patents on Biotechnol. 1: 11-24, 2007) . See, also, Kind, et al., Appl. Microbiol. Biotechnol. 91: 1287-1296, 2011 for a review of genes involved in cadaverine production. Illustrative genes encoding lysine biosynthesis polypeptides are provided below.

In some embodiments, a host cell is genetically modified to express a lysine decarboxylase, an aspartate kinase, a dihydrodipicolinate synthase, a diaminopimelate decarboxylase, an aspartate semialdehyde dehydrogenase, a dihydropicolinate reductase, and an aspartate transaminase. Additional modifications may also be incorporated into the host cell.

In some embodiments, a host cell may be genetically modified to attenuate or reduce the expression of one or more polypeptides that affect lysine biosynthesis. Examples of such polypeptides include the E. coli genes Pck, Pgi, DeaD, CitE, MenE, PoxB, AceA, AceB, AceE, RpoC, and ThrA, or the corresponding genes from other organisms. Such genes are known in the art (see, e.g., Shah et al., J. Med. Sci. 2: 152-157, 2002; Anastassiadia, S. Recent Patents on Biotechnol. 1: 11-24, 2007) . See, also, Kind, et al., Appl. Microbiol. Biotechnol. 91: 1287-1296, 2011 for a review of genes attenuated to increase cadaverine production. Illustrative genes encoding polypeptides whose attenuation increases lysine biosynthesis are provided below.

Nucleic acids encoding a lysine decarboxylase or a lysine biosynthesis polypeptide may be introduced into the host cell along with a polynucleotide encoding the CsrB or CsrC sRNA, e.g., encoded on a single expression vector, or introduced in multiple expression vectors at the same time. Alternatively, the host cell may be genetically modified to overexpress lysine decarboxylase or one or more lysine biosynthesis polypeptides before or after the host cell is genetically modified to overexpress the CsrB or CsrC sRNA.

In alternative embodiments, a host cell that overexpresses a naturally occurring CsrB or CsrC sRNA can be obtained by other techniques, e.g., by mutagenizing cells, e.g., E. coli cells, and screening cells to identify those that express a CsrB or CsrC sRNA, at a higher level compared to the cell prior to mutagenesis.

A host cell a CsrB or CsrC sRNA as described herein is a bacterial host cell. In typical embodiments, the bacterial host cell is a Gram-negative bacterial host cell. In some embodiments of the invention, the bacterium is an enteric bacterium. In some embodiments of the invention, the bacterium is a species of the genus Corynebacterium, Escherichia, Pseudomonas, Zymomonas, Shewanella, Salmonella, Shigella, Enterobacter, Citrobacter, Cronobacter, Erwinia, Serratia, Proteus, Hafnia, Yersinia, Morganella, Edwardsiella, or Klebsiella taxonomical classes. In some embodiments, the host cells are members of the genus Escherichia, Hafnia, or Corynebacterium. In some embodiments, the host cell is an Escherichia coli, Hafnia alvei, or Corynebacterium glutamicum host cell.

In some embodiments, the host cell is a gram-positive bacterial host cell, such as a Bacillus sp., e.g., Bacillus subtilis or Bacillus licheniformis; or another Bacillus sp. such as B. alcalophilus, B. aminovorans, B. amyloliquefaciens, B. caldolyticus, B. circulans, B. stearothermophilus, B. thermoglucosidasius, B. thuringiensis or B. vulgatis.

Host cells modified in accordance with the invention can be screened for increased production of lysine or a lysine derivative, such as cadaverine, as described herein.

Methods of producing lysine or a lysine derivative.

A host cell genetically modified to overexpress CsrB or CsrC sRNA can be employed to produce lysine or a derivative of lysine. In some embodiments, the host cell produces cadaverine. To produce lysine or the lysine derivative, a host cell genetically modified to overexpress CsrB or CsrC sRNA as described herein can be cultured under conditions suitable to allow expression of the CsrB or CsrC sRNA and expression of enzymes that are used to produce lysine or the lysine derivative. A host cell modified in accordance with the invention provides a higher yield of lysine or lysine derivatives relative to a non-modified counterpart host cell that expresses a CsrB or CsrC sRNA at native levels.

Host cells may be cultured using well known techniques (see, e.g., the illustrative conditions provided in the examples section) .

The lysine or lysine derivative can then be separated and purified using known techniques. Lysine or lysine derivatives, e.g., cadaverine, produced in accordance with the invention may then be used in any known process, e.g., to produce a polyamide.

In some embodiments, lysine may be converted to caprolactam using chemical catalysts or by using enzymes and chemical catalysts.

The present invention will be described in greater detail by way of specific examples. The following examples are offered for illustrative purposes, and are not intended to limit the invention in any manner. Those of skill in the art will readily recognize a variety of noncritical parameters, which can be changed or modified to yield essentially the same results.

EXAMPLES

The following examples are offered to illustrate, but not to limit the claimed invention.

Example 1: Construction of plasmid vectors that encode CadA.

A plasmid vector containing wild-type E. coli cadA (SEQ ID NO: 1) , which encodes the lysine decarboxylase CadA (SEQ ID NO: 2) , was amplified from the E. coli MG1655 K12 genomic DNA using the PCR primers cadA-F and cadA-R, digested using the restriction enzymes SacI and BamHI, and ligated into pSTV28 to generate the plasmid pCIB39. The 5’sequence upstream of the cadA gene was optimized using the PCR primers cadA-F2 and cadA-R2 to create pCIB40. The SacI restriction site was added back to pCIB40 using the SacI-F and SacI-R primers to create pCIB41.

Example 2: Construction of a plasmid vector expressing genes encoding CsrA or CsrD.

The E. coli gene, csrA (SEQ ID NO: 3) , that encodes a carbon storage regulator, CsrA (SEQ ID NO: 4) , was amplified from the E. coli MG1655 K12 genomic DNA using the PCR primers csrA-F and csrA-R, digested with the restriction enzymes SacI and BamHI, and ligated into pCIB41 plasmid vector also digested with SacI and BamHI to create pCIB49. Similarly, csrD (SEQ ID NO: 5) , that encodes CsrD (SEQ ID NO: 6) , was amplified from the E. coli MG1655 K12 genomic DNA using the PCR primers csrD-F and csrD-R, the BamHI restriction site was removed using sewing PCR with the primers rmvBamHI-F and rmvBamHI-R, digested with the restriction enzymes SacI and BamHI, and ligated into pCIB41 plasmid vector also digested with SacI and BamHI to create pCIB50.

Example 3: Construction of plasmid vectors co-expressing Synthetic Operon I that contains three proteins (LysC, DapA, LysA) from the lysine biosynthetic pathway.

Three genes from E. coli, lysC, dapA, and lysA, encode proteins involved in the E. coli lysine biosynthetic pathway: aspartate kinase (LysC or AKIII, encoded by lysC) , dihydrodipicolinate synthase (DapA or DHDPS, encoded by dapA) , and diaminopimelate decarboxylase (LysA, encoded by lysA) . The three genes were cloned into a plasmid vector and the three proteins, LysC (SEQ ID NO: 7) , DapA (SEQ ID NO: 8) , and LysA (SEQ ID NO: 9) were overexpressed in E. coli. The gene lysC was amplified from the E. coli MG1655 K12 genomic DNA using the primers lysC-F and lysC-R, and the amplified fragment was digested using SacI and BamHI, and ligated into pUC18 to create pCIB7. The gene dapA was amplified from the E. coli MG1655 K12 genomic DNA using the primers dapA-F and dapA-R, and the amplified fragment was digested using BamHI and XbaI, and ligated into pCIB7 to create pCIB8. The gene lysA was amplified from the E. coli MG1655 K12 genomic DNA using the primers lysA-F and lysA-R, and the amplified fragment was digested using XbaI and SalI, and ligated into pCIB8 to create pCIB9. The three-gene operon was amplified from pCIB9 using the primers lysC-F and lysA-R. The amplified product was digested using SacI and SalI, and the digested fragment was ligated into pCIB10 to create pCIB32.

Example 4: Construction of plasmid vectors co-expressing various aspartokinases. Various aspartokinases were expressed in order to increase lysine production.

Two pairs of mutations were chosen that enabled the E. coli LysC to have an increased feedback resistance to lysine. The gene encoding the first mutant, LysC-1 (M318I, G323D) (SEQ. ID NO: 10) was constructed using the primers 318-F, 318-R, 323-F, 323-R. The genes encoding LysC-1 (M318I, G323D) was cloned into pCIB32 and replaced the wild-type E. coli aspartokinase, LysC, to create the plasmids pCIB43. The aspartokinase from Streptomyces strains that is capable of producing polylysine was previously suggested, but not proven, to be more feedback resistant to lysine compared to E. coli aspartokinase. As such, the aspartokinase gene from Streptomyces lividans was codon optimized, synthesized, and cloned in place of wild-type lysC in pCIB32 in order to create the plasmid pCIB55 using the primers SlysC-F and SlysC-R. The resulting aspartokinase protein that was expressed was named S-LysC (SEQ ID NO: 11) .

Example 5: Construction of plasmid vectors co-expressing Synthetic Operon II that contains three proteins (Asd, DapB, DapD, AspC) from the lysine biosynthetic pathway.

Next, the expression of four additional genes, asd, dapB, dapD, and aspC, which are involved in the lysine biosynthetic pathway of E. coli, was enhanced. These genes encode the following enzymes: aspartate semialdehyde dehydrogenase (Asd (SEQ ID NO: 12) , encoded by asd) , dihydrodipicolinate reductase (DapB or DHDPR (SEQ ID NO: 13) , encoded by dapB) , tetrahydrodipicolinate succinylase (DapD (SEQ ID NO: 14) , encoded by dapD) , and aspartate transaminase (AspC (SEQ ID NO: 15) , encoded by aspC) . The gene asd was amplified from the E. coli MG1655 K12 genomic DNA using the primers asd-F and asd-R, and the amplified fragment was digested using SacI and BamHI, and ligated into pUC18 to create pCIB12. The gene dapB was amplified from the E. coli MG1655 K12 genomic DNA using the primers dapB-F and dapB-R, and the amplified fragment was digested using BamHI and XbaI, and ligated into pCIB12 to create pCIB13. The gene dapD was amplified from the E. coli MG1655 K12 genomic DNA using the primers dapD-F and dapD-R, and the amplified fragment was digested using XbaI and SalI, and ligated into pCIB13 to create pCIB14. Similarly, the gene aspC was amplified from the E. coli MG1655 K12 genomic DNA using the primers aspC-F and aspC-R, and the amplified fragment was digested using XbaI and SalI, and ligated into pCIB13 to create pCIB31.

Example 6: Construction of plasmid vectors co-expressing Synthetic Operons I and II that contain proteins from the lysine biosynthetic pathway.

Synthetic Operon I was further optimized using primers lysC-rbs2-F and lysC-rbs2-R to modify pCIB43 and create the plasmid pCIB378. Synthetic Operon II was further optimized using the primers asd-rbs2-F and asd-rbs2-R to modify pCIB31 and create the plasmid pCIB380. pCIB380 was further modified using the primers SacI-F2, SacI-R2, ApaI-F, and ApaI-R in order to add the restriction enzyme sites for ApaI and SacI to pCIB380 in order to create the plasmid pCIB393. The two synthetic operons, Synthetic Operon I and Synthetic Operon II, consisting of the genes lysC, dapA, lysA, asd, dapB, and aspC were combined into a single vector. The operon from pCIB378 consisting of the genes lysC, dapA, and lysA was amplified using the primers LAL2-SacI-F and LAL2-ApaI-R, digested using the restriction enzymes SacI and ApaI, and ligated into pCIB393 in order to create the plasmid pCIB394.

Example 7: Mutation of E. coli using atmospheric and room temperature plasma.

E. coli MG1655 K12 was mutagenized using the atmospheric and room temperature plasma method (ARTP) (Zhang et al., Appl. Microbiol. Biotechnol. 98: 5387-5396, 2014) . The ARTP II-Sinstrument was purchased from WuXi TMAXTREE Biotechnoogy Co., Ltd. An overnight culture of cells was inoculated into fresh LB medium and grown to an OD of 0.6, after which the cells were treated with plasma for 50, 70, and 90 sec. Each of the three samples were washed and diluted with CGXII media (Keilhauer et al., J. Bacteriol. 175: 5595-5603, 1993) .

Example 8: Selection of mutated E. coli that show resistance to S- (β-aminoethyl) l-cysteine.

Each sample of mutagenized E. coli MG1655 K12 was plated on LB agar plates containing 0, 100, 200, 400, 600, 1000, or 2000 mg/L of S- (β-aminoethyl) l-cysteine (AEC) . The cells were grown at 37℃ for 2 days, after which the number of colonies growing on each plate was counted. After multiple rounds of ARTP mutagenesis and screening on LB agar plates containing AEC, it was possible to obtain colonies that were able to grow on plates containing 2000 mg/L AEC.

Example 9: Production of lysine by mutant E. coli created using ARTP.

Colonies able to grow on 2000 mg/L AEC were assayed for their lysine production ability. Each colony was grown overnight at 37℃ in 3mL of medium containing 4%glucose, 0.1%KH ₂PO ₄, 0.1%MgSO ₄, 1.6% (NH ₄) ₂SO ₄, 0.001%FeSO ₄, 0.001%MnSO ₄, 0.2%yeast extract, 0.05%L-methionine, 0.01%L-threonine, and 0.005%L-isoleucine. The following day, each culture was inoculated into 100 mL of fresh medium with 30 g/L of glucose instead and 0.7%Ca (HCO ₃) ₂. The culture was grown for 72 hours at 37℃, at which point the concentration of lysine in each culture was determined (Table 1) .

Table 1. Production of lysine by E. coli strains mutated using ARTP.

As shown in Table 1, all of the mutants selected from the LB plates containing AEC demonstrated the ability to overproduce lysine. Specifically, mutants M3, M11, and M15 were able to produce > 2.0 g/L.

Example 10: Production of lysine from mutant E. coli over-expressing Synthetic Operons I and II and CsrA or CsrD.

E. coli mutant M11 was transformed with one of the following combination of plasmids: pCIB394 and pSTV28, pCIB394 and pCIB49, or pCIB394 and pCIB50. Three single colonies from each transformation were grown overnight at 37℃ in 3mL of medium containing 4%glucose, 0.1%KH ₂PO ₄, 0.1%MgSO ₄, 1.6% (NH ₄) ₂SO ₄, 0.001%FeSO ₄, 0.001%MnSO ₄, 0.2%yeast extract, 0.05%L-methionine, 0.01%L-threonine, 0.005%L-isoleucine, ampicillin (100 μg/mL) , and chloramphenicol (20 μg/mL) . The following day, each culture was inoculated into 100 mL of fresh medium with 30 g/L of glucose, 0.7%Ca (HCO ₃) ₂, ampicillin (100 μg/mL) , and chloramphenicol (20 μg/mL) . The culture was grown for 72 hours at 37℃, at which point the concentration of lysine in each culture was determined (Table 2) .

Table 2. Production of lysine by mutant E. coli M11 containing Synthetic Operons I and II, and CsrA or CsrD.

As shown in Table 2, the overproduction of CsrA or CsrD did not lead to an increase in lysine production. Surprisingly, the increased expression of csrA led to a decrease in lysine production.

Example 11: Construction of plasmid vectors expressing the sRNAs CsrB or CsrC.

The E. coli csrB (SEQ ID NO: 16) that encodes CsrB was amplified from the E. coli MG1655 K12 genomic DNA using the PCR primers csrB-F and csrB-R, digested with the restriction enzymes SacI and BamHI, and ligated into pCIB41 plasmid vector also digested with SacI and BamHI to create pCIB51. Similarly, csrC (SEQ ID NO: 17) was amplified from the E. coli MG1655 K12 genomic DNA using the PCR primers csrC-F and csrC-R, digested with the restriction enzymes SacI and BamHI, and ligated into pCIB41 plasmid vector also digested with SacI and BamHI to create pCIB52.

Example 12: Production of lysine from E. coli over-expressing Synthetic Operons I and II and CsrB or CsrC.

E. coli mutant M11 was transformed with one of the following combination of plasmids: pCIB394 and pSTV28, pCIB394 and pCIB51, or pCIB394 and pCIB52. Three single colonies from each transformation were grown overnight at 37℃ in 3mL of medium containing 4%glucose, 0.1%KH ₂PO ₄, 0.1%MgSO ₄, 1.6% (NH ₄) ₂SO ₄, 0.001%FeSO ₄, 0.001%MnSO ₄, 0.2% yeast extract, 0.05%L-methionine, 0.01%L-threonine, 0.005%L-isoleucine, ampicillin (100 μg/mL) , and chloramphenicol (20 μg/mL) . The following day, each culture was inoculated into 100 mL of fresh medium with 30 g/L of glucose, 0.7%Ca (HCO ₃) ₂, ampicillin (100 μg/mL) , and chloramphenicol (20 μg/mL) . The culture was grown for 72 hours at 37℃, at which point the concentration of lysine in each culture was determined (Table 3) .

Table 3. Production of lysine by mutant E. coli M11 containing Synthetic Operons I and II, and CsrB or CsrC.

As shown in Table 3, the overproduction of CsrB or CsrC led to an increase in lysine production compared to the control (7.5 or 7.6 g/L compared to 6.8 g/L) .

Example 13: Construction of plasmid vectors encoding a lysine decarboxylase and CsrB or CsrC.

The csrB sRNA on pCIB51 was amplified using the primers csrB-F2 and csrB-R2, the amplified fragment was digested using the restriction enzymes BamHI and SphI, and ligated into pCIB41 to form the plasmid pCIB104. Similarly, the csrC sRNA was amplified using the primers csrC-F2 and csrC-R2, the amplified fragment was digested using the restriction enzymes BamHI and SphI, and ligated into pCIB41 to form the plasmid pCIB105.

Example 14: Production of lysine from E. coli co-overexpressing genes that encode a lysine decarboxylase, CsrB or CsrC, and lysine Synthetic Operons I and II.

E. coli MG1655 K12 was transformed with one of the following combination of plasmids: pCIB394 and pSTV28, pCIB394 and pCIB41, pCIB394 and pCIB104, or pCIB394 and pCIB105. Three single colonies from each transformation were grown overnight at 37℃ in 3mL of medium containing 4%glucose, 0.1%KH ₂PO ₄, 0.1%MgSO ₄, 1.6% (NH ₄) ₂SO ₄, 0.001%FeSO ₄, 0.001%MnSO ₄, 0.2%yeast extract, 0.05%L-methionine, 0.01%L-threonine, 0.005%L- isoleucine, ampicillin (100 μg/mL) , and chloramphenicol (20 μg/mL) . The following day, each culture was inoculated into 100 mL of fresh medium with 30 g/L of glucose, 0.7%Ca (HCO ₃) ₂, ampicillin (100 μg/mL) , and chloramphenicol (20 μg/mL) . The culture was grown for 72 hours at 37℃, at which point the concentration of lysine and cadaverine in each culture was determined (Table 4) .

Table 4. Production of lysine and cadaverine by E. coli strains that contain the lysine Synthetic Operons I and II and overproduce a lysine decarboxylase and an imine/enamine deaminase.

As shown in Table 4, overproduction of CadA led to the production of cadaverine. Furthermore, the overproduction of the sRNA CsrB or CsrC further increased cadaverine production from 3.2 g/L to 3.8 g/L for CsrB and CsrC.

Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, one of skill in the art will appreciate that certain changes and modifications may be practiced within the scope of the appended claims. All publications, patents, accession numbers, and patent applications cited herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited. Where a conflict exists between the instant application and a reference provided herein, the instant application shall dominate.

Table of plasmids used in Examples

Table of primer sequences used in Examples.

Illustrative sequences

SEQ ID NO: 1 Escherichia coli cadA nucleic acid sequence

SEQ ID NO: 2 CadA polypeptide sequence

SEQ ID NO: 3 E. coli csrA nucleic acid sequence

SEQ ID NO: 4 CsrA polypeptide sequence

SEQ ID NO: 5 E. coli csrD nucleic acid sequence

SEQ ID NO: 6 CsrD polypeptide sequence

SEQ ID NO: 7 LysC polypeptide sequence

SEQ ID NO: 8 DapA polypeptide sequence

SEQ ID NO: 9 LysA polypeptide sequence

SEQ ID NO: 10 LysC-1 M318I, G323D polypeptide sequence

SEQ ID NO: 11 S-LysC polypeptide sequence

SEQ ID NO: 12 Asd polypeptide sequence

SEQ ID NO: 13 DapB polypeptide sequence

SEQ ID NO: 14 DapD polypeptide sequence

SEQ ID NO: 15 AspC polypeptide sequence

SEQ ID NO: 16 E. coli csrB nucleic acid sequence

SEQ ID NO: 17 E. coli csrC nucleic acid sequence

SEQ ID NO: 18 Shigella flexneri CsrB nucleic acid sequence (CP024470.1)

SEQ ID NO: 19 Triticum aestivum CsrB nucleic acid sequence (AK447219.1)

SEQ ID NO: 20 Shigella sonnei CsrB nucleic acid sequence (CP023645.1)

SEQ ID NO: 21 Citrobacter sp. CsrB nucleic acid sequence (LT556085.1)

SEQ ID NO: 22 Salmonella enterica subsp. CsrB nucleic acid sequence (CP015574.1)

SEQ ID NO: 23 Shigella sonnei CsrC nucleic acid sequence (CP023645.1)

SEQ ID NO: 24 Shigella flexneri CsrC nucleic acid sequence (CP024470.1)

SEQ ID NO: 25 Citrobacter werkmaniii CsrC nucleic acid sequence (CP023504.1)

SEQ ID NO: 26 Salmonella enterica subsp. CsrC nucleic acid sequence (CP018661.1)

Claims

A genetically modified host cell comprising an exogenous nucleic acid encoding a CsrB sRNA or a CsrC sRNA, wherein the host cell overexpresses the CsrB sRNA or CsrC sRNA relative to a counterpart host cell that has not been modified to express the exogenous nucleic acid; and has at least one additional genetic modification to increase production of lysine or a lysine derivative compared to a wildtype host cell.
The genetically modified host cell of claim 1, wherein the amino acid derivative is cadaverine.
The genetically modified host cell of claim 1 or 2, wherein the CsrB sRNA comprises a nucleotide sequence having at least 85%identity to SEQ ID NO: 16 and the CsrC sRNA comprises a nucleotide sequence having at least 85%identity to SEQ ID NO: 17
The genetically modified host cell of claim 1 or 2, wherein the CsrB sRNA comprises the nucleic acid sequence of SEQ ID NO: 16 and the CsrC sRNA comprises the nucleic acid sequence of SEQ ID NO: 17.
The genetically modified host cell of any one of claims 1 to 4, wherein the CsrB or CsrC is heterologous to the host cell.
The genetically modified host cell of any one of claims 1 to 5, wherein the exogenous nucleic acid encoding the CsrB or CsrC is encoded by an expression vector introduced into the cell, wherein the expression vector comprises the exogenous nucleic acid operably linked to a promoter.
The genetically modified host cell of any one of claims 1 to 5, wherein the exogenous nucleic acid is integrated into the host chromosome.
The genetically modified host cell of any one of claims 1 to 7, wherein the host cell overexpresses a lysine decarboxylase.
The genetically modified host cell of any one of claims 1 to 8, wherein the host cell overexpresses one or more lysine biosynthesis polypeptides.
The genetically modified host cell of claim 9, wherein the one or more lysine biosynthesis polypeptide is an aspartate kinase, a dihydrodipicolinate synthase, a diaminopimelate decarboxylase, an aspartate semialdehyde dehydrogenase, a dihydropicolinate reductase, or an aspartate transaminase.
The genetically modified host cell of claim 10, wherein the aspartate kinase, dihydrodipicolinate synthase, diaminopimelate decarboxylase, aspartate semialdehyde dehydrogenase, dihydropicolinate reductase, or aspartate transaminase is a LysC, DapA, LysA, Asd, DapB, or AspC polypeptide.
The genetically modified host cell of any one of claims 1 to 7, wherein the host cell overexpresses CadA, LysC, DapA, LysA, Asd, DapB, and AspC polypeptide.
The genetically modified host cell of any one of claims 1 to 12, wherein the host cell is of the genus Escherichia, Hafnia, or Corynebacterium.
The genetically modified host cell of claim 13, wherein the host cell is Escherichia coli, Hafnia alvei, or Corynebacterium glutamicum.
The genetically modified host cell of claim 14, wherein the host cell is Escherichia coli.
A method of producing lysine or a lysine derivative, the method comprising culturing a host cell of any one of claims 1 to 15 under conditions in which the CsrB sRNA or CsrC sRNA is overexpressed.
A method of engineering a host cell to increase production of lysine or a lysine derivative, the method comprising introducing an exogenous nucleic acid encoding a CsrB or CsrC sRNA into the host cell, wherein the host cell has at least one additional genetic modification to increase production of lysine or a lysine derivative compared to a wildtype host cell;

culturing the host cell under conditions in which the CsrB or CsrC sRNA is expressed, and

selecting a host cells that exhibits increased production of lysine or a lysine derivative relative to a counterpart control host cell that has not been modified to express the exogenous nucleic acid.
The method of claim 17, wherein the amino acid derivative is cadaverine.
The method of claim 17 or 18, wherein the CsrB or CsrC sRNA is heterologous to the host cell.
The method of any one of claims 17 to 19, wherein the exogenous nucleic acid encoding the CsrB or CsrC sRNA is encoded by an expression vector introduced into the cell, wherein the expression vector comprises the exogenous nucleic acid operably linked to a promoter.
The method of any one of claims 17 to 19, wherein the exogenous nucleic acid is integrated into the host chromosome.
The method of any one of claims 17 to 21, wherein the CsrB sRNA comprises the nucleic acid sequence of SEQ ID NO: 16 and the CsrC sRNA comprises the nucleic acid sequence of SEQ ID NO: 17.
The method of any one of claims 17 to 22, wherein the host cell overexpresses a lysine decarboxylase.
The method of any one of claims 17 to 23, wherein the host cell overexpresses one or more lysine biosynthesis polypeptides.
The method of claim 24, wherein the lysine biosynthesis polypeptide is an aspartate kinase, a dihydrodipicolinate synthase, a diaminopimelate decarboxylase, an aspartate semialdehyde dehydrogenase, a dihydropicolinate reductase, or an aspartate transaminase.
The method of claim 25, wherein the aspartate kinase, dihydrodipicolinate synthase, diaminopimelate decarboxylase, aspartate semialdehyde dehydrogenase, dihydropicolinate reductase, or aspartate transaminase is a LysC, DapA, LysA, Asd, DapB, or AspC polypeptide.
The method of any one of claims 17 to 22, wherein the host cell overexpresses a CadA, LysC, DapA, LysA, Asd, DapB, and AspC polypeptide.
The method of any one of claims 17 to 27, wherein the host cell is of the genus Escherichia, Hafnia, or Corynebacterium.
The method of claim 28, wherein the host cell is Escherichia coli, Hafnia alvei, or Corynebacterium glutamicum.
The method of claim 29, wherein the host cell is Escherichia coli.