WO2007118836A1

WO2007118836A1 - Bi-directional selection markers with improved activity

Info

Publication number: WO2007118836A1
Application number: PCT/EP2007/053549
Authority: WO
Inventors: Van Den Marco Alexander Berg; Roelof Ary Lans Bovenberg
Original assignee: Dsm Ip Assets B.V.
Priority date: 2006-04-13
Filing date: 2007-04-12
Publication date: 2007-10-25
Also published as: US7951568B2; EP2007893A1; US20090246826A1

Abstract

The present invention discloses a polypeptide selected from the group consisting of: a polypeptide having an amino acid sequence according to SEQ ID NO 3, a polypeptide having an amino acid sequence according to SEQ ID NO 6, a polypeptide having an amino acid that is substantially homologous to the sequence of SEQ ID NO 3 and a polypeptide having an amino acid that is substantially homologous to the sequence of SEQ ID NO 6, the polypeptide displaying acetamidase activity and providing a reverse selection on fluoroacetamide with an efficiency of at least 95%, preferably at least 96%, more preferably at least 97%, more preferably at least 98%, more preferably at least 99%, most preferably 100%. The gene encoding the polypeptide of the invention is used as an efficient bi-directional selection marker in the construction of selection marker free strains, in particular for processes for the production of a compound of interest.

Description

BI-DIRECTIONAL SELECTION MARKERS WITH IMPROVED ACTIVITY

Field of the invention

The present invention relates to new amdS genes for use in an improved method for the transformation of filamentous fungi for obtaining selection marker gene free recombinant strains.

Background of the invention

In the field of microbial production of compounds of interest, there is in general a growing desire to use recombinant microorganisms containing as little as possible of foreign DNA. Ideally the transformed microorganism would contain only the desired modifications at the gene(s)-of-interest and as little as possible or remnants of other DNA fragments used during the transformation experiment or during cloning.

Patent applications EP 635,574 and WO 9706261 reveal genes encoding phenotypically selectable markers, which can be removed once the gene(s)-of-interest are stably maintained in the organism. The examples in these patent applications are genes encoding acetamidases, which upon expression in a transformed cell enable the cell to grow on media with acetamide as the sole carbon and/or nitrogen source. This selectable marker has several advantages: it is a dominant but non-antibiotic marker, and it can be used even in fungi with an endogenous amdS gene, like Aspergillus nidulans (Tilburn et al., 1983, Gene 26: 205-221 ). Moreover, this selectable marker is a so-called bi-directional marker, meaning that besides the positive selection for its presence (forward selection) also a reverse selection selecting for the absence of the gene can be applied, using fluoroacetamide (see Figure 1 ). This is successfully applied in species from the genera Aspergillus, Penicillium, Saccharomyces and Trichoderma.

The amdS gene of Aspergillus nidulans is successfully used as a selection marker for different fungal species, for instance in Aspergillus niger and Penicillium chrysogenum. In addition, homologous amdS genes were described, obtained from Aspergillus niger for transformation of Aspergillus niger and obtained from Penicillium chrysogenum for transformation of Penicillium chrysogenum (WO 9706261 ). However, these amdS genes still have a major drawback. The forward selection, e.g. selection for the presence of the selectable marker gene, works, although for one of the genes the present invention will show that this is not the case. The problems become apparent when the reverse selection is applied, e.g. selection for the absence of the selectable marker gene. The most widely used (acet)amidase expression cassette, PgpdA-amdS from Aspergillus nidulans can give rise to isolation of false negatives, e.g. isolates which suggest by phenotype that they are devoid of the selection marker as a result of the negative selection protocol, but actually have the selection marker gene or fragments thereof stably maintained in the genome. For some reason they escape the selection pressure. This can be a burden in strain improvement programs where repeated transformations have to be performed, especially in those cases where a high throughput is required. Therefore, a bi-directional selection marker gene that functions 100% correctly in both directions, e.g. the forward and the reverse selection, is not available but is highly desirable.

The present invention surprisingly shows that the reverse selection, i.e. the deletion of an amidase marker gene from microbial strains, works with 100% efficiency when using the amidase encoding genes according to the invention. As a result, sequential modification of industrial production strains is feasible with high efficiency and throughput.

Summary of the invention

The present invention provides a polypeptide displaying acetamidase activity and providing a reverse selection on fluoroacetamide. Furthermore, the present invention provides a polynucleotide encoding the above polypeptide and the use of said polynucleotide.

Detailed description of the invention

In a first aspect, the present invention provides a polypeptide selected from the group consisting of a polypeptide having an amino acid sequence according to SEQ ID NO 3, a polypeptide having an amino acid sequence according to SEQ ID NO 6, a polypeptide having an amino acid that is substantially homologous to the sequence of SEQ ID NO 3 and a polypeptide having an amino acid that is substantially homologous to the sequence of SEQ ID NO 6, the polypeptide displaying acetamidase activity and that providing a reverse selection on fluoroacetamide with an efficiency of at least 95%, preferably at least 96%, more preferably at least 97%, more preferably at least 98%, more preferably at least 99%, most preferably 100%.

The present invention further provides in a second aspect a polynucleotide encoding a polypeptide of the first aspect. In particular, a specific DNA sequence is provided encoding the polypeptide of SEQ ID NO 3, i.e. SEQ ID NO 1 or 2, or encoding the polypeptide of SEQ ID NO 6, i.e. SEQ ID NO 4 or 5. SEQ ID NO 7 is a variant coding sequence derived from the genomic DNA of SEQ ID NO 4, due to an event of alternative splicing, and SEQ ID NO 8 is the protein encoded by the alternative coding sequence. On protein level both polypeptides (i.e. SEQ ID NO 6 and SEQ ID NO 8) are identical up to amino acid 365.

As part of the present invention it is demonstrated that the fluoroacetamide screening using the currently available acetamidase genes is not functioning properly; i.e. screening gives rise to false negative mutants. The amidase genes described by the present invention do not have this problem. The activity of the novel polypeptides encoded by these genes can therefore be characterized as follows:

They enable a forward selection, i.e. provide growth on acetamide as the sole nitrogen source,

They enable an efficient reverse selection, i.e. provide resistance to fluoroacetamide with an efficiency of at least 95%, as mentioned above.

In the context of the invention, an efficiency of at least 95% means that at least 95% of the strains resulting from the reverse selection on fluoroacetamide has the amidase gene deleted from the genome. This is typically observed upon further analyzing isolated colonies obtained after selection on standard fluoroacetamide plates containing 32 m M fluoroacetamide, 5 mM urea (as N-source) and 1.1 % glucose (see WO 9706261 ).

A polypeptide having an amino acid sequence that is "substantially homologous" to the sequence of SEQ ID NO 3 and/or 6 is defined as a polypeptide having an amino acid sequence possessing a degree of identity to the specified amino acid sequence of at least 75%, preferably at least 80%, more preferably at least 85%, still more preferably at least 90%, still preferably at least 95%, still more preferably at least 96%, still more preferably at least 97%, still more preferably at least 98% and most preferably at least 99%, the substantially homologous peptide displaying acetamidase activity and providing a reverse selection on fluoroacetamide with an efficiency of at least 95%. A substantially homologous polypeptide may encompass polymorphisms that may exist in cells from different populations or within a population due to natural allelic or intra-strain variation. A substantially homologous polypeptide may further be derived from a fungus other than the fungus where the specified amino acid and/or DNA sequence originates from, or may be encoded by an artificially designed and synthesized DNA sequence. DNA sequences related to the specified DNA sequences and obtained by degeneration of the genetic code are also part of the invention. Homologues may also encompass biologically active fragments of the full-length sequence, for example the polypeptide with an amino acid sequence of SEQ ID NO 8.

For the purpose of the present invention, the degree of identity between two amino acid sequences refers to the percentage of amino acids that are identical between the two sequences. The degree of identity is determined using the BLAST algorithm, which is described in Altschul, et al., J. MoI. Biol. 215: 403-410 (1990). Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLAST program uses as defaults a word length (W) of 11 , the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89: 10915 (1989)) alignments (B) of 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands.

Substantially homologous polypeptides may contain only conservative substitutions of one or more amino acids of the specified amino acid sequences or substitutions, insertions or deletions of non-essential amino acids. Accordingly, a non-essential amino acid is a residue that can be altered in one of these sequences without substantially altering the biological function. For example, guidance concerning how to make pheno- typically silent amino acid substitutions is provided in Bowie, J. U. et al., Science 247:1306-1310 (1990) wherein the authors indicate that there are two main approaches for studying the tolerance of an amino acid sequence to change. The first method relies on the process of evolution, in which mutations are either accepted or rejected by natural selection. The second approach uses genetic engineering to introduce amino acid changes at specific positions of a cloned gene and selects or screens to identify sequences that maintain functionality. As the authors state, these studies have revealed that proteins are surprisingly tolerant of amino acid substitutions. The authors further indicate which changes are likely to be permissive at a certain position of the protein. For example, most buried amino acid residues require non-polar side chains, whereas few features of surface side chains are generally conserved. Other such phenotypically silent substitutions are described in Bowie et al, and the references cited therein.

The term "conservative substitution" is intended to mean that a substitution in which the amino acid residue is replaced with an amino acid residue having a similar side chain. These families are known in the art and include amino acids with basic side chains (e.g. lysine, arginine and histidine), acidic side chains (e.g. aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagines, glutamine, serine, threonine, tyrosine, cysteine), non-polar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), β-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine tryptophan, histidine).

In one embodiment of the invention, a deteriorated polypeptide of the invention is provided. A "deteriorated polypeptide" of the invention is an acetamidase protein wherein at least the acetamidase activity in the forward selection is decreased. The standard acetamide forward selection provides for a standard assay to measure their activity and thus deteriorated proteins may easily be selected. Deteriorated polypeptides of the invention thus are characterized by the fact that they show less activity in the forward selection (i.e. on acetamide), but they still retain at least 95% efficiency in the reverse selection (i.e. on fluoroacetamide). Such deteriorated enzymes are particular useful if one wants to screen for increased copy numbers after transformation. Introduction of a gene-of-interest (GOI) in filamentous fungi goes via co-transformation; i.e. a mixture of DNA fragments is incubated with the competent cells. This mixture contains at least a gene-fragment encoding a selectable marker (in this case acetamidase). Furthermore, it contains at least one GOI, but this might be extended to several GOI, or even to hundreds of GOI. Transformants are first selected by means of the integration of the selection marker, and are subsequently screened by molecular techniques (i.e. colony PCR or Southern blotting) for the presence of one or more GOI. Depending on the strain, DNA mixture and transformation conditions, 1-50%, or even more than 50%, of the acetamidase positive transformants contain also the GOI. If a high gene copy number of the GOI is desired, one can apply these deteriorated polypeptides of the invention to screen and select for strains with high gene copy numbers of the GOI. As additional copies of a deteriorated amidase will lead to more enzyme molecules and thus more enzyme activity, additional gene copies will lead to better growing isolates that can be selected from a primary transformant. In many of these cases the GOI will also be present in multiple copies. These isolates can thus be screened for additional copies of the GOI too.

Deteriorated polypeptides of the invention may be obtained by randomly introducing mutations along all or part of the amidase coding sequence, such as by saturation mutagenesis, and the resulting mutants can be expressed recombinantly and screened for biological activity. They can also be isolated via error prone PCR or can be produced synthetically.

Alternatively, high copy numbers of a GOI may be obtained using the non- deteriorated amidases as described in SEQ ID NO 3 or 6 or 8, and decreasing the concentration of acetamide or using an N-source having a low bioavailability to the transformant, like acryl amide in agar plates, and then selecting the best growing colonies as the ones with additional gene copies. These isolates can than be screened for additional copies of the GOI too.

The polynucleotide or nucleic acid sequence of the invention may be an isolated polynucleotide of genomic, cDNA, RNA, semi-synthetic, synthetic origin, or any combinations thereof. The term "isolated polynucleotide or nucleic acid sequence" as used herein refers to a polynucleotide or nucleic acid sequence which is essentially free of other nucleic acid sequences, e. g., at least about 20% pure, preferably at least about 40% pure, more preferably at least about 60% pure, even more preferably at least about 80% pure, most preferably at least about 90% pure as determined by agarose electrophoresis. For example, an isolated nucleic acid sequence can be obtained by standard cloning procedures used in genetic engineering to relocate the nucleic acid sequence from its natural location to a different site where it will be reproduced.

The polypeptides according to the invention and the encoding nucleic acid sequences may be obtained from any eukaryotic cell, preferably from a fungus, more preferably from a filamentous fungus. "Filamentous fungi" include all filamentous forms of the subdivision Eumycota and Oomycota (as defined by Hawksworth et al., 1995, supra). Filamentous fungal strains include, but are not limited to, strains of Acremonium, Aspergillus, Aureobasidium, Chrysosporium, Cryptococcus, Filibasidium, Fusarium, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Piromyces, Schizophyllum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, and Trichoderma.

In a more preferred embodiment, the nucleic acid sequence encoding a polypeptide of the present invention is obtained from a strain of Penicillium chrysogenum.

DNA sequences of the invention may be obtained by hybridization. Nucleic acid molecules corresponding to variants (e.g. natural allelic variants) and homologues of the DNA of the invention can be isolated based on their homology to the nucleic acids disclosed herein using these nucleic acids or a suitable fragment thereof, as a hybridization probe according to standard hybridization techniques, preferably under highly stringent hybridization conditions. Alternatively, one could apply in silico screening through the available genome databases.

"Stringency" of hybridization reactions is readily determinable by one of ordinary skill in the art. For additional details and explanation of stringency of hybridization reactions, see Ausubel et al, Current Protocols in Molecular Biology, Wiley lnterscience Publishers, (1995).

The nucleic acid sequence may be isolated by e.g. screening a genomic or cDNA library of the microorganism in question. Once a nucleic acid sequence encoding a polypeptide having an activity according to the invention has been detected with e.g. a probe derived from SEQ ID NO 2 or 5, the sequence may be isolated or cloned by utilizing techniques which are known to those of ordinary skill in the art (see, e.g., J. Sambrook, E. F. Fritsch, and T. Maniatis, 1989, Molecular Cloning, A Laboratory Manual, 2d edition, Cold Spring Harbor, New York).

The cloning of the nucleic acid sequences of the present invention from such (genomic) DNA can also be effected, e.g. by using methods based on polymerase chain reaction (PCR) or antibody screening of expression libraries to detect cloned DNA fragments with shared structural features (See, e.g., lnnis et al., 1990, PCR: A Guide to Methods and Application, Academic Press, New York.).

The sequence information as provided herein should not be so narrowly construed as to require inclusion of erroneously identified bases. The specific sequences disclosed herein can be readily used to isolate the complete gene from filamentous fungi, in particular Penicillium chrysogenum, which in turn can easily be subjected to further sequence analyses thereby identifying sequencing errors. Unless otherwise indicated, all nucleotide sequences determined by sequencing a DNA molecule herein were determined using an automated DNA sequencer and all amino acid sequences of polypeptides encoded by DNA molecules determined herein were predicted by translation of a DNA sequence determined as above. Therefore, as is known in the art for any DNA sequence determined by this automated approach, any nucleotide sequence determined herein may contain some errors. Nucleotide sequences determined by automation are typically at least about 90% identical, more typically at least about 95% to at least about 99.9% identical to the actual nucleotide sequence of the sequenced DNA molecule. The actual sequence can be more precisely determined by other approaches including manual DNA sequencing methods well known in the art. As is also known in the art, a single insertion or deletion in a determined nucleotide sequence compared to the actual sequence will cause a frame shift in translation of the nucleotide sequence such that the predicted amino acid sequence encoded by a determined nucleotide sequence will be completely different from the amino acid sequence actually encoded by the sequenced DNA molecule, beginning at the point of such an insertion or deletion. The person skilled in the art is capable of identifying such erroneously identified bases and knows how to correct for such errors.

In a third aspect, the present invention discloses a strain comprising the polynucleotide mentioned above in the second aspect. Said strain may be any eukaryotic cell, preferably a fungus, more preferably a filamentous fungus. Filamentous fungal strains include, but are not limited to, strains of Acremonium, Aspergillus, Aureobasidium, Chrysosporium, Cryptococcus, Filibasidium, Fu sari urn, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Piromyces, Schizophyllum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, and Trichoderma.

In a fourth aspect, the present invention discloses the use of a polynucleotide of the first aspect as a selection marker for selecting transformed host strains. The advantage of the selection marker of the present invention is that it can be easily deleted from the transformed host organism. The deletion of the selection marker is based on dominant selection.

The choice of a host cell in the methods of the present invention will to a large extent depend upon the source of the nucleic acid sequence (gene) of interest encoding a polypeptide. Preferably, the host cell is a eukaryotic cell, more preferably a fungus, most preferably a filamentous fungus. In a preferred embodiment, the filamentous fungal host cell is a cell of a species cited as species from which the polynucleotide of the first aspect may be obtained, examples of which are, but are not limited to, Aspergillus species (i.e. Aspergillus awamori, Aspergillus fumigatus, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae or Aspergillus sojae), Chrysosporium species (i.e. Chrysosporium lucknowense) or Penicillium species (i.e. Penicillium chrysogenum). The host cell may be a wild type filamentous fungus host cell or a variant, a mutant or a genetically modified filamentous fungus host cell. Such modified filamentous fungal host cells include e.g. host cells with reduced protease levels, such as the protease deficient strains as Aspergillus oryzae JaL 125 (described in WO 97/35956 or EP 429490); the tripeptidyl-aminopeptidases-deficient Aspergillus niger strain as disclosed in WO 96/14404, or host cells with reduced production of the protease transcriptional activator (prtT; as described in WO 01/68864 and US 2004/0191864); host strains like the Aspergillus oryzae BECh2, wherein three TAKA amylase genes, two protease genes, as well as the ability to form the metabolites cyclopiazonic acid and kojic acid have been inactivated (BECh2 is described in WO 00/39322); filamentous fungal host cells comprising an elevated unfolded protein response (UPR) compared to the wild type cell to enhance production abilities of a polypeptide of interest (described in US 2004/0186070, US 2001/0034045, WO 01/72783 and WO 2005/123763); host cells with an oxalate deficient phenotype (described in WO 2004/070022); host cells with a reduced expression of an abundant endogenous polypeptide such as a glucoamylase, neutral alpha-amylase A, neutral alpha-amylase B, alpha-1 , 6-transglucosidase, proteases, cellobiohydrolase and/or oxalic acid hydrolase (as may be obtained by genetic modification according to the techniques described in US 2004/0191864); host cells with an increased efficiency of homologous recombination (having deficient hdfA or hdfB gene as described in WO 2005/095624); Penicillium host cells producing adipyl-7- aminodesacetoxycephalosporanic acid and derivatives thereof; and host cells having any possible combination of these modifications.

Nucleic acid constructs, e.g. expression constructs, may contain a gene of interest and the polynucleotide of the invention (selection marker), each operably linked to one or more control sequences, which direct the expression of the encoded polypeptide in a suitable expression host. The nucleic acid constructs may be on one DNA fragment, or, preferably, on separate fragments. Expression will be understood to include any step involved in the production of the polypeptide and may include transcription, post-transcriptional modification, translation, post-translational modification, and secretion. The term nucleic acid construct is synonymous with the term expression vector or cassette when the nucleic acid construct contains all the control sequences required for expression of a coding sequence in a particular host organism. The term "control sequences" is defined herein to include all components, which are necessary or advantageous for the expression of a polypeptide. Each control sequence may be native or foreign to the nucleic acid sequence encoding the polypeptide. Such control sequences may include, but are not limited to, a promoter, a leader, optimal translation initiation sequences (as described in Kozak, 1991 , J. Biol. Chem. 266:19867- 19870 and WO 2006/077258), a secretion signal sequence, a pro-peptide sequence, a polyadenylation sequence, a transcription terminator. At a minimum, the control sequences include a promoter, and transcriptional and translational stop signals. The term "operably linked" is defined herein as a configuration in which a control sequence is appropriately placed at a position relative to the coding sequence of the DNA sequence such that the control sequence directs the production of a polypeptide.

The control sequence may include an appropriate promoter sequence containing transcriptional control sequences. The promoter may be any nucleic acid sequence, which shows transcription regulatory activity in the cell including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extra cellular or intracellular polypeptides. The promoter may be either homologous or heterologous to the cell or to the polypeptide.

Preferred promoters for filamentous fungal cells are known in the art and can be, for example, the glucose-6-phosphate dehyrogenase gpdA promoters, protease promoters such as pepA, pepB, pepC, the glucoamylase glaA promoters, amylase amyf\, amyB promoters, the catalase caiR or catA promoters, glucose oxidase goxC promoter, beta-galactosidase lack promoter, alpha-glucosidase aglA promoter, translation elongation factor tefA promoter, xylanase promoters such as xlnA, xlnB, xlnC, xlnD, cellulase promoters such as eg/A, eglB, cbhA, promoters of transcriptional regulators such as areA, creA, xlnR, pacC, prfl, etc or any other, and can be found among others at the NCBI website (http://www.ncbi.nlm.nih.gov/entrez/).

In a preferred embodiment, the promoter may be derived from a gene, which is highly expressed (defined herein as the mRNA concentration with at least 0.5% (w/w) of the total cellular mRNA). In another preferred embodiment, the promoter may be derived from a gene, which is medium expressed (defined herein as the mRNA concentration with at least 0.01 % until 0.5% (w/w) of the total cellular mRNA). In another preferred embodiment, the promoter may be derived from a gene, which is low expressed (defined herein as the mRNA concentration lower than 0.01 % (w/w) of the total cellular mRNA).

In an even more preferred embodiment, Micro Array data is used to select genes, and thus promoters of those genes, that have a certain transcriptional level and regulation. In this way one can adapt the gene expression cassettes optimally to the conditions it should function in.

Alternatively, one could clone random DNA fragments in front of the polynucleotides of this invention. Using the acetamide plate assay one can easily screen for active promoters, as these should facilitate growth on acetamide as the sole nitrogen source. These DNA fragments can be derived from many sources, i.e. different species, PCR amplified, synthetically and the like.

The control sequence may also include a suitable transcription terminator sequence, a sequence recognized by a filamentous fungal cell to terminate transcription. The terminator sequence is operably linked to the 3' terminus of the nucleic acid sequence encoding the polypeptide. Any terminator, which is functional in the cell, may be used in the present invention. Preferred terminators for filamentous fungal cells are obtained from the genes encoding Aspergillus oryzae TAKA amylase, Aspergillus niger glucoamylase, Aspergillus nidulans anthranilate synthase, Aspergillus niger alpha- glucosidase, trpC gene and Fusarium oxysporum trypsin-like protease.

The control sequence may also include a suitable leader sequence, a non- translated region of a mRNA, which is important for translation by the filamentous fungal cell. The leader sequence is operably linked to the 5' terminus of the nucleic acid sequence encoding the polypeptide. Any leader sequence, which is functional in the cell, may be used in the present invention. Preferred leaders for filamentous fungal cells are obtained from the genes encoding Aspergillus oryzae TAKA amylase and Aspergillus nidulans triose phosphate isomerase and Aspergillus niger glaA.

The control sequence may also include a polyadenylation sequence, a sequence which is operably linked to the 3' terminus of the nucleic acid sequence and which, when transcribed, is recognized by the filamentous fungal cell as a signal to add polyadenosine residues to transcribed mRNA. Any polyadenylation sequence, which is functional in the cell, may be used in the present invention. Preferred polyadenylation sequences for filamentous fungal cells are obtained from the genes encoding Aspergillus oryzae TAKA amylase, Aspergillus niger glucoamylase, Aspergillus nidulans anthranilate synthase, Fusarium oxysporum trypsin-like protease and Aspergillus niger alpha- glucosidase.

For a polypeptide to be secreted, the control sequence may also include a signal peptide-encoding region, which codes for an amino acid sequence linked to the amino terminus of the polypeptide, which can direct the encoded polypeptide into the cell's secretory pathway. The 5'end of the coding sequence of the nucleic acid sequence may inherently contain a signal peptide-coding region naturally linked in translation reading frame with the segment of the coding region, which encodes the secreted polypeptide. Alternatively, the 5'end of the coding sequence may contain a signal peptide-coding region, which is foreign to the coding sequence. The foreign signal peptide-coding region may be required where the coding sequence does not normally contain a signal peptide- coding region. Alternatively, the foreign signal peptide-coding region may simply replace the natural signal peptide-coding region in order to obtain enhanced secretion of the polypeptide.

The nucleic acid construct may be an expression vector. The expression vector may be any vector (e.g. a plasmid or virus), which can be conveniently subjected to recombinant DNA procedures and can bring about the expression of the nucleic acid sequence encoding the polypeptide. The choice of the vector will typically depend on the compatibility of the vector with the cell into which the vector is to be introduced. The vectors may be linear or closed circular plasmids.

The vector may be an autonomously replicating vector, i.e. a vector, which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g. a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome. An autonomously maintained cloning vector for a filamentous fungus may comprise the AMA1 -sequence (see e.g. Aleksenko and Clutterbuck (1997), Fungal Genet. Biol. 21 : 373-397).

Alternatively, the vector may be one which, when introduced into the cell, is integrated into the genome and replicated together with the chromosome (s) into which it has been integrated. The integrative cloning vector may integrate at random or at a predetermined target locus in the chromosomes of the host cell. In a preferred embodiment of the invention, the integrative cloning vector comprises a DNA fragment, which is homologous to a DNA sequence in a predetermined target locus in the genome of host cell for targeting the integration of the cloning vector to this predetermined locus. In order to promote targeted integration, the cloning vector is preferably linearized prior to transformation of the host cell. Linearization is preferably performed such that at least one but preferably either end of the cloning vector is flanked by sequences homologous to the target locus. The length of the homologous sequences flanking the target locus is preferably at least at least 0.1 kb, even preferably at least 0.2kb, more preferably at least 0.5 kb, even more preferably at least 1 kb, most preferably at least 2 kb.

The vector system may be a single vector or plasmid or two or more vectors or plasmids, which together contain the total DNA to be introduced into the genome of the host cell.

The DNA constructs may be used on an episomal vector. However in the present invention the constructs are preferably integrated in the genome of the host strain.

The obtained host cell may be used for producing a compound of interest, for example a primary or secondary metabolite or a polypeptide. A gene of interest may thus encompass a gene encoding a polypeptide involved in a metabolic pathway or may be a gene encoding a polypeptide of interest to be produced.

The present description further uses the term 'introduction' of one or more gene(s)-of-interest (abbreviated as GOI) or fragments thereof. With this is meant an insertion, duplication, deletion, or substitution of a GOI. In general all these alterations can be performed using the polynucleotide of the present invention as a bi-directional selection marker, applying a forward and reverse selection protocol.

The amidase genes disclosed by the present invention not only can be used in the forward selection on acetamide, but perform also very effectively in fluoroacetamide reverse selection procedures to obtain selection marker gene free strains. Due to the fact that a final recombinant strain does not contain the selection marker, the procedure of the present invention can be repeated, so that several alterations suggested above can be combined in one recombinant strain. Surprisingly, it was found that the deletion of an amidase marker gene according to the invention from microbial strains works with a 100% efficiency.

The efficiency of deleting the amidase gene(s) through selection on fluoroacetamide can be increased by flanking the amidase gene by DNA repeats, enabling efficient recombination and subsequent loss of the amidase gene and one of the direct repeats. Alternatively, the amidase genes can be flanked by so-called lox or frt sequences, which upon addition of recombinase enzymes (ere or flp recombinase, respectively), also enables efficient recombination and subsequent loss of the amidase gene and one of the direct repeats. Alternatively, other features can be used to remove the selection marker genes, i.e. restriction enzyme recognition sites.

The invention provides a method for obtaining selection marker gene free recombinant strains comprising the following steps:

(i) transforming a host cell of interest with a polynucleotide comprising a gene of interest (GOI) and/or a polynucleotide comprising a DNA sequence affecting expression of a GOI and with a polynucleotide comprising the selection marker gene according to the invention,

(ii) selecting clones of transformed cells for their capacity to grow on acetamide as the sole nitrogen and/or carbon source, (iii) effectuating deletion of the selection marker gene from transfected clones by reverse selection on fluoroacetamide.

The present invention further shows that this selection marker gene can be deleted from the chromosomes of the transformed organisms without leaving a trace of DNA used for cloning.

If site-specific (or better locus specific) integration is desired, the sequences used for integration as mentioned under (i) may be surrounded by endogenous DNA fragments homologous to DNA sequences in the host genome. If such a sequence is not present the DNA nevertheless may integrate into the genome. This does not influence the possibility of deletion of the selection marker gene. Alternatively, the sequences of (i) may be surrounded by recognition sites for restriction or recombination enzymes to facilitate efficient integration in step (i) and/or efficient deletion in step (iii).

The present invention further discloses a method for producing a compound of interest comprising:

(i) transforming a host cell of interest with a polynucleotide comprising a gene of interest and/or with a polynucleotide comprising a DNA sequence affecting expression of a gene of interest, and with the polynucleotide comprising the selection marker gene according to the invention,

(ii) selecting clones of transformed cells for their capacity to grow on acetamide as the sole nitrogen and/or carbon source, (iii) effectuating deletion of the selection marker gene from transformed clones by reverse selection on fluoroacetamide,

(iv) using a reverse selected clone for the production of the compound of interest. Alternatively, step (Ni) can be omitted and the acetamidase containing clone can be used for production of the compound of interest.

The dominant selection and reverse selection method described above can be employed in the development of production strains in various ways: To introduce a new GOI

The vector used for integration of the amidase gene also contains a gene of interest. The invention thus further enables the introduction of desired foreign or homologous genes or DNA elements in the host organisms of choice using the amidase gene as a selection marker gene. Subsequently, the amidase gene is deleted. Preferably the amidase and the desired genes or DNA elements are introduced site-specifically, where after the amidase gene is deleted. To introduce multiple GOI sequentially (at predetermined loci) Specifically, the invention discloses selection marker free organisms containing (site-specifically) introduced genes, which can be used for a new round of transformation. The invention is used for repeated introduction of multiple copies of various genes or a DNA element at a predetermined genomic locus. To modify transcription levels of a GOI

The method as disclosed by the invention can be used to transform a host cell with a DNA sequence of interest affecting expression of a GOI. For instance, the promoter of a GOI may be mutated or exchanged for a different promoter, thereby altering the regulation and thus expression level of the GOI. Alternatively, one can modify the transcript level of a GOI by introducing an expression construct via co-transformation (see above) which mediates RNA inhibition. To delete a GOI

The disclosed method can also be used to remove a complete or a certain part of a gene. This is of interest when certain genes coding for proteins that negatively influence production levels of desired proteins or metabolites, again without leaving a marker gene in the genome. Examples are proteases in enzyme production, transcription regulators and competing pathways in metabolite production processes and the like. To amplify the copy numbers of GOI

The disclosed invention can be used to introduce additional copies of certain genes. This is particularly useful when altering the regulation of these GOI via exchange of the homologous promoter for a strong promoter is not an option. To introduce site specific mutations in GOI, which alter the kinetics of the protein(s) produced

In a fifth aspect, the application of the polypeptides of the present invention can be improved by deleting one or more of the endogenous amidase encoding genes from the genome of the host strain. Although these might be non-functional or non-transcribed genes, deleting them from the genome prevents that they could mutate into active amidases and interfere with the polypeptides from the present invention.

Legend to the figures

Figure 1 is a representation of the forward and reverse amidase reactions.

Figure 2 is a representation of plasmid pHELY-A1 , the expression vector for Aspergillus nidulans amdS.

Figure 3 is a representation of plasmid pGATWAn, a Penicillium chrysogenum expression vector.

Figure 4 is a representation of plasmid pGPcamdScA, the expression vector for Penicillium chrysogenum amdS (sequence described in EP0758020A2).

Figure 5 is a representation of plasmid pGPcamdAcA, the expression vector for Penicillium chrysogenum amdA.

Figure 6 is a representation of plasmid pGPcamdBcA, the expression vector for Penicillium chrysogenum amdB.

Figure 7 shows the acetamide (left plate) versus fluoroacetamide (right plate) plate selection. A=transformant with Penicillium chrysogenum amidase amdA gene; B=transformant with Penicillium chrysogenum amidase amdB gene; C=transformant with Aspergillus nidulans amdS gene.

Figure 8 shows the rtPCR on amdB transformants. Panel A is a schematic representation of the 3' part of the gene and the oligonucleotides used. Panel B is the gel-electrophoresis of the amplified fragments (L=kb ladder; 1 =transformant 1 ; 2=transformant 2; 3=transformant 3; 4=transformant 4; 5=control, untransformed strain; a=rtPCR with SEQ ID NO 15 and 16; b=rtPCR with SEQ ID NO 15 and 17).

EXAMPLES General Methods

In the examples standard molecular techniques have been applied as described in literature (Sambrook et al., 1989, Molecular cloning: a laboratory manual", CSHL press, 5 Cold Spring Harbour, NY), unless stated otherwise.

Example 1

Wild type Penicillium chrysogenum cells do not use acetamide as sole N-source Penicillium chrysogenum strain Wisconsin 54-1255 (ATCC 28089) was tested for growth o on acetamide plates, prepared as described previously (Cantwell CA, Beckmann RJ, Dotzlaf JE, Fisher DL, Skatrud PL, Yeh WK, Queener SW (1990) Curr Genet. 17:213- 21.) and several growth stages of cells were plated out: spores, mycelium and protoplasts. None of these gave significant growth on acetamide plates, demonstrating that wild type Penicillium chrysogenum is not able to use and grow on acetamide.

5

Example 2 Transformation of Penicillium chrysogenum with Aspergillus nidulans amdS and subsequent reverse selection Techniques involved in the transfer of DNA to protoplasts of Penicillium chrysogenum o are well known in the art and are described in many references, including Finkelstein and Ball (eds.), Biotechnology of filamentous fungi, technology and products, Butter- worth-Heinemann (1992); Bennett and Lasure (eds.) More Gene Manipulations in fungi, Academic Press (1991 ); Turner, in: Pϋhler (ed), Biotechnology, 2^nd completely revised edition, VHC (1992). The Ca-PEG mediated protoplast transformation is used as descri-

5 bed in EP 635,574. pHELY-A1 (described in WO 04106347) was used as expression construct for testing the Aspergillus nidulans amdS gene. Two μg of vector was transformed to Penicillium chrysogenum. Transformants were selected on media with acetamide as the sole nitrogen source. To secure obtaining stable transformants, first round positives were colony purified on fresh acetamide plates and subsequently transferred to o non-selective, rich media (YEPD) to induce sporulation. Afterwards all colonies were again tested on acetamide media. Spores of stable amdS transformants were used to compare growth on acetamide and fluoroacetamide media. For this a spore solution was made in 0.9 mM NaCI and several dilutions were spotted on both media. Although, acetamide positive, and thus amdS positive, in all dilutions there is also growth in the undiluted suspension on fluoroacetamide plates (see Figure 7), demonstrating that the reverse selection using the Aspergillus nidulans amdS is not tight.

Example 3

A known Penicillium chrysogenum amdS \s not a functional acetamidase An alleged Penicillium chrysogenum amdS gene (WO 9706261 ) was PCR amplified from Penicillium chrysogenum DNA using Herculase^® Hotstart DNA Polymerase (Stratagene). The oligonucleotides of SEQ ID NO 9 and 10 were used to amplify the ORF. The amplified gene was cloned in pCR^®-Blunt II-TOPO^® (Invitrogen) and sequence verified before further processing. After digestion with Ase\ en Sbft, the ORF fragment was cloned in pGATWAn (see Figure 3) digested with Nde\ en Λ/s/^'l, resulting in the PcamdS expression vector pGPcamdScA (Figure 4). Vector pGPcamdScA (2 μg) was transformed to Penicillium chrysogenum and plated out on media with acetamide as the sole carbon source. No colonies were obtained, demonstrating that the Penicillium chrysogenum amdS gene is not functional. Although WO 9706261 discloses this gene sequence with low homology to the Aspergillus nidulans amdS gene sequence to be present in the genome of Penicillium chrysogenum, we have demonstrated that this gene sequence is not a functional equivalent of the Aspergillus nidulans amdS gene and does not enable Penicillium chrysogenum to grow on acetamide as the sole N-source.

Example 4

Amidases amdA and amdB of Penicillium chrysogenum as efficient bi-directional selection marker genes

Cloning of genes

Amidase encoding open reading frames amdA and amdB were identified by inspection of the Penicillium chrysogenum genome sequence of the Wisconsin54-1255 strain by those skilled in the art. They were PCR amplified from Penicillium chrysogenum DNA using Herculase^® Hotstart DNA Polymerase (Stratagene). SEQ ID NO 1 1 and 12 were used to amplify the ORF encoding SEQ ID NO 3; SEQ ID NO 13 and 14 were used to amplify the ORF encoding SEQ ID NO 6. The amplified amidase genes were cloned in pCR^®-Blunt II-TOPO^® (Invitrogen) and sequence verified before further processing. After digestion with Ase\ en Sbft, fragments were cloned in pGATWAn digested with Nde\ en Λ/s/1, resulting in two new expression vectors pGPcamdAcA and pGPcamdBcA, respectively (see Figures 5 and 6).

Acetamide selection

Two μg of both vectors was transformed to Penicillium chrysogenum. Transformants were selected on media with acetamide as the sole nitrogen source. To make sure that stable transformants were obtained, first round positives were colony purified on fresh acetamide plates and subsequently transferred to non-selective, rich media (YEPD) to induce sporulation. Afterwards all colonies were again tested on acetamide media. As stable transformants were obtained, we concluded that both amdA and amdB could function as an acetamidase.

Fluoroacetamide selection

Spores of the stable amidase transformants were used to compare growth on acetamide and fluoroacetamide media with the Aspergillus nidulans amdS (see example 2). For this a spore solution was made in physiological salt (0.9 mM NaCI) and several dilutions were spotted on both media. As demonstrated in Figure 7, both Penicillium chrysogenum amidase genes are supporting growth on acetamide in all dilutions and show no growth in all dilutions on fluoroacetamide plates, demonstrating that the reverse selection using the Penicillium chrysogenum amdA and amdB genes is very tight.

Alternative splicing of Penicillium chrysogenum amdB

The ORF of amdB was amplified from cDNA and after sequencing found to retain one of the predicted introns. Still, this clone was able to give acetamide positive transformants and therefore would be functionally active as an acetamidase. To assess which form of the mRNA was active in the transformants a rtPCR was performed on several transformants. Cells from acetamide plates (inducing conditions) were thereto resuspended in water and RNA was isolated using the StrataPrep^® Total RNA Microprep kit (Stratagene). rtPCR was performed with the Superscript™ III One-Step RT-PCR System (Invitrogen). The results indicate that in all transformants both mRNA variants are present (see Figure 8).

Claims

1. A polypeptide selected from the group consisting of a polypeptide having an amino acid sequence according to SEQ ID NO 3, a polypeptide having an amino acid

5 sequence according to SEQ ID NO 6, a polypeptide having an amino acid that is substantially homologous to the sequence of SEQ ID NO 3 and a polypeptide having an amino acid that is substantially homologous to the sequence of SEQ ID NO 6, the polypeptide displaying acetamidase activity and providing a reverse selection on fluoroacetamide with an efficiency of at least 95%, preferably at least 96%, more o preferably at least 97%, more preferably at least 98%, more preferably at least 99%, most preferably 100%.

2. A polynucleotide comprising a DNA sequence encoding the polypeptide of claim 1.

5

3. The polynucleotide of claim 2 that is SEQ ID NO 1 , 2, 4 or 5.

4. Strain comprising the polynucleotide of claim 2 or 3 wherein said strain is not Penicillium chrysogenum. o

5. Use of the polynucleotide of claim 2 or 3 as a bi-directional selection marker.

6. A method for the preparation of a strain expressing a gene of interest comprising: (i) transforming a host cell of interest with a gene of interest and/or with a DNA

5 sequence affecting expression of a gene of interest, and with the polynucleotide of claim 2 or 3,

(ii) selecting clones of transformed cells for their capacity to grow on acetamide as the sole nitrogen and/or carbon source.

o

7. Method according to claim 6 further comprising deletion of the polynucleotide of claims 2 or 3 from transformed clones by reverse selection on fluoroacetamide.

8. A method for producing a compound of interest comprising: (i) transforming a host cell of interest with a gene of interest and/or with a DNA sequence affecting expression of a gene of interest, and with the polynucleotide of claims 2 or 3,

9. Method according to claim 8 further comprising deletion of the polynucleotide of claims 2 or 3 from transformed clones by reverse selection on fluoroacetamide

10. A method according to any one of claims 5-8 wherein said host cell of interest is a fungal cell.