WO2024170889A1 - Yeast breeding process for strain improvement, involving pep4 mutants - Google Patents

Yeast breeding process for strain improvement, involving pep4 mutants Download PDF

Info

Publication number
WO2024170889A1
WO2024170889A1 PCT/GB2024/050386 GB2024050386W WO2024170889A1 WO 2024170889 A1 WO2024170889 A1 WO 2024170889A1 GB 2024050386 W GB2024050386 W GB 2024050386W WO 2024170889 A1 WO2024170889 A1 WO 2024170889A1
Authority
WO
WIPO (PCT)
Prior art keywords
pep4
yeast
progeny
strains
plasmid
Prior art date
Application number
PCT/GB2024/050386
Other languages
French (fr)
Inventor
Christopher Finnis
Edward LOUIS
Christian GUDE
Andrei PARKER
Charlotte SMITH
Sarah HEWITT
Emanuele KENDRICK
Cindy VALLIERES
Yue Hu
Original Assignee
Phenotypeca Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Phenotypeca Limited filed Critical Phenotypeca Limited
Publication of WO2024170889A1 publication Critical patent/WO2024170889A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N1/00Microorganisms, e.g. protozoa; Compositions thereof; Processes of propagating, maintaining or preserving microorganisms or compositions thereof; Processes of preparing or isolating a composition containing a microorganism; Culture media therefor
    • C12N1/14Fungi; Culture media therefor
    • C12N1/16Yeasts; Culture media therefor
    • C12N1/18Baker's yeast; Brewer's yeast
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N1/00Microorganisms, e.g. protozoa; Compositions thereof; Processes of propagating, maintaining or preserving microorganisms or compositions thereof; Processes of preparing or isolating a composition containing a microorganism; Culture media therefor
    • C12N1/14Fungi; Culture media therefor
    • C12N1/16Yeasts; Culture media therefor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1079Screening libraries by altering the phenotype or phenotypic trait of the host
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • C12N15/81Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/58Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from fungi
    • C12N9/60Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from fungi from yeast
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/23Aspartic endopeptidases (3.4.23)
    • C12Y304/23025Saccharopepsin (3.4.23.25), i.e. yeast proteinase A
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/02Libraries contained in or displayed by microorganisms, e.g. bacteria or animal cells; Libraries contained in or displayed by vectors, e.g. plasmids; Libraries containing only microorganisms or vectors

Definitions

  • the invention relates to breeding processes, and in particular, to a method of breeding yeast, such as Saccharomyces cerevisiae, exhibiting an improved phenotype for a biomanufacturing process.
  • the invention is especially concerned with a method of breeding yeast exhibiting improved recombinant protein production.
  • the invention also extends to yeast strains exhibiting an improved phenotype for a biomanufacturing process, obtained by the method of breeding according to the invention.
  • microorganisms have been used to produce recombinant proteins.
  • the cost of producing high-quality, biologically compatible, and commercially relevant quantities is often prohibitive.
  • the baker's yeast Saccharomyces cerevisiae
  • Saccharomyces cerevisiae is well known for producing high-quality, correctly folded heterologous proteins, often at lower costs and at higher yields than mammalian cells.
  • Alternative yeast such as Komagataella species, K. phaffii, K. pastoris, and K. pseudopastoris, including industrial strains also known as Pichia pastoris, are also used to make recombinant products, including biopharmaceutical proteins, at competitive costs of goods.
  • Pichia pastoris is often associated with a requirement for methanol and oxygen during large-scale manufacturing and generally uses non-exchangeable plasmid systems requiring selection with toxic zeocin, and may cause post-translational quality issues for biopharmaceuticals.
  • Alternative microbial production hosts including prokaryotic/bacterial systems like E. coli, lack eukaryotic cellular machinery, e.g. for protein folding and secretion, frequently resulting in insoluble inclusion bodies and endotoxin contamination issues affecting downstream purification.
  • Previous work to improve product yields from S. cerevisiae has succeeded in developing manufacturing processes for a range of biopharmaceuticals, such as insulins, vaccines (e.g. virus-like particles), albumin and albumin fusion proteins.
  • the production hosts are often sub-optimal, and the costs of goods limit access to biopharmaceutical products for those who need them, thereby hindering the eradication of treatable human diseases.
  • the development of improved production yeast, especially for S. cerevisiae, which has a long, safe history of human use and is most commonly used for the manufacture of yeast-derived biologies approved by the FDA, would be advantageous. Improving product yields from S. cerevisiae would be especially beneficial.
  • each different recombinant protein product presents different challenges for production strain optimisation and generates a different burden on host cell metabolism. Therefore, the changes made to a single yeast strain improved for one product are unlikely to be optimal and may even be detrimental to the production of a different product. Consequently, bespoke production strain improvement is frequently required for each new product, which can also be sub-optimal, slow, expensive and labour-intensive.
  • each recombinant product might require an optimal combination of chaperone proteins for maximal secretion of the correctly folded protein. This might require multiple chaperones to be overexpressed, each requiring their expression level to be fine-tuned, thereby requiring the generation of multiple different strains and then slow, expensive testing in fermenters.
  • proteases by S. cerevisiae is complex, with each recombinant protein product usually being affected differently by the nearly 200 diverse peptidases reportedly encoded in its genome. Consequently, it can be challenging to identify which protease(s) is/are degrading a particular protein product. This is especially true when the recombinant protein is a substrate for multiple S. cerevisiae proteases, because even if the recombinant protein is expressed in a strain disrupted for one of the proteases which degrade it, it will still be degraded by the other(s), so it is difficult to know if an improvement has been made.
  • Proteinase A is a vacuolar aspartyl protease required for post-translational precursor maturation of vacuolar proteinases.
  • This protease is important for protein turnover after oxidative damage, and plays a protective role in acetic acid induced apoptosis. It is synthesised as a zymogen, self-activates and is targeted to the vacuole via VpslOp-dependent endosomal vacuolar protein sorting pathway.
  • the Saccharomyces genome database https://www.yeastgenome.org/, teaches that "Loss of Pep4 protein (Pep4p) activity leads to a shortened lifespan, and the PEP4 gene is essential under conditions of nutrient starvation, such as those used for spore formation. Homozygous diploid pep4 mutants are defective in sporulation.” Accordingly, the inventors did not know whether breeding itself would be practical for sufficiently reducing proteolysis compared to PEP4 gene disruption, for cost- effective strain development and recombinant protein production. While it could have been anticipated that progeny could have some variation in the levels of proteinase A and other proteases even without PEP4 gene disruption (e.g.
  • the inventors designed a breeding method using two parents containing the pep4- gene disrupted with a dominant marker (JCanMX) and two parents with a functional PEP4 gene. This allowed enough of the Pep4 protein to be present during breeding to allow sporulation. Selection for the KanMX gene (with G418) after multigenerational breeding then allows for the selection of different populations of haploids or diploids containing the pep4-disruption, and so could not express functional proteinase A, which would carry out proteolysis. Surprisingly, the inventors identified that using their method, it was possible to achieve multigenerational breeding using at least one parent containing a disrupted pep4 gene.
  • a method of breeding to generate yeast exhibiting an improved phenotype for a biomanufacturing process comprising: i) breeding at least two parental yeast strains, wherein at least one parental yeast strain comprises a functionally deleted PEP4 gene, or a homologue, orthologue or paralogue thereof; and ii) selecting for progeny exhibiting an improved phenotype for a biomanufacturing process.
  • the inventors have demonstrated how yeast breeding and selection can be used in combination with traditional strain engineering, for example to increase recombinant protein yield by increased cellular productivity, or decrease losses by proteolysis, thereby providing improved production strains for a particular recombinant protein.
  • This overcomes a significant problem where the engineering important for recombinant protein production directly impacts the processes of mating, sporulation and germination, which are essential for breeding, without introducing any undesirable genome engineering that might destabilise the final production strains.
  • the method comprises at least three parental yeast strains, more preferably, at least four parental yeast strains.
  • each parental yeast strain is a haploid strain.
  • Genetically diverse parental strains are preferred for multigenerational breeding strategies to generate large populations (e.g. of 10 8 -10 9 ) of genetically diverse haploid or diploid progeny suitable for screening to identify strains improved for recombinant protein production and performing quantitative trait loci (QTL) analysis to identify the genes, alleles and SNPs responsible for the improvements.
  • QTL quantitative trait loci
  • the method comprises breeding genetically diverse parental yeast strains.
  • the resultant progeny are genetically diverse.
  • the parental strains are not common laboratory strains.
  • a "common laboratory strain cell” will be well-known to the skilled person and may include those defined in Louis, E.J., 2016 1 .
  • a common laboratory strain may include yeast strains listed on the Saccharomyces Genome Database (SGD).
  • a common laboratory strain may include one of the following yeast strains: S288C (Reference Genome: GenBank GCF_000146045.2); W303 (GenBank: JRIU00000000.1); CEN.PK; JRY188; AH22; S150-2B; and CB11/63.
  • a common laboratory strain is not a natural strain, and therefore, may contain its own non-naturally occurring combination of alleles.
  • Heterozygous functionally deleted PEP4 diploids are not guaranteed to sporulate in all cases of genetically diverse strains, including heterozygous functionally deleted PEP4 diploids produced from parental strains preferred for breeding or heterozygous functionally deleted PEP4 diploids generated from some combinations of parental strains preferred for breeding.
  • the diploids produced may not contain sufficient proteinase A to sporulate at all, or to sporulate efficiently, or other factors present in these diverse strains may negatively impact sporulation. This may prevent multigenerational breeding even with at least one parental yeast strain comprising a functionally deleted PEP4 gene. This is because the diploids generated in the first rounds of breeding may not produce spores at all, so breeding cannot continue any further.
  • the efficiency of sporulation may be so low that either the genetic diversity present in the parents is inadequately represented in the progeny population or the numbers of spores produced are insufficient due to inefficient sporulation.
  • Sporulation efficiency is preferably high, with most diploids producing four spores (in a tetrad). In one embodiment, therefore, sporulation efficiency provides spores from at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% and preferably 100% of the diploids produced by mating.
  • the method comprises breeding one parental yeast strain comprising a functionally deleted PEP4 gene, with one parental yeast strain comprising a functional PEP4 gene.
  • the two parental strains are haploids.
  • the progeny are diploid.
  • the method comprises breeding two parental yeast strains comprising a functionally deleted PEP4 gene, with two parental yeast strains comprising a functional PEP4 gene.
  • the four parental strains are haploids.
  • the progeny are diploid.
  • At least one parental strain comprises a selectable marker, which preferably functionally deletes, or is inserted into, each PEP4 gene.
  • the method comprises isolating homozygous functionally deleted PEP4 diploid progeny by selecting for the selectable marker.
  • the method comprises germinating the progeny and isolating spores comprising a functionally deleted PEP4 gene by selecting for the selectable marker.
  • the method further comprises mating the spores to form homozygous functionally deleted PEP4 diploid progeny.
  • the method comprises germinating and mating functionally deleted PEP4 haploid progeny with functional PEP4 haploid progeny.
  • the method comprises isolating homozygous functionally deleted PEP4 diploid progeny and heterozygous functionally deleted PEP4 diploid progeny by selecting for the selectable marker.
  • the method comprises breeding at least one parental yeast strain comprising a functionally deleted PEP4 gene, with at least one parental yeast strain comprising a functional PEP4 gene.
  • this results in the production of heterozygous functionally deleted PEP4 diploid progeny and/or homozygous functionally deleted PEP4 diploid progeny.
  • the selectable marker may be an auxotrophic marker.
  • the selectable marker may be selected from a group of markers consisting of: LEU2, TRP1, HIS3, HIS4, URA3, URA5, SFA1, ADE2, MET15, LYS5, LYS2, ILV2, FBA1, PSE1, PDI1 and PGK1.
  • markers are auxotrophic markers, such as LEU2, where host cells containing the deleted auxotrophic marker can also be grown in the presence of a media supplement, e.g. leucine, when required.
  • Dominant selectable markers typically used in yeast include KanMX (for G418 selection, also known as kanMX), HygMX (for hygromycin B selection, also known as hygMX), NatMX (for nourseothricin selection, also known as natMX) and PatMX for bialaphos selection, also known as patMX), AUR1 AUR1-C for selection with Aureobasidin A / LY295337) and also amdSYM (for fluoroacetamide selection) as described by Goldstein et al., 1999 3 , Heidler and Radding 1995 4 and Solis-Escalante et al. 2013 5 .
  • KanMX for G418 selection, also known as kanMX
  • HygMX for hygromycin B selection, also known as hygMX
  • NatMX for nourseothricin selection
  • PatMX for bialaphos selection
  • the selectable marker may be a dominant selectable marker.
  • the dominant selectable marker may be selected from the group consisting of: KanMX, HygMX, NatMX, AUR1-C, PatMX and amdSYM.
  • the dominant selectable marker is KanMX.
  • Yeast strains comprising a dominant selectable marker will be resistant to a selection agent, such as G418 (Geneticin) when the marker is KanMX.
  • the method comprises breeding the yeast in the presence of a suitable selection agent which is selected based on the selectable marker.
  • the dominant selectable marker is KanMX, and is selected for using the selectable agent G418 (Geneticin).
  • G418 is an aminoglycoside antibiotic similar in structure to gentamicin Bl, which blocks polypeptide synthesis by inhibiting the elongation step.
  • the method comprises breeding the yeast in the presence of G418 or gentamicin Bl.
  • the method comprises allowing germination of genetically diverse progeny in the presence of G418 to select for pep4: : KanMX spores only.
  • the spores are then preferably mated to form homozygous functionally deleted PEP4 diploids pep4: :pep4 diploids).
  • a mixed population of heterozygous functionally deleted PEP4 diploid progeny pep4 PEP4') and homozygous functionally deleted PEP4 diploid progeny pep4 -.pep4) (i.e. "mixed diploids", MD), may be prepared by allowing germination and mating of pep4: : KanMX and PEP4 haploids. This is followed by selection against functional PEP4 homozygous diploids (PEP4:PEP4)' with G418. Therefore, multigenerational breeding can be achieved to produce genetically diverse libraries with a range of proteinase A genotypes, including an absence of proteinase A gene, despite this protease being essential for the breeding process.
  • the functionally deleted PEP4 gene may be a disrupted PEP4 gene.
  • the PEP4 gene may be disrupted by replacing it with a nonfunctional copy, or introducing a mutation into the PEP4 gene.
  • a functionally deleted PEP4 gene may not comprise a disrupted gene, but will instead be disrupted at the protein level, for example using an inhibitor.
  • the PEP4 gene may be functionally disrupted using gene silencing, such as RIMA interference (RNAi) or small nuclear RNA (snRNA), as described in Williams T. C. et al., Microb Cell Fact 14, 43 (2015).
  • RNAi RIMA interference
  • snRNA small nuclear RNA
  • the improved phenotype for a biomanufacturing process may be selected from the following phenotypes: increased protein product yield, reduced cell lysis, modified shear resistance, modified sedimentation, improved product harvesting, and altered host cell protein profile.
  • the improved phenotype for a biomanufacturing process may be increased plasmid, genomic or phenotypic stability, preferably over multiple generations during continuous culture.
  • the improved phenotype for a biomanufacturing process may be improved growth phenotypes including, modified media requirements, modified growth temperature or temperature range, and modified growth pH or pH range.
  • the improved phenotype for a biomanufacturing process may be altered post-translational modification, including reduced proteolysis or modified glycosylation.
  • the improved phenotype for a biomanufacturing process is improved production of biologicals, biologies, therapeutic proteins, vaccines, recombinant proteins, and fragments, conjugates or fusions thereof.
  • the improved phenotype for a biomanufacturing process is improved recombinant protein production.
  • a yeast cell exhibiting "improved recombinant protein production" may also be defined as a yeast cell exhibiting increased recombinant protein production.
  • improved recombinant protein production comprises improved secretion.
  • the yeast cell obtained by the method according to the invention may exhibit an increase in recombinant protein production or secretion compared to a yeast cell that has not been obtained using the claimed method.
  • the yeast cell obtained by the method according to the invention may exhibit an increase in recombinant protein production or secretion compared to a yeast cell that has not been obtained using the claimed method, of at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 100%.
  • the yeast cell may exhibit an increase in recombinant protein production or secretion compared to a yeast cell that has not been obtained using the claimed method, of at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold or at least 10-fold.
  • the yeast cell may exhibit an increase in recombinant protein production or secretion compared to a yeast cell that has not been obtained using the claimed method, with total soluble intracellular protein or culture supernatant yields of at least 1 g/L, at least 2 g/L, at least 3 g/L, at least 4 g/L, at least 5 g/L, at least 6 g/L, at least 7 g/L, at least 8 g/L, at least 9 g/L, at least 10 g/L, at least 20g/L, at least 30 g/L, at least 40 g/L, at least 50 g/L, at least 60 g/L, at least 70 g/L, at least 80 g/L, at least 90 g/L, at least 100 g/L, at least 200g/L, or at least 300g/L.
  • the method further comprises the step of performing QTL analysis to identify the causal genetics associated with the improved phenotype for a bioman
  • the method comprises performing at least
  • the method comprises performing at least 7, at least 8, at least 9, at least 10, at least 11 or at least 12 generations of breeding. Most preferably, the method comprises performing at least 13, at least 14, at least 15, at least 16, at least 17 or at least 18 generations of breeding.
  • the method comprises performing multigenerational breeding. Multigenerational breeding may be defined as breeding comprising multiple rounds of mating, sporulation and germination. For example, to maximise meiotic events and increase the genetic diversity of the progeny compared to the parents.
  • the method comprises breeding to obtain at least 10 4 progeny spores, at least 10 5 progeny spores, at least 10 6 progeny spores, or at least 10 7 progeny spores. More preferably, the method comprises breeding to obtain at least 10 8 progeny spores, or at least 10 9 progeny spores.
  • the methods of the invention comprise breeding yeast exhibiting an improved phenotype for a biomanufacturing process, preferably improved recombinant protein production.
  • the method further comprises transforming at least one parental yeast strain with an expression vector encoding at least one recombinant protein.
  • the method may comprise transforming the yeast progeny with an expression vector encoding at least one recombinant protein.
  • the expression vector is a whole- 2-micron family plasmid.
  • a whole-2-micron family plasmid is an expression vector comprising at least 50%, at least 60%, at least 70%, at least 80%, preferably at least 90%, or most preferably 100% of the sequence of a natural yeast 2-micron plasmid.
  • the expression vector may be a stable partial-2-micron plasmid.
  • a stable partial-2-micron plasmid is an expression vector comprising the 2-micron origin of replication which is dependent on a whole-2-micron family plasmid, a whole-2-micron expression plasmid or functions provided from these plasmids for stable replication and maintenance.
  • the expression vector may be an integrative plasmid (e.g. a plasmid that integrates into the yeast genome).
  • the expression vector may be a centromeric plasmid (e.g. containing a centromeric sequence and/or an autonomous replicating sequence).
  • the expression vector may be an artificial chromosome (e.g.
  • a yeast artificial chromosome or YAC a yeast artificial chromosome
  • the inventors have overcome this problem by using a whole-2-micron-family plasmid instead of the partial-2-micron plasmid, because the whole-2-micron-family plasmid is amplified to a relatively constant copy number in each different strain, i.e. the copy number for a particular plasmid will be approximately the same in transformants of the same strain.
  • At least one parental yeast strain does not comprise a partial-2-micron family plasmid. More preferably, the at least two parental yeast strains do not comprise a partial-2-micron family plasmid.
  • the method comprises transforming at least one parental yeast strain with a whole-2-micron family plasmid.
  • the whole-2-micron plasmid of Saccharomyces cerevisiae is a small circular, multicopy DNA element that resides in the yeast nucleus at a copy number of about 40-60 per haploid cell.
  • Examples of whole-2-micron family plasmids include Scpl, Scp2 and Scp3 in S.
  • plasmid pSMl from Zygosaccharomyces fermentati
  • plasmid pKDl from Kiuyveromyces drosphiiarum
  • an un-named plasmid from Pichia membranae faciens.
  • whole-2-micron family plasmids share a series of common features in that they possess two inverted repeats on opposite sides of the plasmid, have a similar size around 6-kbp (range 4757 to 6615-bp), at least three open reading frames, one of which encodes for a site specific recombinase (such as FLP in 2pm) and an autonomously replicating sequence (ARS), also known as an origin of replication (ori), located close to the end of one of the inverted repeats.
  • site specific recombinase such as FLP in 2pm
  • ARS autonomously replicating sequence
  • ori origin of replication
  • the method comprises transforming at least one parental yeast strain with an engineered whole-2-micron family plasmid.
  • an engineered whole-2-micron family plasmid is a whole-2-micron family plasmid which has been engineered for recombinant protein production (a whole-2-micron expression plasmid), preferably where recombinant protein production is inactive, repressed or uninduced during breeding.
  • the expression plasmid is introduced into one or more of the parent strains at the start of the multigenerational breeding, while it will be present in the final population, e.g. >10 8 progeny, its presence is likely to be counter-selected against the optimal recombinant protein production strains generated by breeding. This is because the presence of the plasmid and the metabolic burden on the strain producing recombinant proteins, even at a low level, often allows poor expressing strains to outgrow the high expressing strains during the many generations of growth required.
  • inducible yeast promoter systems which can be used to switch recombinant protein expression on and off, these are generally not suitable for large-scale biopharmaceutical manufacture, either due to cost or undesirable properties of the inducer within the process stream, e.g. antibiotics.
  • Regulated S. cerevisiae promoters allow the control of timing and gene expression levels, achievable through manipulation of the growth medium by the addition of metabolites or ions.
  • the galactose (GALl-lO) promoters allow regulation of gene expression through the carbon source, galactose for induction, and glucose for repression.
  • the phosphate PH05) and the copper CUP1) promoters are inducible by phosphate and copper, respectively.
  • regulation of these promoters often interferes with the cellular metabolism, due to the changes in growth media, and in many cases does not completely shut off transcription. This can be overcome by using the tetracycline (Tet-On/Off) promoters, which are either inducible or repressible.
  • Tet-Off activates expression in the absence of doxycycline
  • Tet-On activates in the presence of doxycycline. While the tetracycline derivative, doxycycline, does not interfere with the yeast cellular metabolism it is undesirable for biopharmaceutical manufacturing. It also requires a specific strain background or additional manipulations of the strains in use.
  • the inventors provided a solution by transforming at least one parental strain with an expression plasmid containing a whole-2-micron- based plasmid derived from one of the parental strains containing a repressible promoter from the S. cerevisiae MET17 gene (also known as YLR303W, MET15 or MET25) and growing the breeding population in the continual presence of methionine at a level sufficient to repress expression of the recombinant product (below approximately 0.05 mM methionine).
  • the method comprises growing the yeast in a media comprising less than 0.05 mM methionine, less than 0.04 mM methionine, less than 0.03 mM methionine, less than 0.02 mM methionine, or less than 0.01 mM methionine.
  • the method comprises growing the yeast in a media which does not comprise methionine.
  • the repressible MET17 promoter can be switched on for the production of the recombinant protein product.
  • the whole-2-micron family plasmid comprises a repressible promoter.
  • the repressible promoter is the MET17 promoter.
  • the method comprises breeding the yeast in the presence of methionine.
  • the level of methionine in the culture media needs to be carefully controlled to ensure transcription is off when needed, because the yeast can metabolise methionine until its concentration eventually falls to a level, below which, transcription is activated.
  • adding methionine to the culture media at 20mM or above is sufficient to repress the MET17 promoter for several days.
  • the method comprises breeding the yeast in the presence of at least 0.05 mM methionine, at least 0.1 mM methionine, at least 0.5 mM methionine, or at least 1 mM methionine.
  • the method comprises breeding the yeast in the presence of at least 5 mM methionine, at least
  • the repressible MET17 promoter can be switched off during breeding.
  • an inducer is not required for expression during industrial production campaigns.
  • the repressible MET17 promoter provides additional options for optimising fermentation and production of proteins which are deleterious or toxic to the production host, e.g. by separating the growth phase from the production phase. Furthermore, this can be achieved using methionine which is both cost-effective and safe for biopharmaceutical manufacturing.
  • the method comprises obtaining yeast progeny (preferably between 10 8 and 10 9 yeast progeny), which do not comprise a whole-2- micron expression plasmid (i.e. cir° progeny).
  • yeast progeny are then transformed with a whole-2-micron expression plasmid to obtain transformed yeast progeny.
  • greater than 10 3 , or between 10 4 and 10 6 transformed yeast progeny are obtained.
  • the method preferably further comprises back-crossing the transformed yeast progeny with the yeast progeny that do not comprise a whole-2-micron expression plasmid (i.e. the cir° progeny).
  • the yeast progeny are then bred for at least two generations, at least three generations, at least four generations, or at least five generations.
  • the method comprises selecting an individual strain which does not comprise a whole-2-micron expression plasmid (i.e. cir° progeny), from the yeast progeny.
  • the method may comprise selecting at least two individual strains, at least three individual strains, at least four individual strains, or at least five individual strains which do not comprise a whole-2-micron expression plasmid (i.e. cir° progeny), from the yeast progeny.
  • the selected yeast progeny are then transformed with a whole-2-micron expression plasmid to obtain transformed yeast progeny.
  • the method preferably further comprises back-crossing the transformed yeast progeny with the yeast progeny that do not comprise a whole-2-micron expression plasmid (i.e. the cir° progeny).
  • the yeast progeny are then bred for at least two generations, at least three generations, at least four generations, or at least five generations. It is desirable to have haploid progeny to facilitate QTL analysis and follow-on genetic improvements. Accordingly, in one preferred embodiment, the method comprises selecting for haploid progeny exhibiting an improved phenotype for a biomanufacturing process.
  • the population of yeast cells obtained by the method of the invention will contain both mating types (MAT-a and MAT-alpha). Once germinated, haploids of the opposite mating-type tend to mate to form diploids, unless physically separated. This is undesirable if haploid screening of large populations (e.g. 10 8 -10 9 ) is required to identify a final haploid production strain. If grown together as a population with both mating types present, mating can interfere with screening processes needed to detect preferred strains with improved phenotypes.
  • the inventors achieved this by germinating spores derived from a population comprising heterozygous functionally deleted PEP4 diploids, e.g. pep4 ⁇ PEP4 heterozygous diploids, in the presence of an excess of a strain with one mating type and allowing for simple selective removal of diploids, e.g. by flow cytometry. For example, performed in the presence of an excess of a single mating-type strain containing at least one marker for subsequent removal of diploids arising from it.
  • the method comprises germinating a population of spores derived from heterozygous functionally deleted PEP4 diploids in the presence of a yeast strain of one mating type (i.e. MAT-a or MAT-alpha).
  • the population of heterozygous functionally deleted PEP4 diploids are germinated in the presence of an excess of a yeast strain of one mating type (MAT-a or MAT-alpha).
  • an excess of one mating type comprises at least 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, or preferably 10-fold or 20-fold as many cells as the number of germinated spores.
  • the excess yeast strain produces a fluorogenic protein.
  • the method further comprises selecting for progeny (e.g. haploid or diploid progeny) exhibiting an improved phenotype for a biomanufacturing process using flow cytometry or growth on a selective media.
  • Markers may include, but are not limited to, auxotrophic markers or fluorogenic markers, e.g. fluorescent protein expression. These markers are suitable for the selection of large numbers of individual progeny of different types (e.g. >10 4 of haploids of a single mating type) by growth on an appropriate media or by flow cytometry.
  • the method comprises allowing recombinant protein production to take place for at least 2 hours, at least 4 hours, at least 6 hours, at least 8 hours, at least 12 hours, at least 16 hours, at least 20 hours, at least 24 hours, at least 48 hours, at least 72 hours, or at least 96 hours prior to selecting for progeny (e.g. haploid or diploid progeny).
  • progeny e.g. haploid or diploid progeny
  • the method comprises allowing recombinant protein production to take place for between 12 and 24 hours, prior to selecting for progeny (such as haploid or diploid progeny), exhibiting an improved phenotype for a biomanufacturing process.
  • progeny such as haploid or diploid progeny
  • haploid libraries enriched for a single mating type were used in subsequent screening processes, e.g., flow cytometry and/or microtitre plate culture, to identify strains improved for recombinant protein production, e.g., improved recombinant protein secretion or intracellular protein production.
  • Haploid individuals can also be obtained by transformation of pep4 pep4 diploids exhibiting improved phenotypes for a biomanufacturing process, with a plasmid to temporarily complement the pep4-disruption to allow sporulation, e.g. a genetically stable CEN-vector expressing PEP4.
  • the method comprises a homozygous functionally deleted PEP4 diploid, exhibiting an improved phenotype for a biomanufacturing process, with a plasmid that complements the functional deletion of PEP4.
  • the plasmid is a genetically stable CEN-vector expressing PEP4. Spores can then be obtained by standard methods, e.g. tetrad analysis.
  • the method comprises using tetrad analysis to obtain spores.
  • Haploid progeny can subsequently be cured of the complementing PEP4 plasmid before characterisation of haploids containing the desired combination of alleles from the parents, which are also improved phenotypes for a biomanufacturing process.
  • the method preferably comprises curing the haploid progeny of the complementing PEP4 plasmid, and selecting for haploid progeny exhibiting an improved phenotype for a biomanufacturing process.
  • Curing may be achieved by a variety of methods, including counter-selecting against selectable markers, such as URA3 with 5-Fluoroorotic acid (5FOA) or LYS2 with alpha-aminoadipate, which may be combined with multiple generations of cell division without selection for the CEN- vector.
  • selectable markers such as URA3 with 5-Fluoroorotic acid (5FOA) or LYS2 with alpha-aminoadipate
  • the method comprises selecting for diploid progeny exhibiting an improved phenotype for a biomanufacturing process.
  • the method comprises selecting for homozygous functionally deleted PEP4 diploid progeny, or selecting for a mixed population of heterozygous functionally deleted PEP4 diploid progeny and homozygous functionally deleted PEP4 diploid progeny.
  • the yeast is Pichia pastoris (Komagataella species, such as
  • the yeast is a Saccharomyces species yeast, such as Saccharomyces cerevisiae. Most preferably, the yeast is Saccharomyces cerevisiae.
  • At least one parental yeast strain comprises a functionally deleted UBC4 gene, or a homologue, orthologue or paralogue thereof.
  • UBC4 encodes a ubiquitin conjugating enzyme that links ubiquitin (Ubi4p) to lysine residues of target proteins.
  • At least one parental yeast strain comprises a functionally deleted YPS1 gene, or a homologue, orthologue or paralogue thereof.
  • YPS1 also known as YAP3 with systematic name YLR120C
  • GPI glycosylphosphatidylinositol
  • Additional engineering to improve recombinant protein production may be desirable in pep4 and ypsl disrupted strains. For example, disruption of genes to increase recombinant protein production, such as UBC4, M0T2 and GHS1, or overexpression of genes for chaperones, such as PDI1 and ER.01 or manipulation of the unfolded protein response, e.g. by HACli overexpression.
  • genes to increase recombinant protein production such as UBC4, M0T2 and GHS1
  • genes for chaperones such as PDI1 and ER.01
  • manipulation of the unfolded protein response e.g. by HACli overexpression.
  • At least one parental yeast strain comprises a functionally deleted M0T2 and/or GHS1 gene, or a homologue, orthologue or paralogue thereof.
  • at least one parental yeast strain comprises an over-expression of genes encoding chaperones, for example, overexpression of PDI1 and ER.01.
  • at least one parental yeast strain comprises an over-expression of HACli.
  • the at least two parental yeast strains comprise a selectable or auxotrophic marker.
  • the at least two parental yeast strains comprise a ura3 or a Iys2 marker.
  • the selectable or auxotrophic marker are located at the same genomic loci in the at least two parental yeast strains, e.g. at the LYS2-locus.
  • the at least two parental yeast strains are genetically diverse.
  • strains are defined as strains being at least 0.001%, at least 0.005%, at least 0.01%, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.06%, at least 0.07%, at least 0.08%, or at least 0.09% different by whole genome comparisons performed by Neighbour-joining based on SNP differences.
  • the strains are at least 0.1%, at least 0.2%, at least 0.3%, at least 0.4%, at least 0.5%, at least 0.6%, at least 0.7%, at least 0.8%, at least 0.9 %, or preferably at least 1%, or more preferably at least 2%, or at least 5% different by whole genome comparisons performed by Neighbour-joining based on SNP differences.
  • Strains are preferably compared to the yeast reference strain S288C.
  • strains may be compared to each other, for example, the parental strains used in a cross are compared by whole genome comparisons performed by Neighbour-joining based on SNP differences.
  • yeast refers to the process of mating, followed by the generation of progeny. Sporulation is required for the generation of progeny, and occurs when yeast exit the mitotic cell cycle and enter into meiosis, leading to spore formation.
  • homologous genes will be well understood by the skilled person to mean a gene or genetic region that is similar in sequence, structure or evolutionary origin to a gene or genetic region in another species or organism. For example, homologous genes may be derived from a single common ancestral gene present in the common ancestor of different organisms. Homologous genes will encode proteins with the same or similar function in different species, and may also be referred to as an "ortholog ue".
  • orthologue will be well understood by the skilled person to mean one of two or more homologous gene sequences found in different species.
  • a yeast strain exhibiting an improved phenotype for a biomanufacturing process, wherein the yeast strain is obtained by the method according to the first aspect.
  • the yeast strain is an engineered yeast strain.
  • the yeast is Pichia pastoris Komagataella species, such as K. phaffii, K. pastoris, and K.
  • yeast is a Saccharomyces species yeast, such as Saccharomyces cerevisiae. Most preferably, the yeast is Saccharomyces cerevisiae.
  • the yeast strain comprises a functionally deleted protease gene. More preferably, the yeast strain comprises a functionally deleted PEP4 gene, or a homologue or paralogue thereof.
  • a yeast library obtained by the method according to the first aspect.
  • the yeast library is genetically diverse.
  • the yeast library comprises an engineered whole-2 micron plasmid.
  • the engineered whole-2 micron plasmid is engineered for recombinant protein production, where recombinant protein production is inactive, repressed or uninduced during breeding.
  • a product produced by the yeast strain according to the second aspect or the yeast library according to the third aspect is provided.
  • the product is recombinant.
  • the product may be a peptide, polypeptide or protein.
  • the product may comprise at least 10, 20, 30, 40 or 50 amino acids.
  • the product comprises at least 100, 200, 300, 400 or 500 amino acids. More preferably, the product comprises at least 1000, 2000, 3000, 4000 or 5000 amino acids.
  • the product is purified.
  • the product is at least 95%, 96%, 97%, 98% or 99% pure.
  • the term "recombinant protein” may be any protein not naturally produced by the expression host, including fusion proteins, tagged proteins, muteins, analogues, derivatives, domains, precursors and fragments of any protein or polypeptide, including, but not limited to, the following proteins (or other polypeptides) of interest.
  • Proteins (or other polypeptides) of interest include albumin, transferrin, lactoferrin, immunoglobulin (such as an Fab fragment or single-chain antibody, including, ScFvs, VHHs and VNARs), (haemo)globin, leghaemoglobin, myoglobin, blood clotting factors (such as factors II, VII, VIII, IX), von Willebrand's factor, tick anticoagulant peptide, endostatin, angiostatin, ice- structuring proteins, hydrophobins, interferons, interleukins, alpha-l-antitrypsin, insulin, GLP-1, glucagon, calcitonin, cell surface receptors, fibronectin, prourokinase, (pre-pro)-chymosin, antigens for vaccines (including virus-like particles), t-PA, urokinase, prourokinase, hirudin, tumour necrosis factor, G-CSF,
  • the protein may be a viral, microbial, fungal, plant or animal protein, for example, a mammalian protein. In one embodiment, it is a human protein.
  • the recombinant protein may be a protein endogenous to the host, such as an enzyme, for which production has been improved by strain engineering in a recombinant eukaryotic cell.
  • FIG. 1 shows how a mixed population of heterozygous pep4 PEP4 and homozygous pep4 pep4 diploids ("mixed diploids", MD) was prepared by allowing germination and mating of pep4: :KanMX and PEP4 haploids followed by selection against PEP4:PEP4 homozygous diploids with G418.
  • This figure also shows how a population of pep4::pep4 homozygous diploids ("pure diploids", PD) can be produced by selecting only pep4: :KanMX haploids using G418 and allowing them to mate.
  • Figure 2 shows specific mCherry activities for progeny yeast strains grown at 30°C. The average across four replicates is displayed (error bars show 1 SD from 4 replicates). Significant diversity in the mCherry phenotype is observed, with more than a 10-fold difference observed between the highest and lowest producers.
  • Figure 3 shows plots of mCherry against amylase specific activities for yeast strains grown in four microtitre plates at 30°C. While the mCherry levels range widely from low to high, the amylase levels tend not to fall so far towards zero, indicating that proteolysis has affected the mCherry and amylase domains differently.
  • Figure 4 shows typical populations selected by flow cytometry where low and non-producers of recombinant protein (based on mCherry signal) can be avoided (fraction 1) and populations with increasing intracellular recombinant protein production (fractions 2 to 6) can be selected for analysis.
  • RBD3-mCherry expression (from construct pEV25) in a pure diploid pep4: :pep4 homozygotes) population after at least four generations of breeding, individual cells were grown in 48-well microtitre plates. Cells from fraction 1 were cream coloured (indicating yeast with no/low mCherry production), whereas cells from fractions 2 to 6 were increasingly pink/red in visual appearance.
  • Figure 5 shows increased recombinant RBD4-mCherry protein production in progeny selected from a genetically diverse yeast library lacking proteinase A expression. This led to the selection of strains DH01, DH02 and DH03 with significantly increased supernatant mCherry levels compared to the parental strains.
  • Figure 5 shows reducing SDS-PAGE analysis of culture supernatant from a strain improved for RBD4-mCherry protein production by multigenerational breeding.
  • Figure 6 shows enrichment of high mCherry producers after multiple rounds of flow cytometry. Cells were grown overnight to mid-log phase before cell sorting. Typically, the top 2% of the population for mCherry fluorescence was collected in the first passage then grown overnight at least once before, optionally performing further flow cytometry passages, leading to sorting the top 2% into 96-well microtitre plates (1 cell/well) containing BMMD+CSM-Leu+Met.
  • Figure 7 shows yield improvement by breeding for commercially relevant VLP proteins, HPV16(Ll)-mCherry and mCherry-HBsAg (with and without an N-terminal 6His-tag containing methionine as the first amino acid), from pEV51, pEV52 and pEV56 expression constructs, respectively.
  • a Cpl2 cir° PD library was transformed with the expression plasmids, and expression studies showed that yeast genome diversity gave variable expression levels for all VLP-mCherry fusions with different average expression levels of each VLP-fusion.
  • Back-crossing followed by additional rounds of breeding were then performed to increase the diversity of plasmid containing cells. Individuals with enhanced expression were selected for all VLPs.
  • FIG. 7 shows the average of the diverse population.
  • Figure 8 shows the supernatant mCherry signal corrected for growth (whole culture ODeoo) for the parental control strains, the randomly picked colonies and the individuals enriched for improved recombinant protein secretion by flow cytometry. Breeding has generated individuals with significantly improved product yields, which can be further enriched by flow cytometry. Secretion of the recombinant VHH- mCherry fusion protein has been improved at least 10-20 fold (12-50 fold compared to parental strains) by the breeding and selection process for strains with protease disruptions. The key describes strains/populations plotted in the same order from the left to the right.
  • Figure 9 shows yield improvement from breeding, where parental strains tend to secrete recombinant albumin at significantly lower levels than the majority of the diverse population evaluated.
  • Figure 10 shows different embodiments of expression constructs used in the claimed invention.
  • Figure 10a is the construct pHRIK, which encodes the LEU2 selectable marker and pUC57-Kan integrated at the SnaBI-site downstream of the 2-micron D gene.
  • Figure 10b is the construct pHR2A, which was constructed in a 3- way ligation of the 2.6kp pHRIK Notl-Hpal fragment, the 1.2kp pHRIK Notl-AcII fragment and the 2.5kp pUC57-Amp Eco J-Narl fragment.
  • Figure 10c shows the construct pEV7 for amylase-mCherry secretion, with Figure lOd showing a map of the final expression plasmid pHR!K-pEV7 in yeast after homologous recombination with pHRIK, illustrating how each expression construct of the invention is used to make whole-2-micron expression plasmids.
  • Figure lOe shows plasmid pEV25 containing the expression construct for secretion of RBD3-mCherry
  • Figure lOf shows pEV26 containing the expression construct for secretion of RBD4-mCherry
  • Figure 10g showing the locations of RBD3 and RBD4 within the SARS-CoV-2 Spike protein in relation to N-linked glycosylation sites.
  • Figure lOh shows pEV51 containing the expression construct for intracellular production of HPV(Ll)- mCherry.
  • Figure lOi shows pEV52 containing the expression construct for intracellular production of mCherry-GSl-HBsAg
  • Figure lOj shows pEV56 containing the expression construct for intracellular production of M6His-mCherry- GSl-HBsAg.
  • Figure 10k shows pEV95 containing the expression construct for secretion of VHH-GS-mCherry.
  • Figure 101 shows pEV60 containing the expression construct for intracellular production of Green Fluorescent Protein, ymllkGl.
  • Figure 10m shows pEV3 containing the expression construct for the secretion of rHA- mCherry, where rHA is recombinant human albumin.
  • Figure lOn shows pEV203 containing the expression construct for secretion of untagged rHA.
  • Figure lOo shows pEV64 containing the expression construct for secretion of a 14 kDa recombinant protein fused to mCherry.
  • Figure 11 shows the plasmids used in this invention for expression of recombinant proteins, with a description of the expression cassettes they contain (including any detection tags and linkers) and whether they are designed for intracellular or secreted production.
  • Figure 12 shows a comparison of one of the best two strains for amylase-mCherry secretion (2-A2) with one of the worst two strains for amylase-mCherry secretion (1-C12), for expression of multiple different recombinant proteins.
  • the inventors set out to design a yeast breeding and selection method, to increase recombinant protein yield by increased cellular productivity, or decrease losses by proteolysis, thereby providing improved production strains for a particular recombinant protein.
  • the inventors used two parents containing the pep4-gene disrupted with a dominant marker KanMX) and two parents with a functional PEP4 gene.
  • Saccharomyces cerevisiae were collected from around the world in compliance with the Nagoya Protocol. These strains were genetically modified to enable an advanced multigenerational breeding strategy, including engineering auxotrophic markers. Additional engineering and experimental testing were performed on the genomes of specific strains, e.g., to reduce protease production in selected parents by pep4 gene disruption, while still allowing breeding to occur.
  • a method was developed to allow the biological processes of mating, sporulation and germination, which are essential for the multigenerational breeding programme to occur despite pep4 gene disruption being present in the genetically diverse population, which ultimately allows pep4-deleted haploid or diploid strains to be screened for improved recombinant protein production. It was subsequently necessary to select for pep4- disrupted haploid or diploid progeny after multigenerational breeding, where at least one parent had contained the pep4 gene disruption.
  • Example 1 Plasmid Construction and Yeast Transformation
  • Plasmid pHRIK ( Figure 10a) was constructed from two PCR fragments encoding the entire 2-micron plasmid from strain YLF185 (each containing one of the inverted repeat regions), which were cloned into the EcoRV site of pUC57-Kan
  • Regions of primer sequence binding are marked on the pHRIK map as PCR1 to PCR4.
  • Amplification was performed using a Q5 HiFi PCR kit (NEB) and/or restriction enzyme digestion to generate: a) pUC57-Kan (from GeneWiz) cut with EcoRV (buffer 3.1) or uncut plasmid can be used as PCR template DNA.
  • PCR primers contain 20-30bp homology to pUC57-Kan and the second 2-micron fragment, respectively. This fragment is around 3.1Kb in length.
  • PCR primers contain 20-30bp homology with the first 2-micron fragment and the LEU2 fragment. This fragment is around 3.4Kb in length.
  • the LEU2 fragment flanked by Pad-sites. PCR primers contain 20-30bp homology with the second 2-micron fragment (encoding R.EP1 and D) and the pUC57-Kan fragment. This fragment is around 2.1Kb in length and can be amplified from S288C genomic DNA. PCR and DNA cloning methods are well known to those skilled in the art of molecular biology.
  • Co-transformation of a cir° yeast strain, such as YLF185 cir°, with all four fragments followed by selection of leucine prototrophs, extraction of total DNA, transformation of f. coli with kanamycin selection and then extraction and sequencing of plasmid DNA for alignment to the expected pHRl-K sequence may be performed.
  • seamless cloning can be performed in vitro (e.g. using a NEBuilder® HiFi DNA assembly kit) before the yeast and/or E. coli transformations.
  • pHRIK plasmid DNA can be isolated from E. coli, and its identity confirmed by diagnostic restriction enzyme digests and/or DNA sequencing.
  • This whole-2-micron family plasmid contained the LEU2 selectable marker and pUC57-Kan integrated at the SnaBI-site downstream of the 2-micron D gene (see GenBank: J01347.1 sequence for Sna BI site location). Sbfl and Notl sites were introduced on PCR primers for the directional insertion of expression constructs downstream of the LEU2 selectable marker. See the pHRIK map for details ( Figure 10a). Standard methods were used for E. coli transformation and plasmid pre pa ration. pHR2A (see Figure 10b) was constructed in a 3-way ligation of the 2.6kp pHRIK Notl-Hpal fragment, the 1.2kp pHRIK Notl-Acll fragment and the 2.5kp pUC57-
  • DNA fragments were synthesised (GeneWiz/Azenta) to make expression constructs for different recombinant proteins, which were subsequently cloned directionally into the Sbfl and Notl sites of pHR2A.
  • proteins were typically expressed with and without the coding sequence for a fluorophore, e.g. mCherry from
  • Anaplasma marginale (UniProt X5DSL3), genetically fused to the protein of interest. Genetic fusion to the C-terminus of the polypeptide of interest was performed where it was desirable for the polypeptide of interest to be synthesised before the mCherry domain, thereby ensuring that an mCherry signal detected in the cell was associated with synthesis of the entire fusion protein. Polypeptide coding regions were designed using codons selected for rapid mRNA translation in S. cerevisiae as described by Chu et al., 2014 8 .
  • expression constructs used the repressible MET17 promoter MET17p).
  • the repressible MET17 promoter can be switched off during breeding (by adding methionine to the culture media to repress expression, e.g. at 20mM concentration) and switched on for the production of recombinant protein, e.g. with easily detectable mCherry, in selected strains by growth in media lacking methionine (at methionine concentrations below approximately 0.05 mM). Strains were typically grown on BMMD media, as described by Evans et al.
  • S. cerevisiae strains e.g. Q427 (SA lineage) used to introduce the amylase- mCherry expression plasmid into the genetically diverse libraries, or the libraries themselves, were transformed to leucine prototrophy using a lithium acetate method (Sigma Aldrich Yeast Transformation Kit YEAST1). This was achieved by co-transformation of the yeast strains or libraries with DNA from both the whole-2- micron plasmid, e.g. pHRIK, and with DNA containing the expression construct plus flanking DNA homologous to the whole-2-micron DNA fragment, e.g. from pEV7 (described below). This allows in vivo assembly of the final expression plasmid, e.g.
  • pHR!K-pEV7 by homologous recombination, which is shown in Figure lOd.
  • Prototrophic transformants were selected on media without leucine, and cryopreserved stocks were prepared with 25% glycerol after plating a single transformant to provide single colonies.
  • the expression plasmid e.g. pHR!K-pEV7 shown in Figure lOd, was subsequently transferred to all progeny during multigeneration breeding.
  • This yeast cell transformation approach typically used approximately lOOng each of gel purified DNA from the plasmid containing the whole-2-micron DNA, e.g. 7.3Kb pHRIK BstXI-Notl fragment, and the plasmid containing the expression construct, e.g.
  • pEV7 digested with SwaI+Acc65I The DNA from the whole-2-micron plasmid and the plasmid containing the expression construct and homologous flanking DNA were co-transformed into yeast, e.g. into Q427 cir° to give strain Q427 [pHR!K-pEV7], with homologous recombination (gaprepair) of the plasmid DNA fragments acting to assemble the final expression plasmid, e.g. pHR!K-pEV7. Gap-repair transformation and other yeast methods are described in Andersen et al. 2012 10 , Structure- based mutagenesis reveals the albumin-binding site of the neonatal Fc receptor and Finnis et al.
  • Plasmids comprising expression constructs used in this invention include: pEV7 (amylase-mCherry expression); Figure 10c, contains the expression construct MET17p-SUC2pre-AA-mCherry-ADHlt, which was used with pHRIK DNA to transform yeast according to the method described above for in vivo construction of the final whole-2-micron expression plasmid, e.g. pHR!K-pEV7 for amylase- mCherry expression ( Figure lOd). Equivalent methods were used for in vivo construction of the final whole-2-micron expression plasmids for other recombinant proteins using the plasmids described below, which contain the expression constructs for each different recombinant protein.
  • the recombinant product it was preferred in this case for the recombinant product to lack N-linked glycosylation, which might affect protein activities independently of product yield.
  • Options include the expression of an alpha amylase lacking potential N-linked glycosylation motifs (-N-X-S/T-) or expressing an alpha-amylase analogue modified to remove existing N-linked glycosylation motifs, e.g. Aspergillus oryzae alphaamylase (UniProt P0C1B3) with serine 199 in the mature protein changed to an alanine residue.
  • the coding sequence for the fluorophore was genetically fused to the alpha-amylase C-terminal coding sequence to facilitate screening and selection of final yeast strains and to provide an additional phenotype for evaluating productivity and differential proteolysis.
  • fluorescent proteins such as yEGFP, ymllkGl, mOrange and mNeonGreen and other, such as described by Thorn et al. 2017 12 could be used. Codons were selected for rapid mRNA translation in S. cerevisiae as described by Chu et al. 2014 8 .
  • pEV7 contains an expression construct encoding a repressible MET17 promoter, which is driving expression of the alpha-amylase (AA) mCherry fusion protein without any N-linked glycosylation sites.
  • the repressible MET17 promoter can be switched off during breeding (by adding methionine to the culture media at 20mM or above. Secretion was directed by the secretory leader sequence from the
  • S. cerevisiae SUC2 invertase gene.
  • This pre-leader sequence is removed by signal peptidase during translocation into the endoplasmic reticulum to give the mature amylase-mCherry protein, which is then secreted into the culture media.
  • pEV25 (RBD3-mCherry): Contains the expression construct MET17p-SUC2pre- RBD3-mCherry-ADHlt for RBD3-mCherry secretion, which is the SARS-CoV-2 Receptor Binding Domain (RBD) fragment (residues 344-604) with mCherry genetically fused to the C-terminus; Figure lOe.
  • pEV26 (RBD4-mCherry): Contains the expression construct MET17p-SUC2pre-
  • RBD4-mCherry-ADHlt for RBD4-mCherry secretion which is the SARS-CoV-2 Receptor Binding Motif (RBM) fragment (residues 438-505) with mCherry genetically fused to the C-terminus; Figure lOf.
  • RBD3 and RBD4 protein sequences are devoid of N-linked glycosylation sites.
  • the location of the RBD3 and RBD4 (also known as RBM) fragments withing the SARS- CoV-2 coding sequence are shown in Figure 10g, where N-linked glycosylation sites are marked as vertical bars.
  • pEV51 to pEV58 VLP proteins: Plasmids pEV51 to pEV58 for the intracellular production of VLP proteins are described below.
  • GS1 is a linker with sequence GSGGSGGSGPVTN (SEQ ID No: 9), and GS2 is a linker with sequence GGSGS (SEQ ID No: 10).
  • HPV16(L1) encodes Human papillomavirus type 16 major capsid protein LI (UniProt: P03101 VL1_HPV16).
  • HBsAg encodes Hepatitis B virus S protein (GenBank: AIJ50189.1).
  • AP205 encodes Acinetobacter phage AP205 coat protein (Q9AZ42 ⁇ Q9AZ42_9VIRU).
  • 6His encodes the hexahistidine-tag, preceded by a methionine residue in M6His.
  • mCh encodes mCherry protein.
  • Plasmid maps for pEV51 ( Figure lOh), pEV52 ( Figure lOi), and pEV56 ( Figure lOj) are shown as examples.
  • Plasmid pEV60 (GFP-ymllkGl) : Plasmid pEV60 ( Figure 101) contains the expression construct TEF2p-ymUkGl-ARO4t for intracellular production of GFP (ymllkGl), where TEF2p and ARO4t are the Saccharomyces cerevisiae TEF2 promoter and ARO4 terminator, respectively.
  • pEV3 (rHA-mCherry) : Plasmid pEV3 ( Figure 10m) contains the expression construct MET17p-SUC2pre-rHA-mCherry-ADHlt for secretion of recombinant human albumin with mCherry fused to its C-terminus.
  • Plasmid pEV203 (rHA) : Plasmid pEV203 ( Figure lOn) contains the expression construct PRBlp-SUC2pre-rHA-ADHlt for the secretion of recombinant human albumin, driven by the Saccharomyces cerevisiae proteinase B promoter PRB1 p).
  • pEV64 (RP-mCherry): Plasmid pEV64 ( Figure lOo) contains an expression construct MET17p-SUC2pre-RP-mCh-ADHlt for the secretion of a 14 kDa recombinant protein (RP) fused to mCherry.
  • Expression cassettes for additional recombinant proteins were designed similarly to pEV7, as Sbfl-Notl fragments, which were cloned in place of the amylase-mCherry expression cassette for transcription in the same direction as the LEU2 gene in the final whole-2-micron vectors.
  • the plasmids for expression of the additional recombinant proteins are described below and in Figure 11, with a description of the expression cassettes they contain (including any detection tags and linkers) and whether they are designed for intracellular or secreted production.
  • pEVl contains an expression cassette for intracellular expression of mCherry.
  • the mCherry coding sequence is essentially the same in all constructs of this invention.
  • This expression cassette contains the MET17 promoter and ADH1 terminator described above for pEV7.
  • pEV51 & pEV52 contain intracellular expression cassettes for the virus-like particle proteins HPV16(Ll)-mCherry and mCherry-GSl-HBsAg, respectively, where GS1 is a linker with sequence GSGGSGGSGPVTN (SEQ ID No: 9).
  • HPV16(L1) encodes human papillomavirus type 16 major capsid protein LI (UniProt: P03101 VL1_HPV16).
  • HBsAg encodes Hepatitis B virus S protein (GenBank: AIJ50189.1).
  • pEV3 contains an expression cassette for the secretion of rHA-mCherry, where rHA encodes recombinant human albumin with the mature albumin sequence from UniProt P02768.
  • This expression cassette contains the MET17 promoter, SUC2 leader and ADH1 terminator described above for pEV7.
  • pEV388 contains an expression cassette for the secretion of rHA (without an mCherry tag) with transcription from the Saccharomyces cerevisiae proteinase B promoter PRBlp). Secretion is directed by the modified fusion leader sequence (mFL).
  • pEV275 contains an expression cassette for a VHH domain antibody for prostatespecific membrane antigen, PSMA (Chatalic et al., 2O15 20 ). This expression cassette contains the MET17 promoter, SUC2 leader and ADH1 terminator described above for pEV7.
  • pEV299 contains the same expression cassette as pEV275, except with transcription from the Saccharomyces cerevisiae proteinase B promoter (PRBlp).
  • pEV298 contains the same expression cassette as pEV299, except for secretion of a VHH-mCherry fusion protein.
  • pEV395 contains an expression cassette for secretion of a GLP1 (9-37) analogue precursor-GS-HiBit fusion protein with amino acid sequence
  • EGTFTSDVSSYLEGQAAKEFIAWLVRGRGGGGGSGGGGSVSGWRLFKKIS SEQ ID No: 10.
  • Equivalent yeast transformations were performed as required for the introduction of additional expression plasmids with homologous recombination (gap-repair) between DNA fragments comprising the whole-2-micron plasmid sequence and fragments comprising the expression construct with homologous flanking sequences from the different pEV-plasmids described above.
  • Example 2 Multiaenerational breeding to produce genetically diverse libraries with different proteinase A genotypes.
  • the inventors generated C12 cir° and Cpl2 cir° libraries with >10 8 progeny.
  • parental strains Q416 and Q413 were replaced with Q410 and elOS599, respectively.
  • spore segregants were isolated from the diploids listed in Louvel et al. 2014 15 .
  • Q410 was one of these which was cured of its native 2- micron.
  • Q416 was derived from Q410 by disrupting pep4.
  • elOS599 was also one of the spore segregants and Q413 was derived from this in the same way.
  • Q426 and Q417 were spore segregants from crosses of diploid spores described in Louvel et al. 2014 15 and the lys2::URA3 strains from Cubillos et al. 2013 17 (which had ura3 : : Kan MX i n stea d of ura3A ) .
  • Cpl2 populations were generated from (Q416, Q413, Q426 and Q427) as described below in the presence of methionine.
  • C12 cir° diploid libraries contained entirely PEP4: :PEP4 homozygous diploids, as the parental strains did not contain any pep4 KanMX gene disruptions.
  • Cpl2 cir° diploid libraries comprise approximately 25% PEP4::PEP4 homozygotes, 50% PEP4::pep4 heterozygotes, and 25% pep4::pep4 homozygotes.
  • Selection from Cpl2 cir° libraries or Cpl2 libraries containing whole-2-micron expression plasmids was performed using G418.
  • G418 is an aminoglycoside antibiotic similar in structure to gentamicin Bl, which blocks polypeptide synthesis by inhibiting the elongation step. Strains containing the KanMX gene are resistant to G418.
  • a homozygous pep4 pep4 diploid population (“pure diploids", PD) was prepared by allowing germination of genetically diverse Cpl2 progeny in the presence of G418 to select for pep4: : KanMX spores only, followed by mating to form pep4::pep4 diploids.
  • a mixed population of heterozygous pep4 PEP4 and homozygous pep4 pep4 diploids (“mixed diploids", MD) was prepared by allowing germination and mating of pep4: : KanMX and PEP4 haploids followed by selection against PEP4:PEP4 homozygous diploids with G418, as shown in Figure 1. Therefore, multigenerational libraries can be produced with a range of proteinase A genotypes, including an absence of proteinase A gene, despite this protease being essential for an efficient breeding process.
  • Example 3 Screening multigenerational libraries with different proteinase A genotypes.
  • a twelve generation mixed diploid library (Cpl2 [pHR!K-pEV7]) containing an amylase-mCherry expression plasmid was screened by flow cytometry and populations were isolated by flow cytometry detection of intracellular mCherry signal for analysis of amylase-mCherry secretion.
  • Individual strains were grown in microtitre plate culture and supernatant removed for detection of mCherry in a plate reader. No significant mCherry signal was detected in the culture supernatant despite numerous progeny strains being analysed. This indicated that a sufficient reduction in proteolysis for full-length recombinant protein production might not be easily achieved by breeding alone, e.g. for practical screening experiments using up to 10 3 individuals.
  • pep4 gene disruption is generally desirable for recombinant protein production
  • the inventors decided to perform equivalent screening with a "pure diploid" library comprising pep4::pep4 homozygotes.
  • mCherry was detectable in the culture supernatants, indicating that it was beneficial to screen genetically diverse libraries lacking proteinase A expression.
  • the pep4 disruption is advantageous for selecting progeny from yeast breeding which have improved recombinant protein secretion. Indeed, it is difficult to detect any secreted products without using strains with pep4 disruption.
  • For screening individual genetically diverse progeny produced by multigenerational breeding it is possible to collect populations of cells by flow cytometry, e.g. with different intracellular mCherry activities.
  • individuals can be isolated by serial dilution and plating to single cells on agar plates containing appropriate culture media. Individual colonies can then be picked into microtitre plates for expression analysis, e.g. intracellular expression or secretion. Secreted protein can be detected by centrifugation to sediment cells and removing supernatant to a fresh microtitre plate for product detection, e.g. in a plate reader.
  • cells were cultured in lOOpL media in 96-well microtitre plates or 500pL media in 48-well microtitre plates, with shaking incubation at 30°C, 200- 280rpm/2.5cm orbit with humidity control.
  • individual cells can be sorted directly into (or onto) culture media in microtitre plates with one cell per well. In this case it is preferable to sort the cells directly into media containing methionine to repress recombinant protein expression during growth of the single cell to a stationary phase culture (30°C, 280 rpm with humidity control). Typically, 50-100% of single cells can be routinely cultivated by these methods for the production of cryopreserved stocks and/or subsequent analysis. Therefore, genetically diverse individual strains from breeding can be evaluated for improved bioprocessing traits, such as improved levels of recombinant protein production or secretion. Due to the multigenerational breeding, phenotyping combined with genome sequencing provides data suitable for QTL analysis.
  • Figure 2 shows the mCherry activities from 41 strains selected for growth at 30°C suitable for QTL analysis.
  • the supernatant mCherry levels were corrected for growth differences (OD620) for the strains used in the QTL analysis.
  • Significant diversity in the mCherry phenotype is observed, with more than a 10-fold difference observed between the highest and lowest producers.
  • a 10.8-fold difference in mCherry levels in the supernatant for the highest producer (2A2) was obtained compared to the lowest producer (3D9).
  • Figure 3 shows plots of mCherry against amylase specific activities for each strain grown at 30°C, with strains grown in four MTPs.
  • QTL analysis identified 16 genomic regions comprising approximately 3.3% of the total Saccharomyces cerevisiae genome containing genes and alleles responsible for the differential expression of the recombinant amylase-mCherry protein. Successive statistical and bioinformatic filters and rational selection methods were used to shortlist sequences, e.g. QTLs, genes and QTNs, responsible for the increased recombinant protein production.
  • Example 4 Enrichment of strains with high yield phenotypes by flow cytometry.
  • FACS Fluorescence Activated Cell Sorting
  • mCherry recombinant fusion proteins containing fluorophore domains
  • FACS was used to enrich for populations enriched for SARS-CoV-2 RBD3-mCherry productivity and secretion.
  • Figure 4 shows typical populations selected by flow cytometry where low and non-producers of recombinant protein (based on mCherry signal) can be avoided (fraction 1) and populations with increasing intracellular recombinant protein production (fractions 2 to 6) can be selected for analysis.
  • RBD3-mCherry expression from construct pEV25
  • :pep4 homozygotes population after at least four generations of breeding, individual cells were grown in 48-well microtitre plates.
  • screening multigenerational libraries lacking proteinase A is highly beneficial for selecting strains with increased secretion of recombinant proteins with different protease sensitivities.
  • pep4 disruption alone is not sufficient to control proteolysis for all recombinant proteins in all progeny strains.
  • Strains, such as DH01, DH02 and DH03 isolated for the secretion of protease sensitive products can be cured of the expression plasmid used during screening and retransformed with an expression plasmid for a different recombinant product.
  • haploids can be obtained by sporulation of strains such as DH01, DH02 and DH03 after transformation with a P5P4-complementation plasmid, which can subsequently be cured of all plasmids for use producing new recombinant products.
  • multiple rounds of flow cytometry can be used to increase the enrichment of strains for improved recombinant protein production. This is especially useful for screening extremely large genetically diverse libraries (e.g. 10 8 -10 9 progeny) for individuals with optimal, or close to optimal, phenotypes which occur less frequently in the population after breeding.
  • extremely large genetically diverse libraries e.g. 10 8 -10 9 progeny
  • Example 5 Rapid production of large genetically diverse libraries by transformation of multigenerational cir° libraries and back-crossing, For producing large populations of genetically diverse libraries with high levels of mitotic recombination suitable for QTL analysis, it is desirable not to perform time consuming multigenerational breeding (6-18 generations) for each product being investigated. Consequently, alternative strategies were developed using cir° libraries produced by multigenerational breeding, typically in methods combining library transformation followed by back-crosses and further breeding. For example, improving intracellular production of recombinant VLP (virus-like particle) proteins was demonstrated by Cpl2 cir° library transformation without and with back-crossing. Three viral proteins were expressed with different mCherry tags.
  • VLP virus-like particle
  • HPV16 major capsid protein LI HPV16(L1) - non-enveloped, C-terminal antigen/tag
  • Hepatitis B surface antigen HbsAg - lipid enveloped, N-terminal antigen/tag.
  • AP205 coat protein non-enveloped, C- or N- terminal antigen/tag.
  • All expression vectors were made for intracellular expression of VLP-mCherry fusions using the repressible S. cerevisiae MET17 promoter.
  • the S. cerevisiae ADH1 terminator was used for all constructs and plasmids were assembled by homologous recombination in vivo to generate stable high-copy-number whole-2-micron expression vectors.
  • An mCherry-tag was located at the N- or C-terminus to facilitate high-throughput selection of improved strains.
  • the tag was designed to allow particle formation with the mCherry domain decorating the outside of the VLP. Li nkers/s pacers were included where needed to facilitate VLP formation. Constructs with His-tags were also investigated.
  • Table 3 Expression constructs for VLP monomers Initially, a Cpl2 cir° PD library was transformed with pEV51-54. Expression studies clearly showed that yeast genome diversity gave variable expression levels for all VLP-mCherry fusions with different average expression levels of each VLP-fusion. AP205-mCherry fusions showed significantly higher expression levels than the HPV16(L1) and HbsAg fusions. Evidence consistent with macromolecular particle formation was observed for the high expressing cultures. Surprisingly, yields of the enveloped HbsAg appeared easier to enhance than HPV16(L1). However, the full genetic diversity in the 12 th generation library cannot easily be screened by transformation alone, due to the limited numbers of transformants generated.
  • a Cpl2 cir° mixed diploid library was first transformed to obtain >10 3 transformants. The entire population of transformants was subsequently back-crossed with the Cpl2 cir° mixed diploid library and bred for additional generations. This method created genetically diverse libraries with >10 8 progeny, without having to introduce each expression plasmid into a parental strain followed by lengthy multigenerational breeding (e.g. at least 6-12 generations for each expression plasmid).
  • the Cpl2 cir° mixed diploid library was initially transformed to leucine prototrophy with expression constructs for the mCherry-tagged VLP proteins.
  • At least 10 3 transformants were pooled from each transformation and crossed separately with the Cpl2 cir° mixed diploid library at approximately a 1: 1 ratio. Diploids were sporulated and treated with zymolyase and ether to kill vegetative cells. Spores were germinated, allowed to mate, and diploids subsequently selected on G418 to create a new population of mixed diploids (Cpl3 MD) for additional breeding cycles. After three rounds of breeding posttransformation, with continuous methionine repression of the MET17 promoter throughout, the Cpl5 MD libraries were screened by flow cytometry and expression analysis.
  • Microtitre plates containing BMMD+CSM-Leu-Met media were inoculated from stock plates and grown for 3-5 days before evaluating expression based on the whole culture mCherry fluorescence. Stocks resulting in the highest mCherry levels were grown in shake flask culture for subsequent analysis.
  • Example 6 Selection of genetically diverse haploid orogeny enriched for a single mating-type
  • haploid strains which are preferred for genome sequencing and QTL analysis, it is possible to prepare genetically diverse haploid libraries enriched for a single mating-type before screening.
  • Yeast strain Q427 (SA lineage) was transformed to leucine prototrophy with pHRIK and pEV60 for homologous recombination in vivo and cryopreserved glycerol stocks prepared (called Q427 [pHR!K-pEV60] with approximately 4 x 10 8 cells/mL). These cells constitutively express GFP (ymllkGl).
  • Q427 was similarly transformed with pHRIK and pEV3 for homologous recombination in vivo and used as a parental strain to generate a Cpl2 mixed diploid library (as described above), called Cpl2 [pHRlK-pEV3].
  • a spore preparation was made from Cpl2 [pHRlK-pEV3] using the same zymolyase/ether method used during breeding, from which a cryopreserved glycerol stock was prepared (Cpl2 [pHR!K-pEV3] spore stock with approximately 10 6 spores/mL). Once germinated, cells from this diverse population will produce and secrete albumin-mCherry at different levels when grown in the absence of methionine.
  • the genetically diverse Cpl2 [pHR!K-pEV3] spores were then geminated in an excess of the strain Q427 [pHR!K-pEV60] on solid agar plates (e.g. YPD media with 1% yeast extract, 2% peptone, 2% glucose, and agar added at 2% ), whereby the MATalpha progeny from the Cpl2 library can mate with the MATa Q427 [pHRlK- pEV60] to produce diploids capable of expressing mCherry-tagged albumin and GFP (ymllkGl).
  • solid agar plates e.g. YPD media with 1% yeast extract, 2% peptone, 2% glucose, and agar added at 2%
  • the MATalpha progeny from the Cpl2 library can mate with the MATa Q427 [pHRlK- pEV60] to produce diploids capable of expressing mCherry-tagged albumin and G
  • Samples were prepared for flow cytometry with a Beckman Coulter Astrios EQ cell sorter. Cells were harvested from the plates with 1 mL of BMMD+CSM-Leu- Met+lOOpg/mL ampicillin. 500pL of each cell suspension was transferred to 1.5 mL tube and 500 pL of the same medium was added.
  • the cultured haploid Q427 [pHRlK-pEV60] and germinated Cpl2 [pHRlK-pEV3] spore samples were used with a sample of ungerminated Cpl2 [pHRlK-pEV3] spores to set the flow cytometry gates to enable single haploid vegetative cells with only mCherry fluorescence to be enriched and sorted into microtitre plates containing BMMD+CSM-Leu+Met for cell recovery and subsequent preparation of glycerol stocks.
  • BMMD+CSM-Leu+Met BMMD+CSM-Leu+Met for cell recovery and subsequent preparation of glycerol stocks.
  • cell recovery was approximately 99%, with mCherry detected in approximately 98% of wells and GFP in approximately 2% of wells.
  • the majority of cells were identified as auxotrophic unable to grow in minimal media lacking uracil and lysine.
  • Cpl2 MD [pHRlK-pEV64] transformant colonies were pooled, grown and back-crossed with the Cpl2 cirO MD library ( ⁇ 10 9 cells) as described above. For this, the strains were mixed, sporulated (SPM plates), germinated and mated (YPD plates), and diploids selected (YNB with G418).
  • the Cpl3 MD [pHRlK- pEV64] diploids were selected using G418 and sporulated with the Cpl2 cir° MD library ( ⁇ 10 9 cells) for an additional round of mating with back-crossing to prepare Cpl4 MD [pHRlK-pEV64] diploids.
  • the Cpl4 MD [pHRlK-pEV64] diploids were grown overnight without methionine (BMMD+CSM-Leu-Met) before selecting the top 2% (>10 5 individuals) by flow cytometry based on mCherry signal, which were recovered in BMMD+CSM-Leu+Met (estimated 57% recovery rate).
  • This population was similarly subjected to an additional flow cytometry passage with the top 2% collected (estimated 99% recovery rate) from which spores were prepared for germination in the presence of an excess of Q427 [pHRlK-pEV60]as described above, with the equivalent controls prepared for setting gates for cell sorting in a Beckman Coulter Astrios EQ flow cytometer.
  • Gates were set to sort haploid-sized cells expressing mCherry only into single wells of 18 x 96-well microtitre plates (1 cell/well) containing O.lmL BMMD+CSM-Leu+Met per well. Cell recovery was 57% from which 96 well microtitre plates containing O. lmL BMMD+CSM-Leu-Met were inoculated and grown for 3-4 days (30°C, 280 rpm with humidity control) for recombinant protein expression analysis, with cells sedimented by centrifugation and supernatant mCherry levels determined, e.g. using a TECAN Spark plate reader.
  • Parental strains were produced from derivatives of YLF185, YLF187, YLF190, and YLF191 using standard genetic techniques. All derived parental strains contained disruption of the PEP4 protease gene and an additional protease disruption in the YPS1 protease gene.
  • YPS1 (also known as YAP3 with systematic name YLR120C) is an aspartic protease and hyperglycosylated member of the yapsin family of proteases, attached to the plasma membrane via a glycosylphosphatidylinositol (GPI) anchor. It is involved with other yapsins in the cell wall integrity response and has a role in KEX2- independent processing of the alpha factor precursor. It can be beneficial to use ypsl gene disrupted strains for recombinant protein production. Additional engineering to improve recombinant protein production may be desirable in these pep4 and ypsl disrupted strains.
  • GPI glycosylphosphatidylinositol
  • disruption of genes to increase recombinant protein production such as UBC4, M0T2 and GHS1, or overexpression of genes for chaperones, such as PDI1 and ER.01 or manipulation of the unfolded protein response, e.g. by HACli overexpression, may be desirable.
  • Example 8 Auxotrophic selection of pure haploid libraries. Enrichment of genetically diverse libraries of haploid cells was performed after multigenerational breeding using auxotrophic selection. All parental strains were engineered to contain a selectable marker under control of the HO-promoter at the disrupted HO-locus within strains. HO gene expression is exclusive to haploid cells and mating type switching is prevented by HO-disruption.
  • Germination of spores and continued growth in media lacking the essential supplement required in the absence of the marker gene expression was used to maintain haploid progeny. Selection of individuals for further analysis is possible by FACS (Fluorescence Activated Cell Sorting) or as colonies on agar plates can be used to physically separate strains and prevent mating. Individual separated strains can be further screened for improved traits for recombinant protein production. Typically, 100% of progeny will be haploid.
  • Example 9 Selection of haoloids containing multiple protease disruptions improved for recombinant protein yield.
  • Multigenerational breeding was performed for pep4, ypsl disrupted cir° parents with pep4 complementation from a CEN-vector containing the functional PEP4 gene.
  • Colonies were also pooled for selection of individuals with improved VHH-mCherry production. Individual cells with the top 5% mCherry signal were sorted into individual wells of 50 x 96-well microtitre plates containing O. lmL BMMD+CSM- Leu+Met, which were grown to stationary phase and cryopreserved stocks prepared. Approximately 1,300 individuals were inoculated into 48-well microtitre plates for expression analysis as described above. Parental control strains were similarly grown in 48-well microtitre plates for comparison (16-24 replicates of each parent).
  • Figure 8 shows the supernatant mCherry signal corrected for growth (whole culture ODeoo) for the parental control strains, the randomly picked colonies and the individuals enriched for improved recombinant protein secretion by flow cytometry. Breeding has clearly generated individuals with significantly improved product yields, which can be further enriched by flow cytometry. Secretion of the recombinant VHH-mCherry fusion protein has been improved at least 10-20 fold (12-50 fold compared to parental strains) by the breeding and selection process for strains with protease disruptions.
  • a cir° library was prepared following multigenerational breeding (> 6 generations) with pep4 and ypsl disrupted parents and haploid progeny cured of the PEP4 complementing CEN-vector.
  • Approximately 3000 haploids (maintained by growth in media requiring HO-promoter driven marker expression) were pooled and transformed with plasmid pEV203 for constitutive expression of recombinant human albumin (rHA, untagged) with around 15,000 transformants generated.
  • rHA recombinant human albumin
  • Approximately 1000 transformants were picked for expression analysis in 48-well microtitre plates with constitutive albumin expression for final product titre determination by an HTRF assay (cisbio HSA kit) following the manufacturer's instructions.
  • Figure 9 shows the diversity for rHA in a sub-population of 192 individuals tested, which contained individuals significantly improved compared to the parental strains with equivalent engineering.
  • Example 10 OTL analysis with strains isolated from multigenerational libraries with different proteinase A genotypes.
  • Quantitative Trait Locus is a statistical method that links both phenotypic and genotypic data to explain the genetic basis of variation in complex traits (Miles et al. 2008 18 ). This method was performed on a range of S. cerevisiae strains selected with a range of productivities for the recombinant protein amylase- mCherry, described in Example 2. This analysis identifies regions of the genome containing alleles and SNPs (single polynucleotide polymorphisms), of which some are also QTNs (quantitative trait nucleotides) contributing to a phenotype, e.g. increased levels of the recombinant protein amylase-mCherry. The QTL analysis identifies "regions" (also called “intervals” or "loci”) in the genome associated with improving the phenotype analysed.
  • Short reads were first assessed for sequencing quality using fastqc, before each read was aligned against the reference genome of S288C (R64-2-1), using bwa. Alignments were indexed and sorted using samtools, and duplicate reads marked and removed using picard tools. Variants were then called using freebayes. The parameters were set for the minimum mapping quality to 20 and ploidy to diploid (- -min-mapping-quality 20 -min-base-quality 20 -p 2), then this output was subjected to a set of filters to use as genetic markers with SNP sites for the samples.
  • the following filters were applied: a.
  • the variant calling quality is more than 20; b.
  • the observation of the variant is 100% of the samples in the calling set; c.
  • Allele frequency (REF/(REF + ALT in (0.1, .09)); and d.
  • Calling positions are the known bi-allele variant sites for SGRP founders.
  • the fold increase was between approximately 4.2 and 1.2, with one protein giving approximately equal productivity between the two strains.
  • HPV16(Ll)-mCherry production is very different to amylase-mCherry, so it is not unexpected that alleles beneficial to amylase-mCherry production that are present in strain 2-A2 would improve HPV16(L1) production as significantly as for the other recombinant proteins because the HPV16(Ll)-mCherry was expressed intracellularly for accumulation and VLP formation in the nucleus, whereas amylase-mCherry was expressed for secretion into the extracellular media.
  • the alleles in 2-A2 were beneficial for improved recombinant protein production, e.g. through increased productivity and/or reduced proteolysis.
  • the recombinant proteins expressed have a diverse range of sizes, folding, domain structures and other physiochemical structures, indicating that strain 2-A2 is also generally improved to produce many other recombinant proteins of interest.
  • Multiple alleles beneficial for recombinant protein production and/or reduced proteolysis originating from the different parental strains have been combined in strain 2-A2. While this combination of alleles was originally selected for improved production of amylase-mCherry, clearly, many of these alleles and other combinations of these alleles and the SNPs within them are also beneficial for the production of multiple other recombinant proteins.
  • strains 2-A2 and 1-C12 were cured of the whole-2-micron expression vector for amylase-mCherry (pHR!K-pEV7) by the method described above and retransformed for expression from whole-2-micron plasmids equivalent to pHRIK of multiple different recombinant proteins comprising the expression cassettes described in Figure 11. All final whole-2-micron expression plasmids contain a LEU2 gene for leucine selection. Transformants were isolated as colonies on synthetic drop-out agar lacking leucine, e.g. BMMD+(CSM-Leu+Met). Three transformants were selected for each strain/plasmid combination for expression studies.
  • Controls were strains 2-A2 and 1-C12 secreting amylase-mCherry from the pEV7 expression cassette in pHRlK-pEV7.
  • Inoculum cultures for three transformants of 2-A2 and 1-C12 for each plasmid (and triplicate pHRlK-pEV7 controls) were started by picking cells from patches on solid media and transferring to 500pL liquid cultures in clear, 48 well microtiter plates.
  • Buffered synthetic drop-out media with 2%(w/v) dextrose, lacking leucine and containing 3g/L methionine was used to maintain plasmids and to repress expression from constructs utilising the MET17 promoter.
  • Inoculum cultures were incubated at 30°C for 2 days after which 20pL of each inoculum culture was passaged into 500pL synthetic dropout media with 2%(w/v) dextrose, lacking leucine in new 48 well microtiter plates, e.g. BMMD+(CSM-Leu-Met), in triplicate, to inoculate expression cultures.
  • Expression cultures were incubated in shaking humidity chambers at 30°C over 4 days before harvesting. Upon harvest, culture OD was measured in wells using a TECAN Spark plate reader (Tecan, Switzerland). Culture supernatants were isolated by centrifugation at 1800 RCF and analysed for secreted product, where applicable.
  • mCherry fluorescence was measured directly from the expression cultures in 48 well clear microtiter plates upon harvest at Aex540nm; Aem614nm, gain 100 on a TECAN Spark plate reader (TECAN, Switzerland).
  • rHA recombinant human albumin
  • titres were quantified using the Albumin Blue Fluorescence Assay Kit (Active Motif, Belgium) with a modified protocol for high-throughput detection in 384 plates. Briefly, 12.5pL culture supernatant was transferred to wells in a black, clear bottomed, non-treated 96 well assay plate. 75pL assay reagent comprising Ipl Albumin Blue dye and 74pL Buffer A from the kit was added to each well and mixed by pipetting. Plates were incubated for 5 minutes at room temperature, then fluorescence at Aex560nm; Aem620nm was measured on a TECAN Spark plate reader. Fluorescence signals for each sample were averaged from three technical replicates and converted to relative levels using a standard curve of rHA prepared in expression medium.
  • the inventors have identified that proteinase A gene PEP4 disruption is important for reducing the overall protease levels in final production strains genetically diverse for protease metabolism, in addition to any protease reduction caused by breeding and selection. It was unknown prior to this invention whether breeding alone was sufficient to allow strains to be selected with adequate control of proteolysis for recombinant protein production.
  • the inventors designed a genetically stable expression vector, which allowed recombinant protein production to be repressed during the breeding so that the expression plasmid was present in all progeny (up to a billion progeny) without creating a significant burden of metabolism from recombinant protein production, which might otherwise lead to selection against high productivity strains, which are a valuable component of the final strain libraries.
  • methods were developed to allow the selection of large populations of genetically diverse diploid progeny or haploid progeny, including populations of haploid progeny enriched for one mating type only (i.e. predominantly containing only one of either of the two possible mating types) for the selection of individuals with improved phenotypes for bioprocess improvement. Additional methods were developed for the autotrophic selection of haploid populations to use when screening for individuals with improved phenotypes for bioprocess improvement.
  • the method of breeding according to the invention provides a variety of innovative and flexible options allowing for the generation of diverse yeast populations suitable for high throughput screening and QTL analysis to improve multiple bioprocessing traits, especially for industrial recombinant protein production.
  • Saccharomyces cerevisiae is "the model eukaryote" with homology to other eukaryotic production hosts valuable for manufacturing biopharmaceuticals and other biologicals
  • these methods enable the discovery of underlying biology using methods (such as QTL analysis) which are relevant not only to this yeast but also to other eukaryotes.
  • Genes and proteins involved in improving one eukaryotic host for the manufacture of biologicals are likely to have homologues of value in other eukaryotes. This is especially relevant to strain improvement to Pichia, filamentous fungi and mammalian cell lines such as Chinese hamster ovary (CHO) cells used for biologies production, where the breeding methods used in this invention are impossible, more difficult or less well developed.
  • CHO Chinese hamster

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Mycology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Botany (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Virology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention relates to breeding processes, and in particular, to a method of breeding yeast, such as Saccharomyces cerevisiae, exhibiting an improved phenotype for a biomanufacturing process. The invention is especially concerned with a method of breeding yeast exhibiting improved recombinant protein production. The invention also extends to yeast strains exhibiting an improved phenotype for a biomanufacturing process, obtained by the method of breeding according to the invention.

Description

YEAST BREEDING PROCESS FOR STRAIN IMPROVEMENT, INVOLVING PEP4 MUTANTS
The invention relates to breeding processes, and in particular, to a method of breeding yeast, such as Saccharomyces cerevisiae, exhibiting an improved phenotype for a biomanufacturing process. The invention is especially concerned with a method of breeding yeast exhibiting improved recombinant protein production. The invention also extends to yeast strains exhibiting an improved phenotype for a biomanufacturing process, obtained by the method of breeding according to the invention. For over 40 years, microorganisms have been used to produce recombinant proteins. However, the cost of producing high-quality, biologically compatible, and commercially relevant quantities is often prohibitive. The baker's yeast, Saccharomyces cerevisiae, is well known for producing high-quality, correctly folded heterologous proteins, often at lower costs and at higher yields than mammalian cells. Alternative yeast, such as Komagataella species, K. phaffii, K. pastoris, and K. pseudopastoris, including industrial strains also known as Pichia pastoris, are also used to make recombinant products, including biopharmaceutical proteins, at competitive costs of goods. However, Pichia pastoris is often associated with a requirement for methanol and oxygen during large-scale manufacturing and generally uses non-exchangeable plasmid systems requiring selection with toxic zeocin, and may cause post-translational quality issues for biopharmaceuticals. While these disadvantages are not typically associated with Saccharomyces cerevisiae, this yeast tends to have lower productivity. Therefore, it would be advantageous to increase the yields of recombinant proteins from S. cerevisiae, which will act to reduce the costs of goods manufactured at a large scale, whilst retaining all the positive features of this expression host.
Alternative eukaryotic production hosts, such as mammalian cells, tend to be significantly more expensive for manufacturing biopharmaceuticals than yeasts. The media and growth requirements are also more expensive for mammalian cells, and processes are not free from animal-derived components.
Alternative microbial production hosts, including prokaryotic/bacterial systems like E. coli, lack eukaryotic cellular machinery, e.g. for protein folding and secretion, frequently resulting in insoluble inclusion bodies and endotoxin contamination issues affecting downstream purification. Previous work to improve product yields from S. cerevisiae has succeeded in developing manufacturing processes for a range of biopharmaceuticals, such as insulins, vaccines (e.g. virus-like particles), albumin and albumin fusion proteins. However, the production hosts are often sub-optimal, and the costs of goods limit access to biopharmaceutical products for those who need them, thereby hindering the eradication of treatable human diseases. The development of improved production yeast, especially for S. cerevisiae, which has a long, safe history of human use and is most commonly used for the manufacture of yeast-derived biologies approved by the FDA, would be advantageous. Improving product yields from S. cerevisiae would be especially beneficial.
Existing methods for improving product yields include optimising the expression construct, e.g. the promoter, leader sequence (for secretory products), the coding sequence, and terminator sequences, and also the copy number and stability of the expression construct. However, improvements to the yeast genome can also substantially improve productivity, product quality and other valuable bioprocess phenotypes. Nevertheless, the existing methods for improving the genome of the production yeast are sub-optimal, slow, expensive and labour-intensive. These methods have also focused on engineering the genome of a single strain or a family of closely related strains derived from commonly available laboratory strains, e.g.
CEN.PK or relatives of W303. Therefore, the optimal starting strains were not used. Furthermore, the underlying biology is complex and not fully understood, especially the bottlenecks and limitations in metabolism affecting product yields and quality. Each protein product places different requirements on the host, and many different limitations or bottlenecks can be encountered during production strain development. A production host improved for one product is seldom optimal for all products.
Furthermore, methods for improvement have so far explored only a relatively small number of genetic changes. For example, random mutagenesis approaches tend to make limited types of genetic changes and frequently result in loss of function. There is also a risk of introducing unwanted mutations. Unless the beneficial mutation is then identified and re-engineered into a clean genetic background of the progenitor strain, there is a tendency for the accumulation of these undesirable mutations. Therefore, multiple rounds of random mutagenesis tend to be limited in the scope of improvements possible and can generate adverse effects. On the other hand, rational genome engineering is unlikely to identify all options for strain improvement, some of which are not obvious targets for improvement, e.g.
UBC4, M0T2, GHS1. The engineering of the yeast secretion system is also complicated by the involvement of many cross-reacting factors. The tight interdependence of each of these factors makes genetic modification difficult. Many attempts in strain engineering also fail to be transferrable from the laboratory to an industrial process.
Each different recombinant protein product presents different challenges for production strain optimisation and generates a different burden on host cell metabolism. Therefore, the changes made to a single yeast strain improved for one product are unlikely to be optimal and may even be detrimental to the production of a different product. Consequently, bespoke production strain improvement is frequently required for each new product, which can also be sub-optimal, slow, expensive and labour-intensive. For example, each recombinant product might require an optimal combination of chaperone proteins for maximal secretion of the correctly folded protein. This might require multiple chaperones to be overexpressed, each requiring their expression level to be fine-tuned, thereby requiring the generation of multiple different strains and then slow, expensive testing in fermenters.
Similarly, a particular problem exists in controlling the proteolysis of the final recombinant product. Undesirable proteolysis may occur at any point after the polypeptide chain has been synthesised by the ribosome. This may be at any stage in the secretory pathway within the cell, in the extracellular media before harvesting, or even during downstream processing and purification, e.g. where a particular protease is brought into contact with the product protein under conditions allowing for proteolysis. Furthermore, removing a "product-related impurity" such as a proteolytic fragment of the desired protein from its active full-length form can be difficult because the proteolytic fragment often retains some physiochemical properties exploited during downstream processing to efficiently obtain the desired full-length product. Therefore, even a small amount of proteolysis affecting the final product can have significant cost implications in its removal, not to mention that some of the product has been lost due to proteolytic degradation reducing overall yields. Therefore, it is common practice to disrupt non-essential protease genes in production strains which are known to be responsible for the proteolytic degradation of desired products. However, this can result in undesirable phenotypes, such as poor growth, of the final production strain. It can also limit the selection of secretory leader sequences for secreted products, which may rely on a range of proteases for the correct and efficient removal and processing of pre- and pro-leader sequences.
Furthermore, the expression of proteases by S. cerevisiae is complex, with each recombinant protein product usually being affected differently by the nearly 200 diverse peptidases reportedly encoded in its genome. Consequently, it can be challenging to identify which protease(s) is/are degrading a particular protein product. This is especially true when the recombinant protein is a substrate for multiple S. cerevisiae proteases, because even if the recombinant protein is expressed in a strain disrupted for one of the proteases which degrade it, it will still be degraded by the other(s), so it is difficult to know if an improvement has been made. Additionally, a limit is quickly reached for the total number of protease genes which can be disrupted in any S. cerevisiae production host before it becomes unsuitable for industrial manufacturing due to adverse phenotypes introduced. Therefore, the strategy of identifying and disrupting protease to solve this problem is both laborious, error-prone and extremely limited. Furthermore, it is impractical to have strains disrupted for all combinations of S. cerevisiae proteases. This means that the manufacture of desirable biopharmaceutical products may not be commercially viable, despite the other obvious benefits of this well-established yeast, and medical innovations may fail to reach the market. Controlling proteolysis without or with fewer difficulties in identifying the culprit proteases or the adverse effect of multiple gene disruptions to control them would be highly advantageous.
Approaches that explore greater genetic diversity and make selecting strains improved for each recombinant product easier, instead of relying solely on traditional mutagenesis and strain engineering methods, would be beneficial. It would also be advantageous to have faster, less expensive and less labour- intensive methods for strain improvement overall. Improved strains for recombinant protein manufacture, which are genetically diverse compared to common laboratory strains, would also be valuable for investigating the manufacture of products that are difficult to express in existing systems. The different genetic backgrounds will affect each recombinant product differently so that when a single strain or family of closely related strains fails to manufacture a product to the required standard, e.g. of yield, quality or cost, a genetically different strain may succeed. Furthermore, it would be beneficial to be able to identify the key regions of the yeast genome responsible for improvements in recombinant protein production, whether for intracellular products or secreted products. This would allow for further improvements, e.g. by rational engineering, in addition to those introduced by breeding or other methods which increase genetic diversity. The inventors turned their attention to proteinase A which has been disrupted in S. cerevisiae strains for recombinant protein production by disruption of the PEP4 gene (also known as PRA1 or YPL154C). Proteinase A is a vacuolar aspartyl protease required for post-translational precursor maturation of vacuolar proteinases. This protease is important for protein turnover after oxidative damage, and plays a protective role in acetic acid induced apoptosis. It is synthesised as a zymogen, self-activates and is targeted to the vacuole via VpslOp-dependent endosomal vacuolar protein sorting pathway.
The Saccharomyces genome database (SGD), https://www.yeastgenome.org/, teaches that "Loss of Pep4 protein (Pep4p) activity leads to a shortened lifespan, and the PEP4 gene is essential under conditions of nutrient starvation, such as those used for spore formation. Homozygous diploid pep4 mutants are defective in sporulation.". Accordingly, the inventors did not know whether breeding itself would be practical for sufficiently reducing proteolysis compared to PEP4 gene disruption, for cost- effective strain development and recombinant protein production. While it could have been anticipated that progeny could have some variation in the levels of proteinase A and other proteases even without PEP4 gene disruption (e.g. through differential gene expression), the inventors did not know if this would be sufficient to allow detectable levels of recombinant protein products in the culture supernatant, e.g. while still screening a reasonable number of progeny in practical cost-effective workflows. As such, it was not at all predictable what would happen. However, when the inventors investigated this, they found that pep4 gene disruption was needed to observe secreted recombinant proteins for simple detection in the culture supernatant. Additionally, because proteinase A plays such an important role in yeast metabolism and especially in sporulation, it was not obvious that efficient multigenerational breeding would ever be possible in a population containing disrupted pep4 genes, where the wild-type gene dosage is reduced and proteinase A protein levels are also expected to be reduced.
The inventors designed a breeding method using two parents containing the pep4- gene disrupted with a dominant marker (JCanMX) and two parents with a functional PEP4 gene. This allowed enough of the Pep4 protein to be present during breeding to allow sporulation. Selection for the KanMX gene (with G418) after multigenerational breeding then allows for the selection of different populations of haploids or diploids containing the pep4-disruption, and so could not express functional proteinase A, which would carry out proteolysis. Surprisingly, the inventors identified that using their method, it was possible to achieve multigenerational breeding using at least one parent containing a disrupted pep4 gene. This approach is advantageous over alternative methods which may contribute to genome instability, an undesirable trait in production strains generally, and especially undesirable during continuous culture. Accordingly, in a first aspect of the invention, there is provided a method of breeding to generate yeast exhibiting an improved phenotype for a biomanufacturing process, the method comprising: i) breeding at least two parental yeast strains, wherein at least one parental yeast strain comprises a functionally deleted PEP4 gene, or a homologue, orthologue or paralogue thereof; and ii) selecting for progeny exhibiting an improved phenotype for a biomanufacturing process.
Advantageously, as described in the examples, the inventors have demonstrated how yeast breeding and selection can be used in combination with traditional strain engineering, for example to increase recombinant protein yield by increased cellular productivity, or decrease losses by proteolysis, thereby providing improved production strains for a particular recombinant protein. This overcomes a significant problem where the engineering important for recombinant protein production directly impacts the processes of mating, sporulation and germination, which are essential for breeding, without introducing any undesirable genome engineering that might destabilise the final production strains. Preferably, the method comprises at least three parental yeast strains, more preferably, at least four parental yeast strains. Preferably, each parental yeast strain is a haploid strain.
Genetically diverse parental strains are preferred for multigenerational breeding strategies to generate large populations (e.g. of 108-109) of genetically diverse haploid or diploid progeny suitable for screening to identify strains improved for recombinant protein production and performing quantitative trait loci (QTL) analysis to identify the genes, alleles and SNPs responsible for the improvements.
Accordingly, preferably the method comprises breeding genetically diverse parental yeast strains. Preferably, therefore, the resultant progeny are genetically diverse. Preferably, the parental strains are not common laboratory strains. A "common laboratory strain cell" will be well-known to the skilled person and may include those defined in Louis, E.J., 20161. A common laboratory strain may include yeast strains listed on the Saccharomyces Genome Database (SGD). A common laboratory strain may include one of the following yeast strains: S288C (Reference Genome: GenBank GCF_000146045.2); W303 (GenBank: JRIU00000000.1); CEN.PK; JRY188; AH22; S150-2B; and CB11/63. A common laboratory strain is not a natural strain, and therefore, may contain its own non-naturally occurring combination of alleles.
Heterozygous functionally deleted PEP4 diploids are not guaranteed to sporulate in all cases of genetically diverse strains, including heterozygous functionally deleted PEP4 diploids produced from parental strains preferred for breeding or heterozygous functionally deleted PEP4 diploids generated from some combinations of parental strains preferred for breeding. The diploids produced may not contain sufficient proteinase A to sporulate at all, or to sporulate efficiently, or other factors present in these diverse strains may negatively impact sporulation. This may prevent multigenerational breeding even with at least one parental yeast strain comprising a functionally deleted PEP4 gene. This is because the diploids generated in the first rounds of breeding may not produce spores at all, so breeding cannot continue any further. Alternatively, the efficiency of sporulation may be so low that either the genetic diversity present in the parents is inadequately represented in the progeny population or the numbers of spores produced are insufficient due to inefficient sporulation. Sporulation efficiency is preferably high, with most diploids producing four spores (in a tetrad). In one embodiment, therefore, sporulation efficiency provides spores from at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% and preferably 100% of the diploids produced by mating.
It was not at all predictable what would happen or how breeding could be achieved with genetically diverse strains comprising a functionally deleted PEP4 gene. However, when the inventors investigated this, they found ways to achieve multigenerational breeding with at least one parent comprising a functionally deleted PEP4 gene and produce large populations (e.g. of 108-109) of genetically diverse haploid or diploid progeny with functionally deleted PEP4 genes. Furthermore, these populations were useful for both selecting strains improved for recombinant protein production and for QTL analysis to identify the genes, alleles and SNPs responsible for the improvements.
Preferably, the method comprises breeding one parental yeast strain comprising a functionally deleted PEP4 gene, with one parental yeast strain comprising a functional PEP4 gene. Preferably, in this embodiment, the two parental strains are haploids. Preferably, the progeny are diploid.
In another preferred embodiment, the method comprises breeding two parental yeast strains comprising a functionally deleted PEP4 gene, with two parental yeast strains comprising a functional PEP4 gene. Preferably, in this embodiment, the four parental strains are haploids. Preferably, the progeny are diploid.
To obtain progeny comprising the functionally deleted PEP4 gene, selection with any suitable selectable marker may be used. Accordingly, in one embodiment, at least one parental strain comprises a selectable marker, which preferably functionally deletes, or is inserted into, each PEP4 gene. Preferably, the method comprises isolating homozygous functionally deleted PEP4 diploid progeny by selecting for the selectable marker. In one embodiment, the method comprises germinating the progeny and isolating spores comprising a functionally deleted PEP4 gene by selecting for the selectable marker. Preferably, the method further comprises mating the spores to form homozygous functionally deleted PEP4 diploid progeny. As illustrated in Figure 1, this results in the production of a "pure diploid" population, i.e. 100% homozygous functionally deleted PEP4 diploid progeny. Alternatively, also as illustrated in Figure 1, a "mixed diploid" population, i.e. 33% homozygous functionally deleted PEP4 diploid progeny and 67% heterozygous functionally deleted PEP4 diploid progeny, may be prepared. Accordingly, in another embodiment, the method comprises germinating and mating functionally deleted PEP4 haploid progeny with functional PEP4 haploid progeny. Preferably, the method comprises isolating homozygous functionally deleted PEP4 diploid progeny and heterozygous functionally deleted PEP4 diploid progeny by selecting for the selectable marker.
In yet another preferred embodiment, the method comprises breeding at least one parental yeast strain comprising a functionally deleted PEP4 gene, with at least one parental yeast strain comprising a functional PEP4 gene. Preferably, this results in the production of heterozygous functionally deleted PEP4 diploid progeny and/or homozygous functionally deleted PEP4 diploid progeny. The selectable marker may be an auxotrophic marker. For example, the selectable marker may be selected from a group of markers consisting of: LEU2, TRP1, HIS3, HIS4, URA3, URA5, SFA1, ADE2, MET15, LYS5, LYS2, ILV2, FBA1, PSE1, PDI1 and PGK1. Those skilled in the art will appreciate that any gene whose chromosomal deletion or inactivation results in an inviable host, so called essential genes, can be used as a selective marker if a functional gene is provided on the plasmid, as demonstrated for PGK1 in a pgkl yeast strain by Piper and Curran, 19902. Typically, however, markers are auxotrophic markers, such as LEU2, where host cells containing the deleted auxotrophic marker can also be grown in the presence of a media supplement, e.g. leucine, when required.
Dominant selectable markers typically used in yeast include KanMX (for G418 selection, also known as kanMX), HygMX (for hygromycin B selection, also known as hygMX), NatMX (for nourseothricin selection, also known as natMX) and PatMX for bialaphos selection, also known as patMX), AUR1 AUR1-C for selection with Aureobasidin A / LY295337) and also amdSYM (for fluoroacetamide selection) as described by Goldstein et al., 19993, Heidler and Radding 19954 and Solis-Escalante et al. 20135. Accordingly, in one embodiment, the selectable marker may be a dominant selectable marker. The dominant selectable marker may be selected from the group consisting of: KanMX, HygMX, NatMX, AUR1-C, PatMX and amdSYM.
Most preferably, the dominant selectable marker is KanMX.
Yeast strains comprising a dominant selectable marker will be resistant to a selection agent, such as G418 (Geneticin) when the marker is KanMX. Preferably, therefore, the method comprises breeding the yeast in the presence of a suitable selection agent which is selected based on the selectable marker.
In a preferred embodiment, the dominant selectable marker is KanMX, and is selected for using the selectable agent G418 (Geneticin). G418 is an aminoglycoside antibiotic similar in structure to gentamicin Bl, which blocks polypeptide synthesis by inhibiting the elongation step. Preferably, therefore, the method comprises breeding the yeast in the presence of G418 or gentamicin Bl.
Accordingly, in a preferred embodiment, in order to obtain a homozygous functionally deleted PEP4 population pep4 -.pep4 diploid population, i.e. "pure diploids", PD), the method comprises allowing germination of genetically diverse progeny in the presence of G418 to select for pep4: : KanMX spores only. The spores are then preferably mated to form homozygous functionally deleted PEP4 diploids pep4: :pep4 diploids).
Alternatively, a mixed population of heterozygous functionally deleted PEP4 diploid progeny pep4 PEP4') and homozygous functionally deleted PEP4 diploid progeny pep4 -.pep4) (i.e. "mixed diploids", MD), may be prepared by allowing germination and mating of pep4: : KanMX and PEP4 haploids. This is followed by selection against functional PEP4 homozygous diploids (PEP4:PEP4)' with G418. Therefore, multigenerational breeding can be achieved to produce genetically diverse libraries with a range of proteinase A genotypes, including an absence of proteinase A gene, despite this protease being essential for the breeding process. In one embodiment, the functionally deleted PEP4 gene may be a disrupted PEP4 gene. For example, the PEP4 gene may be disrupted by replacing it with a nonfunctional copy, or introducing a mutation into the PEP4 gene. Alternatively, a functionally deleted PEP4 gene may not comprise a disrupted gene, but will instead be disrupted at the protein level, for example using an inhibitor. Alternatively, the PEP4 gene may be functionally disrupted using gene silencing, such as RIMA interference (RNAi) or small nuclear RNA (snRNA), as described in Williams T. C. et al., Microb Cell Fact 14, 43 (2015).
In a preferred embodiment, the improved phenotype for a biomanufacturing process may be selected from the following phenotypes: increased protein product yield, reduced cell lysis, modified shear resistance, modified sedimentation, improved product harvesting, and altered host cell protein profile. In another preferred embodiment, the improved phenotype for a biomanufacturing process may be increased plasmid, genomic or phenotypic stability, preferably over multiple generations during continuous culture. In another preferred embodiment, the improved phenotype for a biomanufacturing process may be improved growth phenotypes including, modified media requirements, modified growth temperature or temperature range, and modified growth pH or pH range. In another preferred embodiment, the improved phenotype for a biomanufacturing process may be altered post-translational modification, including reduced proteolysis or modified glycosylation.
Preferably, the improved phenotype for a biomanufacturing process is improved production of biologicals, biologies, therapeutic proteins, vaccines, recombinant proteins, and fragments, conjugates or fusions thereof. Most preferably, the improved phenotype for a biomanufacturing process is improved recombinant protein production. A yeast cell exhibiting "improved recombinant protein production" may also be defined as a yeast cell exhibiting increased recombinant protein production. Preferably, improved recombinant protein production comprises improved secretion. In one embodiment, the yeast cell obtained by the method according to the invention may exhibit an increase in recombinant protein production or secretion compared to a yeast cell that has not been obtained using the claimed method.
For example, in a preferred embodiment, the yeast cell obtained by the method according to the invention may exhibit an increase in recombinant protein production or secretion compared to a yeast cell that has not been obtained using the claimed method, of at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 100%. Alternatively, the yeast cell may exhibit an increase in recombinant protein production or secretion compared to a yeast cell that has not been obtained using the claimed method, of at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least 9-fold or at least 10-fold. Alternatively, the yeast cell may exhibit an increase in recombinant protein production or secretion compared to a yeast cell that has not been obtained using the claimed method, with total soluble intracellular protein or culture supernatant yields of at least 1 g/L, at least 2 g/L, at least 3 g/L, at least 4 g/L, at least 5 g/L, at least 6 g/L, at least 7 g/L, at least 8 g/L, at least 9 g/L, at least 10 g/L, at least 20g/L, at least 30 g/L, at least 40 g/L, at least 50 g/L, at least 60 g/L, at least 70 g/L, at least 80 g/L, at least 90 g/L, at least 100 g/L, at least 200g/L, or at least 300g/L. In one embodiment, the method further comprises the step of performing QTL analysis to identify the causal genetics associated with the improved phenotype for a biomanufacturing process (preferably improved recombinant protein production).
It is advantageous if the regions (intervals, loci) identified in Quantitative Trait Locus (QTL) analysis are shorter and precisely defined to contain only the genes and Quantitative Trait Nucleotides (QTNs) responsible for the phenotypic improvement being analysed, thereby significantly reducing the laborious follow-on work required to identify causal QTNs from other SNPs. To perform QTL analysis resulting in shorter QTLs, it is beneficial to perform multiple rounds of breeding where meiosis increases the number of cross-over events within the chromosomes (chromosomal crossover) during synapsis, compared to the parental genotypes. Maximising the number of meiotic cross-over events is an important factor contributing to improved QTL results. Accordingly, in a preferred embodiment, the method comprises performing at least
2, at least 3, at least 4, at least 5, or at least 6 generations of breeding. More preferably, the method comprises performing at least 7, at least 8, at least 9, at least 10, at least 11 or at least 12 generations of breeding. Most preferably, the method comprises performing at least 13, at least 14, at least 15, at least 16, at least 17 or at least 18 generations of breeding. Preferably, the method comprises performing multigenerational breeding. Multigenerational breeding may be defined as breeding comprising multiple rounds of mating, sporulation and germination. For example, to maximise meiotic events and increase the genetic diversity of the progeny compared to the parents.
The inventors discovered that it was not always sufficient during breeding for sporulation, germination and mating to occur at a partial level compared to the wild-type population with undisrupted PEP4. These processes must be highly efficient in order to obtain sufficient levels of meiosis in each round of breeding and enough, e.g. >104 and preferably up to 109, progeny spores in the final population.
Accordingly, in a preferred embodiment, the method comprises breeding to obtain at least 104 progeny spores, at least 105 progeny spores, at least 106 progeny spores, or at least 107 progeny spores. More preferably, the method comprises breeding to obtain at least 108 progeny spores, or at least 109 progeny spores.
The methods of the invention comprise breeding yeast exhibiting an improved phenotype for a biomanufacturing process, preferably improved recombinant protein production. Preferably, therefore, the method further comprises transforming at least one parental yeast strain with an expression vector encoding at least one recombinant protein. Alternatively, the method may comprise transforming the yeast progeny with an expression vector encoding at least one recombinant protein. In a preferred embodiment, the expression vector is a whole- 2-micron family plasmid. A whole-2-micron family plasmid is an expression vector comprising at least 50%, at least 60%, at least 70%, at least 80%, preferably at least 90%, or most preferably 100% of the sequence of a natural yeast 2-micron plasmid. Alternatively, the expression vector may be a stable partial-2-micron plasmid. A stable partial-2-micron plasmid is an expression vector comprising the 2-micron origin of replication which is dependent on a whole-2-micron family plasmid, a whole-2-micron expression plasmid or functions provided from these plasmids for stable replication and maintenance. Alternatively, the expression vector may be an integrative plasmid (e.g. a plasmid that integrates into the yeast genome). Alternatively, the expression vector may be a centromeric plasmid (e.g. containing a centromeric sequence and/or an autonomous replicating sequence). Alternatively, the expression vector may be an artificial chromosome (e.g. a yeast artificial chromosome or YAC). It is advantageous, when screening for genomic traits in genetically diverse populations generated by multigenerational breeding, if the differences in product yield are due to genomic traits alone. A problem exists, however, if plasmids are used which have variable copy number, which itself will lead to variable product yield even in an isogenic population. This problem can be severe if commonly used partial-2-micron plasmids are used, which rely on the native 2-micron plasmid for their maintenance and will have variable copy number in isogenic strains. The inventors have overcome this problem by using a whole-2-micron-family plasmid instead of the partial-2-micron plasmid, because the whole-2-micron-family plasmid is amplified to a relatively constant copy number in each different strain, i.e. the copy number for a particular plasmid will be approximately the same in transformants of the same strain.
Accordingly, in a preferred embodiment, at least one parental yeast strain does not comprise a partial-2-micron family plasmid. More preferably, the at least two parental yeast strains do not comprise a partial-2-micron family plasmid.
Preferably, the method comprises transforming at least one parental yeast strain with a whole-2-micron family plasmid. The whole-2-micron plasmid of Saccharomyces cerevisiae is a small circular, multicopy DNA element that resides in the yeast nucleus at a copy number of about 40-60 per haploid cell. Examples of whole-2-micron family plasmids include Scpl, Scp2 and Scp3 in S. cerevisiae, pSRl, pSB3 and pSB4 from Zygosaccharomyces rouxii, pSBl and pSB2 from Zygosaccharomyces bailii, plasmid pSMl from Zygosaccharomyces fermentati, plasmid pKDl from Kiuyveromyces drosphiiarum, an un-named plasmid from Pichia membranae faciens. These whole-2-micron family plasmids share a series of common features in that they possess two inverted repeats on opposite sides of the plasmid, have a similar size around 6-kbp (range 4757 to 6615-bp), at least three open reading frames, one of which encodes for a site specific recombinase (such as FLP in 2pm) and an autonomously replicating sequence (ARS), also known as an origin of replication (ori), located close to the end of one of the inverted repeats. Examples of whole-2-micron family plasmids are provided in Sleep et al. 20056 (W02005061719A1), the contents of which are incorporated herein in their entirety. Additionally, examples of multiple types of 2-micron plasmid in Saccharomyces cerevisiae are described in Strope et ai. 20157. More preferably, the method comprises transforming at least one parental yeast strain with an engineered whole-2-micron family plasmid. Preferably, an engineered whole-2-micron family plasmid is a whole-2-micron family plasmid which has been engineered for recombinant protein production (a whole-2-micron expression plasmid), preferably where recombinant protein production is inactive, repressed or uninduced during breeding.
If the expression plasmid is introduced into one or more of the parent strains at the start of the multigenerational breeding, while it will be present in the final population, e.g. >108 progeny, its presence is likely to be counter-selected against the optimal recombinant protein production strains generated by breeding. This is because the presence of the plasmid and the metabolic burden on the strain producing recombinant proteins, even at a low level, often allows poor expressing strains to outgrow the high expressing strains during the many generations of growth required.
While there are several inducible yeast promoter systems which can be used to switch recombinant protein expression on and off, these are generally not suitable for large-scale biopharmaceutical manufacture, either due to cost or undesirable properties of the inducer within the process stream, e.g. antibiotics.
Regulated S. cerevisiae promoters allow the control of timing and gene expression levels, achievable through manipulation of the growth medium by the addition of metabolites or ions. The galactose (GALl-lO) promoters allow regulation of gene expression through the carbon source, galactose for induction, and glucose for repression. The phosphate PH05) and the copper CUP1) promoters are inducible by phosphate and copper, respectively. However, regulation of these promoters often interferes with the cellular metabolism, due to the changes in growth media, and in many cases does not completely shut off transcription. This can be overcome by using the tetracycline (Tet-On/Off) promoters, which are either inducible or repressible. Gene expression is regulated by binding the Tet-Off or Tet-On proteins to an element located within an inducible promoter to give different responses to doxycycline, a tetracycline derivative; Tet-Off activates expression in the absence of doxycycline, whereas Tet-On activates in the presence of doxycycline. While the tetracycline derivative, doxycycline, does not interfere with the yeast cellular metabolism it is undesirable for biopharmaceutical manufacturing. It also requires a specific strain background or additional manipulations of the strains in use. To overcome this problem, the inventors provided a solution by transforming at least one parental strain with an expression plasmid containing a whole-2-micron- based plasmid derived from one of the parental strains containing a repressible promoter from the S. cerevisiae MET17 gene (also known as YLR303W, MET15 or MET25) and growing the breeding population in the continual presence of methionine at a level sufficient to repress expression of the recombinant product (below approximately 0.05 mM methionine). Accordingly, in a preferred embodiment, the method comprises growing the yeast in a media comprising less than 0.05 mM methionine, less than 0.04 mM methionine, less than 0.03 mM methionine, less than 0.02 mM methionine, or less than 0.01 mM methionine. Alternatively, the method comprises growing the yeast in a media which does not comprise methionine. Advantageously, therefore, the repressible MET17 promoter can be switched on for the production of the recombinant protein product.
Accordingly, in one embodiment, the whole-2-micron family plasmid comprises a repressible promoter. Preferably, the repressible promoter is the MET17 promoter. In this embodiment, preferably the method comprises breeding the yeast in the presence of methionine.
The level of methionine in the culture media needs to be carefully controlled to ensure transcription is off when needed, because the yeast can metabolise methionine until its concentration eventually falls to a level, below which, transcription is activated. Typically, adding methionine to the culture media at 20mM or above is sufficient to repress the MET17 promoter for several days. The
MET17 promoter is repressed at methionine concentrations above approximately 0.05 mM. Accordingly, in one embodiment, the method comprises breeding the yeast in the presence of at least 0.05 mM methionine, at least 0.1 mM methionine, at least 0.5 mM methionine, or at least 1 mM methionine. Alternatively, the method comprises breeding the yeast in the presence of at least 5 mM methionine, at least
10 mM methionine, at least 15 mM methionine, or at least 20 mM methionine. Advantageously, therefore, the repressible MET17 promoter can be switched off during breeding. By using a repressible promoter, an inducer is not required for expression during industrial production campaigns. Furthermore, the repressible MET17 promoter provides additional options for optimising fermentation and production of proteins which are deleterious or toxic to the production host, e.g. by separating the growth phase from the production phase. Furthermore, this can be achieved using methionine which is both cost-effective and safe for biopharmaceutical manufacturing.
This additionally requires that all the parental strains lack the native 2-micron plasmid, either naturally or after curing. To facilitate stable plasmid maintenance, it is preferable to use a whole-2-micron-family plasmid sequence which is at least as stable as one naturally present and stably maintained in a parental strain. This can be isolated from one of the parental strains. It is still problematic to perform multigenerational breeding for every expression plasmid of interest, as this process is slow and laborious. Therefore, the inventors developed an additional improvement to the breeding process of this invention by transforming a cir° population of 108to 109 progeny, e.g. after 12 generations of breeding, with a whole-2-micron expression plasmid to obtain 104 to 106 transformed progeny, then to back-cross them to the 108to 109 cir° progeny, and breed for one generation, preferably two generations or more preferably 3, 4, or 5 additional generations, with repression of recombinant protein expression before screening the entire population without repression of recombinant protein expression.
Accordingly, in one embodiment, the method comprises obtaining yeast progeny (preferably between 108 and 109 yeast progeny), which do not comprise a whole-2- micron expression plasmid (i.e. cir° progeny). Preferably, the yeast progeny are then transformed with a whole-2-micron expression plasmid to obtain transformed yeast progeny. Preferably, greater than 103, or between 104 and 106 transformed yeast progeny are obtained. The method preferably further comprises back-crossing the transformed yeast progeny with the yeast progeny that do not comprise a whole-2-micron expression plasmid (i.e. the cir° progeny). Preferably, the yeast progeny are then bred for at least two generations, at least three generations, at least four generations, or at least five generations.
Alternatively, an individual strain or a selection of a few strains, transformed with the desired expression plasmid, can be back-crossed to the cir° population of 108 to 109 progeny from multigenerational breeding, followed by additional breeding to replicate the plasmid throughout the population. Accordingly, in another embodiment, the method comprises selecting an individual strain which does not comprise a whole-2-micron expression plasmid (i.e. cir° progeny), from the yeast progeny. Alternatively, the method may comprise selecting at least two individual strains, at least three individual strains, at least four individual strains, or at least five individual strains which do not comprise a whole-2-micron expression plasmid (i.e. cir° progeny), from the yeast progeny.
Preferably, the selected yeast progeny are then transformed with a whole-2-micron expression plasmid to obtain transformed yeast progeny. The method preferably further comprises back-crossing the transformed yeast progeny with the yeast progeny that do not comprise a whole-2-micron expression plasmid (i.e. the cir° progeny). Preferably, the yeast progeny are then bred for at least two generations, at least three generations, at least four generations, or at least five generations. It is desirable to have haploid progeny to facilitate QTL analysis and follow-on genetic improvements. Accordingly, in one preferred embodiment, the method comprises selecting for haploid progeny exhibiting an improved phenotype for a biomanufacturing process. The population of yeast cells obtained by the method of the invention will contain both mating types (MAT-a and MAT-alpha). Once germinated, haploids of the opposite mating-type tend to mate to form diploids, unless physically separated. This is undesirable if haploid screening of large populations (e.g. 108-109) is required to identify a final haploid production strain. If grown together as a population with both mating types present, mating can interfere with screening processes needed to detect preferred strains with improved phenotypes.
Consequently, it is desirable to obtain a stable haploid population of one mating type only, before allowing expression of the recombinant protein for screening purposes.
The inventors achieved this by germinating spores derived from a population comprising heterozygous functionally deleted PEP4 diploids, e.g. pep4 \PEP4 heterozygous diploids, in the presence of an excess of a strain with one mating type and allowing for simple selective removal of diploids, e.g. by flow cytometry. For example, performed in the presence of an excess of a single mating-type strain containing at least one marker for subsequent removal of diploids arising from it. Accordingly, in one embodiment, the method comprises germinating a population of spores derived from heterozygous functionally deleted PEP4 diploids in the presence of a yeast strain of one mating type (i.e. MAT-a or MAT-alpha). Preferably, the population of heterozygous functionally deleted PEP4 diploids are germinated in the presence of an excess of a yeast strain of one mating type (MAT-a or MAT-alpha).
Preferably, an excess of one mating type comprises at least 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, or preferably 10-fold or 20-fold as many cells as the number of germinated spores. Preferably, the excess yeast strain produces a fluorogenic protein.
Preferably, the method further comprises selecting for progeny (e.g. haploid or diploid progeny) exhibiting an improved phenotype for a biomanufacturing process using flow cytometry or growth on a selective media. Markers may include, but are not limited to, auxotrophic markers or fluorogenic markers, e.g. fluorescent protein expression. These markers are suitable for the selection of large numbers of individual progeny of different types (e.g. >104 of haploids of a single mating type) by growth on an appropriate media or by flow cytometry.
A period of growth is required after spore germination where recombinant protein production is permitted before flow cytometry to ensure different levels of recombinant protein production are detectable in the genetically diverse progeny being selected during flow cytometry, which corresponds to their genotypic differences. Accordingly, in a preferred embodiment, the method comprises allowing recombinant protein production to take place for at least 2 hours, at least 4 hours, at least 6 hours, at least 8 hours, at least 12 hours, at least 16 hours, at least 20 hours, at least 24 hours, at least 48 hours, at least 72 hours, or at least 96 hours prior to selecting for progeny (e.g. haploid or diploid progeny). Preferably, the method comprises allowing recombinant protein production to take place for between 12 and 24 hours, prior to selecting for progeny (such as haploid or diploid progeny), exhibiting an improved phenotype for a biomanufacturing process.
These genetically diverse haploid libraries enriched for a single mating type were used in subsequent screening processes, e.g., flow cytometry and/or microtitre plate culture, to identify strains improved for recombinant protein production, e.g., improved recombinant protein secretion or intracellular protein production. Haploid individuals can also be obtained by transformation of pep4 pep4 diploids exhibiting improved phenotypes for a biomanufacturing process, with a plasmid to temporarily complement the pep4-disruption to allow sporulation, e.g. a genetically stable CEN-vector expressing PEP4.
Accordingly, in another embodiment, the method comprises a homozygous functionally deleted PEP4 diploid, exhibiting an improved phenotype for a biomanufacturing process, with a plasmid that complements the functional deletion of PEP4. Preferably, the plasmid is a genetically stable CEN-vector expressing PEP4. Spores can then be obtained by standard methods, e.g. tetrad analysis. Preferably, therefore, the method comprises using tetrad analysis to obtain spores. Haploid progeny can subsequently be cured of the complementing PEP4 plasmid before characterisation of haploids containing the desired combination of alleles from the parents, which are also improved phenotypes for a biomanufacturing process.
Accordingly, the method preferably comprises curing the haploid progeny of the complementing PEP4 plasmid, and selecting for haploid progeny exhibiting an improved phenotype for a biomanufacturing process. Curing may be achieved by a variety of methods, including counter-selecting against selectable markers, such as URA3 with 5-Fluoroorotic acid (5FOA) or LYS2 with alpha-aminoadipate, which may be combined with multiple generations of cell division without selection for the CEN- vector.
In some cases, it is desirable to select for diploids as the final production strain. Accordingly, in another preferred embodiment, the method comprises selecting for diploid progeny exhibiting an improved phenotype for a biomanufacturing process.
Preferably, the method comprises selecting for homozygous functionally deleted PEP4 diploid progeny, or selecting for a mixed population of heterozygous functionally deleted PEP4 diploid progeny and homozygous functionally deleted PEP4 diploid progeny.
In one embodiment, the yeast is Pichia pastoris (Komagataella species, such as
K. phaffii, K. pastoris, and K. pseudopastoris) or Hansenula polymorpha (also known as Ogataea polymorpha), Kluyveromyces lactis, Yarrowia species such as Yarrowia lipolytica, or Schizosaccharomyces pombe. Alternatively, in a preferred embodiment, the yeast is a Saccharomyces species yeast, such as Saccharomyces cerevisiae. Most preferably, the yeast is Saccharomyces cerevisiae.
In one embodiment, at least one parental yeast strain comprises a functionally deleted UBC4 gene, or a homologue, orthologue or paralogue thereof. UBC4 encodes a ubiquitin conjugating enzyme that links ubiquitin (Ubi4p) to lysine residues of target proteins.
In another embodiment, at least one parental yeast strain comprises a functionally deleted YPS1 gene, or a homologue, orthologue or paralogue thereof. YPS1 (also known as YAP3 with systematic name YLR120C) is aspartic protease and hyperglycosylated member of the yapsin family of proteases, attached to the plasma membrane via a glycosylphosphatidylinositol (GPI) anchor. It is involved with other yapsins in the cell wall integrity response and has a role in KEX2- independent processing of the alpha factor precursor. It can be beneficial to use ypsl gene disrupted strains form recombinant protein production.
Additional engineering to improve recombinant protein production may be desirable in pep4 and ypsl disrupted strains. For example, disruption of genes to increase recombinant protein production, such as UBC4, M0T2 and GHS1, or overexpression of genes for chaperones, such as PDI1 and ER.01 or manipulation of the unfolded protein response, e.g. by HACli overexpression.
Accordingly, in one embodiment, at least one parental yeast strain comprises a functionally deleted M0T2 and/or GHS1 gene, or a homologue, orthologue or paralogue thereof. In another embodiment, at least one parental yeast strain comprises an over-expression of genes encoding chaperones, for example, overexpression of PDI1 and ER.01. In another embodiment, at least one parental yeast strain comprises an over-expression of HACli.
In one embodiment, the at least two parental yeast strains comprise a selectable or auxotrophic marker. For example, in one embodiment, the at least two parental yeast strains comprise a ura3 or a Iys2 marker. Preferably, the selectable or auxotrophic marker are located at the same genomic loci in the at least two parental yeast strains, e.g. at the LYS2-locus. Preferably, the at least two parental yeast strains are genetically diverse. Genetically diverse strains are defined as strains being at least 0.001%, at least 0.005%, at least 0.01%, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.06%, at least 0.07%, at least 0.08%, or at least 0.09% different by whole genome comparisons performed by Neighbour-joining based on SNP differences. Preferably, the strains are at least 0.1%, at least 0.2%, at least 0.3%, at least 0.4%, at least 0.5%, at least 0.6%, at least 0.7%, at least 0.8%, at least 0.9 %, or preferably at least 1%, or more preferably at least 2%, or at least 5% different by whole genome comparisons performed by Neighbour-joining based on SNP differences. Strains are preferably compared to the yeast reference strain S288C. Alternatively, strains may be compared to each other, for example, the parental strains used in a cross are compared by whole genome comparisons performed by Neighbour-joining based on SNP differences. As used herein, the "breeding" of yeast refers to the process of mating, followed by the generation of progeny. Sporulation is required for the generation of progeny, and occurs when yeast exit the mitotic cell cycle and enter into meiosis, leading to spore formation. The term "homologue" will be well understood by the skilled person to mean a gene or genetic region that is similar in sequence, structure or evolutionary origin to a gene or genetic region in another species or organism. For example, homologous genes may be derived from a single common ancestral gene present in the common ancestor of different organisms. Homologous genes will encode proteins with the same or similar function in different species, and may also be referred to as an "ortholog ue".
The term "orthologue" will be well understood by the skilled person to mean one of two or more homologous gene sequences found in different species.
The term "paralogue" will be well understood by the skilled person to mean a gene which has evolved by a gene duplication event within a genome. For example, gene duplication within a single species may involve one copy of the gene receiving a mutation that gives rise to a new gene. Paralogous genes code for a protein with similar, but not necessarily identical functions. In a second aspect of the invention, there is provided a yeast strain exhibiting an improved phenotype for a biomanufacturing process, wherein the yeast strain is obtained by the method according to the first aspect. Preferably, the yeast strain is an engineered yeast strain. In one embodiment, the yeast is Pichia pastoris Komagataella species, such as K. phaffii, K. pastoris, and K. pseudopastoris') or Hansenula polymorpha (also known as Ogataea polymorpha), Kluyveromyces lactis, Yarrowia species such as Yarrowia lipolytica, or Schizosaccharomyces pombe. Alternatively, in a preferred embodiment, the yeast is a Saccharomyces species yeast, such as Saccharomyces cerevisiae. Most preferably, the yeast is Saccharomyces cerevisiae.
Preferably, the yeast strain comprises a functionally deleted protease gene. More preferably, the yeast strain comprises a functionally deleted PEP4 gene, or a homologue or paralogue thereof.
In a third aspect of the invention, there is provided a yeast library obtained by the method according to the first aspect. Preferably, the yeast library is genetically diverse. Preferably, the yeast library comprises an engineered whole-2 micron plasmid. Preferably, the engineered whole-2 micron plasmid is engineered for recombinant protein production, where recombinant protein production is inactive, repressed or uninduced during breeding. In a fourth aspect, there is provided a product produced by the yeast strain according to the second aspect or the yeast library according to the third aspect.
Preferably, the product is recombinant. The product may be a peptide, polypeptide or protein. The product may comprise at least 10, 20, 30, 40 or 50 amino acids. Preferably, the product comprises at least 100, 200, 300, 400 or 500 amino acids. More preferably, the product comprises at least 1000, 2000, 3000, 4000 or 5000 amino acids.
Preferably, the product is purified. Preferably, the product is at least 95%, 96%, 97%, 98% or 99% pure. In one embodiment, the term "recombinant protein" may be any protein not naturally produced by the expression host, including fusion proteins, tagged proteins, muteins, analogues, derivatives, domains, precursors and fragments of any protein or polypeptide, including, but not limited to, the following proteins (or other polypeptides) of interest. Proteins (or other polypeptides) of interest include albumin, transferrin, lactoferrin, immunoglobulin (such as an Fab fragment or single-chain antibody, including, ScFvs, VHHs and VNARs), (haemo)globin, leghaemoglobin, myoglobin, blood clotting factors (such as factors II, VII, VIII, IX), von Willebrand's factor, tick anticoagulant peptide, endostatin, angiostatin, ice- structuring proteins, hydrophobins, interferons, interleukins, alpha-l-antitrypsin, insulin, GLP-1, glucagon, calcitonin, cell surface receptors, fibronectin, prourokinase, (pre-pro)-chymosin, antigens for vaccines (including virus-like particles), t-PA, urokinase, prourokinase, hirudin, tumour necrosis factor, G-CSF, GM-CSF, Kunitz domain proteins, CNTF, growth hormone, transforming growth factors, fibroblast growth factors, nerve growth factors, serum cholinesterase, aprotinin, amyloid precursor protein, inter-alpha trypsin inhibitor, antithrombin III, apolipoproteins, bone morphogenic proteins, MIC-1, leptin, erythropoietin (EPO), thrombopoietin (TPO), parathyroid hormone, platelet-derived endothelial cell growth factor, platelet-derived growth factor, Protein C, Protein S, keratins, collagens, antimicrobial peptides, defensins, chymosins, casein, amylases, and enzymes generally, such as glucose oxidase and superoxide dismutase. The protein may be a viral, microbial, fungal, plant or animal protein, for example, a mammalian protein. In one embodiment, it is a human protein. The recombinant protein may be a protein endogenous to the host, such as an enzyme, for which production has been improved by strain engineering in a recombinant eukaryotic cell.
All of the features described herein (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined with any of the above aspects in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.
For a better understanding of the invention, and to show how embodiments of the same may be carried into effect, reference will now be made, by way of example, to the accompanying Figures, in which:- Figure 1 shows how a mixed population of heterozygous pep4 PEP4 and homozygous pep4 pep4 diploids ("mixed diploids", MD) was prepared by allowing germination and mating of pep4: :KanMX and PEP4 haploids followed by selection against PEP4:PEP4 homozygous diploids with G418. This figure also shows how a population of pep4::pep4 homozygous diploids ("pure diploids", PD) can be produced by selecting only pep4: :KanMX haploids using G418 and allowing them to mate.
Figure 2 shows specific mCherry activities for progeny yeast strains grown at 30°C. The average across four replicates is displayed (error bars show 1 SD from 4 replicates). Significant diversity in the mCherry phenotype is observed, with more than a 10-fold difference observed between the highest and lowest producers.
Figure 3 shows plots of mCherry against amylase specific activities for yeast strains grown in four microtitre plates at 30°C. While the mCherry levels range widely from low to high, the amylase levels tend not to fall so far towards zero, indicating that proteolysis has affected the mCherry and amylase domains differently. Figure 4 shows typical populations selected by flow cytometry where low and non-producers of recombinant protein (based on mCherry signal) can be avoided (fraction 1) and populations with increasing intracellular recombinant protein production (fractions 2 to 6) can be selected for analysis. For RBD3-mCherry expression (from construct pEV25) in a pure diploid pep4: :pep4 homozygotes) population after at least four generations of breeding, individual cells were grown in 48-well microtitre plates. Cells from fraction 1 were cream coloured (indicating yeast with no/low mCherry production), whereas cells from fractions 2 to 6 were increasingly pink/red in visual appearance. Figure 5 (left) shows increased recombinant RBD4-mCherry protein production in progeny selected from a genetically diverse yeast library lacking proteinase A expression. This led to the selection of strains DH01, DH02 and DH03 with significantly increased supernatant mCherry levels compared to the parental strains. Figure 5 (right) shows reducing SDS-PAGE analysis of culture supernatant from a strain improved for RBD4-mCherry protein production by multigenerational breeding. Figure 6 shows enrichment of high mCherry producers after multiple rounds of flow cytometry. Cells were grown overnight to mid-log phase before cell sorting. Typically, the top 2% of the population for mCherry fluorescence was collected in the first passage then grown overnight at least once before, optionally performing further flow cytometry passages, leading to sorting the top 2% into 96-well microtitre plates (1 cell/well) containing BMMD+CSM-Leu+Met.
Figure 7 shows yield improvement by breeding for commercially relevant VLP proteins, HPV16(Ll)-mCherry and mCherry-HBsAg (with and without an N-terminal 6His-tag containing methionine as the first amino acid), from pEV51, pEV52 and pEV56 expression constructs, respectively. Initially, a Cpl2 cir° PD library was transformed with the expression plasmids, and expression studies showed that yeast genome diversity gave variable expression levels for all VLP-mCherry fusions with different average expression levels of each VLP-fusion. Back-crossing followed by additional rounds of breeding were then performed to increase the diversity of plasmid containing cells. Individuals with enhanced expression were selected for all VLPs. Yield improvements are shown in Figure 7 against the average of the diverse population. Figure 8 shows the supernatant mCherry signal corrected for growth (whole culture ODeoo) for the parental control strains, the randomly picked colonies and the individuals enriched for improved recombinant protein secretion by flow cytometry. Breeding has generated individuals with significantly improved product yields, which can be further enriched by flow cytometry. Secretion of the recombinant VHH- mCherry fusion protein has been improved at least 10-20 fold (12-50 fold compared to parental strains) by the breeding and selection process for strains with protease disruptions. The key describes strains/populations plotted in the same order from the left to the right. Figure 9 shows yield improvement from breeding, where parental strains tend to secrete recombinant albumin at significantly lower levels than the majority of the diverse population evaluated.
Figure 10 shows different embodiments of expression constructs used in the claimed invention. Figure 10a is the construct pHRIK, which encodes the LEU2 selectable marker and pUC57-Kan integrated at the SnaBI-site downstream of the 2-micron D gene. Figure 10b is the construct pHR2A, which was constructed in a 3- way ligation of the 2.6kp pHRIK Notl-Hpal fragment, the 1.2kp pHRIK Notl-AcII fragment and the 2.5kp pUC57-Amp Eco J-Narl fragment. Figure 10c shows the construct pEV7 for amylase-mCherry secretion, with Figure lOd showing a map of the final expression plasmid pHR!K-pEV7 in yeast after homologous recombination with pHRIK, illustrating how each expression construct of the invention is used to make whole-2-micron expression plasmids. Figure lOe shows plasmid pEV25 containing the expression construct for secretion of RBD3-mCherry, and Figure lOf shows pEV26 containing the expression construct for secretion of RBD4-mCherry, with Figure 10g showing the locations of RBD3 and RBD4 within the SARS-CoV-2 Spike protein in relation to N-linked glycosylation sites. Figure lOh shows pEV51 containing the expression construct for intracellular production of HPV(Ll)- mCherry. Figure lOi shows pEV52 containing the expression construct for intracellular production of mCherry-GSl-HBsAg, and Figure lOj shows pEV56 containing the expression construct for intracellular production of M6His-mCherry- GSl-HBsAg. Figure 10k shows pEV95 containing the expression construct for secretion of VHH-GS-mCherry. Figure 101 shows pEV60 containing the expression construct for intracellular production of Green Fluorescent Protein, ymllkGl. Figure 10m shows pEV3 containing the expression construct for the secretion of rHA- mCherry, where rHA is recombinant human albumin. Figure lOn shows pEV203 containing the expression construct for secretion of untagged rHA. Figure lOo shows pEV64 containing the expression construct for secretion of a 14 kDa recombinant protein fused to mCherry.
Figure 11 shows the plasmids used in this invention for expression of recombinant proteins, with a description of the expression cassettes they contain (including any detection tags and linkers) and whether they are designed for intracellular or secreted production.
Figure 12 shows a comparison of one of the best two strains for amylase-mCherry secretion (2-A2) with one of the worst two strains for amylase-mCherry secretion (1-C12), for expression of multiple different recombinant proteins.
Examples
The inventors set out to design a yeast breeding and selection method, to increase recombinant protein yield by increased cellular productivity, or decrease losses by proteolysis, thereby providing improved production strains for a particular recombinant protein. To do this, the inventors used two parents containing the pep4-gene disrupted with a dominant marker KanMX) and two parents with a functional PEP4 gene.
To increase the genetic diversity available for biopharmaceutical production strain improvements, four genetically diverse yeast strains of baker's yeast,
Saccharomyces cerevisiae, were collected from around the world in compliance with the Nagoya Protocol. These strains were genetically modified to enable an advanced multigenerational breeding strategy, including engineering auxotrophic markers. Additional engineering and experimental testing were performed on the genomes of specific strains, e.g., to reduce protease production in selected parents by pep4 gene disruption, while still allowing breeding to occur. A method was developed to allow the biological processes of mating, sporulation and germination, which are essential for the multigenerational breeding programme to occur despite pep4 gene disruption being present in the genetically diverse population, which ultimately allows pep4-deleted haploid or diploid strains to be screened for improved recombinant protein production. It was subsequently necessary to select for pep4- disrupted haploid or diploid progeny after multigenerational breeding, where at least one parent had contained the pep4 gene disruption. Example 1: Plasmid Construction and Yeast Transformation
Whole-2-micron Plasmids
Plasmid pHRIK (Figure 10a) was constructed from two PCR fragments encoding the entire 2-micron plasmid from strain YLF185 (each containing one of the inverted repeat regions), which were cloned into the EcoRV site of pUC57-Kan
(GeneWiz/Azenta) with a 2.1Kb PCR fragment encoding the S. cerevisiae S288C LEU2 gene (flanked by two Pad-sites introduced on the PCR primers), using NEBuilder® HiFi DNA Assembly Cloning Kit. Oligonucleotide PCR primers used to construct pHRIK are provided below, with F (forward) and R (reverse) primers binding to the regions PCR1-4 shown in the pHRIK map corresponding to the primer names (Figure 10a). Table 1 : PCR primers
Figure imgf000031_0001
Regions of primer sequence binding are marked on the pHRIK map as PCR1 to PCR4. Amplification was performed using a Q5 HiFi PCR kit (NEB) and/or restriction enzyme digestion to generate: a) pUC57-Kan (from GeneWiz) cut with EcoRV (buffer 3.1) or uncut plasmid can be used as PCR template DNA. b) The 2-micron fragment from the SnaBI-site near STB to just before the second inverted repeat, which contains the first inverted repeat, FLP and R.EP2 of the 2-micron B-form. PCR primers contain 20-30bp homology to pUC57-Kan and the second 2-micron fragment, respectively. This fragment is around 3.1Kb in length. c) The 2-micron fragment containing the second inverted repeat, R.EP1 and D. PCR primers contain 20-30bp homology with the first 2-micron fragment and the LEU2 fragment. This fragment is around 3.4Kb in length. d) The LEU2 fragment flanked by Pad-sites. PCR primers contain 20-30bp homology with the second 2-micron fragment (encoding R.EP1 and D) and the pUC57-Kan fragment. This fragment is around 2.1Kb in length and can be amplified from S288C genomic DNA. PCR and DNA cloning methods are well known to those skilled in the art of molecular biology. Co-transformation of a cir° yeast strain, such as YLF185 cir°, with all four fragments followed by selection of leucine prototrophs, extraction of total DNA, transformation of f. coli with kanamycin selection and then extraction and sequencing of plasmid DNA for alignment to the expected pHRl-K sequence may be performed. Alternatively, seamless cloning can be performed in vitro (e.g. using a NEBuilder® HiFi DNA assembly kit) before the yeast and/or E. coli transformations. pHRIK plasmid DNA can be isolated from E. coli, and its identity confirmed by diagnostic restriction enzyme digests and/or DNA sequencing.
This whole-2-micron family plasmid contained the LEU2 selectable marker and pUC57-Kan integrated at the SnaBI-site downstream of the 2-micron D gene (see GenBank: J01347.1 sequence for Sna BI site location). Sbfl and Notl sites were introduced on PCR primers for the directional insertion of expression constructs downstream of the LEU2 selectable marker. See the pHRIK map for details (Figure 10a). Standard methods were used for E. coli transformation and plasmid pre pa ration. pHR2A (see Figure 10b) was constructed in a 3-way ligation of the 2.6kp pHRIK Notl-Hpal fragment, the 1.2kp pHRIK Notl-Acll fragment and the 2.5kp pUC57-
Amp Eco -Narl fragment (also digested with NEB Antarctic phosphatase). Ligation used NEB high concentration T4 ligase with transformation into NEB10P competent cells for growth on LB plates containing ampicillin. Expression Plasmids
DNA fragments were synthesised (GeneWiz/Azenta) to make expression constructs for different recombinant proteins, which were subsequently cloned directionally into the Sbfl and Notl sites of pHR2A. To facilitate screening yeast strains for factors affecting recombinant protein production, proteins were typically expressed with and without the coding sequence for a fluorophore, e.g. mCherry from
Anaplasma marginale (UniProt X5DSL3), genetically fused to the protein of interest. Genetic fusion to the C-terminus of the polypeptide of interest was performed where it was desirable for the polypeptide of interest to be synthesised before the mCherry domain, thereby ensuring that an mCherry signal detected in the cell was associated with synthesis of the entire fusion protein. Polypeptide coding regions were designed using codons selected for rapid mRNA translation in S. cerevisiae as described by Chu et al., 20148.
For repression of recombinant protein production, e.g. during breeding, expression constructs used the repressible MET17 promoter MET17p). The repressible MET17 promoter can be switched off during breeding (by adding methionine to the culture media to repress expression, e.g. at 20mM concentration) and switched on for the production of recombinant protein, e.g. with easily detectable mCherry, in selected strains by growth in media lacking methionine (at methionine concentrations below approximately 0.05 mM). Strains were typically grown on BMMD media, as described by Evans et al. 20109, with appropriate supplements, e.g., CSM-Leu (Formedium Ltd), allowing for plasmid selection. For expression studies, strains were typically cultured in BMMD+(CSM-Leu-Met). Expression plasmids were constructed for intracellular expression or secretion of recombinant proteins. Secretion was typically directed by the secretory leader sequence from the S. cerevisiae SUC2 (invertase) gene. This pre-leader sequence SUC2pre) is removed by signal peptidase during translocation into the endoplasmic reticulum to give the mature protein, which is then secreted into the culture media. The S. cerevisiae ADH1 terminator ADHlt) was used for transcriptional termination.
Yeast Strain Transformation
S. cerevisiae strains, e.g. Q427 (SA lineage) used to introduce the amylase- mCherry expression plasmid into the genetically diverse libraries, or the libraries themselves, were transformed to leucine prototrophy using a lithium acetate method (Sigma Aldrich Yeast Transformation Kit YEAST1). This was achieved by co-transformation of the yeast strains or libraries with DNA from both the whole-2- micron plasmid, e.g. pHRIK, and with DNA containing the expression construct plus flanking DNA homologous to the whole-2-micron DNA fragment, e.g. from pEV7 (described below). This allows in vivo assembly of the final expression plasmid, e.g. pHR!K-pEV7, by homologous recombination, which is shown in Figure lOd. Prototrophic transformants were selected on media without leucine, and cryopreserved stocks were prepared with 25% glycerol after plating a single transformant to provide single colonies. The expression plasmid, e.g. pHR!K-pEV7 shown in Figure lOd, was subsequently transferred to all progeny during multigeneration breeding. This yeast cell transformation approach typically used approximately lOOng each of gel purified DNA from the plasmid containing the whole-2-micron DNA, e.g. 7.3Kb pHRIK BstXI-Notl fragment, and the plasmid containing the expression construct, e.g. pEV7 digested with SwaI+Acc65I. The DNA from the whole-2-micron plasmid and the plasmid containing the expression construct and homologous flanking DNA were co-transformed into yeast, e.g. into Q427 cir° to give strain Q427 [pHR!K-pEV7], with homologous recombination (gaprepair) of the plasmid DNA fragments acting to assemble the final expression plasmid, e.g. pHR!K-pEV7. Gap-repair transformation and other yeast methods are described in Andersen et al. 201210, Structure- based mutagenesis reveals the albumin-binding site of the neonatal Fc receptor and Finnis et al. 201811 (WO2018234349A1). A similar approach was used to construct other expression plasmids in this invention, e.g. using other pEV-plasmids described below. In cases where the expression cassette itself contained restriction endonuclease site desirable for excising the whole expression construct with flanking DNA, alternative restriction sites may be used, e.g. BstEII instead of Acc65I.
Plasmids comprising expression constructs used in this invention include: pEV7 (amylase-mCherry expression); Figure 10c, contains the expression construct MET17p-SUC2pre-AA-mCherry-ADHlt, which was used with pHRIK DNA to transform yeast according to the method described above for in vivo construction of the final whole-2-micron expression plasmid, e.g. pHR!K-pEV7 for amylase- mCherry expression (Figure lOd). Equivalent methods were used for in vivo construction of the final whole-2-micron expression plasmids for other recombinant proteins using the plasmids described below, which contain the expression constructs for each different recombinant protein.
It was preferred in this case for the recombinant product to lack N-linked glycosylation, which might affect protein activities independently of product yield. Options include the expression of an alpha amylase lacking potential N-linked glycosylation motifs (-N-X-S/T-) or expressing an alpha-amylase analogue modified to remove existing N-linked glycosylation motifs, e.g. Aspergillus oryzae alphaamylase (UniProt P0C1B3) with serine 199 in the mature protein changed to an alanine residue.
The coding sequence for the fluorophore, mCherry from Anaplasma marginale (UniProt X5DSL3), was genetically fused to the alpha-amylase C-terminal coding sequence to facilitate screening and selection of final yeast strains and to provide an additional phenotype for evaluating productivity and differential proteolysis. Alternatively, fluorescent proteins such as yEGFP, ymllkGl, mOrange and mNeonGreen and other, such as described by Thorn et al. 201712 could be used. Codons were selected for rapid mRNA translation in S. cerevisiae as described by Chu et al. 20148. pEV7 contains an expression construct encoding a repressible MET17 promoter, which is driving expression of the alpha-amylase (AA) mCherry fusion protein without any N-linked glycosylation sites. The repressible MET17 promoter can be switched off during breeding (by adding methionine to the culture media at 20mM or above. Secretion was directed by the secretory leader sequence from the
S. cerevisiae SUC2 (invertase) gene. This pre-leader sequence is removed by signal peptidase during translocation into the endoplasmic reticulum to give the mature amylase-mCherry protein, which is then secreted into the culture media. The
S. cerevisiae ADH1 terminator ADHlt) was used for transcriptional termination. pEV25 (RBD3-mCherry): Contains the expression construct MET17p-SUC2pre- RBD3-mCherry-ADHlt for RBD3-mCherry secretion, which is the SARS-CoV-2 Receptor Binding Domain (RBD) fragment (residues 344-604) with mCherry genetically fused to the C-terminus; Figure lOe. pEV26 (RBD4-mCherry): Contains the expression construct MET17p-SUC2pre-
RBD4-mCherry-ADHlt for RBD4-mCherry secretion, which is the SARS-CoV-2 Receptor Binding Motif (RBM) fragment (residues 438-505) with mCherry genetically fused to the C-terminus; Figure lOf.
RBD3 and RBD4 protein sequences are devoid of N-linked glycosylation sites. The location of the RBD3 and RBD4 (also known as RBM) fragments withing the SARS- CoV-2 coding sequence are shown in Figure 10g, where N-linked glycosylation sites are marked as vertical bars. pEV51 to pEV58 (VLP proteins): Plasmids pEV51 to pEV58 for the intracellular production of VLP proteins are described below.
Figure imgf000036_0001
In plasmids pEV51-pEV58, GS1 is a linker with sequence GSGGSGGSGPVTN (SEQ ID No: 9), and GS2 is a linker with sequence GGSGS (SEQ ID No: 10). HPV16(L1) encodes Human papillomavirus type 16 major capsid protein LI (UniProt: P03101 VL1_HPV16). HBsAg encodes Hepatitis B virus S protein (GenBank: AIJ50189.1). AP205 encodes Acinetobacter phage AP205 coat protein (Q9AZ42 ■ Q9AZ42_9VIRU). 6His encodes the hexahistidine-tag, preceded by a methionine residue in M6His. mCh encodes mCherry protein.
Plasmid maps for pEV51 (Figure lOh), pEV52 (Figure lOi), and pEV56 (Figure lOj) are shown as examples. pEV95 (VHH-mCherry): Plasmid (Figure 10k) contains the expression construct MET17p-SUC2pre-VHH-mCh-ADHlt for secretion of VHH-mCherry, where GS is a linker with sequence GGGGSGGGGS (SEQ ID No: 11). pEV60 (GFP-ymllkGl) : Plasmid pEV60 (Figure 101) contains the expression construct TEF2p-ymUkGl-ARO4t for intracellular production of GFP (ymllkGl), where TEF2p and ARO4t are the Saccharomyces cerevisiae TEF2 promoter and ARO4 terminator, respectively. pEV3 (rHA-mCherry) : Plasmid pEV3 (Figure 10m) contains the expression construct MET17p-SUC2pre-rHA-mCherry-ADHlt for secretion of recombinant human albumin with mCherry fused to its C-terminus. pEV203 (rHA) : Plasmid pEV203 (Figure lOn) contains the expression construct PRBlp-SUC2pre-rHA-ADHlt for the secretion of recombinant human albumin, driven by the Saccharomyces cerevisiae proteinase B promoter PRB1 p). pEV64 (RP-mCherry): Plasmid pEV64 (Figure lOo) contains an expression construct MET17p-SUC2pre-RP-mCh-ADHlt for the secretion of a 14 kDa recombinant protein (RP) fused to mCherry.
Expression cassettes for additional recombinant proteins were designed similarly to pEV7, as Sbfl-Notl fragments, which were cloned in place of the amylase-mCherry expression cassette for transcription in the same direction as the LEU2 gene in the final whole-2-micron vectors. The plasmids for expression of the additional recombinant proteins are described below and in Figure 11, with a description of the expression cassettes they contain (including any detection tags and linkers) and whether they are designed for intracellular or secreted production. pEVl contains an expression cassette for intracellular expression of mCherry. The mCherry coding sequence is essentially the same in all constructs of this invention.
This expression cassette contains the MET17 promoter and ADH1 terminator described above for pEV7. pEV51 & pEV52 contain intracellular expression cassettes for the virus-like particle proteins HPV16(Ll)-mCherry and mCherry-GSl-HBsAg, respectively, where GS1 is a linker with sequence GSGGSGGSGPVTN (SEQ ID No: 9). HPV16(L1) encodes human papillomavirus type 16 major capsid protein LI (UniProt: P03101 VL1_HPV16). HBsAg encodes Hepatitis B virus S protein (GenBank: AIJ50189.1).
These expression cassettes contain the MET17 promoter and ADH1 terminator described above for pEV7. pEV3 contains an expression cassette for the secretion of rHA-mCherry, where rHA encodes recombinant human albumin with the mature albumin sequence from UniProt P02768. This expression cassette contains the MET17 promoter, SUC2 leader and ADH1 terminator described above for pEV7. pEV388 contains an expression cassette for the secretion of rHA (without an mCherry tag) with transcription from the Saccharomyces cerevisiae proteinase B promoter PRBlp). Secretion is directed by the modified fusion leader sequence (mFL). The DNA coding sequence for mFL-rHA in pEV388 is the same as the open reading for mFL-rHA in SEQ ID 19 of W02004009819. pEV275 contains an expression cassette for a VHH domain antibody for prostatespecific membrane antigen, PSMA (Chatalic et al., 2O1520). This expression cassette contains the MET17 promoter, SUC2 leader and ADH1 terminator described above for pEV7. pEV299 contains the same expression cassette as pEV275, except with transcription from the Saccharomyces cerevisiae proteinase B promoter (PRBlp). pEV298 contains the same expression cassette as pEV299, except for secretion of a VHH-mCherry fusion protein. pEV395 contains an expression cassette for secretion of a GLP1 (9-37) analogue precursor-GS-HiBit fusion protein with amino acid sequence
EGTFTSDVSSYLEGQAAKEFIAWLVRGRGGGGGSGGGGSVSGWRLFKKIS (SEQ ID No: 10) with a Saccharomyces cerevisiae Mating Factor-alpha-derived leader sequence and Saccharomyces cerevisiae BCY1 -derived terminator. Equivalent yeast transformations were performed as required for the introduction of additional expression plasmids with homologous recombination (gap-repair) between DNA fragments comprising the whole-2-micron plasmid sequence and fragments comprising the expression construct with homologous flanking sequences from the different pEV-plasmids described above.
Example 2: Multiaenerational breeding to produce genetically diverse libraries with different proteinase A genotypes.
The inventors generated C12 cir° and Cpl2 cir° libraries with >108 progeny. For C12 libraries, parental strains Q416 and Q413 were replaced with Q410 and elOS599, respectively.
Table 2: Cl 2 and Cpl2 yeast strain origin and genotype
Figure imgf000038_0001
Figure imgf000039_0001
Strain Engineering, Breeding and Selection
The original strains described by Liti et al. 200913, from the Saccharomyces Genome Resequencing Project (SGRP), were made genetically tractable by Cubillos et al. 200914 and Louvel et al. 201415 using standard methods of genome engineering. Representatives of the Wine/European (WE), West African (WA), North American (NA) and Sake (SA) clean lineages were selected, with derivatives YLF185, YLF187, YLF190, and YLF191 described in Louvel et al. 201415, being further modified in this study to create strains Q416, Q413, Q426 and Q427, respectively, with genotypes shown in Table 2. Plasmid curing is described by Rose and Broach 199016.
For the strain engineering, spore segregants were isolated from the diploids listed in Louvel et al. 201415. Q410 was one of these which was cured of its native 2- micron. Q416 was derived from Q410 by disrupting pep4. elOS599 was also one of the spore segregants and Q413 was derived from this in the same way. Q426 and Q417 were spore segregants from crosses of diploid spores described in Louvel et al. 201415 and the lys2::URA3 strains from Cubillos et al. 201317 (which had ura3 : : Kan MX i n stea d of ura3A ) .
Breeding
Breeding methods are described in Cubillos et al. 201317. Breeding was performed for 12 generations before strain selection. A C12 population was bred from the four PEP4 parent (Q410, elOS599, Q426 and Q427) as described in Cubillos et al.
201317, with two pairwise first and then mixed 4-way with selection for Ura + Lys+ in between cycles. Cpl2 populations were generated from (Q416, Q413, Q426 and Q427) as described below in the presence of methionine. C12 cir° diploid libraries contained entirely PEP4: :PEP4 homozygous diploids, as the parental strains did not contain any pep4 KanMX gene disruptions.
Cpl2 cir° diploid libraries comprise approximately 25% PEP4::PEP4 homozygotes, 50% PEP4::pep4 heterozygotes, and 25% pep4::pep4 homozygotes. Selection from Cpl2 cir° libraries or Cpl2 libraries containing whole-2-micron expression plasmids was performed using G418. G418 is an aminoglycoside antibiotic similar in structure to gentamicin Bl, which blocks polypeptide synthesis by inhibiting the elongation step. Strains containing the KanMX gene are resistant to G418.
A homozygous pep4 pep4 diploid population ("pure diploids", PD) was prepared by allowing germination of genetically diverse Cpl2 progeny in the presence of G418 to select for pep4: : KanMX spores only, followed by mating to form pep4::pep4 diploids.
Alternatively, a mixed population of heterozygous pep4 PEP4 and homozygous pep4 pep4 diploids ("mixed diploids", MD) was prepared by allowing germination and mating of pep4: : KanMX and PEP4 haploids followed by selection against PEP4:PEP4 homozygous diploids with G418, as shown in Figure 1. Therefore, multigenerational libraries can be produced with a range of proteinase A genotypes, including an absence of proteinase A gene, despite this protease being essential for an efficient breeding process. Example 3: Screening multigenerational libraries with different proteinase A genotypes.
Initially, a twelve generation mixed diploid library (Cpl2 [pHR!K-pEV7]) containing an amylase-mCherry expression plasmid was screened by flow cytometry and populations were isolated by flow cytometry detection of intracellular mCherry signal for analysis of amylase-mCherry secretion. Individual strains were grown in microtitre plate culture and supernatant removed for detection of mCherry in a plate reader. No significant mCherry signal was detected in the culture supernatant despite numerous progeny strains being analysed. This indicated that a sufficient reduction in proteolysis for full-length recombinant protein production might not be easily achieved by breeding alone, e.g. for practical screening experiments using up to 103 individuals.
Because pep4 gene disruption is generally desirable for recombinant protein production, the inventors decided to perform equivalent screening with a "pure diploid" library comprising pep4::pep4 homozygotes. In this case, mCherry was detectable in the culture supernatants, indicating that it was beneficial to screen genetically diverse libraries lacking proteinase A expression. This demonstrated that the pep4 disruption is advantageous for selecting progeny from yeast breeding which have improved recombinant protein secretion. Indeed, it is difficult to detect any secreted products without using strains with pep4 disruption. For screening individual genetically diverse progeny produced by multigenerational breeding, it is possible to collect populations of cells by flow cytometry, e.g. with different intracellular mCherry activities. For sorted populations of cells, individuals can be isolated by serial dilution and plating to single cells on agar plates containing appropriate culture media. Individual colonies can then be picked into microtitre plates for expression analysis, e.g. intracellular expression or secretion. Secreted protein can be detected by centrifugation to sediment cells and removing supernatant to a fresh microtitre plate for product detection, e.g. in a plate reader. Typically, cells were cultured in lOOpL media in 96-well microtitre plates or 500pL media in 48-well microtitre plates, with shaking incubation at 30°C, 200- 280rpm/2.5cm orbit with humidity control. For the removal of supernatant, it is important to minimise the carry-over of any cells with intracellular mCherry, which will otherwise interfere with the supernatant mCherry measurement. This can be achieved by robotics systems and/or multiple passages of supernatant after multiple centrifugation steps to minimise cell carry-over.
Is an alternative to growing single colonies from individual cells in a population collected during flow cytometry, individual cells can be sorted directly into (or onto) culture media in microtitre plates with one cell per well. In this case it is preferable to sort the cells directly into media containing methionine to repress recombinant protein expression during growth of the single cell to a stationary phase culture (30°C, 280 rpm with humidity control). Typically, 50-100% of single cells can be routinely cultivated by these methods for the production of cryopreserved stocks and/or subsequent analysis. Therefore, genetically diverse individual strains from breeding can be evaluated for improved bioprocessing traits, such as improved levels of recombinant protein production or secretion. Due to the multigenerational breeding, phenotyping combined with genome sequencing provides data suitable for QTL analysis.
Figure 2 shows the mCherry activities from 41 strains selected for growth at 30°C suitable for QTL analysis. The supernatant mCherry levels were corrected for growth differences (OD620) for the strains used in the QTL analysis. Significant diversity in the mCherry phenotype is observed, with more than a 10-fold difference observed between the highest and lowest producers. A 10.8-fold difference in mCherry levels in the supernatant for the highest producer (2A2) was obtained compared to the lowest producer (3D9). Figure 3 shows plots of mCherry against amylase specific activities for each strain grown at 30°C, with strains grown in four MTPs. While the mCherry levels range widely from low to high, the amylase levels tend not to fall so far towards zero. This indicates that proteolysis has affected the mCherry and amylase domains differently. The mCherry domain appears more protease sensitive than the amylase domain.
QTL analysis identified 16 genomic regions comprising approximately 3.3% of the total Saccharomyces cerevisiae genome containing genes and alleles responsible for the differential expression of the recombinant amylase-mCherry protein. Successive statistical and bioinformatic filters and rational selection methods were used to shortlist sequences, e.g. QTLs, genes and QTNs, responsible for the increased recombinant protein production.
Example 4: Enrichment of strains with high yield phenotypes by flow cytometry. To screen large populations of genetically diverse progeny from multigenerational breeding, it is possible to use FACS (Fluorescence Activated Cell Sorting) combined with expression of recombinant fusion proteins containing fluorophore domains, e.g. mCherry, to enrich the population and/or select individuals improved for productivity. This is possible for both intracellular and secreted proteins.
FACS was used to enrich for populations enriched for SARS-CoV-2 RBD3-mCherry productivity and secretion. Figure 4 shows typical populations selected by flow cytometry where low and non-producers of recombinant protein (based on mCherry signal) can be avoided (fraction 1) and populations with increasing intracellular recombinant protein production (fractions 2 to 6) can be selected for analysis. For RBD3-mCherry expression (from construct pEV25) in a pure diploid pep4: :pep4 homozygotes) population after at least four generations of breeding, individual cells were grown in 48-well microtitre plates. Cells from fraction 1 were cream coloured (indicating yeast with no/low mCherry production), whereas cells from fractions 2 to 6 were increasingly pink/red in visual appearance. Similarly, screening culture supernatants from an equivalent library producing SARS-CoV-2 RBD4-mCherry fusion protein (pEV26 expression construct), led to the selection of strain DH01, DH02 and DH03 (see Figure 5) with significantly increased supernatant mCherry levels compared to the parental strains. Reducing SDS-PAGE analysis of supernatant indicated the presence of full-length RBD4-mCherry, plus a proteolytic degradation product. Because RBM is a fragment of the full-length Spike protein without full disulphide bond stabilisation fused to an mCherry domain, which lacks disulphide bonds, it is expected to be especially sensitive to proteolytic degradation.
Therefore, screening multigenerational libraries lacking proteinase A is highly beneficial for selecting strains with increased secretion of recombinant proteins with different protease sensitivities. However, pep4 disruption alone is not sufficient to control proteolysis for all recombinant proteins in all progeny strains. Strains, such as DH01, DH02 and DH03 isolated for the secretion of protease sensitive products can be cured of the expression plasmid used during screening and retransformed with an expression plasmid for a different recombinant product. Furthermore, haploids can be obtained by sporulation of strains such as DH01, DH02 and DH03 after transformation with a P5P4-complementation plasmid, which can subsequently be cured of all plasmids for use producing new recombinant products.
Furthermore, multiple rounds of flow cytometry can be used to increase the enrichment of strains for improved recombinant protein production. This is especially useful for screening extremely large genetically diverse libraries (e.g. 108-109 progeny) for individuals with optimal, or close to optimal, phenotypes which occur less frequently in the population after breeding.
Example 5: Rapid production of large genetically diverse libraries by transformation of multigenerational cir° libraries and back-crossing, For producing large populations of genetically diverse libraries with high levels of mitotic recombination suitable for QTL analysis, it is desirable not to perform time consuming multigenerational breeding (6-18 generations) for each product being investigated. Consequently, alternative strategies were developed using cir° libraries produced by multigenerational breeding, typically in methods combining library transformation followed by back-crosses and further breeding. For example, improving intracellular production of recombinant VLP (virus-like particle) proteins was demonstrated by Cpl2 cir° library transformation without and with back-crossing. Three viral proteins were expressed with different mCherry tags.
• HPV16 major capsid protein LI (HPV16(L1) - non-enveloped, C-terminal antigen/tag).
• Hepatitis B surface antigen (HbsAg - lipid enveloped, N-terminal antigen/tag). • AP205 coat protein (non-enveloped, C- or N- terminal antigen/tag).
All expression vectors were made for intracellular expression of VLP-mCherry fusions using the repressible S. cerevisiae MET17 promoter. The S. cerevisiae ADH1 terminator was used for all constructs and plasmids were assembled by homologous recombination in vivo to generate stable high-copy-number whole-2-micron expression vectors. An mCherry-tag was located at the N- or C-terminus to facilitate high-throughput selection of improved strains. The tag was designed to allow particle formation with the mCherry domain decorating the outside of the VLP. Li nkers/s pacers were included where needed to facilitate VLP formation. Constructs with His-tags were also investigated.
Table 3: Expression constructs for VLP monomers
Figure imgf000044_0001
Initially, a Cpl2 cir° PD library was transformed with pEV51-54. Expression studies clearly showed that yeast genome diversity gave variable expression levels for all VLP-mCherry fusions with different average expression levels of each VLP-fusion. AP205-mCherry fusions showed significantly higher expression levels than the HPV16(L1) and HbsAg fusions. Evidence consistent with macromolecular particle formation was observed for the high expressing cultures. Surprisingly, yields of the enveloped HbsAg appeared easier to enhance than HPV16(L1). However, the full genetic diversity in the 12th generation library cannot easily be screened by transformation alone, due to the limited numbers of transformants generated.
Consequently, back-crossing followed by additional rounds of breeding were performed to increase the diversity of plasmid containing cells. Populations were enriched by multiple passages of flow cytometry for cells with the highest intracellular mCherry signals, before 103-105 single cells were grown for evaluation in microtitre plates (MTPs). mCherry signals were evaluated in different MTP formats and shake flasks. Individuals with enhanced expression were selected for all VLPs. Yield improvements are shown in Figure 7 for pEV51, -EV52 and pEV56 constructs, compared to the average of the diverse population for each expression construct.
Back-crossing and breeding methods
For high-throughput screening of large populations (e.g. >106 progeny) of genetically diverse strains by flow cytometry, a Cpl2 cir° mixed diploid library was first transformed to obtain >103 transformants. The entire population of transformants was subsequently back-crossed with the Cpl2 cir° mixed diploid library and bred for additional generations. This method created genetically diverse libraries with >108 progeny, without having to introduce each expression plasmid into a parental strain followed by lengthy multigenerational breeding (e.g. at least 6-12 generations for each expression plasmid). The Cpl2 cir° mixed diploid library was initially transformed to leucine prototrophy with expression constructs for the mCherry-tagged VLP proteins. At least 103 transformants were pooled from each transformation and crossed separately with the Cpl2 cir° mixed diploid library at approximately a 1: 1 ratio. Diploids were sporulated and treated with zymolyase and ether to kill vegetative cells. Spores were germinated, allowed to mate, and diploids subsequently selected on G418 to create a new population of mixed diploids (Cpl3 MD) for additional breeding cycles. After three rounds of breeding posttransformation, with continuous methionine repression of the MET17 promoter throughout, the Cpl5 MD libraries were screened by flow cytometry and expression analysis.
Increasing enrichment of intracellular mCherry signal by flow cytometry Cells were grown overnight to mid-log phase in 50mL shake flasks (30°C, 280 rpm) containing lOmL BMMD+CSM-Leu-Met media before cell sorting in a Beckman Coulter Astrios EQ flow cytometer (sterile sorting, 70,000 cells/second). Typically, the top 2% of the population for mCherry fluorescence was collected in the first passage then grown overnight, and optionally passaged at least once more, before sorting the top 2% into 96-well microtitre plates (1 cell/well) containing BMMD+CSM-Leu+Met. Individual cells were grown (30°C, 280 rpm with humidity control) and glycerol stocks (25% glycerol final concentration) prepared for cryopreservation and subsequent screening. Figure 6 shows enrichment of cell populations with increasing intracellular mCherry levels with subsequent rounds of flow cytometry.
Microtitre plates containing BMMD+CSM-Leu-Met media were inoculated from stock plates and grown for 3-5 days before evaluating expression based on the whole culture mCherry fluorescence. Stocks resulting in the highest mCherry levels were grown in shake flask culture for subsequent analysis.
Preliminary data from cell lysates generated using a Covaris Focused-ultrasonicator were even more encouraging e.g. for HbsAg-based proteins.
Example 6: Selection of genetically diverse haploid orogeny enriched for a single mating-type
For the direct selection of haploid strains, which are preferred for genome sequencing and QTL analysis, it is possible to prepare genetically diverse haploid libraries enriched for a single mating-type before screening.
Yeast strain Q427 (SA lineage) was transformed to leucine prototrophy with pHRIK and pEV60 for homologous recombination in vivo and cryopreserved glycerol stocks prepared (called Q427 [pHR!K-pEV60] with approximately 4 x 108 cells/mL). These cells constitutively express GFP (ymllkGl).
Q427 was similarly transformed with pHRIK and pEV3 for homologous recombination in vivo and used as a parental strain to generate a Cpl2 mixed diploid library (as described above), called Cpl2 [pHRlK-pEV3]. A spore preparation was made from Cpl2 [pHRlK-pEV3] using the same zymolyase/ether method used during breeding, from which a cryopreserved glycerol stock was prepared (Cpl2 [pHR!K-pEV3] spore stock with approximately 106 spores/mL). Once germinated, cells from this diverse population will produce and secrete albumin-mCherry at different levels when grown in the absence of methionine.
The genetically diverse Cpl2 [pHR!K-pEV3] spores were then geminated in an excess of the strain Q427 [pHR!K-pEV60] on solid agar plates (e.g. YPD media with 1% yeast extract, 2% peptone, 2% glucose, and agar added at 2% ), whereby the MATalpha progeny from the Cpl2 library can mate with the MATa Q427 [pHRlK- pEV60] to produce diploids capable of expressing mCherry-tagged albumin and GFP (ymllkGl). Isolation of haploid vegetative cells based on size and expression of mCherry only was performed by FACS, thereby removing most ungerminated spores and cells expressing GFP (ymllkGl) and enriching for genetically diverse MATa Cpl2 [pHRlK-pEV3] haploid progeny.
The following samples were plated onto BMMD+CSM-Leu-Met agar plates and grown for 4-days at 30°C:
1) 250pL Q427 [pHRlK-pEV60] alone.
2) 125pL Cpl2 [pHRlK-pEV3] spores alone.
3) 250pL Q427 [pHRlK-pEV60] x 125pL Cpl2 [pHRlK-pEV3] spores.
4) 250pL Q427 [pHRlK-pEV60] x 250pL Cpl2 [pHRlK-pEV3] spores. 5) 250pL Q427 [pHRlK-pEV60] x 500pL Cpl2 [pHRlK-pEV3] spores.
Samples were prepared for flow cytometry with a Beckman Coulter Astrios EQ cell sorter. Cells were harvested from the plates with 1 mL of BMMD+CSM-Leu- Met+lOOpg/mL ampicillin. 500pL of each cell suspension was transferred to 1.5 mL tube and 500 pL of the same medium was added. The cultured haploid Q427 [pHRlK-pEV60] and germinated Cpl2 [pHRlK-pEV3] spore samples were used with a sample of ungerminated Cpl2 [pHRlK-pEV3] spores to set the flow cytometry gates to enable single haploid vegetative cells with only mCherry fluorescence to be enriched and sorted into microtitre plates containing BMMD+CSM-Leu+Met for cell recovery and subsequent preparation of glycerol stocks. In 25 x 96 well microtitre plates sorted, cell recovery was approximately 99%, with mCherry detected in approximately 98% of wells and GFP in approximately 2% of wells. The majority of cells were identified as auxotrophic unable to grow in minimal media lacking uracil and lysine.
Similarly, genetically diverse libraries were made containing plasmid pHRlK-pEV64. A Cpl2 cir° MD library was transformed with pHRIK and pEV64 for homologous recombination in vivo and transformants selected for leucine prototrophy.
Approximately 103 Cpl2 MD [pHRlK-pEV64] transformant colonies were pooled, grown and back-crossed with the Cpl2 cirO MD library (~109 cells) as described above. For this, the strains were mixed, sporulated (SPM plates), germinated and mated (YPD plates), and diploids selected (YNB with G418). The Cpl3 MD [pHRlK- pEV64] diploids were selected using G418 and sporulated with the Cpl2 cir° MD library (~109 cells) for an additional round of mating with back-crossing to prepare Cpl4 MD [pHRlK-pEV64] diploids. The Cpl4 MD [pHRlK-pEV64] diploids were grown overnight without methionine (BMMD+CSM-Leu-Met) before selecting the top 2% (>105 individuals) by flow cytometry based on mCherry signal, which were recovered in BMMD+CSM-Leu+Met (estimated 57% recovery rate). This population was similarly subjected to an additional flow cytometry passage with the top 2% collected (estimated 99% recovery rate) from which spores were prepared for germination in the presence of an excess of Q427 [pHRlK-pEV60]as described above, with the equivalent controls prepared for setting gates for cell sorting in a Beckman Coulter Astrios EQ flow cytometer. Gates were set to sort haploid-sized cells expressing mCherry only into single wells of 18 x 96-well microtitre plates (1 cell/well) containing O.lmL BMMD+CSM-Leu+Met per well. Cell recovery was 57% from which 96 well microtitre plates containing O. lmL BMMD+CSM-Leu-Met were inoculated and grown for 3-4 days (30°C, 280 rpm with humidity control) for recombinant protein expression analysis, with cells sedimented by centrifugation and supernatant mCherry levels determined, e.g. using a TECAN Spark plate reader.
Cells selected for high mCherry levels were analysed for ploidy. Each candidate strain was mated separately with strain PT2369 MATalpha, ura2-l, tyrl-1) and PT2370 MATa, ura2-l, tyrl-1), neither of which will grow in BMMD+leucine media, but will complement the ura3 or Iys2 mutations in haploid progeny selected from the genetically diverse library. Only one of either PT2369 or PT2370 will mate with a haploid cell, which must be of the opposite mating type, to give a diploid capable of growing in BMMD+leucine media. Diploids with genotype ura3/ura3, LYS2/lys2: :URA3 will grow alone in BMMD+leucine media (without PT2369 or PT2370), whereas haploids ura3, LYS2 or ura3, Iys2: :URA3) will not. 10 out 19 strains tested were haploid. Example 7: Production of multiaenerational libraries with multiple protease gene disruptions.
Parental strains were produced from derivatives of YLF185, YLF187, YLF190, and YLF191 using standard genetic techniques. All derived parental strains contained disruption of the PEP4 protease gene and an additional protease disruption in the YPS1 protease gene.
YPS1 (also known as YAP3 with systematic name YLR120C) is an aspartic protease and hyperglycosylated member of the yapsin family of proteases, attached to the plasma membrane via a glycosylphosphatidylinositol (GPI) anchor. It is involved with other yapsins in the cell wall integrity response and has a role in KEX2- independent processing of the alpha factor precursor. It can be beneficial to use ypsl gene disrupted strains for recombinant protein production. Additional engineering to improve recombinant protein production may be desirable in these pep4 and ypsl disrupted strains. For example, disruption of genes to increase recombinant protein production, such as UBC4, M0T2 and GHS1, or overexpression of genes for chaperones, such as PDI1 and ER.01 or manipulation of the unfolded protein response, e.g. by HACli overexpression, may be desirable.
Parental strains were transformed with a low copy number CEN-vector with a URA3 selectable marker containing the PEP4 gene to complement the pep4 genotype during breeding. Multigenerational breeding was performed with cir° strains which were then cured of the CEN-plasmid by growth on non-selective media and counterselection with 5-fluoroorotic acid (5-FOA).
Example 8: Auxotrophic selection of pure haploid libraries. Enrichment of genetically diverse libraries of haploid cells was performed after multigenerational breeding using auxotrophic selection. All parental strains were engineered to contain a selectable marker under control of the HO-promoter at the disrupted HO-locus within strains. HO gene expression is exclusive to haploid cells and mating type switching is prevented by HO-disruption.
Germination of spores and continued growth in media lacking the essential supplement required in the absence of the marker gene expression was used to maintain haploid progeny. Selection of individuals for further analysis is possible by FACS (Fluorescence Activated Cell Sorting) or as colonies on agar plates can be used to physically separate strains and prevent mating. Individual separated strains can be further screened for improved traits for recombinant protein production. Typically, 100% of progeny will be haploid.
Example 9: Selection of haoloids containing multiple protease disruptions improved for recombinant protein yield.
Multigenerational breeding was performed for pep4, ypsl disrupted cir° parents with pep4 complementation from a CEN-vector containing the functional PEP4 gene.
Genetically diverse cir° progeny from breeding were transformed with plasmid pEV95 for methionine repressible expression of an mCherry-tagged VHH antibody fragment, and bred for an additional two generations. Approximately 1200-6000 5- FOA-resistant haploid colonies were isolated, which were cured of the PEP4 complementing CEN-vector. 96 colonies were randomly picked for expression analysis in 48-well microtitre plates. Initial 3-day inoculation cultures in BMMD+CSM-Leu+Met (without recombinant protein production) were followed by 5-days growth in BMMD+CSM-Leu-Met at 30°C, 280rpm/2.5cm orbital shaking with humidity control, to generate culture supernatant for mCherry signal detection (TECAN Spark plate reader).
Colonies were also pooled for selection of individuals with improved VHH-mCherry production. Individual cells with the top 5% mCherry signal were sorted into individual wells of 50 x 96-well microtitre plates containing O. lmL BMMD+CSM- Leu+Met, which were grown to stationary phase and cryopreserved stocks prepared. Approximately 1,300 individuals were inoculated into 48-well microtitre plates for expression analysis as described above. Parental control strains were similarly grown in 48-well microtitre plates for comparison (16-24 replicates of each parent).
Figure 8 shows the supernatant mCherry signal corrected for growth (whole culture ODeoo) for the parental control strains, the randomly picked colonies and the individuals enriched for improved recombinant protein secretion by flow cytometry. Breeding has clearly generated individuals with significantly improved product yields, which can be further enriched by flow cytometry. Secretion of the recombinant VHH-mCherry fusion protein has been improved at least 10-20 fold (12-50 fold compared to parental strains) by the breeding and selection process for strains with protease disruptions.
Similarly, a cir° library was prepared following multigenerational breeding (> 6 generations) with pep4 and ypsl disrupted parents and haploid progeny cured of the PEP4 complementing CEN-vector. Approximately 3000 haploids (maintained by growth in media requiring HO-promoter driven marker expression) were pooled and transformed with plasmid pEV203 for constitutive expression of recombinant human albumin (rHA, untagged) with around 15,000 transformants generated. Approximately 1000 transformants were picked for expression analysis in 48-well microtitre plates with constitutive albumin expression for final product titre determination by an HTRF assay (cisbio HSA kit) following the manufacturer's instructions. Figure 9 shows the diversity for rHA in a sub-population of 192 individuals tested, which contained individuals significantly improved compared to the parental strains with equivalent engineering.
Example 10: OTL analysis with strains isolated from multigenerational libraries with different proteinase A genotypes.
Quantitative Trait Locus (QTL) analysis is a statistical method that links both phenotypic and genotypic data to explain the genetic basis of variation in complex traits (Miles et al. 200818). This method was performed on a range of S. cerevisiae strains selected with a range of productivities for the recombinant protein amylase- mCherry, described in Example 2. This analysis identifies regions of the genome containing alleles and SNPs (single polynucleotide polymorphisms), of which some are also QTNs (quantitative trait nucleotides) contributing to a phenotype, e.g. increased levels of the recombinant protein amylase-mCherry. The QTL analysis identifies "regions" (also called "intervals" or "loci") in the genome associated with improving the phenotype analysed.
16 genomic regions were identified using QTL analysis, which contain genes, and thereby alleles, responsible for the differential expression of the recombinant amylase-mCherry protein.
Short reads were first assessed for sequencing quality using fastqc, before each read was aligned against the reference genome of S288C (R64-2-1), using bwa. Alignments were indexed and sorted using samtools, and duplicate reads marked and removed using picard tools. Variants were then called using freebayes. The parameters were set for the minimum mapping quality to 20 and ploidy to diploid (- -min-mapping-quality 20 -min-base-quality 20 -p 2), then this output was subjected to a set of filters to use as genetic markers with SNP sites for the samples.
The following filters were applied: a. The variant calling quality is more than 20; b. The observation of the variant is 100% of the samples in the calling set; c. Allele frequency (REF/(REF + ALT in (0.1, .09)); and d. Calling positions are the known bi-allele variant sites for SGRP founders.
The reproducibility of strain measurements across each plate was assessed using R. QTL (Quantitative Trait Loci) analysis was applied to find the association between genotypes and phenotype measurements for each plate as well as the average records. Specific activity values (mCherry fluorescence I OD620) from each culture (obtained using the methods described above) were used as phenotype inputs for the QTL analysis. LOD (Logarithm Of the Odds) score was calculated for each locus. The selected candidate QTL intervals are listed if 1) the LOD score is > 3 for each separate replicate plate analysis and 2) the max LOD score is > 5 for the max score when all replicate plates are considered. The intervals are summarised by the local maximum and 1.5 LOD drop. In addition, 5k and 10k flanking regions are also summarised for each of the selected peak markers. Variants in QTLs resulting in nonsynonymous mutation were further annotated. To narrow down and identify candidate causative genes, only the sites which appear in the top 2 performing strains and alternatives present in the bottom 2 performing strains are considered. Additionally, subsets of these genes with nonsynonymous mutations were selected based on gene function and position within each interval.
Example 11 - Improved Production of Multiple Protein Types
A comparison of one of the best two strains for amylase-mCherry secretion (2-A2) with one of the worst two strains for amylase-mCherry secretion (1-C12) was performed for multiple different recombinant proteins. Nine additional recombinant proteins were expressed, which were diverse in structure, size, and other physiochemical properties (Figure 11). In all cases except one, the best strain for amylase mCherry production gave higher production for the other recombinant proteins (Figure 12). For amylase-mCherry control, 2-A2 gave approximately 8.6 times more amylase-mCherry based on mCherry fluorescence than 1-C12. For the other proteins, the fold increase was between approximately 4.2 and 1.2, with one protein giving approximately equal productivity between the two strains. In this case, HPV16(Ll)-mCherry production is very different to amylase-mCherry, so it is not unexpected that alleles beneficial to amylase-mCherry production that are present in strain 2-A2 would improve HPV16(L1) production as significantly as for the other recombinant proteins because the HPV16(Ll)-mCherry was expressed intracellularly for accumulation and VLP formation in the nucleus, whereas amylase-mCherry was expressed for secretion into the extracellular media. For all the other proteins, which had a range of detection tags and assay methods and were either secreted or expressed intracellularly for cytosolic accumulation, the alleles in 2-A2 were beneficial for improved recombinant protein production, e.g. through increased productivity and/or reduced proteolysis. The recombinant proteins expressed have a diverse range of sizes, folding, domain structures and other physiochemical structures, indicating that strain 2-A2 is also generally improved to produce many other recombinant proteins of interest. Multiple alleles beneficial for recombinant protein production and/or reduced proteolysis originating from the different parental strains have been combined in strain 2-A2. While this combination of alleles was originally selected for improved production of amylase-mCherry, clearly, many of these alleles and other combinations of these alleles and the SNPs within them are also beneficial for the production of multiple other recombinant proteins.
For expression of multiple different types of recombinant protein, strains 2-A2 and 1-C12 were cured of the whole-2-micron expression vector for amylase-mCherry (pHR!K-pEV7) by the method described above and retransformed for expression from whole-2-micron plasmids equivalent to pHRIK of multiple different recombinant proteins comprising the expression cassettes described in Figure 11. All final whole-2-micron expression plasmids contain a LEU2 gene for leucine selection. Transformants were isolated as colonies on synthetic drop-out agar lacking leucine, e.g. BMMD+(CSM-Leu+Met). Three transformants were selected for each strain/plasmid combination for expression studies. Controls were strains 2-A2 and 1-C12 secreting amylase-mCherry from the pEV7 expression cassette in pHRlK-pEV7. Inoculum cultures for three transformants of 2-A2 and 1-C12 for each plasmid (and triplicate pHRlK-pEV7 controls) were started by picking cells from patches on solid media and transferring to 500pL liquid cultures in clear, 48 well microtiter plates. Buffered synthetic drop-out media with 2%(w/v) dextrose, lacking leucine and containing 3g/L methionine was used to maintain plasmids and to repress expression from constructs utilising the MET17 promoter. Inoculum cultures were incubated at 30°C for 2 days after which 20pL of each inoculum culture was passaged into 500pL synthetic dropout media with 2%(w/v) dextrose, lacking leucine in new 48 well microtiter plates, e.g. BMMD+(CSM-Leu-Met), in triplicate, to inoculate expression cultures. Expression cultures were incubated in shaking humidity chambers at 30°C over 4 days before harvesting. Upon harvest, culture OD was measured in wells using a TECAN Spark plate reader (Tecan, Switzerland). Culture supernatants were isolated by centrifugation at 1800 RCF and analysed for secreted product, where applicable.
For strains transformed with pEVl, pEV51 and pEV52 for the expression of intracellular mCherry and mCherry-tagged recombinant proteins, mCherry fluorescence was measured directly from the expression cultures in 48 well clear microtiter plates upon harvest at Aex540nm; Aem614nm, gain 100 on a TECAN Spark plate reader (TECAN, Switzerland).
For strains transformed with pEV3, pEV7 and pEV298 expression constructs for secreted expression mCherry-tagged recombinant proteins, 200pL culture supernatant was isolated as described previously and transferred to new, clear, 48 well microtiter plates. mCherry fluorescence of supernatants was measured at Aex540nm; Aem614nm, gain 100 on a TECAN Spark plate reader. OD was also measured to check for any accidental transfer of the cell pellet.
For strains transformed with pEV388 for the secreted expression of recombinant human albumin (rHA), titres were quantified using the Albumin Blue Fluorescence Assay Kit (Active Motif, Belgium) with a modified protocol for high-throughput detection in 384 plates. Briefly, 12.5pL culture supernatant was transferred to wells in a black, clear bottomed, non-treated 96 well assay plate. 75pL assay reagent comprising Ipl Albumin Blue dye and 74pL Buffer A from the kit was added to each well and mixed by pipetting. Plates were incubated for 5 minutes at room temperature, then fluorescence at Aex560nm; Aem620nm was measured on a TECAN Spark plate reader. Fluorescence signals for each sample were averaged from three technical replicates and converted to relative levels using a standard curve of rHA prepared in expression medium.
For strains transformed with pEV395 for the secreted expression of HiBit-tagged GLP1 analogue precursor, titres were quantified using the Hi-Bit (HiBit) extracellular detection kit (Promega, US), according to the manufacturer's instructions. A standard curve was made from a HiBit-tagged control protein (Promega, US) of known concentrations prepared in expression medium. Supernatant samples were diluted 1/104 to generate a signal within the linear range of the standard curve. Reactions were conducted in 20pL final volumes (lOpL sample, lOpL assay mix) in white, 384 well low-volume assay plates. Luminescence was measured on a BMG FLUOstar Omega plate reader (BMG Labtech, Germany). Luminescence signals for each sample were averaged from 3 technical replicates and converted to relative levels using the HiBit control protein standard curve.
For strains transformed with pEV299 and pEV275 for the secreted expression of untagged VHH, titres were quantified by SDS PAGE. Supernatant samples were run on NuPAGE 4-12% Bis-Tris gels (Thermo Fisher Scientific, US) according to the manufacturer's instructions, alongside 3 prepared samples of a purified VHH standard at known concentrations. Gels were Coomassie stained and imaged on an Amersham ImageQuant 800 (Cytiva, US), and densitometry analysis of bands corresponding to the VHH samples was conducted using ImageQuantTL (Cytiva, US). Band intensity values were converted to estimated relative levels using values obtained from the VHH reference standard.
All data was for triplicates corrected for growth/biomass (ODeoo or OD620).
Accordingly, these results demonstrate that the breeding method according to the claimed invention, using at least one parental yeast strain comprising a functionally deleted PEP4 gene, is able to generate individuals with significantly improved product yields for multiple different protein types, including secreted and intracellular proteins.
Conclusions The inventors have identified that proteinase A gene PEP4 disruption is important for reducing the overall protease levels in final production strains genetically diverse for protease metabolism, in addition to any protease reduction caused by breeding and selection. It was unknown prior to this invention whether breeding alone was sufficient to allow strains to be selected with adequate control of proteolysis for recombinant protein production. The inventors designed a genetically stable expression vector, which allowed recombinant protein production to be repressed during the breeding so that the expression plasmid was present in all progeny (up to a billion progeny) without creating a significant burden of metabolism from recombinant protein production, which might otherwise lead to selection against high productivity strains, which are a valuable component of the final strain libraries.
Additionally, methods were developed to allow the selection of large populations of genetically diverse diploid progeny or haploid progeny, including populations of haploid progeny enriched for one mating type only (i.e. predominantly containing only one of either of the two possible mating types) for the selection of individuals with improved phenotypes for bioprocess improvement. Additional methods were developed for the autotrophic selection of haploid populations to use when screening for individuals with improved phenotypes for bioprocess improvement.
Advantageously, therefore, the method of breeding according to the invention provides a variety of innovative and flexible options allowing for the generation of diverse yeast populations suitable for high throughput screening and QTL analysis to improve multiple bioprocessing traits, especially for industrial recombinant protein production. Because Saccharomyces cerevisiae is "the model eukaryote" with homology to other eukaryotic production hosts valuable for manufacturing biopharmaceuticals and other biologicals, these methods enable the discovery of underlying biology using methods (such as QTL analysis) which are relevant not only to this yeast but also to other eukaryotes. Genes and proteins involved in improving one eukaryotic host for the manufacture of biologicals are likely to have homologues of value in other eukaryotes. This is especially relevant to strain improvement to Pichia, filamentous fungi and mammalian cell lines such as Chinese hamster ovary (CHO) cells used for biologies production, where the breeding methods used in this invention are impossible, more difficult or less well developed.
References 1. Louis, E. J. Historical Evolution of Laboratory Strains of Saccharomyces cerevisiae.
Cold Spring Harb. Protoc. 2016, (2016).
2. Piper, P. W. & Curran, B. P. When a glycolytic gene on a yeast 2 mu ORI-STB plasmid is made essential for growth its expression level is a major determinant of plasmid copy number. Curr. Genet. 17, 119-123 (1990).
3. Goldstein, A. L. & McCusker, J. H. Three new dominant drug resistance cassettes for gene disruption in Saccharomyces cerevisiae. Yeast 15, 1541-1553 (1999).
4. Heidler, S. A. & Radding, J. A. The AUR1 gene in Saccharomyces cerevisiae encodes dominant resistance to the antifungal agent aureobasidin A (LY295337). Antimicrob. Agents Chemother. 39, 2765-2769 (1995).
5. Solis-Escalante, D. et al. amdSYM, a new dominant recyclable marker cassette for Saccharomyces cerevisiae. FEMS Yeast Res. 13, 126-139 (2013).
6. Sleep, D. & Finnis, C. 2-MICRON FAMILY PLASMID AND USE THEREOF, WO 2005/061719 Al. (2005). 7. Strope, P. K. et al. 2p plasmid in Saccharomyces species and in Saccharomyces cerevisiae. FEMS Yeast Res. 15, (2015).
8. Chu, D. et al. Translation elongation can control translation initiation on eukaryotic mRNAs. EMBO J. 33, 21-34 (2014).
9. Evans, L. et al. The production, characterisation and enhanced pharmacokinetics of scFv-albumin fusions expressed in Saccharomyces cerevisiae. Protein Expr. Purif. 73, 113-
124 (2010).
10. Andersen, J. T. et al. Structure-based mutagenesis reveals the albumin-binding site of the neonatal Fc receptor. Nat. Commun. 3, 610 (2012).
11. Finnis, C., Nordeide, P. & McLaughlan, J. IMPROVED PROTEIN EXPRESSION STRAINS, WO 2018/234349 Al. (2018).
12. Thorn, K. Genetically encoded fluorescent tags. Mol. Biol. Cell 28, 848-857 (2017).
13. Liti, G. et al. Population genomics of domestic and wild yeasts. Nature 458, 337-341 (2009).
14. Cubillos, F. A., Louis, E. J. 8i Liti, G. Generation of a large set of genetically tractable haploid and diploid Saccharomyces strains. FEMS Yeast Res. 9, 1217-1225 (2009).
15. Louvel, H., Gillet-Markowska, A., Liti, G. 8i Fischer, G. A set of genetically diverged Saccharomyces cerevisiae strains with markerless deletions of multiple auxotrophic genes. Yeast 31, 91-101 (2014).
16. Rose, A. B. 8i Broach, J. R. Propagation and expression of cloned genes in yeast: 2- microns circle-based vectors. Methods Enzymol. 185, 234-279 (1990).
17. Cubillos, F. A. et al. High-resolution mapping of complex traits with a four-parent advanced intercross yeast population. Genetics 195, 1141-1155 (2013).
18. Miles, C.; Wayne, M. Quantitative Trait Locus (QTL) Analysis. Nat. Educ. 1, 208 (2008).

Claims

Claims
1. A method of breeding to generate yeast exhibiting an improved phenotype for a biomanufacturing process, the method comprising: i) breeding at least two parental yeast strains, wherein at least one parental yeast strain comprises a functionally deleted PEP4 gene, or a homologue, orthologue or paralogue thereof; and ii) selecting for progeny exhibiting an improved phenotype for a biomanufacturing process.
2. The method according to claim 1, wherein the method comprises at least three parental yeast strains or at least four parental yeast strains.
3. The method according to either claim 1 or 2, wherein the method comprises breeding genetically diverse parental yeast strains, preferably wherein the parental strains are not common laboratory strains.
4. The method according to any preceding claim, wherein the method comprises breeding one parental yeast strain comprising a functionally deleted PEP4 gene, with one parental yeast strain comprising a functional PEP4 gene.
5. The method according to any preceding claim, wherein the method comprises breeding two parental yeast strains comprising a functionally deleted PEP4 gene, with two parental yeast strains comprising a functional PEP4 gene.
6. The method according to any preceding claim, wherein at least one parental strain comprises a selectable marker, which functionally deletes, or is inserted into, the PEP4 gene.
7. The method according to claim 6, wherein the method comprises isolating homozygous functionally deleted PEP4 diploid progeny by selecting for the selectable marker.
8. The method according to either claim 6 or claim 7, wherein the method comprises germinating the progeny and isolating spores comprising a functionally deleted PEP4 gene by selecting for the selectable marker.
9. The method according to any one of claims 6-8, wherein the method further comprises mating the spores to form homozygous functionally deleted PEP4 diploid progeny.
10. The method according to any one of claims 6-9, wherein the method comprises germinating and mating functionally deleted PEP4 haploid progeny with functional PEP4 haploid progeny.
11. The method according to any one of claims 6-10, wherein the method comprises isolating homozygous functionally deleted PEP4 diploid progeny and heterozygous functionally deleted PEP4 diploid progeny by selecting for the selectable marker.
12. The method according to any preceding claim, wherein the method comprises breeding at least one parental yeast strain comprising a functionally deleted PEP4 gene, with at least one parental yeast strain comprising a functional PEP4 gene.
13. The method according to any one of claims 6-12, wherein the selectable marker is an auxotrophic marker, optionally wherein the selectable marker is selected from a group of markers consisting of: LEU2, TRP1, HIS3, HIS4, URA3, URA5, SFA1, ADE2, MET15, LYS5, LYS2, ILV2, FBA1, PSE1, PDI1 and PGK1.
14. The method according to any one of claims 6-13, wherein the selectable marker is a dominant selectable marker, optionally wherein the dominant selectable marker is selected from the group consisting of: KanMX, HygMX, NatMX, PatMX, AUR1-C and amdSYM.
15. The method according to any one of claims 6-14, wherein the method comprises breeding the yeast in the presence of a suitable selection agent which is selected based on the selectable marker.
16. The method according to any preceding claim, wherein in order to obtain a homozygous functionally deleted PEP4 population, the method comprises allowing germination of genetically diverse progeny in the presence of G418 to select for pep4: :KanMX spores only, optionally wherein the spores are then mated to form homozygous functionally deleted PEP4 diploids pep4: :pep4 diploids), and/or wherein a mixed population of heterozygous functionally deleted PEP4 diploid progeny pep4 PEP4') and homozygous functionally deleted PEP4 diploid progeny pep4 pep4) are prepared by allowing germination and mating of pep4: :KanMX and PEP4 haploids, preferably followed by selection against functional PEP4 homozygous diploids PEP4:PEP4) with G418.
17. The method according to any preceding claim, wherein the improved phenotype for a biomanufacturing process is selected from the following phenotypes: increased protein product yield, reduced cell lysis, modified shear resistance, modified sedimentation, improved product harvesting, altered host cell protein profile, increased plasmid, genomic or phenotypic stability, improved growth phenotypes including modified media requirements, modified growth temperature or temperature range, and modified growth pH or pH range, and altered post-translational modification, including reduced proteolysis or modified glycosylation.
18. The method according to any preceding claim, wherein the improved phenotype for a biomanufacturing process is improved recombinant protein production.
19. The method according to any preceding claim, further comprising performing quantitative trait loci (QTL) analysis to identify the causal genetics associated with the improved phenotype for a biomanufacturing process.
20. The method according to any preceding claim, wherein the method comprises performing at least 2, at least 3, at least 4, at least 5, or at least 6 generations of breeding, or wherein the method comprises performing at least 7, at least 8, at least 9, at least 10, at least 11 or at least 12 generations of breeding.
21. The method according to any preceding claim, wherein the method comprises breeding to obtain at least 104 progeny spores, at least 105 progeny spores, at least 106 progeny spores, at least 107 progeny spores, at least 108 progeny spores, or at least 109 progeny spores.
22. The method according to any preceding claim, wherein the method further comprises transforming at least one parental yeast strain with an expression vector encoding at least one recombinant protein.
23. The method according to claim 22, wherein the expression vector is:
(i) a whole-2-micron family plasmid;
(ii) a stable partial-2-micron plasmid; (iii) an integrative plasmid;
(iv) a centromeric plasmid; or
(v) an artificial chromosome.
24. The method according to either claim 22 or 23, wherein the expression vector is an engineered whole-2-micron family plasmid, optionally wherein the engineered whole-2-micron family plasmid is a whole-2-micron family plasmid which has been engineered for recombinant protein production.
25. The method according to claim 24, wherein the whole-2-micron family plasmid comprises a repressible promoter, optionally wherein the repressible promoter is the MET17 promoter.
26. The method according to claim 25, wherein the method comprises growing the yeast in a media comprising less than 0.05 mM methionine, less than 0.04 mM methionine, less than 0.03 mM methionine, less than 0.02 mM methionine, or less than 0.01 mM methionine.
27. The method according to either claim 25 or claim 26, wherein the method comprises breeding the yeast in the presence of at least 0.05 mM methionine, at least 0.1 mM methionine, at least 0.5 mM methionine, at least 1 mM methionine, at least 5 mM methionine, at least 10 mM methionine, at least 15 mM methionine, or at least 20 mM methionine.
28. The method according to any preceding claim, wherein the method comprises:
(i) obtaining yeast progeny that do not comprise a whole-2-micron expression plasmid;
(ii) transforming the yeast progeny with a whole-2-micron expression plasmid to obtain transformed yeast progeny, preferably wherein greater than 103, or between 104 and 106 transformed yeast progeny are obtained; and
(iii) back-crossing the transformed yeast progeny with the yeast progeny that do not comprise a whole-2-micron expression plasmid.
29. The method according to any one of claims 1 to 28, wherein the method comprises:
(i) selecting, from the yeast progeny, an individual strain that does not comprise a whole-2-micron expression plasmid, or selecting at least two individual strains, at least three individual strains, at least four individual strains, or at least five individual strains which do not comprise a whole-2-micron expression plasmid;
(ii) transforming the selected progeny with a whole-2-micron expression plasmid to obtain transformed yeast progeny; and (iii) back-crossing the transformed yeast progeny with the yeast progeny that do not comprise a whole-2-micron expression plasmid.
30. The method according to either claim 28 or claim 29, wherein the yeast progeny are bred for at least two generations, at least three generations, at least four generations, or at least five generations.
31. The method according to any preceding claim, wherein the method comprises selecting for haploid or diploid progeny exhibiting an improved phenotype for a biomanufacturing process.
32. The method according to any preceding claim, wherein the method comprises germinating a population of spores derived from heterozygous functionally deleted PEP4 diploids in the presence of a yeast strain of one mating type.
33. The method according to claim 32, wherein the population of heterozygous functionally deleted PEP4 diploids are germinated in the presence of an excess of a yeast strain of one mating type, optionally wherein the excess of one mating type comprises at least 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, or preferably 10-fold or 20-fold as many cells as the number of germinated spores.
34. The method according to any preceding claim, wherein the method further comprises selecting for progeny exhibiting an improved phenotype for a biomanufacturing process, using flow cytometry or growth on a selective media.
35. The method according to any preceding claim, wherein the method comprises a homozygous functionally deleted PEP4 diploid, exhibiting an improved phenotype for a biomanufacturing process, with a plasmid that complements the functional deletion of PEP4, optionally wherein the plasmid is a genetically stable CEN-vector expressing PEP4.
36. The method according to any preceding claim, wherein the yeast is Pichia pastoris, Hansenula polymorpha, Kluyveromyces lactis, a Yarrowia species or a Saccharomyces species yeast.
37. The method according to any preceding claim, wherein the yeast is Saccharomyces cerevisiae.
38. The method according to any preceding claim, wherein at least one parental yeast strain comprises a functionally deleted UBC4 gene, or a homologue, orthologue or paralogue thereof, and/or wherein at least one parental yeast strain comprises a functionally deleted YPS1 gene, or a homologue, orthologue or paralogue thereof.
39. The method according to any preceding claim, wherein at least one parental yeast strain comprises: (i) a functionally deleted M0T2 and/or GHS1 gene, or a homologue, orthologue or paralogue thereof;
(ii) an over-expression of a gene encoding a chaperone; and/or
(iii) an over-expression of HACli.
40. The method according to any preceding claim, wherein the at least two parental yeast strains are genetically diverse, optionally wherein the strains are at least 0.001%, at least 0.005%, at least 0.01%, at least 0.02%, at least 0.03%, at least 0.04%, at least 0.05%, at least 0.06%, at least 0.07%, at least 0.08%, or at least 0.09% different by whole genome comparisons.
41. A yeast strain exhibiting an improved phenotype for a biomanufacturing process, wherein the yeast strain is obtained by the method according to any one of claims 1 to 40.
42. A yeast library obtained by the method according to any one of claims 1 to
40.
43. A product produced by the yeast strain according to claim 41 or the yeast library according to claim 42.
PCT/GB2024/050386 2023-02-13 2024-02-13 Yeast breeding process for strain improvement, involving pep4 mutants WO2024170889A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GBGB2301988.8A GB202301988D0 (en) 2023-02-13 2023-02-13 Breeding process
GB2301988.8 2023-02-13

Publications (1)

Publication Number Publication Date
WO2024170889A1 true WO2024170889A1 (en) 2024-08-22

Family

ID=85704490

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2024/050386 WO2024170889A1 (en) 2023-02-13 2024-02-13 Yeast breeding process for strain improvement, involving pep4 mutants

Country Status (2)

Country Link
GB (1) GB202301988D0 (en)
WO (1) WO2024170889A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0909312B1 (en) * 1996-07-05 2002-12-04 Novo Nordisk A/S Method for the production of polypeptides
US6566122B1 (en) * 2000-06-16 2003-05-20 Academia Sinica Super-secreting saccharomyces cerevisiae strains
WO2004009819A2 (en) 2002-07-23 2004-01-29 Delta Biotechnology Limited Polypeptides with a signal sequence comprising an fivsi motif and olynucleotides encoding therefor
WO2005061719A1 (en) 2003-12-23 2005-07-07 Delta Biotechnology Limited 2-micron family plasmid and use thereof
US20160017343A1 (en) * 2013-03-06 2016-01-21 Glaxosmithkline Llc Host cells and methods of use
US20180022785A1 (en) * 2015-12-22 2018-01-25 Albumedix A/S Protein expression strains
WO2018234349A1 (en) 2017-06-20 2018-12-27 Albumedix Ltd Improved protein expression strains

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0909312B1 (en) * 1996-07-05 2002-12-04 Novo Nordisk A/S Method for the production of polypeptides
US6566122B1 (en) * 2000-06-16 2003-05-20 Academia Sinica Super-secreting saccharomyces cerevisiae strains
WO2004009819A2 (en) 2002-07-23 2004-01-29 Delta Biotechnology Limited Polypeptides with a signal sequence comprising an fivsi motif and olynucleotides encoding therefor
WO2005061719A1 (en) 2003-12-23 2005-07-07 Delta Biotechnology Limited 2-micron family plasmid and use thereof
US20160017343A1 (en) * 2013-03-06 2016-01-21 Glaxosmithkline Llc Host cells and methods of use
US20180022785A1 (en) * 2015-12-22 2018-01-25 Albumedix A/S Protein expression strains
WO2018234349A1 (en) 2017-06-20 2018-12-27 Albumedix Ltd Improved protein expression strains

Non-Patent Citations (21)

* Cited by examiner, † Cited by third party
Title
"GenBank", Database accession no. GCF_000146045.2
"Methods in Enzymology", vol. 463, 1 January 2009, ELSEVIER, ACADEMIC PRESS, NL, ISBN: 978-0-12-805382-9, article JAMES M. CREGG ET AL: "Chapter 13 Expression in the Yeast Pichia pastoris", pages: 169 - 189, XP055312768, DOI: 10.1016/S0076-6879(09)63013-5 *
"UniProt", Database accession no. P03101 VL1_HPV16
ALVAREZ P ET AL: "A new system for the release of heterologous proteins from yeast based on mutant strains deficient in cell integrity", JOURNAL OF BIOTECHNOLOGY, ELSEVIER, AMSTERDAM NL, vol. 38, no. 1, 30 November 1994 (1994-11-30), pages 81 - 88, XP023705270, ISSN: 0168-1656, [retrieved on 19941130], DOI: 10.1016/0168-1656(94)90149-X *
AMMERER G ET AL: "PEP4 gene of Saccharomyces cerevisiae encodes proteinase A, a vacuolar enzyme required for processing of vacuolar precursors", MOLECULAR AND CELLULAR BIOLOGY, AMERICAN SOCIETY FOR PHARMACOLOGY AND EXPERIMENTAL THERAPEUTICS, US, vol. 6, no. 7, 1 January 1986 (1986-01-01), pages 2490 - 2499, XP003010544, ISSN: 0270-7306 *
ANDERSEN, J. T. ET AL.: "Structure-based mutagenesis reveals the albumin-binding site of the neonatal Fc receptor", NAT. COMMUN., vol. 3, 2012, pages 610
CHU, D. ET AL.: "Translation elongation can control translation initiation on eukaryotic mRNAs", EMBO J., vol. 33, 2014, pages 21 - 34
CUBILLOS, F. A. ET AL.: "High-resolution mapping of complex traits with a four-parent advanced intercross yeast population", GENETICS, vol. 195, 2013, pages 1141 - 1155
CUBILLOS, F. A.LOUIS, E. J.LITI, G: "Generation of a large set of genetically tractable haploid and diploid Saccharomyces strains", FEMS YEAST RES, vol. 9, 2009, pages 1217 - 1225, XP055131695, DOI: 10.1111/j.1567-1364.2009.00583.x
EVANS, L. ET AL.: "The production, characterisation and enhanced pharmacokinetics of scFv-albumin fusions expressed in Saccharomyces cerevisiae", PROTEIN EXPR. PURIF., vol. 73, 2010, pages 113 - 124, XP027144407, DOI: 10.1016/j.pep.2010.05.009
GOLDSTEIN, A. L.MCCUSKER, J. H: "Three new dominant drug resistance cassettes for gene disruption in Saccharomyces cerevisiae", YEAST, vol. 15, 1999, pages 1541 - 1553
HEIDLER, S. A.RADDING, J. A: "The AUR1 gene in Saccharomyces cerevisiae encodes dominant resistance to the antifungal agent aureobasidin A (LY295337", ANTIMICROB. AGENTS CHEMOTHER, vol. 39, 1995, pages 2765 - 2769
LITI, G. ET AL.: "Population genomics of domestic and wild yeasts", NATURE, vol. 458, 2009, pages 337 - 341
LOUVEL, H.GILLET-MARKOWSKA, A.LITI, G.FISCHER, G: "A set of genetically diverged Saccharomyces cerevisiae strains with markerless deletions of multiple auxotrophic genes", YEAST, vol. 31, 2014, pages 91 - 101
MILES, C.WAYNE, M: "Quantitative Trait Locus (QTL) Analysis", NAT. EDUC, vol. 1, 2008, pages 208
PIPER, P. W.CURRAN, B. P: "When a glycolytic gene on a yeast 2 mu ORI-STB plasmid is made essential for growth its expression level is a major determinant of plasmid copy number", CURR. GENET, vol. 17, 1990, pages 119 - 123
ROSE, A. B.BROACH, J. R: "Propagation and expression of cloned genes in yeast: 2-microns circle-based vectors", METHODS ENZYMOL., vol. 185, 1990, pages 234 - 279
SOLIS-ESCALANTE, D ET AL.: "amdSYM, a new dominant recyclable marker cassette for Saccharomyces cerevisiae", FEMS YEAST RES, vol. 13, 2013, pages 126 - 139, XP055806708, DOI: 10.1111/1567-1364.12024
STROPE, P. K. ET AL.: "2µ plasmid in Saccharomyces species and in Saccharomyces cerevisiae", FEMS YEAST RES, vol. 15, 2015
THORN, K: "Genetically encoded fluorescent tags", MOL. BIOL. CELL, vol. 28, 2017, pages 848 - 857, XP093048748, DOI: 10.1091/mbc.e16-07-0504
WILLIAMS T. C. ET AL., MICROB CELL FACT, vol. 14, 2015, pages 43

Also Published As

Publication number Publication date
GB202301988D0 (en) 2023-03-29

Similar Documents

Publication Publication Date Title
KR101262682B1 (en) Gene expression technology
KR101298157B1 (en) 2-micron family plasmid and use thereof
AU2006260739B2 (en) Gene expression technique
JP5631533B2 (en) Gene expression technology
JPS62104585A (en) Site selective genom modification of pitia yeast
US20140011236A1 (en) Promoters for high level recombinant expression in fungal host cells
KR100490190B1 (en) Improved protein expression strains
CN111032871A (en) Improved protein-expressing strains
JP2012527227A (en) Eukaryotic host cells containing expression enhancers
WO2024170889A1 (en) Yeast breeding process for strain improvement, involving pep4 mutants
WO2023104896A1 (en) Improved production of secreted proteins in fungal cells
US20230111619A1 (en) Non-viral transcription activation domains and methods and uses related thereto
KR100977446B1 (en) A novel gene of Hansenula polymorpha that regulates secretory stress response and a method of increasing the secretory expression efficiency of recombinant protein using the gene
WO2024170890A1 (en) Engineered eukaryotic cell
WO2024170891A1 (en) Engineered eukaryotic cell
JP2001509392A (en) Increased production of secreted proteins by recombinant yeast cells
HK1099938B (en) Gene expression technique

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24707893

Country of ref document: EP

Kind code of ref document: A1