EP3251039A1

EP3251039A1 - Computer-implemented method for creating a fermentation model

Info

Publication number: EP3251039A1
Application number: EP16701791.2A
Authority: EP
Inventors: Tobias NEYMANN; Lukas Hebing; Sebastian Engell
Original assignee: Bayer AG
Current assignee: Bayer AG
Priority date: 2015-01-29
Filing date: 2016-01-28
Publication date: 2017-12-06
Also published as: SG10202006972VA; KR20170109629A; US10872680B2; AU2016212059B2; SG11201706166PA; US10296708B2; CN107408161A; EA201791659A1; IL253584A0; TWI690813B; CN107408161B; CA2975012A1; WO2016120361A1; HK1247340A1; BR112017016198A2; TW201643744A; CA2975012C; AU2016212059A1; EA035276B1; EP3051449A1

Abstract

The invention relates to a computer-implemented method for creating a bioreaction model, in particular a fermentation or whole cell catalytic process, using an organism based on measurement data.

Description

Computer-implemented method for creating a fermentation model

The invention relates to a computer-implemented method for creating a model of a bioreaction - esp. Fermentation or whole-cell catalysis - using an organism.

Organism in the sense of the application are cultures of plant or animal cells such as mammalian cells, yeasts, bacteria, algae, etc., which are used in bioreactions.

The sensory monitoring of a fermentation process and analysis of samples from a process z. Using the BaychroMAT® quality-by-design analytics automation platform from Bayer Technology Services GmbH provides various information about the state of the process in the bioreactor in real time. Typically, cell counts, cell vitality, concentrations of substrates such as carbon sources (eg, glucose), amino acids or O2, products and by-products (eg, lactate or CO2), process parameters such as temperature and / or pH or product characteristics are determined , This data can be calculated by calculated data and / or extrapolations z. B. be supplemented from the prior art. Together, these data form the measurement data or the process knowledge in the sense of the application.

In the context of the application, background knowledge of the organism means knowledge about the biochemical reactions of the organism - specific and unspecific reactions - and in particular the cell-internal reactions, or macro-reactions describing the organism-specific metabolic networks (also called SN or metabolic networks) consisting of substrates, metabolites (also called nodes of the metabolic network), products as well as the biochemical reactions between them. These biochemical reactions are through their:

(a) stoichiometry

(b) reversibility (under biological conditions),

(c) integration into a stoichiometric network,

Are defined.

So far, the measurement data is mainly used for qualitative monitoring of the process. The following section presents a selection of technical issues that require dynamic process models to solve. A technical use of the process knowledge in the sense of the application provides the model-based state estimation of a process in a bioreactor. Methods such as the extended Kalman filter allow a continuous estimation of process variables over which discontinuous measurements are available [Welch G, Bishop G. 1995. Chapel Hill, NC, USA: University of North Carolina at Chapel Hill The course of non-measurable quantities can also be calculated from other measurements, provided that a process model is correctly described for the underlying process.

Another application is model-based, optimal process control. Here, a dynamic process model is used to optimize process control in terms of product quantity, product characteristics or formation of by-products or other target variables in a model-based, predictive closed-loop control system. This is described, for example, by Frahm et al. for a hybridoma cell culture [Frahm B, Lane P, Atzert H, Munack A, Hoffmann M, Hass VC, Portner R. 2002. Adaptive, Model-Based Control by the Open Loop Feedback Optimal (OLFO) Controller for the Effective Fed-Batch Cultivation of Hybridoma Cells. Biotechnol. Prog. 18 (5): 1095-1103J.

For both mentioned technical applications, it is important that the created process model has the lowest possible complexity, ie a limited number of state variables and / or equations, with at the same time good accuracy of the reproduction of the process.

In addition to the mentioned applications for process control, dynamic process models can also be used during process development to plan experiments with optimal information gain. This approach is called model-based experimental design [Franceschini G, Macchietto S. 2008. Model-based design of experiments for parameter precision: State of the Art. Chemical Engineering Science 63 (19): 4846-1872]. In addition to the model complexity requirements described above, this technical application requires that a dynamic process model already exist during the development phase. This should be able to be generated as quickly as possible from existing process knowledge in order to minimize the time required for process development.

There was therefore a need to provide a method that allows the creation of a dynamic process model using background knowledge and process knowledge. To be able to use this model, for example for a state estimation, an optimal process control or for model-based experimental design, the complexity of the model has to be low. Dependencies, ie influences of the process variables or the process state on the process behavior, should be quantified with sufficient accuracy within the Design Space. All Present information about the process state should be used for this purpose. The model-based description of product features should be integrated into the model as needed. Design Space is the area where process knowledge exists. The method should be applicable to the bioreactions listed above and significantly reduce the development time of such dynamic models. Previous approaches to developing dynamic models take months to years to complete a process model. Experience shows that the present approach reduces the development time to a few weeks.

Typical product features within the meaning of the application are, for example, glycosylation patterns of proteins or protein integrity, without being limited thereto. Dynamic models used in the above context do not yet have this property. The present approach enables a simple model-based integration of product features.

The model-based process management of fermentations is described by Frahm et. al using the example of a hybridoma cell culture (Frahm B, Lane P, Atzert H, Munack A, Hoffmann M, Hate VC, Portner R. 2002. Adaptive, Model-Based Control by the Open-Loop-Feedback-Ultimate (OLFO) controller for the Effective Fed-Batch Cultivation of Hybridoma Cells, Biotechnol. Prog. 18 (5): 1095-1103). Basic process variables are controlled model-based here. An integration of product features does not take place here. The mathematical model of the cell was designed for this specific process and can only be transferred with great effort to processes with the same or other organisms or strains of the same organism. Background knowledge in the form of cell-internal reactions is not explicitly considered in the model. An integration of further measured variables in the model and thus a complete use of the information about the process state can be done here only with great effort. The approach thus represents an individual solution that is neither transferable to other processes nor allows full use of the data obtained. Due to the expected development time of the model and the time-consuming transferability of the solution to other processes with the same or with other organisms, the named method does not solve the above-mentioned technical problem.

Further modeling, which also includes product features such as glycosylation, can be found in the publications by Kontoravdi et. al. The model that describes the main metabolism does not include any background knowledge in the form of cellular reactions and can not be transferred to other processes with the same or different organisms. Integration of further parameters into the model can not be done here [Kontoravdi C, Asprey SP, Pistikopoulos EN, Mantalaris A. 2007. Development of a dynamic model of monoclonal antibody production and glycosylation for product quality monitoring. Computers & Chemical Engineering 31 (5-6): 392-400.]. This method also does not allow full use of process state information, requires a long development time of the model, and is not transferable to other organisms or strains. This method does not represent a solution to the technical problem.

The modeling of glycosylation involving nucleotide sugar metabolism by Jedrzejewski et al. and Jimenez et al. [1]. [2] Abstracts in the form of balance equations of internal metabolic intermediates include [Jedrzejewski PM, del Val, Ioscani Jimenez, Constantinou A, Dell A, Haslam SM, Polizzi KM, Kontoravdi C. 2014. Towards Controlling the Glycoform: A Model Framework Linking Extracellular Metabolites to Antibody glycosylation. International Journal of Molecular Sciences 15 (3): 4492-4522; Jimenez del Val, Ioscani, Nagy JM, Kontoravdi C. 2011. A dynamic mathematical model for monoclonal antibody N-linked glycosylation and nucleotide sugar donor transport within a maturing Golgi apparatus. Biotechnology progress 27 (6): 1730-1743 J. When using this model for process control, however, the complexity of the entire model and the lack of observability of cell-internal metabolic intermediates are disadvantageous. In addition, the main metabolism model does not allow for transfer to other processes or full use of process state information. This method does not represent a solution to the technical problem.

Flexible model generation for bioprocesses is described by Leifheit et al. [Leifheit J, Heine T, Kawohl M, King R. 2007. Computer-aided semi-automatic modeling of biotechnological processes (Semi-Automatic Modeling of Biotechnical Processes). at - Automation Technology 55 (5)]. The model generation takes place with the help of process knowledge, but without background knowledge. The procedure can be used for different processes with the same or different organisms. The basis here are macro reactions that the user himself dictates. Their exact stoichiometries are determined in the process. The method is described for a small number of state or measured variables. An integration of further state or measured variables would be associated with a significant increase in the complexity of the method. Using a comprehensive data foundation, such as that provided by the BaychroMAT® platform, would make this method unworkable. An integration of product features does not allow the method. It therefore does not represent a solution to the above-mentioned technical problem. The use of background knowledge in the form of macro reactions obtained as elementary modes (EM) from the known metabolic (stoichiometric) networks of an organism is described by Provost [Provost A. 2006. Metabolism design of dynamic bioreaction models. Facultte des Sciences Appliquees, Universite Catholique de Louvain, Louvain-la-Neuve, Louvain-la-Neuve, p. 81 ff., P.107 ff., Pp. 118 ff.]. This method is useful for various organisms or strains of the same organism. The selection of macro-reactions for the process model is done using process knowledge. However, process sections are defined for which a predefined number of macro-reactions are selected separately at random. The method described here provides one of many possible combinations of elementary modes. The number of macro reactions, and thus the model complexity, is fixed and unchangeable. The method gives separate models for each process section. The kinetics of the individual macro reactions are selected taking into account the stoichiometry of the selected macro reactions. However, the parameters of the kinetics (model parameters) are not adapted to the process data. Instead, the use of the separate process section models maps the changes in the process data. Although the random selection of reactions can also be based on a comprehensive data basis, the described approach to the selection of kinetics and the selected kinetics can not reflect the course of the process or the behavior of the organism in the process. The use of multiple process section models also leads to an unnecessary increase in complexity of the process model. The dependencies, ie influences of process variables or the process state on the process behavior, are not quantified with this method. Also, an integration of product features does not take place here. This method is therefore not a solution to the above technical problem.

Therefore, there has been a need to provide a method which enables the rapid and efficient provision of a model based on process knowledge and measurement data and the optimization of the product turnover and the critical product features in consideration of background knowledge and does not have the above-mentioned disadvantages.

The object has been achieved by a method for creating a model of bioreaction with an organism in a bioreactor as described below. The subject of the application is a computer-implemented method for creating a model of a bioreaction - in particular fermentation or whole-cell catalysis - with an organism comprising the following steps:

a. Selected metabolic pathways of the organism, their stoichiometric and reversibility properties are entered as background knowledge in the process. In other words, one or more metabolic networks of the organism are entered into the procedure. Elementary Modes (EMs) are calculated from this input.

b. The EMs are summarized in a matrix K, where the EMs summarize the metabolic pathways from a) into macro reactions. This matrix K contains the stoichiometry and the reversibility properties of all possible macro reactions from the background.

c. The measurement data (also called process knowledge) for bioreaction with the organism are entered.

d. Using an interpolation method, on the basis of the input measurement data from c), the rates specific to the organism - excretion and uptake rates of one or more input variables and output variables - of the input metabolic pathways are calculated. Growth rates, particularly preferably also mortality rates of the organism, are preferably also calculated.

e. Relevant macro-reactions are selected in the form of a subset of the elementary modes from b)

i. Data-independent and / or data-dependent prereduction of the number of EMs from b). ii. Selection of the subset from the prereduction from e) i. with the measurement data from c) and / or one or more rates from d), preferably with the measurement data from c) by means of an algorithm according to a mathematical quality criterion and summary of the subset in a matrix L.

iii. Optionally, the subsets are displayed graphically.

f. Using an interpolation method, the reaction rates of the macro reactions of the subset r (t) are calculated on the basis of the input measured data from c) and / or the rates from d).

G. Kinetics of the macro-reactions of the subset from e) ii. are designed with the following intermediate steps; This defines the model parameters.

i. From the stoichiometry of the macro reactions, generic kinetics are designed. ii. Factors influencing the macro reactions are determined from the reaction rates from f). iii. The generic kinetics from g) i. are extended by terms equivalent to those in g) ii. Quantify determined influencing variables.

H. Optionally, for the kinetics from g), a first adaptation of the model parameter values for each macro reaction is carried out separately to the calculated reaction rates from f) and a check of the adaptation quality.

i. Optionally, steps g) and h) are repeated until a predefined quality of adaptation is achieved.

j. The model parameter values are adapted to the measurement data from c).

k. The matrix L, the kinetics from g) and the model parameter values from j) form the model and are output and / or transferred to a process control or process development module.

Typically, the process control module communicates on-line with a process control system commonly used to control the bioreactor.

Typically, process development modules are used to off-line optimize the process or design experiments.

The bioreaction modeling according to the invention is essentially based on the assumption of representative macro reactions which simplify internal metabolic processes. The selection of reactions requires both biochemical background knowledge and process knowledge.

In the first step of the method, the reactions of the metabolic network, their stoichiometry and reversibility property are entered by the user via a user interface or, ideally, automatically by the selection of an organism and its stored metabolic pathways from a database module in which the background information on the organism is stored. The metabolic network (also called stoichiometric network in the state of the art) and the properties of its individual reactions represent the background knowledge of the organism. The metabolic network preferably contains reactions from metabolic pathways which are important for the organism, for example reactions of glycolysis. Particularly preferably, the selection contains external reactions. An external reaction in the sense of the application contains at least one component outside the cell, typically at least one input variable and / or at least one output variable (product, by-product, etc.). More preferably, the metabolic network contains reactions that describe cell growth, e.g. B. in the form of a simplified reaction of internal metabolites to external biomass. Figure 5 and Table 1 in the example describe but are not limited to an applicable metabolic network.

Then, the metabolic pathways of the organism elementary mode are calculated from the input metabolic pathways combined in one or more stoichiometric networks. Each elementary mode is a linear combination of reaction rates from the metabolic pathways - d. H. internal and external responses of the metabolic network, which satisfy both the steady state condition for internal metabolites and the reversibility or irreversibility of reactions, in linear combinations of reactions that take into account the steady state condition of internal metabolites , no internal metabolites can accumulate.

An internal reaction within the meaning of the application takes place exclusively within the cell.

By externalizing an internal component, i. H. By classifying an actual internal component as an input or output variable, it is possible to model the internal reaction associated with the externalized internal components as an external response and thus bypass the steady-state condition for internal metabolites in this case. A macro reaction in the sense of the application summarizes all reactions that lead from one or more input variables to one or more output variables (n) Each elementary mode thus describes a macro reaction Compared to the method of Leifheit et al the macro reactions are determined on the basis of the entered background knowledge.

The elementary modes (EMs) are combined in a matrix E, preferably in a module for matrix engineering, which is configured with a corresponding algorithm. Known algorithms can be used outside the Elementary Modes Matrix. METATOOL is mentioned as an example without being limited to: [Pfeiffer T, Montero F, Schuster, 1999. METATOOL: for studying metabolic networks. Bioinformatics 15 (3): 251-257.]

METATOOL generates a first matrix E describing the input internal and external responses.

In step b), a matrix consisting of possible macro reactions K is generated with the aid of the (external) stoichiometric matrix N _p from the matrix (E).

K = N _V - E (formula I) The transformation of matrix E into K is known from Provost [Provost A. 2006. Metabolism design of dynamic bioreaction models. Faculté des Sciences Appliquees, Universite Catholique de Louvain, Louvain-la-Neuve, Louvain-la-Neuve, p. 81].

The column vectors of the matrix K describe the macro reactions. The row vectors describe the components of the macro reactions (input and output variables). In the matrix K, the stoichiometry of the macro reactions is entered.

Any reaction rate possible in terms of the metabolic network can be represented as a positive linear combination of these macro reactions.

The use of EMs as the basis of a process model is known in the art (eg, from Provost A. 2006. Metabolism design of dynamic bioreaction models.) Faculty of Sciences Appliquees, Universite Catholique de Louvain, Louvain-la-Neuve, Louvain- la-Neuve, pp. 87, pp. 118 ff and Gao, J., et al., (2007) Dynamic metabolic modeling for a MAB bioprocess, Biotechnology pmgress, 23 (1), 168-181).

In a further step c), the available measurement data (process knowledge) for bioreaction with the organism are entered. Typically, cell number, cell vitality, concentrations of substrates such as carbon sources (eg, glucose), amino acids or O ₂ , products and by-products (eg, lactate or CO ₂ ), process parameters such as temperature and / or pH, or product characteristics determined. This input can be done manually by the user or automatically, such as. Example by selection from a database module for storing measurement data and transfer of the selected data in a data analysis module, which is connected to the database module.

From these measurement data, in step d), the cell-specific excretion and uptake rates of substrates and (ancillary) products - together specific rates q (t) - called, and optionally the

Growth and death rates of the organism (μ (ί), M _d CO) are calculated. Prerequisite for the calculation is the interpolation of the vital cell count, the total cell count and media concentrations with the help of a From these temporal changes of the measured variables can be determined. The calculated rates q (t), μ (, give information about the observed dynamic behavior of the organism over time.

One or more different methods of interpolation (n) can be used in combination to calculate the above rates. As an example, Leifheit et al. the determination of the temporal changes of measured variables - z. B. the total cell count, the vital cell count or from other media concentrations of measurement data using spline-interpolated measurement data [Leifheit, J., Heine, T., Kawohl, M., & King, R. (2007). Computer-aided semi-automatic modeling of biotechnological processes (Semiautomatic Modeling of Biotechnical Processes). at-automation technology, 55 (5), 21 1 -218]. This method is hereby incorporated into the application by reference.

The above rates q (t), μ (ί), _d C are calculated from these temporal changes:

For example, the growth rate of the organism μ (ΐ) can be calculated from spline-interpolated values of the total cell number X _t (t) and the living cell number X _v (t) as well as the temporal values calculated from them

IX

Change in total cell number (t) can be calculated using formula 2:

^ (C) = -D (t) ^■ X _t (t) + μ (ΐ) ^■ X _v (t) (formula 2)

Where D (t) is the dilution rate.

With a known course of μ (ί), the rate of death μ ^ (£) can be determined from the course of X _v (t) and the

iX

Course of the temporal change of the vital cell number (t) can be calculated with formula 3:

^ (t) = -D t) ^■ X _v (t) + (μ (-μ _α (ΐ)) ^■ X _v (t) (Formula 3)

The calculation of the specific rates of another component tq _j (t) can be obtained from spline-interpolated values of the living cell number X _v (t) and the concentration of the component Q (t), as well as the variation of the temporal change ^ (t) from 8ρ1ίη6 -ίηΐ6 θ1ίειΐ6η values of Q (t) can be determined using formula 4:

^ (t) = D (t) · (c _Un - Q () + qi (t) ^■ X _v (t) (Formula 4)

In a preferred embodiment of the method, the measurement data from step c) are prepared before the first interpolation as follows: In order to take into account all concentration changes not caused by the cells and to obtain a steady course of the concentration changes from the measurement data, the measurement data are shifted (in the application called shifts). The amount AQ (t), by which the concentration measurement is shifted, can be calculated according to formula 5:

AQ (= J (T). (Q (T) -C _UN (T)) dr (Formula 5)

Where D () is the dilution rate. The shifted concentration curve C _is (t) then results according to formula 6: Q, _s (= Q (- AQ (C) (formula 6)

The differential equation, which indicates the course of the shifted concentration C _is (t), thus results from the formulas 4 and 6:

dC, "d (Ci (t) -ACi (t)),. _T,. ._,

^~ ä ⁼ _dt = <? i (0 · ^χ v W (formula 7)

This treatment (shifts) of the measured data prevents a sudden change in the calculated specific rates when turning on or off a feed (feed peak), in particular in a fed-batch process.

Figure 1 shows the processing / shifting of the measured data in the sense of this application.

In a particular embodiment of the method, the processed data then becomes

dX I

Calculation of a gradient of the total number of cells - - \ (t) by the method of Leifheit et al.

dt l _s

used. This is approximated with a spline interpolation according to differential equation 8: ^ | (t) = μ (ί) ^■ X _v (t (Formula 8)

Particularly preferably, the lysis is included in the calculation on the basis of a lysis factor K _t (formula 9). This can z. B. be assumed to be constant over the course of the process.

fl y

= μ (ί) × X _v (t) ^-K _t ^■ (X _t (t) -X _v (t)) (formula 9)

The decrease in the total number of cells X _tiS (t) _shifted can thus be explained by lysis, whereby negative values for the growth rate μ (ΐ) can be avoided.

Preferably, the processed data are also used for the calculation of the death rate μ _ά (t).

The possible combinations of specific rates q (t) in a flux map are preferred

Diagram as exemplified in Figure 2. This representation provides a good overview of the calculated specific rates q (t). The contour lines indicate here which ones

Areas are physiologically important.

If the specific rates q (t) and optionally the other rates (and μ ^ (ί) are different

Having orders of magnitude and units, these are usually summarized by means of simplifications to a specific rate vector q (t) with the same units. For example

3 is the specific rate of a macromolecule measured in grams [g], the unit

ICell hi own. When the composition of this macromolecule is estimated, e.g. Based on its C-mol content, its specific rate can be varied from [g] to [C-mol], so that the specific rate has the unit [^ _e ^].

The specific rates q (t) form one of the bases for the further step e) of the method, namely the selection of the relevant macro reactions.

In step e), a subset (L) of the EMs is selected on the basis of the data with which the specific rates q (t) from d) and / or the measurement data from c) can be well mapped according to a mathematical quality criterion. The number of EMs in the subset (L) should be as small as possible in order to ensure the lowest possible complexity of the process model. The subset L should however ensure a good description of the process knowledge.

The selection of EMs reduces the solution space compared to the original EMs set (K) from a), but still contains the determined physiologically important area of the cells.

Figure 3 shows a representation of the solution space, where the original set of EMs (K) is reduced to a subset (L).

For step e), the calculated specific rates q (t) and the measurement data from c) are usually transferred to a module for selecting the relevant macro reactions, which is configured with corresponding algorithms.

In step e) i), a data-independent and / or a data-dependent prereduction of the matrix K takes place in any order:

The data-independent prereduction is preferably carried out by a geometric reduction. In this case, all cosine similarities to all other modes are calculated for a randomly selected EM. The most similar EM will be removed from the set. This procedure is repeated until a predefined number of EMs has been reached. The desired number is usually defined for the procedure in advance. As a control variable, the volume of the solution space can be used. Surprisingly, it was found that a significant reduction in the number of macroreactions while maintaining 90 to 98%, preferably 92 to 95% of the clamped volume, compared to the original volume is possible.

The data-dependent prereduction can be calculated by comparing yield coefficients of the EMs (Y ^EM ) to the yield coefficients calculated from the specific rates q (t) of d)

(Y ^m ). The yield coefficient of the kth EM (i) ' ^k ) is given by formula 10 Division of the corresponding stoichiometric coefficients of the external metabolites i and j determined. For the kth EM these are the matrix entries K _ik and K _Jik .

yEM.k ₌ ^ _(formd

If the stoichiometric coefficient K _Jik = 0, the yield coefficient can not be determined. The yield coefficient Y _j (t) gives the ratio of two measured or d) calculated cell-specific rates (i (t), q _j (t)) to each other according to formula 1 1:

* w (0 = H (formula I)

From the yield coefficients Y ^m , an upper and a lower limit can be determined for each possible combination of two external components t and j. For example, the lowest yield coefficient of two external metabolites i and j Υ ™ (ΐ) can be used as the lower bound and the largest value of Υ ™ (ΐ) as the upper bound, but other limits are possible. EMs whose yield coefficients Y ™ are above the upper limit or below the lower limit are removed from the matrix K. If the yield coefficient of an EMs Y ™ can not be determined, this remains in the matrix K. The method according to the invention "linear estimation of reaction rates of selected macro reactions with NNLS" described on page 15 can also be used for the data-dependent prereduction The advantage of using the process data in the data-dependent pre-reduction in the method is that the reduction is process-related and thus more effective and focused.

In step e) ii. a subset of macro reactions is selected with an algorithm: For the selection, a quality criterion with which it can be quantified how well the specific rates q (t) from d) and / or the measurement data from c) are mapped with a subset (L) and an algorithm for selecting the subset.

As a quality criterion for the mapping of calculated specific rates q (t) with a subset L, according to Soons et al. the sum of the squared residuals of the specific rates (SSR _q ) of formula 12 [Soons, Zita, Ferreira, E. C, Rocha, I. (2010). Selection of Elementary Modes for Bioprocess Control. 1 Ith International Symposium on Computer Applications in Biotechnology, Leuven, Belgium, July 7-9, 2010, 156-161]. SSR, (formula 12)

t = l

The value for SSR _q should be as small as possible.

For the minimization of SSR _q , the vector r (tj) has to be determined beforehand for each considered time with the aid of a non-negative-least-squares algorithm such that the following applies:

(Formula 13)

with the additional boundary condition:

r (td> 0 (formula 14)

The advantage of this method is that the calculations according to formula 12 - 14 can also be carried out for very large subsets with many EMs. A significant disadvantage is that the computed specific rates q (t) are required for this calculation. Since these are obtained from interpolated measured values, they are present with great uncertainty regarding their true values. Measurement inaccuracies may under certain circumstances have a strong effect on the calculated specific rates c / (t). The quality criterion SSR _q can therefore only be determined under great uncertainty. In addition to the information about the quality of the mapping, this method also yields an estimated course of the reaction rates r (t) of the subset L as a result of the minimization according to formulas 13 and 14.

Leighty, R. et al. describes another method in which the measured values (concentration measurements) are directly approximated by a linear estimate of volumetric reaction rates over time. By solving a linear optimization problem with a linear least-squares solver, the course of the reactions can be estimated quickly, assuming that it proceeds linearly between interpolation sites [Leighty, RW, & Antoniewicz, MR (2011), Dynamic metabolic flux analysis (DMFA ): a framework for determining fluxes at metabolic non-steady state. Metabolic engineering, 13 (6), 745-755]. This method only applies to reversible macro reactions (such as the "free fluxes" in the source) and dilution effects (ie concentration changes that are not caused by the cells) can not be taken into account because the dimensions of the macro reactions and the measured values do not agree For example, if cell growth in the form of external biomass is part of the macro-reactions and the measurements are only known for the cell dry matter, this method is not applicable to the Application of irreversible macro reactions and fed-batch processes suitable. Using the concept of Leighty et al, which is hereby incorporated by reference in this application, with the data prepared according to the invention (shifted), this method can now also be applied to Fed-patches. In addition, by adding a lower bound for the reaction rates of the macro reactions as a constraint to the linear optimization problem, the method can also be used for irreversible reactions - such as the elementary modes. If the dimensions of the macro reactions and the measured values do not match, the dimension of the measured values can be adapted to the macro reactions via suitable correlations. This combination of the linear estimation according to Leighty et al. the enhancements to this claim are hereafter referred to as "linear estimation of reaction rates of selected macro reactions".

It is thus possible to check whether the selected macro reactions of a subset L of the original EM set K can adequately represent the measured data. The final sum of the squared residuals SSR _C according to formula 15 calculated here between the shifted concentrations Cs (t) determined by the method and the shifted concentrations C _; (t) indicates how well the measurement data can be mapped with the subset.

(Formula 15)

The smaller the value of SSR _C , the better the subset L. This method is preferred over the method of Soons et al., Especially for the modeling of fed-batch processes, since a check of the quality of a subsets quickly without any erroneous prior determination of specific rates is possible. By assuming that the estimated reaction rates are linear between interpolation points, measurement deviations have very little effect on the estimation of reaction rates. The disadvantage of this method is that the size of the subset L to be examined is limited by the solution of the linear optimization problem. The maximum number of responses in the subset is equal to the number of measurements available divided by the number of nodes.

In addition to the information about the quality of the mapping, an estimated course of the reaction rates of the subset r (t) is also obtained with this method.

In a preferred embodiment of the "linear estimation of reaction rates of selected macro reactions according to the invention", instead of a linear least squares solver, the non-negative least squares solver (NNLS) from Lawson et al is used to solve the linear optimization problem [Lawson, CL and RJ Hanson, Solving Least Squares Problems, Prentice-Hall, 1974, Chapter 23, p. 161.]. This makes it possible to check the quality of larger subsets with the method. The maximum number of macro reactions can also be significantly greater than the number of existing measurements divided by the number of nodes. This combination of the "linear estimation of reaction rates of selected macro reactions" with the use of the non-negative least-squares solver is hereinafter referred to as "linear estimation of reaction rates of selected macro reactions with NNLS".

The method according to the invention of the "linear estimation of reaction rates of selected macro reactions with NNLS" can additionally be used as a further data-dependent method for prereduction of the EMs in step e) i) Here, a very large set K of macro reactions can be used on the one hand the value for SSR _C and on the other the course of the reaction rates r (t) EMs with small values of the corresponding rate r (t) are removed from the matrix K. This procedure is repeated until a predefined number of EMs is reached or the value of SSR _C exceeds a specified limit.

Algorithms for selecting the subset are z. ß. from Provost et al. and Soons et al. known [Provost A. 2006. Metabolism design of dynamic bioreaction models. Faculty of Sciences Appliquees, Universite Catholique de Louvain, Louvain-la-Neuve, Louvain-la-Neuve; Soons, Z. 1. T.A., Ferreira, E.C., Rocha, 1. (2010). Selection of Elementary Modes for Bioprocess Control. 1 Ith International Symposium on Computer Applications in Biotechnology, Leuven, Belgium, July 7-9, 2010, 156-161 J.

Soons et al. describe the formation of an EM subset in a two-stage optimization process. For various randomly selected EMs, the values for SSR _q are minimized as described above. The set with the smallest minimized SSR _q value is selected. However, with a large number of EMs, the random selection is ineffective as the number of possible combinations grows very strongly. For example, in selecting 10 reactions from a set of 20,000 EMs, there are more than 2.8 x 10 ³⁶ combinations. The probability that the optimal combination is found here is very low. By using the quality criterion SSR _q , this method is susceptible to measurement uncertainties and measurement deviations.

Provost describes an alternative algorithm in which for different specific values of q (t _j ) t = Ι,.,. , η determines all possible positive linear combinations of elementary modes where: SSR _q (t = tj) = 0. Of these many possible combinations, a combination is then randomly selected. This method uses only one vector q (tj) at a time and not the entire course over time. A selection of EMs for the whole

Process is therefore not possible. Although the random selection makes it possible to represent the vector q (ti) to what extent the remainder of the process can be represented herewith, it is not determined. Another disadvantage of this method is that a vector qt _t ) that is not in the solution space of all EMs can not be used. An approximate solution can not be determined. This is a major drawback especially with uncertainty measurements and specific rates. By using the quality criterion SSR _q , this method is also vulnerable to measurement uncertainties and measurement deviations.

In a further and preferred embodiment of the method, therefore, an evolutionary, in particular a genetic, algorithm is used to select the relevant macro-reactions, ie to select the EM subsets L. Such an algorithm is z. ß. from Baker et al. [Syed Murtuza Baker, Kai Schallau, Björn H. Junker. 2010. Comparison of different algorithms for simultaneous estimation of multiple parameters in kinetic metabolic models. J. Integrative Bioinformatics: In particular, a genetic algorithm can be used, in whose objective function for various combinations of EMs the respective value SSR _{C is} calculated with the method "linear estimation of reaction rates of selected macro reactions." Alternatively, a random After completion of step ii), the matrix L contains the necessary macro reactions (step iii).

In an optional step iii), the validity of the EM subset L is checked graphically. Here, the flux map from step d) can be used as a projection of the EM subset L. Figure 4 shows the Flux Map with the projection of a subset of six EMs. If the EM subset L is valid, the measurement data remains within the EM subset L. This representation allows a quick graphical check of the validity of the selection.

In a further step f), the reaction rates of the macro reactions of the subset L are calculated with the specific rates q (t) from d) and / or the measured data from c). The calculation of r (t) can be based on the specific rates q (t) as described in e) according to Soons et al. [Soons, Z. 1.

TA, Ferreira, E.C., Rocha, 1. (2010). Selection of Elementary Modes for Bioprocess Control. 1 Ith International Symposium on Computer Applications in Biotechnology, Leuven, Belgium, July 7-9, 2010, 156-161], the calculation of r (t) preferably takes place on the basis of the measurement data from c) with the "linear estimation of reaction rates of selected macro reactions" according to the invention.

In step g) of the method, the kinetics of the macro reactions are designed. The determined kinetics should quantify the dynamic influences of the process state on the respective reaction rates r _k :

r _k = f (, pH, T, ...) (formula 16)

The kinetics result in the model parameters to be determined.

The generic kinetics are determined in step g) i. designed from the stoichiometry of macro reactions. For substrates of the macro reaction, a limitation of the monodype is assumed.

multiplied by j:

(Formula 17)

_{Wherein the monod} constants K _mki and the Hill coefficients η _{έ represent} the parameters of the equation whose first values are entered manually. Usually, the monod constants K _mk i are set to one tenth of the respective maximum measured concentrations and the hill coefficients η _έ to the value 1. The determination of generic kinetics from the reaction stoichiometries is described by Provost or by Gao et al. [Provost A. 2006. Metabolic design of dynamic bioreaction models. Faculty of the Sciences Applique, Universite Catholique de Louvain, Louvain-la-Neuve, p. 126; [Gao, J., Gorenflo, VM, Scharer, JM, & Budman, HM (2007). Dynamic metabolic modeling for a MAB bioprocess. Biotechnology progress, 23 (1), 168-181]. These methods are hereby integrated into the application by reference. In these methods, the substrate limitations of the monodype are used for the respective substrates of a reaction. Although Provost or Gao do not describe this, inhibition by toxic products is also derivable from reaction stoichiometry using this method.

In step g) ii. the influencing variables are determined on the reaction rates r (t) determined in f). All variables that describe the process state (ie also bioreaction conditions such as the pH, the reactor temperature, partial pressures that can not be derived from the stoichiometry of the macro reaction) are considered. The influencing variables can be determined manually, for example using a statistical method such as partial least squares. For this purpose, the correlation between the process state (which is summarized in a matrix) and the reaction rates r (t) from f) is determined. In a step g) iii. then the g in g) ii. determined influences quantified and the kinetics of i. extended by corresponding terms. One in g) ii. found influence of a size of the process state on a reaction rate can then be occupied by a term j. The term j is any function that gives a value between 0 and 1 depending on the process state. The in g) i. established generic kinetics of the reaction is then multiplied by this term. For example, a negative correlation found between the concentration of a component t and the reaction k indicates an influence on the reaction rate k by the concentration of (Q). This can be demonstrated, for example, with an inhibition kinetics after Haldane: I ^{(t) =} ^K KI, k, i + ^K ^'rm W ^{(Formula 18)}

Wherein K _{I ki} denotes the inhibition constant and represents another model parameter whose first value is input manually and is usually set to one-tenth of the respective maximum measured concentrations.

In an optional step h), the model parameter values p of the kinetics are adapted to the reaction rates of the macro reactions r (t) determined in f): m ninn ^ (r _k () - r _k ) (formula 19)

v ■

- k = l

This is referred to below as model parameter estimation. A numerical solution of one or more differential equations according to the formulas 2 to 4 in this step can be dispensed with; the model parameter values can be adjusted in independent groups with usually 3 to 10 parameters separately for each macro reaction k. The adaptation is done by a common method such as the Gauss-Newton method [Bates DM, Watts DG. 1988. Nonlinear regression analysis and its applications. New York: Wiley. xiv, 365.].

This model parameter value estimation, which is separate for each macro reaction, is particularly advantageous for steps i) and j), since it can be carried out quickly and also provides improved starting values for adapting the model parameter values to measurement data from c) in step j).

The goodness of fit is calculated, for example, with the sum of the squared residuals SSR _r according to formula 20: N _t

SSR = ^ _r (r _k () - r _k) (Formula 20)

k = l

The smaller the value for SSR _r , the better the fit. Alternatively, the check of the quality of fit is done by a graphical comparison of f _k ~ and r _k .

In an optional step i), the kinetics of the macro reactions selected in g) are checked for their quality of fit. The basis is the value SSR _r calculated in step h), which quantifies the quality of fit of the model parameter value estimation. With an unsatisfactory quality of fit, steps g) and h) can be repeated until a predefined quality of fit is achieved.

In a further step j), the adaptation of the parameter values of the kinetics from g) to the measured data from c) can be carried out according to a method customary for adaptations. The starting values from step h) are preferably used for this adaptation. The model parameter value adjustment takes place with the inclusion of o. G. Differential equations (formulas 2 to 4), z. B. using the Gauss-Newton method [Bates DM, Watts DG. 1988. Nonlinear regression analysis and its applications. New York: Wiley. xiv, 365. J or using a multiple-shoot algorithm [Peifer M, Timmer J. 2007. Parameter estimation in ordinary differential equations for biochemical processes using the method of multiple shooting. Systems Biology, IET 1 (2): 78-88.J.

Preferably, product features can be integrated into the model. Most preferably, this may be introduced for product characteristics that depend on the concentration of by-products or intermediates. Concentrations of by-products that are external components of the metabolic network entered in a) are already integrated into the model and can be calculated. If necessary, however, other by-products or intermediates may be grouped together in one or more separate metabolic networks. This is advantageous if the expected excretion or uptake rates are in different orders of magnitude or certain metabolic processes are to be considered in different degrees of detail. As an alternative to an integrated model, steps a) to j) can be used to generate a separate model for the calculation of the product features, which also describes the course of the process of the external components of the separate metabolic network with a set of macro reactions with their own kinetics. By-products or intermediates that are not outside the organism but whose intracellular accumulation affects one or more product characteristics may be externalized in step (a) and (b) in the calculation of the EMs and the formulation of macro-reactions, ie classified as external components, become. The involvement of Product features that are dependent on intracellular or out-of-cell concentrations may then be achieved through the additional integration of quantitative or qualitative relationships between concentrations and product characteristics.

Further objects are a computer program or software for carrying out the method according to the invention.

The model provided by the method according to the invention can be used for process control or planning of the process control as well as investigation of the process in the reactor.

A particular embodiment of the method according to the invention is described by way of example, without being limited thereto. Using this method, a model of a fermentation with hybridoma cells was also prepared by way of example and its validity was tested as described.

Example: Modeling a hybridoma cell culture

1 step a)

Background information in the form of a metabolic network has been published in Niu et al. Chemistry Engineering Science (2013) 102, pp. 461-173, Dül: 10.1016 / j.ces.2013.07.034.) (Metabolism pathway analysis and reduction for mammalian cell cultures-Towards macroscopic modeling. The metabolic network of an animal cell described here contains 35 reactions that link 37 internal and external metabolites (see Figure 5, see Table 1).

Table 1: Metabolic Network Reactions According to Niu et al. (Metabolic Pathway Analysis and Reduction for Mammalian Cell Cultures Towards Macroscopic Modeling: Chemical Engineering Science (2013) 102, pp. 461-473, DOI: 10.1016 / j.ces.2013.07.034.)

1 glucose - »1 G6P

1 G6P + 2 NAD -> 2 pyruvates

1 pyruvate -> 1 lactate + 1 NAD

1 pyruvate - »1 pyruvate m

1 NADm + 1 pyruvate_m - »1 acetyl coA_m

1 acetyl coA_m + 1 NADm + 1 oxaloacetate_m - »1 a-ketoglutarate_m 1 -ketoglutarate_m + 1 NADm -> 1 succinyl CoA_m

1 FADm + 1 succinyl CoA m -> 1 fumarate

1 Fumarate -> 1 Malat_m

1 Malat_m + 1 NADm -> 1 oxaloacetate_m

1 glutamine -> 1 glutamate + 1 NH3

1 glutamate + 1 NADm -> 1 a-ketoglutarate_m + 1 NH3

1 malat_m -> 1 malate

1 malate + 1 NAD -> 1 pyruvate

1 glutamate + 1 pyruvate -> 1 a-ketoglutarate_m + 1 alanine

1 glutamate + 1 oxaloacetate m -> 1 α-ketoglutarate m + 1 aspartate

1 arginine + 2 NADm -> 1 glutamate + 3 NH3

1 Asparagine -> 1 Aspartate + 1 NH3

2 glycines + 1 NADm - »1 NH3

1 histidine + 1 NADm -> 1 glutamate + 2 NH3

1 isoleucine + 2 NADm -> 1 acetyl coA_m + 1 NH3 + 1 succinyl CoA_m

1 leucine + 3 NADm - »3 acetyl coA_m

1 lysine + 6 NADm - »2 acetyl coA_m

1 methionine + 4 NADm -> 1 NH3 + 1 succinyl CoA_m

1 NADm + 1 phenalanine - »1 Tyrosine

1 serine - »1 NH3 + 1 pyruvate

1 NADm + 1 threonine-> 1 NH3 + 1 succinyl CoA_m

19 NADm + 1 TRP -> 3 acetyl coA_m

5 NADm + 1 tyrosine -> 2 acetyl coA_m + 1 fumarate

5 NADm + 1 valine - 1 NH3 + 1 succinyl CoA_m

1 NADm -> 1 NAD

0.5 Oxygen (02) - 1 NADm

1 NADm - 1 FADm

0.0156 Alanine + 0.0082 Arginine + 0.0287 Aspartate + 0.0167 G6P + 0.0245 Glutamine + 0.0039 Glutamate + 0.0196 Glycine + 0.0038 Histidine + 0.0099 Isoleucine + 0.0156 Leucine + 0.0119 Lysine + 0.0039 Methionine + 0.0065 Phenalanine + 0.016 Serine + 0.0094 Threomne + 0.0047 Tyrosine + 0.0113 Valine - 1 X (biomass) + 0.0981 NAD

0.01101 Alanine + 0.005033 Arginine + 0.007235 Asparagine + 0.0081787 Aspartate + 0.010381 Glutamine + 0.010695 Glutamate + 0.01447 Glycine + 0.0034602 Histidine + 0.005033 Isoleucine + 0.014155 Leucine + 0.01447 Lysine + 0.002831 1 Methionine + 0.007235 Phenalanine + 0.026738 Serine + 0.016043 Threonine + 0.0084932 Tyrosine + 0.018874 Valine - »1 IgG (Antibody)

In the publication, the reversibility of the reactions is not explicitly stated. Instead, the metabolic flux analysis data from the same publication was evaluated and used to identify the irreversible reactions.

With the stoichiometric matrix N, the stoichiometry, d. H. the stoichiometric coefficients containing internal metabolites and the information on the reversibility of the reactions were all elucidated using METATOOL 5.1 (Pfeiffer et al., METATOOL: for studying metabolic networks, Bioinformatics 199915 (3), pp. 251-257.) Modes (EMs) of the network calculated. The number of EMs here is over 300,000.

2 step b)

The matrix with the calculated EMs E was obtained in step a). Analogous to the matrix N, the matrix N _{p contains} the stoichiometry, ie the stoichiometric coefficients, of the external metabolite. Possible macro reactions of the stoichiometric network were summarized in the matrix K with formula 21:

K = N _p -E (Formula 21)

3 step c)

The measurement data of the process were taken from Baughman et al. which reports various measures of a fermentation of hybridoma cells over the course of a batch process (see Figure 6). In: Computers & Chemical Engineering (2010) 34 (2), pp. 210-222.]. The measurement data was entered in the procedure.

4 step d)

Using spline-interpolated measurements from c) (C ^m ), the growth and mortality rates as well as the specific uptake and excretion rates were calculated (see Figure 7). The lysis was included with a predefined lysis factor A ^' _j = 0.1, which was entered into the procedure and was constant over the process period. A shift of the measured data was not necessary, since this is data of a batch process without further additions. Accordingly, the data show a rising trend because all concentration changes are caused by the cell and not by additions.

Additional information is used to calculate rates q. So could with the help of

[C-τγιοΐΛ - - J) and the total number of cells has an average C-mol content of fc-moi, x ⁼ 18.41 be calculated. The C-mole related growth rate could now be compared with formula 22:

mol mol

μ μ fc C-mol, X _v (formula 22)

h ^■ W ⁹ cells W ⁹ cells

be calculated. Analogously, the C-mol-related formation rate of the antibody can be estimated. For this purpose, the molar composition of the antibody to CHi ₅₈ 0 _{0 31} N _{0 2} 7S _{0 004 was} estimated with a formal molar mass of M _{mAb c} _ _mol = 22.45 - ^. Here, it is assumed that the molar composition of an average molar composition of proteins as described by Villadsen et al. [Bioreaction engineering principles (201 1), Chapter 3, Elemental and Redox Balances, p. 73, Springer Verlag, ISBN: 978-1 -4419-9687-9J. The molar mass of the total antibody was estimated to be M _mAb = 150,000 - ^. The rate of formation of the antibody then resulted from the formula:

C-mol 10 ^{~ 4} mol M, mAb, C-mol

QmAb QmAb w ⁴ (formula 23)

h ^■ 10 ⁹ cells h ^■ 10 ⁹ cells M _m Ab

The time course of the rates q (t) could then be used to select the macro reactions.

5 step e)

In step e), an EM subset of macro reactions was generated, with which the data record was reproduced in the best possible way. For this purpose, the matrix K from step b) was needed. Since the number of more than 300,000 macro-reactions would have resulted in too many possible combinations, a data-dependent pre-reduction was performed first:

For this purpose, the rates q (t) determined in step d) were used to calculate the yield coefficients Y ^m for all combinations of two external metabolites. The lower limit of a yield coefficient Y _tj was chosen such that 99% of the determined yield coefficients Y _j {t) are above this value. The upper limit was chosen so that 99% of the determined Yield coefficients Y ^ it) are below this value. By way of example, some determined limits and the proportion of EMs whose yield coefficients Y ™ within these limits are given in Table 2. Overall, the number of EMs could be reduced to about 3000.

Table 2: External metabolites, their maximum and minimum yield coefficients and the proportion of EMs whose yield coefficients are within the specified limits

Following the data-dependent reduction, a data-independent reduction was subsequently carried out. Here, a maximum value for the cosine similarity of two EMs of 0.995 was defined. Starting with the first reaction, all macro reactions were removed from matrix K that exceeded this value. There remained about 500 macro reactions from the matrix K (also called reduced matrix K), which continue to cover more than 95% of the volume of the solution space spanned by the approximately 3000 EMs.

Before the selection process, a comparison of the metabolic network according to Niu et al. specified components (corresponding to the external metabolite of the metabolic network of a) and the measured component concentrations of c). Except for proline, all of Baughman et al. measured concentrations in the metabolic network according to Niu et al. considered. In order to be able to use the measurement of the proline concentration, either another simplified network containing proline as an external metabolite could be used, or an extension of the existing metabolic network possible.

Components that occurred in the calculated macro-reactions but of which no data was available were also ignored below. The corresponding rows of the matrix K were accordingly deleted from the matrix. The deletion of the corresponding lines does not mean that these inputs or outputs are not used by the cell. They continue to exist in the metabolic network, only lacking measurements that can make them accountable. In this example, the inputs and outputs of arginine, glutamates, glycine, histidine, leucine, lysines, methionine, ammonium, oxygen, phenylalanine, serine, threonine, tryptophan, tyrosine and valine were neglected. In the following steps of the method, the reduced matrix K - which maps the background knowledge - and the rates q (t) from d) and the measured data from c) - the

Form process knowledge - used to obtain the smallest possible subset L of macro reactions from K.

The "linear estimation of reaction rates of selected macro reactions" according to the invention was used as the quality criterion.

Analogous to the rates q (i), the measured values of the cell number and of the antibody were normalized here to C-mol. This is necessary so that the dimensions of the macro reactions match those of the measured values.

The subset was selected using a genetic algorithm. In the calculation of the objective function of this genetic algorithm, the linear optimization problem addressed in the "Linear Estimation of Reaction Rates of Selected Macro Reactions" was solved.The final sum of the least squares of the linear optimization problem calculated here was also the value of the objective function for the respective selection of macro reactions.

In order to select the size of the subset L from K, the optimization was performed repeatedly with a different number of macro reactions in L. The count represents a trade-off between model complexity and rendering accuracy. To determine how many responses are sufficient for rendering, either the selection of the subset L may be repeated for a varying number of macro-reactions, or a penalty for the number of responses will be used directly Target function of the genetic algorithm can be added. In this case, several optimizations were performed with a predefined number of macro reactions (10, 7, 5, 4, and 3). The smallest sum of squares found with the genetic algorithm is plotted against the number of macro reactions in Figure 9. It turned out that in this case fewer than seven macro reactions are too few to represent the course of the process sufficiently well. The selected macro reactions are given in Table 3.

Table 3: Selected subset of macro reactions (I). Non-underlined components are not included in the model as there are no measurements for this.

0.474 Alanine + 0.474 Methionine

- * 0.158 asparagines + 0.316 aspartates + 0.632 glycines

+ 0.158 tryptophan + 0.00789 arginine + 0.0304 asparagine + 0.0161 glucose

+ 0.0236 glutamine + 0.00375 glutamate + 0.00366 histidine + 0.00953 isoleucine + 0.015 leucine + 0.112 methionine

+ 0.00626 phenalanine + 0.0154 serine + 0.0109 valine

-> 0.963 X (Biomass) + 0.00276 Aspartate + 0.24 Glycine

+ 0.0208 tryptophan ine + 0.147 glutamate

-> 0.295 aspartate + 0.885 glycine + 0.147 lactate ne + 0.113 asparagine + 0.0603 glucose + 0.0225 glutamine

+ 0.0824 histidine + 0.00909 isoleucine + 0.00597 phenalanine + 0.0216 tryptophan + 0.00431 tyrosine + 0.0104 valine

-> 0.918 X (biomass) + 0.061 Alanine + 0.0865 Aspartates

+ 0.343 Glycine + 0.0631 Methionine e + 0.412 Aspartate + 0.00991 Glucose + 0.0145 Glutamine

+ 0.554 glycines + 0.00226 histidines + 0.00588 isoleucines

+ 0.00926 leucine + 0.00706 lysine + 0.0649 phenalanine + 0.0095 serine + 0.00671 valine

- * 0.594 X (biomass) + 0.049 alanines + 0.395 asparagines

+ 0.0503 threonine + 0.0388 tryptophan

0.0077 ar ginine + 0.179 aspartate + 0.0157 glucose + 0.104 glutamine

+ 0.216 Glycine + 0.00357 Histidine + 0.00929 Isoleucine

+ 0.0146 leucine + 0.0112 lysine + 0.03Q tyrosine + 0.0106 valine

- * 0.939 X (biomass) + 0.0624 alanines + 0.152 asparagines

+ 0.0183 tryptophan

0.0342 Arginine + 0.211 Aspartate + 0.00762 Glucose + 0.0195 Glutamine

+ 0.244 glycines + 0.00452 histidines + 0.0546 isoleucines

+ 0.0185 leucine + 0.0171 lysine + 0.00406 methionine

+ 0.0178 tyrosine + 0.0203 valine

-> 0.457 X (biomass) + 0.804 IgG (Antibody) + 0.185 asparagines

+ 0.0153 tryptophan

In the macroreactions shown, all external metabolites of the metabolic network are given from a). However, only the underlined external metabolites are part of the model, since only for this measurement data from c) are available.

6 step f)

For the selected set of macro reactions, the reaction rates over time were determined. In this example, using the method according to the invention "linear estimation of reaction rates of selected macro reactions", the measured values shown in Figure 10 were approximated by an estimate of the reaction rates r (t) The result of the method is a piecewise linear progression of the individual (volumetric) reaction rates Division with the interpolated progression of the vital cell number X _v (t), the cell-specific reaction rates r (t) of the macro reactions shown in Table 3 were obtained and the reaction rates r (t) obtained are shown in Figure 10.

7 step g)

For all macro reactions shown in Table 3, generic kinetics according to Formula 24 were adopted: (Formula 24)

In this case they were realized by a monodic kinetics D. h. for each reaction k, a limitation according to formula 25 for each substrate t:

(Formula 25)

was introduced. Here, r _{k max is} the maximum reaction rate, Ni the number of limitations taken into account, Q the concentration of the component t, K _mki the associated _monod constants and n; represent the Hill parameter for the reaction order. Their values are adjusted in steps h) and j).

Further terms result from the analysis of the reaction rates r (t) from f). In addition to substrate transfers, inhibitions according to formula 26 were also taken into account in this example.

(Formula 26)

Also for this limitation, the values of the parameters K _{lk ii} and η _{έ had} to be adjusted. The kinetic terms of the reactions used are given in Table 4.

Table 4: Kinetic terms of the selected macro reactions from L

[Glc] (t) \ ([Gln] (t) \ ([Asn] (t) r ₂ (t = ^r 2, r.

Km, Gic, 2 + [Glc] (t) K _miGlni2 + [Gln] (t) JK _miAsrii2 + [Asn] (t)

[Ala] (t)

Km, Ala, 2 + [Ala \ (t) _j

[Glc] (t) \ ([Asn] (t) \ (K _liLaCi3 r ₃ (t = ^r 3 ,;

Km, Gic, 3 + [Glc] (t IK _miGlni3 + [Asn] (t) l K _IiLaCi3 + [Lac] (t

8 step h)

For each reaction rate it was possible to algebraically calculate the course of the reaction rate with the kinetics given in Table 4 and the interpolated values of the concentrations C ^m (t) considered in the kinetics.

The adaptation of the parameters of these kinetics was carried out separately for each reaction t to the reaction rate ri (t) determined in step f). The objective function for optimizing the parameters occurring in reaction t in this example was:

(Formula 27)

The thus adapted curves of all calculated r _k p _k , C ^int (t) J are shown together with the corresponding r _k (t) in Figure 11. The progressions of the former are dashed, the latter represented by the latter. It can be seen that the course is qualitatively consistent. This means that with the chosen kinetics also the dynamics of the process can be reproduced satisfactorily. This information is very useful in this modeling step because if the playback is unsatisfactory, it is possible to repeat the quick steps g) (select other kinetics) and h) (parameter value estimation) until the desired level of fit is achieved. Step i) was not required here. 9 step j)

The further adaptation of the model parameter values p was carried out with the measurement data from c). For this purpose, all parameters were optimized at the same time. In addition, the previously not considered processes apoptosis and lysis were included. These are needed in the differential equations describing the development of vital and total cell numbers:

dX ₁

μ _ά ) ^■ X _v (formula 28)

dt

¹ (formula 29)

The chosen kinetics for the description of apoptosis was:

([Lac] (t) - C _LaCiCr )

d (= μα,, max, [Lac]> C ^• _l hac, cr (formula 30)

Kd, Lac + ([^ ^ac ] ( ^- ^ Lac, cr)

Md (= _O , [Lac] <C _LaCi , cr (Formula 31)

The lysis rate Ki was assumed to be constant over the process. In addition to the parameters of the reaction rates, the parameters Ciac.cr (critical lactate concentration), μα, τηαχ (maximum rate of death) introduced by apoptosis and lysis, K _{d Lac} (Monod parameter for describing the influence of lactate concentration) were determined in this step the rate of death) and K _t (lysis rate). In the example, based on the starting values of the data set, the course of the estimated concentrations C (t) was determined by numerically solving the ÜDE system. The difference between the measured concentrations C_ ^m (t) and the estimated concentration C (t) was thereby minimized by conventional methods with the following objective function:

(Formula 32)

With a total of 33 parameters p, this optimization is i. d. R difficult to perform since the

Target function has many local optima. If you start a deterministic optimization algorithm, such. If, for example, the Levenberg-Marquardt algorithm at the starting values of the parameters known from step h), the chances of success are greatly increased. The adjusted process flow is shown in Figure 12. The adjusted parameters are shown in Table 5. Table 5: Parameters of kinetics as well as apoptosis and lysis

10 step k)

The model consisting of the matrix L, the kinetics from Table 4 and the kinetics of apoptosis with the associated parameter values from Table 5 was output.

List of Symbols

(Underscore) Denotes a vector

i (index) Denotes the ith element of a vector

^ (Index k) Denotes the kth element of a vector

[] Indicates the concentration of the component in the bracket

C concentration AC concentration difference

Interpolated concentration

c Estimated concentration (e.g., by solving a differential equation)

C _s Shifted concentration

r Critical concentration

_C m Measured concentration

D dilution rate

q Determined cell-specific excretion and uptake rates

q Determined cell-specific excretion and uptake rate, which was converted from any unit to -: - \

° LZeit Zellzahll °

r Determined reaction rate

f Estimated reaction rate (for example, by calculating a reaction kinetics) f Limiting a kinetics

Parameters of a reaction kinetics

N stoichiometric matrix

N _p External stoichiometric matrix

K matrix containing macro reactions

E matrix containing all elementary modes

x _t total cell number

X _v Vital cell number

μ growth rate

Mortality

μ growth rate from any unit up

\ Sto f femenqe ^~ \. ,.

was converted

Number of prefixes 1 ^a

Kd lysis rate

Ki parameters of inhibition limitation

KM parameters of a substrate limitation

n Hill parameters of inhibition or substrate limitation

L Subset of macro reactions used for the model

V model parameter

S substrate SSR _q Sum of squared residuals of specific uptake or release rates

SSR _C Sum of squared residuals of concentration

SSR _r Sum of squared residuals of reaction rates

Description of the pictures:

Figure 1 shows the shift of measured data: The actual course of a measured quantity (C _t (t)) is shown, which changes abruptly when the dilution rate changes (D (t)). The shifted history ( _is (t)) comes about only by changes caused by the cell.

Figure 2 shows the flux map of two specific rates q _x and q ₂ . The contour lines indicate the frequency with which the respective combination of rates occurs in the measured data.

Figure 3 shows a three-dimensional representation of the solution space, which is spanned by a positive linear combination of EMs. In black, the solution space of the entire set is shown, in gray, that of a subset.

Figure 4 shows the flux map of two specific rates q _i and q ₂ . As vectors, the 2-dimensional projections of the macro reactions of a set I are shown.

Figure 5 shows a schematic representation of the metabolic network of Niu et al. Here, the limitation of the cell is shown as a box. The cell-internal demarcation of the mitochondrion is indicated by a dashed line. External components are marked with the index "xt" The arrows and dotted arrows indicate reactions.

Figure 6 shows the measurement data of a fermentation with hybridoma cells from Baughman et al. The total cell count (total cells) is calculated here from the sum of the living cells (vital cells) and dead cells (dead cells). The abbreviations GLC, GLN, ASP, ASN, LAC, ALA and PRO denote the substrate glucose and the amino acids glutamine, aspartic acid, asparagine, alanine and proline as well as the metabolite lactate. The abbreviation MAB designates the product of monoclonal antibodies and BM the biomass. Figure 7 shows the growth and death rates as well as the cell-specific uptake and release rates

[mM 1

h oo ⁹ ceiis \ ^an 8 ^e 8 ^erjen - The rate

1" ⁿ -

Figure 8 shows the concentrations approximated by the "linear estimation of reaction rates of selected macro reactions" with the selected reaction set, and the total number of cells (X _t ) and the antibody concentration (MAB) were converted to C-mol.

Figure 9 shows the smallest calculated sum of error squares ("minimum error") plotted against the number of macro reactions in the subset (n _R ).

FIG. 10 shows the reaction rates of the macro reactions r (t) determined by the method according to the invention "linear estimation of reaction rates of selected macro reactions".

Figure 1 1 shows the reaction rates of macro reactions r (t) (continuous) determined by the method according to the invention "linear estimation of reaction rates of selected macro reactions"

Line) together with the algebraically calculated reaction rates r (dashed line)

Figure 12 shows a comparison of the measured concentrations C ^m (i) (points) and the simulated process flow (i) (solid line). The concentrations are given in [mM]. Exceptions are the vital and total number of cells (X _v / X _t in [10 ⁹ cells and the concentration of antibody (mAb in [10 ^-4 mM]).

Claims

Claims:

A computer-implemented method for creating a model of bioreaction with an organism comprising the steps of:

a. Selected metabolic pathways of the organism, their stoichiometric and reversibility properties are entered into the procedure as background knowledge, and elementary modes are calculated from this input.

b. The elementary modes from a) are summarized in a matrix K, whereby the elementary modes summarize the metabolic pathways from a) into macro reactions and the matrix K contains the stoichiometry and the reversibility properties of all macro reactions.

c. The measurement data for bioreaction with the organism are entered. d. Using an interpolation method, on the basis of the input measurement data from c), the rates specific to the organism - excretion and uptake rates of one or more input variables and output variables - of the input metabolic pathways are calculated.

e. Relevant macro-reactions are selected in the form of a subset of the elementary modes from a)

i. data-independent and / or data-dependent prereduction of the number of elementary modes from a),

ii. Selection of the subset from the prereduction from e) i. with the measurement data from c) and / or one or more rates from d) by means of an algorithm according to a mathematical quality criterion and summary of the subsets in a matrix L.

iii. Optionally, the subsets are displayed graphically.

f. Using an interpolation method, the reaction rates of the macro reactions of the subset r (t) are calculated on the basis of the given measurement data from c) and / or the rates from d).

i. From the stoichiometry of the macro reactions, generic kinetics are designed. ii. Factors influencing the macro reactions are determined from the reaction rates from f).

iii. The generic kinetics from g) i. are extended by terms equivalent to those in g) ii. Quantify determined influencing variables.

H. Optionally, from the kinetics of g), a first adaptation of the model parameter values for each macro reaction separately to the calculated reaction rates from f) is performed separately for each macro reaction.

j. The model parameter values are adapted to the measurement data from c).

k. The matrix L, the kinetics of g) and the model parameter values of j) form this

Model and are issued and / or in a process control or

Transfer process development module.

2. Computer-implemented method according to claim 1, wherein in step d) also growth rates, particularly preferably also die-off rates of the organism are calculated.

3. A computer-implemented method according to any one of claims 1 or 2, wherein in step g) an individual adaptation of the kinetics based on an analysis of the reaction rates from f).

4. The computer-implemented method according to any one of claims 1 to 3, wherein in step h) the adjustment of the parameter values of the kinetics of g) is carried out by combining a plurality of adaptation methods.

5. Computer-implemented method according to one of claims 1 to 4, wherein in step e) ii. To select the subset of macro reactions, a linear estimate of reaction rates of selected macro reactions is performed.

6. A computer-implemented method according to any one of claims 1 to 5, wherein in step e) ii. To select the subset of macro reactions, a linear estimate of reaction rates of selected macro reactions is performed in combination with an evolutionary algorithm.

The computer-implemented method of any of claims 1 to 6, wherein the measurement data is shifted prior to application of the interpolation method in step d) to achieve the description of constant consumption without feed peaks.

A computer-implemented method according to any one of claims 1 to 7, wherein in step f) a linear estimate of reaction rates of selected macro-reactions is performed.

The computer-implemented method according to any one of claims 1 to 8, wherein in step e) i. a data-dependent pre-reduction is performed and for this the method of linear estimation of reaction rates of selected macro reactions with N LS is used.

10. The computer-implemented method according to any one of claims 1 to 9, wherein in step e) iii. the validity of the subset selection of macro reactions is checked using a Flux Map.

11. A computer-implemented method according to any one of claims 1 to 10, wherein in step e) ii. the selection from the prereduction from e) i. with the measured data from c).

12. Computerprogram for carrying out the method steps according to one of claims 1 to 10.

13. Software for carrying out the method steps according to one of claims 1 to 11.