CN117690513A - Culture medium formula stability analysis method based on self-encoder - Google Patents
Culture medium formula stability analysis method based on self-encoder Download PDFInfo
- Publication number
- CN117690513A CN117690513A CN202311702960.8A CN202311702960A CN117690513A CN 117690513 A CN117690513 A CN 117690513A CN 202311702960 A CN202311702960 A CN 202311702960A CN 117690513 A CN117690513 A CN 117690513A
- Authority
- CN
- China
- Prior art keywords
- formula
- cells
- culture medium
- self
- variance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000001963 growth medium Substances 0.000 title claims abstract description 62
- 238000004458 analytical method Methods 0.000 title abstract description 15
- 238000000034 method Methods 0.000 claims abstract description 66
- 238000013528 artificial neural network Methods 0.000 claims abstract description 9
- 210000004027 cell Anatomy 0.000 claims description 42
- 239000000203 mixture Substances 0.000 claims description 30
- 238000009472 formulation Methods 0.000 claims description 25
- 230000008569 process Effects 0.000 claims description 24
- 239000002609 medium Substances 0.000 claims description 17
- 238000009826 distribution Methods 0.000 claims description 16
- 238000012549 training Methods 0.000 claims description 10
- 238000004113 cell culture Methods 0.000 claims description 9
- 210000004978 chinese hamster ovary cell Anatomy 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 4
- 210000003501 vero cell Anatomy 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 3
- 210000001840 diploid cell Anatomy 0.000 claims description 2
- 238000011478 gradient descent method Methods 0.000 claims description 2
- 210000004408 hybridoma Anatomy 0.000 claims description 2
- 239000012642 immune effector Substances 0.000 claims description 2
- 229940121354 immunomodulator Drugs 0.000 claims description 2
- 238000004519 manufacturing process Methods 0.000 abstract description 15
- 230000009467 reduction Effects 0.000 abstract description 4
- 238000013507 mapping Methods 0.000 abstract description 2
- 239000000306 component Substances 0.000 description 32
- 238000000227 grinding Methods 0.000 description 10
- 238000002156 mixing Methods 0.000 description 6
- 238000001035 drying Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 239000002994 raw material Substances 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 230000016784 immunoglobulin production Effects 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 239000004017 serum-free culture medium Substances 0.000 description 3
- 210000004102 animal cell Anatomy 0.000 description 2
- 239000007640 basal medium Substances 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000010261 cell growth Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 239000000843 powder Substances 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 241000700605 Viruses Species 0.000 description 1
- 238000000498 ball milling Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009477 fluid bed granulation Methods 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000002045 lasting effect Effects 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 239000012533 medium component Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 239000012679 serum free medium Substances 0.000 description 1
- 238000013097 stability assessment Methods 0.000 description 1
- 238000003756 stirring Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/30—Prediction of properties of chemical compounds, compositions or mixtures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0895—Weakly supervised learning, e.g. semi-supervised or self-supervised learning
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Chemical & Material Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Molecular Biology (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention relates to a culture medium formula stability analysis method based on a self-encoder. The method of the invention is based on an unsupervised learning algorithm of the neural network, and maps the culture medium formula to a hidden space with lower dimension, thereby achieving the purposes of dimension reduction and noise reduction. And after mapping, calculating the variance of the target product through a variance formula, thereby obtaining a formula stability analysis result. The method is used for analyzing the overall stability of the culture medium formula, and the formula with high stability is screened out based on the result, so that the stability of the quality of the culture medium is ensured from the source, and the difficulty of the production process is reduced.
Description
Technical Field
The invention relates to the technical field of culture medium research and development of modern biotechnology. In particular to a method for analyzing the stability of a culture medium formula based on a self-encoder.
Background
The serum-free culture medium for animal cells is a key raw material in the field of biological pharmacy based on the animal cell culture technology at present, the quality of the culture medium product is directly related to the final safety and effectiveness of the biological pharmacy product, and the quality stability of the culture medium product plays a decisive role in the stability of the production process of pharmaceutical enterprises.
The production process route of serum-free culture medium products is complex, taking dry powder form products as an example, the process involves feeding, production, packaging and finished product inspection, wherein the production process comprises raw material drying, grinding, mixing and the like, and is a major difference of process routes of culture medium production enterprises, and common cone-mixing needle grinding technology of multiple-skilled-based organisms, fluid bed granulation (AGT) technology of Gibco and the like. In the process implementation, the final quality level of the product is affected by various factors of the process. (1) form of raw materials: the components in the serum-free medium formulation are typically between 60 and 100, and even though each component has a definite chemical structure, the same CAS number brings impurities into the medium product due to different suppliers, different production processes, different product quality standards, and the like. (2) a material drying process: the tolerance degree of different components to the drying process is different, and the drying process is very easy to introduce deviation into a production system for the components which are volatile or contain crystal water; (3) grinding stage: common grinding processes include needle grinding, ball milling, air grinding and the like, and a large amount of heat is easily generated in any one of the grinding processes, and a large number of heat-sensitive substances exist in the culture medium components, so that quality change can occur in the grinding process, and when the grinding process is determined, the grinding time is controlled by low-temperature control, which is an effective protection method, but clearly increases the difficulty of the production process. (4) mixing stage: the disturbance factor of the mixing process is influenced by the difference of the particle size and the component proportion content of the component particles besides the mixing process parameters such as the charging coefficient, the mixing time, the mode of powder entering the mixer, the sequence, the stirring/rotating speed and the like.
In summary, when the culture medium is converted from the formula to the product, a lot of interference factors are introduced in the process and are in a 'black box' state, so that analysis and optimization work is difficult to develop from each link. Therefore, only if the overall stability of the culture medium formula is improved as much as possible, the formula which has strong anti-interference capability and can simplify the required process flow can be screened, so that the stable quality of the product can be ensured from the source, and the difficulty of the process flow is reduced. Thus, there is a need in the art to establish an overall stability assessment method for media formulations so that more stable formulations are preferred from serum-free media formulations with similar properties.
Disclosure of Invention
The invention provides an analysis method for the formula stability of a culture medium, which takes the cell growth state (such as the highest living cell density) or the product expression condition (such as the protein yield, the virus titer or a certain quality index) as a response value, and can guide and select the formula with higher stability or smaller fluctuation range in the similar performance through the analysis result of the formula stability, namely the fluctuation range of the response value, thereby being convenient for the establishment of the subsequent culture medium production process, ensuring the quality stability of the culture medium from the source and reducing the production process difficulty.
It is an object of the present invention to provide a method for analyzing the stability of a culture medium formulation based on a self-encoder.
In a first aspect of the invention, a method for analyzing the stability of a culture medium formulation based on a self-encoder model is provided, comprising the steps of:
m1: providing a pre-trained self-encoder model; wherein the self-encoder model is a self-encoding network comprising an encoder network and a decoder network; the encoder network and the decoder network are the encoder neural network and the decoder neural network with the same dimension;
m2: providing a composition comprising p components, and the concentration value of each component is expressed as x i Normalizing the concentration of each component in the culture medium formula k to be tested to obtain a normalized valueNormalized value +.>The input self-encoder model operates to obtain an output characteristic Z; wherein i is a positive integer from 1 to p, and p is a positive integer from 50 to 500;
wherein saidNormalization is performed using the formula Q1 shown below:
in the method, in the process of the invention,
normalized values for the content of component i;
x i the content value of any component i in the formula k;
the lowest value of the content of the component i in the formula database;
the highest content value of the component i in the formula database;
wherein the recipe database contains (or consists of) all recipes for pre-training;
m3: introducing a variance calculation formula Q3, and calculating the yield y of the culture medium formula k based on the output characteristic Z obtained in M2 k Is a variance of (2);
in the method, in the process of the invention,
var(y k ) Variance of yield for medium formulation k to be tested;
y k fitting yield of the formula k of the culture medium to be measured;
z is the output characteristic obtained in the step M2;
Z T is a transpose of the output feature Z;
the reciprocal of the variance representing the prior distribution is the empirical value 0.1;
the prior distribution variance representing the error term epsilon is an empirical value of 0.01;
subscript k represents formula k;
m4: determining the stability of the obtained culture medium formula k based on the magnitude of the variance, wherein the smaller the variance is, the higher the stability of the culture medium formula k is; the larger the variance, the poorer the stability of the medium formulation k.
In another preferred embodiment, in step M1, the self-encoder model is obtained by pre-training by a method comprising the steps of:
s1: providing a formula database comprising data for a plurality of media formulas, wherein the respective data for each media formula comprises formula composition data, applicable cell type data, and yield data for a target product; wherein the formula component data are the components composing the formula of the culture medium; the formula database comprises n groups of culture medium formulas, wherein n is an integer of 1-1000; preferably, n is an integer from 20 to 500;
s2: normalizing the content of component h in any one of the media formulas j in the formula database byIndicating normalized content values, wherein h is a positive integer from 1 to p, and p is a positive integer from 50 to 500;
wherein, the normalized content value is calculated by adopting a formula Q1-1
In the method, in the process of the invention,
as the normalized value of the content of the component h,
x h is the content value of any component h in the formula database,
for the lowest content of component h in the formulation database,
the highest content value of the component h in the formula database;
s3: providing an initialized self-encoder network, and inputting p normalized content values into the self-encoder networkPre-training is carried out as training samples, and p content values after fitting are finally output>The loss function L is calculated according to equation Q2:
s4: the self-encoder network is guided to train by minimizing the loss function L by a gradient descent method based on the value of the loss function L, thereby obtaining a pre-trained self-encoder model.
In another preferred embodiment, the method further comprises:
step S5: repeating steps S2-S4 when the formula database adds or updates the data of the culture medium formula, thereby performing fine tuning on the pre-trained self-encoder model to obtain an updated self-encoder model,
in step S3, the difference is that: the initialized self-encoder network is replaced with a pre-trained self-encoder model.
In another preferred embodiment, the cell type data is encoded as a single thermal encoding.
In another preferred embodiment, the cells are selected from the group consisting of: CHO cells, MDCK cells, BHK cells, sf9 cells, highFive cells, 293 cells, MDBK cells, F81 cells, DF-1 cells, LMH cells, vero cells, PK15 cells, ST cells, marc145 cells, hybridoma cells, diploid cells, immune effector cells.
In another preferred embodiment, the cells are selected from the group consisting of: CHO cells, MDCK cells, sf9 cells, vero cells.
In another preferred embodiment, in step M3, the yield y is calculated using formula Q3 k Variance var (y) k ):
In the method, in the process of the invention,
var(y k ) Variance of yield for medium formulation k to be tested;
y k fitting yield of the formula k of the culture medium to be measured;
z is the output characteristic obtained in the step M2;
Z T is a transpose of the output feature Z;
the reciprocal of the variance representing the prior distribution is the empirical value 0.1;
representing errorsThe prior distribution variance of the difference term epsilon is 0.01 of the experience value;
the subscript k denotes formula k.
In another preferred embodiment, when var (y k ) And less than 0.5, the stability of the culture medium formula is good.
In another preferred embodiment, when var (y k ) < 0.3, preferably var (y k ) < 0.2; more preferably, var (y k ) < 0.1, or var (y k ) < 0.01, or var (y) k ) And less than 0.001, the stability of the culture medium formula is good.
In another preferred embodiment, the method further comprises: the medium formulation with high stability obtained from the variance value was used for cell culture.
It is understood that within the scope of the present invention, the above-described technical features of the present invention and technical features specifically described below (e.g., in the examples) may be combined with each other to constitute new or preferred technical solutions. And are limited to a space, and are not described in detail herein.
Drawings
Fig. 1 is a diagram of a self-encoder architecture.
Detailed Description
The inventor has studied extensively and intensively and provided an analysis method for the stability of a culture medium based on a self-encoder neural network for the first time. And (3) utilizing a self-encoder to extract the relation and characteristics of historical data of the formula components of the culture medium and the expression of the product, establishing the relation between the components of the culture medium and the response value, and obtaining a stability analysis result through a variance formula. The analysis method can screen out the formula with higher stability or smaller variation range in the similar performance, and is convenient for the establishment of the subsequent culture medium production process. Based on this, the inventors completed the present invention.
Self-encoder model and neural network
The self-encoder model comprises a self-encoding network of an encoder network and a decoder network, wherein the encoder network and the decoder network are neural networks, and the structures of the encoder network and the decoder network are symmetrical, namely the encoder network and the decoder network have the same hidden layer number. The main purpose of the self-encoder model is to reconstruct the input data at the output layer, i.e. to minimize the distance between the input data and the reconstructed data.
The optimization objective from the encoder model is to minimize the value of the loss function L:
variance calculation formula
The variance formula derived from the regression model Bayesian formula conventionally used in the art is used in the present invention, and the stability of the yield of the target product is described by using the variance.
The yield variance var (y) was calculated using the formula Q3 shown below k ) And (3) performing calculation:
wherein,
var(y k ) For the variance of the yield of the medium formulation k to be tested,
y k for the yield of the medium formulation k to be tested,
z is the output characteristic obtained in step M2,
Z T is a transpose of the output feature Z,
the inverse of the variance, representing the a priori distribution, is the empirical value 0.1,
the prior distribution variance representing the error term epsilon is an empirical value of 0.01;
representing the inverse of the a priori distribution variance of the error term e.
Fitting the yield y of the target product obtained by cell culture with medium formula k using formula Q4:
y=β T Z+∈ (Q4)
wherein Z is the output characteristic obtained in the step M2;
β T is the transpose of beta, represents the independent variable coefficient, and the prior distribution is Gaussian distribution N (0, sigma p ),
Σ p Representing beta T Is of the a priori distribution Σ p An empirical value of 10;
e represents the error term, the prior distribution is Gaussian
The a priori distribution variance representing the error term e is an empirical value of 0.01.
The invention is based on an unsupervised learning algorithm of a self-encoder neural network, and the basic idea of the algorithm is to map a formula to a hidden space with lower dimensionality, thereby achieving the purposes of dimension reduction and noise reduction. After mapping, the variance of the yield of the target product can be calculated through a variance formula so as to obtain a stability analysis result. The advantage of using a self-encoder is that it can extract relationships and features between variables through the historical data, making full use of the information in the historical data.
Compared with the prior art, the invention has the main advantages that:
(1) The analysis method can guide and select the culture medium formula with higher stability or smaller fluctuation range in the similar performance, is convenient for the establishment of the subsequent culture medium production process, ensures the stable quality of the culture medium from the source and reduces the difficulty of the production process.
(2) By training the self-encoder in the historical data, the method is made to better address the high latitude problem of the medium components. While the self-encoder is capable of processing a wider variety of data so that the input data is not limited to the conventional numerical variables.
The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. Percentages and parts are by weight unless otherwise indicated.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. In addition, any methods and materials similar or equivalent to those described herein can be used in the methods of the present invention. The preferred methods and materials described herein are presented for illustrative purposes only.
The invention provides an analysis method for the stability of a culture medium formula, which can fully utilize historical data to perform stability analysis on data of complex culture medium components, namely cell growth/product expression and the like. The invention is further illustrated below with reference to examples.
Example 1
This example uses the CHO cell culture process to produce certain antibodies by following 1X 10 cells 6 cells/ml were seeded at a density of 20 ml/tube in a cell culture shake tube and a "basal + fed-batch" two-stage culture procedure was used with antibody production at the end of 14 days as a response. The overall stability analysis range of the formulation includes Basal Medium (BM) and fed-batch medium (FM).
The method specifically comprises the following steps:
1) Collecting all culture medium formula components and content data and historical data of yield thereof as a formula database; for the case that the culture medium has the same composition and different cells are used for culture, the cells are classified by adopting the single-heat coding. The results of all historical media formulation components and their content data and yield data are shown in table 1 below.
TABLE 1
2) And normalizing the historical data. The normalization processing is required to be carried out according to the maximum and minimum values of the components, and the maximum and minimum values can be obtained by judging according to expert knowledge or can be the maximum and minimum values of the components in a culture medium formula library.
3) And inputting the obtained normalized components and the concentration data thereof into a self-encoder network for pre-training, and storing the trained self-encoder network in a lasting manner.
4) Obtaining a new culture medium formula, processing component data related to the new culture medium formula (without yield data related to the corresponding formula) by the method of the step 1), normalizing by the method of the step 2), and inputting the normalized data into the pre-trained self-encoder network of the step 3) for fine adjustment.
Step 4) can be repeated by continuously obtaining new culture medium formula, and fine tuning is carried out on the self-encoder network; during the fine tuning process, the first layer of the encoder network and the last layer of the decoder network in the self-encoder are frozen and not updated; and after the fine tuning is finished, obtaining a pre-trained self-encoder model.
5) Providing 4 groups of culture medium formulas to be tested, normalizing each component in each culture medium formula to be tested by the method in the step 2), and inputting normalized component data into the pre-trained self-encoder model obtained in the step 4), thereby obtaining an output characteristic Z.
TABLE 2 culture medium formulation to be tested
Recipe name | X1 | X2 | X3 | X4 | X5 | … | X52 | X53 | X54 | X55 |
BM31D-FM4D1 | 4 | 8 | 4028 | 91 | 1 | … | 7385 | 297 | 1728 | 1809 |
BM31D-FM12D | 4.3 | 5.5 | 4028 | 89 | 0.2 | … | 7083 | 297 | 1728 | 1809 |
BM38D-FM4D1 | 5.9 | 5.5 | 4408 | 92 | 0.2 | … | 4160 | 88 | 1572 | 1809 |
BM38D-FM12D | 4.3 | 8.4 | 5618 | 89 | 0.2 | … | 2448 | 88 | 1572 | 1809 |
6) The yield variance of the target product of each medium formulation was performed by the yield variance formula using the output characteristics Z of each medium formulation, and the results are shown in table 3:
TABLE 3 Table 3
Recipe name | Estimating yield variance |
BM31D-FM4D1 | 0.20 |
BM31D-FM12D | 0.18 |
BM38D-FM4D1 | 0.27 |
BM38D-FM12D | 0.28 |
7) Cell experiments verify the stability of the medium: the 4 culture medium formulations to be tested provided in the step 5) are respectively used for producing a certain antibody by a CHO cell culture process, and the experimental process is that the cells are according to the ratio of 1 multiplied by 10 6 cells/ml were inoculated into a cell culture shake tube at a loading of 20 ml/tube, and after 14 days of culture, antibody production was measured by a "basic+fed-batch" two-stage culture process.
Preparing 5 parts of each group of culture medium formula to be tested by using raw materials of different batches; CHO cell culture experiments were performed on each medium under the same environment, measuring antibody production on day 14; the variance of the yields of each of the above groups of media was calculated.
The results are shown in Table 4 below:
TABLE 4 Table 4
Recipe name | Variance of |
BM31D-FM4D1 | 0.20 |
BM31D-FM12D | 0.19 |
BM38D-FM4D1 | 0.24 |
BM38D-FM12D | 0.33 |
As can be seen from Table 4, the yield variance measured by the method of the present invention was close to the yield variance in actual production. That is, the method of the invention has high accuracy, and can guide technicians to select the formula with the smallest fluctuation range and the highest stability.
All documents mentioned in this application are incorporated by reference as if each were individually incorporated by reference. Further, it will be appreciated that various changes and modifications may be made by those skilled in the art after reading the above teachings, and such equivalents are intended to fall within the scope of the claims appended hereto.
Claims (10)
1. A method for analyzing the stability of a culture medium formulation based on a self-encoder model, comprising the steps of:
m1: providing a pre-trained self-encoder model; wherein the self-encoder model is a self-encoding network comprising an encoder network and a decoder network; the encoder network and the decoder network are the encoder neural network and the decoder neural network with the same dimension;
m2: providing a composition comprising p components, and the concentration value of each component is expressed as x i Normalizing the concentration of each component in the culture medium formula k to be tested to obtain a normalized valueNormalized value +.>The input self-encoder model operates to obtain an output characteristic Z; wherein i is a positive integer from 1 to p, and p is a positive integer from 50 to 500;
wherein saidNormalization is performed using the formula Q1 shown below:
in the method, in the process of the invention,
normalized values for the content of component i;
x i the content value of any component i in the formula k;
the lowest value of the content of the component i in the formula database;
the highest content value of the component i in the formula database;
wherein the recipe database contains (or consists of) all recipes for pre-training;
m3: introducing a variance calculation formula Q3, and calculating the yield y of the culture medium formula k based on the output characteristic Z obtained in M2 k Is a variance of (2);
in the method, in the process of the invention,
var(y k ) Variance of yield for medium formulation k to be tested;
y k fitting yield of the formula k of the culture medium to be measured;
z is the output characteristic obtained in the step M2;
Z T is a transpose of the output feature Z;
the reciprocal of the variance representing the prior distribution is the empirical value 0.1;
the prior distribution variance representing the error term epsilon is an empirical value of 0.01;
subscript k represents formula k;
m4: determining the stability of the obtained culture medium formula k based on the magnitude of the variance, wherein the smaller the variance is, the higher the stability of the culture medium formula k is; the larger the variance, the poorer the stability of the medium formulation k.
2. The method of claim 1, wherein in step M1, the self-encoder model is obtained by pre-training by a method comprising the steps of:
s1: providing a formula database comprising data for a plurality of media formulas, wherein the respective data for each media formula comprises formula composition data, applicable cell type data, and yield data for a target product; wherein the formula component data are the components composing the formula of the culture medium; the formula database comprises n groups of culture medium formulas, wherein n is an integer of 1-1000; preferably, n is an integer from 20 to 500;
s2: normalizing the content of component h in any one of the media formulas j in the formula database byIndicating normalized content values, wherein h is a positive integer from 1 to p, and p is a positive integer from 50 to 500;
wherein, the normalized content value is calculated by adopting a formula Q1-1
In the method, in the process of the invention,
as the normalized value of the content of the component h,
x h is the content value of any component h in the formula database,
for the lowest content of component h in the formulation database,
the highest content value of the component h in the formula database;
s3: providing an initialized self-encoder network, and inputting p normalized content values into the self-encoder networkPre-training is carried out as training samples, and p content values after fitting are finally output>The loss function L is calculated according to equation Q2:
s4: the self-encoder network is guided to train by minimizing the loss function L by a gradient descent method based on the value of the loss function L, thereby obtaining a pre-trained self-encoder model.
3. The method of claim 2, wherein the method further comprises:
step S5: repeating steps S2-S4 when the formula database adds or updates the data of the culture medium formula, thereby performing fine tuning on the pre-trained self-encoder model to obtain an updated self-encoder model,
in step S3, the difference is that: the initialized self-encoder network is replaced with a pre-trained self-encoder model.
4. The method of claim 1, wherein the cell type data is encoded as a single thermal encoding.
5. The method of claim 1, wherein the cell is selected from the group consisting of: CHO cells, MDCK cells, BHK cells, sf9 cells, highFive cells, 293 cells, MDBK cells, F81 cells, DF-1 cells, LMH cells, vero cells, PK15 cells, ST cells, marc145 cells, hybridoma cells, diploid cells, immune effector cells.
6. The method of claim 1, wherein the cell is selected from the group consisting of: CHO cells, MDCK cells, sf9 cells, vero cells.
7. Such as weightThe method according to claim 1, wherein in step M3, the yield y is calculated using formula Q3 k Variance var (y) k ):
In the method, in the process of the invention,
var(y k ) Variance of yield for medium formulation k to be tested;
y k fitting yield of the formula k of the culture medium to be measured;
z is the output characteristic obtained in the step M2;
Z T is a transpose of the output feature Z;
the reciprocal of the variance representing the prior distribution is the empirical value 0.1;
the prior distribution variance representing the error term epsilon is an empirical value of 0.01;
the subscript k denotes formula k.
8. The method of claim 1, wherein when var (y k ) And less than 0.5, the stability of the culture medium formula is good.
9. The method of claim 1, wherein when var (y k ) < 0.3, preferably var (y k ) < 0.2; more preferably, var (y k ) < 0.1, or var (y k ) < 0.01, or var (y) k ) And less than 0.001, the stability of the culture medium formula is good.
10. The method as recited in claim 1, wherein said method further comprises: the medium formulation with high stability obtained from the variance value was used for cell culture.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311702960.8A CN117690513B (en) | 2023-12-12 | 2023-12-12 | Culture medium formula stability analysis method based on self-encoder |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311702960.8A CN117690513B (en) | 2023-12-12 | 2023-12-12 | Culture medium formula stability analysis method based on self-encoder |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117690513A true CN117690513A (en) | 2024-03-12 |
CN117690513B CN117690513B (en) | 2024-06-25 |
Family
ID=90133135
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311702960.8A Active CN117690513B (en) | 2023-12-12 | 2023-12-12 | Culture medium formula stability analysis method based on self-encoder |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117690513B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117497037A (en) * | 2023-11-17 | 2024-02-02 | 上海倍谙基生物科技有限公司 | Culture medium component sensitivity analysis method based on generalized linear model |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114121161A (en) * | 2021-06-04 | 2022-03-01 | 东莞太力生物工程有限公司 | Culture medium formula development method and system based on transfer learning |
WO2022063341A1 (en) * | 2020-09-27 | 2022-03-31 | 深圳太力生物技术有限责任公司 | Basal culture medium development method, basal culture medium formulation and development, and system thereof |
CN114360652A (en) * | 2022-01-28 | 2022-04-15 | 深圳太力生物技术有限责任公司 | Cell strain similarity evaluation method and similar cell strain culture medium formula recommendation method |
CN114678085A (en) * | 2022-04-29 | 2022-06-28 | 深圳太力生物技术有限责任公司 | Metabolic parameter-based supplemented medium development method and system |
-
2023
- 2023-12-12 CN CN202311702960.8A patent/CN117690513B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022063341A1 (en) * | 2020-09-27 | 2022-03-31 | 深圳太力生物技术有限责任公司 | Basal culture medium development method, basal culture medium formulation and development, and system thereof |
CN114121161A (en) * | 2021-06-04 | 2022-03-01 | 东莞太力生物工程有限公司 | Culture medium formula development method and system based on transfer learning |
CN114360652A (en) * | 2022-01-28 | 2022-04-15 | 深圳太力生物技术有限责任公司 | Cell strain similarity evaluation method and similar cell strain culture medium formula recommendation method |
CN114678085A (en) * | 2022-04-29 | 2022-06-28 | 深圳太力生物技术有限责任公司 | Metabolic parameter-based supplemented medium development method and system |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117497037A (en) * | 2023-11-17 | 2024-02-02 | 上海倍谙基生物科技有限公司 | Culture medium component sensitivity analysis method based on generalized linear model |
Also Published As
Publication number | Publication date |
---|---|
CN117690513B (en) | 2024-06-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117690513B (en) | Culture medium formula stability analysis method based on self-encoder | |
Go et al. | Adaptively truncated Hilbert space based impurity solver for dynamical mean-field theory | |
CN107533593A (en) | Method for identifying unknown microbiota subpopulations from the set with reference to subgroup by mass spectrography | |
DE112019002899T5 (en) | SYSTEM, PROCESS AND COMPUTER PROGRAM PRODUCT FOR PREDICTING PROPERTIES OF A POLYMER | |
CN112906300A (en) | Polarized SAR (synthetic Aperture Radar) soil humidity inversion method based on two-channel convolutional neural network | |
CN109616161B (en) | Fermentation process soft measurement method based on twin support vector regression machine | |
CN110442911B (en) | High-dimensional complex system uncertainty analysis method based on statistical machine learning | |
CN115015126B (en) | Method and system for judging activity of powdery biological particle material | |
CN111126511A (en) | Vegetation index fusion-based LAI quantitative model establishment method | |
CN111833970A (en) | Construction method and application of cement clinker quality characterization parameter prediction model | |
CN114878509A (en) | Standard sample-free transfer method of tobacco near-infrared quantitative analysis model | |
CN108073074B (en) | Assembly quality control method based on motion characteristics of machine tool feeding system | |
CN105740884A (en) | Hyper-spectral image classification method based on singular value decomposition and neighborhood space information | |
CN113808681A (en) | ABO (abnormal noise) rapid prediction based on SHAP-Catboost3Method and system for specific surface area of perovskite material | |
CN111832748A (en) | Electronic nose width learning method for performing regression prediction on concentration of mixed gas | |
CN110909492B (en) | Sewage treatment process soft measurement method based on extreme gradient lifting algorithm | |
CN110866643A (en) | Fermentation process quality variable prediction based on maximum quadratic mutual information criterion regression | |
CN114121161B (en) | Culture medium formula development method and system based on transfer learning | |
CN116052778A (en) | Method for monitoring component concentration of cell culture solution in bioreactor in real time | |
CN111292006B (en) | Method and device for obtaining raw material quality range based on yellow wine product quality range | |
CN111178627B (en) | Neural network hybrid optimization prediction method based on SPCA | |
CN101173918A (en) | Method for predicting biological, biochemical, biophysical, or pharmacological characteristics of a substance | |
CN114334030A (en) | Method for evaluating high molecular polymerization reaction product based on quantum support vector machine | |
CN112733903A (en) | Air quality monitoring and alarming method, system, device and medium based on SVM-RF-DT combination | |
CN112686881A (en) | Particle material mixing uniformity detection method based on image statistical characteristics and LSTM composite network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |