WO2003023687A2 - Procede perfectionne destine a analyser le profil de donnees continues - Google Patents
Procede perfectionne destine a analyser le profil de donnees continues Download PDFInfo
- Publication number
- WO2003023687A2 WO2003023687A2 PCT/US2002/027805 US0227805W WO03023687A2 WO 2003023687 A2 WO2003023687 A2 WO 2003023687A2 US 0227805 W US0227805 W US 0227805W WO 03023687 A2 WO03023687 A2 WO 03023687A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- parameter
- data
- readable medium
- machine readable
- aligning
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 249
- 238000004458 analytical method Methods 0.000 title claims abstract description 63
- 230000008569 process Effects 0.000 claims abstract description 90
- 238000004519 manufacturing process Methods 0.000 claims abstract description 80
- 239000000463 material Substances 0.000 claims description 42
- 238000000855 fermentation Methods 0.000 claims description 27
- 230000004151 fermentation Effects 0.000 claims description 27
- 230000007613 environmental effect Effects 0.000 claims description 20
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 claims description 16
- 238000004113 cell culture Methods 0.000 claims description 15
- 238000012417 linear regression Methods 0.000 claims description 14
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 claims description 13
- 229910052760 oxygen Inorganic materials 0.000 claims description 13
- 239000001301 oxygen Substances 0.000 claims description 13
- 230000000007 visual effect Effects 0.000 claims description 12
- 230000001580 bacterial effect Effects 0.000 claims description 9
- 241000588724 Escherichia coli Species 0.000 claims description 8
- 239000001569 carbon dioxide Substances 0.000 claims description 8
- 229910002092 carbon dioxide Inorganic materials 0.000 claims description 8
- 230000003287 optical effect Effects 0.000 claims description 5
- 241000894007 species Species 0.000 claims description 5
- 239000000047 product Substances 0.000 description 28
- 239000002994 raw material Substances 0.000 description 16
- 238000005259 measurement Methods 0.000 description 13
- 238000013179 statistical model Methods 0.000 description 8
- 230000001419 dependent effect Effects 0.000 description 5
- 244000005700 microbiome Species 0.000 description 5
- 230000008030 elimination Effects 0.000 description 4
- 238000003379 elimination reaction Methods 0.000 description 4
- 239000012467 final product Substances 0.000 description 4
- 239000004615 ingredient Substances 0.000 description 4
- 238000004886 process control Methods 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 238000007619 statistical method Methods 0.000 description 4
- 239000000356 contaminant Substances 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 3
- 150000003839 salts Chemical class 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 239000007858 starting material Substances 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 239000002699 waste material Substances 0.000 description 3
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000011109 contamination Methods 0.000 description 2
- 239000010779 crude oil Substances 0.000 description 2
- 239000012535 impurity Substances 0.000 description 2
- 229910052500 inorganic mineral Inorganic materials 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000011707 mineral Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000035790 physiological processes and functions Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000000611 regression analysis Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 235000000346 sugar Nutrition 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 239000000956 alloy Substances 0.000 description 1
- 229910045601 alloy Inorganic materials 0.000 description 1
- 230000003698 anagen phase Effects 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008238 biochemical pathway Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 239000002775 capsule Substances 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 238000011210 chromatographic step Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000010924 continuous production Methods 0.000 description 1
- 238000002425 crystallisation Methods 0.000 description 1
- 230000008025 crystallization Effects 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000011026 diafiltration Methods 0.000 description 1
- 238000004090 dissolution Methods 0.000 description 1
- 239000002552 dosage form Substances 0.000 description 1
- 229940088679 drug related substance Drugs 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 238000011049 filling Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000005194 fractionation Methods 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 238000013340 harvest operation Methods 0.000 description 1
- 238000012787 harvest procedure Methods 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000005342 ion exchange Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000002207 metabolite Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 150000003013 phosphoric acid derivatives Chemical class 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 238000012628 principal component regression Methods 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 239000002510 pyrogen Substances 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000013341 scale-up Methods 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 239000001488 sodium phosphate Substances 0.000 description 1
- 235000011008 sodium phosphates Nutrition 0.000 description 1
- 239000010935 stainless steel Substances 0.000 description 1
- 229910001220 stainless steel Inorganic materials 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
- 238000010977 unit operation Methods 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B19/00—Programme-control systems
- G05B19/02—Programme-control systems electric
- G05B19/418—Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM]
- G05B19/41875—Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM] characterised by quality surveillance of production
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
- G05B2219/32—Operator till task planning
- G05B2219/32187—Correlation between controlling parameters for influence on quality parameters
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
- G05B2219/32—Operator till task planning
- G05B2219/32194—Quality prediction
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/02—Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]
Definitions
- the present invention relates to a method for analyzing the data from a production process.
- a statistical technique e.g., multiple regression
- a method for analysis comprising the steps of: generating plots for each of a plurality of batches of a process manufacturing process based on data for at least one continuous parameter; aligning the plots based on at least one aligning continuous parameter of each of the plurality of batches; selecting a plurality of profiles for the aligned plots; analyzing the profiles using a regression method to provide analysis results indicating the level of success of the process manufacturing process; and displaying the analysis results to a user.
- a machine readable medium storing instructions that, if executed by a computer system, causes the computer system to perform a set of operations comprising: generating plots for each of a plurality of batches of a process manufacturing process based on data for at least one continuous parameter; aligning the plots based on at least one aligning continuous parameter of each of the plurality of batches; selecting a plurality of profiles for the aligned plots; analyzing the profiles using a regression method to provide analysis results indicating the level of success of the process manufacturing process; and displaying the analysis results to a user.
- a method for analysis comprising the steps of: generating plots for each of a plurality of batches of a process manufacturing process based on data for at least one continuous parameter; aligning the plots based on at least one aligning continuous parameter of each of the plurality of batches; selecting a plurality of profiles for the aligned plots; analyzing the profiles using a regression method to provide analysis results indicating the level of success of the process manufacturing process; and storing the analysis results in a machine readable medium.
- a machine readable medium storing instructions that, if executed by a computer system, causes the computer system to perform a set of operations comprising: generating plots for each of a plurality of batches of a process manufacturing process based on data for at least one continuous parameter; aligning the plots based on at least one aligning continuous parameter of each of the plurality of batches; selecting a plurality of profiles for the aligned plots; analyzing the profiles using a regression method to provide analysis results indicating the level of success of the process manufacturing process; and storing the analysis results in a second machine readable medium.
- FIG. 1 is a flow chart illustrating a preferred embodiment of the method of the present invention
- FIG. 2 is a screenshot of the operation of a program used to implement a preferred embodiment of the method of the present invention showing plots for parameters of a process being analyzed prior to alignment;
- FIG. 3 is a screenshot of the operation of the program of FIG. 2 showing how the plots of FIG. 2 may be shifted using a preferred embodiment of the method of the present invention
- FIG. 4 is a screenshot of the operation of the program of FIG. 2 showing one parameter on which a linear regression may be carried out in accordance with a preferred embodiment of the method of the present invention
- FIG. 5 is a screenshot of the operation of the program of FIG. 2 showing five continuous parameters and a series of selected time-based profiles for a process being analyzed by the program of FIG. 2;
- FIG. 6 is a screenshot of the operation of the program of FIG. 2 showing one- line summaries of the multiple regression analysis output of the process being analyzed by the program of FIG 2;
- FIG. 7 is a screenshot of the operation of the program of FIG. 2 showing a detailed output of a model of the process being analyzed by the program of FIG. 2.
- a "raw data" source refers to unadjusted data contained in the original data sources.
- profile refers to the collection of values of one or more continuous parameters of one or more "parameter sets” at a given common time point.
- the term "the level of success of a manufacturing process” refers to the current relationship of the process to the predicted outcome of interest, such as an impurity or yield outcome parameter.
- process manufacturing process refers to a process that uses processing steps that exert chemical and physical changes on the raw materials and intermediate materials so that they are physically transformed into products that no longer resemble the starting materials.
- process manufacturing processes are the production of antibiotics or genetically engineered proteins from sugars and salts using living microorganisms, or the production of gasoline from crude oil.
- discrete manufacturing process refers to a process that is essentially an assembly operation. Examples of discrete manufacturing are the manufacturing of an automobile or a shirt that consists of assembling various parts to make a whole. The finished product looks like an assembly of the parts that the process started with. Some of the parts used to assemble a car may be produced by process manufacturing processes. For example, an engine block for an automobile is made from molten alloy that is produced by a process manufacturing process starting with chunks of mineral ores that are melted and processed so that they no longer resemble the mined starting materials.
- bioprocess refers to any process manufacturing process that involves the use of cell cultures, including living cell cultures such as bacterial cultures.
- the term "user” refers not only to end- users of software employing the method of the present invention, but also to individuals, such as software developers or database designers, who carry out one or more steps of the method of the present invention.
- process refers to any process.
- the method of the present invention may be used to access and analyze data from processes for producing one or more products including manufacturing processes, purification processes, chemical synthesis processes, etc. or may be used for other types of processes such as tracking the shipment of goods, tracking inventory in a store, etc.
- a process of the present invention includes one or more steps.
- the term “parameter” refers to any property or characteristic used to classify an individual piece or multiple pieces of data. For the purposes of the present invention, there are two characteristics of "parameters":
- identification codes and “parameter values.” Any parameter that is not used as an identification code for an analysis group is a parameter value. Parameters may include characteristics such as the temperature at a particular time, the pH of a solution, the purity of a compound, the source of a raw material, etc.
- parameter value refers to the specific piece of data resulting from a measurement associated with a specific parameter.
- specific parameter values include the particular batch number for a batch, the measured temperature associated with a batch material at a particular time in its production cycle, the test outcome for a specific parameter, etc.
- identification code refers to a code, name etc. that uniquely identifies a particular parameter.
- parameter set refers to a group of parameters that relate to the same batch of manufactured product.
- a parameter set may be obtained from a single data set or multiple data sets.
- a parameter set may have one or more "parameter values" associated with each parameter in the parameter set.
- equipment parameter refers to a parameter relating to one or more pieces of equipment used in a process manufacturing process of the present invention.
- equipment parameters include: RPMs of an agitator, pressure of a vessel, etc.
- the term "material parameter" refers to a parameter relating to a material that is processed by a process manufacturing process of the present invention.
- material parameters include: pH of a solution, concentration of metabolite, temperature of a liquid, etc.
- environmental parameter refers to a parameter relating to the environment to which material processed by the process manufacturing process is exposed.
- environmental parameters include: room air temperature, humidity, dew point, etc.
- the terms "material parameter”, “environmental parameter” and “equipment parameter” refer to different species of parameters.
- raw material refers to starting materials used in a process for producing a product.
- intermediate material refers to a material produced at any point in the process prior to producing the final product of the process.
- An intermediate material may be produced by manufacturing the intermediate material from raw materials or other intermediate materials, by purifying raw materials or other intermediate materials, by the synthesis from raw materials or other intermediate materials, etc.
- batch refers to a given amount of product and the materials and conditions used to make that given amount of product, regardless of the amount of raw materials used, the amount of product produced, or the time taken to produce a given amount of product.
- Several types of discrete data, continuous data, and replicate data may all be related to a particular batch of product.
- the term batch as used in the present invention may refer to a production run of several hours, days, weeks, months, etc.
- the term “data source” refers to any source of data such as a database or a data storage file, data directly produced by a measurement device, data electronically sent from a remote location, data entered into a database from paper records, etc. Two data sources are considered to be “different” if the data sources employ different file formats or different data structures or have different physical locations.
- the term “data set” refers to a set of data or a database. A data set may be classified into a particular "complete data set type" based on the data set's primary data set type, secondary data set type and the same tertiary data set type.
- discrete data refers to data parameter values that are obtained only once during the process of producing one batch of product. Examples of discrete data include: the amount of an ingredient added at some step in a process, the source of an ingredient added at a particular step in a process, the date of production of an ingredient used in a process, etc.
- continuous data refers to data parameter values that are obtained at several times during the same step of the process of producing a batch of product, with each collection having an associated time.
- continuous data include: the temperature at a particular step of a process measured in 5 second intervals for the duration of the step, the moisture content of the effluent air at a particular step measured in 10 second intervals for the duration of the step, the amount of contamination present at a particular step measured in 15 minute intervals, etc.
- replica data refers to data parameter values that are obtained from several measurements of the same parameter made independent of the time of the measurement, i.e. replicate data includes data obtained from multiple measurements of the same parameter made at the same time and data obtained from multiple measurements of the same parameter taken with no regard as to the time that the measurements were made. Replicate data may also be discrete data or continuous data.
- an analysis group refers to a collection of parameter sets that may be selected by a user wherein all of the parameter sets meet the "parameter restrictions" for one or more parameters.
- an analysis group could include all of the parameter sets which have median temperature parameter values of 35 to 38° C for three different time points, a minimum pH parameter value above 7, have the same raw materials supplier parameter, have a raw materials supplied date parameter value of January, etc.
- An analysis group is a structured data container that supports fast, efficient utilization of data via standardized interfaces. The structure of an analysis group permits it to hold all types of data concurrently, e.g. discrete, continuous, replicate, etc.
- An analysis group can be thought of as a sparsely populated multidimensional data cube, with parameter sets (that relate to individual batches of manufactured product) making up one axis, parameter names making up another axis, and time offsets (for continuous parameters) making up another axis, and replicate information making up another axis.
- Analysis groups also allow the dynamic creation of additional parameters within the analysis groups, allow for the data within them to be subsetted for subsequent operations and allow themselves to be updated with new data from the data sources on an on-demand basis. Analysis groups of the type employed by the method of the present invention are described in greater detail in U.S. Patent Application No.
- the term "computer system” refers to any type of computer system that implements software including an individual computer such as a personal computer, mainframe computer, mini-computer, etc. or a network of computers, such as a network of computers in a business, the Internet, personal data assistant, cell phone, etc.
- visual display device includes any type of visual display device such as a CRT monitor, LCD screen, etc. Description
- the present invention provides a method for statistical analysis of large multi- parameter data sets and finding correlation between values of multiple continuous parameters and selected outcome parameter(s).
- finding the correlation between the multiple continuous parameters and the selected outcome parameters is done through modeling these large sets of data.
- the continuous parameters along with the discrete outcome parameter are first fitted to a statistical model, such as linear regression or multiple regression or non-linear regression.
- the output from the statistical model is a mathematical relationship between the continuous parameters and the selected outcome parameter.
- the selected outcome parameter may then be predicted at several time points during the process to decide or determine the fate of the manufacturing process with a high degree of certainty and for better process control.
- APA Advanced Profile Analysis
- the multiple continuous parameters may come from a single piece of equipment or unit operation, such as a fermentor or from multiple pieces of equipment at one or more process steps.
- the method may first allow the user to align all the parameters across multiple batches or production runs based on some criterion such as time at which a certain parameter value occurs, e.g., when dissolved oxygen reaches 50%. This allows the absolute times at which all the parameters are measured for a particular batch to be adjusted by the amount needed to align that batch to a process state similar to the rest of the batches.
- the method of the present invention may then allow the user to specify a single time point for the continuous parameters that is the best predictor of the selected process outcome.
- the method of the present invention may allow different time points to be used for each of the parameters.
- the method of the present invention may use multiple time points for the continuous parameters based on some interval, e.g. every hour, where the value of each continuous parameter is the actual value at the given time point.
- the method of the present invention may employ the average or some other derivation of each parameter that includes a user-specified number of time points or a specified time interval on either side of a particular time point.
- the method may then allow the user to specify a statistical technique, e.g., multiple regression, that will be performed at each time point that relates the values of the continuous parameters, as specified above, to the outcome parameter(s) of interest.
- a statistical technique e.g., multiple regression
- the resulting statistical models may be presented to the user ordered by the time of the profile or in order of significance.
- Variations may consist of using any of a number of statistical techniques to apply to the method of relating the values of multiple continuous parameters to a discrete outcome parameter. For example, multiple linear regression, non-linear regression, principal component analysis, etc.
- Bioprocesses are an example of data-intensive process manufacturing processes to which the analysis method of the present invention may have particular utility.
- a typical pharmaceutical bioprocess will now be described.
- a pharmaceutical bioprocess may begin with raw materials like glucose (a sugar), salts (such as sodium chloride and phosphates), and water, along with a starting living cell culture of a microorganism like E. coli.
- the E. coli is inoculated and grown in a sterilized flask of these raw materials that has been equilibrated to, and then incubated at, a suitable temperature until it replicates into a large number of cells, approximately 1 X 10 per milliliter. At this point the cells have nearly depleted the raw materials in the flask and built up waste products. They must be transferred to a fresh sterile container with fresh raw materials so that they can continue growing. This is generally accomplished by transferring them to a seed fermentor.
- a fermentor is a sealable, stainless steel tank that contains raw materials in a water-based medium that can be sterilized by heat and pressure much like a pressure cooker. Once the medium is sterilized and cooled to the right temperature, the E. coli is transferred into it in an aseptic manner, i.e., in a manner that excludes the possibility of contamination by other, unwanted microorganisms. After this is done, the fermentor is stirred, sterile air and/or oxygen is pumped in, the pH, temperature, pressure and dissolved oxygen concentrations are held constant at controlled setpoints and additional sterile raw materials are often pumped in to promote growth of the microorganism.
- the E. coli cells Once the E. coli cells have grown in number, depleted the raw materials in the seed fermentor and built up near toxic levels of waste products, they are transferred progressively into larger sterile fermentors until reaching the final stage of fermentation called a production fermentor.
- the production fermentor is operated in much the same way as the seed fermentors, accumulating more data for later use.
- the microorganism At the end of the production fermentor stage of the process, the microorganism has produced the maximum level of the desired product and the product is ready to be harvested and purified.
- the maximum level of product may be produced naturally or by means of an induction mechanism that introduces natural or non-natural biochemical pathways.
- the harvest procedure consists of concentrating the cells away from the liquid fermentation broth in which they are suspended. This is usually done by, e.g., a filtration or centrifugation process. Once the cells are concentrated, and if they contain the product of interest, they may be broken typically using a force called hydrodynamic sheer. The resulting product concentrate is then ready for purification. If the product of interest does not accumulate in the cells but in the medium instead, then the medium is taken on for further processing in a similar way to the contents of the broken cells.
- much data is again collected from the measuring devices that control the various processes and from the batch records. Examples of data collected include pressures, weights, volumes, flow rates, temperatures, pH's, operator name, start time, stop time, amount of base/acid added, amount of waste, etc.
- the purification process that follows is aimed at achieving a higher concentration of the desired product at the same time as removing the contaminating materials. This is accomplished by subjecting the harvested mixture to separation techniques that selectively favor the desired product over the contaminating materials. These techniques may consist of any or all of the following: salt fractionation, selective precipitation, crystallization, size exclusion, affinity binding, hydrophobic separation, ion exchange, diafiltration, etc. During each of these processes, many measuring devices are used to make measurements and control the conditions in the process stream and much data is accumulated for later use. Examples of measurements include temperatures, flow rates, volumes, start times, stop times, ionic strength, pH, color, etc.
- the desired product Once the desired product has reached its maximum practical level of concentration and purity that may be the result of the several similar or different operations of the purification process, it passes into the final stage of manufacturing called filling and finishing. During this stage of the process, it is mixed with neutral carrier molecules so that it may be prepared in the right dosage form for administration to the patients who need it. Often it is also sterilized if it is intended for injection into patients. The final product is placed into suitable containers and labeled with the ingredients and expiration date. Once again, much data is accumulated during this part of the process for later use. Examples of data that is accumulated includes, the names of operators, the vendors of materials used, room air quality measurements, calibration dates, equipment service dates, pyrogen levels, particulates, color, pH, ionic strength, potency levels, contaminant levels, etc.
- the analysis method of the present invention employs Advanced Profile Analysis (APA) to search through time-based profiles of the continuous process parameters of a process manufacturing process to determine the time (or combination of times) that correspond to specific features in the profiles that provide the best statistical model for predicting an outcome parameter of interest.
- APA Advanced Profile Analysis
- the analysis method of the present invention may be used to find the parameters and times most useful for predicting the outcome, and to determine as early in the fermentation process as possible if a product is likely to fail an in-process or final product specification later in the process. That way, time and resources need not be wasted producing a bad product.
- the analysis method of the present invention also allows users to select specific features of continuous data profiles of a process manufacturing process that are then extracted and quantified, and used as discrete parameters singly or together for statistical analysis in combination with other process parameters. This gives the ability to find out, for example, what upstream parameters may be driving specific features in the oxygen uptake rate of the production fermentation. These results may be used to give improved control within the physical constraints of the equipment or the cost constraints of the process. Users could also determine what features in the post-induction CER (Carbon Dioxide Evolution Rate) of the production fermentation in combination with measurements made in the recovery process, are associated with a troublesome contaminant in the final product.
- CER Carbon Dioxide Evolution Rate
- the only evidence for the contaminant might be a shoulder on the main peak in one of the downstream chromatography steps which shoulder may be quantified using a number of methods. This information may give better process control, lower failure rates and higher quality and predictability. Scale-up problems that might result if such determinations were made on smaller experimental scale may be avoided if the analyses are done on data taken from full-scale operations.
- a user may select any number of continuous parameters from a single operation for any number of batches together with a discrete outcome parameter of interest, e.g., Product Impurity A.
- a discrete outcome parameter of interest e.g., Product Impurity A.
- the continuous data from the parameter sets are aligned.
- the profile plots may not be aligned across batches, perhaps because product is transferred into a fermentor in different physiological states equating to different times in its growth phase. In this case, it may be appropriate, and the system will have the ability to align the profiles where the dissolved oxygen reaches, for example, 30%>.
- Other means of aligning the parameter sets may also be applied.
- time-based profiles are chosen and values are assigned to temporary discrete parameters derived at the intersections of the profiles with the lines on the continuous data plots.
- the time-based profiles, and therefore the corresponding discrete parameter values may be chosen manually or by using an automatically generated time interval approach, e.g., run a model that chooses time-based profiles at thirty-minute intervals.
- the user may choose to use a single value of the profiled parameter at that time, or use a neighborhood mean for each of the profile parameters.
- the user may prefer to use the average of a certain number of nearby values, neighborhood values, in each profile of a particular parameter across all the batches to which the analysis method of the present invention is being applied.
- the data is then analyzed.
- the user has the option of saving the time-based profile data as new parameters for further analysis, or using it directly in a series of regression models to determine the time-based profile, or combination of time-based profiles, that give the best overall model that predicts the chosen outcome.
- the user will be able to choose from among the standard regression techniques available in commercially available software, such as Discoverant®, a software product made by Aegis Analytical Corporation, the assignee of the present invention.
- the multiple regression feature of Discoverant allows a user to perform regression using more than one independent parameter.
- One example is to estimate the effects on dissolution rate, i.e. the response parameter, of different factors, i.e. the independent continuous parameters, such as time-based profiles of dryer air humidity, KW input to a wet granulating mixer, and/or the addition rate of granulating solution to a granulator.
- the user may perform multiple regression using the following techniques: all parameters, forward selection, backwards elimination, stepwise, etc. When the all parameters technique is employed, multiple regression is performed using all of the parameters selected, whether their coefficient estimates are significant or not.
- the forward selection techniques When the forward selection techniques are employed, predictor parameters are added to the regression model one at a time.
- the candidate parameter for inclusion at each step is the one that produces the biggest decrease in residual sum of squares (SSE). If the calculated significance at each step is smaller than the specified p-value, then the parameter is added to the model. Otherwise the process is terminated.
- the backwards elimination technique When the backwards elimination technique is employed, regression is performed starting with all of the parameters in the model.
- the candidate parameter considered for removal at each step is the one that produces the smallest increase in SSE from the previous step.
- the procedure terminates when all the parameters are removed from the model or the calculated significance is greater than the specified p-value.
- the stepwise procedure technique is a combination of the forward selection and backward elimination techniques.
- FIG. 1 illustrates in flow chart 100 how a preferred embodiment of the method of the present invention may be implemented in a software program.
- a visual display device not shown in FIG. 1
- the time-based profiles for various batches will be displayed on a graph or plot on a screen on the visual display device.
- a user creates an analysis group from raw data sources by using a method such as described in US Patent Application No. 09/816,547, entitled “System, Method and Computer Program Product for Mapping Data of Multi-Database Origins" the entire contents and disclosure of which is hereby incorporated by reference.
- the user selects continuous parameters of interest and particular batches to be analyzed for those continuous parameters.
- the user aligns the continuous data from each of the batches. The user selects a single continuous parameter such as the dissolved oxygen concentration in the fermentor, the carbon dioxide evolution rate of the E. coli cells, the RPM of the agitator, the airflow through the fermentation tank, the optical density of the fermentation broth, etc.
- the user chooses an alignment criterion, such as the point where dissolved oxygen reaches 80%>. Then the user activates the display button on a screen displayed on a visual display device to display the aligned batches on the visual display device.
- the user selects a multiple regression method such as forward selection, backward elimination, etc. to analyze the aligned profiles.
- the user selects a single discrete outcome parameter of interest such as fermentation yield.
- the user selects multiple time-based profiles, either automatically at repeating intervals or interactively at particular times chosen by the user via on-screen interactions, upon which the previously selected regression method will be performed.
- the software program When the "analyze button" is clicked, the software program first creates the data set from which to build the specified regression model. For example, if a user is analyzing 5 batches and 3 continuous parameters where the user wants to use neighborhood means using +/- 3 data points for profiles at 5 and 10 minutes, the process might look like profiles in Tables 1 and 2 below:
- Batch refers to batch number
- Yield refers to percentage yield
- CER Carbon Dioxide Evolution Rate
- BaseFlow refers to the amount of base fed into the fermentor in liters/minute
- %DO refers to percentage of dissolved oxygen in the fermentor. Yield is the discrete dependent parameter and CER, BaseFlow and %>DO are the continuous independent parameters.
- the value of 28.2 for CER is determined by taking the average of the actual value of CER at 5 minutes, and the values at 2, 3, 4, 6, 7, and 8 minutes, assuming that the parameter has values recorded once per minute.
- the user-specific regression technique may be used to construct the "best fit" model. Using stepwise regression produces results such as the following:
- the software program will create a statistical model, which predicts the outcome based on the values of the independent continuous parameters, for each time-based profile or a single model for all the profiles together. If there are multiple data models, the software program may provide one-line summaries for each, sorted by order of significance followed by more detailed output. If the selected dependent parameter, i.e., the outcome parameter of interest, is used to determine the success or failure of a batch, the earliest significant time profile could be used to predict the likelihood of success or failure of a future batch. In any event, establishing correlations with desired outcomes can provide insight into where to focus root cause investigations of failures, thus saving time and money.
- a particular implementation of a software program of the type illustrated in FIG. 1 for use on a particular process manufacturing process is described in the example below.
- the analysis method of the present invention has been described above with respect to analyzing an example of a bioprocess, the method of the present invention may be used to analyze other types of process manufacturing processes such as for example the production tablets or capsules containing active drug substances, the production of chemicals or specialty chemicals, the production of refined minerals, the production of gasoline from crude oil, etc.
- results may be presented in written or printed form, audio form, or any other means of communicating the results to a user.
- APA Advanced Profile Analysis
- the data was aligned.
- production batches or production runs do not always start in a fermentor in the same physiological state. Therefore they must be aligned so that comparisons from batch to batch at specific time intervals make sense.
- any number of parameters could be used to determine how much to time-shift the data from each batch, e.g., Carbon Dioxide Evolution Rate (CER).
- CER Carbon Dioxide Evolution Rate
- the weight of Tank A is the plotted parameter used for alignment.
- FIG. 2 shows the plots before alignment.
- the alignment criteria may be where the maximum or minimum occurs, or as in the present example, where the amount of base used intersects the value 2.
- TankWt A is highlighted to indicate the chosen independent continuous parameter.
- the highlighted items to the right of TankWt A, e.g., 104a, 108a, etc., indicate the batches that were chosen for the graphical display and subsequent analysis. Clicking the display button plots the values of the continuous parameter, TankWt A in this case, on the y-axis vs. the time (in minutes) from the start of the step on the x-axis.
- the screen shot of FIG. 3 shows how the plots are shifted using the method of the present invention so that they all intersect the value 2 on the y-axis at the same point in time. All the batches are shifted to the left to line up with the batch that intersects the value 2 the earliest. The amount by which each batch is shifted may in itself be an important parameter to correlate with various outcomes of interest.
- the x-axis is rescaled to accommodate the batch that has to shift the most to the left to achieve the alignment criteria. For example, if the batch that takes the longest time to reach the value 2 takes 350 minutes longer than the fastest one to reach 2, the minimum value on the x-axis has to be no larger than -350 to show all the data.
- each vertical line in the graph represents a time-based profile at a particular time. Consequently, a new discrete parameter will be created for TankWt A at each of the selected time profiles where the profile line intersects that parameter. For each batch, the created values correspond to values of TankWt A at each of the time points that correspond to intersection of the time-based profile lines with the continuous parameter.
- an option is provided to use neighborhood means instead.
- an average may be taken using the single intersection value and including a specified number of points on both sides of it to help average out noisy (i.e. highly variable) parameter values, thus resulting in more robust regression models.
- a linear interpolation may be used so that there will be a complete data set.
- all the values that are used in the program to build the regression model may be saved as new parameters so that they may be used elsewhere in the system to create additional models, e.g., principal component regression.
- a more typical situation is when there are multiple continuous parameters chosen to determine the profile that results in the best model in terms of predicting the outcome of interest, also known as the 'Dependent Parameter.'
- the screenshot of FIG. 5 shows five continuous parameters and a series of selected time-based profiles. To avoid having potentially hundreds of lines on a single graph, there is a multi-page display where each page shows the chosen continuous parameters for each batch.
- the y-axis provides a standardized or normalized scale that ranges from 0 to 100, i.e. range- scaled.
- the user can select the Point ID mode and click on a point. The system will then put a label on the plot that indicates the time offset and actual value.
- MassRate A, Rate PI, TankWt A, TankWt B, and TankWt C are highlighted to indicate the chosen independent continuous parameters.
- the highlighted items to the right, e.g., 104a, 108a, etc., indicate the batches that are chosen for the graphical display and subsequent analysis. Clicking the display button plots the values of the continuous parameters (in terms of % of range) on the y-axis vs. the time (in minutes) from the start of the step on the x-axis.
- a calculated p-value of less than 0.05 is generally considered to indicate that the model is statistically significant, i.e., the parameters in the model can be reliably used to help predict the value of the dependent (outcome) parameter of interest; Response A in the present example.
- Column 6 gives the value of R-Squared, which is the percentage of variability in the dependent parameter that can be explained by the model.
- Column 7 is an adjusted R-Squared, which is similar to R-Squared but is adjusted downward to account for the number of parameters and degrees of freedom and to help prevent misinterpretation from over fitting the model.
- a one-page detailed output is provided for each model so the user can see which of the continuous parameters are the most significant.
- Column 1 of the first table provides the parameter name
- Column 2 provides the coefficient estimate for the parameter
- Column 3 provides the standard error associated with the coefficient estimate
- Column 4 provides the calculated t-Statistic (Column2/Column3)
- Column 5 provides the calculated p-value that indicates the level of significance for each of the parameters.
- the second table provides the sum of squares and degrees of freedom for each source, along with mean square errors for the model and residual, and calculated F-Ratio and p-value that indicates the level of significance of the overall model.
- the R-Squared and adjusted R-Squared values are the same in the one-line summaries in FIG. 6.
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Manufacturing & Machinery (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Cold Cathode And The Manufacture (AREA)
- Paper (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2002323532A AU2002323532A1 (en) | 2001-09-12 | 2002-09-04 | An advanced method for profile analysis of continuous data |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US31832901P | 2001-09-12 | 2001-09-12 | |
US60/318,329 | 2001-09-12 | ||
US15361302A | 2002-05-24 | 2002-05-24 | |
US10/153,613 | 2002-05-24 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2003023687A2 true WO2003023687A2 (fr) | 2003-03-20 |
WO2003023687A3 WO2003023687A3 (fr) | 2003-08-28 |
Family
ID=26850705
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2002/027805 WO2003023687A2 (fr) | 2001-09-12 | 2002-09-04 | Procede perfectionne destine a analyser le profil de donnees continues |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2002323532A1 (fr) |
WO (1) | WO2003023687A2 (fr) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1517207A2 (fr) * | 2003-09-16 | 2005-03-23 | Voith Paper Patent GmbH | Système informatisé de mesure des données de qualité et/ou de processus |
GB2453035A (en) * | 2007-09-15 | 2009-03-25 | Curvaceous Software Ltd | Control of a multi-variable process. |
WO2009052080A1 (fr) * | 2007-10-17 | 2009-04-23 | Nalco Company | Procédé d'analyse de données de procédé de fabrication |
GB2477447A (en) * | 2006-09-13 | 2011-08-03 | Fisher Rosemount Systems Inc | Analyzing batch runs by using a batch signature to perform a user interface function |
US9086695B2 (en) | 2006-09-13 | 2015-07-21 | Fisher-Rosemount Systems, Inc. | Compact batch viewing techniques for use in batch processes |
US9275367B2 (en) | 2011-05-05 | 2016-03-01 | Aegis Analytical Corporation | System for designating, displaying and selecting types of process parameters and product outcome parameters |
US9927788B2 (en) | 2011-05-19 | 2018-03-27 | Fisher-Rosemount Systems, Inc. | Software lockout coordination between a process control system and an asset management system |
WO2019094787A1 (fr) * | 2017-11-09 | 2019-05-16 | Zymergen Inc. | Optimisation d'organismes pour une performance dans des conditions à plus grande échelle à partir d'une performance dans des conditions à plus petite échelle |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6243615B1 (en) * | 1999-09-09 | 2001-06-05 | Aegis Analytical Corporation | System for analyzing and improving pharmaceutical and other capital-intensive manufacturing processes |
-
2002
- 2002-09-04 AU AU2002323532A patent/AU2002323532A1/en not_active Abandoned
- 2002-09-04 WO PCT/US2002/027805 patent/WO2003023687A2/fr not_active Application Discontinuation
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6243615B1 (en) * | 1999-09-09 | 2001-06-05 | Aegis Analytical Corporation | System for analyzing and improving pharmaceutical and other capital-intensive manufacturing processes |
Non-Patent Citations (2)
Title |
---|
'Application note 4.1 fermentation monitoring. Affinity sensors' IASYS. 1996, pages 1 - 4, XP002961668 * |
BERRY M.J. ET AL.: 'Assay and purification of Fv fragments in fermenter cultures: design and evaluation of generic binding reagents' JOURNAL OF IMMUNOLOGICAL METHODS vol. 167, 1994, pages 173 - 182, XP002961667 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1517207A3 (fr) * | 2003-09-16 | 2009-10-21 | Voith Patent GmbH | Système informatisé de mesure des données de qualité et/ou de processus |
EP1517207A2 (fr) * | 2003-09-16 | 2005-03-23 | Voith Paper Patent GmbH | Système informatisé de mesure des données de qualité et/ou de processus |
GB2477447B (en) * | 2006-09-13 | 2012-02-01 | Fisher Rosemount Systems Inc | Compact batch viewing and alarm generation techniques using a batch signature for use in batch processes |
GB2477447A (en) * | 2006-09-13 | 2011-08-03 | Fisher Rosemount Systems Inc | Analyzing batch runs by using a batch signature to perform a user interface function |
US9086695B2 (en) | 2006-09-13 | 2015-07-21 | Fisher-Rosemount Systems, Inc. | Compact batch viewing techniques for use in batch processes |
GB2453035B (en) * | 2007-09-15 | 2010-11-24 | Curvaceous Software Ltd | Multi-variable operations |
GB2453035A (en) * | 2007-09-15 | 2009-03-25 | Curvaceous Software Ltd | Control of a multi-variable process. |
US8380328B2 (en) | 2007-09-15 | 2013-02-19 | Process Plant Computing Limited | Method and system for monitoring and controlling a multi-variable process throughout a plurality of distinct phases of the process |
WO2009052080A1 (fr) * | 2007-10-17 | 2009-04-23 | Nalco Company | Procédé d'analyse de données de procédé de fabrication |
US9275367B2 (en) | 2011-05-05 | 2016-03-01 | Aegis Analytical Corporation | System for designating, displaying and selecting types of process parameters and product outcome parameters |
US9927788B2 (en) | 2011-05-19 | 2018-03-27 | Fisher-Rosemount Systems, Inc. | Software lockout coordination between a process control system and an asset management system |
WO2019094787A1 (fr) * | 2017-11-09 | 2019-05-16 | Zymergen Inc. | Optimisation d'organismes pour une performance dans des conditions à plus grande échelle à partir d'une performance dans des conditions à plus petite échelle |
CN111886330A (zh) * | 2017-11-09 | 2020-11-03 | 齐默尔根公司 | 基于在较小规模条件下的性能优化在较大规模条件下的有机体性能 |
Also Published As
Publication number | Publication date |
---|---|
AU2002323532A1 (en) | 2003-03-24 |
WO2003023687A3 (fr) | 2003-08-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wagih et al. | SGAtools: one-stop analysis and visualization of array-based genetic interaction screens | |
Royce | A discussion of recent developments in fermentation monitoring and control from a practical perspective | |
Konstantinov | Monitoring and control of the physiological state of cell cultures | |
WO2008010005A1 (fr) | Procédé d'optimisation en ligne d'une unité de fermentation à écoulement discontinu à des fins d'optimisation de la productivité | |
US20090048816A1 (en) | Method for on-line prediction of future performance of a fermentation unit | |
CN109154588A (zh) | 自动化生物过程开发 | |
WO2018229802A1 (fr) | Procédé de prédiction du résultat et de modélisation d'un processus dans un bioréacteur | |
WO2003023687A2 (fr) | Procede perfectionne destine a analyser le profil de donnees continues | |
Knijnenburg et al. | Exploiting combinatorial cultivation conditions to infer transcriptional regulation | |
WO2019111310A1 (fr) | Système d'analyse automatique | |
CN109684468B (zh) | 针对循证医学的文献筛选标注系统 | |
Alberghina et al. | Flow cytometry and cell cycle kinetics in continuous and fed-batch fermentations of budding yeast | |
Henson | Biochemical reactor modeling and control | |
Batut et al. | Hands-on: Hands-on: Reference-based RNA-Seq data analysis | |
Swartz | The use of economic analysis of penicillin G manufacturing costs in establishing priorities for fermentation process improvement | |
WO2001043038A1 (fr) | Localisation d'echantillons d'etudes cliniques, d'informations et de resultats | |
Wentworth et al. | Self-cycling fermentation of a citric acid producing strain of Candida lipolytica | |
US11519889B2 (en) | Biological sample automatic analysis system | |
Mariani et al. | Yeast population models for monitoring and control of biotechnical processes | |
WO2021076855A1 (fr) | Procédés de régulation de monosaccharide dynamique | |
Pieracci et al. | 15 Leveraging Multivariate Analysis Tools to Qualify Scaled-Down Models | |
Coleman | Establishment of a novel Pichia Pastoris host production platform | |
US20230002708A1 (en) | Adapting control of a cell culture in a production scale vessel with regard to a starting medium | |
Endo et al. | A database system and an expert system for realizing factory automation in the bioindustries | |
JPWO2019111312A1 (ja) | 生体試料分析システム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BY BZ CA CH CN CO CR CU CZ DE DM DZ EC EE ES FI GB GD GE GH HR HU ID IL IN IS JP KE KG KP KR LC LK LR LS LT LU LV MA MD MG MN MW MX MZ NO NZ OM PH PL PT RU SD SE SG SI SK SL TJ TM TN TR TZ UA UG UZ VC VN YU ZA ZM |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ UG ZM ZW AM AZ BY KG KZ RU TJ TM AT BE BG CH CY CZ DK EE ES FI FR GB GR IE IT LU MC PT SE SK TR BF BJ CF CG CI GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: JP |