WO2003023687A2 - Procede perfectionne destine a analyser le profil de donnees continues - Google Patents

Procede perfectionne destine a analyser le profil de donnees continues Download PDF

Info

Publication number
WO2003023687A2
WO2003023687A2 PCT/US2002/027805 US0227805W WO03023687A2 WO 2003023687 A2 WO2003023687 A2 WO 2003023687A2 US 0227805 W US0227805 W US 0227805W WO 03023687 A2 WO03023687 A2 WO 03023687A2
Authority
WO
WIPO (PCT)
Prior art keywords
parameter
data
readable medium
machine readable
aligning
Prior art date
Application number
PCT/US2002/027805
Other languages
English (en)
Other versions
WO2003023687A3 (fr
Inventor
Justin Neway
Brent Rognlie
Original Assignee
Aegis Analytical Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aegis Analytical Corporation filed Critical Aegis Analytical Corporation
Priority to AU2002323532A priority Critical patent/AU2002323532A1/en
Publication of WO2003023687A2 publication Critical patent/WO2003023687A2/fr
Publication of WO2003023687A3 publication Critical patent/WO2003023687A3/fr

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/418Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM]
    • G05B19/41875Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM] characterised by quality surveillance of production
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/32Operator till task planning
    • G05B2219/32187Correlation between controlling parameters for influence on quality parameters
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/32Operator till task planning
    • G05B2219/32194Quality prediction
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Definitions

  • the present invention relates to a method for analyzing the data from a production process.
  • a statistical technique e.g., multiple regression
  • a method for analysis comprising the steps of: generating plots for each of a plurality of batches of a process manufacturing process based on data for at least one continuous parameter; aligning the plots based on at least one aligning continuous parameter of each of the plurality of batches; selecting a plurality of profiles for the aligned plots; analyzing the profiles using a regression method to provide analysis results indicating the level of success of the process manufacturing process; and displaying the analysis results to a user.
  • a machine readable medium storing instructions that, if executed by a computer system, causes the computer system to perform a set of operations comprising: generating plots for each of a plurality of batches of a process manufacturing process based on data for at least one continuous parameter; aligning the plots based on at least one aligning continuous parameter of each of the plurality of batches; selecting a plurality of profiles for the aligned plots; analyzing the profiles using a regression method to provide analysis results indicating the level of success of the process manufacturing process; and displaying the analysis results to a user.
  • a method for analysis comprising the steps of: generating plots for each of a plurality of batches of a process manufacturing process based on data for at least one continuous parameter; aligning the plots based on at least one aligning continuous parameter of each of the plurality of batches; selecting a plurality of profiles for the aligned plots; analyzing the profiles using a regression method to provide analysis results indicating the level of success of the process manufacturing process; and storing the analysis results in a machine readable medium.
  • a machine readable medium storing instructions that, if executed by a computer system, causes the computer system to perform a set of operations comprising: generating plots for each of a plurality of batches of a process manufacturing process based on data for at least one continuous parameter; aligning the plots based on at least one aligning continuous parameter of each of the plurality of batches; selecting a plurality of profiles for the aligned plots; analyzing the profiles using a regression method to provide analysis results indicating the level of success of the process manufacturing process; and storing the analysis results in a second machine readable medium.
  • FIG. 1 is a flow chart illustrating a preferred embodiment of the method of the present invention
  • FIG. 2 is a screenshot of the operation of a program used to implement a preferred embodiment of the method of the present invention showing plots for parameters of a process being analyzed prior to alignment;
  • FIG. 3 is a screenshot of the operation of the program of FIG. 2 showing how the plots of FIG. 2 may be shifted using a preferred embodiment of the method of the present invention
  • FIG. 4 is a screenshot of the operation of the program of FIG. 2 showing one parameter on which a linear regression may be carried out in accordance with a preferred embodiment of the method of the present invention
  • FIG. 5 is a screenshot of the operation of the program of FIG. 2 showing five continuous parameters and a series of selected time-based profiles for a process being analyzed by the program of FIG. 2;
  • FIG. 6 is a screenshot of the operation of the program of FIG. 2 showing one- line summaries of the multiple regression analysis output of the process being analyzed by the program of FIG 2;
  • FIG. 7 is a screenshot of the operation of the program of FIG. 2 showing a detailed output of a model of the process being analyzed by the program of FIG. 2.
  • a "raw data" source refers to unadjusted data contained in the original data sources.
  • profile refers to the collection of values of one or more continuous parameters of one or more "parameter sets” at a given common time point.
  • the term "the level of success of a manufacturing process” refers to the current relationship of the process to the predicted outcome of interest, such as an impurity or yield outcome parameter.
  • process manufacturing process refers to a process that uses processing steps that exert chemical and physical changes on the raw materials and intermediate materials so that they are physically transformed into products that no longer resemble the starting materials.
  • process manufacturing processes are the production of antibiotics or genetically engineered proteins from sugars and salts using living microorganisms, or the production of gasoline from crude oil.
  • discrete manufacturing process refers to a process that is essentially an assembly operation. Examples of discrete manufacturing are the manufacturing of an automobile or a shirt that consists of assembling various parts to make a whole. The finished product looks like an assembly of the parts that the process started with. Some of the parts used to assemble a car may be produced by process manufacturing processes. For example, an engine block for an automobile is made from molten alloy that is produced by a process manufacturing process starting with chunks of mineral ores that are melted and processed so that they no longer resemble the mined starting materials.
  • bioprocess refers to any process manufacturing process that involves the use of cell cultures, including living cell cultures such as bacterial cultures.
  • the term "user” refers not only to end- users of software employing the method of the present invention, but also to individuals, such as software developers or database designers, who carry out one or more steps of the method of the present invention.
  • process refers to any process.
  • the method of the present invention may be used to access and analyze data from processes for producing one or more products including manufacturing processes, purification processes, chemical synthesis processes, etc. or may be used for other types of processes such as tracking the shipment of goods, tracking inventory in a store, etc.
  • a process of the present invention includes one or more steps.
  • the term “parameter” refers to any property or characteristic used to classify an individual piece or multiple pieces of data. For the purposes of the present invention, there are two characteristics of "parameters":
  • identification codes and “parameter values.” Any parameter that is not used as an identification code for an analysis group is a parameter value. Parameters may include characteristics such as the temperature at a particular time, the pH of a solution, the purity of a compound, the source of a raw material, etc.
  • parameter value refers to the specific piece of data resulting from a measurement associated with a specific parameter.
  • specific parameter values include the particular batch number for a batch, the measured temperature associated with a batch material at a particular time in its production cycle, the test outcome for a specific parameter, etc.
  • identification code refers to a code, name etc. that uniquely identifies a particular parameter.
  • parameter set refers to a group of parameters that relate to the same batch of manufactured product.
  • a parameter set may be obtained from a single data set or multiple data sets.
  • a parameter set may have one or more "parameter values" associated with each parameter in the parameter set.
  • equipment parameter refers to a parameter relating to one or more pieces of equipment used in a process manufacturing process of the present invention.
  • equipment parameters include: RPMs of an agitator, pressure of a vessel, etc.
  • the term "material parameter" refers to a parameter relating to a material that is processed by a process manufacturing process of the present invention.
  • material parameters include: pH of a solution, concentration of metabolite, temperature of a liquid, etc.
  • environmental parameter refers to a parameter relating to the environment to which material processed by the process manufacturing process is exposed.
  • environmental parameters include: room air temperature, humidity, dew point, etc.
  • the terms "material parameter”, “environmental parameter” and “equipment parameter” refer to different species of parameters.
  • raw material refers to starting materials used in a process for producing a product.
  • intermediate material refers to a material produced at any point in the process prior to producing the final product of the process.
  • An intermediate material may be produced by manufacturing the intermediate material from raw materials or other intermediate materials, by purifying raw materials or other intermediate materials, by the synthesis from raw materials or other intermediate materials, etc.
  • batch refers to a given amount of product and the materials and conditions used to make that given amount of product, regardless of the amount of raw materials used, the amount of product produced, or the time taken to produce a given amount of product.
  • Several types of discrete data, continuous data, and replicate data may all be related to a particular batch of product.
  • the term batch as used in the present invention may refer to a production run of several hours, days, weeks, months, etc.
  • the term “data source” refers to any source of data such as a database or a data storage file, data directly produced by a measurement device, data electronically sent from a remote location, data entered into a database from paper records, etc. Two data sources are considered to be “different” if the data sources employ different file formats or different data structures or have different physical locations.
  • the term “data set” refers to a set of data or a database. A data set may be classified into a particular "complete data set type" based on the data set's primary data set type, secondary data set type and the same tertiary data set type.
  • discrete data refers to data parameter values that are obtained only once during the process of producing one batch of product. Examples of discrete data include: the amount of an ingredient added at some step in a process, the source of an ingredient added at a particular step in a process, the date of production of an ingredient used in a process, etc.
  • continuous data refers to data parameter values that are obtained at several times during the same step of the process of producing a batch of product, with each collection having an associated time.
  • continuous data include: the temperature at a particular step of a process measured in 5 second intervals for the duration of the step, the moisture content of the effluent air at a particular step measured in 10 second intervals for the duration of the step, the amount of contamination present at a particular step measured in 15 minute intervals, etc.
  • replica data refers to data parameter values that are obtained from several measurements of the same parameter made independent of the time of the measurement, i.e. replicate data includes data obtained from multiple measurements of the same parameter made at the same time and data obtained from multiple measurements of the same parameter taken with no regard as to the time that the measurements were made. Replicate data may also be discrete data or continuous data.
  • an analysis group refers to a collection of parameter sets that may be selected by a user wherein all of the parameter sets meet the "parameter restrictions" for one or more parameters.
  • an analysis group could include all of the parameter sets which have median temperature parameter values of 35 to 38° C for three different time points, a minimum pH parameter value above 7, have the same raw materials supplier parameter, have a raw materials supplied date parameter value of January, etc.
  • An analysis group is a structured data container that supports fast, efficient utilization of data via standardized interfaces. The structure of an analysis group permits it to hold all types of data concurrently, e.g. discrete, continuous, replicate, etc.
  • An analysis group can be thought of as a sparsely populated multidimensional data cube, with parameter sets (that relate to individual batches of manufactured product) making up one axis, parameter names making up another axis, and time offsets (for continuous parameters) making up another axis, and replicate information making up another axis.
  • Analysis groups also allow the dynamic creation of additional parameters within the analysis groups, allow for the data within them to be subsetted for subsequent operations and allow themselves to be updated with new data from the data sources on an on-demand basis. Analysis groups of the type employed by the method of the present invention are described in greater detail in U.S. Patent Application No.
  • the term "computer system” refers to any type of computer system that implements software including an individual computer such as a personal computer, mainframe computer, mini-computer, etc. or a network of computers, such as a network of computers in a business, the Internet, personal data assistant, cell phone, etc.
  • visual display device includes any type of visual display device such as a CRT monitor, LCD screen, etc. Description
  • the present invention provides a method for statistical analysis of large multi- parameter data sets and finding correlation between values of multiple continuous parameters and selected outcome parameter(s).
  • finding the correlation between the multiple continuous parameters and the selected outcome parameters is done through modeling these large sets of data.
  • the continuous parameters along with the discrete outcome parameter are first fitted to a statistical model, such as linear regression or multiple regression or non-linear regression.
  • the output from the statistical model is a mathematical relationship between the continuous parameters and the selected outcome parameter.
  • the selected outcome parameter may then be predicted at several time points during the process to decide or determine the fate of the manufacturing process with a high degree of certainty and for better process control.
  • APA Advanced Profile Analysis
  • the multiple continuous parameters may come from a single piece of equipment or unit operation, such as a fermentor or from multiple pieces of equipment at one or more process steps.
  • the method may first allow the user to align all the parameters across multiple batches or production runs based on some criterion such as time at which a certain parameter value occurs, e.g., when dissolved oxygen reaches 50%. This allows the absolute times at which all the parameters are measured for a particular batch to be adjusted by the amount needed to align that batch to a process state similar to the rest of the batches.
  • the method of the present invention may then allow the user to specify a single time point for the continuous parameters that is the best predictor of the selected process outcome.
  • the method of the present invention may allow different time points to be used for each of the parameters.
  • the method of the present invention may use multiple time points for the continuous parameters based on some interval, e.g. every hour, where the value of each continuous parameter is the actual value at the given time point.
  • the method of the present invention may employ the average or some other derivation of each parameter that includes a user-specified number of time points or a specified time interval on either side of a particular time point.
  • the method may then allow the user to specify a statistical technique, e.g., multiple regression, that will be performed at each time point that relates the values of the continuous parameters, as specified above, to the outcome parameter(s) of interest.
  • a statistical technique e.g., multiple regression
  • the resulting statistical models may be presented to the user ordered by the time of the profile or in order of significance.
  • Variations may consist of using any of a number of statistical techniques to apply to the method of relating the values of multiple continuous parameters to a discrete outcome parameter. For example, multiple linear regression, non-linear regression, principal component analysis, etc.
  • Bioprocesses are an example of data-intensive process manufacturing processes to which the analysis method of the present invention may have particular utility.
  • a typical pharmaceutical bioprocess will now be described.
  • a pharmaceutical bioprocess may begin with raw materials like glucose (a sugar), salts (such as sodium chloride and phosphates), and water, along with a starting living cell culture of a microorganism like E. coli.
  • the E. coli is inoculated and grown in a sterilized flask of these raw materials that has been equilibrated to, and then incubated at, a suitable temperature until it replicates into a large number of cells, approximately 1 X 10 per milliliter. At this point the cells have nearly depleted the raw materials in the flask and built up waste products. They must be transferred to a fresh sterile container with fresh raw materials so that they can continue growing. This is generally accomplished by transferring them to a seed fermentor.
  • a fermentor is a sealable, stainless steel tank that contains raw materials in a water-based medium that can be sterilized by heat and pressure much like a pressure cooker. Once the medium is sterilized and cooled to the right temperature, the E. coli is transferred into it in an aseptic manner, i.e., in a manner that excludes the possibility of contamination by other, unwanted microorganisms. After this is done, the fermentor is stirred, sterile air and/or oxygen is pumped in, the pH, temperature, pressure and dissolved oxygen concentrations are held constant at controlled setpoints and additional sterile raw materials are often pumped in to promote growth of the microorganism.
  • the E. coli cells Once the E. coli cells have grown in number, depleted the raw materials in the seed fermentor and built up near toxic levels of waste products, they are transferred progressively into larger sterile fermentors until reaching the final stage of fermentation called a production fermentor.
  • the production fermentor is operated in much the same way as the seed fermentors, accumulating more data for later use.
  • the microorganism At the end of the production fermentor stage of the process, the microorganism has produced the maximum level of the desired product and the product is ready to be harvested and purified.
  • the maximum level of product may be produced naturally or by means of an induction mechanism that introduces natural or non-natural biochemical pathways.
  • the harvest procedure consists of concentrating the cells away from the liquid fermentation broth in which they are suspended. This is usually done by, e.g., a filtration or centrifugation process. Once the cells are concentrated, and if they contain the product of interest, they may be broken typically using a force called hydrodynamic sheer. The resulting product concentrate is then ready for purification. If the product of interest does not accumulate in the cells but in the medium instead, then the medium is taken on for further processing in a similar way to the contents of the broken cells.
  • much data is again collected from the measuring devices that control the various processes and from the batch records. Examples of data collected include pressures, weights, volumes, flow rates, temperatures, pH's, operator name, start time, stop time, amount of base/acid added, amount of waste, etc.
  • the purification process that follows is aimed at achieving a higher concentration of the desired product at the same time as removing the contaminating materials. This is accomplished by subjecting the harvested mixture to separation techniques that selectively favor the desired product over the contaminating materials. These techniques may consist of any or all of the following: salt fractionation, selective precipitation, crystallization, size exclusion, affinity binding, hydrophobic separation, ion exchange, diafiltration, etc. During each of these processes, many measuring devices are used to make measurements and control the conditions in the process stream and much data is accumulated for later use. Examples of measurements include temperatures, flow rates, volumes, start times, stop times, ionic strength, pH, color, etc.
  • the desired product Once the desired product has reached its maximum practical level of concentration and purity that may be the result of the several similar or different operations of the purification process, it passes into the final stage of manufacturing called filling and finishing. During this stage of the process, it is mixed with neutral carrier molecules so that it may be prepared in the right dosage form for administration to the patients who need it. Often it is also sterilized if it is intended for injection into patients. The final product is placed into suitable containers and labeled with the ingredients and expiration date. Once again, much data is accumulated during this part of the process for later use. Examples of data that is accumulated includes, the names of operators, the vendors of materials used, room air quality measurements, calibration dates, equipment service dates, pyrogen levels, particulates, color, pH, ionic strength, potency levels, contaminant levels, etc.
  • the analysis method of the present invention employs Advanced Profile Analysis (APA) to search through time-based profiles of the continuous process parameters of a process manufacturing process to determine the time (or combination of times) that correspond to specific features in the profiles that provide the best statistical model for predicting an outcome parameter of interest.
  • APA Advanced Profile Analysis
  • the analysis method of the present invention may be used to find the parameters and times most useful for predicting the outcome, and to determine as early in the fermentation process as possible if a product is likely to fail an in-process or final product specification later in the process. That way, time and resources need not be wasted producing a bad product.
  • the analysis method of the present invention also allows users to select specific features of continuous data profiles of a process manufacturing process that are then extracted and quantified, and used as discrete parameters singly or together for statistical analysis in combination with other process parameters. This gives the ability to find out, for example, what upstream parameters may be driving specific features in the oxygen uptake rate of the production fermentation. These results may be used to give improved control within the physical constraints of the equipment or the cost constraints of the process. Users could also determine what features in the post-induction CER (Carbon Dioxide Evolution Rate) of the production fermentation in combination with measurements made in the recovery process, are associated with a troublesome contaminant in the final product.
  • CER Carbon Dioxide Evolution Rate
  • the only evidence for the contaminant might be a shoulder on the main peak in one of the downstream chromatography steps which shoulder may be quantified using a number of methods. This information may give better process control, lower failure rates and higher quality and predictability. Scale-up problems that might result if such determinations were made on smaller experimental scale may be avoided if the analyses are done on data taken from full-scale operations.
  • a user may select any number of continuous parameters from a single operation for any number of batches together with a discrete outcome parameter of interest, e.g., Product Impurity A.
  • a discrete outcome parameter of interest e.g., Product Impurity A.
  • the continuous data from the parameter sets are aligned.
  • the profile plots may not be aligned across batches, perhaps because product is transferred into a fermentor in different physiological states equating to different times in its growth phase. In this case, it may be appropriate, and the system will have the ability to align the profiles where the dissolved oxygen reaches, for example, 30%>.
  • Other means of aligning the parameter sets may also be applied.
  • time-based profiles are chosen and values are assigned to temporary discrete parameters derived at the intersections of the profiles with the lines on the continuous data plots.
  • the time-based profiles, and therefore the corresponding discrete parameter values may be chosen manually or by using an automatically generated time interval approach, e.g., run a model that chooses time-based profiles at thirty-minute intervals.
  • the user may choose to use a single value of the profiled parameter at that time, or use a neighborhood mean for each of the profile parameters.
  • the user may prefer to use the average of a certain number of nearby values, neighborhood values, in each profile of a particular parameter across all the batches to which the analysis method of the present invention is being applied.
  • the data is then analyzed.
  • the user has the option of saving the time-based profile data as new parameters for further analysis, or using it directly in a series of regression models to determine the time-based profile, or combination of time-based profiles, that give the best overall model that predicts the chosen outcome.
  • the user will be able to choose from among the standard regression techniques available in commercially available software, such as Discoverant®, a software product made by Aegis Analytical Corporation, the assignee of the present invention.
  • the multiple regression feature of Discoverant allows a user to perform regression using more than one independent parameter.
  • One example is to estimate the effects on dissolution rate, i.e. the response parameter, of different factors, i.e. the independent continuous parameters, such as time-based profiles of dryer air humidity, KW input to a wet granulating mixer, and/or the addition rate of granulating solution to a granulator.
  • the user may perform multiple regression using the following techniques: all parameters, forward selection, backwards elimination, stepwise, etc. When the all parameters technique is employed, multiple regression is performed using all of the parameters selected, whether their coefficient estimates are significant or not.
  • the forward selection techniques When the forward selection techniques are employed, predictor parameters are added to the regression model one at a time.
  • the candidate parameter for inclusion at each step is the one that produces the biggest decrease in residual sum of squares (SSE). If the calculated significance at each step is smaller than the specified p-value, then the parameter is added to the model. Otherwise the process is terminated.
  • the backwards elimination technique When the backwards elimination technique is employed, regression is performed starting with all of the parameters in the model.
  • the candidate parameter considered for removal at each step is the one that produces the smallest increase in SSE from the previous step.
  • the procedure terminates when all the parameters are removed from the model or the calculated significance is greater than the specified p-value.
  • the stepwise procedure technique is a combination of the forward selection and backward elimination techniques.
  • FIG. 1 illustrates in flow chart 100 how a preferred embodiment of the method of the present invention may be implemented in a software program.
  • a visual display device not shown in FIG. 1
  • the time-based profiles for various batches will be displayed on a graph or plot on a screen on the visual display device.
  • a user creates an analysis group from raw data sources by using a method such as described in US Patent Application No. 09/816,547, entitled “System, Method and Computer Program Product for Mapping Data of Multi-Database Origins" the entire contents and disclosure of which is hereby incorporated by reference.
  • the user selects continuous parameters of interest and particular batches to be analyzed for those continuous parameters.
  • the user aligns the continuous data from each of the batches. The user selects a single continuous parameter such as the dissolved oxygen concentration in the fermentor, the carbon dioxide evolution rate of the E. coli cells, the RPM of the agitator, the airflow through the fermentation tank, the optical density of the fermentation broth, etc.
  • the user chooses an alignment criterion, such as the point where dissolved oxygen reaches 80%>. Then the user activates the display button on a screen displayed on a visual display device to display the aligned batches on the visual display device.
  • the user selects a multiple regression method such as forward selection, backward elimination, etc. to analyze the aligned profiles.
  • the user selects a single discrete outcome parameter of interest such as fermentation yield.
  • the user selects multiple time-based profiles, either automatically at repeating intervals or interactively at particular times chosen by the user via on-screen interactions, upon which the previously selected regression method will be performed.
  • the software program When the "analyze button" is clicked, the software program first creates the data set from which to build the specified regression model. For example, if a user is analyzing 5 batches and 3 continuous parameters where the user wants to use neighborhood means using +/- 3 data points for profiles at 5 and 10 minutes, the process might look like profiles in Tables 1 and 2 below:
  • Batch refers to batch number
  • Yield refers to percentage yield
  • CER Carbon Dioxide Evolution Rate
  • BaseFlow refers to the amount of base fed into the fermentor in liters/minute
  • %DO refers to percentage of dissolved oxygen in the fermentor. Yield is the discrete dependent parameter and CER, BaseFlow and %>DO are the continuous independent parameters.
  • the value of 28.2 for CER is determined by taking the average of the actual value of CER at 5 minutes, and the values at 2, 3, 4, 6, 7, and 8 minutes, assuming that the parameter has values recorded once per minute.
  • the user-specific regression technique may be used to construct the "best fit" model. Using stepwise regression produces results such as the following:
  • the software program will create a statistical model, which predicts the outcome based on the values of the independent continuous parameters, for each time-based profile or a single model for all the profiles together. If there are multiple data models, the software program may provide one-line summaries for each, sorted by order of significance followed by more detailed output. If the selected dependent parameter, i.e., the outcome parameter of interest, is used to determine the success or failure of a batch, the earliest significant time profile could be used to predict the likelihood of success or failure of a future batch. In any event, establishing correlations with desired outcomes can provide insight into where to focus root cause investigations of failures, thus saving time and money.
  • a particular implementation of a software program of the type illustrated in FIG. 1 for use on a particular process manufacturing process is described in the example below.
  • the analysis method of the present invention has been described above with respect to analyzing an example of a bioprocess, the method of the present invention may be used to analyze other types of process manufacturing processes such as for example the production tablets or capsules containing active drug substances, the production of chemicals or specialty chemicals, the production of refined minerals, the production of gasoline from crude oil, etc.
  • results may be presented in written or printed form, audio form, or any other means of communicating the results to a user.
  • APA Advanced Profile Analysis
  • the data was aligned.
  • production batches or production runs do not always start in a fermentor in the same physiological state. Therefore they must be aligned so that comparisons from batch to batch at specific time intervals make sense.
  • any number of parameters could be used to determine how much to time-shift the data from each batch, e.g., Carbon Dioxide Evolution Rate (CER).
  • CER Carbon Dioxide Evolution Rate
  • the weight of Tank A is the plotted parameter used for alignment.
  • FIG. 2 shows the plots before alignment.
  • the alignment criteria may be where the maximum or minimum occurs, or as in the present example, where the amount of base used intersects the value 2.
  • TankWt A is highlighted to indicate the chosen independent continuous parameter.
  • the highlighted items to the right of TankWt A, e.g., 104a, 108a, etc., indicate the batches that were chosen for the graphical display and subsequent analysis. Clicking the display button plots the values of the continuous parameter, TankWt A in this case, on the y-axis vs. the time (in minutes) from the start of the step on the x-axis.
  • the screen shot of FIG. 3 shows how the plots are shifted using the method of the present invention so that they all intersect the value 2 on the y-axis at the same point in time. All the batches are shifted to the left to line up with the batch that intersects the value 2 the earliest. The amount by which each batch is shifted may in itself be an important parameter to correlate with various outcomes of interest.
  • the x-axis is rescaled to accommodate the batch that has to shift the most to the left to achieve the alignment criteria. For example, if the batch that takes the longest time to reach the value 2 takes 350 minutes longer than the fastest one to reach 2, the minimum value on the x-axis has to be no larger than -350 to show all the data.
  • each vertical line in the graph represents a time-based profile at a particular time. Consequently, a new discrete parameter will be created for TankWt A at each of the selected time profiles where the profile line intersects that parameter. For each batch, the created values correspond to values of TankWt A at each of the time points that correspond to intersection of the time-based profile lines with the continuous parameter.
  • an option is provided to use neighborhood means instead.
  • an average may be taken using the single intersection value and including a specified number of points on both sides of it to help average out noisy (i.e. highly variable) parameter values, thus resulting in more robust regression models.
  • a linear interpolation may be used so that there will be a complete data set.
  • all the values that are used in the program to build the regression model may be saved as new parameters so that they may be used elsewhere in the system to create additional models, e.g., principal component regression.
  • a more typical situation is when there are multiple continuous parameters chosen to determine the profile that results in the best model in terms of predicting the outcome of interest, also known as the 'Dependent Parameter.'
  • the screenshot of FIG. 5 shows five continuous parameters and a series of selected time-based profiles. To avoid having potentially hundreds of lines on a single graph, there is a multi-page display where each page shows the chosen continuous parameters for each batch.
  • the y-axis provides a standardized or normalized scale that ranges from 0 to 100, i.e. range- scaled.
  • the user can select the Point ID mode and click on a point. The system will then put a label on the plot that indicates the time offset and actual value.
  • MassRate A, Rate PI, TankWt A, TankWt B, and TankWt C are highlighted to indicate the chosen independent continuous parameters.
  • the highlighted items to the right, e.g., 104a, 108a, etc., indicate the batches that are chosen for the graphical display and subsequent analysis. Clicking the display button plots the values of the continuous parameters (in terms of % of range) on the y-axis vs. the time (in minutes) from the start of the step on the x-axis.
  • a calculated p-value of less than 0.05 is generally considered to indicate that the model is statistically significant, i.e., the parameters in the model can be reliably used to help predict the value of the dependent (outcome) parameter of interest; Response A in the present example.
  • Column 6 gives the value of R-Squared, which is the percentage of variability in the dependent parameter that can be explained by the model.
  • Column 7 is an adjusted R-Squared, which is similar to R-Squared but is adjusted downward to account for the number of parameters and degrees of freedom and to help prevent misinterpretation from over fitting the model.
  • a one-page detailed output is provided for each model so the user can see which of the continuous parameters are the most significant.
  • Column 1 of the first table provides the parameter name
  • Column 2 provides the coefficient estimate for the parameter
  • Column 3 provides the standard error associated with the coefficient estimate
  • Column 4 provides the calculated t-Statistic (Column2/Column3)
  • Column 5 provides the calculated p-value that indicates the level of significance for each of the parameters.
  • the second table provides the sum of squares and degrees of freedom for each source, along with mean square errors for the model and residual, and calculated F-Ratio and p-value that indicates the level of significance of the overall model.
  • the R-Squared and adjusted R-Squared values are the same in the one-line summaries in FIG. 6.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Manufacturing & Machinery (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Cold Cathode And The Manufacture (AREA)
  • Paper (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

L'invention concerne un procédé d'analyse comprenant les étapes consistant: à générer des figures pour chaque pluralité de lots d'un procédé de fabrication basé sur des données pour au moins un paramètre continu; à aligner les figures basées sur au moins un paramètre continu d'alignement de chaque pluralité de lots; à sélectionner une pluralité de profils pour les figures alignées; à analyser ces profils au moyen d'un procédé de régression afin de fournir des résultats d'analyse indiquant le niveau de succès du procédé de fabrication; et à afficher les résultats d'analyse à un utilisateur ou à les stocker dans un support assimilable par machine. Cette invention concerne également un support assimilable par machine pouvant être utilisé afin de mettre en oeuvre le procédé de cette invention.
PCT/US2002/027805 2001-09-12 2002-09-04 Procede perfectionne destine a analyser le profil de donnees continues WO2003023687A2 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2002323532A AU2002323532A1 (en) 2001-09-12 2002-09-04 An advanced method for profile analysis of continuous data

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US31832901P 2001-09-12 2001-09-12
US60/318,329 2001-09-12
US15361302A 2002-05-24 2002-05-24
US10/153,613 2002-05-24

Publications (2)

Publication Number Publication Date
WO2003023687A2 true WO2003023687A2 (fr) 2003-03-20
WO2003023687A3 WO2003023687A3 (fr) 2003-08-28

Family

ID=26850705

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2002/027805 WO2003023687A2 (fr) 2001-09-12 2002-09-04 Procede perfectionne destine a analyser le profil de donnees continues

Country Status (2)

Country Link
AU (1) AU2002323532A1 (fr)
WO (1) WO2003023687A2 (fr)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1517207A2 (fr) * 2003-09-16 2005-03-23 Voith Paper Patent GmbH Système informatisé de mesure des données de qualité et/ou de processus
GB2453035A (en) * 2007-09-15 2009-03-25 Curvaceous Software Ltd Control of a multi-variable process.
WO2009052080A1 (fr) * 2007-10-17 2009-04-23 Nalco Company Procédé d'analyse de données de procédé de fabrication
GB2477447A (en) * 2006-09-13 2011-08-03 Fisher Rosemount Systems Inc Analyzing batch runs by using a batch signature to perform a user interface function
US9086695B2 (en) 2006-09-13 2015-07-21 Fisher-Rosemount Systems, Inc. Compact batch viewing techniques for use in batch processes
US9275367B2 (en) 2011-05-05 2016-03-01 Aegis Analytical Corporation System for designating, displaying and selecting types of process parameters and product outcome parameters
US9927788B2 (en) 2011-05-19 2018-03-27 Fisher-Rosemount Systems, Inc. Software lockout coordination between a process control system and an asset management system
WO2019094787A1 (fr) * 2017-11-09 2019-05-16 Zymergen Inc. Optimisation d'organismes pour une performance dans des conditions à plus grande échelle à partir d'une performance dans des conditions à plus petite échelle

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6243615B1 (en) * 1999-09-09 2001-06-05 Aegis Analytical Corporation System for analyzing and improving pharmaceutical and other capital-intensive manufacturing processes

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6243615B1 (en) * 1999-09-09 2001-06-05 Aegis Analytical Corporation System for analyzing and improving pharmaceutical and other capital-intensive manufacturing processes

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
'Application note 4.1 fermentation monitoring. Affinity sensors' IASYS. 1996, pages 1 - 4, XP002961668 *
BERRY M.J. ET AL.: 'Assay and purification of Fv fragments in fermenter cultures: design and evaluation of generic binding reagents' JOURNAL OF IMMUNOLOGICAL METHODS vol. 167, 1994, pages 173 - 182, XP002961667 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1517207A3 (fr) * 2003-09-16 2009-10-21 Voith Patent GmbH Système informatisé de mesure des données de qualité et/ou de processus
EP1517207A2 (fr) * 2003-09-16 2005-03-23 Voith Paper Patent GmbH Système informatisé de mesure des données de qualité et/ou de processus
GB2477447B (en) * 2006-09-13 2012-02-01 Fisher Rosemount Systems Inc Compact batch viewing and alarm generation techniques using a batch signature for use in batch processes
GB2477447A (en) * 2006-09-13 2011-08-03 Fisher Rosemount Systems Inc Analyzing batch runs by using a batch signature to perform a user interface function
US9086695B2 (en) 2006-09-13 2015-07-21 Fisher-Rosemount Systems, Inc. Compact batch viewing techniques for use in batch processes
GB2453035B (en) * 2007-09-15 2010-11-24 Curvaceous Software Ltd Multi-variable operations
GB2453035A (en) * 2007-09-15 2009-03-25 Curvaceous Software Ltd Control of a multi-variable process.
US8380328B2 (en) 2007-09-15 2013-02-19 Process Plant Computing Limited Method and system for monitoring and controlling a multi-variable process throughout a plurality of distinct phases of the process
WO2009052080A1 (fr) * 2007-10-17 2009-04-23 Nalco Company Procédé d'analyse de données de procédé de fabrication
US9275367B2 (en) 2011-05-05 2016-03-01 Aegis Analytical Corporation System for designating, displaying and selecting types of process parameters and product outcome parameters
US9927788B2 (en) 2011-05-19 2018-03-27 Fisher-Rosemount Systems, Inc. Software lockout coordination between a process control system and an asset management system
WO2019094787A1 (fr) * 2017-11-09 2019-05-16 Zymergen Inc. Optimisation d'organismes pour une performance dans des conditions à plus grande échelle à partir d'une performance dans des conditions à plus petite échelle
CN111886330A (zh) * 2017-11-09 2020-11-03 齐默尔根公司 基于在较小规模条件下的性能优化在较大规模条件下的有机体性能

Also Published As

Publication number Publication date
AU2002323532A1 (en) 2003-03-24
WO2003023687A3 (fr) 2003-08-28

Similar Documents

Publication Publication Date Title
Wagih et al. SGAtools: one-stop analysis and visualization of array-based genetic interaction screens
Royce A discussion of recent developments in fermentation monitoring and control from a practical perspective
Konstantinov Monitoring and control of the physiological state of cell cultures
WO2008010005A1 (fr) Procédé d'optimisation en ligne d'une unité de fermentation à écoulement discontinu à des fins d'optimisation de la productivité
US20090048816A1 (en) Method for on-line prediction of future performance of a fermentation unit
CN109154588A (zh) 自动化生物过程开发
WO2018229802A1 (fr) Procédé de prédiction du résultat et de modélisation d'un processus dans un bioréacteur
WO2003023687A2 (fr) Procede perfectionne destine a analyser le profil de donnees continues
Knijnenburg et al. Exploiting combinatorial cultivation conditions to infer transcriptional regulation
WO2019111310A1 (fr) Système d'analyse automatique
CN109684468B (zh) 针对循证医学的文献筛选标注系统
Alberghina et al. Flow cytometry and cell cycle kinetics in continuous and fed-batch fermentations of budding yeast
Henson Biochemical reactor modeling and control
Batut et al. Hands-on: Hands-on: Reference-based RNA-Seq data analysis
Swartz The use of economic analysis of penicillin G manufacturing costs in establishing priorities for fermentation process improvement
WO2001043038A1 (fr) Localisation d'echantillons d'etudes cliniques, d'informations et de resultats
Wentworth et al. Self-cycling fermentation of a citric acid producing strain of Candida lipolytica
US11519889B2 (en) Biological sample automatic analysis system
Mariani et al. Yeast population models for monitoring and control of biotechnical processes
WO2021076855A1 (fr) Procédés de régulation de monosaccharide dynamique
Pieracci et al. 15 Leveraging Multivariate Analysis Tools to Qualify Scaled-Down Models
Coleman Establishment of a novel Pichia Pastoris host production platform
US20230002708A1 (en) Adapting control of a cell culture in a production scale vessel with regard to a starting medium
Endo et al. A database system and an expert system for realizing factory automation in the bioindustries
JPWO2019111312A1 (ja) 生体試料分析システム

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BY BZ CA CH CN CO CR CU CZ DE DM DZ EC EE ES FI GB GD GE GH HR HU ID IL IN IS JP KE KG KP KR LC LK LR LS LT LU LV MA MD MG MN MW MX MZ NO NZ OM PH PL PT RU SD SE SG SI SK SL TJ TM TN TR TZ UA UG UZ VC VN YU ZA ZM

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ UG ZM ZW AM AZ BY KG KZ RU TJ TM AT BE BG CH CY CZ DK EE ES FI FR GB GR IE IT LU MC PT SE SK TR BF BJ CF CG CI GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP