WO2022265110A1 - モデル関数フィッティング装置およびモデル関数フィッティング方法 - Google Patents
モデル関数フィッティング装置およびモデル関数フィッティング方法 Download PDFInfo
- Publication number
- WO2022265110A1 WO2022265110A1 PCT/JP2022/024399 JP2022024399W WO2022265110A1 WO 2022265110 A1 WO2022265110 A1 WO 2022265110A1 JP 2022024399 W JP2022024399 W JP 2022024399W WO 2022265110 A1 WO2022265110 A1 WO 2022265110A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- function
- model
- fitting
- model function
- chromatogram
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 33
- 238000012887 quadratic function Methods 0.000 claims abstract description 13
- 238000012886 linear function Methods 0.000 claims abstract description 12
- 238000006243 chemical reaction Methods 0.000 claims description 31
- 239000000654 additive Substances 0.000 claims description 12
- 230000000996 additive effect Effects 0.000 claims description 12
- 238000012937 correction Methods 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 10
- 239000002131 composite material Substances 0.000 claims description 3
- 230000006870 function Effects 0.000 description 336
- 238000010586 diagram Methods 0.000 description 26
- 238000005259 measurement Methods 0.000 description 18
- 238000004088 simulation Methods 0.000 description 18
- 238000005457 optimization Methods 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000009499 grossing Methods 0.000 description 5
- 238000000926 separation method Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 230000007423 decrease Effects 0.000 description 4
- 238000010606 normalization Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 3
- 239000012535 impurity Substances 0.000 description 3
- 239000007788 liquid Substances 0.000 description 3
- 230000010355 oscillation Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000000342 Monte Carlo simulation Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000011002 quantification Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000009897 systematic effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000004836 empirical method Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 229930010796 primary metabolite Natural products 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000004451 qualitative analysis Methods 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000001179 sorption measurement Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N30/00—Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
- G01N30/02—Column chromatography
- G01N30/86—Signal analysis
Definitions
- the present invention relates to a model function fitting device and model function fitting method for fitting a model function to a chromatogram.
- Non-Patent Document 1 Various model functions as shown in Non-Patent Document 1 have been proposed in order to quantify and qualitatively measure waveforms measured by a chromatograph. In applying the peak separation algorithm, the model function can be fitted to the measured waveform with high accuracy, and it is unlikely that any parameter will take a different shape from the chromatogram peak waveform. requested. To meet these demands, for example, the EMG function and BEMG function shown in Non-Patent Document 2 are used.
- the purpose of the present invention is to provide a model function that enables highly accurate fitting.
- a model function fitting apparatus includes an acquisition unit that acquires a chromatogram, a first unit that can approximate a logarithmic function of the model function with a quadratic function for the model function, and and a fitting part that fits the model function to the chromatogram under the constraint of having a second part that can be positioned and approximated by a linear function.
- FIG. 1 is a block diagram of a model function fitting device according to this embodiment.
- FIG. 2 is a functional block diagram of the model function fitting device according to this embodiment.
- FIG. 3 shows the chromatogram.
- FIG. 4 shows the logarithm of the chromatogram.
- FIG. 5 shows the fitting of the generalized additive model to the chromatogram.
- FIG. 6 is a diagram showing residuals from a simulation of model function fitting according to the first embodiment.
- FIG. 7 is a diagram showing residuals from a simulation of BEMG function fitting.
- FIG. 8 is a diagram comparing the model function according to the first embodiment and the model function fitted by the unimodal restriction.
- FIG. 9 is a diagram showing model functions fitted to the simulation data.
- FIG. 1 is a block diagram of a model function fitting device according to this embodiment.
- FIG. 2 is a functional block diagram of the model function fitting device according to this embodiment.
- FIG. 3 shows the chromatogram.
- FIG. 4 shows the logarithm of
- FIG. 10 is a diagram comparing simulation results of the model function and the BEMG function according to the second embodiment.
- FIG. 11 is a diagram comparing simulation results of the model function, the EMG function, and the BEMG function according to the second embodiment.
- FIG. 12 is a diagram showing a chromatogram C2 that is the object of the third embodiment.
- FIG. 13 is a diagram showing logarithm LC2 of chromatogram C2.
- FIG. 14 shows the chromatogram C3 converted by the conversion function.
- FIG. 15 is a diagram showing logarithm LC3 of chromatogram C3.
- FIG. 16 shows the chromatogram C5 converted by the conversion function.
- FIG. 17 is a diagram showing logarithm LC5 of chromatogram C5.
- FIG. 18 shows the chromatogram C7 transformed by the transformation function.
- FIG. 19 is a diagram showing logarithm LC7 of chromatogram C7.
- FIG. 20 shows a logarithmic chromatogram LC9.
- FIG. 21 is a diagram showing a group of splines used in GAM.
- FIG. 22 shows the GAM model applied to logarithmic chromatogram LC9.
- FIG. 23 shows the GAM model applied to the time-distortion function.
- FIG. 24 is a diagram showing the oscillation of coefficients due to fitting.
- FIG. 25 is a flow chart showing a model function fitting method according to the embodiment.
- FIG. 26 is a flow chart showing a model function fitting method according to the embodiment.
- FIG. 1 is a configuration diagram of a model function fitting device 1 according to an embodiment.
- the model function fitting device 1 of the present embodiment acquires sample measurement data MD obtained in a liquid chromatograph, a gas chromatograph, or the like.
- the model function fitting device 1 of this embodiment is configured by a personal computer. As shown in FIG. 1, the model function fitting device 1 includes a CPU (Central Processing Unit) 11, a RAM (Random Access Memory) 12, a ROM (Read Only Memory) 13, an operation section 14, a display 15, a storage device 16, a communication An interface (I/F) 17 and a device interface (I/F) 18 are provided.
- a CPU Central Processing Unit
- RAM Random Access Memory
- ROM Read Only Memory
- operation section 14 a display
- I/F communication An interface
- I/F device interface
- the CPU 11 performs overall control of the model function fitting device 1.
- RAM 12 is used as a work area when CPU 11 executes a program.
- Various data, programs, and the like are stored in the ROM 13 .
- the operation unit 14 receives an input operation by the user. Operation unit 14 includes a keyboard, a mouse, and the like.
- the display 15 displays information such as fitting results.
- the storage device 16 is a storage medium such as a hard disk.
- the storage device 16 stores a program P1 and measurement data MD.
- the program P1 executes a process of obtaining a chromatogram and a process of fitting a model function to the chromatogram.
- the communication interface 17 is an interface for wired or wireless communication with other computers.
- a device interface 18 is an interface that accesses a storage medium 19 such as a CD, DVD, or semiconductor memory.
- FIG. 2 is a block diagram showing the functional configuration of the model function fitting device 1.
- the control unit 20 is a functional unit realized by executing the program P1 while the CPU 11 uses the RAM 12 as a work area.
- the control unit 20 includes an acquisition unit 21 , a fitting unit 22 and an output unit 23 . That is, the acquisition unit 21, the fitting unit 22, and the output unit 23 are functional units realized by executing the program P1.
- the functional units 21 to 23 can also be said to be functional units provided in the CPU 11 .
- the acquisition unit 21 inputs the measurement data MD.
- the acquisition unit 21 inputs measurement data MD from another computer, analysis device, or the like via the communication interface 17, for example.
- the acquisition unit 21 inputs the measurement data MD stored in the storage medium 19 via the device interface 18 .
- the fitting unit 22 performs the process of fitting the model function to the chromatogram.
- the fitting unit 22 of the present embodiment includes a first part in which the logarithmic function of the model function can be approximated by a quadratic function, and a second part located on both sides of the first part in which the logarithmic function of the model function can be approximated by a linear function.
- a model function is fitted to the chromatogram under the constraint that it has
- the output unit 24 causes the display 15 to display the results of the fitting performed by the fitting unit 22, information on the fitted model function, and the like.
- the program P1 may be stored in the storage medium 19 and provided.
- the CPU 11 may access the storage medium 19 via the device interface 18 and store the program P1 stored in the storage medium 19 in the storage device 16 or the ROM 13 .
- the CPU 11 may access the storage medium 19 via the device interface 18 and execute the program P1 stored in the storage medium 19 .
- FIG. 3 is a diagram showing measurement data MD acquired by the acquisition unit 21.
- the measurement data MD shows the chromatogram C1 of the sample to be analyzed.
- the horizontal axis is time, and the vertical axis is intensity (detection value).
- the peak of chromatogram C1 is detected at time T1. Peak heights are normalized to 100 intensities.
- FIG. 4 is a diagram showing the logarithm LC of the chromatogram C1 shown in FIG.
- the log LC of chromatogram C1 has a first part A1 that can be approximated as a quadratic function.
- the first part A1 is the area containing the time T1.
- the logarithm LC then has second parts A2, A2 that can be approximated to a linear function on both sides of the first part A1. Therefore, the fitting unit 22 uses a sequence L[t] (t is time) having a non-positive secondary difference (secondary differential) as the logarithmic function of the model function.
- the characteristic that the logarithmic function of the model function is upwardly convex is used as a constraint on the model function.
- the fitting unit 22 fits a model function to the chromatogram C1 by using generalized additive models (GAM).
- GAM generalized additive models
- a smoothed spline model is used as the generalized additive model. That is, as the logarithmic function of the model function, a sequence L[t] in which the secondary difference is non-positive is applied, and a generalized additive model is applied to the model function using a smoothing spline.
- the method of applying the generalized additive model to the constraint that the second-order difference is non-positive is herein referred to as DGAM.
- FIG. 5 shows an example of fitting a generalized additive model using a smoothing spline to chromatogram C1.
- FIG. 5 shows how the quartic splines SP1, SP2, and SP3 are arranged with an interval of about 1/3 of the peak half width.
- the splines arranged in the area on the left side of the spline SP1 and the area on the right side of the spline SP3 are omitted from the illustration.
- the interval at which the splines are arranged is not particularly limited. In this example, the spline interval is set to about 1/3 of the peak half-value width at which the approximation accuracy is empirically high. Moreover, the spline intervals may not be the same among the plurality of splines arranged.
- the number of parameters can be reduced by widening the spline interval in the region outside the peak. Also, when the peak tails, it is better to determine the spline interval using the half-value width obtained on the non-tailing side, that is, on the side where there is a steeper change.
- the fitting unit 22 fits the model function to the chromatogram using the generalized additive model.
- the parameters are arranged in chronological order, but by giving the constraint that the secondary difference of the parameter is non-positive, it is possible to apply restrictions to the convex function while applying smoothing.
- the number of parameters can be reduced and the amount of calculation can be reduced as compared with the case where the sequence L[t] is used as the model function.
- the number of parameters can be reduced to the number of spline peaks.
- the number of parameters is not a big problem, but when performing regression by Bayesian estimation using the Markov chain Monte Carlo method (MCMC: Markov chain Monte Carlo methods), reducing the number of parameters is computationally It is a big advantage.
- MCMC Markov chain Monte Carlo methods
- FIG. 6 is a diagram showing the results of a fitting simulation using DGAM.
- FIG. 6 is a diagram showing residuals between chromatogram simulation data and model functions calculated by DGAM.
- FIG. 7 is a diagram showing a simulation result of fitting using BEMG as a comparative example.
- FIG. 7 is a diagram showing residuals between the same simulation data as in FIG. 6 and the model function calculated by BEMG.
- BEMG has an error of about 0.08%
- DGAM has an error of about 0.005%.
- the target error amount can be achieved by adjusting the spline interval/order. In this way, DGAM can approximate a model function with a minimum number of parameters.
- FIG. 8 is a diagram comparing fitting results by DGAM and by unimodal restriction.
- FIG. 8 shows a model function fitted to the same measurement data MD using DGAM and unimodal restriction.
- M1 is a model function fitted using DGAM
- M2 is a model function fitted using unimodal restriction.
- the DGAM of the present embodiment By using the DGAM of the present embodiment to calculate the area of one or more peaks included in the chromatogram, it is possible to accurately quantify and qualitatively perform the sample. Approximation accuracy is important when the model function is used in a chromatogram separation algorithm. In the case of impurity control for pharmaceuticals, it is necessary to control an impurity peak that is in a very small amount (eg, 0.05%) compared to the main component peak. In such applications, the error of the model function used for fitting must of course be less than 0.05%. However, model functions such as BEMG have an error of about 0.1% as described with reference to FIG. This error is also confirmed in a chromatogram simulation using a Radke-Prausnitz adsorption isotherm model or the like.
- DGAM which is the method of the present embodiment, has a fitting error of less than 0.05%, as described with reference to FIG. Therefore, by using DGAM, it can be used for chromatogram separation algorithms even in the field of pharmaceuticals that control minute impurities.
- Equation 1 is an equation representing the model function exp(g(x, a, b)) according to the second embodiment.
- x is the retention time obtained by normalizing the peak position and peak width. That is, if u is the peak position and s is the peak width, (xu)/s is input to x.
- a and b are tailing parameters.
- Equation 1 g(x, a, b) is, as shown in FIG. 4, a first part A1 that can be approximated to a quadratic function, and a second It has parts A2 and A2.
- the logarithmic function of exp(g(x, a, b)), which is the model function has a first part A1 that can be approximated to a quadratic function, and a second part A1 that can be approximated to a linear function on both sides of the first part A1. It has A2 and A2.
- the model function of the second embodiment also has the restriction that the logarithmic function of the model function is upwardly convex.
- the model function exp(g(x, a, b)) of this embodiment having such constraints is called an EMLC (Exponential of Modified Log Cosh) function.
- Equation 2 the model function exp(h(x, a, b)) normalized with respect to the peak position, peak height and peak width is shown in Equation 2.
- Equation 2 ⁇ is the beta function.
- FIG. 9 shows results of estimating the EMLC function by Bayesian estimation using simulation data as the measurement data MD.
- SD is simulation data
- M3 is a model function (EMLC function) estimated by Bayesian estimation.
- the peak height H is added as a parameter to perform Bayesian estimation.
- FIG. 9 The simulation shown in FIG. 9 was performed using the Bayesian estimation software stan. Normal distribution noise centered at 0 is added to the simulation data. Bayesian estimation was performed with 2000 iterations and 12 chains.
- FIG. 10 shows simulation results of Bayesian inference performed by stan. As a comparative example, results of Bayesian estimation using BEMG as a model function for the same simulation data SD are also shown. As shown in FIG. 10, in EMLC, Rhat (convergence determination index) is a value close to 1.0, and it can be seen that the accuracy of estimation is higher than in BEMG.
- Rhat convergence determination index
- n_eff the number of effective samples indicating that sampling has been efficiently performed also has a larger value in EMLC than in BEMG, indicating that the accuracy of estimation is high. Also, se-mean is the standard error, and the value of EMLC is smaller than that of BEMG. Thus, it can be seen that EMLC is advantageous in calculations that use differentiation of functions, such as Bayesian estimation by MCMC sampling.
- FIG. 11 is a diagram comparing the fitting results for the actually measured measurement data MD when EMLC is used as the model function and when EMG/BEMG is used as the model function as a comparative example.
- the measurement data MD is a chromatogram related to a sample of primary metabolites, and FIG. 11 shows the mean square error between the area quantification of the sample (the maximum peak height is set to 1) and the area quantification of each model function.
- Equation 1 A method for obtaining the EMLC function shown in Equation 1 and the normalized EMLC function shown in Equation 2 will be described.
- f(x, a, b) shown in Equation 3 is obtained by multiplying the sigmoid function by a constant and adding a constant.
- Equation 4 represents g(x, a, b) in Equation 1. That is, g(x, a, b) is a logarithmic function of the EMLC function and has an upwardly convex shape.
- the tailing/leading parameters should not have a large effect on peak feature quantities such as peak position, height, area, and width. Such conditions make the fit to chromatograms with standard tailing-leading shapes less collinear. Therefore, since the peak position and height of exp(g(x, a, b)) can be found analytically, it is desirable to use a function gg(x, a, b) that normalizes them. Equation 5 represents a function gg(x, a, b) obtained by normalizing g(x, a, b). Normalization may be performed using another peak feature amount such as the center of gravity instead of the peak position/height.
- Equation 5 x is multiplied by Ns (a, b) to correct the peak width and normalize the peak shape so as to keep the area constant.
- the function after normalization is Formula 7. Instead of using the area, correction may be performed using an empirically obtained formula for the half-value width or a formula obtained by machine learning. Equation 7 represents h(x, a, b) in Equation 2. More preferably, log ⁇ obtained by transforming the beta function ⁇ can be used as shown in Equation 6 in order to prevent digit loss during floating-point arithmetic.
- model functions such as EMG and BEMG have a problem that it is difficult for users to understand the model functions because parameters such as peak positions and half-value widths are not expressed in a manner that is easy for humans to interpret.
- the EMLC function of the present embodiment is transformed into a form that is easy for users to understand by normalization.
- Model functions such as EMG and BEMG include multiplication of exp and erfc, and in calculation of the peak tail portion, a value close to 0 is multiplied by a value close to ⁇ , resulting in a very small value. For this reason, precision loss such as cancellation of significant digits occurs, so it is necessary to prepare a separate function for the bottom part, which makes calculation difficult.
- the EMLC function of the present embodiment can prevent cancellation of digits due to calculation by normalization as described above.
- FIG. 12 shows a chromatogram C2 targeted in the third and fourth embodiments.
- the horizontal axis is time
- the vertical axis is intensity (detection value).
- peak heights are normalized to an intensity of 1.0.
- FIG. 13 is a diagram showing logarithm LC2 of chromatogram C2 shown in FIG.
- the logarithm LC2 of the chromatogram C2 has a first part A1 that can be approximated by a quadratic function, and second parts A2 and A2 that can be approximated by a linear function on both sides of the first part A1.
- the logarithm LC2 has a third part A3 outside the second part A2, which is an area where the second derivative is positive. Part 3 A3 is due to the tailing that occurs in chromatogram C2.
- the model function of the third embodiment has the constraint that the logarithmic function of the model function is upwardly convex in many parts. is allowed.
- the fitting unit 22 uses a function obtained by applying a conversion function smoother than an exponential function to a function having a non-positive secondary difference (hereinafter referred to as an original function) as a model function for chromatographic Fit to Gram C2.
- the fitting unit 22 uses, for example, a composite function of an exponential function and a gamma correction function as a conversion function to be applied to the original function. Assuming that the original function is B(t), the gamma correction function is G, and the exponential function is exp, the conversion function is expressed by exp(G(*)). Formula 8 is an example of a gamma correction function used in the conversion function.
- the parameter q is a constant of 0 or more. As the value of the parameter q increases, the effect of gamma correction can be obtained only in the range where the chromatogram intensity is minute.
- the parameter p usually takes a value of 1 or less, and a range of positive values is set to allow deviation.
- Parameter r is a parameter for adjusting the peak width.
- the solid line indicates the chromatogram C3 obtained by applying the conversion function exp(G(*)) to the original function B(t).
- the dashed line indicates the chromatogram C4 obtained by applying the exponential function to the original function without applying the conversion function.
- the same original function is used in chromatograms C3 and C4. That is, the chromatogram C3 is represented by exp(G(B(t))), and the chromatogram C4 is represented by exp(B(t)).
- the chromatogram C3 is represented by exp(G(B(t))
- the chromatogram C4 is represented by exp(B(t)).
- the solid line indicates logarithm LC3 of chromatogram C3
- the dashed line indicates logarithm LC4 of chromatogram C4.
- (5-2) Conversion Function Using Polynomial A polynomial can also be used as another example of the conversion function.
- Expression 9 is an example using a polynomial as the conversion function Q.
- the solid line indicates the chromatogram C5 to which the conversion function shown in Equation 9 is applied to the original function.
- the dashed line indicates the chromatogram C6 in which the exponential function is applied to the original function without applying the conversion function. The same original function is used in chromatograms C5 and C6.
- the solid line indicates logarithm LC5 of chromatogram C5
- the dashed line indicates logarithm LC6 of chromatogram C6. It can be seen that the logarithm LC6 is subject to the second derivative non-positive constraint, whereas the logarithm LC5 deviates from the second derivative non-positive constraint in regions away from the peak.
- Equation 10 For the conversion function Q(x) using a polynomial, for example, a general expression as shown in Equation 10 can be used. That is, the conversion function Q(x) is represented by a function containing an n-th order polynomial in the denominator.
- a cosh function can also be used as another example of the conversion function.
- Formula 11 is an example using a cosh function as the conversion function Q.
- u is a parameter for peak width adjustment.
- the conversion function Q(x) is represented by a function including the cosh function in the denominator.
- the solid line indicates the chromatogram C7 obtained by applying the conversion function shown in Equation 11 to the original function.
- the dashed line indicates the chromatogram C8 in which the exponential function is applied to the original function without applying the conversion function. The same original function is used in chromatograms C7 and C8.
- the solid line indicates logarithm LC7 of chromatogram C7
- the dashed line indicates logarithm LC8 of chromatogram C8. It can be seen that the logarithm LC8 is subject to the second derivative non-positive constraint, whereas the logarithm LC7 deviates from the second derivative non-positive constraint in regions away from the peak.
- the conversion function As an example of the conversion function, the case of using a gamma correction function, a polynomial, or a cosh function has been described above. These functions are examples, and a monotonic function having a gentler slope than the exponential function can be used as the conversion function.
- model function fitting method according to a fourth embodiment will be described. Similar to the third embodiment, the model function of the fourth embodiment also has the constraint that the logarithmic function of the model function is upwardly convex in most areas. Deviations from that constraint are allowed. A fourth embodiment allows deviation from this constraint by distorting the time.
- the fitting unit 22 fits the model function to the chromatogram by applying the GAM model to the time distortion function.
- m(t) be the function that distorts time t.
- exp(-t ⁇ 2) when a chromatogram represented by exp(-t ⁇ 2) is subjected to time distortion by m(t), the chromatogram is represented by exp(-m(t) ⁇ 2) .
- the solid line is simulation data of logarithmic chromatogram LC9.
- the intensity of each feature point becomes as shown in FIG.
- the intensity value of each feature point is as shown in FIG.
- the example of FIG. 23 is an example in which the time of the tailing portion on the right side of the peak is distorted. Conversely, when skewing the time of the leading portion to the left of the peak, the time-skew function m(t) is skewed such that the slope of the time corresponding to the leading portion is gradual.
- the dashed line in FIG. 20 shows the result of fitting the GAM to the time-distortion function m(t).
- the constraint that the second derivative is non-positive may be given to the logarithmic chromatogram to which the time-distortion function m(t) is applied, or may be applied to the time-distortion function m(t). good. For example, consider a case where a time distortion function m(t) is applied to a chromatogram represented by exp(-t ⁇ 2). At this time, the logarithmic chromatogram becomes -m(t) ⁇ 2. The constraint that the second derivative of the logarithmic chromatogram-m(t) ⁇ 2 is non-positive is expressed by Equation (12).
- Equation 12 may be implemented as an optimization algorithm, the amount of calculation increases. Therefore, in order to reduce the amount of calculation, it is considered to give a constraint to the time distortion function m(t).
- the slope of the time-distortion function m(t) decreases as the distance from the center of the peak decreases, that is, the value of the first derivative of m(t) decreases.
- a restriction equivalent to the single bee restriction can be given. This constraint is expressed by Equation 13, where tn is the feature points that are consecutively arranged.
- the lower limit is 0, but the model function of the actual chromatogram and the function that satisfies the GAM model whose second derivative is non-positive do not have the lower limit of 0, and empirically A range of values. Therefore, the lower limit may be set to an empirically obtained value greater than 0 and less than 1. Moreover, when using a GAM model, only functions defined by splines can be expressed, so fine systematic errors remain. Therefore, even if the waveform is simply fitted to a waveform without tailing, oscillation of the coefficients as shown in FIG. 24 may occur. In order to allow such oscillation of coefficients, the case where the upper limit exceeds 1 may also be allowed in Equation (13).
- This allowable range is desirably obtained empirically from the systematic error of the model function and the peak width (number of feature points/spline dimension) using GAM. Assuming that the empirically obtained lower limit is Ca and the upper limit is Cb, the constraint of Equation 13 is expressed as Equation 14.
- the model function of the fourth embodiment has the constraint that the logarithmic function of the model function is upwardly convex in many parts. Departure from the constraint is allowed for the region of the part. This enables fitting of the model function with higher accuracy.
- FIG. 25 is a flow chart showing the model function fitting method of the first and second embodiments executed by the program P1. 12 is a flowchart executed by the CPU 11.
- step S1 the acquisition unit 21 acquires measurement data MD.
- the measurement data MD is, for example, a chromatogram obtained by a liquid chromatograph.
- step S2 the fitting unit 22 performs a first part in which the logarithmic function of the model function can be approximated by a quadratic function, and a second part located on both sides of the first part in which the logarithmic function of the model function can be approximated by a linear function.
- a model function is fitted to the chromatogram under the constraint that it has two parts.
- step S2 in the first embodiment, a model function with a constraint that the secondary difference is non-positive is used.
- step S2 in the second embodiment a function obtained by integrating a sigmoid function multiplied by a constant and adding a constant is used as the logarithmic function of the model function.
- FIG. 26 is a flow chart showing the model function fitting method of the third and fourth embodiments executed by program P1. 26 is a flowchart executed by the CPU 11.
- the measurement data MD is, for example, a chromatogram obtained by a liquid chromatograph.
- step S12 the fitting unit 22 performs a first part where the logarithmic function of the model function can be approximated by a quadratic function, and a second part located on both sides of the first part where the logarithmic function of the model function can be approximated by a linear function.
- a model function is fitted to the chromatogram under the constraint that it has two parts.
- the fitting unit 22 converts the model function obtained by applying a smoother transform function than the exponential function to the original function having a non-positive secondary difference to the chromatographic function. fitting to grams.
- step S2 in the fourth embodiment, the fitting unit 22 deviates from the constraint that the secondary difference is non-positive by distorting the time for the original function whose secondary difference is non-positive. is fitted to the chromatogram.
- a smoothing spline model is used as the generalized additive model.
- a Bezier function, a Gaussian function, or the like can be used in addition to the spline.
- DGAM a method in which a generalized additive model is applied to the constraint that the secondary difference is non-positive
- the EMLC function in the second embodiment may be used as an initial value when using DGAM.
- DGAM has a relatively large number of parameters, but by using the EMLC function as an initial value, effective constraints can be given from the initial state.
- an example using a gamma correction function an example using a polynomial, and an example using a cosh function have been described as conversion functions. may be used.
- the method of permitting partial deviation from the constraint that the second derivative of the logarithmic chromatogram is non-positive has been described.
- this constraint may be relaxed by direct empirical methods.
- the user may empirically set a parameter that permits deviation, or set the sum of positive values or their powers as a penalty term in solving the optimization problem.
- a model function fitting device comprises: an acquisition unit that acquires a chromatogram;
- the model function is constrained to have a first part where the logarithmic function of the model function can be approximated by a quadratic function, and a second part located on both sides of the first part and can be approximated by a linear function. and a fitting unit that fits the model function to the chromatogram.
- the fitting unit may constrain the model function such that a secondary difference of the logarithmic function is non-positive.
- the fitting unit may fit the model function to the chromatogram by using a generalized additive model.
- the logarithmic function of the model function may be a function obtained by integrating a sigmoid function multiplied by a constant and adding a constant.
- the fitting unit may use, as the logarithmic function of the model function, a function obtained by integrating a sigmoid function multiplied by a constant and adding a constant.
- the model function may also be a normalized function of peak height and peak position.
- the format is easy for users to interpret, and the model functions are easy to handle.
- the model function may be a function obtained by correcting the peak width with an expression including a beta function and an exponential function.
- the format is easy for users to interpret, and the model functions are easy to handle.
- the fitting unit may fit the model function obtained by applying a conversion function having a slope gentler than the exponential function to the original function having a non-positive secondary difference to the chromatogram. .
- the transform function may include a composite function of a gamma correction function and an exponential function.
- the gamma correction function allows deviation from the constraint that the second derivative is non-positive.
- the transform function may include a function having a polynomial of order n in the denominator.
- a transformation function containing polynomials can allow deviation from the constraint that the second derivative is non-positive.
- the transform function may include a function having a cosh function in the denominator.
- Transformation functions including cosh functions, allow deviation from the constraint that the second derivative is non-positive.
- a model function fitting method comprises: obtaining a chromatogram;
- the model function is constrained to have a first part where the logarithmic function of the model function can be approximated by a quadratic function, and a second part located on both sides of the first part and can be approximated by a linear function. and fitting the model function to the chromatogram above.
- the fitting step may constrain the model function such that the second difference of the logarithmic function is non-positive.
- the logarithmic function of the model function may use a function obtained by integrating a sigmoid function multiplied by a constant and adding a constant.
- the fitting step includes fitting the model function obtained by applying a conversion function having a slope gentler than the exponential function to the original function having a non-positive secondary difference to the chromatogram. good.
Landscapes
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
図1は、実施の形態に係るモデル関数フィッティング装置1の構成図である。本実施の形態のモデル関数フィッティング装置1は、液体クロマトグラフまたはガスクロマトグラフなどにおいて得られた試料の測定データMDを取得する。
図2は、モデル関数フィッティング装置1の機能構成を示すブロック図である。図2において、制御部20は、CPU11がRAM12をワークエリアとして使用しつつ、プログラムP1を実行することにより実現される機能部である。制御部20は、取得部21、フィッティング部22および出力部23を備える。つまり、取得部21、フィッティング部22および出力部23は、プログラムP1の実行により実現される機能部である。言い換えると、各機能部21~23は、CPU11が備える機能部とも言える。
次に、第1の実施の形態に係るモデル関数フィッティング方法について説明する。図3は、取得部21が取得する測定データMDを示す図である。測定データMDは、分析対象の試料のクロマトグラムC1を示す。図3において横軸は時間、縦軸は強度(検出値)である。図に示すように、時間T1においてクロマトグラムC1のピークが検出されている。ピーク高さは強度100に正規化されている。
次に、第2の実施の形態に係るモデル関数フィッティング方法について説明する。数1式は、第2の実施の形態に係るモデル関数exp(g(x,a,b))を示す式である。なお、数1式において、xは、ピーク位置とピーク幅を正規化した保持時間である。つまり、ピーク位置をu、ピーク幅をsとすると、xには、(x-u)/sが入力される。また、数1式において、a,bはテーリングパラメータである。
次に、第3の実施の形態に係るモデル関数フィッティング方法について説明する。図12は、第3の実施の形態および第4の実施の形態において対象とされるクロマトグラムC2を示す。図12において横軸は時間、縦軸は強度(検出値)である。図に示すように、ピーク高さは強度1.0に正規化されている。クロマトグラムC2には、テーリングが発生している。図13は、図12に示すクロマトグラムC2の対数LC2を示す図である。図13に示すように、クロマトグラムC2の対数LC2は、2次関数に近似できる第1部A1と、第1部A1の両側に1次関数に近似できる第2部A2,A2を有している。ただし、対数LC2には、第2部A2の外側の領域に、2次微分が正となる領域である第3部A3が存在している。第3部A3は、クロマトグラムC2に生じているテーリングに起因する。このように、第3の実施の形態のモデル関数は、モデル関数の対数関数が多くの部分において上に凸であるという制約を有しているが、一部の領域について、その制約からの逸脱が許容される。
フィッティング部22は、元関数に適用させる変換関数として、例えば、指数関数とガンマ補正関数の合成関数を用いる。元関数をB(t)、ガンマ補正関数をG、指数関数をexpとすると、変換関数は、exp(G(*))で表される。数8式は、変換関数に用いられるガンマ補正関数の例である。
変換関数の別の例として、多項式を用いることもできる。数9式は、変換関数Qとして多項式を用いた例である。数9式のxには、元関数の値が入力される。例えば、元関数がB(t)=-t^2である場合、数9式のxには、-t^2を入力することで、変換関数が適用されたクロマトグラムが得られる。
変換関数の別の例として、cosh関数を用いることもできる。数11式は、変換関数Qとして、cosh関数を用いた例である。数11式において、uは、ピーク幅調整用のパラメータである。このように、変換関数Q(x)は、分母にcosh関数を含む関数で表される。
次に、第4の実施の形態に係るモデル関数フィッティング方法について説明する。第4の実施の形態のモデル関数も、第3の実施の形態と同様、モデル関数の対数関数が多くの部分において上に凸であるという制約を有しているが、一部の領域について、その制約からの逸脱が許容される。第4の実施の形態においては、時間を歪ませることにより、この制約からの逸脱を許容する。フィッティング部22は、時間歪み関数に対してGAMモデルを適用させることにより、クロマトグラムにモデル関数をフィッティングする。
図25は、プログラムP1により実行される第1および第2の実施の形態のモデル関数フィッティング方法を示すフローチャートである。つまり、図12は、CPU11により実行されるフローチャートである。まず、ステップS1において、取得部21が、測定データMDを取得する。測定データMDは、例えば液体クロマトグラフにより取得されたクロマトグラムである。
第1の実施の形態において、一般化加法モデルとして平滑化スプラインモデルを利用した。第1の実施の形態の変形例として、スプライン以外にもベジエやガウス関数などを利用することもできる。
上述した複数の例示的な実施の形態は、以下の態様の具体例であることが当業者により理解される。
一態様に係るモデル関数フィッティング装置は、
クロマトグラムを取得する取得部と、
モデル関数に対して、前記モデル関数の対数関数が2次関数で近似できる第1部と、前記第1部の両側に位置し1次関数で近似できる第2部とを有するという制約を与えた上で、前記クロマトグラムに前記モデル関数をフィッティングするフィッティング部と、を備える。
第1項に記載のモデル関数フィッティング装置において、
前記フィッティング部は、前記モデル関数に前記対数関数の2次差分が非正であるという制約を与えてもよい。
第2項に記載のモデル関数フィッティング装置において、
前記フィッティング部は、一般化加法モデルを用いることにより前記クロマトグラムに前記モデル関数をフィッティングしてもよい。
第2項または第3項に記載のモデル関数フィッティング装置において、
1つのピークまたは複数のピークの面積算出に用いられてもよい。
第2項または第3項に記載のモデル関数フィッティング装置において、
前記モデル関数の初期値として、前記モデル関数の前記対数関数が、シグモイド関数を定数倍したものに定数加算したものを積分した関数を用いてもよい。
第1項に記載のモデル関数フィッティング装置において、
前記フィッティング部は、前記モデル関数の前記対数関数が、シグモイド関数を定数倍したものに定数加算したものを積分した関数を用いてもよい。
第6項に記載のモデル関数フィッティング装置において、
前記モデル関数は、さらに、ピーク高さおよびピーク位置を正規化した関数であってもよい。
第7項に記載のモデル関数フィッティング装置において、
前記モデル関数は、さらに、ピーク幅をベータ関数および指数関数を含む式で補正した関数であってもよい。
第1項に記載のモデル関数フィッティング装置において、
前記フィッティング部は、2次差分が非正である元関数に対して、指数関数よりも緩やかな傾きを有する変換関数を適用させることにより得られる前記モデル関数を前記クロマトグラムにフィッティングしてもよい。
第9項に記載のモデル関数フィッティング装置において、
前記変換関数は、ガンマ補正関数と指数関数との合成関数を含んでもよい。
第9項に記載のモデル関数フィッティング装置において、
前記変換関数は、分母にn次の多項式を有する関数を含んでもよい。
第9項に記載のモデル関数フィッティング装置において、
前記変換関数は、分母にcosh関数を有する関数を含んでもよい。
第1項に記載のモデル関数フィッティング装置において、
前記フィッティング部は、2次差分が非正である元関数に対して、時間を歪ませることにより、2次差分が非正である制約からの逸脱を許容させた前記モデル関数を前記クロマトグラムにフィッティングしてもよい。
他の態様に係るモデル関数フィッティング方法は、
クロマトグラムを取得する工程と、
モデル関数に対して、前記モデル関数の対数関数が2次関数で近似できる第1部と、前記第1部の両側に位置し1次関数で近似できる第2部とを有するという制約を与えた上で、前記クロマトグラムに前記モデル関数をフィッティングする工程と、を備える。
第14項に記載のモデル関数フィッティング方法において、
前記フィッティングする工程は、前記モデル関数に前記対数関数の2次差分が非正であるという制約を与えてもよい。
第14項に記載のモデル関数フィッティング方法において、
前記フィッティングする工程は、前記モデル関数の前記対数関数が、シグモイド関数を定数倍したものに定数加算したものを積分した関数を用いてもよい。
第14項に記載のモデル関数フィッティング方法において、
前記フィッティングする工程は、2次差分が非正である元関数に対して、指数関数よりも緩やかな傾きを有する変換関数を適用させることにより得られる前記モデル関数を前記クロマトグラムにフィッティングしてもよい。
第14項に記載のモデル関数フィッティング方法において、
前記フィッティングする工程は、2次差分が非正である元関数に対して、時間を歪ませることにより、2次差分が非正である制約からの逸脱を許容させた前記モデル関数を前記クロマトグラムにフィッティングしてもよい。
Claims (18)
- クロマトグラムを取得する取得部と、
モデル関数に対して、前記モデル関数の対数関数が2次関数で近似できる第1部と、前記第1部の両側に位置し1次関数で近似できる第2部とを有するという制約を与えた上で、前記クロマトグラムに前記モデル関数をフィッティングするフィッティング部と、を備えるモデル関数フィッティング装置。 - 前記フィッティング部は、前記モデル関数に前記対数関数の2次差分が非正であるという制約を与える、請求項1に記載のモデル関数フィッティング装置。
- 前記フィッティング部は、一般化加法モデルを用いることにより前記クロマトグラムに前記モデル関数をフィッティングする、請求項2に記載のモデル関数フィッティング装置。
- 1つのピークまたは複数のピークの面積算出に用いられる、請求項2または請求項3に記載のモデル関数フィッティング装置。
- 前記モデル関数の初期値として、前記モデル関数の前記対数関数が、シグモイド関数を定数倍したものに定数加算したものを積分した関数を用いる、請求項2または請求項3に記載のモデル関数フィッティング装置。
- 前記フィッティング部は、前記モデル関数の前記対数関数が、シグモイド関数を定数倍したものに定数加算したものを積分した関数を用いる、請求項1に記載のモデル関数フィッティング装置。
- 前記モデル関数は、さらに、ピーク高さおよびピーク位置を正規化した関数である、請求項6に記載のモデル関数フィッティング装置。
- 前記モデル関数は、さらに、ピーク幅をベータ関数および指数関数を含む式で補正した関数である、請求項7に記載のモデル関数フィッティング装置。
- 前記フィッティング部は、2次差分が非正である元関数に対して、指数関数よりも緩やかな傾きを有する変換関数を適用させることにより得られる前記モデル関数を前記クロマトグラムにフィッティングする、請求項1に記載のモデル関数フィッティング装置。
- 前記変換関数は、ガンマ補正関数と指数関数との合成関数を含む、請求項9に記載のモデル関数フィッティング装置。
- 前記変換関数は、分母にn次の多項式を有する関数を含む、請求項9に記載のモデル関数フィッティング装置。
- 前記変換関数は、分母にcosh関数を有する関数を含む、請求項9に記載のモデル関数フィッティング装置。
- 前記フィッティング部は、2次差分が非正である元関数に対して、時間を歪ませることにより、2次差分が非正である制約からの逸脱を許容させた前記モデル関数を前記クロマトグラムにフィッティングする、請求項1に記載のモデル関数フィッティング装置。
- クロマトグラムを取得する工程と、
モデル関数に対して、前記モデル関数の対数関数が2次関数で近似できる第1部と、前記第1部の両側に位置し1次関数で近似できる第2部とを有するという制約を与えた上で、前記クロマトグラムに前記モデル関数をフィッティングする工程と、を備えるモデル関数フィッティング方法。 - 前記フィッティングする工程は、前記モデル関数に前記対数関数の2次差分が非正であるという制約を与える、請求項14に記載のモデル関数フィッティング方法。
- 前記フィッティングする工程は、前記モデル関数の前記対数関数が、シグモイド関数を定数倍したものに定数加算したものを積分した関数を用いる、請求項14に記載のモデル関数フィッティング方法。
- 前記フィッティングする工程は、2次差分が非正である元関数に対して、指数関数よりも緩やかな傾きを有する変換関数を適用させることにより得られる前記モデル関数を前記クロマトグラムにフィッティングする、請求項14に記載のモデル関数フィッティング方法。
- 前記フィッティングする工程は、2次差分が非正である元関数に対して、時間を歪ませることにより、2次差分が非正である制約からの逸脱を許容させた前記モデル関数を前記クロマトグラムにフィッティングする、請求項14に記載のモデル関数フィッティング方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202280042686.3A CN117501117A (zh) | 2021-06-18 | 2022-06-17 | 模型函数拟合装置及模型函数拟合方法 |
JP2023530441A JPWO2022265110A1 (ja) | 2021-06-18 | 2022-06-17 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021-101606 | 2021-06-18 | ||
JP2021101606 | 2021-06-18 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022265110A1 true WO2022265110A1 (ja) | 2022-12-22 |
Family
ID=84526532
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/024399 WO2022265110A1 (ja) | 2021-06-18 | 2022-06-17 | モデル関数フィッティング装置およびモデル関数フィッティング方法 |
Country Status (3)
Country | Link |
---|---|
JP (1) | JPWO2022265110A1 (ja) |
CN (1) | CN117501117A (ja) |
WO (1) | WO2022265110A1 (ja) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015049136A (ja) * | 2013-09-02 | 2015-03-16 | 株式会社島津製作所 | ピーク抽出方法及びプログラム |
JP2019190833A (ja) * | 2018-04-18 | 2019-10-31 | 東ソー株式会社 | クロマトグラムにおけるピークの信号処理方法 |
JP2020153864A (ja) * | 2019-03-20 | 2020-09-24 | 株式会社日立ハイテクサイエンス | クロマトグラフのデータ処理装置、データ処理方法、およびクロマトグラフ |
-
2022
- 2022-06-17 WO PCT/JP2022/024399 patent/WO2022265110A1/ja active Application Filing
- 2022-06-17 CN CN202280042686.3A patent/CN117501117A/zh active Pending
- 2022-06-17 JP JP2023530441A patent/JPWO2022265110A1/ja active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015049136A (ja) * | 2013-09-02 | 2015-03-16 | 株式会社島津製作所 | ピーク抽出方法及びプログラム |
JP2019190833A (ja) * | 2018-04-18 | 2019-10-31 | 東ソー株式会社 | クロマトグラムにおけるピークの信号処理方法 |
JP2020153864A (ja) * | 2019-03-20 | 2020-09-24 | 株式会社日立ハイテクサイエンス | クロマトグラフのデータ処理装置、データ処理方法、およびクロマトグラフ |
Also Published As
Publication number | Publication date |
---|---|
JPWO2022265110A1 (ja) | 2022-12-22 |
CN117501117A (zh) | 2024-02-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
De Menezes et al. | A review on robust M-estimators for regression analysis | |
Cozzolino et al. | Interpreting and reporting principal component analysis in food science analysis and beyond | |
Savalei | Improving fit indices in structural equation modeling with categorical data | |
Li et al. | Linearized Galerkin FEMs for nonlinear time fractional parabolic problems with non-smooth solutions in time direction | |
WO2017094170A1 (ja) | ピーク検出方法及びデータ処理装置 | |
Jalali-Heravi et al. | MCRC software: a tool for chemometric analysis of two-way chromatographic data | |
Yu et al. | Chemometric strategy for automatic chromatographic peak detection and background drift correction in chromatographic data | |
JP2005507128A (ja) | 物理的および非物理的なシステムのポジノミナルモデル化、サイジング、最適化および制御 | |
Urban et al. | Fundamental definitions and confusions in mass spectrometry about mass assignment, centroiding and resolution | |
Chouzenoux et al. | Optimal multivariate Gaussian fitting with applications to PSF modeling in two-photon microscopy imaging | |
Song et al. | Derivative-based new upper bound of Sobol’sensitivity measure | |
Li et al. | Nonlinear diffusion filtering for peak-preserving smoothing of a spectrum signal | |
JP3737257B2 (ja) | 2次元表現によるスペクトルデータの処理方法及び補正方法 | |
WO2022265110A1 (ja) | モデル関数フィッティング装置およびモデル関数フィッティング方法 | |
Zhu et al. | A new predictor–corrector scheme for valuing American puts | |
Albrecht | Estimating reaction model parameter uncertainty with Markov Chain Monte Carlo | |
Low | An algorithm for accurate evaluation of the fatigue damage due to multimodal and broadband processes | |
Wei et al. | Two-stage iteratively reweighted smoothing splines for baseline correction | |
Tellinghuisen | Least-squares analysis of data with uncertainty in y and x: algorithms in Excel and KaleidaGraph | |
JP6680139B2 (ja) | データ平滑化方法及びその方法を実行するプログラム | |
Černá | Wavelet method for sensitivity analysis of European options under Merton jump-diffusion model | |
Sanfelici | Galerkin infinite element approximation for pricing barrier options and options with discontinuous payoff | |
Kalambet | Data acquisition and integration | |
US20200265109A1 (en) | Data smoothing method, and program for performing the method | |
Mestre et al. | Time–frequency varying estimations: comparison of discrete and continuous wavelets in the market line framework |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22825097 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023530441 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280042686.3 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22825097 Country of ref document: EP Kind code of ref document: A1 |