JP7390210B2

JP7390210B2 - Spectrum processing device and method

Info

Publication number: JP7390210B2
Application number: JP2020032724A
Authority: JP
Inventors: 朋喜中尾
Original assignee: Jeol Ltd
Current assignee: Jeol Ltd
Priority date: 2020-02-28
Filing date: 2020-02-28
Publication date: 2023-12-01
Anticipated expiration: 2040-02-28
Also published as: JP2021135222A

Description

本発明は、スペクトル処理装置及び方法に関し、特に、スペクトルに含まれる特定の成分の推定及び除去に関する。 The present invention relates to a spectrum processing apparatus and method, and in particular to estimation and removal of specific components included in a spectrum.

スペクトル解析の対象となるスペクトルとして、ＮＭＲ（Nuclear Magnetic Resonance）スペクトル、Ｘ線スペクトル、分光スペクトル、マススペクトル等が挙げられる。一般に、スペクトルには、注目波形成分及びベースライン成分が含まれる。注目波形成分は、典型的には、１又は複数のピークを含む部分であり、それが本来的な解析対象である。一方、ベースライン成分は、本来的な解析対象ではない成分であって、周波数空間（周波数領域）において広い範囲にわたって存在する成分である。 Examples of spectra to be subjected to spectral analysis include NMR (Nuclear Magnetic Resonance) spectra, X-ray spectra, spectral spectra, and mass spectra. Generally, a spectrum includes a waveform component of interest and a baseline component. The waveform component of interest typically includes one or more peaks, and is the original target of analysis. On the other hand, the baseline component is a component that is not originally an analysis target and is a component that exists over a wide range in frequency space (frequency domain).

例えば、ＮＭＲスペクトルの観測においては、測定段階で生じるリンキングノイズ、デジタルフィルタ処理後のデータ欠損、分子構造由来の信号成分等がベースライン成分を生じさせる。ＮＭＲスペクトル中の底辺に相当する、大きな周期をもった変動、湾曲、傾斜等がベースライン成分であり、あるいは、そのような底辺それ自体がベースライン成分である。 For example, in observing an NMR spectrum, linking noise generated during the measurement stage, data loss after digital filter processing, signal components derived from molecular structure, etc. generate baseline components. A fluctuation, curvature, slope, etc. with a large period corresponding to the base in an NMR spectrum is a baseline component, or such a base itself is a baseline component.

なお、特許文献１には、ＮＭＲスペクトルを補正する技術が開示されている。特許文献１には、ベースライン成分の推定、特に、ベースラインモデルに基づくベースライン成分の推定、については記載されていない。 Note that Patent Document 1 discloses a technique for correcting an NMR spectrum. Patent Document 1 does not describe estimation of a baseline component, particularly estimation of a baseline component based on a baseline model.

特開昭６２－１２４４５０号公報Japanese Unexamined Patent Publication No. 62-124450

スペクトル中の注目波形成分の解析精度を高めるため、スペクトル解析に先立って又はスペクトル解析と同時進行で、スペクトル中のベースライン成分が推定され、スペクトルからベースライン成分が除去される。ベースライン成分の推定に際しては、ベースライン成分を数学的なモデルでフィッティングすることが行われている。そのモデルを定義する関数として、Ｎ次多項式、余弦関数、正弦関数、スプライン関数、区分直線関数、等が知られている。 In order to improve the accuracy of analysis of the waveform component of interest in the spectrum, a baseline component in the spectrum is estimated and removed from the spectrum prior to or concurrently with the spectrum analysis. When estimating the baseline component, the baseline component is fitted using a mathematical model. As functions that define the model, Nth-order polynomials, cosine functions, sine functions, spline functions, piecewise linear functions, and the like are known.

しかし、モデルを定義する関数を適切に選択した上で、その関数に対して適切なパラメータを与える作業は、通常、ユーザーにとって大きな負担となるものであり、それを適切に行うには経験を要する。そのような作業が適切に行われない場合、ベースライン成分の推定精度が低下し、ひいては、スペクトル解析精度が低下してしまう。なお、上記以外の必要性からベースライン成分が推定されることもある。 However, the task of appropriately selecting a function that defines a model and then providing appropriate parameters to that function is usually a heavy burden on the user, and it requires experience to do it properly. . If such work is not performed appropriately, the accuracy of estimating the baseline component will decrease, and as a result, the accuracy of spectrum analysis will decrease. Note that the baseline component may be estimated due to necessity other than the above.

ベースライン成分の推定に際しては、スペクトルにおいて、注目波形成分が支配的な部分以外の部分、すなわち、ベースライン成分が支配的な部分、を特定する必要がある。その特定をユーザーが行う場合、ユーザーの負担が増大し、また客観性が低下してしまう。 When estimating the baseline component, it is necessary to identify a portion of the spectrum other than the portion where the waveform component of interest is dominant, that is, a portion where the baseline component is dominant. If the user identifies the information, the burden on the user increases and objectivity decreases.

本発明の目的は、スペクトルに含まれるベースライン成分を精度良く推定することにある。あるいは、本発明の目的は、ユーザーに大きな負担を生じさせることなく、ベースライン成分を推定し、それをスペクトルから除去することにある。 An object of the present invention is to accurately estimate a baseline component included in a spectrum. Alternatively, it is an object of the present invention to estimate the baseline component and remove it from the spectrum without causing a significant burden on the user.

本発明に係るスペクトル処理装置は、ベースライン成分を含むスペクトルを受け入れる手段と、複数の基底関数からなる基底関数列に対して複数の重みからなる重み列を作用させることにより、ベースラインモデルを生成する生成手段と、少なくとも前記スペクトルに基づいて、前記スペクトルにおける標本部分及び前記ベースラインモデルにおける標本部分を定める標本点列を決定する決定手段と、前記スペクトルにおける標本部分に対して前記ベースラインモデルにおける標本部分をフィッティングさせるフィッティング条件が満たされるように、最適な重み列を探索する探索手段と、前記スペクトルから、前記最適な重み列に基づいて生成される最適なベースラインモデルを減算する減算手段と、を含むことを特徴とする。 The spectrum processing device according to the present invention generates a baseline model by accepting a spectrum including a baseline component and applying a weight sequence consisting of a plurality of weights to a basis function sequence consisting of a plurality of basis functions. a determining means for determining, based on at least the spectrum, a sample point sequence that defines a sample portion in the spectrum and a sample portion in the baseline model; a search means for searching for an optimal weight sequence so that a fitting condition for fitting the sample portion is satisfied; and a subtraction means for subtracting an optimal baseline model generated based on the optimal weight sequence from the spectrum. It is characterized by including the following.

本発明に係るスペクトル処理方法は、複数の基底関数からなる基底関数列に対して複数の重みからなる重み列を作用させることによりベースラインモデルを生成し、少なくともスペクトルに基づいて前記スペクトルにおける標本部分及び前記ベースラインモデルにおける標本部分を定める標本点列を決定し、前記スペクトルにおける標本部分に対して前記ベースラインモデルにおける標本部分をフィッティングさせるフィッティング条件、及び、前記重み列のＬｐノルム（但しｐ≦１）を小さくするＬｐノルム条件、が満たされるように最適な重み列を探索し、前記最適な重み列に基づいて前記スペクトルに含まれるベースライン成分を最適なベースラインモデルとして推定する、ことを特徴とする。 The spectral processing method according to the present invention generates a baseline model by applying a weight sequence consisting of a plurality of weights to a basis function sequence consisting of a plurality of basis functions, and generates a baseline model based on at least the spectrum. and fitting conditions for determining a sample point sequence that defines a sample part in the baseline model and fitting the sample part in the baseline model to the sample part in the spectrum, and the Lp norm of the weight sequence (where p≦ 1) search for an optimal weight sequence so that the Lp norm condition that reduces Features.

本発明によれば、スペクトルに含まれるベースライン成分を精度良く推定できる。あるいは、本発明によれば、ユーザーに大きな負担を生じさせることなく、ベースライン成分を推定し、それをスペクトルから除去できる。 According to the present invention, a baseline component included in a spectrum can be estimated with high accuracy. Alternatively, according to the present invention, the baseline component can be estimated and removed from the spectrum without causing a large burden on the user.

実施形態に係るスペクトル処理方法を示す概念図である。FIG. 2 is a conceptual diagram showing a spectrum processing method according to an embodiment. 実施形態に係るスペクトル処理装置を示すブロック図である。FIG. 1 is a block diagram showing a spectrum processing device according to an embodiment. 第１実施例に係るスペクトル処理方法を示すフローチャートである。3 is a flowchart showing a spectrum processing method according to the first embodiment. 第１実施例に係るスペクトル処理方法の効果を示す図である。FIG. 3 is a diagram showing the effects of the spectrum processing method according to the first example. 第２実施例に係るスペクトル処理方法を示すフローチャートである。7 is a flowchart showing a spectrum processing method according to a second embodiment. 第２実施例に係るスペクトル処理方法の効果を示す図である。FIG. 7 is a diagram showing the effects of the spectrum processing method according to the second example. 第３実施例に係るスペクトル処理方法を示すフローチャートである。7 is a flowchart showing a spectrum processing method according to a third embodiment. 第３実施例に係る標本点列決定方法を示すフローチャートである。12 is a flowchart showing a sample point sequence determination method according to a third embodiment. 第３実施例に係る標本点列決定方法の効果を示す図である。FIG. 7 is a diagram showing the effects of the sample point sequence determination method according to the third embodiment. 第３実施例に係るスペクトル処理方法の効果を示す図である。FIG. 7 is a diagram showing the effects of the spectrum processing method according to the third example.

以下、実施形態を図面に基づいて説明する。 Hereinafter, embodiments will be described based on the drawings.

（１）実施形態の詳細
実施形態に係るスペクトル処理装置は、受け入れ手段、生成手段、決定手段、探索手段、及び、減算手段を有する。受け入れ手段は、ベースライン成分を含むスペクトルを受け入れる手段である。決定手段は、複数の基底関数からなる基底関数列に対して複数の重みからなる重み列を作用させることにより、ベースラインモデルを生成する。決定手段は、少なくともスペクトルに基づいて、スペクトルにおける標本部分及びベースラインモデルにおける標本部分を定める標本点列を決定する。探索手段は、スペクトルにおける標本部分に対してベースラインモデルにおける標本部分をフィッティングさせるフィッティング条件が満たされるように、最適な重み列を探索する。減算手段は、スペクトルから、最適な重み列に基づいて生成される最適なベースラインモデルを減算する。 (1) Details of Embodiment A spectrum processing device according to an embodiment includes an accepting means, a generating means, a determining means, a searching means, and a subtracting means. The accepting means is a means for accepting a spectrum including a baseline component. The determining means generates a baseline model by applying a weight sequence consisting of a plurality of weights to a basis function sequence consisting of a plurality of basis functions. The determining means determines a sample point sequence defining a sample portion in the spectrum and a sample portion in the baseline model, based at least on the spectrum. The search means searches for an optimal weight sequence so that a fitting condition for fitting the sample portion in the baseline model to the sample portion in the spectrum is satisfied. The subtraction means subtracts an optimal baseline model generated based on the optimal weight sequence from the spectrum.

上記構成によれば、最適なベースラインモデルを定義する最適な重み列が自動的に探索され、その探索の過程で標本点列が自動的に優良化される。よって、ユーザーの負担を軽減でき、また、ベースライン成分を客観的に推定することが可能となる。 According to the above configuration, an optimal weight sequence that defines an optimal baseline model is automatically searched, and a sample point sequence is automatically improved in the process of the search. Therefore, it is possible to reduce the burden on the user and to objectively estimate the baseline component.

標本点列の決定に際して、スペクトルとベースラインモデルが比較されてもよい。その比較によれば、注目波形成分が支配的な部分を避けつつベースライン成分が支配的な部分を特定できる可能性を高められる。あるいは、スペクトルにおいて注目波形成分が存在する部分を特定し、当該部分以外の部分に基づいて標本点列が決定されてもよい。あるいは、スペクトルにおける平坦な部分を特定することにより、標本点列が決定されてもよい。 In determining the sample point sequence, the spectrum and a baseline model may be compared. According to the comparison, it is possible to increase the possibility of identifying a portion where the baseline component is dominant while avoiding a portion where the waveform component of interest is dominant. Alternatively, a portion of the spectrum where the waveform component of interest exists may be specified, and the sample point sequence may be determined based on portions other than the specified portion. Alternatively, the sample point sequence may be determined by identifying a flat portion in the spectrum.

最適な重み列の探索に際しては、フィッティング条件のみが考慮されてもよいが、フィッティング条件及び他の条件が考慮されてもよい。他の条件として、後述するＬｐノルム条件が挙げられる。スペクトルは、例えば、ＮＭＲスペクトルである。他のスペクトルに対して上記処理が適用されてもよい。上記複数の手段は、実施形態において、プロセッサが発揮する複数の機能に相当する。 When searching for an optimal weight sequence, only the fitting condition may be considered, but the fitting condition and other conditions may also be considered. Other conditions include the Lp norm condition, which will be described later. The spectrum is, for example, an NMR spectrum. The above processing may be applied to other spectra. The plurality of means described above correspond to the plurality of functions performed by the processor in the embodiment.

実施形態において、探索手段は、フィッティング条件、及び、重み列のＬｐノルム（但しｐ≦１）を小さくするＬｐノルム条件が満たされるように、最適な重み列を探索する。これにより、最適な重み列の探索の過程で、重み列が徐々にスパース化する。すなわち、重み列中において０の重み又は０に近い重みの割合が増大する。これにより、基底関数列中で、ベースラインモデルを構成する基底関数の個数が徐々に少なくなる。その結果、広帯域成分としてのベースライン成分の基本的な性質を表す少数の基底関数によってベースラインモデルが構成され易くなる。 In the embodiment, the search means searches for an optimal weight sequence so that the fitting condition and the Lp norm condition for reducing the Lp norm (where p≦1) of the weight sequence are satisfied. As a result, the weight sequence gradually becomes sparse in the process of searching for the optimal weight sequence. That is, the proportion of 0 weights or weights close to 0 in the weight sequence increases. As a result, the number of basis functions constituting the baseline model in the basis function sequence gradually decreases. As a result, it becomes easier to construct a baseline model using a small number of basis functions representing the basic properties of the baseline component as a broadband component.

フィッティング条件だけに基づいて最適な重み列の探索を行うと、基底関数列を構成する複数の基底関数に対して、広くまんべんなく重みが与えられ易くなり、その結果、場合によっては、行き過ぎたフィッティングが生じ易くなる。これに対して、フィッティング条件に加えてＬｐノルム条件（具体的にはＬｐノルム最小化条件）を導入することにより、行き過ぎたフィッティングという問題が生じ難くなる。 When searching for an optimal weight sequence based only on fitting conditions, it becomes easy to give weights widely and evenly to the multiple basis functions that make up the basis function sequence, and as a result, in some cases, excessive fitting may occur. It becomes more likely to occur. On the other hand, by introducing an Lp norm condition (specifically, an Lp norm minimization condition) in addition to the fitting condition, the problem of excessive fitting becomes less likely to occur.

一般に、基底関数列を構成する基底関数の個数を増大させると、解の収束が遅くなる等の問題が生じる。上記構成によれば、基底関数列を多数の基底関数で構成しても、有意な重みが与えられるのは実際には少数の基底関数となるので、解の収束が早まることを期待できる。逆に言えば、上記構成によれば、基底関数列をより多くの基底関数により構成し得る。その場合、特殊な形態を有するベースライン成分に対しても良好なフィッティング結果を得ることが可能となる。 In general, increasing the number of basis functions constituting a basis function sequence causes problems such as slow convergence of solutions. According to the above configuration, even if the basis function sequence is composed of a large number of basis functions, only a small number of basis functions are actually given significant weight, so it can be expected that the solution converges quickly. Conversely, according to the above configuration, the basis function sequence can be composed of more basis functions. In that case, it is possible to obtain good fitting results even for the baseline component having a special shape.

なお、各基底関数は、ベースラインモデルを構成する要素に相当するものである。各基底関数の寄与度を規定するものが重みである。最適解は、厳密な意味での最適解である必要はなく、演算上において最適解とみなされるものであればよい。例えば、探索終了条件を満たした時点での解が最適解となる。 Note that each basis function corresponds to an element constituting the baseline model. Weights define the degree of contribution of each basis function. The optimal solution does not have to be an optimal solution in a strict sense, but may be one that is considered to be an optimal solution in terms of calculation. For example, the solution at the time when the search end condition is satisfied becomes the optimal solution.

Ｌｐノルムに関し、ノルム計算対象のスパース性を高める働きは、ｐが１以下の場合に認められ、一般に、ｐが小さくなればなるほどその働きが大きくなる。よって、ベースライン成分の内容等に応じて、ｐの値を変更するようにしてもよい。Ｐは０以上１．０以下の値として定められるが、ｐが０の場合（つまりＬ０ノルムを利用する場合）、状況次第では解が不安定となるので、ｐを０よりも大きな値とするのが望ましい。 Regarding the Lp norm, the effect of increasing the sparsity of the norm calculation target is recognized when p is 1 or less, and generally, the smaller p is, the greater the effect is. Therefore, the value of p may be changed depending on the content of the baseline component. P is defined as a value between 0 and 1.0, but if p is 0 (that is, when using the L0 norm), the solution may become unstable depending on the situation, so set p to a value larger than 0. is desirable.

実施形態において、決定手段は、スペクトル及びベースラインモデルに基づいて、標本点列を決定する。最適な重み列の探索の過程で、標本点列及びベースラインモデルが優良化していく。 In an embodiment, the determining means determines the sample point sequence based on the spectrum and the baseline model. In the process of searching for the optimal weight sequence, the sample point sequence and baseline model become better.

実施形態において、決定手段は、スペクトルとベースラインモデルの比較により残差列を求め、残差列に基づいて標本点列を決定する。残差列は後述する残差スペクトルに相当する。実施形態において、決定手段は、残差列の中で所定条件を満たす複数の残差に基づいて標本点列を決定する。その場合、例えば、０以下の残差が特定されてもよいし、閾値以内の残差絶対値が特定されてもよい。実施形態において、決定手段は、残差列に基づいてヒストグラムを生成し、ヒストグラムに基づいて標本点列を決定する。ヒストグラムは標本部分の統計的な特徴を示すものである。スペクトルにおいてベースライン成分が支配的な部分においてはそれに対応するヒストグラムが一定の分布になる、という仮定に基づいて、一定の分布から外れる残差が特定され、その特定に基づいて標本点候補が絞り込まれてもよい。 In the embodiment, the determining means obtains a residual sequence by comparing the spectrum and a baseline model, and determines a sample point sequence based on the residual sequence. The residual sequence corresponds to a residual spectrum described later. In the embodiment, the determining means determines the sample point sequence based on a plurality of residuals satisfying a predetermined condition among the residual sequences. In that case, for example, a residual that is less than or equal to 0 may be specified, or an absolute value of the residual that is within a threshold value may be specified. In the embodiment, the determining means generates a histogram based on the residual sequence and determines the sample point sequence based on the histogram. A histogram shows the statistical characteristics of a sample portion. Based on the assumption that the histogram corresponding to the part of the spectrum where the baseline component is dominant has a constant distribution, residuals that deviate from the constant distribution are identified, and sampling point candidates are narrowed down based on this identification. You may be

実施形態に係るプログラムは、受け入れ機能、生成機能、決定機能、探索機能、及び、減算機能を有する。受け入れ機能は、ベースライン成分を含むスペクトルを受け入れる機能である。生成機能は、複数の基底関数からなる基底関数列に対して複数の重みからなる重み列を作用させることにより、ベースラインモデルを生成する機能である。決定機能は、少なくともスペクトルに基づいて、スペクトルにおける標本部分及びベースラインモデルにおける標本部分を定める標本点列を決定する機能である。探索機能は、スペクトルにおける標本部分に対してベースラインモデルにおける標本部分をフィッティングさせるフィッティング条件、及び、重み列のＬｐノルム（但しｐ≦１）を小さくするＬｐノルム条件、が満たされるように、最適な重み列を探索する機能である。減算機能は、スペクトルから、最適な重み列に基づいて生成される最適なベースラインモデルを減算する機能である。プログラムは、可搬型記憶媒体を介して又はネットワークを介して、情報処理装置へインストールされる。情報処理装置の概念には、コンピュータ、スペクトル生成装置、スペクトル測定装置、等が含まれる。 The program according to the embodiment has an acceptance function, a generation function, a determination function, a search function, and a subtraction function. The acceptance function is a function that accepts a spectrum containing a baseline component. The generation function is a function that generates a baseline model by applying a weight sequence consisting of a plurality of weights to a basis function sequence consisting of a plurality of basis functions. The determining function is a function of determining, based at least on the spectrum, a sample point sequence that defines the sample portion in the spectrum and the sample portion in the baseline model. The search function optimizes the fitting condition to fit the sample part in the baseline model to the sample part in the spectrum, and the Lp norm condition to make the Lp norm (however, p≦1) of the weight sequence small. This is a function to search for a weight sequence. The subtraction function is a function that subtracts the optimal baseline model generated based on the optimal weight sequence from the spectrum. The program is installed on the information processing device via a portable storage medium or via a network. The concept of an information processing device includes a computer, a spectrum generation device, a spectrum measurement device, and the like.

実施形態に係るスペクトル処理方法は、生成工程、決定工程、及び、探索工程を有する。生成工程は、複数の基底関数からなる基底関数列に対して複数の重みからなる重み列を作用させることにより、ベースラインモデルを生成する工程である。決定工程は、少なくともスペクトルに基づいて、スペクトルにおける標本部分及びベースラインモデルにおける標本部分を定める標本点列を決定する工程である。探索工程は、スペクトルにおける標本部分に対してベースラインモデルにおける標本部分をフィッティングさせるフィッティング条件、及び、重み列のＬｐノルム（但しｐ≦１）を小さくするＬｐノルム条件、が満たされるように、最適な重み列を探索する工程である。最適な重み列に基づいてスペクトルに含まれるベースライン成分が最適なベースラインモデルとして推定される。 The spectrum processing method according to the embodiment includes a generation step, a determination step, and a search step. The generation step is a step of generating a baseline model by applying a weight sequence consisting of a plurality of weights to a basis function sequence consisting of a plurality of basis functions. The determining step is a step of determining, based at least on the spectrum, a sample point sequence that defines a sample portion in the spectrum and a sample portion in the baseline model. The search process is performed optimally so that the fitting condition for fitting the sample portion in the baseline model to the sample portion in the spectrum and the Lp norm condition for reducing the Lp norm (however, p≦1) of the weight sequence are satisfied. This is the process of searching for a weight sequence. A baseline component included in the spectrum is estimated as an optimal baseline model based on the optimal weight sequence.

（２）実施形態の詳細
図１には、実施形態に係るスペクトル処理方法が示されている。具体的には、プロセッサにより実行されるスペクトル処理アルゴリズムが示されている。符号１００は、受け入れ工程を示しており、それは受け入れ手段に相当する。符号１０２は生成工程を示しており、それは生成手段に相当する。符号１０４は決定工程を示しており、それは決定手段に相当する。符号１０６は探索工程を示しており、それは探索手段に相当する。符号１０８は減算工程を示しており、それは減算手段に相当する。 (2) Details of Embodiment FIG. 1 shows a spectrum processing method according to an embodiment. Specifically, a spectral processing algorithm is shown executed by the processor. Reference numeral 100 indicates an acceptance process, which corresponds to an acceptance means. Reference numeral 102 indicates a generation step, which corresponds to a generation means. Reference numeral 104 indicates a determining step, which corresponds to determining means. Reference numeral 106 indicates a search step, which corresponds to a search means. Reference numeral 108 indicates a subtraction step, which corresponds to a subtraction means.

受け入れられたＮＭＲスペクトル（以下、単に「スペクトル」という。）ｙが処理対象である。横軸ｆは周波数軸であり、縦軸は強度軸である。スペクトルｙは、周波数軸上に並ぶＮ（Ｎは２以上の整数で、例えば数百又は数千）個の強度からなり、それは、演算処理上、ベクトルとして取り扱われる。 The accepted NMR spectrum (hereinafter simply referred to as "spectrum") y is the processing target. The horizontal axis f is the frequency axis, and the vertical axis is the intensity axis. The spectrum y consists of N (N is an integer of 2 or more, for example, several hundred or several thousand) intensities arranged on the frequency axis, and is treated as a vector in arithmetic processing.

スペクトルｙは、注目波形成分ａとベースライン成分ｂとを含む。注目波形成分ａは、図示の例において、複数のピークからなる。ベースライン成分ｂは、注目波形成分ａ以外の成分であって、広帯域成分であり、スペクトルｙにおける底辺に相当するものである。スペクトルｙの解析に先立って、そこに含まれるベースライン成分ｂを事前に除去しておくことが求められる。スペクトル処理アルゴリズムは、スペクトルｙからベースライン成分ｂを除去するものである。 The spectrum y includes a waveform component of interest a and a baseline component b. In the illustrated example, the waveform component a of interest consists of a plurality of peaks. The baseline component b is a component other than the waveform component of interest a, is a broadband component, and corresponds to the base of the spectrum y. Prior to analyzing the spectrum y, it is required to remove the baseline component b contained therein in advance. The spectral processing algorithm removes the baseline component b from the spectrum y.

生成工程１０２では、基底関数列Ａに対して重み列ｘを作用させることにより、具体的には、基底関数列Ａに対する重み列ｘの乗算により、ベースラインモデルＡｘが生成される。基底関数列Ａは、Ｋ（Ｋは２以上の整数で、例えば数個～数十個）個の基底関数からなる。通常、基底関数（モデル要素）として、様々な次数をもった様々な種類の関数が用意されている。例えば、１次から１０次のＮ次多項式、１次から２０次の余弦関数、１次から２０次の正弦関数、等が用意されている。基底関数列Ａには、指数関数、対数関数、スプライン関数、区分直線関数、等が含まれ得る。Ｎ個の基底関数は、図示の例において、ｎ軸方向に並んでおり、個々の基底関数は周波数軸上において波形である。個々の基底関数は、演算処理上、ベクトルとして取り扱われる。 In the generation step 102, the baseline model Ax is generated by applying the weight sequence x to the basis function sequence A, specifically, by multiplying the basis function sequence A by the weight sequence x. The basis function sequence A consists of K (K is an integer of 2 or more, for example, several to several dozen) basis functions. Usually, various types of functions with various degrees are prepared as basis functions (model elements). For example, N-order polynomials from 1st to 10th orders, cosine functions from 1st to 20th orders, sine functions from 1st to 20th orders, etc. are prepared. The basis function sequence A may include an exponential function, a logarithmic function, a spline function, a piecewise linear function, and the like. In the illustrated example, the N basis functions are arranged in the n-axis direction, and each basis function has a waveform on the frequency axis. Each basis function is treated as a vector for calculation purposes.

重み列ｘは、ｎ軸方向に並ぶＫ個の重みにより構成される。個々の重みは、それに対応する関数に対して乗算される係数である。重みが０であれば、それに対応する関数は、ベースラインモデルの構成要素から事実上、除外される。重み列ｘは、演算処理上、ベクトルとして取り扱われる。ベースライン成分ｂに近似する最適なベースラインモデルの探索に当たって、後述する評価値Ｊが最小化するように、重み列ｘが繰り返し更新される。探索終了条件を満たした時点での重み列ｘが最適な重み列となり、その時点でのベースラインモデルＡｘが最適なベースラインモデルとなる。 The weight sequence x is composed of K weights arranged in the n-axis direction. Each weight is a factor by which its corresponding function is multiplied. If a weight is 0, the corresponding function is effectively excluded from the components of the baseline model. The weight sequence x is treated as a vector in calculation processing. In searching for the optimal baseline model that approximates the baseline component b, the weight sequence x is repeatedly updated so that the evaluation value J, which will be described later, is minimized. The weight sequence x at the time when the search end condition is satisfied becomes the optimal weight sequence, and the baseline model Ax at that time becomes the optimal baseline model.

決定工程１０４では、標本点列Ｉが決定される。標本点列Ｉは、周波数軸方向に並ぶＭ（Ｍは２以上の整数で、例えば、Ｎの数％～１００％）個の標本点からなる。それは演算処理上、ベクトルとして取り扱われる。個々の標本点は、全観測点の中から選択されるサンプリング点である。標本点列Ｉからベクトル演算で必要となる標本行列Ｓ_Ｉが定義される。標本行列Ｓ_Ｉにより、スペクトルｙにおける標本部分Ｓ_Ｉｙ、及び、ベースラインモデルＡｘにおける標本部分Ｓ_ＩＡｘが定められる。注目波形成分ａが支配的な部分ではなく、ベースライン成分ｂが支配的な部分に、標本点列Ｉが集中するように、標本点列Ｉを構成する複数の標本点が徐々に選別される。 In the determination step 104, a sample point sequence I is determined. The sample point sequence I consists of M sample points (M is an integer of 2 or more, for example, several % to 100% of N) arranged in the frequency axis direction. It is treated as a vector in arithmetic processing. Each sampling point is a sampling point selected from among all observation points. A sample matrix S _I required for vector calculation is defined from the sample point sequence I. The sample matrix S _I defines the sample portion S _I y in the spectrum y and the sample portion S _I Ax in the baseline model Ax. A plurality of sample points constituting the sample point sequence I are gradually selected so that the sample point sequence I is concentrated not in the part where the waveform component of interest a is dominant but in the part where the baseline component b is dominant. .

より詳しくは、決定工程１０４では、スペクトルｙとベースラインモデルＡｘとが比較され、両者の差分に相当する残差スペクトル（残差列）が演算される。残差スペクトルを構成するＮ個の残差を個別的に評価することにより、標本点列Ｉを構成するＭ個の標本点が選別される。後に説明する実施例では、通常、Ｍの値は徐々に小さくなる。 More specifically, in the determination step 104, the spectrum y and the baseline model Ax are compared, and a residual spectrum (residual sequence) corresponding to the difference between the two is calculated. By individually evaluating the N residuals constituting the residual spectrum, M sample points constituting the sample point sequence I are selected. In the embodiments to be described later, the value of M usually becomes gradually smaller.

なお、スペクトルｙにおいて注目波形成分ａが支配的な部分を特定し、その部分以外の部分に基づいて標本点列Ｉが決定されてもよい。スペクトルｙにおける平坦部分を特定し、当該平坦部分に基づいて標本点列Ｉが決定されてもよい。それら以外の方法により標本点列Ｉが決定されてもよい。実施形態においては、スペクトルｙとベースラインモデルＡｘを比較するので、ベースラインモデルＡｘの優良化に伴って、標本点列Ｉが自然に優良化される。 Note that a portion in the spectrum y where the waveform component of interest a is dominant may be specified, and the sample point sequence I may be determined based on portions other than that portion. A flat portion in the spectrum y may be specified, and the sample point sequence I may be determined based on the flat portion. The sample point sequence I may be determined by other methods. In the embodiment, since the spectrum y and the baseline model Ax are compared, the sample point sequence I is naturally improved as the baseline model Ax is improved.

実施形態において、初期決定時に決定工程１０４が実行される。その後においては、以下に説明するように、重み列ｘを繰り返し更新していく過程で、評価値Ｊが最小化し、且つ、探索終了条件が満たされない場合に、決定工程１０４が実行される。なお、評価値Ｊの計算回数が一定値Ｃ（Ｃは２以上の整数）に到達する都度、決定工程１０４が実行されてもよい。あるいは、評価値Ｊの計算の都度、決定工程１０４が実行されてもよい。 In an embodiment, a decision step 104 is performed during the initial decision. Thereafter, as described below, in the process of repeatedly updating the weight sequence x, if the evaluation value J is minimized and the search end condition is not satisfied, the determination step 104 is executed. Note that the determining step 104 may be executed each time the number of calculations of the evaluation value J reaches a certain value C (C is an integer of 2 or more). Alternatively, the determination step 104 may be executed each time the evaluation value J is calculated.

探索工程１０６（特に符号１０６Ａを参照）では、評価値Ｊが最小化されるように、重み列ｘが徐々に優良化される。評価値Ｊは、以下の（１）式により定義される。

なお、ｙはＮ×１ベクトルであり、Ｓ_ＩはＭ×Ｎ行列であり、ＡはＮ×Ｋ行列であり、Ｌｐノルムは以下の（２）式で定義される。

上記（１）式において、第１項は、標本部分間の残差スペクトルのＬ２ノルムを最小化する条件を規定しており、すなわち、フィッティング条件を規定している。残差スペクトルは、スペクトルｙにおける標本部分Ｓ_ＩｙからベースラインモデルＡｘにおける標本部分Ｓ_ＩＡｘを減算することにより求められる。２つの標本部分Ｓ_Ｉｙ及びＳ_ＩＡｘが相互に完全に一致した場合、Ｌ２ノルムは０となる。 In the search step 106 (see particularly reference numeral 106A), the weight sequence x is gradually improved so that the evaluation value J is minimized. The evaluation value J is defined by the following equation (1).

Note that y is an N×1 vector, S _I is an M×N matrix, A is an N×K matrix, and the Lp norm is defined by the following equation (2).

In the above equation (1), the first term defines the condition for minimizing the L2 norm of the residual spectrum between sample parts, that is, defines the fitting condition. The residual spectrum is obtained by subtracting the sample portion S _I Ax in the baseline model Ax from the sample portion S _I y in the spectrum y. If the two sample parts S _I y and S _I Ax perfectly match each other, the L2 norm will be zero.

上記（１）式において、第２項は、重み列ｘのＬｐノルム（但しｐ≦１）を最小化するＬｐノルム条件を規定している。重み列ｘのＬｐノルムが小さくなるに従って、重み列ｘのスパース化が促進する。すなわち、重み列ｘを構成するＫ個の重みの中において、０の値をもつ又は０に近い値をもつ重みが増大していく。スパース化の促進により、ベースラインモデルを実質的に規定する基底関数の数が徐々に少なくなる。典型的には、ベースライン成分ｂにおける主要成分を表す少数の規定関数のみが残り、それ以外が除外されていく。この結果、過剰なフィッティングが防止又は軽減される。なお、λは第１項の作用と第２項の作用を調整する係数である。 In the above equation (1), the second term defines an Lp norm condition that minimizes the Lp norm (where p≦1) of the weight sequence x. As the Lp norm of the weight sequence x becomes smaller, the sparsification of the weight sequence x is promoted. That is, among the K weights forming the weight sequence x, the weights having a value of 0 or a value close to 0 increase. With the promotion of sparsification, the number of basis functions that substantially define the baseline model gradually decreases. Typically, only a small number of prescribed functions representing the main components in the baseline component b remain, and the others are excluded. As a result, overfitting is prevented or reduced. Note that λ is a coefficient that adjusts the effect of the first term and the effect of the second term.

上記のように、０≦ｐ≦１であり、解の安定性を求める場合には０＜ｐ≦１である。ｐが１以下の場合、Ｌｐノルムを用いた解の探索過程において解のスパース性が増大する。実施形態においては、その性質が重み列ｘの最適解の探索で効果的に利用されている。一般に、ｐは１であるが、重み列のスパース性が高いと予測される場合、ｐを例えば０．７５又は０．５とするようにしてもよい。 As mentioned above, 0≦p≦1, and when seeking the stability of the solution, 0<p≦1. When p is less than or equal to 1, the sparsity of the solution increases in the process of searching for a solution using the Lp norm. In the embodiment, this property is effectively utilized in the search for the optimal solution for the weight sequence x. Generally, p is 1, but if the sparsity of the weight sequence is expected to be high, p may be set to, for example, 0.75 or 0.5.

評価値Ｊにより、第１条件としてのフィッティング条件及び第２条件としてのＬｐノルム条件を同時に考慮することが可能となる。もっとも、スペクトルｙにおける標本部分Ｓ_Ｉｙに対してベースラインモデルＡｘにおける標本部分Ｓ_ＩＡｘをフィッティングできる限りにおいて、他の計算式を用いてもよい。 The evaluation value J makes it possible to simultaneously consider the fitting condition as the first condition and the Lp norm condition as the second condition. However, other calculation formulas may be used as long as the sample portion S _I Ax in the baseline model Ax can be fitted to the sample portion S _I y in the spectrum y.

上記（１）式に基づいて、勾配法等の公知の最適解探索法を適用することにより、重み列ｘが徐々に更新され（符号１０６Ｂを参照）、重み列ｘが徐々に優良化される。その過程で、標本点列Ｉも段階的に優良化されていく。探索終了条件が満たされた時点での重み列ｘが最適な重み列ｘとされ、その時点でのベースラインモデルＡｘが最適なベースラインモデルＡｘとされる。評価値Ｊが一定値以下になった場合、評価値Ｊの演算回数が一定値に到達した場合、等において、最適解の探索を終了させてもよい。 Based on the above equation (1), the weight sequence x is gradually updated by applying a known optimal solution search method such as the gradient method (see reference numeral 106B), and the weight sequence x is gradually improved. . In the process, the sample point sequence I is also improved step by step. The weight sequence x at the time when the search end condition is satisfied is set as the optimal weight sequence x, and the baseline model Ax at that time is set as the optimal baseline model Ax. The search for the optimal solution may be terminated when the evaluation value J becomes less than a certain value, when the number of calculations of the evaluation value J reaches a certain value, etc.

減算工程１０８では、探索終了条件が満たされた時点で（符号１０６Ｃを参照）、スペクトルｙから、最適なベースラインモデルＡｘが減算される。この減算により、ベースライン成分ｂが除去されたスペクトルｙｃが生成される。そのスペクトルｙｃが解析対象となる。その減算は以下の（３）式で表現される。

In the subtraction step 108, the optimal baseline model Ax is subtracted from the spectrum y when the search end condition is satisfied (see reference numeral 106C). This subtraction generates a spectrum yc from which the baseline component b has been removed. The spectrum yc becomes the analysis target. The subtraction is expressed by the following equation (3).

図２には、ＮＭＲ測定システムが示されている。ＮＭＲ測定システムは、図示の構成例では、ＮＭＲ測定装置３０とスペクトル処理装置３２とにより構成される。ＮＭＲ測定装置３０からスペクトル処理装置３２へスペクトルデータが転送される。その転送は、例えば、ネットワークを介して又は記憶媒体を介して行われる。 In FIG. 2, an NMR measurement system is shown. In the illustrated configuration example, the NMR measurement system includes an NMR measurement device 30 and a spectrum processing device 32. Spectral data is transferred from the NMR measurement device 30 to the spectrum processing device 32. The transfer takes place, for example, via a network or via a storage medium.

ＮＭＲ測定装置３０は、図２においてはその詳細が示されていないものの、分光計及び測定部により構成される。測定部は、静磁場発生器、プローブ等により構成される。静磁場発生器は、垂直貫通孔としてのボアを有し、そのボアの内部にプローブの挿入部が挿入される。挿入部のヘッド内にはサンプルからのＮＭＲ信号を検出する検出回路が設けられている。分光計は、制御部、送信部、受信部等により構成される。送信部は、送信パルスシーケンスに従って送信信号を生成し、その送信信号がプローブに送られる。これにより、サンプルに対して電磁波が照射される。その後、プローブにおいて、サンプルからのＮＭＲ信号が検出される。その検出により生じた受信信号が受信部へ送られる。受信部においては、受信信号に対するＦＦＴ演算によりＮＭＲスペクトルが生成される。そのＮＭＲスペクトルが、必要に応じて、スペクトル処理装置３２へ送られる。スペクトル処理装置３２をＮＭＲ測定装置３０内に組み込むようにしてもよい。 Although the details are not shown in FIG. 2, the NMR measuring device 30 is composed of a spectrometer and a measuring section. The measuring section is composed of a static magnetic field generator, a probe, and the like. The static magnetic field generator has a bore as a vertical through hole, into which a probe insertion section is inserted. A detection circuit for detecting an NMR signal from a sample is provided within the head of the insertion section. A spectrometer is composed of a control section, a transmitting section, a receiving section, and the like. The transmitter generates a transmit signal according to the transmit pulse sequence, and the transmit signal is sent to the probe. As a result, the sample is irradiated with electromagnetic waves. The NMR signal from the sample is then detected at the probe. A received signal generated by the detection is sent to the receiving section. In the receiving section, an NMR spectrum is generated by FFT calculation on the received signal. The NMR spectrum is sent to spectrum processing device 32 as required. The spectrum processing device 32 may be incorporated into the NMR measuring device 30.

情報処理装置としてのスペクトル処理装置３２は、ＣＰＵ５０、メモリ５６、入力器５２、表示器５４等を有する。メモリ５６上には、スペクトル処理プログラム５８、及び、スペクトル解析プログラム６０が格納されている。スペクトル処理プログラム５８は、図１に示したスペクトル処理アルゴリズムを実行するプログラムである。スペクトル解析プログラム６０は、前処理後のスペクトルを解析するためのプログラムである。それらのプログラム５８，６０は、ＣＰＵ５０において実行される。 The spectrum processing device 32 as an information processing device includes a CPU 50, a memory 56, an input device 52, a display device 54, and the like. A spectrum processing program 58 and a spectrum analysis program 60 are stored on the memory 56 . The spectrum processing program 58 is a program that executes the spectrum processing algorithm shown in FIG. The spectrum analysis program 60 is a program for analyzing the spectrum after preprocessing. These programs 58 and 60 are executed in the CPU 50.

ＣＰＵ５０は、受け入れ手段、生成手段、決定手段、探索手段、及び、減算手段として機能する。入力器５２は、キーボード、ポインティングデバイス等によって構成され、入力器を利用して、ｐに代入する値、係数λ、終了条件、等が指定される。例えば、解の変化量が０．１％以下となった場合に繰り返し処理を終了させてもよい。表示器５４は、例えば、ＬＣＤによって構成され、そこにはスペクトルが表示される。 The CPU 50 functions as an accepting means, a generating means, a determining means, a searching means, and a subtracting means. The input device 52 includes a keyboard, a pointing device, and the like, and is used to specify a value to be substituted for p, a coefficient λ, an end condition, and the like. For example, the iterative process may be terminated when the amount of change in the solution becomes 0.1% or less. The display 54 is constituted by, for example, an LCD, and the spectrum is displayed there.

以下、第１実施例～第３実施例を説明する。第１実施例～第３実施例においては、それぞれ異なる方法により、標本点列Ｉが決定される。それ以外の部分は、基本的に、いずれの実施例においても共通である。 The first to third embodiments will be described below. In the first to third embodiments, the sample point sequence I is determined using different methods. The other parts are basically the same in all embodiments.

図３には、第１実施例に係るスペクトル処理方法が示されている。Ｓ１０では、重み列ｘが初期設定される。例えば、重み列ｘを構成するＫ個の重みにランダム値が与えられる。Ｓ１２Ａでは、標本点列Ｉが初期設定される。具体的には、スペクトルｙからベースラインモデルＡｘを減算することにより残差スペクトル（残差列）が求められ、残差スペクトルに基づいて０以下の残差を生じさせた観測点が特定される。０以下の残差を生じさせた観測点が標本点列Ｉとして決定される。もっとも、Ｓ１２Ａにおいて、Ｎ個の観測点からなる全観測点を標本点列Ｉとして決定してもよい。 FIG. 3 shows a spectrum processing method according to the first embodiment. In S10, the weight sequence x is initialized. For example, random values are given to the K weights that make up the weight sequence x. In S12A, the sample point sequence I is initialized. Specifically, a residual spectrum (residual sequence) is obtained by subtracting the baseline model Ax from the spectrum y, and observation points that have produced a residual of 0 or less are identified based on the residual spectrum. . Observation points that produce a residual of 0 or less are determined as a sample point sequence I. However, in S12A, all observation points consisting of N observation points may be determined as the sample point sequence I.

Ｓ１６では、評価値Ｊが演算される。Ｓ１８では、現在の標本点列Ｉの下で、評価値Ｊが最小（又は最小とみなせる値）になったか否かが判断される。評価値Ｊが最小になっていないと判断された場合、Ｓ２０において、重み列ｘが更新される。その後、Ｓ１６で評価値Ｊが再び演算される。評価値Ｊの変化が一定値以下になった場合に、評価値Ｊの最小化を判断してもよい。 In S16, an evaluation value J is calculated. In S18, it is determined whether the evaluation value J has become the minimum (or a value that can be considered the minimum) under the current sample point sequence I. If it is determined that the evaluation value J is not the minimum, the weight column x is updated in S20. Thereafter, the evaluation value J is calculated again in S16. Minimization of the evaluation value J may be determined when the change in the evaluation value J becomes equal to or less than a certain value.

Ｓ１８で、評価値Ｊが最小になったと判断された場合、Ｓ２２で、探索終了条件が満たされたか否か判断される。例えば、評価値Ｊが閾値以下の値になった場合、評価値Ｊの変化が一定値以下になった場合、評価値Ｊの演算回数が一定回数に達した場合、等において、探索終了条件が満たされたと判断される。 If it is determined in S18 that the evaluation value J has become the minimum, it is determined in S22 whether the search end condition is satisfied. For example, when the evaluation value J becomes a value less than a threshold value, when the change in the evaluation value J becomes less than a certain value, when the number of calculations of the evaluation value J reaches a certain number, etc., the search end condition is set. considered to be satisfied.

探索終了条件が満たされないと判断された場合、Ｓ２４Ａで標本点列Ｉが更新される。具体的には、上記同様に、残差スペクトルが演算され、その中で、０以下の残差が特定される。０以下の残差を生じさせた観測点が標本点列Ｉとして決定される。その後、Ｓ１６以降の工程が実行される。図３の内容から明らかなように、探索終了条件が満たされるまで標本点列Ｉの決定が繰り返されることになる。Ｓ２２において、探索終了条件が満たされたと判断された場合，Ｓ２８において、スペクトルから最適なベースラインモデルＡｘ（つまり、推定されたベースライン成分）が減算される。 If it is determined that the search end condition is not satisfied, the sample point sequence I is updated in S24A. Specifically, in the same manner as described above, a residual spectrum is calculated, and residuals of 0 or less are identified. Observation points that produce a residual of 0 or less are determined as a sample point sequence I. After that, the steps after S16 are executed. As is clear from the content of FIG. 3, the determination of the sample point sequence I is repeated until the search end condition is satisfied. If it is determined in S22 that the search end condition is satisfied, the optimal baseline model Ax (that is, the estimated baseline component) is subtracted from the spectrum in S28.

図４には、第１実施例の効果が示されている。第１段のＡ１０は、入力されたスペクトルを示している。第２段には、１回目の最小化処理（重み列を最小化する処理）が示され、第３段には、２回目の最小化処理が示され、第４段には、３回目の最小化処理が示され、第５段には、６回目の最小化処理が示されている。 FIG. 4 shows the effect of the first embodiment. A10 in the first stage indicates the input spectrum. The second stage shows the first minimization process (processing to minimize the weight sequence), the third stage shows the second minimization process, and the fourth stage shows the third minimization process. The minimization process is shown, and the fifth stage shows the sixth minimization process.

各段において、左側には、標本点列が複数の観測点の連なりとして示されている（Ａ２０，Ａ３０，Ａ４０，Ａ５０を参照）。各段において、中央には、推定されたベースラインモデルが示されている（Ａ２２，Ａ３２，Ａ４２，Ａ５２を参照）。各段において、右側には、減算により生成された残差スペクトルが示されている（Ａ２４，Ａ３４，Ａ４４，Ａ５４を参照）。推定処理の繰り返しにより、ベースラインモデルが優良化されている。最終的に、スペクトルに含まれるベースライン成分が効果的に除去されている。 In each row, on the left side, a sample point sequence is shown as a series of a plurality of observation points (see A20, A30, A40, A50). In each row, the estimated baseline model is shown in the center (see A22, A32, A42, and A52). In each stage, on the right side, the residual spectrum generated by the subtraction is shown (see A24, A34, A44, A54). The baseline model is improved by repeating the estimation process. Finally, the baseline component contained in the spectrum has been effectively removed.

第１実施例によると、スペクトルの底に相当する部分が標本部分になり易くなる。その結果、標本点数が非常に少なくなる傾向が認められる。よって、第１実施例はメタボロミクスデータなどに対して有効に働くものと考えられる。 According to the first embodiment, the portion corresponding to the bottom of the spectrum tends to become the sample portion. As a result, there is a tendency for the number of sample points to become extremely small. Therefore, the first embodiment is considered to work effectively for metabolomics data and the like.

図５には、第２実施例に係るスペクトル処理方法が示されている。図５において、図３に示した工程と同一の工程には同一の符号を付しその説明を省略する。 FIG. 5 shows a spectrum processing method according to a second embodiment. In FIG. 5, the same steps as those shown in FIG. 3 are given the same reference numerals, and their explanations will be omitted.

Ｓ１２Ｂでは、標本点列Ｉが初期設定される。具体的には、スペクトルｙからベースラインモデルＡｘを減算することにより残差スペクトル（残差列）が求められ、残差スペクトルに基づいて閾値以内の残差の絶対値（残差絶対値）を生じさせた観測点が特定される。閾値以内の残差絶対値を生じさせた観測点が標本点列Ｉとして決定される。もっとも、Ｓ１２Ｂにおいて、Ｎ個の観測点からなる全観測点を標本点列Ｉとして決定してもよい。 In S12B, the sample point sequence I is initialized. Specifically, the residual spectrum (residual sequence) is obtained by subtracting the baseline model Ax from the spectrum y, and the absolute value of the residual within a threshold value (residual absolute value) is calculated based on the residual spectrum. The observation point that caused the occurrence is identified. Observation points that produce residual absolute values within the threshold are determined as sample point sequence I. However, in S12B, all observation points consisting of N observation points may be determined as the sample point sequence I.

Ｓ２４Ｂでは、標本点列Ｉが更新される。具体的には、上記同様に、残差スペクトルが演算され、残差スペクトルに基づいて閾値以内の残差絶対値が特定される。閾値以内の残差絶対値を生じさせた観測点が標本点列Ｉとして決定される。第２実施例では、正負の符号を問わず、小さい残差を生じさせた部分が標本部分とされ、大きな残差を生じさせた部分が非標本部分とされる。例えば、スペクトルの最大値の絶対値を基準値とし、その基準値の０．１％を上記閾値として定めてもよい。 In S24B, the sample point sequence I is updated. Specifically, in the same manner as described above, a residual spectrum is calculated, and a residual absolute value within a threshold value is specified based on the residual spectrum. Observation points that produce residual absolute values within the threshold are determined as sample point sequence I. In the second embodiment, regardless of the positive or negative sign, a portion that causes a small residual error is determined to be a sample portion, and a portion that causes a large residual error is determined to be a non-sampled portion. For example, the absolute value of the maximum value of the spectrum may be used as the reference value, and 0.1% of the reference value may be set as the threshold value.

図６には、第２実施例の効果が示されている。第１段のＢ１０は、入力されたスペクトルを示している。第２段には、１回目の最小化処理が示され、第３段には、２回目の最小化処理が示され、第４段には、３回目の最小化処理が示され、第５段には、９回目の最小化処理が示されている。 FIG. 6 shows the effect of the second embodiment. B10 in the first stage shows the input spectrum. The second stage shows the first minimization process, the third stage shows the second minimization process, the fourth stage shows the third minimization process, and the fifth stage shows the third minimization process. The ninth stage shows the ninth minimization process.

各段において、左側には、標本点列が複数の観測点の連なりとして示されている（Ｂ２０，Ｂ３０，Ｂ４０，Ｂ５０を参照）。各段において、中央には、推定されたベースラインモデルが示されている（Ｂ２２，Ｂ３２，Ｂ４２，Ｂ５２を参照）。各段において、右側には、減算により生成された残差スペクトルが示されている（Ｂ２４，Ｂ３４，Ｂ４４，Ｂ５４を参照）。標本点列Ｉの更新を伴う最小化処理の繰り返しにより、ベースラインモデルが優良化されている。最終的に、スペクトルに含まれるベースライン成分が効果的に除去されている。 In each row, on the left side, a sample point sequence is shown as a series of a plurality of observation points (see B20, B30, B40, and B50). In each row, the estimated baseline model is shown in the center (see B22, B32, B42, and B52). In each stage, on the right side, the residual spectrum generated by the subtraction is shown (see B24, B34, B44, B54). The baseline model is improved by repeating the minimization process that involves updating the sample point sequence I. Finally, the baseline component contained in the spectrum has been effectively removed.

第２実施例によると、多くの標本点を採用し易くなる。よって、演算結果の安定性を得られ易くなる。但し、上記閾値の設定を適切に行う必要がある。 According to the second embodiment, it becomes easier to employ many sample points. Therefore, it becomes easier to obtain stability of the calculation results. However, it is necessary to appropriately set the threshold value.

図７には、第３実施例に係るスペクトル処理方法が示されている。図７において、図３に示した工程と同一の工程には同一の符号を付しその説明を省略する。 FIG. 7 shows a spectrum processing method according to a third embodiment. In FIG. 7, the same steps as those shown in FIG. 3 are given the same reference numerals, and their explanations will be omitted.

Ｓ１２Ｃでは、標本点列Ｉが初期決定される。その場合、以下に詳述するヒストグラムが所定の分布（正規分布）に近付くように（ヒストグラムが正規化されるように）、標本点（観測点）が選別される。但し、Ｓ１２Ｃにおいて、全観測点を標本点列Ｉとして初期決定してもよい。 In S12C, a sample point sequence I is initially determined. In that case, sample points (observation points) are selected so that the histogram, which will be described in detail below, approaches a predetermined distribution (normal distribution) (so that the histogram is normalized). However, in S12C, all observation points may be initially determined as the sample point sequence I.

Ｓ２４Ｃでも、Ｓ１２Ｃと同様に、ヒストグラムが所定の分布（正規分布）に近付くように（ヒストグラムが正規化されるように）、標本点（観測点）が選別される。 In S24C, as in S12C, sample points (observation points) are selected so that the histogram approaches a predetermined distribution (normal distribution) (so that the histogram is normalized).

以下、図８に基づいて、第３実施例における標本点列決定方法について説明する。Ｓ３０では、全観測点が標本点候補として仮決定される。Ｓ３２では、スペクトルからベースラインモデルが減算され、これにより残差スペクトルが求められる。続いて、残差スペクトルからヒストグラムが生成される。すなわち、残差の大きさごとに、個数（頻度）が計数され、その結果から、ヒストグラム（残差ヒストグラム）が生成される。Ｓ３４では、ヒストグラムに対して、公知のカイ（χ）二乗検定が適用される。その検定により、ヒストグラムが一定の分布（カイ二乗分布）に従っているか否かが判定される。 Hereinafter, based on FIG. 8, a method for determining a sample point sequence in the third embodiment will be described. In S30, all observation points are provisionally determined as sample point candidates. In S32, the baseline model is subtracted from the spectrum, thereby obtaining a residual spectrum. A histogram is then generated from the residual spectrum. That is, the number (frequency) is counted for each size of the residual, and a histogram (residual histogram) is generated from the result. In S34, a known chi-square test is applied to the histogram. The test determines whether the histogram follows a certain distribution (chi-square distribution).

Ｓ３６で、ヒストグラムが一定の分布に従っていないと判断された場合、すなわち、偽が判定された場合、Ｓ３８において、ヒストグラム中において除外条件を満たすビン（ヒストグラム要素としての棒）が除外される。具体的には、ヒストグラム中心から最も離れているビンが除外される。これは１又は複数の観測点（つまり１又は複数の標本点候補）の除外に相当する。Ｓ３８の工程は、ヒストグラムを修正する工程、特にヒストグラムを正規分布に近付ける工程、である。 If it is determined in S36 that the histogram does not follow a certain distribution, that is, if it is determined to be false, then in S38 bins (bars as histogram elements) that satisfy the exclusion condition are excluded from the histogram. Specifically, the bins furthest from the histogram center are excluded. This corresponds to excluding one or more observation points (that is, one or more sample point candidates). The step S38 is a step of modifying the histogram, particularly a step of making the histogram closer to a normal distribution.

上記検定の結果、Ｓ３６で、ヒストグラムが一定の分布に従っていると判断された場合、つまり真が判断された場合、Ｓ４０において、その時点での標本点候補が標本点列となる。 As a result of the above test, if it is determined in S36 that the histogram follows a certain distribution, that is, if it is determined to be true, then in S40 the sample point candidates at that point become a sample point sequence.

図９には、第３実施例に係る標本点決定方法の効果が示されている。第１段のＣ１０は、スペクトルそれ全体に対応する全標本点候補を示している。第２段は、１回目のヒストグラム修正結果を示しており、第３段は、２回目ヒストグラム修正結果を示しており、第４段は、６回目のヒストグラム修正結果を示している。各段において、左側には、決定された標本点列が示されており（Ｃ２０，Ｃ３０，Ｃ４０を参照）、右側には、修正されたヒストグラムが示されている（Ｃ２２，Ｃ３２，Ｃ４２を参照）。なお、３つの横軸のスケールは相違している。ヒストグラム修正の繰り返し数の増大に伴って標本点列が優良化している。 FIG. 9 shows the effect of the sample point determination method according to the third embodiment. C10 in the first stage indicates all sample point candidates corresponding to the entire spectrum. The second row shows the results of the first histogram correction, the third row shows the results of the second histogram correction, and the fourth row shows the results of the sixth histogram correction. In each row, the determined sample point sequence is shown on the left (see C20, C30, C40) and the corrected histogram is shown on the right (see C22, C32, C42). ). Note that the scales of the three horizontal axes are different. As the number of repetitions of histogram correction increases, the sample point sequence becomes better.

図１０には、第３実施例に係るスペクトル処理方法の効果が示されている。第１段のＤ１０は、入力されたスペクトルを示している。第２段には、１回目の最小化処理が示され、第３段には、２回目の最小化処理が示され、第４段には、３回目の最小化処理が示され、第５段には、９回目の最小化処理が示されている。各段において、左側には、標本点列が複数の標本点の連なりとして示されている（Ｄ２０，Ｄ３０，Ｄ４０，Ｄ５０を参照）。各段において、中央には、推定されたベースラインモデルが示されている（Ｄ２２，Ｄ３２，Ｄ４２，Ｄ５２を参照）。各段において、右側には、減算により生成された残差スペクトルが示されている（Ｄ２４，Ｄ３４，Ｄ４４，Ｄ５４を参照）。推定処理の繰り返しにより、ベースラインモデルが優良化されている。最終的に、スペクトルに含まれるベースライン成分が効果的に除去されている。 FIG. 10 shows the effect of the spectrum processing method according to the third example. D10 in the first stage indicates the input spectrum. The second stage shows the first minimization process, the third stage shows the second minimization process, the fourth stage shows the third minimization process, and the fifth stage shows the third minimization process. The ninth stage shows the ninth minimization process. In each row, on the left side, a sample point sequence is shown as a series of a plurality of sample points (see D20, D30, D40, D50). In each row, the estimated baseline model is shown in the center (see D22, D32, D42, and D52). In each stage, the residual spectrum generated by the subtraction is shown on the right side (see D24, D34, D44, D54). The baseline model is improved by repeating the estimation process. Finally, the baseline component contained in the spectrum has been effectively removed.

スペクトルの性質やスペクトルの解析方法に応じて、採用する実施例を選択してもよい。例えば，ブロードな信号成分をベースライン成分とみなして、その信号成分を取り除く場合は、第１実施例を採用してもよい。重み列の推定精度を十分に確保できる場合には、第２実施例又は第３実施例を採用してもよい。 The embodiment to be adopted may be selected depending on the properties of the spectrum and the method of analyzing the spectrum. For example, if a broad signal component is regarded as a baseline component and the signal component is to be removed, the first embodiment may be adopted. If sufficient accuracy in estimating the weight sequence can be ensured, the second embodiment or the third embodiment may be adopted.

第３実施例は、スペクトルにおいてベースライン成分のヒストグラムは、正規分布に従うという仮定を前提とするものである。上記のように、処理の繰り返しに伴って、標本点数が減っていき，ヒストグラムが正規分布に近づいていく。第３実施例によれば、第２実施例と同様、最適解を安定的に導出できる。第３実施例では、閾値の設定が不要であるため、第２実施例に比べて、状況又は環境にあまり依存せずに、比較的に良好な解を得られ易いという利点を指摘できる。 The third embodiment is based on the assumption that the histogram of the baseline component in the spectrum follows a normal distribution. As described above, as the process is repeated, the number of sample points decreases and the histogram approaches a normal distribution. According to the third embodiment, the optimal solution can be stably derived as in the second embodiment. In the third embodiment, since it is not necessary to set a threshold value, it can be pointed out that compared to the second embodiment, it is possible to point out the advantage that a relatively good solution can be easily obtained without depending much on the situation or environment.

１００受け入れ工程（受け入れ手段）、１０２生成工程（生成手段）、１０４決定工程（決定手段）、１０６探索工程（探索手段）、１０８減算工程（減算手段）。 100 acceptance process (acceptance means), 102 generation process (generation means), 104 determination process (determination means), 106 search process (search means), 108 subtraction process (subtraction means).

Claims

means for accepting a spectrum including a baseline component;
Generating means for generating a baseline model by applying a weight sequence consisting of a plurality of weights to a basis function sequence consisting of a plurality of basis functions;
determining means for obtaining a residual sequence by comparing the spectrum and the baseline model, and determining , based on the residual sequence, a sample point sequence that defines a sample portion in the spectrum and a sample portion in the baseline model;
searching means for searching for an optimal weight sequence so that a fitting condition for fitting the sample portion in the baseline model to the sample portion in the spectrum is satisfied;
subtraction means for subtracting an optimal baseline model generated based on the optimal weight sequence from the spectrum;
including;
In the process of searching for the optimal weight sequence, the determination of the sample point sequence by the determining means is repeated,
In the process of searching for the optimal weight sequence, the sample point sequence and the baseline model are improved;
A spectrum processing device characterized by:

The spectrum processing device according to claim 1,
The search means searches for the optimal weight sequence so that the fitting condition and the Lp norm condition for reducing the Lp norm (where p≦1) of the weight sequence are satisfied.
A spectrum processing device characterized by:

The spectrum processing device according to claim 2,
In the process of searching for the optimal weight sequence, the weight sequence gradually becomes sparse.
A spectrum processing device characterized by:

The spectrum processing device according to claim 1 ,
The determining means determines the sample point sequence based on a plurality of residuals satisfying a predetermined condition in the residual sequence.
A spectrum processing device characterized by:

The spectrum processing device according to claim 1 ,
The determining means is
generating a histogram based on the residual sequence;
determining the sample point sequence based on the histogram;
A spectrum processing device characterized by:

the ability to accept spectra containing baseline components;
A function to generate a baseline model by applying a weight sequence consisting of multiple weights to a basis function sequence consisting of multiple basis functions;
A function of obtaining a residual sequence by comparing the spectrum and the baseline model, and determining a sample point sequence that defines a sample portion in the spectrum and a sample portion in the baseline model based on the residual sequence ;
The fitting condition for fitting the sample portion in the baseline model to the sample portion in the spectrum and the Lp norm condition for reducing the Lp norm (however, p≦1) of the weight sequence are satisfied. A function to search the weight column,
a function of subtracting an optimal baseline model generated based on the optimal weight sequence from the spectrum;
including;
In the process of searching for the optimal weight sequence, the determination of the sample point sequence is repeated,
In the process of searching for the optimal weight sequence, the sample point sequence and the baseline model are improved;
A program characterized by:

Generate a baseline model by applying a weight sequence consisting of multiple weights to a basis function sequence consisting of multiple basis functions,
determining a residual sequence by comparing the spectrum and the baseline model; based on the residual sequence, determining a sample point sequence that defines a sample part in the spectrum and a sample part in the baseline model;
The fitting condition for fitting the sample portion in the baseline model to the sample portion in the spectrum and the Lp norm condition for reducing the Lp norm (however, p≦1) of the weight sequence are satisfied. Search the weight sequence,
A spectrum processing method that estimates a baseline component included in the spectrum as an optimal baseline model based on the optimal weight sequence,
In the process of searching for the optimal weight sequence, the determination of the sample point sequence is repeated,
In the process of searching for the optimal weight sequence, the sample point sequence and the baseline model are improved;
A spectral processing method characterized by: