JP2006072933A

JP2006072933A - Data processing method and program

Info

Publication number: JP2006072933A
Application number: JP2004258782A
Authority: JP
Inventors: Osao Kaseda; 長生綛田; Toru Kajima; 亨鹿島
Original assignee: Azbil Corp
Current assignee: Azbil Corp
Priority date: 2004-09-06
Filing date: 2004-09-06
Publication date: 2006-03-16
Anticipated expiration: 2024-09-06
Also published as: JP4535811B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a data processing method and program capable of performing a data thinning out processing for performing accurate smoothing when identifying a curve or curved surface model. <P>SOLUTION: Inter-data distances between the respective K pieces, for instance, of inputted data are calculated (step S1), the minimum value is selected, and a data pair which gives the minimum inter-data distance is extracted (step S2). Then, one of the extracted data pair is deleted and a thinning out processing is executed (step S3). Thereafter, the minimum inter-data distance of a data group after the thinning out processing of the arrangement condition of the data remaining after the thinning out processing or the like is calculated as an evaluation index (step S4), the thinning out processing is continued or the thinning out processing is discontinued on the basis of the evaluation index, and the curved surface model is generated by using the processed data. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、少ない実験回数のデータを有効活用し、設計効率の向上を図るための応答曲面法（Response Surface Methodology：ＲＳＭ）などに好適に利用することができるデータ処理方法及びプログラムに関し、特に、離散した複数のデータを補間して連続的な曲線又は曲面を求めるためのデータ処理方法及びプログラムに関する。 The present invention relates to a data processing method and program that can be suitably used for Response Surface Methodology (RSM) for effectively utilizing data of a small number of experiments and improving design efficiency. The present invention relates to a data processing method and program for obtaining a continuous curve or curved surface by interpolating a plurality of discrete data.

近年のユーザニーズの多様化、市場競争の激化、海外の安価な製品参入などにより、製品の品質や納期、コストなどに対する市場要求は益々激しくなり、製品設計、生産設計の高効率化、開発コスト削減が益々要求されてきている。このような問題に対し、実験を多く伴う設計を効率よく行うための技術として応答曲面法が注目されている。 Due to the diversification of user needs in recent years, intensifying market competition, and the entry of inexpensive overseas products, market demands for product quality, delivery time, cost, etc. have become more and more intense, product design and production design efficiency, development cost There is an increasing demand for reduction. In response to such problems, the response surface method is attracting attention as a technique for efficiently performing a design involving many experiments.

例えば、特許文献１には、実験計画法に基づき収集されたデータを用いて効率よく、製剤設計、材料設計、若しくはシステムの運転条件・製品製造条件を設計又は調整するための応答曲面法に利用される曲面モデルの重調和スプライン補間による同定方法が開示されている。また、この方法におけるスムージングの必要性と一手法が開示されている。 For example, Patent Document 1 uses a response surface method for designing or adjusting formulation design, material design, or system operating conditions and product manufacturing conditions efficiently using data collected based on an experimental design method. A method of identifying a curved surface model by double harmonic spline interpolation is disclosed. Further, the necessity and one method of smoothing in this method are disclosed.

重調和スプライン補間を適用する際にスムージングを行わない場合、収集したデータにノイズや異常値が含まれていると、そのノイズや異常値に合わせるような補間になり、結果的に不適切な補間曲面が生成される。したがって、スムージング処理が必要になる。例えば特許文献１には、データのスムージング（収集した全てのデータを必ずしも通らない関数近似）の方法の一例として、収集したデータを間引いて（サンプリングして）補間関数を求める方法、又は補間曲面の曲率を抑制する方法が開示されている。 If smoothing is not performed when applying biharmonic spline interpolation, if the collected data contains noise or abnormal values, interpolation will be performed to match the noise or abnormal values, resulting in inappropriate interpolation. A curved surface is generated. Therefore, a smoothing process is required. For example, in Patent Document 1, as an example of a method of data smoothing (function approximation that does not necessarily pass all collected data), a method of obtaining an interpolation function by thinning (sampling) collected data, or an interpolation curved surface A method for suppressing curvature is disclosed.

ところで、重調和スプライン補間は入力変数が増えて多入力になると、曲面モデル同定演算が発散するという欠点がある。これを回避するためには、薄板スプライン（Thin Plate Spline：ＴＰＳ）補間を適用することが有効である。薄板スプライン補間については、スムージング効果を得るための機能が本質的に含まれており、この機能は、補間曲面の曲率を抑制する方法に相当する。また、薄板スプライン補間においても、スムージング処理としてデータを間引くことも有効である。 By the way, the double harmonic spline interpolation has a drawback that the curved surface model identification calculation diverges when the number of input variables increases and the number of inputs increases. In order to avoid this, it is effective to apply thin plate spline (TPS) interpolation. The thin plate spline interpolation essentially includes a function for obtaining a smoothing effect, and this function corresponds to a method for suppressing the curvature of the interpolation curved surface. In thin plate spline interpolation, it is also effective to thin out data as a smoothing process.

このように、曲面モデルの同定には、スムージング処理が必要であり、その方法としては
１．補間曲面の曲率を抑制する方法
２．データを間引く方法
が有効である。
特開２００２−１８３１１１号公報 As described above, smoothing processing is necessary for identification of a curved surface model. Method for suppressing curvature of interpolated surface 2. A method of thinning out data is effective.
JP 2002-183111 A

しかしながら、補間曲面の曲率を抑制する方法の場合、曲率の抑制程度を何らかの手段で決定する必要があるが、この決定には以下の問題点がある。 However, in the method of suppressing the curvature of the interpolation curved surface, it is necessary to determine the degree of curvature suppression by some means, but this determination has the following problems.

すなわち、抑制程度を決定する手段として、一般化クロスバリデーション（Generalized cross-validation：ＧＣＶ）が用いられることが多いが、ＧＣＶでは、決定した曲率の抑制程度では有効なスムージングが行えない場合がある（Wahba, "Spline Models for Observational Data", Society for Industrial and Applied Mathematics, 1990）。この文献によれば、統計量を利用したスムージング手法は、信号とノイズとを切り分けられるだけのデータ数が充分にない場合には有効なスムージングとはならない。 That is, as a means for determining the degree of suppression, generalized cross-validation (GCV) is often used, but in GCV, effective smoothing may not be performed with the determined degree of curvature suppression ( Wahba, "Spline Models for Observational Data", Society for Industrial and Applied Mathematics, 1990). According to this document, the smoothing method using a statistic is not effective smoothing when there is not enough data to separate a signal and noise.

また、収集したデータと補間曲面とを人間が目でチェックしながら曲率の抑制程度を決定する手段もあるが、試行錯誤が必要であり、曲面モデルの推定に時間及び労力がかかるという問題点がある。更に、ノイズの大きさやデータ密度に偏りがある場合に、補間曲面の曲率を抑制する方法では所望の曲面が得られない場合がある。 In addition, there is a means to determine the degree of curvature suppression while checking the collected data and the interpolation surface with human eyes, but it requires trial and error, and it takes time and effort to estimate the surface model. is there. Furthermore, when there is a bias in the magnitude of noise or data density, there is a case where a desired curved surface cannot be obtained by the method of suppressing the curvature of the interpolation curved surface.

また、データを間引く方法の場合、入力空間が多入力のものになるとデータの間引き方は複雑になり、例えばある特定の入力軸からみれば間引き方に偏りがなくても入力空間全体からみれば間引き方が偏ってしまうことが発生する。 In addition, in the method of thinning out data, if the input space becomes multi-input, the method of thinning out the data becomes complicated. For example, even if there is no bias in the thinning method when viewed from a specific input axis, The thinning method may be biased.

本発明は、このような問題点を解決するためになされたものであり、曲線又は曲面モデルを同定する際に的確なスムージングを行うためのデータ間引き処理を行うことができるデータ処理方法及びプログラムを提供することを目的とする。 The present invention has been made to solve such a problem, and provides a data processing method and program capable of performing data thinning processing for performing accurate smoothing when identifying a curve or a curved surface model. The purpose is to provide.

上述した目的を達成するために、本発明にかかるデータ処理方法は、離散した複数のデータからなるデータ群におけるデータの分布状態が一様になるようデータを間引き処理し、前記間引き処理後の間引き済データ群を補間して連続的な曲線又は曲面モデルを同定することを特徴とする。 In order to achieve the above-described object, a data processing method according to the present invention thins out data so that the distribution state of data in a data group composed of a plurality of discrete data is uniform, and performs thinning after the thinning-out process. It is characterized by identifying a continuous curve or curved surface model by interpolating a completed data group.

本発明においては、データの分布状態が一様になる方向にデータを間引き処理した後、間引き済データ群を補間して連続的な曲線又は曲面モデルを同定するため、データが密の領域から例えば粗の領域と同程度になるようデータを間引き処理すればデータ分布が一様な間引き済データを得ることができ、スムージングされた所望の曲線又は曲面モデルを生成することができる。 In the present invention, after thinning data in a direction in which the data distribution state becomes uniform, a continuous curve or a curved surface model is identified by interpolating the thinned data group. If the data is thinned out to the same extent as the rough area, thinned data having a uniform data distribution can be obtained, and a desired smoothed curve or curved surface model can be generated.

更に、前記データ群における各データ間の距離又は各データ間の類似度などの距離指標に基づき間引き処理をすることができ、簡単な方法にてデータの分布の粗密を判定して間引き処理を実行することができる。 Furthermore, thinning processing can be performed based on a distance index such as the distance between each data in the data group or the similarity between the data, and the thinning processing is performed by determining the density of the data distribution by a simple method. can do.

更にまた、前記間引き処理では、前記データ群に含まれる一のデータとその他のデータとの距離又は距離指標が最小となる最小距離データ対を抽出し、当該最小距離データ対と当該最小距離データ対の代表値とを置き換えることができ、間引き処理対象となる最小距離データ対のデータを例えば最小距離データ対のうちのいずれか一方又は最小距離データ対の平均値などの代表値とすることで、間引き処理を行うことができる。 Furthermore, in the thinning-out process, a minimum distance data pair having a minimum distance or distance index between one data and other data included in the data group is extracted, and the minimum distance data pair and the minimum distance data pair are extracted. By replacing the data of the minimum distance data pair to be thinned out with a representative value such as one of the minimum distance data pairs or the average value of the minimum distance data pair, Thinning processing can be performed.

また、前記間引き処理では、前記データ群に含まれる一のデータとその他のデータとの距離又は距離指標が最小となる側から距離又は距離指標が大きくなる順に所定数のデータを抽出し、前記所定数のデータと当該所定数データの代表値とを置き換えることができ、所定数のデータからなるクラスタを抽出してそれらを当該クラスタの代表値と置き換えることで間引き処理を実行することができる。 In the thinning-out process, a predetermined number of data is extracted in order of increasing distance or distance index from the side where the distance or distance index between one data included in the data group and the other data becomes the smallest. The number of data and the representative value of the predetermined number of data can be replaced, and the thinning-out process can be executed by extracting a cluster composed of the predetermined number of data and replacing them with the representative value of the cluster.

更に、前記間引き処理後のデータ群に含まれるデータ数を評価指標とし、前記評価指標が所定の値より小さい場合には前記間引き処理を打ち切り、前記間引き処理後のデータ群を前記間引き済データ群とすることができ、所望の間引き度合いで間引き処理を停止することができる。 Further, the number of data included in the data group after the thinning process is used as an evaluation index. When the evaluation index is smaller than a predetermined value, the thinning process is terminated, and the data group after the thinning process is used as the thinned data group. And the thinning-out process can be stopped at a desired degree of thinning-out.

更にまた、前記間引き処理後の間引き済データ群をスプライン補間することにより連続的な曲線又は曲面モデルを同定することができる。 Furthermore, a continuous curve or curved surface model can be identified by performing spline interpolation on the thinned data group after the thinning process.

本発明にかかるプログラムは、所定の動作をコンピュータに実行させるためのプログラムであって、離散した複数のデータからなるデータ群におけるデータの分布状態が一様になるようデータを間引き処理し、前記間引き処理後の間引き済データ群を補間して連続的な曲線又は曲面モデルを同定することを特徴とする。 A program according to the present invention is a program for causing a computer to execute a predetermined operation, and performs data thinning so that a data distribution state in a data group composed of a plurality of discrete data is uniform, and the thinning is performed. A continuous curve or curved surface model is identified by interpolating the thinned data group after processing.

本発明に係るデータ処理方法によれば、曲線又は曲面モデルを同定する際に的確なスムージングを行うためのデータ間引き処理を行うことができる。 According to the data processing method of the present invention, it is possible to perform data thinning processing for performing accurate smoothing when identifying a curve or a curved surface model.

また、本発明に係るプログラムによれば、上述した間引き処理及び曲線及び曲面モデルの同定処理をソフトウェアにより実現することができる。 Further, according to the program according to the present invention, the above-described thinning process and curve and curved surface model identification process can be realized by software.

以下、本発明を適用した具体的な実施の形態について、図面を参照しながら詳細に説明する。この実施の形態は、本発明を、曲面モデルの同定という本質的目的に合う適切なデータの間引き方を与え結果として適切なスムージング処理を実現することができる曲線又は曲面モデルの同定方法としてのデータ処理方法に適用したものである。 Hereinafter, specific embodiments to which the present invention is applied will be described in detail with reference to the drawings. In this embodiment, the present invention provides data as a method of identifying a curved line or curved surface model that can provide an appropriate data thinning method that meets the essential purpose of identifying a curved surface model, and can realize appropriate smoothing processing as a result. This is applied to the processing method.

曲面モデルの同定にはスムージング処理が必要であり、またそのための一手法であるデータの間引き方自体にも適切な手順が必要である。本実施の形態におけるデータの間引き処理に従ってデータを適切に間引くことにより、適切なスムージング処理を可能とするのみならず、曲面モデルの同定の精度を劣化させることなく無駄な演算処理量の増加を回避することができる。 A smoothing process is necessary for identifying a curved surface model, and an appropriate procedure is required for the data thinning method itself as one of the techniques. By appropriately thinning out the data according to the data thinning-out process in this embodiment, not only can the smoothing process be performed properly, but also avoid an unnecessary increase in the amount of computation processing without degrading the accuracy of the curved surface model identification. can do.

具体的には、収集した複数のデータ（収集データ）からなるデータ群の入力空間内での相互の距離に着目し、他の収集データとの距離又は距離指標が小さいデータを優先的に間引く。このことにより、間引き処理後に残されたデータ群（間引き済データ群）の相互の距離が均一化される方向に向かう。すなわち、データ間距離に着目し、入力空間内のデータ密度が高い領域に存在するデータを優先的に間引くことにより、データの配置が偏ることを回避することで、適切なスムージング処理を実現するものである。ここで、間引くとは、単純にデータを削除するのみならず、データを合成することによりデータ数を削減することを含む広義の間引きに相当する。 Specifically, paying attention to the mutual distance in the input space of a data group composed of a plurality of collected data (collected data), data with a small distance to other collected data or a small distance index is preferentially thinned out. As a result, the distance between the data groups (thinned data groups) remaining after the thinning process is made uniform. In other words, paying attention to the distance between data and preferentially thinning out data existing in areas with high data density in the input space, thereby avoiding biased data arrangement and realizing appropriate smoothing processing It is. Here, thinning out corresponds to thinning out in a broad sense including not only simply deleting data but also reducing the number of data by combining the data.

実施の形態１.
先ず、本発明の実施の形態１について説明する。図１は、本実施の形態におけるデータ処理装置を示す機能ブロック図である。図１に示すように、データ処理装置としての曲面モデル同定装置３００は、例えば実験計画法等により収集された複数の実験データ等からなるデータ群のデータの分布状態に基づき間引き処理をする間引き処理部１００と、間引き処理された間引き済みデータ群を補間して連続的な曲面モデルを同定する補間処理部２００とを有する。なお、本実施の形態においては、曲面モデルを同定するものとして説明するが、同様の方法を曲線モデルの同定に適用できることはいうまでもない。 Embodiment 1.
First, Embodiment 1 of the present invention will be described. FIG. 1 is a functional block diagram showing a data processing apparatus according to the present embodiment. As shown in FIG. 1, a curved surface model identifying apparatus 300 as a data processing apparatus performs a thinning process based on a data distribution state of a data group including a plurality of experimental data collected by, for example, an experimental design method. And an interpolation processing unit 200 that interpolates the thinned data group that has been subjected to the thinning process and identifies a continuous curved surface model. In the present embodiment, description will be made assuming that a curved surface model is identified, but it goes without saying that a similar method can be applied to identification of a curved model.

実験データは、例えば直交表、球形中心二次複合計画などにより収集された実験計画データ等を使用することができる。また、過去に収集した実験計画データに新たに数回行った再実験データを追加した場合、又は実験計画配置で収集しようとしたがデータの欠落があった場合などの非実験計画データであってもよい。このように収集された複数のデータからなるデータ群は、一般的にはその分布に偏りがある離散的なデータである。データは、ｎ種類の変量によって特徴づけられるｎ次元のデータ（ベクトル）である。後述する補間処理部２００は、例えば、各データのうち、（ｎ−１）種類の変量を解の要因となる変量、残りの一の変量を解となる変量とし、（ｎ−１）次元のデータを入力変数、残りの一のデータを出力変数として曲面モデルを同定することができる。 As experimental data, for example, experimental design data collected by an orthogonal table, a spherical center quadratic composite design, or the like can be used. Also, non-experimental plan data such as when re-experiment data newly performed several times is added to the experimental plan data collected in the past, or when there is data loss when trying to collect in the experimental plan layout Also good. A data group composed of a plurality of data collected in this way is generally discrete data having a biased distribution. The data is n-dimensional data (vector) characterized by n types of variables. The interpolation processing unit 200 to be described later uses, for example, (n−1) types of variables among the data as variables that cause a solution, and the remaining one variable as a variable that serves as a solution. A curved surface model can be identified using data as input variables and the remaining data as output variables.

間引き処理部１００は、入力される離散的なデータ群の分布の偏りをなくすようにデータの間引き処理を行う。データ群に含まれるデータには、データが属する次元空間においてそのデータ数が少ない粗の領域と、データ数が多い密の領域とが存在する。間引き処理部１００は、データ群におけるデータが粗の領域と例えば同程度の分布状態（密度）となるよう、密の領域からデータを削除又は複数のデータを統合するなどする間引き処理を実行し、間引き済のデータ（間引き済データ群）を補間処理部２００に出力する。 The thinning-out processing unit 100 performs data thinning-out processing so as to eliminate the uneven distribution of the input discrete data group. The data included in the data group includes a coarse area where the number of data is small and a dense area where the number of data is large in the dimension space to which the data belongs. The thinning processing unit 100 executes thinning processing such as deleting data from a dense area or integrating a plurality of data so that the data in the data group has a distribution state (density) of the same degree as that of a rough area, for example. The thinned data (thinned data group) is output to the interpolation processing unit 200.

補間処理部２００は、間引き済データ群から、補間により応答曲面モデルを同定する。この補間としては、例えば後述する多変数スプライン（薄板スプライン、重調和スプライン）を利用することができる。多変数スプラインの場合、ランダム分布したデータでも補間可能であり、多変数にも容易に応用することができる。これにより、一意にデータを通る滑らかな応答曲面が定まる。 The interpolation processing unit 200 identifies a response surface model by interpolation from the thinned data group. As this interpolation, for example, a multivariable spline (thin plate spline, double harmonic spline) described later can be used. In the case of multivariable splines, even randomly distributed data can be interpolated and can be easily applied to multivariables. As a result, a smooth response surface that uniquely passes the data is determined.

次に、本実施の形態の曲面モデル同定装置における間引き処理部１００について詳細に説明する。図２は、間引き処理部１００の機能を示すブロック図である。図２に示すように、間引き処理部１００は、入力データ群における各データ間の距離を算出するデータ間距離算出部１０１と、算出されたデータ間距離に基づきデータ対を抽出し、間引き処理を実行するデータ抽出部１０２と、データ抽出部１０２にて間引き処理された後のデータのデータ間距離に基づき当該間引き処理後のデータの評価指標を算出し、間引き処理を打ち切るか否かを決定する評価指標算出部１０３とを有する。評価指標算出部１０３にて間引き処理後のデータが所定の評価基準に達したと判断された場合は、当該間引き処理後のデータが間引き済データ群として出力される。 Next, the thinning processing unit 100 in the curved surface model identification apparatus of the present embodiment will be described in detail. FIG. 2 is a block diagram illustrating functions of the thinning processing unit 100. As shown in FIG. 2, the thinning processing unit 100 extracts a data pair based on the calculated inter-data distance calculation unit 101 that calculates the distance between each data in the input data group, and performs the thinning process. Based on the data extraction unit 102 to be executed and the inter-data distance of the data that has been subjected to the thinning process by the data extraction unit 102, the evaluation index of the data after the thinning process is calculated, and it is determined whether or not to cancel the thinning process And an evaluation index calculation unit 103. If the evaluation index calculation unit 103 determines that the data after the thinning process has reached a predetermined evaluation standard, the data after the thinning process is output as a thinned data group.

なお、本実施の形態における曲面モデル同定装置３００における任意の処理は、ハードウェアにより実行させても、ＣＰＵ（Central Processing Unit）にコンピュータプログラムを実行させてもよい。ソフトウェアにて実行させる場合、コンピュータプログラムは、記録媒体に記録して提供することも可能であり、また、インターネットその他の伝送媒体を介して伝送することにより提供することも可能である。 Note that arbitrary processing in the curved surface model identification apparatus 300 in the present embodiment may be executed by hardware or a CPU (Central Processing Unit) may execute a computer program. When executed by software, the computer program can be provided by being recorded on a recording medium, or can be provided by being transmitted via the Internet or another transmission medium.

データ間距離算出部１０１は、処理開始時点では、実験データなどの入力データ群が入力され、それ以降は評価指標算出部１０３による間引き処理の打ち切り決定までデータ抽出部１０２にて間引き処理された間引き処理後のデータ群が入力され、これらのデータ群に含まれる一のデータとそれ以外のデータとの距離を算出する。すなわち、ｋ個のデータが入力された場合、一のデータに対して残りの（ｋ−１）のデータとの距離が算出され（ｋ−１）個のデータ間距離を得る。 The inter-data distance calculation unit 101 receives an input data group such as experimental data at the start of processing, and after that, the data extraction unit 102 performs decimation processing until the evaluation index calculation unit 103 decides to cancel the decimation processing. A processed data group is input, and the distance between one data included in these data groups and the other data is calculated. That is, when k pieces of data are input, the distance from the remaining (k-1) data to one data is calculated to obtain (k-1) inter-data distances.

ここで、データ間距離とは、２つのデータ間の空間的な距離の他、クラスタ分析に使用される類似度（非類似度）などとして使用される距離又は距離指標などであってもよい。具体的には、データ間距離算出部１０１は、データ間距離として、データを構成する各変数（出力変数を含む）を正規化し、ユークリッド平方距離、ミンコフスキー距離、マハラノビス汎距離等の距離を算出することができる。いずれの距離又は距離指標を算出するかは、収集されたデータの性質等に応じて適宜選択するようにすればよい。以下の説明では、これらの距離又は距離指標をまとめて距離ということとする。 Here, the inter-data distance may be a distance or distance index used as a similarity (dissimilarity) used for cluster analysis, in addition to a spatial distance between two data. Specifically, the inter-data distance calculation unit 101 normalizes each variable (including the output variable) constituting the data as the inter-data distance, and calculates a distance such as Euclidean square distance, Minkowski distance, Mahalanobis general distance, and the like. be able to. Which distance or distance index is calculated may be appropriately selected according to the nature of the collected data. In the following description, these distances or distance indexes are collectively referred to as distances.

データ抽出部１０２は、例えばデータ間距離算出部１０１にｋ個のデータが入力された場合、一のデータについて、（ｋ−１）個のデータ間距離を受け取る。すなわち、ｋ個のデータがある場合は、ｋ×（ｋ−１）／２個のデータ間距離を受け取る。データ抽出部１０２は、これらの全データ間距離のうち最小のデータ間距離Ｌｍｉｎとなるデータ対を、間引き処理対象のデータ対として抽出する。こうして抽出された間引き処理対象のデータ対は、距離的に入力データ群の中で最も密なデータ対である。間引き処理は、この間引き対象のデータ対のうち、いずれか一方を削除することで行う。削除しなかった方のデータと共に残りのデータを間引き処理後のデータとして評価指標算出部１０３に出力する。 For example, when k pieces of data are input to the inter-data distance calculation unit 101, the data extraction unit 102 receives (k-1) inter-data distances for one data. That is, when there are k pieces of data, k × (k−1) / 2 inter-data distances are received. The data extraction unit 102 extracts a data pair having a minimum data distance Lmin among these all data distances as a data pair to be thinned out. The data pairs to be thinned out thus extracted are the densest data pairs in the input data group in terms of distance. The thinning process is performed by deleting one of the thinning target data pairs. The remaining data together with the data not deleted is output to the evaluation index calculation unit 103 as the data after the thinning process.

また、間引き処理は、間引き処理対象データ対のいずれか一方を削除するものとしてもよいが、例えばデータの入力順に番号を付し、間引き処理対象データ対のうち、上記番号の小さいものを削除するようにしてもよい。また、いずれか一方を削除するのではなく、間引き処理対象データ対の平均値を求め、間引き処理対象データ対のデータを削除し、当該平均値を代表値とし、データ対に換えて挿入するようにしてもよい。更に、本実施の形態においては、最小データ間距離のデータ対を抽出して一のデータを削除するものとしたが、後述するように最小データ間距離からデータ間距離が大きくなる順に複数のデータ対を抽出して、複数個のデータを一度に削除するような間引き処理を行ってもよい。 In the thinning process, either one of the thinning process target data pairs may be deleted. For example, the thinning process target data pairs are numbered in order of data input, and the thinnest processing target data pair is deleted. You may do it. Also, instead of deleting either one, the average value of the thinning target data pair is obtained, the data of the thinning target data pair is deleted, the average value is used as a representative value, and the data pair is inserted. It may be. Furthermore, in the present embodiment, the data pair of the minimum data distance is extracted and one data is deleted. However, as will be described later, a plurality of data are arranged in order from the minimum data distance to the data distance. A thinning process may be performed in which pairs are extracted and a plurality of data are deleted at once.

評価指標算出部１０３は、間引き処理後のデータを受け取って間引き処理後のデータの評価指標を算出し、この評価指標に基づき間引き処理の打ち切りを決定する。打ち切りを決定した場合には、間引き処理後のデータを間引き済データとして後段の補間処理部２００に出力する。 The evaluation index calculation unit 103 receives the data after the thinning process, calculates the evaluation index of the data after the thinning process, and determines the termination of the thinning process based on the evaluation index. When the abort is determined, the data after the thinning process is output to the interpolation processing unit 200 at the subsequent stage as the thinned data.

間引き処理後のデータの評価指標は、以下のような方法があるが、これに限らず、後段の補間処理部２００にて生成される曲面モデルが滑らかな曲面となるような間引き済データが得られるような指標であればよい。補間により滑らかな曲面モデルを生成するには、入力データとして与えられる間引き済データ群のデータの分布に偏りがないことが好ましい。すなわち、データの分布に粗密がある場合、密の領域のデータに合わせて曲面モデルを生成すれば滑らかな曲面を得られず、密の領域のデータに合わせて滑らかな曲面モデルを生成すると、粗の領域の曲面モデルが線形となり所望の曲面モデルを得ることができない。 The evaluation index of the data after the thinning process is as follows. However, the present invention is not limited to this, and thinned data is obtained such that the curved surface model generated by the subsequent interpolation processing unit 200 becomes a smooth curved surface. Any index that can be In order to generate a smooth curved surface model by interpolation, it is preferable that there is no bias in the data distribution of the thinned data group given as input data. In other words, if the data distribution is coarse / dense, a smooth curved surface model cannot be obtained if a curved surface model is generated according to dense area data. The curved surface model of the region becomes linear, and a desired curved surface model cannot be obtained.

これを回避するため、本実施の形態において、データ抽出部１０２にて入力データ群のうち最も密であるデータ対を抽出して間引き処理するが、評価指標として最も単純には、間引き処理後のデータ数を使用することができる。この場合、評価指標算出部１０３は、間引き処理後のデータ数をカウントする又は間引き処理の回数をカウントするのみでよい。そして、例えば入力データ群がｋ個の場合、その数〜数十％のデータを間引き処理により削減するまで間引き処理を繰り返す。間引き処理を打ち切るデータ数は、収集した実験データの分布等に応じて適宜設定するものとすればよい。また、ある程度間引き処理をした後、補間処理部２００にて曲面モデルを同定し、所望の曲面モデルが得られない場合は、再度間引き処理を続行するようにしてもよい。 In order to avoid this, in the present embodiment, the data extraction unit 102 extracts the thinnest data pair from the input data group and performs the thinning process. The simplest evaluation index is after the thinning process. The number of data can be used. In this case, the evaluation index calculation unit 103 only needs to count the number of data after the thinning process or the number of thinning processes. For example, when there are k input data groups, the thinning process is repeated until the data of several to several tens of% is reduced by the thinning process. The number of data for which the thinning process is terminated may be set as appropriate according to the distribution of the collected experimental data. Further, after performing the thinning process to some extent, the curved surface model may be identified by the interpolation processing unit 200, and when the desired curved surface model cannot be obtained, the thinning process may be continued again.

また、間引き処理後のデータにおける最小データ間距離が所定の閾値以上である場合に間引き処理を打ち切るようにしてもよい。上記閾値は、外部から入力手段等により指定するようにしてもよく、また間引き処理後のデータにおける最大データ間距離等を基に評価指標算出部１０３等にて決定するようにしてもよい。 In addition, the thinning process may be terminated when the minimum data distance in the data after the thinning process is equal to or greater than a predetermined threshold. The threshold value may be designated from the outside by an input means or the like, or may be determined by the evaluation index calculation unit 103 or the like based on the maximum inter-data distance or the like in the data after the thinning process.

更に、所定回数間引き処理をした後、間引き処理後のデータを補間処理部２００に出力し、曲面モデルを生成し、評価指標算出部１０３がこれを暫定曲面モデルとして受け取って標準偏差を求め、この標準偏差と、間引き処理前のデータの標準偏差とを比較し、暫定曲面モデルの標準偏差が間引き処理前のデータの標準偏差より小さくなった時点で間引き処理の打ち切りを決定し、このときの暫定曲面モデルを曲面モデルとし、暫定曲面モデルを生成する際に使用した間引き処理後のデータを間引き済データ群とするなどしてもよい。又は、暫定曲面モデルの標準偏差が所定の値以下となった時点で同様に間引き処理を打ち切るようにしてもよい。 Further, after performing the thinning process a predetermined number of times, the data after the thinning process is output to the interpolation processing unit 200, a curved surface model is generated, and the evaluation index calculation unit 103 receives this as a provisional curved surface model to obtain a standard deviation. The standard deviation is compared with the standard deviation of the data before the thinning process, and when the standard deviation of the provisional curved surface model becomes smaller than the standard deviation of the data before the thinning process, it is decided to cancel the thinning process. The curved surface model may be a curved surface model, and the data after the thinning process used when generating the provisional curved surface model may be the thinned data group. Alternatively, when the standard deviation of the provisional curved surface model becomes a predetermined value or less, the thinning process may be similarly terminated.

以上のような間引き処理部の処理手順について説明する。図３は、本実施の形態における間引き処理方法を示すフローチャートである。図３に示すように、先ず、入力される例えばｋ個のデータの各変数（出力変数を含む）を正規化し、各データ間におけるユークリッド平方距離等のデータ間距離を算出する（ステップＳ１）。一のデータについて（ｋ−１）個のデータ間距離を求め、これにより、ｋ（ｋ−１）／２のデータ間距離が求まる。なお、上述したように、ユークリッド平方距離の他、ミンコフスキー距離、マハラノビス汎距離等、クラスタ分析の類似度又は非類似度として用いられる距離を使用してもよい。 A processing procedure of the thinning processing unit as described above will be described. FIG. 3 is a flowchart showing the thinning processing method according to the present embodiment. As shown in FIG. 3, first, each variable (including output variables) of, for example, k pieces of input data is normalized, and an inter-data distance such as a Euclidean square distance between the data is calculated (step S1). A distance between (k−1) pieces of data is obtained for one data, and thereby, a distance between data of k (k−1) / 2 is obtained. As described above, in addition to the Euclidean square distance, a distance used as a similarity or dissimilarity in cluster analysis such as a Minkowski distance or a Mahalanobis general distance may be used.

次に、前記データ間距離の最小値を選択し、この最小データ間距離を与えるデータ対を抽出し（ステップＳ２）、抽出したデータ対の一方を削除して間引き処理を実行する（ステップＳ３）。間引き処理としては、一方のデータを削除する他、データ対の平均値等の代表値とデータ対とを入れ換えるようにしてもよい。 Next, the minimum value of the inter-data distance is selected, a data pair giving this minimum inter-data distance is extracted (step S2), and one of the extracted data pairs is deleted and a thinning process is executed (step S3). . As the thinning-out process, one data may be deleted, and a representative value such as an average value of the data pair may be replaced with the data pair.

そして、間引き処理後に残ったデータの配置具合の評価指標を算出する（ステップＳ４）。評価指標としては、上述したように、間引き処理後のデータ群のデータ数、間引き処理後のデータ群の標準偏差、間引き処理後のデータ群の最小データ間距離等を使用することができる。この評価指標に基づき、間引き処理を継続するか、間引き処理を打ち切り、続く曲面モデルの生成を実行するかを決定する。例えば間引き処理後のデータ群のデータ数が所定の値以下である場合、間引き処理後のデータ群の最小データ間距離が所定の値以上である場合、又は間引き処理後のデータ群から得られる暫定曲面モデルの標準偏差が間引き処理前のデータ群の標準偏差より小さい場合などには、間引き処理を打ち切り、曲面モデルの同定を行う。 Then, an evaluation index for the arrangement of data remaining after the thinning process is calculated (step S4). As the evaluation index, as described above, the number of data of the data group after the thinning process, the standard deviation of the data group after the thinning process, the minimum distance between data of the data group after the thinning process, and the like can be used. Based on this evaluation index, it is determined whether to continue the thinning process or stop the thinning process and generate a curved surface model. For example, when the number of data in the data group after the thinning process is less than or equal to a predetermined value, when the minimum data distance of the data group after the thinning process is greater than or equal to a predetermined value, or provisional obtained from the data group after the thinning process When the standard deviation of the curved surface model is smaller than the standard deviation of the data group before the thinning process, the thinning process is stopped and the curved surface model is identified.

本実施の形態においては、曲面モデルの同定に必要なスムージング処理としてデータの間引き処理を、データの分布状態に基づき行い、例えば、データの分布が密の領域に属するデータの数を粗の領域に属するデータの数と同程度となるまで間引き処理をすることで、データの空間的な配置の偏りをなくし、適切なスムージングを実行することができる。 In the present embodiment, the data thinning process is performed based on the data distribution state as a smoothing process necessary for the identification of the curved surface model. For example, the number of data belonging to a dense data distribution area is set to a coarse area. By performing the thinning process until the number of data belonging to the data becomes the same, it is possible to eliminate bias in the spatial arrangement of data and perform appropriate smoothing.

また、従来の重調和スプライン、薄板スプライン補間では、補間曲面数式の作成に使用するデータの数の２乗のオーダーで演算処理と必要な記憶量が増える。曲面モデル同定の精度を向上させるためにはデータ数が増えることが好ましいが冗長なデータが含まれている場合には、無駄な演算処理量の増加を回避することは困難である。これに対し、本実施の形態においては、間引き処理により入力データ群から冗長と見なせるデータを削除するなどの処理を行うことで、補間処理に使用するデータ数を削減することができる。これにより、無駄な演算処理量を削減し、演算処理の高速化を実現することができる。 Further, in the conventional double harmonic spline and thin plate spline interpolation, the arithmetic processing and the required storage amount increase in the order of the square of the number of data used for creating the interpolation curved surface formula. In order to improve the accuracy of the curved surface model identification, it is preferable to increase the number of data. However, when redundant data is included, it is difficult to avoid an increase in useless amount of calculation processing. On the other hand, in the present embodiment, the number of data used for the interpolation process can be reduced by performing a process such as deleting data that can be regarded as redundant from the input data group by the thinning process. As a result, it is possible to reduce a useless amount of calculation processing and realize high-speed calculation processing.

実施の形態２.
次に、本発明における実施の形態２について説明する。実施の形態１は、データ間距離が最小のデータ対を抽出して間引き処理をしたのに対し、本実施の形態においては、データ間距離が最小なものから大きくなる順に複数個抽出し、これらの複数個のデータからなるクラスタについて間引き処理するものである。 Embodiment 2.
Next, a second embodiment of the present invention will be described. In the first embodiment, the data pair with the smallest data distance is extracted and thinned out. In the present embodiment, a plurality of data pairs are extracted in order from the smallest data distance. The thinning-out processing is performed for a cluster composed of a plurality of data.

本実施の形態におけるデータ処理装置としての曲面モデル同定装置も図１及び図２と同様の構成とすることができる。ただし、図２に示すデータ抽出部１０２における処理が異なる。 The curved surface model identification apparatus as the data processing apparatus in the present embodiment can also have the same configuration as that shown in FIGS. However, the processing in the data extraction unit 102 shown in FIG. 2 is different.

本実施の形態におけるデータ抽出部１０２は、実施の形態１と同様、データ間距離算出部１０１から各データにおけるデータ間距離を受け取る。そして、データ間距離が最小側からｍ個のデータ（以下、間引き処理対象クラスタともいう。）を抽出する。通常のクラスタ分析の場合は、ｍ＝２である。そして、このｍ個のデータからなる間引き処理対象クラスタを代表する代表値を算出する。代表値は、ｍ個のデータの平均値とすることができる他、例えばデータの信頼度が判断できる場合には信頼度で重み付けした重み付き平均値としてもよい。これを間引き処理対象クラスタの代表値とし、間引き処理対象クラスタに属するデータを全て削除し、この代表値を追加することで間引き処理を行う。すなわち、ｍが２より大きい場合には、１度の間引き処理で２以上のデータ数を削減することができる。 The data extraction unit 102 in the present embodiment receives the inter-data distance in each data from the inter-data distance calculation unit 101 as in the first embodiment. Then, m pieces of data (hereinafter also referred to as a thinning process target cluster) are extracted from the side with the smallest data distance. In the case of normal cluster analysis, m = 2. Then, a representative value representing the thinning target cluster composed of the m pieces of data is calculated. The representative value may be an average value of m pieces of data, or may be a weighted average value weighted with reliability when, for example, the reliability of data can be determined. This is set as a representative value of the thinning process target cluster, all data belonging to the thinning process target cluster is deleted, and the thinning process is performed by adding this representative value. That is, when m is larger than 2, the number of data of 2 or more can be reduced by a thinning process once.

その後、評価指標算出部が上述の間引き処理後のデータの評価指標に基づき間引き処理を打ち切るか否かを決定する点などは実施の形態１と同様である。図４は、本実施の形態における間引き処理方法を示すフローチャートである。 Thereafter, the evaluation index calculation unit is the same as in the first embodiment in that it determines whether or not to cancel the thinning process based on the evaluation index of the data after the thinning process. FIG. 4 is a flowchart showing a thinning processing method according to the present embodiment.

図４に示すように、先ず、実施の形態１と同様に、入力される例えばｋ個のデータの各変数（出力変数を含む）を正規化し、各データ間におけるユークリッド平方距離等のデータ間距離を算出する（ステップＳ１１）。次に、前記データ間距離が最小なものから順にｍ個のデータを抽出し（ステップＳ１２）、抽出したｍ個のデータ（間引き処理対象クラスタ）の平均値を求める（ステップＳ１３）。そして、間引き処理対象クラスタに属するデータをすべて削除し、ステップＳ１３にて求めた平均値を追加する（ステップＳ１４）。 As shown in FIG. 4, first, as in the first embodiment, each variable (including output variables) of, for example, k pieces of input data is normalized, and the inter-data distance such as the Euclidean square distance between the respective data. Is calculated (step S11). Next, m pieces of data are extracted in order from the smallest data distance (step S12), and an average value of the extracted m pieces of data (thinning processing target cluster) is obtained (step S13). Then, all data belonging to the thinning process target cluster is deleted, and the average value obtained in step S13 is added (step S14).

その後、間引き処理後に残ったデータの評価指標を算出し（ステップＳ１５）、間引き処理を打ち切るか否かを決定する（ステップＳ１６）点は実施の形態１と同様である。すなわち、上述したように、間引き処理後のデータ群のデータ数、間引き処理後のデータ群の標準偏差、間引き処理後のデータ群の最小データ間距離等を評価指標とし、例えば間引き処理後のデータ群のデータ数が所定の値以下である場合、間引き処理後のデータ群の最小データ間距離が所定の値以上である場合、又は間引き処理後のデータ群から得られる暫定曲面モデルの標準偏差が間引き処理前のデータ群の標準偏差より小さい場合などには、間引き処理を打ち切り、曲面モデルの同定を行う。 Thereafter, the evaluation index of the data remaining after the thinning process is calculated (step S15), and it is determined whether or not the thinning process is terminated (step S16) as in the first embodiment. That is, as described above, the number of data in the data group after the thinning process, the standard deviation of the data group after the thinning process, the minimum data distance of the data group after the thinning process, and the like are used as evaluation indexes, for example, the data after the thinning process When the number of data in the group is equal to or less than a predetermined value, when the minimum data distance of the data group after the thinning process is equal to or greater than the predetermined value, or the standard deviation of the provisional curved surface model obtained from the data group after the thinning process is When the standard deviation of the data group before the thinning process is smaller than the standard deviation, the thinning process is stopped and the curved surface model is identified.

本実施の形態においては、間引き処理対象クラスタの平均値を代表値とし、間引き処理対象クラスタのデータと平均値とを入れ換える間引き処理によりスムージングをおこなう。このことにより、実施の形態１と同様の効果、すなわち適切な間引きによりデータの偏りをなくして滑らかな曲面モデルを得ることができると共に、間引き処理により演算処理量を低減することができる。さらに、クラスタとして抽出するデータ数を多くすることで間引き処理を高速化することができる。また、削除するクラスタの代わりにその平均値を入れるため、更にデータの偏りをなくし、データの分布状態を一様にすることができる。 In the present embodiment, smoothing is performed by a thinning process in which the average value of the thinning process target cluster is used as a representative value and the data of the thinning process target cluster and the average value are replaced. This makes it possible to obtain the same effect as in the first embodiment, that is, to obtain a smooth curved surface model by eliminating data bias by appropriate thinning, and to reduce the calculation processing amount by thinning processing. Furthermore, the thinning process can be speeded up by increasing the number of data extracted as a cluster. In addition, since the average value is inserted instead of the cluster to be deleted, the data distribution can be further eliminated and the data distribution state can be made uniform.

次に、補間処理部２００の処理の一具体例について説明しておく。本実施の形態における補間処理としては、間引き処理後の間引き済データ群をスプライン補間する方法、ラグランジェ補間する方法、間引き済データ群から重回帰式を求める方法などにより連続的な曲線又は曲面モデルを同定するものである。 Next, a specific example of processing of the interpolation processing unit 200 will be described. As the interpolation processing in the present embodiment, a continuous curve or curved surface model is obtained by a method of performing spline interpolation on a thinned data group after thinning processing, a method of performing Lagrangian interpolation, a method of obtaining a multiple regression equation from a thinned data group, or the like. Is identified.

このうち、スプライン補間について具体的に説明する。スプラインによる曲面生成問題は、梁又は板のような弾性体の変形問題に置き換えることができる。すなわち、収集された実験データ（サンプル点）を弾性体のある部位に与えられた変位とみなせば、最も滑らかなスプライン曲面を求めることはある弾性体にいくつかの変位が与えられた場合に発生する下記式（１）に示す内部歪エネルギーＥを最小化する問題とみなすことができる。 Among these, the spline interpolation will be specifically described. The curved surface generation problem using splines can be replaced with the deformation problem of elastic bodies such as beams or plates. In other words, if the collected experimental data (sample points) is regarded as the displacement given to a part of the elastic body, finding the smoothest spline curved surface occurs when some displacement is given to the elastic body This can be regarded as a problem of minimizing the internal strain energy E shown in the following formula (1).

上記式（１）を解く方法のうち、多変数データ、ランダムデータへの適用が容易なものとして特許文献１に記載のグリーン関数を用いた解法があり、これにて導かれたものが下記式（２）に示す重調和スプラインである。 Among the methods for solving the above equation (1), there is a solution method using the Green function described in Patent Document 1 as a method that can be easily applied to multivariable data and random data. This is a double harmonic spline shown in (2).

ここで、ｎは与えられたデータ点数、ｄ_ｉは、与えられたデータｉのＸ座標と任意のＸ座標とのユークリッド距離、ｇ（ｄ_ｉ）は、ｄ_ｉを変数とするグリーン関数であり、関数定義は、入力変数Ｘの次元数により異なる。例えば、２次元Ｘ＝（ｘ_１，ｘ_２）の場合、上記式（２）におけるグリーン関数は、下記式（３）となる。また、α_ｉは、係数であり、与えられたデータから線形マトリクス演算で計算することができる。 Here, n is the given number of data points, d _i is the Euclidean distance between the X coordinate of the given data i and an arbitrary X coordinate, and g (d _i ) is a Green function with d _i as a variable. The function definition differs depending on the number of dimensions of the input variable X. For example, in the case of two-dimensional X = (x ₁ , x ₂ ), the Green function in the above formula (2) is the following formula (3) Α _i is a coefficient and can be calculated from given data by linear matrix calculation.

また、スムージング機能を有する薄板スプラインの場合は、最も滑らかなスプライン曲面を求めることは、下記式（４）のエネルギーＥ’式を最小化する問題とみなすことができる。 In the case of a thin plate spline having a smoothing function, obtaining the smoothest spline curved surface can be regarded as a problem of minimizing the energy E ′ equation of the following equation (4).

上記式（４）から下記式（５）の薄板スプラインが導かれる。ここで、λはスムージングパラメータ、ｃ_ｊは係数、ｐは入力変数の次元数を示す。 The thin plate spline of the following formula (5) is derived from the above formula (4). Here, λ is a smoothing parameter, c _j is a coefficient, and p is the number of dimensions of the input variable.

次に、本実施の形態を適用した曲線モデルを例にとって本発明の効果について説明する。図５は、与えられたデータ点に対し、上述の実施の形態２を適用して曲線モデルを求めた実施例であり、図６は、間引き処理を行わず、同一のデータ点から従来の曲率を抑制する方法（薄板スプライン）にて曲線モデルを求めた比較例を示すグラフ図である。 Next, the effect of the present invention will be described using a curve model to which the present embodiment is applied as an example. FIG. 5 is an example in which a curve model is obtained by applying the above-described second embodiment to a given data point, and FIG. 6 is a conventional curvature from the same data point without performing a thinning process. It is a graph which shows the comparative example which calculated | required the curve model by the method (thin board spline) which suppresses.

図５に示すように、データ間最小距離に基づきクラスタを抽出し、クラスタのデータと平均値とを入れ換える処理を繰り返してスムージングを行うことにより、サンプル点の特性を表現しつつ滑らかな曲線モデルが生成されている。 As shown in FIG. 5, by extracting a cluster based on the minimum distance between data and performing smoothing by repeating the process of replacing the data of the cluster and the average value, a smooth curve model can be obtained while expressing the characteristics of the sample points. Has been generated.

図６に示す比較例は、薄板スプラインを使用して求めた曲線モデルであって、λは、上述したスムージングパラメータで、抑制程度の大きさを示すもので、λが大きいものほど曲線モデルは直線に近づく。図５に示す曲線モデルに対し、図６に示すように、外部からの指示により手動にて曲率の抑制程度（スムージングパラメータλ）を変化させた場合には、スムージングパラメータλを大きくするとサンプル点が粗の領域（ｘ＝１０〜４０）では曲線がほぼ直線となってしまう。また、スムージングパラメータλが小さいもののみならず、大きくしてもサンプル点が密の領域（ｘ＝０近傍、５０近傍）においては曲線モデルが密の領域のサンプル点を補間しきれず、不要な極小・極大点が生じてしまい、滑らかな曲線モデルを得ることができない。すなわち、データの分布状態に粗密がある場合は、スムージングパラメータλをどのように変化させても所望の曲線モデルを得ることが困難である。 The comparative example shown in FIG. 6 is a curve model obtained using a thin plate spline, where λ is the smoothing parameter described above and indicates the degree of suppression, and the larger the λ, the more the curve model is a straight line. Get closer to. In contrast to the curve model shown in FIG. 5, when the degree of curvature suppression (smoothing parameter λ) is manually changed by an external instruction as shown in FIG. 6, the sample point is increased by increasing the smoothing parameter λ. In the rough region (x = 10 to 40), the curve is almost a straight line. Further, not only in the case where the smoothing parameter λ is small, but also in the region where the sample points are dense (near x = 0, near 50), even if the smoothing parameter λ is large, the curve model cannot completely interpolate the sample points in the dense region.・ Maximum point is generated and a smooth curve model cannot be obtained. That is, when the data distribution state is dense and dense, it is difficult to obtain a desired curve model no matter how the smoothing parameter λ is changed.

また、同様に、間引き処理を行わず、スムージングパラメータλを変化させてノイズを含む３次元データの曲面モデルを求めた比較例を図７に示す。サンプル点にノイズを含む場合、例えば与えられたサンプル点全てをつなぐ曲面を求めると、λ＝０に示すように、小さい凹凸が多数発生する。これに対し、スムージングパラメータλを徐々に大きくしていくことで生成される曲面モデルが滑らからになり、重回帰式（線形多項式）に近づくようスムージングしてしまい、λ＝２００では線形式（平面）に近づき、データの分布に偏りがある場合には、スムージングパラメータλの調整のみでは滑らかな曲面モデルを生成することはできない。 Similarly, FIG. 7 shows a comparative example in which a smoothing parameter λ is changed and a curved surface model of three-dimensional data including noise is obtained without performing the thinning process. When the sample points include noise, for example, when a curved surface connecting all the given sample points is obtained, a large number of small irregularities are generated as shown by λ = 0. On the other hand, the curved surface model generated by gradually increasing the smoothing parameter λ becomes smooth and smoothes so as to approach the multiple regression equation (linear polynomial). ) And the data distribution is biased, a smooth curved surface model cannot be generated only by adjusting the smoothing parameter λ.

以上のことから、データの分布に偏りがある離散データから連続的な曲線モデル又は曲面モデルを同定するためには、曲線モデル又は曲面モデルの生成の前に、入力データ群におけるデータの分布の偏りをなくすようデータの間引き処理をすることが極めて有効である。 From the above, in order to identify a continuous curve model or curved surface model from discrete data with a biased data distribution, it is necessary to bias the distribution of data in the input data group before generating the curved model or curved surface model. It is extremely effective to perform a data thinning process so as to eliminate the above.

なお、本発明は上述した実施の形態のみに限定されるものではなく、本発明の要旨を逸脱しない範囲において種々の変更が可能であることは勿論である。 It should be noted that the present invention is not limited to the above-described embodiments, and various modifications can be made without departing from the scope of the present invention.

本発明の実施の形態１におけるデータ処理装置を示す機能ブロック図である。It is a functional block diagram which shows the data processor in Embodiment 1 of this invention. 本発明の実施の形態１におけるデータ処理装置の間引き処理部の機能を示すブロック図である。It is a block diagram which shows the function of the thinning-out process part of the data processor in Embodiment 1 of this invention. 本発明の実施の形態１のデータ処理装置における間引き処理方法を示すフローチャートである。It is a flowchart which shows the thinning-out processing method in the data processor of Embodiment 1 of this invention. 本発明の実施の形態２のデータ処理装置における間引き処理方法を示すフローチャートである。It is a flowchart which shows the thinning-out processing method in the data processor of Embodiment 2 of this invention. 与えられたでデータ点に対し、上述の実施の形態２を適用して曲線モデルを求めた実施例を示すグラフ図である。It is a graph which shows the Example which applied the above-mentioned Embodiment 2 with respect to the given data point and calculated | required the curve model. 間引き処理を行わず、同一のデータ点から従来の曲率を抑制する方法（薄板スプライン）にて曲線モデルを求めた比較例を示すグラフ図である。It is a graph which shows the comparative example which calculated | required the curve model by the method (thin board spline) which suppresses the conventional curvature from the same data point, without performing a thinning process. 間引き処理を行わず、ノイズを含む３次元データの曲面モデルをスムージングパラメータλを変化させて求めた比較例を示すグラフ図である。It is a graph which shows the comparative example which calculated | required the smoothing parameter (lambda) for the curved surface model of the three-dimensional data containing noise, without performing a thinning process.

Explanation of symbols

１０１データ間距離算出部
１０２データ抽出部
１０３評価指標算出部
２００補間処理部
３００曲面モデル同定装置 101 Inter-data distance calculation unit 102 Data extraction unit 103 Evaluation index calculation unit 200 Interpolation processing unit 300 Curved surface model identification device

Claims

The data is thinned out from the dense region so that the data distribution state in the data group consisting of a plurality of discrete data is uniform,
A data processing method for identifying a continuous curve or curved surface model by interpolating a thinned data group after the thinning process.

The data processing method according to claim 1, wherein thinning processing is performed based on a distance between each data or a distance index in the data group.

In the thinning-out process, a minimum distance data pair that minimizes a distance or distance index between one data and other data included in the data group is extracted, and a representative value of the minimum distance data pair and the minimum distance data pair The data processing method according to claim 1, wherein: is replaced.

In the thinning process, a predetermined number of data is extracted in order of increasing distance or distance index from the side where the distance or distance index between one data included in the data group and other data is minimized, and the predetermined number The data processing method according to claim 1, further comprising: substituting the data and the representative value of the predetermined number of data.

A distance between one data and other data included in the data group or a minimum distance that minimizes a distance index is used as an evaluation index. If the evaluation index is larger than a predetermined value, the thinning process is terminated, and the thinning process is performed. The data processing method according to claim 1, wherein the processed data group is the thinned data group.

The data processing method according to claim 1, wherein a continuous curve or curved surface model is identified by performing spline interpolation on the thinned data group after the thinning process.

A program for causing a computer to execute a predetermined operation,
The data is thinned out so that the distribution state of the data in the data group consisting of a plurality of discrete data is uniform,
A program for identifying a continuous curve or curved surface model by interpolating a thinned data group after the thinning process.