JP2021022276A

JP2021022276A - Data processing method, data processing apparatus, and program

Info

Publication number: JP2021022276A
Application number: JP2019139631A
Authority: JP
Inventors: 直哉古渡; Naoya Kowatari; 正隆小石; Masataka Koishi
Original assignee: Yokohama Rubber Co Ltd
Current assignee: Yokohama Rubber Co Ltd
Priority date: 2019-07-30
Filing date: 2019-07-30
Publication date: 2021-02-18
Anticipated expiration: 2039-07-30
Also published as: JP7360016B2

Abstract

To efficiently create a prediction module that accurately predicts a value related to a feature quantity by inputting values of a plurality of explanatory variables from an original data set including experimental data and simulation data.SOLUTION: An original data set including experimental data and simulation data is used to create a corrected simulation data set made up of corrected simulation data obtained by correcting a calculated value in the simulation data. The corrected simulation data set, an experimental data set made up of the experimental data, and an integrated data set obtained by integrating the corrected simulation data set and the experimental data set are used as data for machine learning to create a plurality of prediction module candidates. A corrected simulation data set for verification, an experimental data set for verification, and an integrated data set for verification are used to evaluate prediction accuracy for each of the prediction module candidates and thereby determine a prediction module.SELECTED DRAWING: Figure 1

Description

本発明は、コンピュータが、複数の説明変数の値を入力することにより予め定めた特徴量に関する値を予測して出力する予測モジュールを形成するためのデータ処理方法、データ処理装置、及びプログラムに関する。 The present invention relates to a data processing method, a data processing apparatus, and a program for forming a prediction module in which a computer predicts and outputs a value related to a predetermined feature amount by inputting values of a plurality of explanatory variables.

近年、コンピュータに機械学習をさせて、入力されたデータから種々の予測を行う技術が活発に提案されている。一方、従来より、複数のゴム材料、充填材、及びオイル等を試行錯誤により配合して加硫ゴム組成物を試作して物性データを実験して計測することが行われている。このため、加硫ゴム組成物の配合情報と物性データの値とを紐付けたデータが多数蓄積されている。この蓄積データを学習用データセットとして活用して、コンピュータに機械学習させて、入力されたデータから物性データの値を予測させることができる。 In recent years, there have been actively proposed techniques for making a computer perform machine learning to make various predictions from input data. On the other hand, conventionally, a plurality of rubber materials, fillers, oils and the like are blended by trial and error to make a prototype of a vulcanized rubber composition, and experimentally measure physical property data. For this reason, a large amount of data linking the compounding information of the vulcanized rubber composition and the value of the physical property data has been accumulated. By utilizing this accumulated data as a training data set, it is possible to make a computer perform machine learning to predict the value of physical property data from the input data.

例えば、ニューラルネットワークの手法を用いて、設計・配合等の説明変数である要因群と特徴量である特性群との写像関係を学習し、説明変数それぞれの値から特徴量の値を推定するとともに、任意の特徴量の値に対して、それを作り出す説明変数の最適値を効率的にかつ容易に求める方法を提供する技術が知られている（特許文献１）。 For example, using a neural network method, the mapping relationship between the factor group, which is an explanatory variable for design and formulation, and the characteristic group, which is a feature amount, is learned, and the value of the feature amount is estimated from the value of each explanatory variable. , A technique is known that provides a method for efficiently and easily obtaining the optimum value of an explanatory variable that creates an arbitrary feature quantity value (Patent Document 1).

特開２００３−５８５８２号公報Japanese Unexamined Patent Publication No. 2003-58582

この技術におけるニューラルネットワークの学習では、用意したオリジナルデータを全て一律に読み取って複数の学習データに用いる。オリジナルデータには、過去の実験やシミュレーションによって得られたデータが含まれている場合が多い。すなわち、オリジナルデータは、特徴量の値が、測定対象物の実験値である実験データと、特徴量の値が、測定対象物のシミュレーションモデルを用いてシミュレーションを行うことにより算出されたシミュレーション計算値であるシミュレーションデータと、を複数保持する場合が多い。
シミュレーションでは、実験を再現するようにシミュレーションモデルを用いてコンピュータで計算するが、シミュレーションにより得られる結果は、要因（説明変数）を種々変更した時に得られる特徴量の値（シミュレーションで算出された値）の変化は、実験で得られた特徴量の値（実験データ）の変化に対応するが、シミュレーションによって得られた特徴量の値が、実験で得られた特徴量の値に一致しない場合が多く、偏差が存在する場合が多い。
このため、実験データとシミュレーションデータを含んだオリジナルデータを、一まとめにして、説明変数と特徴量の間の関係を機械学習した予測モジュールを作成することは難しい。 In the training of the neural network in this technique, all the prepared original data are read uniformly and used for a plurality of training data. Original data often includes data obtained from past experiments and simulations. That is, the original data is a simulation calculated value calculated by performing a simulation in which the feature amount value is an experimental value of the measurement object and the feature amount value is a simulation model of the measurement object. In many cases, a plurality of simulation data and the above are retained.
In the simulation, the simulation model is used to calculate with a computer so as to reproduce the experiment, but the result obtained by the simulation is the value of the feature amount (value calculated by the simulation) obtained when various factors (explanatory variables) are changed. ) Corresponds to the change in the feature quantity value (experimental data) obtained in the experiment, but the feature quantity value obtained by the simulation may not match the feature quantity value obtained in the experiment. Many, there are often deviations.
For this reason, it is difficult to create a prediction module in which the relationship between the explanatory variables and the features is machine-learned by combining the original data including the experimental data and the simulation data.

そこで、本発明は、コンピュータが、複数の説明変数の値を入力することにより予め定めた特徴量に関する値を予測して出力する予測モジュールを定めるとき、実験データとシミュレーションデータを含むオリジナルデータセットを用いて、説明変数と特徴量の間の関係を機械学習した予測精度の高い予測モジュールを効率よく作成することができるデータ処理方法、データ処理装置、及びコンピュータにデータ処理方法を実行させるプログラムを提供することを目的とする。 Therefore, the present invention defines an original data set including experimental data and simulation data when a computer determines a prediction module that predicts and outputs a value related to a predetermined feature amount by inputting values of a plurality of explanatory variables. Provides a data processing method, a data processing device, and a program that causes a computer to execute a data processing method that can efficiently create a prediction module with high prediction accuracy by machine learning the relationship between explanatory variables and feature quantities. The purpose is to do.

本発明の一態様は、コンピュータが、複数の説明変数の値を入力することにより予め定めた特徴量に関する値を予測して出力する予測モジュールを形成するためのデータ処理方法である。当該データ処理方法は、
複数の説明変数それぞれの値と、前記説明変数の値と関連付けを行うための特徴量の値とをセットにして保持するデータであって、前記特徴量の値が、測定対象物の実験値である実験データと、前記特徴量の値が、前記測定対象物のシミュレーションモデルを用いてシミュレーションを行うことにより算出されたシミュレーション計算値であるシミュレーションデータと、を複数保持するオリジナルデータセットを用いて、コンピュータが、前記シミュレーションデータにおける前記特徴量の値と前記実験用データにおける前記特徴量の値との間の対応関係に基づいて、前記シミュレーションデータにおける前記特徴量の値を修正した修正シミュレーションデータで構成される修正シミュレーションデータセットを作成するステップと、
前記コンピュータが、前記修正シミュレーションデータセットと前記実験用データで構成される実験データセットのそれぞれを、学習用データセットと、検証用データセットとに分離することにより、学習用修正シミュレーションデータセット、学習用実験データセット、検証用修正シミュレーションデータセット、及び検証用実験データセットを生成するステップと、
前記コンピュータが、前記学習用修正シミュレーションデータセット及び前記学習用実験データセット、及び前記学習用修正シミュレーションデータセットと前記学習用実験データセットとを統合した学習用統合データセットのそれぞれを用いて、前記コンピュータが、前記説明変数と前記特性量との間の関係を機械学習した複数の予測モジュール候補を作成するステップと、
前記コンピュータが、前記検証用修正シミュレーションデータセット、前記検証用実験データセット、及び前記検証用修正シミュレーションデータセットと前記検証用実験データセットとを統合した検証用統合データセットを用いて、機械学習した前記複数の予測モジュール候補それぞれに対して予測精度の評価をするステップと、
前記コンピュータは、前記予測精度の評価結果に基づいて、前記複数の予測モジュール候補から前記予測モジュールを決定するステップと、を備える。 One aspect of the present invention is a data processing method for forming a prediction module in which a computer predicts and outputs a value related to a predetermined feature amount by inputting values of a plurality of explanatory variables. The data processing method is
It is data that holds a set of the value of each of the plurality of explanatory variables and the value of the feature amount for associating with the value of the explanatory variable, and the value of the feature amount is an experimental value of the measurement object. Using an original data set that holds a plurality of experimental data and simulation data in which the value of the feature quantity is a simulation calculated value calculated by performing simulation using the simulation model of the measurement object, The computer is composed of modified simulation data in which the value of the feature amount in the simulation data is modified based on the correspondence between the value of the feature amount in the simulation data and the value of the feature amount in the experimental data. Steps to create a modified simulation data set that will be
The computer separates each of the modified simulation data set and the experimental data set composed of the experimental data into a training data set and a verification data set, whereby the training modified simulation data set and the learning Steps to generate experimental data sets for, modified simulation data sets for verification, and experimental data sets for verification,
The computer uses each of the learning modified simulation data set, the learning experimental data set, and the learning integrated data set in which the learning modified simulation data set and the learning experimental data set are integrated. A step in which a computer creates a plurality of prediction module candidates by machine learning the relationship between the explanatory variables and the characteristic quantity.
The computer machine-learned using the verification modified simulation data set, the verification experimental data set, and the verification integrated data set in which the verification modified simulation data set and the verification experimental data set were integrated. A step of evaluating the prediction accuracy for each of the plurality of prediction module candidates, and
The computer includes a step of determining the prediction module from the plurality of prediction module candidates based on the evaluation result of the prediction accuracy.

前記シミュレーションデータは、複数の前記実験データにおける前記特徴量の最大値と最小値のそれぞれを実現する前記説明変数の値を用いて、前記シミュレーションモデルを用いて前記シミュレーションを行うことにより算出されたシミュレーション計算値を含み、
前記特徴量の値の修正では、前記最大値及び前記最小値と、前記最大値及び前記最小値のそれぞれに対応したシミュレーション計算値との間の対応関係と、前記実験データの前記特徴量の値が前記最大値と前記最小値の間に存在し、前記説明変数の値同士が許容範囲内で一致する前記シミュレーション計算値と前記実験データにおける前記特徴量の値との間の対応関係とを利用して、前記学習用シミュレーションデータセットの前記特徴量の値を修正する、ことが好ましい。 The simulation data is a simulation calculated by performing the simulation using the simulation model using the values of the explanatory variables that realize the maximum value and the minimum value of the feature amount in the plurality of experimental data. Including calculated values
In the correction of the value of the feature amount, the correspondence relationship between the maximum value and the minimum value and the simulation calculated value corresponding to each of the maximum value and the minimum value and the value of the feature amount of the experimental data Is present between the maximum value and the minimum value, and the correspondence relationship between the simulation calculated value in which the values of the explanatory variables match within an allowable range and the value of the feature amount in the experimental data is used. Then, it is preferable to correct the value of the feature amount in the training simulation data set.

前記予測精度の評価をするとき、前記学習用統合データセットを用いて機械学習した予測モジュール候補については、前記検証用修正シミュレーションデータセット、前記検証用実験データセット、及び前記検証用統合データセットのそれぞれを用いたときの予測精度の評価をする、ことが好ましい。 When evaluating the prediction accuracy, the prediction module candidates machine-learned using the learning integrated data set are of the verification modified simulation data set, the verification experimental data set, and the verification integrated data set. It is preferable to evaluate the prediction accuracy when each of them is used.

前記予測精度の評価をするとき、前記学習用統合データセットを用いて機械学習した予測モジュール候補については、
（１）前記検証用実験データセットを用いたときの予測精度と、前記学習用実験データセットを用いて機械学習した予測モジュール候補における、前記検証用実験データセットを用いたときの予測精度とを比較し、
（２）前記検証用修正シミュレーションデータセットを用いたときの予測精度と、前記学習用修正シミュレーションデータセットを用いて機械学習した予測モジュール候補における、前記検証用修正シミュレーションセットを用いたときの予測精度とを比較し、
比較結果に基づいて、前記学習用統合データセットを用いて機械学習した予測モジュール候補の評価を行う、ことが好ましい。 When evaluating the prediction accuracy, for the prediction module candidates machine-learned using the integrated learning data set,
(1) The prediction accuracy when the verification experiment data set is used and the prediction accuracy when the verification experiment data set is used in the prediction module candidate machine-learned using the learning experiment data set. Compare and
(2) Prediction accuracy when the verification correction simulation data set is used, and prediction accuracy when the verification correction simulation set is used in the prediction module candidate machine-learned using the learning correction simulation data set. Compare with
Based on the comparison result, it is preferable to evaluate the prediction module candidates machine-learned using the integrated learning data set.

前記シミュレーションデータは、前記シミュレーションモデルの構成および前記シミュレーションの方法の少なくともいずれか１つが異なる第１シミュレーションデータ及び第２シミュレーションデータを含み、
前記第１シミュレーションデータ及び前記第２シミュレーションデータのそれぞれを用いて、前記修正シミュレーションデータセットを作成すること、前記学習用修正シミュレーションデータセット、及び前記検証用修正シミュレーションデータセットを生成すること、前記予測モジュール候補を作成すること、前記予測精度の評価をすること、を行う、ことが好ましい。 The simulation data includes first simulation data and second simulation data in which at least one of the configuration of the simulation model and the method of the simulation is different.
Using each of the first simulation data and the second simulation data to create the modified simulation data set, generate the modified simulation data set for learning, and the modified simulation data set for verification, the prediction. It is preferable to create a module candidate and evaluate the prediction accuracy.

前記特徴量は、タイヤに作用する物理量であり、
前記説明変数の値は、前記タイヤを規定する値である、ことが好ましい。 The feature quantity is a physical quantity acting on the tire.
The value of the explanatory variable is preferably a value that defines the tire.

さらに、前記特徴量に関する目標値の入力に応じて、前記コンピュータが、前記予測モジュールを用いて前記目標値を再現する前記説明変数に関する最適値を算出するステップを備え、
前記最適値を算出するステップでは、前記予測モジュールに入力される前記説明変数の値に応じて前記予測モジュールが予測する前記特徴量の値に基づいて、前記説明変数に関する前記最適値を算出する、ことが好ましい。 Further, the computer comprises a step of calculating the optimum value for the explanatory variable that reproduces the target value by using the prediction module in response to the input of the target value for the feature amount.
In the step of calculating the optimum value, the optimum value for the explanatory variable is calculated based on the value of the feature amount predicted by the prediction module according to the value of the explanatory variable input to the prediction module. Is preferable.

前記説明変数の値と前記特徴量の値の関係を可視化するステップを、さらに備える、ことが好ましい。 It is preferable to further include a step of visualizing the relationship between the value of the explanatory variable and the value of the feature amount.

本発明の他の一態様は、複数の説明変数の値を入力することにより予め定めた特徴量の値を予測して出力する予測モジュールを形成する、コンピュータで構成されたデータ処理装置である。当該データ処理装置は、
複数の説明変数それぞれの値と、前記説明変数の値と関連付けを行うための特徴量の値とをセットにして保持するデータであって、前記特徴量の値が、試験機を用いて得られた測定対象物の実験値である実験データと、前記特徴量の値が、前記測定対象物のシミュレーションモデルを用いてシミュレーションを行うことにより算出されたシミュレーション計算値であるシミュレーションデータとを複数保持するオリジナルデータセットを用いて、前記シミュレーションデータにおける前記特徴量の値と前記実験用データにおける前記特徴量の値との間の対応関係に基づいて、前記シミュレーションデータにおける前記特徴量の値を修正して、修正シミュレーションデータで構成される修正シミュレーションデータセットとするデータ修正部と、
前記修正シミュレーションデータセットと前記実験用データで構成される実験データセットのそれぞれを、学習用データセットと、検証用データセットとに分離することにより、学習用修正シミュレーションデータセット、学習用実験データセット、検証用修正シミュレーションデータセット、及び検証用実験データセットを生成するデータセット生成部と、
前記学習用修正シミュレーションデータセット及び前記学習用実験データセット、及び前記学習用修正シミュレーションデータセットと前記学習用実験データセットとを統合した学習用統合データセットのそれぞれを用いて、前記コンピュータが、前記説明変数と前記特性量との間の関係を機械学習した複数の予測モジュール候補を作成する予測モジュール候補作成部と、
前記コンピュータが、前記検証用修正シミュレーションデータセット、前記検証用実験データセット、及び前記検証用修正シミュレーションデータセットと前記検証用実験データセットとを統合した検証用統合データセットを用いて、機械学習した複数の前記予測モジュール候補それぞれに対して予測精度の評価をする予測モジュール候補評価部と、
前記コンピュータは、前記予測精度の評価結果に基づいて、前記複数の予測モジュール候補から前記予測モジュールを決定する予測モジュール決定部と、を備える。 Another aspect of the present invention is a computer-configured data processing apparatus that forms a prediction module that predicts and outputs a predetermined feature value by inputting values of a plurality of explanatory variables. The data processing device
Data that holds a set of the values of each of the plurality of explanatory variables and the value of the feature amount for associating with the value of the explanatory variable, and the value of the feature amount can be obtained by using a testing machine. It holds a plurality of experimental data which are experimental values of the measurement object and simulation data whose feature value is a simulation calculation value calculated by performing a simulation using the simulation model of the measurement object. Using the original data set, the value of the feature amount in the simulation data is modified based on the correspondence between the value of the feature amount in the simulation data and the value of the feature amount in the experimental data. , A data correction unit that is a correction simulation data set consisting of correction simulation data,
By separating each of the modified simulation data set and the experimental data set composed of the experimental data into a training data set and a verification data set, a training modified simulation data set and a training experimental data set are provided. , A data set generator that generates a modified simulation data set for verification, and an experimental data set for verification,
Using each of the modified simulation data set for learning, the experimental data set for learning, and the integrated learning data set in which the modified simulation data set for learning and the experimental data set for learning are integrated, the computer is used to obtain the computer. A prediction module candidate creation unit that creates a plurality of prediction module candidates by machine learning the relationship between the explanatory variables and the characteristic quantity,
The computer machine-learned using the verification modified simulation data set, the verification experimental data set, and the verification integrated data set in which the verification modified simulation data set and the verification experimental data set were integrated. A prediction module candidate evaluation unit that evaluates prediction accuracy for each of the plurality of prediction module candidates,
The computer includes a prediction module determination unit that determines the prediction module from the plurality of prediction module candidates based on the evaluation result of the prediction accuracy.

本発明のさらに他の一態様は、複数の説明変数の値を入力することにより予め定めた特徴量の値を予測して出力する予測モジュールを形成するためのデータ処理方法を、コンピュータに実行させるプログラムである。当該プログラムは、
複数の説明変数それぞれの値と、前記説明変数の値と関連付けを行うための特徴量の値とをセットにして保持するデータであって、前記特徴量の値が、試験機を用いて得られた測定対象物の実験値である実験データと、前記特徴量の値が、前記測定対象物のシミュレーションモデルを用いてシミュレーションを行うことにより算出されたシミュレーション計算値であるシミュレーションデータとを複数保持するオリジナルデータセットを用いて、コンピュータに、前記シミュレーションデータにおける前記特徴量の値と前記実験用データにおける前記特徴量の値との間の対応関係に基づいて、前記シミュレーションデータにおける前記特徴量の値を修正させて、修正シミュレーションデータで構成される修正シミュレーションデータセットを生成させる手順と、
前記コンピュータに、前記修正シミュレーションデータセットと前記実験用データで構成される実験データセットのそれぞれを、学習用データセットと、検証用データセットとに分離させることにより、学習用修正シミュレーションデータセット、学習用実験データセット、検証用修正シミュレーションデータセット、及び検証用実験データセットを生成させる手順と、
前記コンピュータに、前記学習用修正シミュレーションデータセット、前記学習用実験データセット、及び前記学習用修正シミュレーションデータセットと前記学習用実験データセットとを統合した学習用統合データセットのそれぞれを用いて、前記コンピュータが、前記説明変数と前記特性量との間の関係を機械学習した複数の予測モジュール候補を作成させる手順と、
前記コンピュータに、前記検証用修正シミュレーションデータセット、前記検証用実験データセット、及び前記検証用修正シミュレーションデータセットと前記検証用実験データセットとを統合した検証用統合データセットを用いて、機械学習した複数の前記予測モジュール候補それぞれに対して予測精度の評価をさせる手順と、
前記コンピュータに、前記予測精度の評価結果に基づいて、前記複数の予測モジュール候補から前記予測モジュールを決定させる手順と、を備える。 Yet another aspect of the present invention causes a computer to execute a data processing method for forming a prediction module that predicts and outputs a predetermined feature value by inputting values of a plurality of explanatory variables. It is a program. The program
Data that holds a set of the values of each of the plurality of explanatory variables and the value of the feature amount for associating with the value of the explanatory variable, and the value of the feature amount can be obtained by using a testing machine. It holds a plurality of experimental data which are experimental values of the measurement object and simulation data whose feature value is a simulation calculation value calculated by performing a simulation using the simulation model of the measurement object. Using the original data set, a computer is used to obtain the value of the feature amount in the simulation data based on the correspondence between the value of the feature amount in the simulation data and the value of the feature amount in the experimental data. The procedure for making corrections and generating a correction simulation data set consisting of correction simulation data,
By separating each of the modified simulation data set and the experimental data set composed of the experimental data into a training data set and a verification data set in the computer, the training modified simulation data set and learning Procedures for generating experimental data sets, modified simulation data sets for verification, and experimental data sets for verification,
The computer is used with each of the learning modified simulation data set, the learning experimental data set, and the learning integrated data set in which the learning modified simulation data set and the learning experimental data set are integrated. A procedure for a computer to create a plurality of prediction module candidates by machine learning the relationship between the explanatory variables and the characteristic quantity, and
Machine learning was performed on the computer using the verification modified simulation data set, the verification experimental data set, and the verification integrated data set in which the verification modified simulation data set and the verification experimental data set were integrated. A procedure for evaluating the prediction accuracy for each of the plurality of prediction module candidates, and
A procedure for causing the computer to determine the prediction module from the plurality of prediction module candidates based on the evaluation result of the prediction accuracy.

上述のデータ処理方法、データ処理装置、及びプログラムによれば、コンピュータが、複数の説明変数の値を入力することにより予め定めた特徴量に関する値を予測して出力する予測モジュールを定めるとき、実験データとシミュレーションデータを含むオリジナルデータセットを用いて、説明変数と特徴量の間の関係を機械学習した予測精度の高い予測モジュールを効率よく作成することができる。 According to the above-mentioned data processing method, data processing device, and program, when a computer determines a prediction module that predicts and outputs a value related to a predetermined feature amount by inputting values of a plurality of explanatory variables, an experiment is performed. Using the original data set including data and simulation data, it is possible to efficiently create a prediction module with high prediction accuracy by machine learning the relationship between explanatory variables and feature quantities.

一実施形態のデータ処理方法の流れの一例を概略説明する図である。It is a figure which outlines an example of the flow of the data processing method of one Embodiment. 一実施形態のデータ処理装置の構成の一例を示す図である。It is a figure which shows an example of the structure of the data processing apparatus of one Embodiment. （ａ）〜（ｃ）は、一実施形態のデータ処理方法で行うシミュレーションデータにおける特徴量の値の修正の一例を説明する図である。FIGS. (A) to (C) are diagrams for explaining an example of modifying the value of the feature amount in the simulation data performed by the data processing method of one embodiment. 一実施形態のデータ処理方法で用いる学習用オリジナルデータセットの一例を簡素化してわかり易く説明する図である。It is a figure which simplifies and explains in an easy-to-understand manner an example of the learning original data set used in the data processing method of one Embodiment. （ａ）〜（ｃ）は、一実施形態のデータ処理方法において、オリジナルデータセットから作成されるデータセットの例を示す図である。(A) to (c) are diagrams showing an example of a data set created from an original data set in the data processing method of one embodiment. 一実施形態のデータ処理方法において、オリジナルデータセットから作成されるデータセットの例を示す図である。It is a figure which shows the example of the data set created from the original data set in the data processing method of one Embodiment. 一実施形態のデータ処理方法における予測モジュール候補の作成と、検証用データセットの利用方法の一例を説明する図である。It is a figure explaining an example of the creation of the prediction module candidate in the data processing method of one Embodiment, and the usage method of the data set for verification. 一実施形態のデータ処理方法において用いるシミュレーションデータにおける特徴量の値と実験データにおける特徴量の値との対応を説明する図である。It is a figure explaining the correspondence between the value of the feature amount in the simulation data used in the data processing method of one Embodiment, and the value of the feature amount in the experimental data.

以下、一実施形態のデータ処理方法、データ処理装置、およびプログラムを添付の図に基づいて説明する。
図１は、一実施形態のデータ処理方法の流れの一例を概略説明する図である。図２は、一実施形態のデータ処理装置の構成の一例を示す図である。
一実施形態のデータ処理方法は、コンピュータにより実行される方法であり、複数の説明変数の値を入力することにより予め定めた特徴量に関する値を予測して出力する予測モジュールを作成する方法である。
予測モジュールは、オリジナルデータセットから作成される複数の学習用データセットを用いて作成された複数の予測モジュール候補の中から、オリジナルデータセットから別途作成された複数の検証用データセットを用いて評価した評価結果に基づいて定められる。 Hereinafter, the data processing method, the data processing apparatus, and the program of one embodiment will be described with reference to the attached figures.
FIG. 1 is a diagram schematically illustrating an example of the flow of the data processing method of one embodiment. FIG. 2 is a diagram showing an example of the configuration of the data processing device of one embodiment.
The data processing method of one embodiment is a method executed by a computer, and is a method of creating a prediction module that predicts and outputs a value related to a predetermined feature amount by inputting values of a plurality of explanatory variables. ..
The prediction module is evaluated using a plurality of verification data sets separately created from the original data set from among a plurality of prediction module candidates created using a plurality of training data sets created from the original data set. It is determined based on the evaluation results.

図２に示すデータ処理装置１０は、ＣＰＵ１２及びメモリ１４を含むコンピュータにより構成される。データ処理装置１０には、ディスプレイ３０、及び、情報を指示入力するためのマウスやキーボードを含む入力操作デバイス３２が接続されている。
入力操作デバイス３２は、操作者がデータ処理装置１０に所望の指示入力をするために用いられる。例えば、予測モジュール候補を作成するための条件を設定するために入力操作デバイス３２から操作者は情報を指示入力する。
ディスプレイ３０は、設定された情報を表示するために用いられ、例えば、データ処理方法で用いるオリジナルデータセット、学習用オリジナルデータセット、検証用オリジナルデータセット、及び各種学習用サブデータセット、検証用サブデータセットにおけるデータの数値、説明変数、欠損説明変数、予測モジュール候補を作成するための条件設定画面、及び、予測モジュール候補における予測精度の評価結果等を表示する。 The data processing device 10 shown in FIG. 2 is composed of a computer including a CPU 12 and a memory 14. A display 30 and an input operation device 32 including a mouse and a keyboard for instructing and inputting information are connected to the data processing device 10.
The input operation device 32 is used by the operator to input a desired instruction to the data processing device 10. For example, the operator instructs and inputs information from the input operation device 32 in order to set a condition for creating a prediction module candidate.
The display 30 is used to display the set information, for example, an original data set used in a data processing method, an original data set for learning, an original data set for verification, various sub-data sets for learning, and a sub for verification. The numerical value of the data in the data set, the explanatory variable, the missing explanatory variable, the condition setting screen for creating the prediction module candidate, the evaluation result of the prediction accuracy in the prediction module candidate, and the like are displayed.

メモリ１４には、プログラムが記憶されており、ＣＰＵ１２がプログラムを読み出して実行することにより、シミュレーションデータ修正部１５、サブデータセット作成部１６、予測モジュール候補作成部１８、予測モジュール候補評価部２０、予測モジュール決定部２２、及び予測部２４をソフトウェアモジュールとして機能させる。以下、シミュレーションデータ修正部１５、サブデータセット作成部１６、予測モジュール候補作成部１８、予測モジュール候補評価部２０、予測モジュール決定部２２、及び予測部２４の機能を、図１に示す一実施形態のデータ処理方法の流れを説明しながら同時に説明する。 The program is stored in the memory 14, and when the CPU 12 reads and executes the program, the simulation data correction unit 15, the sub-dataset creation unit 16, the prediction module candidate creation unit 18, the prediction module candidate evaluation unit 20, The prediction module determination unit 22 and the prediction unit 24 are made to function as software modules. Hereinafter, the functions of the simulation data correction unit 15, the sub data set creation unit 16, the prediction module candidate creation unit 18, the prediction module candidate evaluation unit 20, the prediction module determination unit 22, and the prediction unit 24 are shown in FIG. The flow of the data processing method will be explained at the same time.

コンピュータは、機械学習することにより、予測モジュールとなり得る予測モデルを予め保持する。この予測モデルは、上記オリジナルデータセットから作成される複数の学習用サブデータセットを用いて機械学習することにより、予測モジュール候補となる。この予測モジュール候補の少なくとも１つが、予測モジュールとなる。予測モデルは、周知のディープラーニングに代表されるニューラルネットワークを用いたモデル、複数の決定木を使用して、「分類」または「回帰」をする、周知のランダムフォレスト法を用いたモデル、LASSO回帰を用いたモデルを含む。また、予測モデルとして、多項式あるいはクリギング、RBF（Radial Base Function）を用いた非線形関数を用いることもできる。 The computer holds in advance a prediction model that can be a prediction module by machine learning. This prediction model becomes a prediction module candidate by machine learning using a plurality of learning sub-datasets created from the original data set. At least one of the prediction module candidates is a prediction module. Predictive models are models using neural networks represented by well-known deep learning, models using well-known random forest methods that "classify" or "regress" using multiple decision trees, Lasso regression. Includes a model using. Further, as a prediction model, a polynomial, kriging, or a non-linear function using RBF (Radial Base Function) can be used.

オリジナルデータセットは、複数の説明変数の値と、これらの説明変数の値と関連付けを行うための特徴量の値とをセットにして複数組み（例えば、数万組）保持したデータの群である。説明変数は、例えば、製品の設計寸法、製品に用いる構成材料の構造や物性値、あるいは、製品を作製するときの作製条件等を含み、特徴量は、例えば製品の特性値、市場における販売量等を含む。例えば、オリジナルデータセットが、説明変数として、構造体の設計寸法、構成材料の構造を含み、特徴量として、構造体の特性値を含む場合、データは、上記設計寸法、上記構造を種々変化させたときの上記設計寸法及び上記構造の情報と特性値とからなるデータをいう。したがって、この場合、オリジナルデータセットは、上記設計寸法、上記構造を種々変化させたときの上記設計寸法及び上記構造の情報と特性値とをセットにしたデータを多数含む。 The original data set is a group of data in which the values of a plurality of explanatory variables and the values of the feature amount for associating with the values of these explanatory variables are set and held in a plurality of sets (for example, tens of thousands of sets). .. The explanatory variables include, for example, the design dimensions of the product, the structure and physical property values of the constituent materials used in the product, the manufacturing conditions when the product is manufactured, and the feature quantity is, for example, the characteristic value of the product and the sales volume in the market. Etc. are included. For example, when the original data set includes the design dimensions of the structure and the structure of the constituent material as explanatory variables and the characteristic values of the structure as the feature quantities, the data changes the design dimensions and the structure in various ways. It refers to data consisting of the above design dimensions, the above structure information, and characteristic values. Therefore, in this case, the original data set includes a large amount of data in which the design dimensions, the design dimensions when the structure is variously changed, the information of the structure, and the characteristic values are set.

オリジナルデータセットには、過去蓄積された膨大なデータである場合が多い。オリジナルデータセットは、複数の説明変数それぞれの値と、この説明変数の値と関連付けを行うための特徴量の値とをセットにした多数のデータを保持する。多数のデータには、多数の実験データと多数のシミュレーションデータが含まれている。実験データは、特徴量の値が、測定対象物の実験値であるデータである。シミュレーションデータは、特徴量の値が、測定対象物のシミュレーションモデルを用いてシミュレーションを行うことにより算出されたシミュレーション計算値である。図１に示すシミュレーションデータには、異なるシミュレーションモデルを用いて、あるいは異なるシミュレーション方法によって計算された特徴量の値であるシミュレーション計算値を持つシミュレーションデータ１とシミュレーションデータ２が含まれている。 The original dataset is often a huge amount of data accumulated in the past. The original data set holds a large amount of data in which the values of each of the plurality of explanatory variables and the value of the feature amount for associating with the values of the explanatory variables are set. A large number of data includes a large number of experimental data and a large number of simulation data. The experimental data is data in which the value of the feature amount is the experimental value of the measurement object. The simulation data is a simulation calculation value calculated by performing a simulation using a simulation model of the object to be measured for the value of the feature amount. The simulation data shown in FIG. 1 includes simulation data 1 and simulation data 2 having simulation calculation values that are values of feature quantities calculated using different simulation models or by different simulation methods.

シミュレーションデータ修正部１５は、オリジナルデータセットのうち、シミュレーションデータにおけるシミュレーション計算値（特徴量の値）を、実験データを用いて、修正する（図１ＳＴ１０）。これにより、説明変数と特徴量の修正した値をセットとする複数の修正シミュレーションデータで構成された修正シミュレーションデータセットが作成される。実験データで構成されるデータセットを実験データセットという。 The simulation data correction unit 15 corrects the simulation calculation value (feature amount value) in the simulation data in the original data set by using the experimental data (FIG. 1 ST10). As a result, a modified simulation data set composed of a plurality of modified simulation data in which the explanatory variables and the modified values of the features are set is created. A dataset composed of experimental data is called an experimental dataset.

図３（ａ）〜（ｃ）は、シミュレーションデータにおける特徴量の値の修正の一例を説明する図である。図３（ａ）は、実験データの説明変数とシミュレーションデータの説明変数が許容範囲内で一致するときの実験データ及びシミュレーションデータの特徴量の値の一例を示す。図３（ｂ）は、図３（ａ）に示す特徴量の値を、横軸をシミュレーションデータの特徴量とし、縦軸を実験データの特徴量としたグラフにプロットした結果を示す。
図３（ｂ）に示すように、シミュレーションデータと実験データの特徴量の値は、１対１に対応する関係を有するので、この関係を利用して、図３（ｃ）に示すように、シミュレーションデータにおける特徴量の値を、実験データの値に換算した値を修正後のシミュレーションデータの値として定める。
なお、シミュレーションデータが、シミュレーションデータ１及びシミュレーションデータ２を含んでいる場合、シミュレーションデータ１及びシミュレーションデータ２毎に、図３（ｂ）に示すような関係を求めて、シミュレーションデータ１，２における特徴量の値を、実験データの値に換算して修正後のシミュレーションデータの値を求める。
なお、実験データの特徴量の値とシミュレーションデータの特徴量の値を、値の大きさの順番に並べたとき、その順位が、実験データとシミュレーションデータの間で異なるような場合、実験データ及びシミュレーションデータそれぞれの順位が異なる２つの特徴量の値に代えて、この２つの値の平均値を、新たな特徴量の値として用いてもよい。 3 (a) to 3 (c) are diagrams for explaining an example of correction of the value of the feature amount in the simulation data. FIG. 3A shows an example of the value of the feature amount of the experimental data and the simulation data when the explanatory variable of the experimental data and the explanatory variable of the simulation data match within an allowable range. FIG. 3B shows the results of plotting the values of the feature amounts shown in FIG. 3A on a graph in which the horizontal axis is the feature amount of the simulation data and the vertical axis is the feature amount of the experimental data.
As shown in FIG. 3 (b), the values of the feature quantities of the simulation data and the experimental data have a one-to-one relationship, and as shown in FIG. 3 (c), this relationship is used. The value of the feature amount in the simulation data is converted into the value of the experimental data and determined as the value of the modified simulation data.
When the simulation data includes the simulation data 1 and the simulation data 2, the characteristics of the simulation data 1 and 2 are obtained by obtaining the relationship as shown in FIG. 3B for each of the simulation data 1 and the simulation data 2. The value of the quantity is converted into the value of the experimental data to obtain the value of the corrected simulation data.
When the feature amount values of the experimental data and the feature amount values of the simulation data are arranged in the order of the magnitudes of the values, if the order differs between the experimental data and the simulation data, the experimental data and Instead of the values of the two features having different ranks of the simulation data, the average value of these two values may be used as the value of the new features.

次に、サブデータセット作成部１６は、修正シミュレーションデータで構成された修正シミュレーションデータセット１，２と実験用データで構成された実験データセットのそれぞれを、学習用データセットと、検証用データセットとに分離することにより、学習用修正シミュレーションデータセット１，２、学習用実験データセット、を学習用サブデータセットとして作成し、検証用修正シミュレーションデータセット１，２、検証用実験データセットを、検証用サブデータセットとして作成し、さらに、学習用修正シミュレーションデータセット１，２と学習用実験データセットとを統合した学習用統合データセットを学習用サブデータセットとして作成し、検証用修正シミュレーションデータセット１，２と検証用実験データセットとを統合した検証用統合データセットを検証用サブデータセットとして作成する（図１ＳＴ１２，ＳＴ１４）。 Next, the sub-dataset creation unit 16 sets the modified simulation data sets 1 and 2 composed of the modified simulation data and the experimental data set composed of the experimental data into a training data set and a verification data set, respectively. By separating into and, the modified simulation data sets 1 and 2 for learning and the experimental data set for learning are created as sub-datasets for learning, and the modified simulation data sets 1 and 2 for verification and the experimental data set for verification are created. Created as a verification sub-dataset, and further created a learning integrated data set that integrates the training modification simulation data sets 1 and 2 and the training experiment data set as a training sub-dataset, and the verification modification simulation data. An integrated verification data set that integrates sets 1 and 2 and an experimental data set for verification is created as a sub-dataset for verification (FIGS. 1 ST12 and ST14).

図４は、オリジナルデータセットを学習用オリジナルデータセットと検証用オリジナルデータセットに分離したときの学習用オリジナルデータセットの一例を簡素化してわかり易く説明する図であり、図５（ａ）〜（ｃ）及び図６は、一実施形態のデータ処理方法において、学習用オリジナルデータセットから作成される学習用サブデータセットの例を示す図である。
図４に示す学習用オリジナルデータセットは、説明変数として、説明変数Ｘ_１〜Ｘ_ｎ（ｎは自然数）を含み、説明変数それぞれに対するデータとして、データ１〜データ９を含む。図４に示す学習用オリジナルデータセットでは、データ数は９であるが、データ数は、実際、数千〜数万である。
図４に示す学習用オリジナルデータセットでは、特徴量は１つであるが複数であってもよい。
ここで、図４中の「・・・」は、実際に数値があることを示している。図中の“シミュレーションindex”については、“０”が、非シミュレーションデータであることを示し、“１”がシミュレーション１により得られたデータであることを示し、“２”がシミュレーションモデルあるいはシミュレーション方法の点でシミュレーション１と異なるシミュレーション２により得られたデータであることを示している。図中の“試験機index”については、非シミュレーションデータの場合における試験機の種類を示している。“１”が試験機１により得られた実験データであることを示し、“０”は試験機を用いた実験データでないことを示している。まお、図４では、データ１〜９の特徴量の値がＶ１〜Ｖ９であることを示している。 FIG. 4 is a diagram for simplifying and explaining an example of the training original data set when the original data set is separated into the training original data set and the verification original data set, and FIGS. 5 (a) to 5 (c). ) And FIG. 6 are diagrams showing an example of a learning sub-dataset created from the learning original data set in the data processing method of one embodiment.
The learning original data set shown in FIG. 4 includes explanatory variables X _{1 to} X _n (n is a natural number) as explanatory variables, and includes data 1 to data 9 as data for each explanatory variable. In the original learning data set shown in FIG. 4, the number of data is 9, but the number of data is actually several thousand to tens of thousands.
In the original learning data set shown in FIG. 4, the feature amount is one, but may be plural.
Here, "..." in FIG. 4 indicates that there is actually a numerical value. Regarding the "simulation index" in the figure, "0" indicates that it is non-simulation data, "1" indicates that it is data obtained by simulation 1, and "2" indicates a simulation model or simulation method. In this respect, it is shown that the data is obtained by simulation 2 which is different from simulation 1. “Testing machine index” in the figure indicates the type of testing machine in the case of non-simulation data. “1” indicates that it is the experimental data obtained by the testing machine 1, and “0” indicates that it is not the experimental data using the testing machine. In FIG. 4, it is shown that the feature values of the data 1 to 9 are V1 to V9.

図５（ａ）は、学習用オリジナルデータの“シミュレーションindex”が“０”のデータにより構成された学習用実験データセットを示しており、図５（ｂ）は、学習用オリジナルデータの“シミュレーションindex”が“１”のデータにより構成された学習用修正シミュレーションセット１を示しており、図５（ｃ）は、学習用オリジナルデータの“シミュレーションindex”が“２”のデータにより構成された学習用修正シミュレーションセット２を示している。学習用修正シミュレーションセット１,２における特徴量の値は、修正した値であるので、図５（ｂ），（ｃ）に示す特徴量の値は、Ｖ４’〜Ｖ９’となっている。
図６は、学習用統合データセットを示し、修正シミュレーションデータと実験データで構成されている。 FIG. 5A shows a training experimental data set composed of data in which the “simulation index” of the training original data is “0”, and FIG. 5B shows a “simulation” of the training original data. A modified simulation set 1 for learning composed of data whose index ”is“ 1 ”is shown, and FIG. 5 (c) shows learning composed of data whose“ simulation index ”is“ 2 ”of the original data for learning. The modified simulation set 2 for use is shown. Since the feature amount values in the learning modified simulation sets 1 and 2 are the modified values, the feature amount values shown in FIGS. 5 (b) and 5 (c) are V4'to V9'.
FIG. 6 shows an integrated data set for learning, which is composed of modified simulation data and experimental data.

オリジナルデータセットでは、Ｘ_１〜Ｘ_ｎの他に“シミュレーションindex”及び“試験機index”も説明変数であるので、説明変数の数はｎ＋２個であり、学習用実験データセット、学習用修正シミュレーション用データセット１，２、及び学習用統合データセットにおける説明変数の数は、Ｘ_１〜Ｘ_ｎのｎ個である。 In the original data set, in addition to X _{1 to} X _n , "simulation index" and "testing machine index" are also explanatory variables, so the number of explanatory variables is n + 2, and the experimental data set for learning and the modified simulation for learning The number of explanatory variables in the data sets 1 and 2 for learning and the integrated data set for learning is _{n from} X _{1 to} X _n .

予測モジュール候補作成部１８は、学習用オリジナルデータセットと、作成した複数の学習用サブデータセットを用いて予測モデルを機械学習させて、予測モジュール候補１〜５を作成する（図１ＳＴ１６）。予測モデルの機械学習では、ディープラーニングが用いられ、例えば、入力設定された条件に基づいた層構成の予測モジュール候補、例えば、１〜７層の層構成の予測モジュール候補が作成される。 The prediction module candidate creation unit 18 machine-learns a prediction model using the original data set for training and the created plurality of sub-data sets for learning, and creates prediction module candidates 1 to 5 (FIG. 1 ST16). In machine learning of a prediction model, deep learning is used, and for example, a prediction module candidate having a layer structure based on input-set conditions, for example, a prediction module candidate having a layer structure of 1 to 7 layers is created.

図７は、予測モジュール候補の作成と、後述する検証用データセットの利用方法の一例を説明する図である。
予測モジュール候補１〜５は、上述した学習用オリジナルデータセット、学習用修正シミュレーションデータセット１，２、学習用実験データセット、及び学習用統合データセットのそれぞれを用いて、予測モデルが説明変数と特徴量の間の関係を機械学習することにより作成されたものである。予測モジュールの機械学習では、転移学習方法を用いることもできる。 FIG. 7 is a diagram illustrating an example of how to create a prediction module candidate and how to use the verification data set described later.
For the prediction module candidates 1 to 5, the prediction model is used as an explanatory variable by using each of the above-mentioned original training data set, modified simulation data set 1 and 2 for learning, experimental data set for learning, and integrated data set for learning. It was created by machine learning the relationship between feature quantities. In the machine learning of the prediction module, the transfer learning method can also be used.

したがって、学習用オリジナルデータセットから作成された予測モジュール候補１では、Ｘ_１〜Ｘ_ｎ、“シミュレーションindex”及び“試験機index”が説明変数として定義される。したがって、この場合の説明変数はｎ＋２個である。学習用修正シミュレーションデータセット１，２、学習用実験データセット、及び学習用統合データセットのそれぞれから作成された予測モジュール候補２〜５では、Ｘ_１〜Ｘ_ｎが説明変数として定義される。したがって、この場合の説明変数はｎ個である。 Therefore, in the prediction module candidate 1 created from the original data set for training, X _{1 to} X _n , “simulation index” and “testing machine index” are defined as explanatory variables. Therefore, the explanatory variables in this case are n + 2. In the prediction module candidates 2 to 5 created from each of the modified simulation data set 1 and 2 for learning, the experimental data set for learning, and the integrated data set for learning, X _{1 to} X _n are defined as explanatory variables. Therefore, the number of explanatory variables in this case is n.

予測モジュール候補評価部２０は、検証用修正シミュレーションデータセット１，２、検証用実験データセット、及び検証用修正シミュレーションデータセット１，２と検証用実験データセットとを統合した検証用統合データセットを用いて、機械学習した予測モジュール候補１〜５それぞれに対して予測精度の評価をする（図１ＳＴ１８）。 The prediction module candidate evaluation unit 20 provides a verification modified simulation data set 1 and 2, a verification experimental data set, and a verification integrated data set that integrates the verification modified simulation data set 1 and 2 and the verification experimental data set. The prediction accuracy is evaluated for each of the machine-learned prediction module candidates 1 to 5 (FIG. 1 ST 18).

検証用サブデータセットとして用意した検証用修正シミュレーションデータセット１，２、検証用実験データセット、及び検証用統合データセットは、学習用修正シミュレーションデータセット１，２、学習用実験データセット、及び学習用統合データセットと同様に、説明変数としてＸ_１〜Ｘ_ｎを持つので、検証用修正シミュレーションデータセット１，２、検証用実験データセット、及び検証用統合データセットは、予測モジュール２〜５のそれぞれの検証用サブデータセットとして用いることができる。例えば、予測モジュール候補２と、検証用修正シミュレーションデータセット１，２、検証用実験データセット、及び検証用統合データセットのそれぞれを用いて特徴量の予測値を算出することができる。したがって、予測モジュール２で算出した特徴量の予測値を、検証用修正シミュレーションデータセット１，２、検証用実験データセット、及び検証用統合データセットの特徴量の値と比較することができる。同様に、予測モジュール候補３〜５についても、算出した特徴量の予測値を、検証用修正シミュレーションデータセット１，２、検証用実験データセット、及び検証用統合データセットの特徴量の値を正解値として比較することができる。 The verification modified simulation data sets 1 and 2, the verification experimental data set, and the verification integrated data set prepared as the verification sub-dataset are the training modified simulation data set 1, 2, the training experimental data set, and the learning. Since it has X _{1 to} X _n as explanatory variables like the integrated data set for verification, the modified simulation data sets 1 and 2 for verification, the experimental data set for verification, and the integrated data set for verification are of the prediction modules 2 to 5. It can be used as each verification sub-dataset. For example, the predicted value of the feature amount can be calculated using each of the prediction module candidate 2, the verification modified simulation data sets 1 and 2, the verification experimental data set, and the verification integrated data set. Therefore, the predicted value of the feature amount calculated by the prediction module 2 can be compared with the feature amount value of the modified simulation data set 1 and 2 for verification, the experimental data set for verification, and the integrated data set for verification. Similarly, for the prediction module candidates 3 to 5, the calculated feature quantity prediction values are correctly answered by the verification correction simulation data sets 1 and 2, the verification experiment data set, and the verification integrated data set feature quantity values. Can be compared as a value.

予測モジュール候補１〜５における特徴量の予測精度の評価では、予測モジュール候補１〜５それぞれが予測した特徴量の予測値が、正解値にどの程度近似しているかを評価する。評価の仕方は、特に制限されないが、例えば、正解値に対する予測値の比を表した値を評価値とする。特徴量が複数設定されている場合、特徴量毎の上記比の平均値あるいは、上記比が１から最も遠く離れている値を評価値とする。あるいは、実際の特徴量の値と予測モジュール候補による予測値とが多数組あるので、実際の特徴量の値と予測値との間の相関係数Ｒあるいは決定係数Ｒ^２を評価値とすることもできる。 In the evaluation of the prediction accuracy of the feature amount in the prediction module candidates 1 to 5, it is evaluated how close the predicted value of the feature amount predicted by each of the prediction module candidates 1 to 5 is to the correct answer value. The evaluation method is not particularly limited, but for example, a value representing the ratio of the predicted value to the correct answer value is used as the evaluation value. When a plurality of feature amounts are set, the average value of the above ratios for each feature amount or the value at which the above ratio is farthest from 1 is used as the evaluation value. Alternatively, the value of the actual characteristic quantity and the predicted value by the prediction module candidate is multiple sets, to the evaluation value a correlation coefficient R or the coefficient of determination R ² between the actual value of the feature amount and the predicted value You can also.

予測モジュール決定部２２は、予測モジュール候補評価部２０で求めた予測精度の評価結果（評価値）に基づいて、予測精度が高い予測モジュールを決定する（図１のＳＴ２０）。決定される予測モジュールは、複数の予測モジュール候補の中から、予測精度が最も高い１つを選んで決定してもよいし、予測精度が閾値を越える複数の予測モジュール候補を予測モジュールとして決定してもよい。予測モジュール候補の中で、説明変数が最も多い予測モジュール候補１が、最も予測精度が高い予測モジュール候補とは限らない。特徴量に寄与しない説明変数もあり、この説明変数がノイズ成分となって予測精度を低下させる場合がある。
予測精度の評価結果の情報は、ディスプレイ３０に画面表示されることが好ましい。 The prediction module determination unit 22 determines a prediction module with high prediction accuracy based on the evaluation result (evaluation value) of the prediction accuracy obtained by the prediction module candidate evaluation unit 20 (ST20 in FIG. 1). The prediction module to be determined may be determined by selecting one with the highest prediction accuracy from a plurality of prediction module candidates, or a plurality of prediction module candidates whose prediction accuracy exceeds the threshold are determined as prediction modules. You may. Among the prediction module candidates, the prediction module candidate 1 having the largest number of explanatory variables is not necessarily the prediction module candidate with the highest prediction accuracy. Some explanatory variables do not contribute to the feature quantity, and these explanatory variables may become noise components and reduce the prediction accuracy.
The information on the evaluation result of the prediction accuracy is preferably displayed on the screen of the display 30.

予測部２４は、決定された予測モジュールを設定して、説明変数の値を入力することにより特徴量に関する値を予測する。予測した特徴量に関する値は、ディスプレイ３０に出力される。 The prediction unit 24 sets the determined prediction module and predicts the value related to the feature amount by inputting the value of the explanatory variable. The value related to the predicted feature amount is output to the display 30.

このように、上述のデータ処理方法では、シミュレーションデータと実験データを含んでいる場合において、シミュレーションデータの特徴量の値を修正することにより、修正シミュレーションデータと実験データを同じ学習用データとして同時に用いて、また、修正シミュレーションデータセット１，２のように種類の異なる修正シミュレーションデータ毎の学習用修正シミュレーションデータセットを用いて、予測モデルを機械学習させることができるので予測モジュール候補を複数作成することができる。さらに、検証用サブデータセットとして、検証用修正シミュレーションデータ、検証用実験データセット、及び検証用統合データセットそれぞれを用いて、すなわち、検証用オリジナルデータセットを効率よく用いて、複数の予測モジュール候補の予測精度を評価することができる。したがって、オリジナルデータセットに実験データとシミュレーションデータを含む場合であっても、説明変数と特徴量の間の関係を機械学習した予測精度の高い予測モジュールを効率よく作成することができる。 As described above, in the above-mentioned data processing method, when the simulation data and the experimental data are included, the modified simulation data and the experimental data are simultaneously used as the same learning data by modifying the value of the feature amount of the simulation data. In addition, since the prediction model can be machine-trained using the correction simulation data set for training for each of the different types of correction simulation data such as the correction simulation data sets 1 and 2, a plurality of prediction module candidates can be created. Can be done. Further, as the verification sub-dataset, a plurality of prediction module candidates are used by using the verification modified simulation data, the verification experimental data set, and the verification integrated data set, that is, efficiently using the verification original data set. The prediction accuracy of can be evaluated. Therefore, even when the original data set includes experimental data and simulation data, it is possible to efficiently create a prediction module with high prediction accuracy by machine learning the relationship between explanatory variables and features.

一実施形態によれは、オリジナルデータを学習用データセットと検証用データセットとに分割するとき、検証用データセットをオリジナルデータセットの異なる部分から取り出し、残りの部分を学習用データセットとする分割を複数回行い、分割の度に、学習用データセットを用いて作成した予測モジュール候補の予測精度の評価を行い、複数回行った予測精度の評価結果の平均値に基づいて予測モジュール候補から予測モジュールを決定する、ことが好ましい。これにより、オリジナルデータセットの広い範囲で偏ることなく機械学習のための学習用データセットを作成することができ、また、検証のための検証用データセットを広い範囲で偏ることなく用いることができ、予測精度の評価を精度よく求めることができる。 According to one embodiment, when the original data is divided into a training data set and a verification data set, the verification data set is taken out from different parts of the original data set, and the remaining part is used as the training data set. Is performed multiple times, and the prediction accuracy of the prediction module candidate created using the training data set is evaluated for each division, and prediction is made from the prediction module candidate based on the average value of the evaluation results of the prediction accuracy performed multiple times. It is preferable to determine the module. As a result, a learning data set for machine learning can be created without being biased over a wide range of the original data set, and a verification data set for verification can be used without being biased over a wide range. , The evaluation of prediction accuracy can be obtained with high accuracy.

一実施形態によれば、シミュレーションデータは、複数の実験データにおける特徴量の最大値と最小値のそれぞれを実現する説明変数の値を用いて、シミュレーションモデルを用いてシミュレーションを行うことにより算出されたシミュレーション計算値を含み、特徴量の値の修正では、最大値及び最小値と、最大値及び最小値のそれぞれに対応したシミュレーション計算値との間の対応関係と、実験データの特徴量の値が最大値と最小値の間に存在し、説明変数の値同士が許容範囲内で一致するシミュレーション計算値と実験データにおける特徴量の値との間の対応関係を利用して、学習用シミュレーションデータセットの特徴量の値を修正する、ことが好ましい。シミュレーションは、特に限定されないが、例えば、周知の有限要素モデルを用いたシミュレーションが挙げられる。
図８は、シミュレーションデータにおける特徴量の値、すなわちシミュレーション計算値と実験データにおける特徴量の値との対応を説明する図である。実験データにおける特徴量の最大値及び最小値を実現する説明変数の値に対応するシミュレーションデータの特徴量の値、すなわちシミュレーション計算値があれば、最大値と最小値に対応した２つのシミュレーション計算値の間における計算値の修正を、内挿補間を利用して高い精度で行なうことができる。このため、実験データにおける特徴量の最大値及び最小値に対応したシミュレーションデータの特徴量の値が、シミュレーションデータにない場合、シミュレーションモデルを用いてシミュレーションを行うことにより、実験データにおける特徴量の最大値及び最小値に対応したシミュレーション計算値を容易に算出することができる。 According to one embodiment, the simulation data is calculated by performing a simulation using a simulation model using the values of explanatory variables that realize the maximum value and the minimum value of the feature amount in a plurality of experimental data. In the correction of the feature amount value including the simulation calculated value, the correspondence relationship between the maximum value and the minimum value and the simulation calculated value corresponding to each of the maximum value and the minimum value, and the feature amount value of the experimental data are changed. A simulation data set for training using the correspondence between the simulation calculation value that exists between the maximum value and the minimum value and the values of the explanatory variables match within the permissible range and the feature value in the experimental data. It is preferable to modify the value of the feature amount of. The simulation is not particularly limited, and examples thereof include a simulation using a well-known finite element model.
FIG. 8 is a diagram for explaining the correspondence between the feature value in the simulation data, that is, the simulated calculation value and the feature value in the experimental data. If there is a simulation data feature value corresponding to the value of the explanatory variable that realizes the maximum and minimum values of the feature value in the experimental data, that is, if there is a simulation calculation value, two simulation calculation values corresponding to the maximum value and the minimum value It is possible to correct the calculated value between the two with high accuracy by using the interpolation interpolation. Therefore, if the simulation data does not have the feature amount value of the simulation data corresponding to the maximum value and the minimum value of the feature amount in the experimental data, the maximum feature amount in the experimental data can be obtained by performing the simulation using the simulation model. The simulation calculation value corresponding to the value and the minimum value can be easily calculated.

また、実験データの特徴量の値が最大値と最小値の間に存在し、説明変数の値同士が許容範囲内で一致するシミュレーション計算値と実験データにおける前記特徴量の値との間の対応関係を利用して、学習用シミュレーションデータセットの特徴量の値を内挿補間により高い精度で修正することができる。この場合においても、実験データにおける説明変数と説明変数の値が許容範囲内で一致するシミュレーションデータがない場合、シミュレーションモデルを用いてシミュレーションを行うことにより、実験データにおける特徴量の値に対応したシミュレーション計算値を容易に算出することができる。
このようにして、図８に示すように、シミュレーションデータと実験データの間で、説明変数が許容範囲内で一致するときの特徴量の値の対応付けを行うことができる。このため、内挿補間により、精度の高い値の修正を行うことができる。 In addition, the correspondence between the simulated calculation value in which the value of the feature amount of the experimental data exists between the maximum value and the minimum value and the values of the explanatory variables match within the permissible range and the value of the feature amount in the experimental data. By using the relationship, the value of the feature quantity of the training simulation data set can be corrected with high accuracy by interpolation interpolation. Even in this case, if there is no simulation data in which the explanatory variables and the values of the explanatory variables in the experimental data match within the permissible range, the simulation corresponding to the feature quantity value in the experimental data is performed by performing the simulation using the simulation model. The calculated value can be easily calculated.
In this way, as shown in FIG. 8, it is possible to associate the value of the feature amount when the explanatory variables match within the permissible range between the simulation data and the experimental data. Therefore, it is possible to correct the value with high accuracy by interpolation interpolation.

一実施形態によれば、予測精度を評価するとき、図７に示すように、学習用統合データセットを用いて機械学習した予測モジュール候補については、検証用修正シミュレーションデータセット１，２、検証用実験データセット、及び検証用統合データセットのそれぞれを用いたときの予測精度の評価をすることが好ましい。学習用統合データセットを用いて機械学習した予測モジュール候補は、他のいずれの予測モジュール候補よりも予測精度が高いことが一般的に想定されるが、必ずしも予測精度が高くない場合もある。このため、学習用統合データセットを用いて機械学習した予測モジュール候補については、検証用オリジナルデータセットから作成した種々の検証用サブデータセットを可能な限り利用して、予測精度の評価をすることが好ましい。 According to one embodiment, when evaluating the prediction accuracy, as shown in FIG. 7, for the prediction module candidates machine-learned using the integrated learning data set, the modified simulation data sets 1 and 2 for verification and the verification data set 1 and 2 are used. It is preferable to evaluate the prediction accuracy when each of the experimental data set and the integrated data set for verification is used. It is generally assumed that the prediction module candidates machine-learned using the integrated learning data set have higher prediction accuracy than any other prediction module candidates, but the prediction accuracy may not always be high. For this reason, for prediction module candidates machine-learned using the integrated training data set, the prediction accuracy should be evaluated by using various verification sub-datasets created from the original verification data set as much as possible. Is preferable.

一実施形態によれば、予測精度の評価をするとき、学習用統合データセットを用いて機械学習した予測モジュール候補については、
（１）検証用実験データセットを用いたときの予測精度と、学習用実験データセットを用いて機械学習した予測モジュール候補における、検証用実験データセットを用いたときの予測精度とを比較し、
（２）検証用修正シミュレーションデータセットを用いたときの予測精度と、学習用修正シミュレーションデータセットを用いて機械学習した予測モジュール候補における、検証用修正シミュレーションセットを用いたときの予測精度とを比較し、
比較結果に基づいて、学習用統合データセットを用いて機械学習した予測モジュール候補の評価を行う、ことが好ましい。学習用統合データセットを用いて機械学習した予測モジュール候補の特徴量の値の予測精度は、学習用実験データセットを用いて機械学習した予測モジュール候補における、検証用実験データセットを用いたときの予測精度に比べて向上していること、及び、学習用修正シミュレーションデータセットを用いて機械学習した予測モジュール候補における、検証用修正シミュレーションセットを用いたときの予測精度に比べて向上していることが一般に想定されるが、必ずしも予測精度が高くない場合もある。このため、学習用統合データセットを用いて機械学習した予測モジュール候補については、実験データセットから作成した予測モジュール候補が実験データセットの実験データを検証用のデータとして用いた場合の予測精度と比べること、及び、修正シミュレーションデータセットから作成した予測モジュール候補が修正シミュレーションデータを検証用のデータとして用いた場合の予測精度と比べること、が特に好ましい。 According to one embodiment, when evaluating the prediction accuracy, for the prediction module candidates machine-learned using the integrated learning data set,
(1) Compare the prediction accuracy when using the verification experiment data set with the prediction accuracy when using the verification experiment data set in the prediction module candidates machine-learned using the learning experiment data set.
(2) Comparison between the prediction accuracy when using the modified simulation data set for verification and the prediction accuracy when using the modified simulation set for verification in the prediction module candidates machine-learned using the modified simulation data set for learning. And
Based on the comparison result, it is preferable to evaluate the prediction module candidates machine-learned using the integrated learning data set. The prediction accuracy of the feature value of the prediction module candidate machine-learned using the integrated learning data set is the same as when the verification experiment data set is used in the prediction module candidate machine-learned using the training experiment data set. It is improved compared to the prediction accuracy, and it is improved compared to the prediction accuracy when the verification correction simulation set is used in the prediction module candidates machine-learned using the training correction simulation data set. Is generally assumed, but the prediction accuracy may not always be high. Therefore, the prediction module candidates machine-learned using the integrated training data set are compared with the prediction accuracy when the prediction module candidates created from the experimental data set use the experimental data of the experimental data set as the verification data. It is particularly preferable that the prediction module candidate created from the modified simulation data set is compared with the prediction accuracy when the modified simulation data is used as the verification data.

シミュレーションデータは、シミュレーションモデルの構成およびシミュレーションの方法が同じ１種類のシミュレーションデータであってもよいが、図１に示すように、シミュレーションモデルの構成およびシミュレーションの方法の少なくともいずれか１つが異なるシミュレーションデータ１（第１シミュレーションデータ）及びシミュレーションデータ２（第２シミュレーションデータ）を含むことが好ましい。この場合、シミュレーションデータ１及びシミュレーションデータ２のそれぞれを用いて、図１に示すＳＴ１０〜２０の処理を行う、ことが好ましい。これにより、シミュレーションの相違による複数の予測モジュール候補の予測精度を評価することができるので、予測精度の高い予測モジュールを決定することができる。 The simulation data may be one type of simulation data having the same simulation model configuration and simulation method, but as shown in FIG. 1, simulation data in which at least one of the simulation model configuration and simulation method is different. It is preferable to include 1 (first simulation data) and simulation data 2 (second simulation data). In this case, it is preferable to perform the processing of ST10 to 20 shown in FIG. 1 using each of the simulation data 1 and the simulation data 2. As a result, the prediction accuracy of a plurality of prediction module candidates due to the difference in simulation can be evaluated, so that a prediction module with high prediction accuracy can be determined.

一実施形態によれば、特徴量は、タイヤに作用する物理量、例えばタイヤの特性値であり、説明変数の値は、タイヤを規定する値である、ことが好ましい。これにより、タイヤに作用する物理量を、タイヤを規定する値を用いて高い精度で予測することが可能になる。タイヤを規定する値は、例えば、タイヤを装着するリムサイズ、タイヤの偏平率、タイヤ幅、ビードフィラー断面積、第１スチールコードの角度、第１スチールコードの剛性、第２スチールコードの角度、第２スチールコードの剛性、第１カーカスコードの角度、及び第１カーカスコードの剛性、第２カーカスコードの角度、第２カーカスコードの剛性等を含む。 According to one embodiment, it is preferable that the feature amount is a physical quantity acting on the tire, for example, a characteristic value of the tire, and the value of the explanatory variable is a value that defines the tire. This makes it possible to predict the physical quantity acting on the tire with high accuracy using the value that defines the tire. The values that specify the tire are, for example, the rim size on which the tire is mounted, the flatness of the tire, the tire width, the bead filler cross-sectional area, the angle of the first steel cord, the rigidity of the first steel cord, the angle of the second steel cord, and the first. 2 Includes the rigidity of the steel cord, the angle of the first carcass cord, the rigidity of the first carcass cord, the angle of the second carcass cord, the rigidity of the second carcass cord, and the like.

一実施形態によれば、予測モジュールは、特徴量に関する目標値の入力に応じて、目標値を再現する説明変数に関する最適値を算出する最適化処理に用いることもできる。すなわち、一実施形態のデータ処理方法では、特徴量に関する目標値の入力に応じて、データ処理装置１０が、予測モジュールを用いて目標値を再現する説明変数に関する最適値を算出する最適化処理を含むことが好ましい。この場合、予測モジュールに入力される説明変数の値に応じて予測モジュールが予測する特徴量の値に基づいて、説明変数に関する最適値を算出することが好ましい。最適値を算出する方法は、例えば、進化的アルゴリズムが利用することが好ましい。進化的アルゴリズムは、Genetic Algorithm（遺伝的アルゴリズム）、Differential Evolution、Particle Swarm Optimization、Ant Colony Optimization等を含む。実験計画法やラテンハイパーキューブ法を利用することも好ましい。 According to one embodiment, the prediction module can also be used in the optimization process of calculating the optimum value for the explanatory variable that reproduces the target value in response to the input of the target value for the feature amount. That is, in the data processing method of one embodiment, in response to the input of the target value regarding the feature amount, the data processing device 10 performs the optimization process of calculating the optimum value for the explanatory variable that reproduces the target value using the prediction module. It is preferable to include it. In this case, it is preferable to calculate the optimum value for the explanatory variable based on the value of the feature amount predicted by the prediction module according to the value of the explanatory variable input to the prediction module. The method of calculating the optimum value is preferably used by, for example, an evolutionary algorithm. Evolutionary algorithms include Genetic Algorithm, Differential Evolution, Particle Swarm Optimization, Ant Colony Optimization and the like. It is also preferable to use design of experiments or the Latin hypercube method.

一実施形態によれば、説明変数の値と特徴量の値の関係を可視化することが好ましい。
説明変数の値と特徴量の値の関係は、ディスプレイ３０に表示される。説明変数の値と特徴量の値の関係は、例えば自己組織化マップにより表される。あるいは、自己組織化マップに代えて、散布図を用いて、説明変数と特徴量の値の関係を可視化してもよい。 According to one embodiment, it is preferable to visualize the relationship between the value of the explanatory variable and the value of the feature amount.
The relationship between the value of the explanatory variable and the value of the feature amount is shown on the display 30. The relationship between the value of the explanatory variable and the value of the feature amount is represented by, for example, a self-organizing map. Alternatively, instead of the self-organizing map, a scatter plot may be used to visualize the relationship between the explanatory variables and the values of the features.

このようなデータ処理方法は、コンピュータに実行させるプログラムをメモリ１４から読み出して実行することにより達成することができる。したがって、このプログラムは、
（１）実験データとシミュレーションデータとを複数保持するオリジナルデータセットを用いて、コンピュータに、シミュレーションデータにおける特徴量の値と実験用データにおける特徴量の値との間の対応関係に基づいて、シミュレーションデータにおける特徴量の値を修正させて、修正シミュレーションデータで構成される修正シミュレーションデータセットを生成させる手順と、
（２）コンピュータに、修正シミュレーションデータセットと実験用データで構成される実験データセットのそれぞれを、学習用データセットと、検証用データセットとに分離させることにより、学習用修正シミュレーションデータセット、学習用実験データセット、検証用修正シミュレーションデータセット、及び検証用実験データセットを生成させる手順と、
（３）コンピュータに、学習用修正シミュレーションデータセット、学習用実験データセット、及び学習用統合データセットのそれぞれを用いて、コンピュータが、説明変数と特性量との間の関係を機械学習した複数の予測モジュール候補を作成させる手順と、
（４）コンピュータに、検証用修正シミュレーションデータセット、検証用実験データセット、及び検証用統合データセットを用いて、機械学習した複数の予測モジュール候補それぞれに対して予測精度の評価をさせる手順と、
（５）コンピュータに、予測精度の評価結果に基づいて、複数の予測モジュール候補から予測モジュールを決定させる手順と、を備える。 Such a data processing method can be achieved by reading a program to be executed by a computer from the memory 14 and executing the program. Therefore, this program
(1) Using an original data set that holds a plurality of experimental data and simulation data, a computer is used to perform a simulation based on the correspondence between the feature value in the simulation data and the feature value in the experimental data. A procedure for modifying the feature value in the data to generate a modified simulation data set composed of modified simulation data, and
(2) By separating each of the modified simulation data set and the experimental data set composed of the experimental data into a training data set and a verification data set on the computer, the training modified simulation data set and the learning Procedures for generating experimental data sets, modified simulation data sets for verification, and experimental data sets for verification,
(3) A plurality of computers that have machine-learned the relationship between explanatory variables and characteristic quantities using each of a modified simulation data set for learning, an experimental data set for learning, and an integrated data set for learning. The procedure for creating prediction module candidates and
(4) A procedure for causing a computer to evaluate the prediction accuracy of each of a plurality of machine-learned prediction module candidates using a modified simulation data set for verification, an experimental data set for verification, and an integrated data set for verification.
(5) A procedure for causing a computer to determine a prediction module from a plurality of prediction module candidates based on an evaluation result of prediction accuracy is provided.

（実施例、比較例）
上述のデータ処理方法の効果を確認するために、１０８８１個の実験データと、３７３９個のシミュレーションデータを用意した。説明変数は、タイヤ寸法、タイヤの構成材料の寸法、物性値、及びタイヤ構造の形態を情報として含み、特徴量として、転がり抵抗を用いた。 (Example, comparative example)
In order to confirm the effect of the above-mentioned data processing method, 10881 experimental data and 3739 simulation data were prepared. The explanatory variables included the tire dimensions, the dimensions of the constituent materials of the tire, the physical property values, and the morphology of the tire structure as information, and rolling resistance was used as the feature quantity.

実施例では、実験データ及びシミュレーションデータを含むオリジナルデータセットを用意し、このオリジナルデータセットを上述のデータ処理方法により処理して、予測モジュールを決定した。予測モジュール候補は、学習用オリジナルデータセットから作成された予測モジュール候補と、学習用統合データセットから作成された予測モジュール候補と、学習用実験データから作成された予測モジュール候補の３つである。 In the embodiment, an original data set including experimental data and simulation data was prepared, and this original data set was processed by the above-mentioned data processing method to determine a prediction module. There are three prediction module candidates: a prediction module candidate created from the original learning data set, a prediction module candidate created from the integrated learning data set, and a prediction module candidate created from the experimental data for training.

一方、比較例では、実験データを含むがシミュレーションデータを含まないオリジナルデータセットを用いて予測モジュールを決定した。この場合に学習用実験データセットから１つの予測モジュール候補が作成されるだけであり、予測モジュール候補の数は１つであるので、この予測モジュール候補が自動的に比較例における予測モジュールとなる。 On the other hand, in the comparative example, the prediction module was determined using the original data set containing the experimental data but not the simulation data. In this case, only one prediction module candidate is created from the training experimental data set, and the number of prediction module candidates is one. Therefore, this prediction module candidate automatically becomes the prediction module in the comparative example.

実施例における予測モジュール候補の評価結果は以下のとおりであった。
予測モジュール候補は、予測モデルをディープラーニング法により機械学習をさせることにより作成した。深層学習における層構成は、３層とした。
学習用オリジナルデータセットから作成された予測モジュール候補における検証用オリジナルデータセットを用いた予測値と、検証用オリジナルデータセットにおける特徴量の値との間の決定係数Ｒ^２は０．７１と低く、
学習用実験データセットから作成された予測モジュール候補における検証用実験データセット、検証用修正シミュレーションデータセット、及び検証用統合データセットを用いた特徴量の予測値と、上記データセット内の対応する特徴量の値との間の決定係数Ｒ^２は０．７７であり、
学習用統合データセットから作成された予測モジュール候補における検証用実験データセット、検証用修正シミュレーションデータセット、及び検証用統合データセットを用いた特徴量の予測値と、上記データセット内の対応する特徴量の値との間の決定係数Ｒ^２は０．８８であった。 The evaluation results of the prediction module candidates in the examples were as follows.
Prediction module candidates were created by machine learning a prediction model by the deep learning method. The layer structure in deep learning was three layers.
The predicted value using the verification original dataset in the prediction module candidates that are created from the original data set for learning, the coefficient of determination R ² between the value of the feature amount in the verification for the original data set as low as 0.71,
Predicted values of features using the verification experiment dataset, verification modified simulation dataset, and verification integrated dataset in the prediction module candidates created from the training experiment dataset, and the corresponding features in the above dataset. The determinant R ² between the quantity values is 0.77
Predicted values of feature quantities using the verification experimental dataset, verification modified simulation dataset, and verification integrated dataset in the prediction module candidates created from the training integrated dataset, and the corresponding features in the above dataset. The determination factor R ² between the quantity values was 0.88.

一方、比較例で作成される予測モジュール候補は、上述の学習用実験データセットから作成された１つの予測モジュール候補だけであるので、その予測モジュール候補の決定係数Ｒ^２は０．７７である。 On the other hand, since the prediction module candidates created in the comparative example are only one prediction module candidate created from the above-mentioned experimental data set for learning, the coefficient of determination R ² of the prediction module candidate is 0.77.

したがって、実施例で決定される予測モジュールの決定係数Ｒ^２は０．８８であり、比較例で決定される予測モジュールの決定係数Ｒ^２は０．７７である。
これより、実施例の予測モジュールの予測精度は高いといえる。 Therefore, the coefficient of determination R ² of the prediction module determined in the examples is 0.88, and the coefficient of determination R ² of the prediction module determined in the comparative example is 0.77.
From this, it can be said that the prediction accuracy of the prediction module of the embodiment is high.

以上、本発明のデータ処理方法、データ処理装置、及びプログラムについて詳細に説明したが、本発明は上記実施形態に限定されず、本発明の主旨を逸脱しない範囲において、種々の改良や変更をしてもよいのはもちろんである。 The data processing method, data processing apparatus, and program of the present invention have been described in detail above, but the present invention is not limited to the above-described embodiment, and various improvements and changes are made without departing from the gist of the present invention. Of course, it may be.

１０データ処理装置
１２ＣＰＵ
１４メモリ
１５シミュレーションデータ修正部
１６サブデータセット作成部
１８予測モジュール候補作成部
２０予測モジュール候補作成部
２２予測モジュール決定部
２４予測部 10 Data processing device 12 CPU
14 Memory 15 Simulation data correction unit 16 Sub data set creation unit 18 Prediction module candidate creation unit 20 Prediction module candidate creation unit 22 Prediction module determination unit 24 Prediction unit

Claims

It is a data processing method for forming a prediction module in which a computer predicts and outputs a value related to a predetermined feature amount by inputting values of a plurality of explanatory variables.
It is data that holds a set of the value of each of the plurality of explanatory variables and the value of the feature amount for associating with the value of the explanatory variable, and the value of the feature amount is an experimental value of the measurement object. Using an original data set that holds a plurality of experimental data and simulation data in which the value of the feature quantity is a simulation calculated value calculated by performing simulation using the simulation model of the measurement object, The computer is composed of modified simulation data in which the value of the feature amount in the simulation data is modified based on the correspondence between the value of the feature amount in the simulation data and the value of the feature amount in the experimental data. Steps to create a modified simulation data set that will be
The computer separates each of the modified simulation data set and the experimental data set composed of the experimental data into a training data set and a verification data set, whereby the training modified simulation data set and the learning Steps to generate experimental data sets for, modified simulation data sets for verification, and experimental data sets for verification,
The computer uses each of the learning modified simulation data set, the learning experimental data set, and the learning integrated data set in which the learning modified simulation data set and the learning experimental data set are integrated. A step in which a computer creates a plurality of prediction module candidates by machine learning the relationship between the explanatory variables and the characteristic quantity.
The computer machine-learned using the verification modified simulation data set, the verification experimental data set, and the verification integrated data set in which the verification modified simulation data set and the verification experimental data set were integrated. A step of evaluating the prediction accuracy for each of the plurality of prediction module candidates, and
A data processing method, wherein the computer includes a step of determining the prediction module from the plurality of prediction module candidates based on the evaluation result of the prediction accuracy.

The simulation data is a simulation calculated by performing the simulation using the simulation model using the values of the explanatory variables that realize the maximum value and the minimum value of the feature amount in the plurality of experimental data. Including calculated values
In the correction of the value of the feature amount, the correspondence relationship between the maximum value and the minimum value and the simulation calculated value corresponding to each of the maximum value and the minimum value and the value of the feature amount of the experimental data Is present between the maximum value and the minimum value, and the correspondence relationship between the simulation calculated value in which the values of the explanatory variables match within an allowable range and the value of the feature amount in the experimental data is used. The data processing method according to claim 1, wherein the value of the feature amount of the learning simulation data set is corrected.

When evaluating the prediction accuracy, the prediction module candidates machine-learned using the learning integrated data set are of the verification modified simulation data set, the verification experimental data set, and the verification integrated data set. The data processing method according to claim 1 or 2, wherein the prediction accuracy when each of them is used is evaluated.

When evaluating the prediction accuracy, for the prediction module candidates machine-learned using the integrated learning data set,
(1) The prediction accuracy when the verification experiment data set is used and the prediction accuracy when the verification experiment data set is used in the prediction module candidate machine-learned using the learning experiment data set. Compare and
(2) Prediction accuracy when the verification correction simulation data set is used, and prediction accuracy when the verification correction simulation set is used in the prediction module candidate machine-learned using the learning correction simulation data set. Compare with
The data processing method according to any one of claims 1 to 3, wherein the prediction module candidate machine-learned by using the integrated learning data set is evaluated based on the comparison result.

The simulation data includes first simulation data and second simulation data in which at least one of the configuration of the simulation model and the method of the simulation is different.
Using each of the first simulation data and the second simulation data to create the modified simulation data set, generate the modified simulation data set for learning, and the modified simulation data set for verification, the prediction. The data processing method according to any one of claims 1 to 4, wherein a module candidate is created and the prediction accuracy is evaluated.

The feature quantity is a physical quantity acting on the tire.
The data processing method according to any one of claims 1 to 5, wherein the value of the explanatory variable is a value that defines the tire.

Further, the computer comprises a step of calculating the optimum value for the explanatory variable that reproduces the target value by using the prediction module in response to the input of the target value for the feature amount.
In the step of calculating the optimum value, the optimum value for the explanatory variable is calculated based on the value of the feature amount predicted by the prediction module according to the value of the explanatory variable input to the prediction module. The data processing method according to any one of claims 1 to 6.

The data processing method according to any one of claims 1 to 7, further comprising a step of visualizing the relationship between the value of the explanatory variable and the value of the feature amount.

It is a data processing device composed of a computer that forms a prediction module that predicts and outputs the value of a predetermined feature amount by inputting the values of a plurality of explanatory variables.
Data that holds a set of the values of each of the plurality of explanatory variables and the value of the feature amount for associating with the value of the explanatory variable, and the value of the feature amount can be obtained by using a testing machine. It holds a plurality of experimental data which are experimental values of the measurement object and simulation data whose feature value is a simulation calculation value calculated by performing a simulation using the simulation model of the measurement object. Using the original data set, the value of the feature amount in the simulation data is modified based on the correspondence between the value of the feature amount in the simulation data and the value of the feature amount in the experimental data. , A data correction unit that is a correction simulation data set consisting of correction simulation data,
By separating each of the modified simulation data set and the experimental data set composed of the experimental data into a training data set and a verification data set, a training modified simulation data set and a training experimental data set are provided. , A data set generator that generates a modified simulation data set for verification, and an experimental data set for verification,
Using each of the modified simulation data set for learning, the experimental data set for learning, and the integrated learning data set in which the modified simulation data set for learning and the experimental data set for learning are integrated, the computer is used to obtain the computer. A prediction module candidate creation unit that creates a plurality of prediction module candidates by machine learning the relationship between the explanatory variables and the characteristic quantity,
The computer machine-learned using the verification modified simulation data set, the verification experimental data set, and the verification integrated data set in which the verification modified simulation data set and the verification experimental data set were integrated. A prediction module candidate evaluation unit that evaluates prediction accuracy for each of the plurality of prediction module candidates,
The computer is a data processing device including a prediction module determination unit that determines the prediction module from the plurality of prediction module candidates based on the evaluation result of the prediction accuracy.

A program that causes a computer to execute a data processing method for forming a prediction module that predicts and outputs a predetermined feature value by inputting the values of a plurality of explanatory variables.
Data that holds a set of the values of each of the plurality of explanatory variables and the value of the feature amount for associating with the value of the explanatory variable, and the value of the feature amount can be obtained by using a testing machine. It holds a plurality of experimental data which are experimental values of the measurement object and simulation data whose feature value is a simulation calculation value calculated by performing a simulation using the simulation model of the measurement object. Using the original data set, a computer is used to obtain the value of the feature amount in the simulation data based on the correspondence between the value of the feature amount in the simulation data and the value of the feature amount in the experimental data. The procedure for making corrections and generating a correction simulation data set consisting of correction simulation data,
By separating each of the modified simulation data set and the experimental data set composed of the experimental data into a training data set and a verification data set in the computer, the training modified simulation data set and learning Procedures for generating experimental data sets, modified simulation data sets for verification, and experimental data sets for verification,
The computer is used with each of the learning modified simulation data set, the learning experimental data set, and the learning integrated data set in which the learning modified simulation data set and the learning experimental data set are integrated. A procedure for a computer to create a plurality of prediction module candidates by machine learning the relationship between the explanatory variables and the characteristic quantity, and
Machine learning was performed on the computer using the verification modified simulation data set, the verification experimental data set, and the verification integrated data set in which the verification modified simulation data set and the verification experimental data set were integrated. A procedure for evaluating the prediction accuracy for each of the plurality of prediction module candidates, and
A program comprising the procedure of causing the computer to determine the prediction module from the plurality of prediction module candidates based on the evaluation result of the prediction accuracy.