JP7292235B2

JP7292235B2 - Analysis support device and analysis support method

Info

Publication number: JP7292235B2
Application number: JP2020052908A
Authority: JP
Inventors: 文也工藤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2020-03-24
Filing date: 2020-03-24
Publication date: 2023-06-16
Anticipated expiration: 2040-03-24
Also published as: JP2021152751A

Description

本発明は、分析支援装置及び分析支援方法に関する。 The present invention relates to an analysis support device and an analysis support method.

近年、多くの産業分野において業務データの収集が可能であり、売り上げを始めとした企業活動におけるＫＰＩ（ＫｅｙＰｅｒｆｏｒｍａｎｃｅＩｎｄｉｃａｔｏｒ）を予測するモデルの生成や、要因分析などのデータ分析を補助及び自動化するニーズが高まっている。また、過去の分析事例を学習し、データ分析に必要な前処理、特徴生成、及びモデリングを支援するツールが開発されている。 In recent years, it is possible to collect business data in many industrial fields, and there is a need to support and automate data analysis such as generating models that predict KPIs (Key Performance Indicators) in corporate activities such as sales, and factor analysis. is rising. In addition, tools have been developed that learn from past analysis cases and support preprocessing, feature generation, and modeling required for data analysis.

本技術分野の背景技術として国際公開第２０１９／０７３９００号（特許文献１）がある。この公報には、「生体活動に起因する生体音に基づいて、簡易かつ高精度に疾患を判定する。生体活動に起因する生体音に基づいて疾患の判定を行うための判定アルゴリズムＤ２を学習する機械学習装置（３）であって、複数の被験体の前記生体音の音情報を取得する音情報取得部（３３）と、前記複数の被験体の疾患に関する診断情報を取得する診断情報取得部（３４）と、前記各被験体の前記音情報および前記診断情報に基づいて、判定アルゴリズムＤ２を学習する学習部（３５）と、を備えた、機械学習装置（３）。」と記載されている（要約参照）。 As background art in this technical field, there is International Publication No. 2019/073900 (Patent Document 1). In this publication, "Disease can be easily and highly accurately determined based on body sounds caused by body activity. Learning a determination algorithm D2 for determining disease based on body sounds caused by body activity A machine learning device (3), comprising: a sound information acquisition unit (33) for acquiring sound information of the body sounds of a plurality of subjects; and a diagnostic information acquisition unit for acquiring diagnostic information regarding diseases of the plurality of subjects. (34), and a learning unit (35) that learns a determination algorithm D2 based on the sound information and the diagnostic information of each subject." (see summary).

国際公開第２０１９／０７３９００号WO2019/073900

従来技術は、分析済み事例のデータと分析対象事例のデータとから、予め定義した属性情報を抽出し、分析済み事例と分析対象事例との類似度を求めて、分析対象事例に類似する分析済み事例を抽出する。しかし、類似度の算出に用いられる属性情報は設計者が選択するものであるため、分析対象事例に類似する分析済み事例を精度良く抽出するための属性情報が必ずしも選択されているとは限らない。さらに、設計者が選択した属性情報は、事例データが持つ性質の一部しか表現できていない可能性が高い。つまり、従来技術では、類似度の設計自体が困難であった。 The conventional technology extracts predefined attribute information from the data of the analyzed case and the data of the case to be analyzed, obtains the degree of similarity between the analyzed case and the case to be analyzed, Extract examples. However, since the attribute information used to calculate the degree of similarity is selected by the designer, the attribute information for accurately extracting analyzed cases similar to the analysis target case is not necessarily selected. . Furthermore, there is a high possibility that the attribute information selected by the designer can express only a part of the properties of the case data. In other words, it was difficult to design similarity in the conventional technology.

また特許文献１に記載の技術は、決められた種類のモデリングと特徴量を用いて判定アルゴリズムを学習するが、過去のモデリング結果を利用して効率よく学習を行う仕組みを有していない。そこで本発明の一態様は、分析対象事例を高精度に分析するために、分析対象事例に類似する分析済み事例を高精度かつ効率良く特定することを目的とする。 Further, the technique described in Patent Document 1 learns a determination algorithm using a predetermined type of modeling and feature amounts, but does not have a mechanism for efficient learning using past modeling results. Accordingly, an object of one aspect of the present invention is to accurately and efficiently identify an analyzed case similar to an analysis target case in order to analyze the analysis target case with high accuracy.

上記課題を解決するため、本発明の一態様は以下の構成を採用する。分析支援装置は、プロセッサとメモリとを有し、前記メモリは、分析対象事例の説明変数と目的変数とを示す分析対象事例データと、分析済み事例を分析したモデル及び前記モデルに適用されたパラメータの組み合わせと、前記パラメータが適用されたモデルによって前記分析済み事例が分析されたときの当該モデルの評価値と、を示す分析評価データと、を保持し、前記プロセッサは、前記分析評価データに含まれる予め定められた一部のモデル及びパラメータの組み合わせを前記分析対象事例の説明変数に適用して前記分析対象事例の目的変数を予測したときの当該モデルの評価値を算出し、前記算出した評価値と、前記分析評価データが示す評価値それぞれと、を比較して類似度を算出し、前記算出した類似度に基づいて、前記分析対象事例に類似する分析済み事例である類似事例を特定する。 In order to solve the above problems, one embodiment of the present invention employs the following configuration. The analysis support device has a processor and a memory, and the memory stores analysis target case data indicating explanatory variables and objective variables of analysis target cases, a model obtained by analyzing the analyzed cases, and parameters applied to the model. and an evaluation value of the model when the analyzed case is analyzed by the model to which the parameter is applied, and the processor retains analysis evaluation data indicating a combination of calculating the evaluation value of the model when predicting the objective variable of the analysis target case by applying a combination of a predetermined model and parameters of the analysis target case to the explanatory variables of the analysis target case, and calculating the calculated evaluation A similarity is calculated by comparing the value and each evaluation value indicated by the analysis evaluation data, and a similar case that is an analyzed case similar to the analysis target case is specified based on the calculated similarity. .

本発明の一態様によれば、分析対象事例を高精度に分析するために、分析対象事例に類似する分析済み事例を高精度かつ効率良く特定することができる。 According to one aspect of the present invention, in order to analyze a case to be analyzed with high accuracy, an analyzed case similar to the case to be analyzed can be specified with high accuracy and efficiency.

上記した以外の課題、構成及び効果は、以下の実施形態の説明により明らかにされる。 Problems, configurations, and effects other than those described above will be clarified by the following description of the embodiments.

実施例１における分析支援装置の構成例を示すブロック図である。1 is a block diagram showing a configuration example of an analysis support device according to Embodiment 1; FIG. 実施例１における入力テーブルの一例である。It is an example of an input table in Example 1. FIG. 実施例１における分析情報の一例である。It is an example of the analysis information in Example 1. FIG. 実施例１における分析データベースの一例である。1 is an example of an analysis database in Example 1. FIG. 実施例１におけるパラメータデータベースの一例である。4 is an example of a parameter database in Example 1. FIG. 実施例１における自動分析実行処理の一例を示すフローチャートである。6 is a flow chart showing an example of automatic analysis execution processing in Embodiment 1. FIG. 実施例１におけるルールベース問題分類処理の一例を示すフローチャートである。4 is a flow chart showing an example of rule-based problem classification processing in Example 1. FIG. 実施例１における共通パラメータ探索処理の一例を示すフローチャートである。7 is a flow chart showing an example of common parameter search processing in Embodiment 1. FIG. 実施例１における類似事例抽出処理の一例を示すフローチャートである。10 is a flowchart illustrating an example of similar case extraction processing according to the first embodiment; 実施例１における類似事例抽出処理の具体例を示す説明図である。FIG. 10 is an explanatory diagram showing a specific example of similar case extraction processing according to the first embodiment; 実施例１におけるパラメータレコメンド処理の一例を示すフローチャートである。5 is a flowchart showing an example of parameter recommendation processing in Example 1. FIG. 実施例１における周辺探索処理の一例を示すフローチャートである。6 is a flowchart showing an example of peripheral search processing according to the first embodiment; 実施例１における再学習処理の一例を示すフローチャートである。7 is a flowchart illustrating an example of re-learning processing in Example 1; 実施例１における分析データベース作成処理の一例を示すフローチャートである。5 is a flow chart showing an example of analysis database creation processing in Example 1. FIG. 実施例１における自動分析実行処理が行われるときに出力装置に表示される表示画面の一例である。4 is an example of a display screen displayed on the output device when automatic analysis execution processing is performed in Embodiment 1. FIG.

以下、本発明の実施形態を図面に基づいて詳細に説明する。本実施形態において、同一の構成には原則として同一の符号を付け、繰り返しの説明は省略する。なお、本実施形態は本発明を実現するための一例に過ぎず、本発明の技術的範囲を限定するものではないことに注意すべきである。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described in detail based on the drawings. In this embodiment, in principle, the same components are denoted by the same reference numerals, and repeated descriptions are omitted. It should be noted that the present embodiment is merely an example for realizing the present invention and does not limit the technical scope of the present invention.

図１は、分析支援装置の構成例を示すブロック図である。分析支援装置１００は、例えば、それぞれが互いにバス１０７等の内部通信線で接続された、ＣＰＵ（ＣｏｎｔｒｏｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１０１、メモリ１０２、補助記憶装置１０３、入力装置１０４、出力装置１０５、及び通信装置１０６を有する計算機によって構成される。 FIG. 1 is a block diagram showing a configuration example of an analysis support device. The analysis support apparatus 100 includes, for example, a CPU (Control Processing Unit) 101, a memory 102, an auxiliary storage device 103, an input device 104, an output device 105, and a communication device, which are connected to each other by an internal communication line such as a bus 107. 106.

ＣＰＵ１０１は、プロセッサを含み、メモリ１０２に格納されたプログラムを実行する。メモリ１０２は、不揮発性の記憶素子であるＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）及び揮発性の記憶素子であるＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）を含む。ＲＯＭは、不変のプログラム（例えば、ＢＩＯＳ（ＢａｓｉｃＩｎｐｕｔ／ＯｕｔｐｕｔＳｙｓｔｅｍ））などを格納する。ＲＡＭは、ＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）のような高速かつ揮発性の記憶素子であり、ＣＰＵ１０１が実行するプログラム及びプログラムの実行時に使用されるデータを一時的に格納する。 CPU 101 includes a processor and executes programs stored in memory 102 . The memory 102 includes ROM (Read Only Memory), which is a non-volatile storage element, and RAM (Random Access Memory), which is a volatile storage element. The ROM stores immutable programs (for example, BIOS (Basic Input/Output System)) and the like. The RAM is a high-speed and volatile storage device such as a DRAM (Dynamic Random Access Memory), and temporarily stores programs executed by the CPU 101 and data used when the programs are executed.

補助記憶装置１０３は、例えば、磁気記憶装置（ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ））、フラッシュメモリ（ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ））等の大容量かつ不揮発性の記憶装置であり、ＣＰＵ１０１が実行するプログラム及びプログラムの実行時に使用されるデータを格納する。すなわち、プログラムは、補助記憶装置１０３から読み出されて、メモリ１０２にロードされて、ＣＰＵ１０１によって実行される。 The auxiliary storage device 103 is, for example, a magnetic storage device (HDD (Hard Disk Drive)), a flash memory (SSD (Solid State Drive)), or other large-capacity and non-volatile storage device. Stores data used when running That is, the program is read from the auxiliary storage device 103, loaded into the memory 102, and executed by the CPU 101. FIG.

入力装置１０４は、オペレータからの入力を受け付ける装置であり、例えば、キーボードやマウス等である。出力装置１０５は、プログラムの実行結果をオペレータが視認可能な形式で出力する装置であり、例えば、ディスプレイやプリンタ等である。通信装置１０６は、所定のプロトコルに従って、他の装置との通信を制御するネットワークインターフェース装置である。 The input device 104 is a device that receives input from an operator, such as a keyboard and a mouse. The output device 105 is a device that outputs the execution result of the program in a format that can be visually recognized by the operator, and is, for example, a display or a printer. A communication device 106 is a network interface device that controls communication with other devices according to a predetermined protocol.

ＣＰＵ１０１が実行するプログラムは、リムーバブルメディア（ＣＤ－ＲＯＭ、フラッシュメモリなど）又はネットワークを介して分析支援装置１００に提供され、非一時的記憶媒体である不揮発性の補助記憶装置１０３に格納される。このため、分析支援装置１００は、リムーバブルメディアからデータを読み込むインターフェースを有するとよい。 Programs executed by the CPU 101 are provided to the analysis support apparatus 100 via removable media (CD-ROM, flash memory, etc.) or a network, and stored in the non-volatile auxiliary storage device 103, which is a non-temporary storage medium. Therefore, the analysis support apparatus 100 preferably has an interface for reading data from removable media.

分析支援装置１００は、物理的に一つの計算機上で、又は、論理的又は物理的に構成された複数の計算機上で構成される計算機システムであり、同一の計算機上で別個のスレッドで動作してもよく、複数の物理的計算機資源上に構築された仮想計算機上で動作してもよい。 The analysis support apparatus 100 is a computer system configured on one physical computer or on a plurality of computers configured logically or physically, and operates on the same computer with separate threads. Alternatively, it may operate on a virtual computer built on a plurality of physical computer resources.

ＣＰＵ１０１は、例えば、自動分析実行部１１１、分析データベース作成部１１２、ルールベース問題分類部１１３、共通パラメータ探索部１１４、類似事例抽出部１１５、パラメータレコメンド部１１６、周辺探索部１１７、及び再学習部１１８を含む。 The CPU 101 includes, for example, an automatic analysis execution unit 111, an analysis database creation unit 112, a rule-based problem classification unit 113, a common parameter search unit 114, a similar case extraction unit 115, a parameter recommendation unit 116, a peripheral search unit 117, and a relearning unit. 118.

自動分析実行部１１１は、後述する入力テーブル１２１が示す分析対象事例の自動分析処理を制御する。分析データベース作成部１１２は、後述する分析データベース１２３を生成する。ルールベース問題分類部１１３は、入力テーブル１２１が示す分析対象事例においてモデルによって解かれる問題を特定する。 The automatic analysis execution unit 111 controls automatic analysis processing of analysis target cases indicated by an input table 121, which will be described later. The analysis database creation unit 112 creates an analysis database 123, which will be described later. The rule-based problem classification unit 113 identifies a problem to be solved by the model in the analysis target case indicated by the input table 121 .

共通パラメータ探索部１１４は、分析データベース１２３が示す一部のパラメータが適用されたモデルを入力テーブル１２１の説明変数に適用する。類似事例抽出部１１５は、分析対象事例に類似する過去に分析済みの事例である類似事例を抽出する。 The common parameter search unit 114 applies a model to which some parameters indicated by the analysis database 123 are applied to the explanatory variables of the input table 121 . The similar case extraction unit 115 extracts similar cases, which are previously analyzed cases similar to the analysis target case.

パラメータレコメンド部１１６は、類似事例において適用済みのパラメータのうち、分析対象事例に適用すると最も良い評価値を示すパラメータをレコメンドする。周辺探索部１１７は、レコメンドされたパラメータの周辺のパラメータを探索し、探索したパラメータのうち分析対象事例に適用すると最も良い評価値を示すベストパラメータを出力する。再学習部１１８は、ベストパラメータを用いて分析対象事例を分析するための後述するモデル１２４を生成する。 The parameter recommendation unit 116 recommends a parameter showing the best evaluation value when applied to the analysis target case among the parameters already applied to the similar cases. The surrounding search unit 117 searches for parameters around the recommended parameter, and outputs the best parameter indicating the best evaluation value when applied to the case to be analyzed among the searched parameters. The relearning unit 118 uses the best parameters to generate a model 124, which will be described later, for analyzing the case to be analyzed.

例えば、ＣＰＵ１０１は、メモリ１０２にロードされた自動分析実行プログラムに従って動作することで、自動分析実行部１１１として機能し、メモリ１０２にロードされた分析データベース作成プログラムに従って動作することで、分析データベース作成部１１２として機能する。 For example, the CPU 101 operates according to an automatic analysis execution program loaded into the memory 102 to function as an automatic analysis execution unit 111, and operates according to an analysis database creation program loaded into the memory 102 to operate as an analysis database creation unit. 112.

なお、ＣＰＵ１０１に含まれる機能部による機能の一部又は全部が、例えば、ＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）やＦＰＧＡ（Ｆｉｅｌｄ－ＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）等のハードウェアによって実現されてもよい。 Some or all of the functions of the functional units included in the CPU 101 may be realized by hardware such as ASIC (Application Specific Integrated Circuit) or FPGA (Field-Programmable Gate Array).

補助記憶装置１０３は、例えば、入力テーブル１２１、分析情報１２２、分析データベース１２３、モデル１２４、及びパラメータデータベース１２５を保持する。なお、補助記憶装置１０３に格納されている一部又は全部の情報は、メモリ１０２に格納されていてもよいし、分析支援装置１００に接続されているデータベース等に格納されていてもよい。 The auxiliary storage device 103 holds an input table 121, analysis information 122, an analysis database 123, a model 124, and a parameter database 125, for example. Part or all of the information stored in the auxiliary storage device 103 may be stored in the memory 102 or may be stored in a database or the like connected to the analysis support device 100 .

入力テーブル１２１は、分析対象事例のデータを保持する。分析情報１２２は、入力テーブル１２１を分析するための追加情報を保持する。分析データベース１２３は、過去に分析済みの事例において分析に用いられたモデル及びパラメータ、並びに評価値を示す情報を保持する。後述するように分析データベース１２３は、分析済み事例そのもの（分析済み事例の属性情報）を直接的に示す情報を有している必要はない。 The input table 121 holds data of cases to be analyzed. Analysis information 122 holds additional information for analyzing input table 121 . The analysis database 123 holds information indicating models and parameters used for analysis in cases that have been analyzed in the past, and evaluation values. As will be described later, the analysis database 123 does not need to have information that directly indicates the analyzed case itself (attribute information of the analyzed case).

モデル１２４は、入力テーブル１２１が示す分析対象事例を分析するためのモデル、及び過去に分析済みの事例を分析したモデルを含む。パラメータデータベース１２５は、モデル１２４に適用されるパラメータを示す情報を保持する。 The model 124 includes models for analyzing cases to be analyzed indicated by the input table 121 and models for analyzing cases that have been analyzed in the past. The parameter database 125 holds information indicating parameters applied to the model 124 .

なお、本実施形態において、分析支援装置１００が使用する情報は、データ構造に依存せずどのようなデータ構造で表現されていてもよい。本実施形態ではテーブル形式で情報が表現されているが、例えば、リスト、データベース又はキューから適切に選択したデータ構造体が、情報を格納することができる。
図２は、入力テーブル１２１の一例である。入力テーブル１２１は、列方向に同じ変数の値を格納し、行方向に各変数の値を格納するテーブルデータである。 In this embodiment, the information used by the analysis support apparatus 100 may be represented by any data structure without depending on the data structure. Although the information is represented in the form of a table in this embodiment, the information can be stored in a data structure suitably selected from, for example, a list, database, or queue.
FIG. 2 is an example of the input table 121. As shown in FIG. The input table 121 is table data in which the values of the same variables are stored in the column direction and the values of the respective variables are stored in the row direction.

図２の例では、入力テーブル１２１は、ＰａｓｓｅｎｇｅｒＩｄ欄１２１１、Ｓｕｒｖｉｖｅｄ欄１２１２、Ｓｅｘ欄１２１３、Ａｇｅ欄１２１４、ＳｉｂＳｐ欄１２１５、Ｆａｒｅ欄１２１６、Ｅｍｂａｒｋｅｄ欄１２１７と、を含む。入力テーブル１２１に含まれる変数（列）の数は任意であり、各変数は文字列、フラグ、連続数（数値）等の様々な型の値を取ることができる。 In the example of FIG. 2, the input table 121 includes a PassengerId column 1211, a Survived column 1212, a Sex column 1213, an Age column 1214, a SibSp column 1215, a Fare column 1216, and an Embarked column 1217. The number of variables (columns) included in the input table 121 is arbitrary, and each variable can take values of various types such as character strings, flags, consecutive numbers (numerical values), and the like.

図２の例では、ＰａｓｓｅｎｇｅｒＩｄ欄１２１１がＩＤカラム（即ちレコードを識別するための変数）、Ｓｕｒｖｉｖｅｄ欄１２１２が目的変数カラムであり、他の欄は説明変数カラムである。本実施例における分析の目的は、説明変数カラムの値から目的変数カラムの値をより正確に予測する予測式であるモデルを生成することを含む。 In the example of FIG. 2, the PassengerId column 1211 is an ID column (that is, a variable for identifying a record), the Survived column 1212 is a target variable column, and the other columns are explanatory variable columns. The purpose of the analysis in this example includes generating a model, which is a prediction formula that more accurately predicts the value of the objective variable column from the value of the explanatory variable column.

図２の入力テーブル１２１ではＳｕｒｖｉｖｅｄ欄１２１２が目的変数カラムであり、Ｓｕｒｖｉｖｅｄ欄１２１２の値は１又は０のいずれかをとるため、Ｓｕｒｖｉｖｅｄ欄１２１２の値を予測することは二値分類の問題である。一般的に、ＩＤカラムはデータサンプルを識別する通し番号であるため、分析には直接的に用いられないことが多い。 In the input table 121 of FIG. 2, the Survived column 1212 is the objective variable column, and the value of the Survived column 1212 takes either 1 or 0, so predicting the value of the Survived column 1212 is a binary classification problem. . Since the ID column is generally a serial number that identifies a data sample, it is often not directly used for analysis.

図３は、分析情報１２２の一例である。分析情報１２２は、例えば、ＩＤ欄１２２１、目的変数名欄１２２２、問題欄１２２３、及び型欄１２２４を含む。ＩＤ欄１２２１は、分析情報１２２のレコードを識別するＩＤを保持する。目的変数名欄１２２２は、入力テーブル１２１の目的変数カラムを示す。問題欄１２２３は、入力テーブル１２１が二値分類、多クラス分類、又は回帰等の、どのクラスの問題に分類されるかを示す。型欄１２２４は、目的変数カラムの型を示す。 FIG. 3 is an example of the analysis information 122. As shown in FIG. The analysis information 122 includes, for example, an ID column 1221, a target variable name column 1222, a question column 1223, and a type column 1224. The ID column 1221 holds IDs that identify records of the analysis information 122 . A target variable name column 1222 indicates the target variable column of the input table 121 . A question column 1223 indicates which class of problem the input table 121 is classified into, such as binary classification, multiclass classification, or regression. A type column 1224 indicates the type of the objective variable column.

図４は、分析データベース１２３の一例である。分析データベース１２３は問題分類ごとに定義されており、図４は問題分類が多クラス問題かつ稀現象問題である分析データベース１２３の例を示す。分析データベース１２３は、例えば、Ａｎａｌｙｓｉｓ＿ＩＤ欄１２３１、Ｍｏｄｅｌ＿Ｎａｍｅ欄１２３２、Ｃｏｍｍｏｎ＿Ｓｅａｒｃｈ欄１２３３、Ｒｅｃｉｐｅ＿ＩＤ欄１２３４、Ｂｅｓｔ＿Ｆｌａｇ欄１２３５、及びＡｃｃｕｒａｃｙ欄１２３６を含み、過去の探索結果情報を保持する。 FIG. 4 is an example of the analysis database 123. As shown in FIG. The analysis database 123 is defined for each problem classification, and FIG. 4 shows an example of the analysis database 123 whose problem classification is multi-class problem and rare phenomenon problem. The analysis database 123 includes, for example, an Analysis_ID column 1231, a Model_Name column 1232, a Common_Search column 1233, a Recipe_ID column 1234, a Best_Flag column 1235, and an Accuracy column 1236, and holds past search result information.

Ａｎａｌｙｓｉｓ＿ＩＤ欄１２３１は、分析事例を識別するＩＤ（つまり１つの入力テーブル１２１を分析する際に付与されるＩＤ）を保持する。一つの分析事例に対してパラメータを変化させて１回以上の探索と評価が行われるため、一つのＡｎａｌｙｓｉｓ＿ＩＤに対して複数の結果が保存され得る。 The Analysis_ID column 1231 holds IDs that identify analysis cases (that is, IDs that are assigned when one input table 121 is analyzed). Multiple results can be stored for one Analysis_ID because one or more searches and evaluations are performed with different parameters for one analysis case.

Ｍｏｄｅｌ＿Ｎａｍｅ欄１２３２は分析に用いられたモデル１２４の名称を保持する。線形回帰、Ｌｏｇｉｓｔｉｃ回帰、ＸＧＢｏｏｓｔ、ＲａｎｄｏｍＦｏｒｅｓｔ、ＳＶＭ（ＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅ）、ＳＶＲ（ＳｕｐｐｏｒｔＶｅｃｔｏｒＲｅｇｒｅｓｓｉｏｎ）、ＮｅｕｒａｌＮｅｔｗｏｒｋ、及びＧａｕｓｓｉａｎＰｒｏｃｅｓｓは、いずれも分析に用いられるモデル１２４の一例である。 A Model_Name column 1232 holds the name of the model 124 used in the analysis. Return to Logistic, Return to Logistic, XG Boost, RANDOM FOREST, SVM (SUPPORT VECTOR MACHINE), SVR (SUPPPORT VECTOR REGRESSION), NEURAL NETWORK, And Gaussian Process is an example of a model 124 used for analysis.

Ｃｏｍｍｏｎ＿Ｓｅａｒｃｈ欄１２３３は、入力テーブル１２１が示す分析対象事例に類似する分析済みの事例を抽出するために、利用されるパラメータを識別するためのフラグを保持する。 The Common_Search column 1233 holds flags for identifying parameters used to extract analyzed cases similar to the analysis target cases indicated by the input table 121 .

本実施例の分析支援装置１００は、分析対象事例に類似する分析済み事例を抽出し、分析対象事例の分析に用いられると精度の高い分析が可能なパラメータをレコメンドする。分析支援装置１００は、この類似する事例を見つけるために、探索対象のパラメータを同一にすることで事例間の比較を行うことができる。そのため分析データベース１２３は、分析事例間で同一パラメータを探索したサンプルを識別するためのフラグとしてＣｏｍｍｏｎ＿Ｓｅａｒｃｈ欄１２３３を有する。 The analysis support apparatus 100 of the present embodiment extracts analyzed cases similar to an analysis target case, and recommends parameters that enable highly accurate analysis when used to analyze the analysis target case. In order to find similar cases, the analysis support apparatus 100 can compare cases by making the search target parameters the same. Therefore, the analysis database 123 has a Common_Search column 1233 as a flag for identifying samples searching for the same parameter among analysis cases.

Ｒｅｃｉｐｅ＿ＩＤ欄１２３４は、探索済みのパラメータセットそれぞれを識別するＩＤであるＲｅｃｉｐｅ＿ＩＤを保持する。分析データベース１２３に含まれるパラメータセットがレコメンドされる際には、このＲｅｃｉｐｅ＿ＩＤによってパラメータセットが指定される。 The Recipe_ID column 1234 holds Recipe_ID, which is an ID for identifying each searched parameter set. When a parameter set included in the analysis database 123 is recommended, the parameter set is designated by this Recipe_ID.

Ｂｅｓｔ＿Ｆｌａｇ欄１２３５は、同一の分析事例、即ち同一のＡｎａｌｙｓｉｓ＿ＩＤを有する事例において、モデルが最も良い性能を示したパラメータセットを識別するためのフラグであるＢｅｓｔ＿Ｆｌａｇを保持する。図４の例では予測精度（Ａｃｃｕｒａｃｙ欄１２３６の値）によってモデルの性能が評価されているが、他の評価指標及びモデルの安定性などの様々な観点によって、モデルの性能を評価することができる。 Best_Flag column 1235 holds Best_Flag, which is a flag for identifying the parameter set for which the model performed best in the same analysis case, ie, the case with the same Analysis_ID. In the example of FIG. 4, the performance of the model is evaluated by the prediction accuracy (the value of the Accuracy column 1236), but it is possible to evaluate the performance of the model from various viewpoints such as other evaluation indexes and model stability. .

Ａｃｃｕｒａｃｙ欄１２３６は、Ｒｅｃｉｐｅ＿ＩＤ欄１２３４が示すパラメータセットをモデルに適用した際のモデルの評価値の一例である予測精度（Ａｃｃｕｒａｃｙ）を示す。なお、分析データベース１２３は説明の便宜上データベースと名付けられているが、必ずしもデータベースである必要はなく、例えばテキスト形式や他の形式であってもよい。 The Accuracy column 1236 indicates the prediction accuracy (Accuracy), which is an example of the evaluation value of the model when the parameter set indicated by the Recipe_ID column 1234 is applied to the model. Although the analysis database 123 is named database for convenience of explanation, it does not necessarily have to be a database, and may be in a text format or other format, for example.

図５は、パラメータデータベース１２５の一例である。パラメータデータベース１２５は、例えば、Ｒｅｃｉｐｅ＿ＩＤを保持するＲｅｃｉｐｅ＿ＩＤ欄１２５１を含む。また、パラメータデータベース１２５は、例えば、いずれもパラメータ値を保持する、ｐａｒａｍ＿ｍａｘ＿ｄｅｐｔｈ欄１２５２、ｐａｒａｍ＿ｌｅａｆｅ＿ｎｕｍ欄１２５３、ｐａｒａｍ＿ｎ＿ｅｓｔｉｍａｔｏｒ欄１２５４、及びｐａｒａｍ＿ｌｅａｒｉｎｇ＿ｒａｔｅ欄１２５５を含む。図５の例では、パラメータデータベース１２５は、４つのパラメータを保持しているが、パラメータデータベース１２５が保持するパラメータの種類及び数は、分析支援装置１００が有するモデル１２４に依存する。 FIG. 5 is an example of the parameter database 125. As shown in FIG. The parameter database 125 includes, for example, a Recipe_ID column 1251 that holds Recipe_IDs. The parameter database 125 also includes, for example, a param_max_depth column 1252, a param_leafe_num column 1253, a param_n_estimator column 1254, and a param_learning_rate column 1255, all of which hold parameter values. In the example of FIG. 5, the parameter database 125 holds four parameters, but the types and number of parameters held by the parameter database 125 depend on the model 124 that the analysis support apparatus 100 has.

なお、補助記憶装置１０３が有する各データが有する図示された欄はあくまで一例であり、各テーブルは図示された全ての欄を必ずしも有していなくてもよいし、別の欄をさらに有していてもよい。 It should be noted that the illustrated columns of each data in the auxiliary storage device 103 are merely examples, and each table does not necessarily have all the illustrated columns, and may have additional columns. may

図６は、自動分析実行処理の一例を示すフローチャートである。図６の処理の前に入力テーブル１２１及び分析情報１２２の値が入力済みであるものとする。但し分析情報１２２の問題欄１２２３には値が格納されていなくてもよい（ｎｕｌｌ値が格納されていてもよい）。 FIG. 6 is a flowchart showing an example of automatic analysis execution processing. It is assumed that the values of the input table 121 and the analysis information 122 have been input before the processing of FIG. However, the problem column 1223 of the analysis information 122 does not have to store a value (a null value may be stored).

自動分析実行処理において、自動分析実行部１１１は、入力テーブル１２１と分析情報１２２とパラメータデータベース１２５と、に基づいて、分析データベース１２３から分析対象事例に類似する過去の分析済み事例を検索する。そして、自動分析実行部１１１は、最良の評価値を示すパラメータを抽出し、当該抽出したパラメータを適用して、分析対象事例を分析するためのモデルを出力する。 In the automatic analysis execution process, the automatic analysis execution unit 111 searches the analysis database 123 for past analyzed cases similar to the analysis target case based on the input table 121 , the analysis information 122 and the parameter database 125 . Then, the automatic analysis execution unit 111 extracts a parameter indicating the best evaluation value, applies the extracted parameter, and outputs a model for analyzing the analysis target case.

図６の各ステップの詳細については後述するため、ここでは簡潔に説明する。まず、ルールベース問題分類部１１３は、入力テーブル１２１が示す分析対象事例においてモデルによって解かれる問題が、どの問題に属するかを判定する（Ｓ６０１）。共通パラメータ探索部１１４は、過去の分析済み事例に適用されたパラメータを用いたモデルを入力テーブル１２１の説明変数に適用した場合のモデルの評価値を算出する（Ｓ６０２）。 Since the details of each step in FIG. 6 will be described later, they will be briefly described here. First, the rule-based problem classification unit 113 determines to which problem the problem solved by the model in the analysis target case indicated by the input table 121 belongs (S601). The common parameter search unit 114 calculates the evaluation value of the model when the model using the parameters applied to past analyzed cases is applied to the explanatory variables of the input table 121 (S602).

類似事例抽出部１１５は、当該パラメータが適用された、分析対象事例である入力テーブル１２１おける評価値と、過去の分析済み事例における評価値と、を比較して、分析対象事例に最も類似する分析済み事例である類似事例を抽出する（Ｓ６０３）。 The similar case extracting unit 115 compares the evaluation value in the input table 121, which is the analysis target case to which the parameter is applied, with the evaluation values of the past analyzed cases, and performs the analysis most similar to the analysis target case. Similar cases, which are completed cases, are extracted (S603).

パラメータレコメンド部１１６は、類似事例に適用されたパラメータのうち最も評価値が高いパラメータを抽出し、レコメンドする（Ｓ６０４）。周辺探索部１１７は、レコメンドされたパラメータの周辺を探索してベストパラメータを抽出する（Ｓ６０５）。再学習部１１８は、ベストパラメータを用いて、分析対象事例を分析するためのモデル１２４を生成する。 The parameter recommendation unit 116 extracts and recommends the parameter with the highest evaluation value among the parameters applied to the similar cases (S604). The surrounding search unit 117 searches around the recommended parameters and extracts the best parameters (S605). The relearning unit 118 uses the best parameters to generate a model 124 for analyzing the case to be analyzed.

図７は、ステップＳ６０１のルールベース問題分類処理の一例を示すフローチャートである。ルールベース問題分類部１１３は、入力テーブル１２１及び分析情報１２２を読み出す（Ｓ７０１）。 FIG. 7 is a flowchart showing an example of rule-based problem classification processing in step S601. The rule-based problem classification unit 113 reads the input table 121 and analysis information 122 (S701).

ルールベース問題分類部１１３は、分析情報１２２において問題分類が定義されているか、即ち問題欄１２２３に値が入力されているかを判定する（Ｓ７０２）。 The rule-based problem classification unit 113 determines whether a problem classification is defined in the analysis information 122, that is, whether a value is entered in the problem field 1223 (S702).

ルールベース問題分類部１１３は、分析情報１２２において問題分類が定義されていると判定した場合（Ｓ７０２：Ｙｅｓ）、問題欄１２２３が示す問題分類を問題分類結果として出力し（Ｓ７０７）、ルールベース問題分類処理を終了する。 When the rule-based problem classification unit 113 determines that the problem classification is defined in the analysis information 122 (S702: Yes), it outputs the problem classification indicated by the problem column 1223 as a problem classification result (S707). End the classification process.

ルールベース問題分類部１１３は、分析情報１２２において問題分類が定義されていないと判定した場合（Ｓ７０２：Ｎｏ）、分析情報１２２の目的変数名欄１２２２から入力テーブル１２１の目的変数カラムを特定し、入力テーブル１２１の目的変数カラムのユニーク要素数（即ち要素の種類数）が２以下であるかを判定する（Ｓ７０３）。 When the rule-based problem classification unit 113 determines that the problem classification is not defined in the analysis information 122 (S702: No), it identifies the objective variable column of the input table 121 from the objective variable name column 1222 of the analysis information 122, It is determined whether the number of unique elements (that is, the number of element types) in the objective variable column of the input table 121 is 2 or less (S703).

ルールベース問題分類部１１３は、目的変数カラムのユニーク要素数が２以下であると判定した場合（Ｓ７０３：Ｙｅｓ）、入力テーブル１２１の問題が二値分類問題を含むことを示す情報を問題分類結果に含め、稀現象判定処理を実行して（Ｓ７０５）、問題分類結果を出力し（Ｓ７０７）、ルールベース問題分類処理を終了する。 When the rule-based problem classification unit 113 determines that the number of unique elements in the objective variable column is 2 or less (S703: Yes), the problem classification result , executes the rare phenomenon determination process (S705), outputs the problem classification result (S707), and terminates the rule-based problem classification process.

なお、ルールベース問題分類部１１３は、ステップＳ７０５において、目的変数カラムの値の偏りが大きい場合に、入力テーブル１２１の問題が稀現象問題を含むことを示す情報を問題分類結果に含める。具体的には、例えば、ルールベース問題分類部１１３は、目的変数カラムの値の分散が所定値以上である場合や、目的変数カラムの値の最大の相対度数と最小の相対度数との差が所定値以上である場合等に、目的変数カラムの値の偏りが大きいと判定する。 Note that, in step S705, the rule-based problem classification unit 113 includes information indicating that the problem in the input table 121 includes a rare phenomenon problem in the problem classification result when the bias in the values of the objective variable column is large. Specifically, for example, the rule-based problem classification unit 113 determines whether the variance of the objective variable column values is equal to or greater than a predetermined value, or when the difference between the maximum relative frequency and the minimum relative frequency of the objective variable column values is If the value is equal to or greater than a predetermined value, etc., it is determined that the bias in the values of the objective variable column is large.

ルールベース問題分類部１１３は、目的変数カラムのユニーク要素数が２を超えると判定した場合（Ｓ７０３：Ｎｏ）、目的変数が文字データであるかを判定する（Ｓ７０５）。具体的には、例えば、ルールベース問題分類部１１３は、分析情報１２２の型欄１２２４の値を参照して目的変数が文字データであるか否かを判定してもよいし、入力テーブル１２１の目的変数カラムの値に１つでも文字データが含まれている場合に目的変数が文字データであると判定してもよい。このようにルールベース問題分類部１１３は、目的変数のデータ型に基づいて問題分類を実行する。 When the rule-based problem classification unit 113 determines that the number of unique elements in the objective variable column exceeds 2 (S703: No), it determines whether the objective variable is character data (S705). Specifically, for example, the rule-based problem classification unit 113 may refer to the value of the type column 1224 of the analysis information 122 to determine whether or not the objective variable is character data. If at least one character data is included in the value of the objective variable column, it may be determined that the objective variable is character data. Thus, the rule-based problem classification unit 113 executes problem classification based on the data type of the objective variable.

ルールベース問題分類部１１３は、目的変数が文字データであると判定した場合（Ｓ７０４：Ｙｅｓ）、入力テーブル１２１の問題が多クラス問題を含むことを示す情報を問題分類結果に含め、稀現象判定処理を実行して（Ｓ７０５）、問題分類結果を出力し（Ｓ７０７）、ルールベース問題分類処理を終了する。 When the rule-based problem classification unit 113 determines that the objective variable is character data (S704: Yes), the problem classification result includes information indicating that the problem in the input table 121 includes a multi-class problem, and the rare phenomenon is determined. The process is executed (S705), the problem classification result is output (S707), and the rule-based problem classification process ends.

ルールベース問題分類部１１３は、目的変数が文字データでない（即ち数値データである）と判定した場合（Ｓ７０４：Ｎｏ）、入力テーブル１２１の目的変数カラムのユニーク要素数が予め定められた閾値α以下であるかを判定する（Ｓ７０５）。 When the rule-based problem classification unit 113 determines that the objective variable is not character data (that is, is numerical data) (S704: No), the number of unique elements in the objective variable column of the input table 121 is equal to or less than the predetermined threshold α. (S705).

ルールベース問題分類部１１３は、目的変数カラムのユニーク要素数がα以下であると判定した場合（Ｓ７０５：Ｙｅｓ）、入力テーブル１２１の問題が多クラス問題を含むことを示す情報を問題分類結果に含め、稀現象判定処理を実行して（Ｓ７０５）、問題分類結果を出力し（Ｓ７０７）、ルールベース問題分類処理を終了する。ルールベース問題分類部１１３は、目的変数カラムのユニーク要素数がαを超えると判定した場合（Ｓ７０５：Ｎｏ）、入力テーブル１２１の問題が回帰問題であることを示す情報を問題分類結果に含め、問題分類結果を出力し（Ｓ７０７）、ルールベース問題分類処理を終了する。 When the rule-based problem classification unit 113 determines that the number of unique elements in the objective variable column is equal to or less than α (S705: Yes), the problem classification result includes information indicating that the problem in the input table 121 includes a multi-class problem. The rare phenomenon determination process is executed (S705), the problem classification result is output (S707), and the rule-based problem classification process ends. When the rule-based problem classification unit 113 determines that the number of unique elements in the objective variable column exceeds α (S705: No), the problem classification result includes information indicating that the problem in the input table 121 is a regression problem, The problem classification result is output (S707), and the rule-based problem classification process ends.

図７の処理により、入力テーブル１２１の問題分類結果は、分析情報１２２の問題欄１２２３が示す問題、二値問題、二値問題かつ稀現象問題、多クラス問題、多クラス問題かつ稀現象問題、又は回帰問題のいずれかに該当する。 By the processing of FIG. 7, the problem classification results of the input table 121 include the problem indicated by the problem column 1223 of the analysis information 122, the binary problem, the binary problem and rare phenomenon problem, the multiclass problem, the multiclass problem and rare phenomenon problem, or regression problem.

図８は、ステップＳ６０２の共通パラメータ探索処理の一例を示すフローチャートである。共通パラメータ探索部１１４は、入力テーブル１２１と、分析情報１２２と、ステップＳ７０７で出力された問題分類結果が示す問題に対応する分析データベース１２３と、を読み出す（Ｓ８０１）。 FIG. 8 is a flowchart showing an example of common parameter search processing in step S602. The common parameter searching unit 114 reads out the input table 121, the analysis information 122, and the analysis database 123 corresponding to the problem indicated by the problem classification result output in step S707 (S801).

共通パラメータ探索部１１４は、ステップＳ８０１で読み出した分析データベース１２３のＣｏｍｍｏｎ＿Ｓｅａｒｃｈ欄１２３３の値が１であるレコードのモデルとＲｅｃｉｐｅ＿ＩＤとを特定し、特定したＲｅｃｉｐｅ＿ＩＤに対応するパラメータセットをｃｏｍｍｏｎ＿ｇｒｉｄとしてパラメータデータベース１２５から抽出する（Ｓ８０２）。 The common parameter search unit 114 identifies the model and Recipe_ID of the record whose value in the Common_Search column 1233 of the analysis database 123 read in step S801 is 1, and extracts the parameter set corresponding to the identified Recipe_ID from the parameter database 125 as common_grid. Extract (S802).

共通パラメータ探索部１１４は、ｃｏｍｍｏｎ＿ｇｒｉｄとして抽出したパラメータセットそれぞれについて、当該パラメータセットに分析データベース１２３において対応するモデルを入力テーブル１２１の説明変数の値に適用して、当該パラメータセットが適用されたモデルが入力テーブル１２１の説明変数に適用されたときの当該モデルの評価値（例えば予測精度（Ａｃｃｕｒａｃｙ））を算出する（Ｓ８０３）。 For each parameter set extracted as common_grid, the common parameter search unit 114 applies a model corresponding to the parameter set in the analysis database 123 to the values of the explanatory variables in the input table 121, and the model to which the parameter set is applied is An evaluation value (for example, prediction accuracy) of the model when applied to the explanatory variables of the input table 121 is calculated (S803).

共通パラメータ探索部１１４は、ステップＳ８０２で抽出したパラメータセットである探索済みパラメータ群θ_{ｓｅａｒｃｈ}と、ステップＳ８０３において得られた評価値と、を対応付けて出力する（Ｓ８０４）。 The common parameter search unit 114 outputs the searched parameter group θ _search , which is the parameter set extracted in step S802, in association with the evaluation value obtained in step S803 (S804).

図９は、ステップＳ６０３の類似事例抽出処理の一例を示すフローチャートである。類似事例抽出部１１５は、入力テーブル１２１と、分析情報１２２と、ステップＳ８０４で出力された探索済みパラメータ群θ_{ｓｅａｒｃｈ}と、ステップＳ７０７で出力された問題分類結果が示す問題に対応する分析データベース１２３と、を読み出す（Ｓ９０１）。 FIG. 9 is a flowchart showing an example of similar case extraction processing in step S603. The similar case extraction unit 115 extracts the input table 121, the analysis information 122, the searched parameter group θ _search output in step S804, and the analysis database 123 corresponding to the problem indicated by the problem classification result output in step S707. , is read out (S901).

類似事例抽出部１１５は、分析データベース１２３の分析済み事例のＡｎａｌｙｓｉｓ＿ＩＤそれぞれについて、ｃｏｍｍｏｎ＿ｇｒｉｄに対する評価値（Ａｃｃｕｒａｃｙ）と、探索済みパラメータ群θ_{ｓｅａｒｃｈ}における評価値と、を比較して、当該分析済み事例と、入力テーブル１２１の分析対象事例と、の類似度を算出する（Ｓ９０２）。 The similar case extraction unit 115 compares the evaluation value (Accuracy) for common_grid with the evaluation value in the searched parameter group θ _search for each Analysis_ID of the analyzed case in the analysis database 123, and compares the analyzed case with The degree of similarity with the analysis target case in the input table 121 is calculated (S902).

類似事例抽出部１１５は、ステップＳ９０２において、例えば、分析済み事例のｃｏｍｍｏｎ＿ｇｒｉｄに対する評価値と、探索済みパラメータ群θ_{ｓｅａｒｃｈ}における評価値と、の間の相関係数、ユークリッド距離、マンハッタン距離、又は絶対誤差等のいずれかを類似度として算出する。類似事例抽出部１１５は、入力テーブル１２１の分析対象事例との間の類似度が最も高い分析済み事例のＡｎａｌｙｓｉｓ＿ＩＤを類似事例ＩＤとして出力する（Ｓ９０３）。 In step S902, the similar case extraction unit 115 _calculates , for example, the correlation coefficient, Euclidean distance, Manhattan distance, or absolute error or the like is calculated as the degree of similarity. The similar case extraction unit 115 outputs the Analysis_ID of the analyzed case with the highest similarity to the analysis target case in the input table 121 as the similar case ID (S903).

図１０は、図９に示した類似事例抽出処理の具体例を示す説明図である。前述したように類似事例抽出部１１５は、入力テーブル１２１から得られた分析対象事例における、探索済みパラメータ群θ_{ｓｅａｒｃｈ}を適用した場合のモデルの評価値であるＡｃｃｕｒａｃｙと、分析データベース１２３の分析済み事例それぞれにおけるθ_{ｓｅａｒｃｈ}に対応するパラメータセットの評価値であるＡｃｃｕｒａｃｙと、の類似度を算出する。前述したように分析データベース１２３には複数の分析済み事例が格納されており、分析済み事例を識別するＡｎａｌｙｓｉｓ＿ＩＤが付与されている。 FIG. 10 is an explanatory diagram showing a specific example of the similar case extraction processing shown in FIG. As described above, the similar case extraction unit 115 extracts Accuracy, which is the evaluation value of the model when the searched parameter group θ _search is applied, in the analysis target cases obtained from the input table 121, and the analyzed cases in the analysis database 123. The degree of similarity between Accuracy, which is the evaluation value of the parameter set corresponding to θ _search in each case, is calculated. As described above, the analysis database 123 stores a plurality of analyzed cases and is given an Analysis_ID for identifying the analyzed cases.

Ａｎａｌｙｓｉｓ＿ＩＤが「１」、「２」、「３」、「４」の事例の評価値に対して、入力テーブル１２１から得られた分析対象事例における評価値との相関係数が、それぞれ、ｒ＝０．１１、ｒ＝０．３２、ｒ＝０．２５、ｒ＝０．８５、である。従って、図１０の例では、分析対象事例と最も類似度の高い分析済み事例はｒ＝０．８５を示したＡｎａｌｙｓｉｓ＿ＩＤが「４」の分析済み事例であることがわかる。 Correlation coefficients between the evaluation values of the cases with Analysis_ID of "1", "2", "3", and "4" and the evaluation values of the analysis target cases obtained from the input table 121 are r= 0.11, r=0.32, r=0.25, r=0.85. Therefore, in the example of FIG. 10, it can be seen that the analyzed case with the highest degree of similarity to the analysis target case is the analyzed case with Analysis_ID of "4" showing r=0.85.

従って、図１０の例では、類似事例抽出部１１５は、類似事例ＩＤとして「４」を出力する。なお、類似事例抽出部１１５は、相関係数を類似度として算出する場合、相関係数の絶対値が最大の分析済み事例のＡｎａｌｙｓｉｓ＿ＩＤを類似事例ＩＤとして出力してもよい。 Therefore, in the example of FIG. 10, the similar case extraction unit 115 outputs "4" as the similar case ID. When calculating the correlation coefficient as the degree of similarity, the similar case extraction unit 115 may output the Analysis_ID of the analyzed case with the largest absolute value of the correlation coefficient as the similar case ID.

上記した処理により、分析支援装置１００は、分析済み事例の分析結果のうちｃｏｍｍｏｎ＿ｇｒｉｄについてのみ、入力テーブル１２１が示す分析対象事例におけるモデルの評価値を算出し、算出した評価値と、過去分析済み事例におけるモデルの評価値と、を比較して類似事例を抽出するため、少ない処理量によって類似事例を抽出することができる。 Through the above-described processing, the analysis support apparatus 100 calculates the evaluation value of the model in the analysis target case indicated by the input table 121 only for common_grid among the analysis results of the analyzed cases, and calculates the calculated evaluation value and the past analyzed case. Since similar cases are extracted by comparing the evaluation value of the model in and , similar cases can be extracted with a small amount of processing.

また、分析支援装置１００は、上記した処理において、分析済み事例のデータそのもの（分析済み事例の属性情報）を利用することなく、分析データベース１２３（分析済み事例のモデル及びパラメータ、並びにモデルの評価値）を用いて類似事例を抽出することができるため、データ量を節減できる上に、仮に分析データベース１２３が漏洩しても個人情報等が漏洩することはない。 In addition, in the above-described processing, the analysis support apparatus 100 uses the analysis database 123 (the models and parameters of the analyzed cases, and the evaluation values of the models) without using the data of the analyzed cases themselves (the attribute information of the analyzed cases). ) can be used to extract similar cases, the amount of data can be reduced, and even if the analysis database 123 is leaked, personal information and the like will not be leaked.

図１１は、ステップＳ６０４のパラメータレコメンド処理の一例を示すフローチャートである。パラメータレコメンド部１１６は、入力テーブル１２１と、分析情報１２２と、ステップＳ９０４で出力された類似事例ＩＤと、パラメータデータベース１２５と、問題分類結果に対応する分析データベース１２３を読み出す（Ｓ１１０１）。 FIG. 11 is a flowchart showing an example of parameter recommendation processing in step S604. The parameter recommendation unit 116 reads out the input table 121, the analysis information 122, the similar case ID output in step S904, the parameter database 125, and the analysis database 123 corresponding to the problem classification results (S1101).

パラメータレコメンド部１１６は、問題分類結果に対応する分析データベース１２３を参照して、類似事例ＩＤが示す分析済み事例の評価値であるＡｃｃｕｒａｃｙが最も高い値を最良評価値として検索する（Ｓ１１０２）。 The parameter recommendation unit 116 refers to the analysis database 123 corresponding to the problem classification result, and searches for the highest Accuracy evaluation value of the analyzed case indicated by the similar case ID as the best evaluation value (S1102).

パラメータレコメンド部１１６は、最良評価値を示すレコードのＲｅｃｉｐｅ＿ＩＤを取得し、パラメータデータベース１２５から当該Ｒｅｃｉｐｅ＿ＩＤに対応するパラメータセットを取得する（Ｓ１１０３）。パラメータレコメンド部１１６は、取得したパラメータセットをレコメンドパラメータθ_{ｒｅｃｏｍｍｅｎｄ}として出力し（Ｓ１１０４）、パラメータレコメンド処理を終了する。 The parameter recommendation unit 116 acquires the Recipe_ID of the record indicating the best evaluation value, and acquires the parameter set corresponding to the Recipe_ID from the parameter database 125 (S1103). The parameter recommendation unit 116 outputs the acquired parameter set as a recommended parameter θ _recommend (S1104), and terminates the parameter recommendation process.

図１１の例では、パラメータレコメンド部１１６は、類似事例において最も高い評価値を示したパラメータセットを一つレコメンドしているが、例えば、類似事例において最も低い評価値を示したパラメータセットをレコメンドパラメータθ_{ｒｅｃｏｍｍｅｎｄ}として出力してもよいし、評価値の数値の大小以外の観点で選択されたパラメータセットをレコメンドパラメータθ_{ｒｅｃｏｍｍｅｎｄ}として出力してもよい。 In the example of FIG. 11, the parameter recommendation unit 116 recommends one parameter set showing the highest evaluation value in similar cases. It may be output as _θrecommend , or a parameter set selected from a viewpoint other than the numerical value of the evaluation value may be output as _{the recommended} parameter θrecommend.

また、パラメータレコメンド部１１６は、例えば、類似事例において評価値が高い順に所定数（例えば１０個）のパラメータセットをレコメンドパラメータθ_{ｒｅｃｏｍｍｅｎｄ}として出力してもよいし、類似事例において評価値が低い順に所定数（例えば１０個）のパラメータセットをレコメンドパラメータθ_{ｒｅｃｏｍｍｅｎｄ}として出力してもよい。 Further, the parameter recommendation unit 116 may, for example, output a predetermined number (for example, 10) of parameter sets in descending order of evaluation values in similar cases as recommended parameters _θrecommend , or output predetermined parameters in descending order of evaluation values in similar cases. A number (for example, 10) of parameter sets may be output as _{the recommended} parameter θrecommend.

また、パラメータレコメンド部１１６は、例えば、類似事例において評価値が所定以上の全てのパラメータセットをレコメンドパラメータθ_{ｒｅｃｏｍｍｅｎｄ}として出力してもよいし、パラメータレコメンド部１１６は、例えば、類似事例において評価値が所定未満の全てのパラメータセットをレコメンドパラメータθ_{ｒｅｃｏｍｍｅｎｄ}として出力してもよい。 Further, the parameter recommendation unit 116 may output, for example, all parameter sets with evaluation values equal to or greater than a predetermined value in similar cases as recommended parameters θ _recommend . All parameter sets that are less than a predetermined value may be output as _recommended parameters θrecommend.

図１２は、ステップＳ６０５の周辺探索処理の一例を示すフローチャートである。周辺探索部１１７は、入力テーブル１２１と、分析情報１２２と、ステップＳ１１０４で出力されたレコメンドパラメータθ_{ｒｅｃｏｍｍｅｎｄ}と、を読み出す（Ｓ１２０１）。 FIG. 12 is a flow chart showing an example of the peripheral search processing in step S605. The surrounding search unit 117 reads the input table 121, the analysis information 122, and _{the recommendation} parameter θrecommend output in step S1104 (S1201).

周辺探索部１１７は、レコメンドパラメータθ_{ｒｅｃｏｍｍｅｎｄ}を初期値として、レコメンドパラメータθ_{ｒｅｃｏｍｍｅｎｄ}の周辺のパラメータセットを適用したモデルを入力テーブル１２１の説明変数に適用し、最良の評価値を示すパラメータセットを探索する（Ｓ１２０２）。周辺探索部１１７は、最良の評価値を示すパラメータセットをベストパラメータθ_ｂｅｓｔとして出力し（Ｓ１２０３）、探索処理を終了する。 The peripheral search unit 117 uses the recommended parameter _θrecommend as an initial value, applies a model to which the parameter set around the _recommended parameter θrecommend is applied to the explanatory variables of the input table 121, and searches for the parameter set showing the best evaluation value. (S1202). The surrounding search unit 117 outputs the parameter set indicating the best evaluation value as the best parameter θ _best (S1203), and ends the search process.

このように周辺探索部１１７は、類似事例におけるレコメンドパラメータθ_{ｒｅｃｏｍｍｅｎｄ}の周辺を探索することにより、精度の高い結果を示すベストパラメータθ_ｂｅｓｔを少ない処理量で取得することができる。 In this way, the peripheral searching unit 117 can acquire the best parameter θ _best indicating a highly accurate result with a small amount of processing by searching around the recommended parameter θ _recommend in similar cases.

なお、ステップＳ１２０２において、周辺探索部１１７は、例えば、予め定められた範囲のパラメータセットを探索する。 In step S1202, peripheral searching section 117 searches for a parameter set within a predetermined range, for example.

また、ステップＳ１２０２において、周辺探索部１１７は、例えば、レコメンドパラメータθ_{ｒｅｃｏｍｍｅｎｄ}を中心として所定の条件に基づいて更新（例えば拡大又は縮小した）範囲のパラメータセットを探索してもよい。 Further, in step S1202, the surrounding search unit 117 may search for a parameter set in an updated (for example, enlarged or reduced) range based on a predetermined condition centering on the recommended parameter _θrecommend , for example.

具体的には、例えば、周辺探索部１１７は、ｖａｒ＝（ｘ_ｍａｘ－ｘ_ｍｉｎ）×（１－｜Ｃｏｒｒ｜）を算出する（但しｘ_ｍａｘ及びｘ_ｍｉｎはそれぞれ予め定められたパラメータの範囲の最大値及び最小値であり、Ｃｏｒｒは類似事例と、入力テーブル１２１が示す分析対象事例と、の相関係数である）。周辺探索部１１７は、ｘ_new＿min＝ｘ_{ｒｅｃｏｍｍｅｎｄ}－ｖａｒを更新後のパラメータの範囲の最小値、ｘ_{new＿mａｘ}＝ｘ_{ｒｅｃｏｍｍｅｎｄ}＋ｖａｒを更新後のパラメータの範囲の最大値として、当該範囲においてパラメータセットを探索する（但しｘ_{ｒｅｃｏｍｍｅｎｄ}はレコメンドパラメータである）。 Specifically, for example, the peripheral searching unit 117 calculates var=(x _max −x _min )×(1−|Corr|) (where x _max and x _min are each within a predetermined range of parameters). are the maximum value and the minimum value, and Corr is the correlation coefficient between the similar case and the analysis target case indicated by the input table 121). Peripheral search unit 117 sets x _{new_min} = x _recommend −var as the minimum value of the updated parameter range and x _{new_max} = x _recommend + var as the maximum value of the updated parameter range, and searches for a parameter set in the range. (However, x _recommend is a recommendation parameter).

図１３は、ステップＳ６０６の再学習処理の一例を示すフローチャートである。再学習部１１８は、入力テーブル１２１と、分析情報１２２と、ステップＳ１２０３で出力されたベストパラメータθ_ｂｅｓｔと、を読み出す（Ｓ１３０１）。 FIG. 13 is a flow chart showing an example of the relearning process in step S606. The relearning unit 118 reads the input table 121, the analysis information 122, and the best parameter θ _best output in step S1203 (S1301).

再学習部１１８は、ベストパラメータθ_ｂｅｓｔに基づいて、入力テーブル１２１に対して前処理を実行する（Ｓ１３０２）。具体的には、例えば、再学習部１１８は、数値カラムの正規化、文字列カラムの表記ゆれの統合、ＰＣＡ（ＰｒｉｎｃｉｐａｌＣｏｍｐｏｎｅｎｔＡｎａｌｙｓｉｓ）などを用いた次元圧縮、文字列カラムに対するダミー変数化処理、外れ値処理、及び異常値処理はいずれも前処理の一例である。 The relearning unit 118 performs preprocessing on the input table 121 based on the best parameter θ _best (S1302). Specifically, for example, the relearning unit 118 performs normalization of numeric columns, integration of notational variations in character string columns, dimension compression using PCA (Principal Component Analysis), etc., conversion of character string columns into dummy variables, Both outlier processing and outlier processing are examples of preprocessing.

再学習部１１８は、ベストパラメータθ_ｂｅｓｔに従って、入力テーブル１２１の特徴量を生成する（Ｓ１３０３）。具体的には、例えば、再学習部１１８数値カラムの分割、対数変換、指数変換、時系列特徴量変換、及び／又は時刻データの年月日への変換などによって特徴量を生成する。 The relearning unit 118 generates the feature amount of the input table 121 according to the best parameter θ _best (S1303). Specifically, for example, the re-learning unit 118 generates feature values by dividing numerical columns, logarithmic conversion, exponential conversion, time-series feature value conversion, and/or conversion of time data into date.

再学習部１１８は、ベストパラメータθ_ｂｅｓｔを用いて、入力テーブル１２１に対してモデリングを実行する（Ｓ１３０４）。線形回帰、Ｌｏｇｉｓｔｉｃ回帰、ＳＶＭ、ＳＶＲ、ＧａｕｓｓｉａｎＰｒｏｃｅｓｓ、ＲａｎｄｏｍＦｏｒｅｓｔ、ＬｉｇｈｔＧＢＭ、ＸＧＢｏｏｓｔ、及びＮｅｕｒａｌＮｅｔｗｏｒｋはいずれもモデリングの一例である。再学習部１１８は、ステップＳ１３０４のモデリングにおいて得られる、入力テーブル１２１の事例の目的変数を予測するモデル１２４及び当該モデル１２４による分析結果を出力し（Ｓ１３０５）、再学習処理を終了する。 The relearning unit 118 performs modeling on the input table 121 using the best parameter θ _best (S1304). Linear Regression, Logistic Regression, SVM, SVR, GaussianProcess, RandomForest, LightGBM, XGBoost, and NeuralNetwork are all examples of modeling. The relearning unit 118 outputs the model 124 for predicting the target variable of the example in the input table 121 obtained in the modeling in step S1304 and the analysis result of the model 124 (S1305), and ends the relearning process.

図１４は、分析データベース作成処理の一例を示すフローチャートである。分析データベース作成処理は、入力テーブル１２１に対する分析結果を分析データベース１２３へと蓄積する処理であるため、図６の自動分析実行処理の終了後に行われることが望ましい。
分析データベース作成部１１２は、入力テーブル１２１と、分析情報１２２と、を読み出す（Ｓ１４０１）。 FIG. 14 is a flowchart showing an example of analysis database creation processing. Since the analysis database creation process is a process of accumulating the analysis results for the input table 121 in the analysis database 123, it is desirable to be performed after the automatic analysis execution process of FIG. 6 is completed.
The analysis database creating unit 112 reads the input table 121 and the analysis information 122 (S1401).

分析データベース作成部１１２は共通パラメータ探索処理を実行する（Ｓ１４０２）。分析データベース作成部１１２は、ステップＳ１４０２において、図８の共通パラメータ探索処理と同様に、入力テーブル１２１に対して、パラメータセットの探索を行う。 The analysis database creation unit 112 executes common parameter search processing (S1402). In step S1402, the analysis database creation unit 112 searches the input table 121 for a parameter set in the same manner as in the common parameter search process of FIG.

分析データベース作成部１１２は、詳細探索を実行する（Ｓ１４０３）。ステップＳ１４０３において、分析データベース作成部１１２は、ステップＳ１４０２で探索した分析事例間の共通パラメータ以外（即ちｃｏｍｍｏｎ＿ｇｒｉｄ）のパラメータセット（例えば、各ｃｏｍｍｏｎ＿ｇｒｉｄから所定範囲以内の全てのパラメータセット）を探索する。 The analysis database creation unit 112 executes detailed search (S1403). In step S1403, the analysis database creation unit 112 searches for parameter sets (for example, all parameter sets within a predetermined range from each common_grid) other than the common parameters (that is, common_grid) between the analysis cases searched in step S1402.

分析データベース作成部１１２は、ステップＳ１３０２とステップＳ１３０３とにおいて探索された全てのパラメータセットのうち、最も高い評価値を示すサンプルにＢｅｓｔ＿Ｆｌａｇとして１を付与する（Ｓ１４０４）。なお、ステップＳ１４０４の処理は、運用時の動作を高速化するための処理であるため、必ずしも実行されなくてもよい。 The analysis database creation unit 112 assigns 1 as Best_Flag to the sample showing the highest evaluation value among all the parameter sets searched in steps S1302 and S1303 (S1404). Note that the process of step S1404 is a process for speeding up the operation during operation, so it does not necessarily have to be executed.

分析データベース作成部１１２は、入力テーブル１２１の事例に対してＡｎａｌｙｓｉｓ＿ＩＤを付与し、Ａｎａｌｙｓｉｓ＿ＩＤ、Ｍｏｄｅｌ＿Ｎａｍｅ、Ｒｅｃｉｐｅ＿ＩＤ、Ｂｅｓｔ＿Ｆｌａｇ、及びＡｃｃｕｒａｃｙをそれぞれ分析データベース１２３に記録し（Ｓ１４０４）、分析データベース作成処理を終了する。 The analysis database creation unit 112 assigns Analysis_ID to the cases in the input table 121, records Analysis_ID, Model_Name, Recipe_ID, Best_Flag, and Accuracy in the analysis database 123 (S1404), and ends the analysis database creation process.

なお、分析データベース作成部１１２は、パラメータデータベース１２５に格納されていないパラメータセットを探索した場合には、当該パラメータセットに対してＲｅｃｉｐｅ＿ＩＤを付与し、当該Ｒｅｃｉｐｅ＿ＩＤと当該パラメータセットをパラメータデータベース１２５に記録する。 When searching for a parameter set that is not stored in the parameter database 125, the analysis database creation unit 112 assigns a Recipe_ID to the parameter set, and records the Recipe_ID and the parameter set in the parameter database 125. .

また、分析データベース１２３のＣｏｍｍｏｎ＿Ｓｅａｒｃｈ欄１２３３の値は、例えば、分析支援装置１００のユーザの入力に従って記録されてもよいし、自動で（例えば所定のモデルのうち、所定値以上のＡｃｃｕｒａｃｙを示すレコードについては１、他のレコードについては０）記録されてもよい。 In addition, the value of the Common_Search column 1233 of the analysis database 123 may be recorded, for example, according to the input of the user of the analysis support apparatus 100, or automatically (for example, among predetermined models, for records indicating Accuracy equal to or greater than a predetermined value 1 for other records, 0) may be recorded.

図１５は、自動分析実行処理が行われるときに出力装置１０５に表示される表示画面の一例である。表示画面１５００は、例えば、出力情報表示領域１５０１、類似事例抽出レコメンド実行ボタン１５０２、再探索実行ボタン１５０３、及びモデル学習実行ボタン１５０４を含む。 FIG. 15 is an example of a display screen displayed on the output device 105 when automatic analysis execution processing is performed. The display screen 1500 includes, for example, an output information display area 1501 , a similar case extraction recommendation execution button 1502 , a re-search execution button 1503 , and a model learning execution button 1504 .

類似事例抽出レコメンド実行ボタン１５０２が選択されると、図６の自動分析実行処理が行われ、出力情報表示領域１５０１に、例えば、図９の類似事例抽出処理において抽出された類似事例ＩＤと、分析対象事例と当該類似事例との類似度と、図１１のパラメータレコメンド処理において出力されたレコメンドパラメータと、が表示される。 When the similar case extraction recommendation execution button 1502 is selected, the automatic analysis execution processing of FIG. The degree of similarity between the target case and the similar case and the recommended parameters output in the parameter recommendation process of FIG. 11 are displayed.

再探索実行ボタン１５０３が選択されると、図１２の周辺探索処理が実行されて、出力情報表示領域１５０１にベストパラメータがさらに表示される。モデル学習実行ボタン１５０４が選択されると、図１３の再学習処理が実行されて、出力されたモデルがさらに表示される。 When the re-search execution button 1503 is selected, the surrounding search processing of FIG. 12 is executed and the best parameters are further displayed in the output information display area 1501 . When the model learning execution button 1504 is selected, the relearning process of FIG. 13 is executed and the output model is further displayed.

なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施例の構成の一部を他の実施例の構成に置き換えることも可能であり、また、ある実施例の構成に他の実施例の構成を加えることも可能である。また、各実施例の構成の一部について、他の構成の追加・削除・置換をすることが可能である。 In addition, the present invention is not limited to the above-described embodiments, and includes various modifications. For example, the above-described embodiments have been described in detail in order to explain the present invention in an easy-to-understand manner, and are not necessarily limited to those having all the described configurations. It is also possible to replace part of the configuration of one embodiment with the configuration of another embodiment, or to add the configuration of another embodiment to the configuration of one embodiment. Moreover, it is possible to add, delete, or replace a part of the configuration of each embodiment with another configuration.

また、上記の各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、上記の各構成、機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリや、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記録装置、または、ＩＣカード、ＳＤカード、ＤＶＤ等の記録媒体に置くことができる。 Further, each of the above configurations, functions, processing units, processing means, and the like may be realized by hardware, for example, by designing them in an integrated circuit. Moreover, each of the above configurations, functions, etc. may be realized by software by a processor interpreting and executing a program for realizing each function. Information such as programs, tables, and files that implement each function can be stored in recording devices such as memories, hard disks, SSDs (Solid State Drives), or recording media such as IC cards, SD cards, and DVDs.

また、制御線や情報線は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実際には殆ど全ての構成が相互に接続されていると考えてもよい。 Further, the control lines and information lines indicate those considered necessary for explanation, and not all control lines and information lines are necessarily indicated on the product. In practice, it may be considered that almost all configurations are interconnected.

１００分析支援装置、１０１ＣＰＵ、１０２メモリ、１０３補助記憶装置、１０４入力装置、１０５出力装置、１０６通信装置、１１１自動分析実行部、１１２分析データベース作成部、１１３ルールベース問題分類部、１１４共通パラメータ探索部、１１５類似事例抽出部、１１６パラメータレコメンド部、１１７周辺探索部、１１８再学習部、１２１入力テーブル、１２２分析情報、１２３分析データベース、１２４モデル、１２５パラメータデータベース 100 analysis support device, 101 CPU, 102 memory, 103 auxiliary storage device, 104 input device, 105 output device, 106 communication device, 111 automatic analysis execution unit, 112 analysis database creation unit, 113 rule-based problem classification unit, 114 common parameter search unit 115 similar case extraction unit 116 parameter recommendation unit 117 peripheral search unit 118 relearning unit 121 input table 122 analysis information 123 analysis database 124 model 125 parameter database

Claims

An analysis support device,
having a processor and a memory,
The memory is
analysis target case data indicating explanatory variables and objective variables of analysis target cases;
analysis evaluation data indicating a combination of a model that analyzed an analyzed case and parameters applied to the model, and an evaluation value of the model when the analyzed case was analyzed by the model to which the parameters were applied; , and hold
The processor
Calculating the evaluation value of the model when predicting the objective variable of the analysis target case by applying a combination of predetermined models and parameters included in the analysis evaluation data to the explanatory variables of the analysis target case death,
calculating a degree of similarity by comparing the calculated evaluation value with each evaluation value indicated by the analysis evaluation data;
An analysis support device that identifies a similar case, which is an analyzed case similar to the analysis target case, based on the calculated degree of similarity.

The analysis support device according to claim 1,
The processor refers to the analysis evaluation data, and determines, as a recommended parameter, the parameter in the combination with the highest evaluation value among the combinations of the model that analyzed the similar case and the parameter applied to the model. Device.

The analysis support device according to claim 2,
having a display device;
The analysis support device, wherein the processor displays information indicating the similar cases, the degree of similarity, and the recommended parameters on the display device.

The analysis support device according to claim 2,
The processor
searching for parameters in a search range that includes the recommended parameters and satisfies a predetermined condition;
Among the searched parameters, the parameter with the highest evaluation value of the model when predicting the target variable of the case to be analyzed by the model corresponding to the recommended parameter to which the searched parameter is applied is determined as the best parameter. analysis support device.

The analysis support device according to claim 4,
The processor
calculating a correlation coefficient between the calculated evaluation value and the evaluation value of the similar case;
An analysis support device that determines the search range so that the search range becomes smaller as the absolute value of the correlation coefficient increases.

The analysis support device according to claim 4,
The analysis support device, wherein the processor outputs an analysis result obtained by applying the model corresponding to the recommended parameter to which the best parameter is applied to the case to be analyzed.

The analysis support device according to claim 1,
The analysis support device, wherein the processor stores, in the analysis evaluation data, a combination of the partial model that analyzed the analysis target case and the parameter applied to the model, and the calculated evaluation value.

The analysis device according to claim 1,
the analytical evaluation data indicates a problem solved by the model in the analyzed cases;
The processor
identifying a problem to be solved by a model in the case to be analyzed based on the number of elements of the objective variable, the data type of the objective variable, and the relative frequency of the value of the objective variable in the case data to be analyzed;
The model when the target variable of the case to be analyzed is predicted by the combination of the model and parameters corresponding to the identified problem, among the combinations of the predetermined partial models and parameters included in the analysis evaluation data An analysis support device that calculates the evaluation value of

An analysis support method using an analysis support device,
The analysis support device has a processor and a memory,
The memory is
analysis target case data indicating explanatory variables and objective variables of analysis target cases;
analysis evaluation data indicating a combination of a model that analyzed an analyzed case and parameters applied to the model, and an evaluation value of the model when the analyzed case was analyzed by the model to which the parameters were applied; , and hold
The analysis support method includes:
of the model when the processor predicts the objective variable of the analysis target case by applying a combination of predetermined models and parameters included in the analysis evaluation data to the explanatory variables of the analysis target case; Calculate the evaluation value,
The processor compares the calculated evaluation value with each evaluation value indicated by the analysis evaluation data to calculate a degree of similarity;
An analysis support method, wherein the processor identifies a similar case, which is an analyzed case similar to the case to be analyzed, based on the calculated degree of similarity.