JP2021149893A

JP2021149893A - Neural network analysis apparatus, neural network analysis method, and program

Info

Publication number: JP2021149893A
Application number: JP2020052073A
Authority: JP
Inventors: 秀将伊藤; Hidemasa Ito
Original assignee: Toshiba Corp; Toshiba Digital Solutions Corp
Current assignee: Toshiba Corp; Toshiba Digital Solutions Corp
Priority date: 2020-03-24
Filing date: 2020-03-24
Publication date: 2021-09-27
Anticipated expiration: 2040-03-24
Also published as: JP7374829B2

Abstract

To provide a neural network analysis apparatus, a neural network analysis method, and a program that facilitate interpretation of a learned model while suppressing deterioration of prediction accuracy.SOLUTION: A neural network analysis apparatus according to an embodiment comprises a node used index pattern acquisition unit, and a clustering unit. The node used index pattern acquisition unit acquires a node used index pattern being information indicating a node used index that is a value indicating a degree for a node of a neural network to be used according to an input into an input layer for each node of an input layer, an intermediate layer and an output layer of the neural network, for each test data that is input into a learned model represented by an analysis object neural network which is a neural network of an analysis object. The clustering unit performs clustering of the test data and the node used index pattern on the basis of the node used index pattern acquired by the node used index pattern acquisition unit.SELECTED DRAWING: Figure 1

Description

本発明の実施形態は、ニューラルネット解析装置、ニューラルネット解析方法及びプログラムに関する。 Embodiments of the present invention relate to a neural network analyzer, a neural network analysis method and a program.

近年、ニューラルネットワークを用いた機械学習を行い、学習済みモデルを用いて結果を予測することが盛んである。しかしながら、ニューラルネットワークを用いた機械学習が生成した学習済みモデルは非線形モデルであるため、学習済みモデルを解釈することが難しい場合があった。なお、学習済みモデルを解釈するとは、学習済みモデルの予測結果の根拠を知ることを意味する。 In recent years, it has become popular to perform machine learning using a neural network and predict the result using a trained model. However, since the trained model generated by machine learning using a neural network is a non-linear model, it may be difficult to interpret the trained model. Interpreting the trained model means knowing the basis of the prediction result of the trained model.

そこで、学習済みモデルを線形モデルで近似する手法（特許文献１参照）や、非線形モデルの組合せを簡略化するように学習済みモデルを近似する手法等の手法（非特許文献１及び２参照）が提案された。しかし、いずれの手法も、各ノードが発火（Activation）したか否か等のニューラルネットのノードの様子を解析する手法ではなかった。そのため、これまで提案された手法では、予測精度と解釈の容易さとを両立することができない場合があった。 Therefore, a method of approximating the trained model with a linear model (see Patent Document 1) and a method of approximating the trained model so as to simplify the combination of nonlinear models (see Non-Patent Documents 1 and 2) are used. was suggested. However, none of these methods was a method for analyzing the state of the neural network nodes, such as whether or not each node was activated. Therefore, the methods proposed so far may not be able to achieve both prediction accuracy and ease of interpretation.

特開２０１８−１６９９６０号公報JP-A-2018-169960

Leo Breiman and Nong Shang、”BORN AGAIN TREES”、［online］、［令和２年３月１９日検索］、インターネット〈URL：https://www.stat.berkeley.edu/users/breiman/BAtrees.pdf>Leo Breiman and Nong Shang, "BORN AGAIN TREES", [online], [Searched on March 19, 2nd year of Reiwa], Internet <URL: https://www.stat.berkeley.edu/users/breiman/BAtrees. pdf> Marco Tulio Ribeiro, Sameer Singh and Carlos Guestrin、”Why Should I Trust You?” Explaining the Predictions of Any Classifier”、［online］、［令和２年３月１９日検索］、インターネット〈URL：https://www.kdd.org/kdd2016/papers/files/rfp0573-ribeiroA.pdf>Marco Tulio Ribeiro, Sameer Singh and Carlos Guestrin, "Why Should I Trust You?" Explaining the Predictions of Any Classifier, [online], [Search on March 19, 2nd year of Reiwa], Internet <URL: https: // www.kdd.org/kdd2016/papers/files/rfp0573-ribeiroA.pdf>

本発明が解決しようとする課題は、予測精度の低下を抑制しつつ学習済みモデルの解釈を容易にするニューラルネット解析装置、ニューラルネット解析方法及びプログラムを提供することである。 An object to be solved by the present invention is to provide a neural net analysis device, a neural net analysis method, and a program that facilitate interpretation of a trained model while suppressing a decrease in prediction accuracy.

実施形態のニューラルネット解析装置は、ノード被使用指標パターン取得部と、クラスタリング部とを持つ。ノード被使用指標パターン取得部は、ニューラルネットワークのノードが使用される度合を入力層への入力に応じて示す値であるノード被使用指標をニューラルネットワークの入力層、中間層及び出力層の各ノードについて示す情報であるノード被使用指標パターンを、解析対象のニューラルネットワークである解析対象ニューラルネットワークが表す学習済みモデルに入力されたテストデータごとに取得する。クラスタリング部は、前記ノード被使用指標パターン取得部が取得したノード被使用指標パターンに基づき前記テストデータ及び前記ノード被使用指標パターンをクラスタリングする。 The neural network analysis device of the embodiment has a node used index pattern acquisition unit and a clustering unit. The node used index pattern acquisition unit is a value indicating the degree to which the node of the neural network is used according to the input to the input layer. The node used index is the node used index of each node of the input layer, the intermediate layer, and the output layer of the neural network. The node usage index pattern, which is the information indicating the above, is acquired for each test data input to the trained model represented by the neural network to be analyzed, which is the neural network to be analyzed. The clustering unit clusters the test data and the node used index pattern based on the node used index pattern acquired by the node used index pattern acquisition unit.

実施形態のニューラルネット解析装置１の概要を説明する第１の説明図。The first explanatory diagram explaining the outline of the neural network analysis apparatus 1 of an embodiment. 実施形態のニューラルネット解析装置１の概要を説明する第２の説明図。The second explanatory diagram explaining the outline of the neural network analysis apparatus 1 of an embodiment. 実施形態のニューラルネット解析装置１のハードウェア構成の一例を示す図。The figure which shows an example of the hardware composition of the neural network analysis apparatus 1 of embodiment. 実施形態における制御部１０の機能構成の一例を示す図。The figure which shows an example of the functional structure of the control part 10 in embodiment. 実施形態における制御部１０が実行する解析処理初期化段階における処理の流れの一例を示すフローチャート。FIG. 5 is a flowchart showing an example of a processing flow in the analysis processing initialization stage executed by the control unit 10 in the embodiment. 実施形態における制御部１０が実行する解析処理更新段階における処理の流れの一例を示すフローチャート。FIG. 5 is a flowchart showing an example of a processing flow in the analysis processing update stage executed by the control unit 10 in the embodiment.

以下、実施形態のニューラルネット解析装置、ニューラルネット解析方法及びプログラムを、図面を参照して説明する。まず図１及び図２を用いて実施形態のニューラルネット解析装置の概要を説明する。 Hereinafter, the neural network analysis device, the neural network analysis method, and the program of the embodiment will be described with reference to the drawings. First, the outline of the neural network analysis apparatus of the embodiment will be described with reference to FIGS. 1 and 2.

図１は、実施形態のニューラルネット解析装置１の概要を説明する第１の説明図である。ニューラルネット解析装置１は、解析対象のニューラルネットワークが表す学習済みモデル（以下「解析対象モデル」という。）を、テストデータが解析対象モデルに入力された際のノード被使用指標パターンに基づき線形モデルで近似する。以下、解析対象のニューラルネットワークを解析対象ニューラルネットワークという。 FIG. 1 is a first explanatory diagram illustrating an outline of the neural network analysis device 1 of the embodiment. The neural network analysis device 1 uses a trained model represented by the neural network to be analyzed (hereinafter referred to as “analysis target model”) as a linear model based on a node usage index pattern when test data is input to the analysis target model. Approximate with. Hereinafter, the neural network to be analyzed is referred to as an analysis target neural network.

ノード被使用指標パターンは、ノード被使用指標をニューラルネットワークの入力層、中間層及び出力層の各ノードについて示す情報である。ノード被使用指標は、ノードが使用される度合を入力層への入力に応じて示す値である。ノード被使用指標は、例えば、ノードの活性化関数の傾きの値である。例えば、ノード被使用指標は入力と出力との関係が線形関数で表されるノードにおける線形関数の傾きであってもよい。ノード被使用指標は、例えば、以下の式（１）及び式（２）によって表される。 The node usage index pattern is information indicating the node usage index for each node of the input layer, the intermediate layer, and the output layer of the neural network. The node usage index is a value indicating the degree to which a node is used according to the input to the input layer. The node usage index is, for example, the value of the slope of the activation function of the node. For example, the node usage index may be the slope of the linear function in the node where the relationship between the input and the output is represented by the linear function. The node usage index is represented by, for example, the following equations (1) and (2).

式（１）及び（２）において、σはｆ（ｘ）を変数とする活性化関数である。ｆ（ｘ）はノードに入力される値を表す関数であり、ｘに応じた値を出力する関数である。ｘは、１つ前の層のノードの出力の値である。以下、ｘを前段ノード出力値という。なお、前段ノード出力値ｘは、１つ前の層が存在しない層である入力層においては、入力層へ入力される値そのものである。入力層へ入力される値とは、例えば、解析対象モデルに入力されるデータの特徴量である。式（１）及び（２）において、ｗ及びｂは解析対象ニューラルネットワークにおける重みパラメータである。式（１）及び（２）において、ｘ、ｗ及びｂは行列やベクトル等のテンソルで表されている。 In equations (1) and (2), σ is an activation function with f (x) as a variable. f (x) is a function representing a value input to a node, and is a function that outputs a value corresponding to x. x is the output value of the node of the previous layer. Hereinafter, x is referred to as a previous node output value. The previous node output value x is the value itself input to the input layer in the input layer, which is a layer in which the previous layer does not exist. The value input to the input layer is, for example, a feature amount of data input to the analysis target model. In equations (1) and (2), w and b are weight parameters in the neural network to be analyzed. In equations (1) and (2), x, w and b are represented by tensors such as matrices and vectors.

以下、説明の簡単のためノード被使用指標がノードの活性化関数の傾きの値である場合を例にニューラルネット解析装置１を説明する。 Hereinafter, for the sake of simplicity, the neural network analysis device 1 will be described by taking the case where the node usage index is the value of the slope of the activation function of the node as an example.

ノード被使用指標は、例えば、ノードの活性化関数がＲｅＬＵである場合、０又は１の値である。活性化関数がＲｅＬＵである場合においてノード被使用指標が１であることはノードが発火（Activation）したことを意味する。 The node usage index is, for example, a value of 0 or 1 when the activation function of the node is ReLU. When the activation function is ReLU, the node usage index of 1 means that the node has been activated.

入力層のノードのノード被使用指標はノードに入力された値と全単射の関係にある値である。例えば、入力層のノードのノード被使用指標はノードに入力された値と同一の値である。以下説明の簡単のため、入力層のノードのノード被使用指標がノードに入力された値と同一の値である場合を例にニューラルネット解析装置１を説明する。 The node usage index of the node of the input layer is a value that has a bijective relationship with the value input to the node. For example, the node usage index of a node in the input layer is the same value as the value input to the node. For the sake of simplicity of the following description, the neural network analysis device 1 will be described by taking the case where the node usage index of the node of the input layer is the same value as the value input to the node.

中間層及び出力層のノード被使用指標は必ずしもノードに入力された値と全単射の関係にある値ではない。例えば中間層及び出力層のノードの活性化関数がＲｅＬＵである場合、中間層及び出力層のノードのノード被使用指標の候補は０又は１の２つである。このような場合、例えばノード被使用指標が１であるための条件はノードに入力された値が０より大きいという条件であり、ノード被使用指標が０であるための条件はノードに入力された値が０以下という条件である。このように、中間層及び出力層のノード被使用指標は必ずしもノードに入力された値と全単射の関係にある値ではない。 The node usage index of the intermediate layer and the output layer is not necessarily a value that has a bijective relationship with the value input to the node. For example, when the activation function of the nodes of the intermediate layer and the output layer is ReLU, there are two candidates for the node usage index of the nodes of the intermediate layer and the output layer, 0 or 1. In such a case, for example, the condition for the node used index to be 1 is the condition that the value input to the node is larger than 0, and the condition for the node used index to be 0 is input to the node. The condition is that the value is 0 or less. As described above, the node usage index of the intermediate layer and the output layer is not necessarily a value having a bijective relationship with the value input to the node.

なお、ニューラルネットワークにおいて前段のノードの出力（以下「ノード出力値」という。）は後段のノードに入力される。例えば、入力層のノードのノード出力値は中間層の１層目のノードに入力される。例えば、中間層のｎ層目のノードのノード出力値は中間層の（ｎ＋１）層目のノードに入力される（ｎは自然数）。例えば、中間層の最後の層のノードのノード出力値は出力層のノードに入力される。 In the neural network, the output of the node in the first stage (hereinafter referred to as "node output value") is input to the node in the second stage. For example, the node output value of the node of the input layer is input to the node of the first layer of the intermediate layer. For example, the node output value of the node of the nth layer of the intermediate layer is input to the node of the (n + 1) th layer of the intermediate layer (n is a natural number). For example, the node output value of the node of the last layer of the middle layer is input to the node of the output layer.

図２は、実施形態のニューラルネット解析装置１の概要を説明する第２の説明図である。図２は、４つ種類のノード被使用指標パターンＰ１、Ｐ２、Ｐ３及びＰ４を示す。図２のノード被使用指標パターンを示す解析対象ニューラルネットワークは出力層のノードが１つであり、出力層のノードのノード被使用指標が示す値の候補は０、１、２及び３等の４つの値である。 FIG. 2 is a second explanatory diagram illustrating an outline of the neural network analysis device 1 of the embodiment. FIG. 2 shows four types of node usage index patterns P1, P2, P3 and P4. The neural network to be analyzed showing the node usage index pattern in FIG. 2 has one node in the output layer, and the value candidates indicated by the node usage index of the node in the output layer are 4 such as 0, 1, 2, and 3. Two values.

図２においてノード被使用指標パターンＰ１、Ｐ２、Ｐ３及びＰ４の出力層のノードのノード被使用指標はそれぞれ異なる。前段の層のノードと後段の層のノードとをつなぐ線（以下「ノード接続線」という。）は、説明の簡単のために記載された説明のためのイメージの線である。ノード接続線は、後段の層のノードのノード被使用指標の決定に支配的な前段の層のノードを示す。支配的な前段の層のノードとは、ノード被使用指標の変化が、後段の層のノードのノード被使用指標に所定の大きさ以上の変化を与えるノードである。 In FIG. 2, the node usage index of the node of the output layer of the node usage index patterns P1, P2, P3 and P4 is different from each other. The line connecting the node of the first layer and the node of the second layer (hereinafter referred to as "node connection line") is an image line for explanation described for simplicity of explanation. The node connection line indicates the node of the previous layer that is dominant in determining the node usage index of the node of the lower layer. The dominant node in the first layer is a node in which a change in the node usage index causes a change of a predetermined magnitude or more in the node usage index of the node in the second layer.

図２の解析対象ニューラルネットワークは、入力された値に応じて４種類のノード被使用指標パターンＰ１、Ｐ２、Ｐ３及びＰ４のいずれのノード被使用指標パターンが適切か判定する。図２の解析対象ニューラルネットワークは、判定結果のノード被使用指標パターンを用いて解析対象ニューラルネットワークの出力を決定する。図２ではノード被使用指標パターンの種類が４つなので、解析対象ニューラルネットワークのユーザによるニューラルネットワークの動作の解釈は容易である可能性が高い。しかしながら、ノード被使用指標パターンの種類が多くなるほど解析対象ニューラルネットワークの動作の解釈が難しくなる。そこで、ニューラルネット解析装置１は、解析対象ニューラルネットワークのノード被使用指標パターンを分類する。 The neural network to be analyzed in FIG. 2 determines which of the four types of node-used index patterns P1, P2, P3, and P4 is appropriate according to the input value. The analysis target neural network of FIG. 2 determines the output of the analysis target neural network using the node usage index pattern of the determination result. Since there are four types of node-used index patterns in FIG. 2, it is highly possible that the user of the neural network to be analyzed can easily interpret the operation of the neural network. However, as the number of types of node-used index patterns increases, it becomes more difficult to interpret the operation of the neural network to be analyzed. Therefore, the neural network analysis device 1 classifies the node usage index pattern of the neural network to be analyzed.

ニューラルネット解析装置１は、例えば図２の解析対象ニューラルネットワークの場合、ノード被使用指標パターンＰ１とノード被使用指標パターンＰ３とを同一の種類のノード被使用指標パターンであるようにノード被使用指標パターンを１又は複数の集合に分類する。ニューラルネット解析装置１は分類後に、集合ごとにノード指標情報を取得する。ノード指標情報は、解析対象ニューラルネットワークの各ノードについてノード指標を示す情報である。ノード指標は、ノード確率分布を表す指標である。ノード確率分布は、解析対象ニューラルネットワークのノードごとの確率分布であってノード被使用指標を確率変数とする確率分布である。 In the case of the neural network to be analyzed in FIG. 2, for example, the neural network analysis device 1 sets the node-used index pattern P1 and the node-used index pattern P3 to be the same type of node-used index pattern. Classify patterns into one or more sets. After the classification, the neural network analysis device 1 acquires the node index information for each set. The node index information is information indicating a node index for each node of the neural network to be analyzed. The node index is an index representing the node probability distribution. The node probability distribution is a probability distribution for each node of the neural network to be analyzed, and is a probability distribution using the node usage index as a random variable.

ノード指標は、例えば、ノード確率分布の期待値である。ノード指標は、例えば、ノード確率分布の中央値であってもよい。ノード指標は、例えば、カルバックライブラーの距離関数で表される指標であってもよい。図２において、情報ＥＶ１は、ノード被使用指標パターンＰ１とノード被使用指標パターンＰ３とを含む集合におけるノード指標情報である。そのため、図２における情報ＥＶ１の５つのノードの値はそれぞれ、例えば、各ノードにおけるノード確率分布の期待値である。確率分布が表す確率は、具体的には、各クラスタにおける各ノード被使用指標パターンの出現の確率（負担率）である。 The node index is, for example, the expected value of the node probability distribution. The node index may be, for example, the median value of the node probability distribution. The node index may be, for example, an index represented by the distance function of the Kullback-Leibler. In FIG. 2, the information EV1 is the node index information in the set including the node used index pattern P1 and the node used index pattern P3. Therefore, the values of the five nodes of the information EV1 in FIG. 2 are, for example, the expected values of the node probability distribution in each node. Specifically, the probability represented by the probability distribution is the probability (burden rate) of the appearance of each node usage index pattern in each cluster.

ニューラルネット解析装置１は、ノード指標情報に基づき分類後の集合ごとに、集合に属するノード被使用指標パラメータの分布を表す１つの線形モデルを生成する。ニューラルネット解析装置１は、線形モデルの生成後はテストデータが入力されるたびに生成済みの線形モデルを更新する。以下、集合ごとに１つ生成された線形モデルを代表線形モデルという。各集合の代表線形モデルの集合が近似結果モデルである。 The neural network analysis device 1 generates one linear model representing the distribution of the node used index parameters belonging to the set for each set after classification based on the node index information. After the linear model is generated, the neural network analysis device 1 updates the generated linear model every time test data is input. Hereinafter, a linear model generated for each set is referred to as a representative linear model. The set of representative linear models of each set is the approximation result model.

このように、ニューラルネット解析装置１は解析対象ニューラルネットワークのノード被使用指標パターンを分類し、ノード被使用指標パターンの種類を減らす。そして、種類ごとの線形モデルを取得する。そのため、ニューラルネット解析装置１はユーザによるニューラルネットワークの動作の解釈を容易にする。 In this way, the neural network analysis device 1 classifies the node-used index patterns of the neural network to be analyzed and reduces the types of the node-used index patterns. Then, the linear model for each type is acquired. Therefore, the neural network analysis device 1 facilitates the user's interpretation of the operation of the neural network.

なお、線形モデルとは、複数の入力の和に対応する出力の値が各入力の出力の値の和に一致するモデル（すなわち線形写像）である。ニューラルネット解析装置１は、解析対象モデルを１つの線形モデルで近似してもよいし、複数の線形モデルで近似してもよい。解析対象モデルを線形モデルで近似するとは、入力と出力との関係が解析対象モデルとある程度似ている線形モデル（以下「近似結果モデル」という。）を取得することを意味する。 The linear model is a model (that is, a linear map) in which the output value corresponding to the sum of a plurality of inputs matches the sum of the output values of each input. The neural network analysis device 1 may approximate the analysis target model with one linear model, or may approximate it with a plurality of linear models. Approximating the model to be analyzed with a linear model means acquiring a linear model (hereinafter referred to as "approximation result model") in which the relationship between the input and the output is somewhat similar to the model to be analyzed.

テストデータは、解析対象モデルに入力されるデータである。テストデータは解析対象モデルに入力可能なデータであればどのようなデータであってもよい。 The test data is the data input to the analysis target model. The test data may be any data as long as it can be input to the analysis target model.

図１の説明に戻る。図２では、出力層のノードのノード被使用指標の候補が４つのニューラルネットワークを用いてニューラルネット解析装置１の動作の概要を説明した。しかしながらノードのノード被使用指標の候補が多いほど説明は難しくなるので、以下説明の簡単のため、ノードのノード被使用指標の候補が２つである場合を例にニューラルネット解析装置１を説明する。ノードのノード被使用指標の候補が２つであるニューラルネットワークとは具体的には、以下の入力情報条件、入力層条件及びＲｅＬＵ発火条件が満たされるニューラルネットワークである。 Returning to the description of FIG. In FIG. 2, the outline of the operation of the neural network analysis device 1 has been described using four neural networks in which the candidates for the node usage index of the node of the output layer are four. However, the more candidates for the node usage index of the node, the more difficult it is to explain. Therefore, for the sake of simplicity of the explanation below, the neural network analysis device 1 will be described by taking the case where there are two candidates for the node usage index of the node as an example. .. The neural network in which there are two candidates for the node usage index of the node is specifically a neural network that satisfies the following input information condition, input layer condition, and ReLU firing condition.

入力条件は、ニューラルネットワークに入力される情報は入力層が備える各ノードに入力される値を示し、各ノードに入力される値は０又は１であるという条件である。ニューラルネットワークに入力される情報は、例えばテストデータである。 The input condition is that the information input to the neural network indicates a value input to each node provided in the input layer, and the value input to each node is 0 or 1. The information input to the neural network is, for example, test data.

入力層条件は、入力層の各ノードのノード被使用指標は各ノードに入力された値に同一であるという条件である。 The input layer condition is a condition that the node usage index of each node of the input layer is the same as the value input to each node.

ＲｅＬＵ発火条件は、中間層及び出力層のノードの活性化関数がＲｅＬＵであるという条件である。そのため、ＲｅＬＵ発火条件は、入力された値が０より大きい場合にノード被使用指標は１を示し、入力された値が０以下である場合にノード被使用指標は０を示すという条件を含む。 The ReLU firing condition is a condition that the activation function of the nodes in the intermediate layer and the output layer is ReLU. Therefore, the ReLU ignition condition includes a condition that the node used index shows 1 when the input value is larger than 0, and the node used index shows 0 when the input value is 0 or less.

以下、中間層のノードのうちノード被使用指標が１を示すノードを中間層励起ノードという。以下、出力層のノードのうちノード被使用指標が１を示すノードを出力層励起ノードという。以下、中間層のノードのうちノード被使用指標が０を示すノードを中間層非励起ノードという。以下、出力層のノードのうちノード被使用指標が０を示すノードを出力層非励起ノードという。 Hereinafter, among the nodes in the intermediate layer, the node showing a node usage index of 1 is referred to as an intermediate layer excitation node. Hereinafter, among the nodes of the output layer, the node showing a node usage index of 1 is referred to as an output layer excitation node. Hereinafter, among the nodes in the intermediate layer, the node in which the node usage index shows 0 is referred to as an intermediate layer non-excited node. Hereinafter, among the nodes of the output layer, the node whose node usage index shows 0 is referred to as an output layer non-excited node.

以下、入力層のノードのうち、ノード被使用指標が１であるノードを入力層励起ノードという。以下、入力層の各ノードのうち、ノード被使用指標が０であるノードを入力層非励起ノードという。 Hereinafter, among the nodes of the input layer, the node whose node usage index is 1 is referred to as an input layer excitation node. Hereinafter, among the nodes of the input layer, the node whose node usage index is 0 is referred to as an input layer non-excited node.

以下、説明の簡単のため、入力層励起ノード、中間層励起ノード又は出力層励起ノードであるノードを励起ノードという。すなわち、励起ノードはノード被使用指標が１のノードである。以下、説明の簡単のため、入力層非励起ノード、中間層非励起ノード又は出力層非励起ノードであるノードを非励起ノードという。すなわち、非励起ノードはノード被使用指標が０のノードである。 Hereinafter, for the sake of simplicity, a node that is an input layer excitation node, an intermediate layer excitation node, or an output layer excitation node is referred to as an excitation node. That is, the excited node is a node having a node usage index of 1. Hereinafter, for the sake of simplicity, a node that is an input layer non-excited node, an intermediate layer non-excited node, or an output layer non-excited node is referred to as a non-excited node. That is, the non-excited node is a node having a node usage index of 0.

励起パターンは、ニューラルネットワークの入力層、中間層及び出力層の各ノードのノード被使用指標を示す情報であるので、入力情報条件、入力層条件及びＲｅＬＵ発火条件が満たされる場合にはノード被使用指標パターンは励起ノードを示す情報である。 Since the excitation pattern is information indicating the node usage index of each node of the input layer, the intermediate layer, and the output layer of the neural network, the node usage when the input information condition, the input layer condition, and the ReLU ignition condition are satisfied. The index pattern is information indicating the excitation node.

図１におけるノード被使用指標パターンＥｘ１及びノード被使用指標パターンＥｘ２は、それぞれ入力情報条件、入力層条件及びＲｅＬＵ発火条件を満たすニューラルネットワークにおけるノード被使用指標パターンの１例である。ノード被使用指標パターンＥｘ１とノード被使用指標パターンＥｘ２とは、入力層励起ノード、中間層励起ノード又は出力層励起ノードの少なくとも１つが異なる。このように、ノード被使用指標パターンはテストデータ等の入力層に入力された情報と全単射の関係にある。 The node used index pattern Ex1 and the node used index pattern Ex2 in FIG. 1 are examples of node used index patterns in a neural network that satisfy the input information condition, the input layer condition, and the ReLU firing condition, respectively. The node used index pattern Ex1 and the node used index pattern Ex2 differ from each other in at least one of an input layer excitation node, an intermediate layer excitation node, and an output layer excitation node. In this way, the node usage index pattern has a bijective relationship with the information input to the input layer such as test data.

図３は、実施形態のニューラルネット解析装置１のハードウェア構成の一例を示す図である。ニューラルネット解析装置１は、バスで接続されたＣＰＵ（Central Processing Unit）等のプロセッサ９１とメモリ９２とを備える制御部１０を備え、プログラムを実行する。ニューラルネット解析装置１は、プログラムの実行によって制御部１０、通信部１１、記憶部１２及びユーザインタフェース１３を備える装置として機能する。 FIG. 3 is a diagram showing an example of the hardware configuration of the neural network analysis device 1 of the embodiment. The neural net analysis device 1 includes a control unit 10 including a processor 91 such as a CPU (Central Processing Unit) connected by a bus and a memory 92, and executes a program. The neural network analysis device 1 functions as a device including a control unit 10, a communication unit 11, a storage unit 12, and a user interface 13 by executing a program.

より具体的には、ニューラルネット解析装置１は、プロセッサ９１が記憶部１２に記憶されているプログラムを読み出し、読み出したプログラムをメモリ９２に記憶させる。プロセッサ９１が、メモリ９２に記憶させたプログラムを実行することによって、ニューラルネット解析装置１は、制御部１０、通信部１１、記憶部１２及びユーザインタフェース１３を備える装置として機能する。 More specifically, in the neural network analysis device 1, the processor 91 reads out the program stored in the storage unit 12, and stores the read program in the memory 92. When the processor 91 executes the program stored in the memory 92, the neural net analysis device 1 functions as a device including the control unit 10, the communication unit 11, the storage unit 12, and the user interface 13.

制御部１０は、ニューラルネット解析装置１が備える各機能部の動作を制御する。制御部１０は例えば解析対象モデルの近似結果モデルを取得する処理を実行する。制御部１０は取得された近似結果モデルを記憶部１２に記録する。制御部１０は例えば通信部１１の動作を制御する。 The control unit 10 controls the operation of each functional unit included in the neural network analysis device 1. The control unit 10 executes, for example, a process of acquiring an approximation result model of the analysis target model. The control unit 10 records the acquired approximation result model in the storage unit 12. The control unit 10 controls, for example, the operation of the communication unit 11.

通信部１１は、ニューラルネット解析装置１を外部装置に接続するための通信インタフェースを含んで構成される。通信部１１は、例えば、通信先の外部装置に近似結果モデルを送信する。外部装置は、例えば、近似結果モデルを出力するプリンタである。通信部１１は、例えば、外部装置からテストデータを取得する。 The communication unit 11 includes a communication interface for connecting the neural network analysis device 1 to an external device. The communication unit 11 transmits, for example, the approximation result model to the external device of the communication destination. The external device is, for example, a printer that outputs an approximation result model. The communication unit 11 acquires test data from, for example, an external device.

記憶部１２は、磁気ハードディスク装置や半導体記憶装置などの記憶装置を用いて構成される。記憶部１２はニューラルネット解析装置１に関する各種情報を記憶する。記憶部１２は例えば、ニューラルネット解析装置１が備える各機能部の動作を制御するプログラムを予め記憶する。記憶部１２は例えば、予め解析対象ニューラルネットワークを表す情報を記憶する。解析対象モデルを表す情報は、例えば、解析対象モデルを表すニューラルネットワークのハイパパラメータを含む。記憶部１２は例えば、近似結果モデルを記憶する。記憶部１２は、例えば、ノード被使用指標パターンを分類する各集合を示す情報を記憶する。後述するクラスタはノード被使用指標パターンを分類する集合の一例である。記憶部１２は、例えば、代表線形モデルを記憶する。記憶部１２は、ノード被使用指標パターンを記憶する。 The storage unit 12 is configured by using a storage device such as a magnetic hard disk device or a semiconductor storage device. The storage unit 12 stores various information related to the neural network analysis device 1. The storage unit 12 stores, for example, a program for controlling the operation of each functional unit included in the neural network analysis device 1 in advance. The storage unit 12 stores, for example, information representing the neural network to be analyzed in advance. The information representing the analysis target model includes, for example, hyperparameters of the neural network representing the analysis target model. The storage unit 12 stores, for example, an approximation result model. The storage unit 12 stores, for example, information indicating each set that classifies the node usage index pattern. The cluster described later is an example of a set that classifies node usage index patterns. The storage unit 12 stores, for example, a representative linear model. The storage unit 12 stores the node usage index pattern.

ユーザインタフェース１３は、ニューラルネット解析装置１に対する入力を受け付ける入力部１３１とニューラルネット解析装置１に関する各種情報を表示する出力部１３２とを備える。ユーザインタフェース１３は、例えば、タッチパネルである。入力部１３１は、自装置に対する入力を受け付ける。入力部１３１は、例えばマウスやキーボード、タッチパネル等の入力端末である。入力部１３１は、例えば、これらの入力端末を自装置に接続するインタフェースとして構成されてもよい。入力部１３１が受け付ける入力は、例えば、解析対象モデルを表す情報である。入力部１３１が受け付ける入力は、例えば、テストデータである。 The user interface 13 includes an input unit 131 that receives input to the neural network analysis device 1, and an output unit 132 that displays various information about the neural network analysis device 1. The user interface 13 is, for example, a touch panel. The input unit 131 receives an input to its own device. The input unit 131 is an input terminal such as a mouse, a keyboard, or a touch panel. The input unit 131 may be configured as, for example, an interface for connecting these input terminals to its own device. The input received by the input unit 131 is, for example, information representing the analysis target model. The input received by the input unit 131 is, for example, test data.

出力部１３２は、例えば液晶ディスプレイ、有機ＥＬ（Electro Luminescence）ディスプレイ等の表示装置である。出力部１３２は、例えば、これらの表示装置を自装置に接続するインタフェースとして構成されてもよい。出力部１３２は、例えばスピーカー等の音声出力装置であってもよい。出力部１３２が出力する情報は、例えば、近似結果モデルである。以下、説明の簡単のため出力部１３２が表示装置である場合を例にニューラルネット解析装置１を説明する。 The output unit 132 is a display device such as a liquid crystal display or an organic EL (Electro Luminescence) display. The output unit 132 may be configured as, for example, an interface for connecting these display devices to its own device. The output unit 132 may be an audio output device such as a speaker. The information output by the output unit 132 is, for example, an approximation result model. Hereinafter, for the sake of simplicity, the neural network analysis device 1 will be described by taking the case where the output unit 132 is a display device as an example.

図４は、実施形態における制御部１０の機能構成の一例を示す図である。 FIG. 4 is a diagram showing an example of the functional configuration of the control unit 10 in the embodiment.

制御部１０は、テストデータ取得部１０１、ノード被使用指標パターン取得部１０２、クラスタリング部１０３、ノード指標情報取得部１０４、代表線形モデル取得部１０５、動作制御部１０６及び出力制御部１０７を備える。 The control unit 10 includes a test data acquisition unit 101, a node used index pattern acquisition unit 102, a clustering unit 103, a node index information acquisition unit 104, a representative linear model acquisition unit 105, an operation control unit 106, and an output control unit 107.

テストデータ取得部１０１は、通信部１１又はユーザインタフェース１３を介してテストデータを取得する。 The test data acquisition unit 101 acquires test data via the communication unit 11 or the user interface 13.

ノード被使用指標パターン取得部１０２は、テストデータを解析対象モデルに入力し、入力したテストパターンに対応するノード被使用指標パターンを取得する。 The node used index pattern acquisition unit 102 inputs the test data into the analysis target model, and acquires the node used index pattern corresponding to the input test pattern.

クラスタリング部１０３は、初期化段階クラスタリング部３０１及び更新段階クラスタリング部３０２を備える。 The clustering unit 103 includes an initialization stage clustering unit 301 and an update stage clustering unit 302.

初期化段階クラスタリング部３０１は、ノード被使用指標パターン取得部１０２が取得したノード被使用指標パターンに基づきテストデータを予め定められたＫ個（Ｋは整数）のクラスタにクラスタリングする。ノード被使用指標パターンはテストデータと全単射の関係にあるため、ノード被使用指標パターンに基づきテストデータをクラスタリングすることで、ノード被使用指標パターンもまたクラスタリングされる。このように、初期化段階クラスタリング部３０１は、ノード被使用指標パターン取得部１０２が取得したノード被使用指標パターンに基づき、テストデータ及びノード被使用指標パターンをクラスタリングする。 The initialization stage clustering unit 301 clusters test data into a predetermined number of K clusters (K is an integer) based on the node used index pattern acquired by the node used index pattern acquisition unit 102. Since the node-used index pattern has a bijective relationship with the test data, the node-used index pattern is also clustered by clustering the test data based on the node-used index pattern. In this way, the initialization stage clustering unit 301 clusters the test data and the node used index pattern based on the node used index pattern acquired by the node used index pattern acquisition unit 102.

更新段階クラスタリング部３０２は、テストデータを分類するＫ個のクラスタが既に生成済みという条件が少なくとも満たされる場合に動作する。すなわち、更新段階クラスタリング部３０２は、テストデータを分類するＫ個のクラスタが既に生成済みという条件を含む所定の条件が満たされる場合に動作する。更新段階クラスタリング部３０２は、追加のテストデータが入力された場合における解析対象モデルの出力に基づき、追加のテストデータも含めて解析対象モデルに入力済みのテストデータの全てを複数のクラスタにクラスタリングする。 The update stage clustering unit 302 operates when at least the condition that K clusters for classifying test data have already been generated is satisfied. That is, the update stage clustering unit 302 operates when a predetermined condition including the condition that K clusters for classifying the test data have already been generated is satisfied. The update stage clustering unit 302 clusters all the test data input to the analysis target model including the additional test data into a plurality of clusters based on the output of the analysis target model when additional test data is input. ..

例えば、更新段階クラスタリング部３０２は所属クラスタ判定処理の実行により、テストデータ及びノード被使用指標パターンの全てを複数のクラスタにクラスタリングする。所属クラスタ判定処理において、更新段階クラスタリング部３０２はまず、追加のテストデータが入力された場合における解析対象モデルの出力結果と、追加のテストデータが入力された場合における各クラスタの代表線形モデルの出力結果とを比較する。代表線形モデルの出力結果とは、例えば、代表線形モデルの出力層の各ノードのノード出力値である。 For example, the update stage clustering unit 302 clusters all of the test data and the node usage index pattern into a plurality of clusters by executing the belonging cluster determination process. In the affiliation cluster determination process, the update stage clustering unit 302 first outputs the output result of the analysis target model when additional test data is input and the output of the representative linear model of each cluster when additional test data is input. Compare with the results. The output result of the representative linear model is, for example, the node output value of each node in the output layer of the representative linear model.

追加のテストデータが入力された場合における解析対象モデルの出力結果は、例えば、追加のテストデータが入力された場合における解析対象モデルの出力層の各ノードのノード出力値である。 The output result of the analysis target model when the additional test data is input is, for example, the node output value of each node of the output layer of the analysis target model when the additional test data is input.

所属クラスタ判定処理において、次に更新段階クラスタリング部３０２は、高類似モデルが属するクラスタを、追加のテストデータが属するクラスタであると判定する。高類似モデルは、追加のテストデータが入力された場合における解析対象モデルの出力結果との違いが最も小さい出力結果を出力する代表線形モデルである。所属クラスタ判定処理において更新段階クラスタリング部３０２は、所属クラスタ判定処理の実行前に既にクラスタリングされたテストデータについては、属するクラスタを変更しない。 In the affiliation cluster determination process, the update stage clustering unit 302 then determines that the cluster to which the highly similar model belongs is the cluster to which the additional test data belongs. The highly similar model is a representative linear model that outputs the output result with the smallest difference from the output result of the analysis target model when additional test data is input. In the belonging cluster determination processing, the update stage clustering unit 302 does not change the belonging cluster for the test data that has already been clustered before the execution of the belonging cluster determination processing.

以下の式（３）は、所属クラスタ判定処理において追加のテストデータが属するクラスタと追加のテストデータが入力された場合における解析対象モデルとの間の類似の度合（以下「クラス類似度」という。）を表す関数の一例である。所属クラスタ判定処理における違いが最も小さいという条件は、式（３）が大きいという条件であってもよい。 The following equation (3) is referred to as a degree of similarity between the cluster to which the additional test data belongs and the analysis target model when the additional test data is input in the belonging cluster determination process (hereinafter referred to as “class similarity”). ) Is an example of a function. The condition that the difference in the affiliation cluster determination process is the smallest may be the condition that the equation (3) is large.

ｑはクラス類似度を表す。なお、以下、アンダーバーは下付き文字を表す。具体的には、Ａ＿ＢはＡ_Ｂを表す。なお、以下、ハットは上付き文字を表す。具体的には、Ａ＾Ｂは、Ａ^Ｂを表す。 q represents the class similarity. In the following, the underscore represents a subscript. Specifically, A_B represents _{A B.} In the following, the hat represents a superscript. Specifically, A ^ B represents ^{A B.}

ｎはデータインデックス番号を示す。データインデックスとは各データを識別するための識別子である。ｎは１以上Ｎ以下の整数である。Ｎは整数である。Ｎの定義はデータ数である。ｋは、ノード被使用指標パターンインデックスである。ノード被使用指標パターンインデックスとは、所属クラスタ判定処理の実行により生成された各クラスタを識別するための識別子である。ｕ＿ｋ＾（ｎ）の定義は、ｎ番目データがｋ番目ノード被使用指標パターンに所属すれば１であり、ｎ番目データがｋ番目ノード被使用指標パターンに所属しなければ０となる確率である。 n indicates a data index number. The data index is an identifier for identifying each data. n is an integer of 1 or more and N or less. N is an integer. The definition of N is the number of data. k is a node used index pattern index. The node usage index pattern index is an identifier for identifying each cluster generated by executing the belonging cluster determination process. The definition of u_k ^ (n) is the probability that it will be 1 if the nth data belongs to the kth node used index pattern, and 0 if the nth data does not belong to the kth node used index pattern. ..

ｐ（Ａ｜Ｂ）は、Ｂという条件が満たされる場合にＡという条件が満たされる確率（すなわち条件付き確率）である。ｙの定義はデータの教師信号であり、ｙ＾（ｎ）の定義はｎ番目のデータの教師信号である。Φはモデルパラメータの１つである。モデルパラメータは、ニューラルネットワークが表すモデルの構造に関するパラメータを意味する。モデルの構造に関するパラメータは、例えばモデルの構造を表すパラメータである。モデルの構造に関するパラメータは、例えばモデルにおける重みパラメータである。Φは、具体的には、モデルパラメータの１つであるところの重みベクトルである。ｇは領域インデックスである。領域インデックスとは領域を識別するための識別子である。領域とは、解析学における領域を表し、具体的には特徴量空間上の連結な開部分集合を表す。ｆはノード被使用指標パターンインデックスｋから領域インデックスを求める関数である。例えば恒等写像である。φｇはノード被使用指標パターンインデックスがｇのモデルパラメータである。ｘ＾（ｎ）の定義は前段ノード出力値ｘそのものとは異なる定義でありｎ番目の入力データである。 p (A | B) is the probability that the condition A is satisfied (that is, the conditional probability) when the condition B is satisfied. The definition of y is the teacher signal of the data, and the definition of y ^ (n) is the teacher signal of the nth data. Φ is one of the model parameters. The model parameter means a parameter related to the structure of the model represented by the neural network. The parameters related to the structure of the model are, for example, parameters representing the structure of the model. Parameters related to the structure of the model are, for example, weight parameters in the model. Specifically, Φ is a weight vector which is one of the model parameters. g is the region index. The area index is an identifier for identifying the area. The domain represents a domain in analysis, and specifically represents a connected open subset on the feature space. f is a function for obtaining the region index from the node used index pattern index k. For example, an identity map. φg is a model parameter whose node used index pattern index is g. The definition of x ^ (n) is different from the definition of the previous node output value x itself and is the nth input data.

ｍはノード被使用指標パターンの特徴量インデックスである。ノード被使用指標パターンの特徴量インデックスとはノード被使用指標パターンのベクトルの各要素を識別するための識別子である。ノード被使用指標パターンの特徴量インデックスとは、例えば、ノード被使用指標パターンのベクトルの１要素を指す番号である。ｍは１以上Ｍ以下の整数である。Ｍは整数である。Ｍの定義はノード被使用指標パターン特徴量を示すベクトルの要素数である。 m is a feature index of the node used index pattern. The feature amount index of the node-used index pattern is an identifier for identifying each element of the vector of the node-used index pattern. The feature amount index of the node-used index pattern is, for example, a number indicating one element of the vector of the node-used index pattern. m is an integer of 1 or more and M or less. M is an integer. The definition of M is the number of elements of the vector indicating the node usage index pattern feature quantity.

ηはモデルパラメータの１つである。ηは、具体的には、モデルパラメータの１つであるところの代表ノード被使用指標パターンを表す。ｓはノード被使用指標パターンを表す。（ｓ＿ｍ）＾（ｎ）は、ｎ番目データのノード被使用指標パターンの特徴量ベクトルのｍ番目の要素を意味する。η＿（ｋ、ｍ）の定義は、以下の式（４）で表される。 η is one of the model parameters. Specifically, η represents the representative node used index pattern, which is one of the model parameters. s represents a node usage index pattern. (S_m) ^ (n) means the m-th element of the feature vector of the node-used index pattern of the n-th data. The definition of η_ (k, m) is expressed by the following equation (4).

Π＿ｍ＝１＾Ｍ（η＿ｋ、ｍ）＾（（ｓ＿ｍ）＾（ｎ））（１−η＿ｋ、ｍ）＾（１−（ｓ＿ｍ）＾（ｎ））は、n番目のデータのノード被使用指標パターンのｋ番目の代表ノード被使用指標パターンでの発生確率を表す。αはモデルパラメータの１つである。αは、具体的には、モデルパラメータの１つであるところの代表ノード被使用指標パターンの重みを表す。なお、以下の式（５）で表される記号は、積（product）を表す数学の記号である。 Π_m = 1 ^ M (η_k, m) ^ ((s_m) ^ (n)) (1-η_k, m) ^ (1- (s_m) ^ (n)) is the node usage index of the nth data. Represents the probability of occurrence in the kth representative node used index pattern of the pattern. α is one of the model parameters. Specifically, α represents the weight of the representative node used index pattern, which is one of the model parameters. The symbol represented by the following equation (5) is a mathematical symbol representing a product.

なお、α＿ｋの定義は以下の式（６）で表される定義である。 The definition of α_k is the definition represented by the following equation (6).

このように、更新段階クラスタリング部３０２は、クラスタを更新する処理である。クラスタを更新するとは、具体的には、テストデータが各クラスタに属するための条件を更新することを意味する。 In this way, the update stage clustering unit 302 is a process for updating the cluster. Updating the cluster specifically means updating the conditions for the test data to belong to each cluster.

ノード指標情報取得部１０４は、属するノード被使用指標パターンに基づきクラスタごとにノード指標情報を取得する。以下、説明の簡単のためノード指標がノード確率分布の期待値である場合を例に、ニューラルネット解析装置１を説明する。 The node index information acquisition unit 104 acquires node index information for each cluster based on the node used index pattern to which it belongs. Hereinafter, for the sake of simplicity, the neural network analysis device 1 will be described by taking the case where the node index is the expected value of the node probability distribution as an example.

代表線形モデル取得部１０５は、ノード指標情報に基づき分類後のクラスタごとに代表線形モデルを取得する処理を実行する。未だ代表線形モデルが生成されていない場合における代表線形モデルを取得する処理は、新たに代表線形モデルを生成する処理である。既に各クラスタについて代表線形モデルが生成済みの場合における代表線形モデルを取得する処理は、生成済みの代表線形モデルを更新する処理であってもよいし新たに代表線形モデルを生成する処理であってもよい。 The representative linear model acquisition unit 105 executes a process of acquiring a representative linear model for each classified cluster based on the node index information. The process of acquiring the representative linear model when the representative linear model has not been generated yet is the process of generating a new representative linear model. The process of acquiring the representative linear model when the representative linear model has already been generated for each cluster may be the process of updating the generated representative linear model or the process of generating a new representative linear model. May be good.

ノード指標情報に基づき分類後のクラスタごとに代表線形モデルを取得する処理は、具体的には、ニューラルネットの各ノードの活性化関数をノード指標情報が示す値で固定し代表線形モデルを算出する処理である。各ノードの活性化関数をノード指標情報が示す値で固定するとは、出力を入力で微分した際の勾配の一定化、つまり線形モデル化を意味する。勾配の一定化とは、具体的には、ある入力を与えた際の活性化関数の勾配ベクトルを算出し、モデルの活性化関数を入力と勾配ベクトルとのアダマール積関数に置き換える処理のことである。入力値によってモデルの勾配が変化することがなくなるため、勾配の一定化が行われたモデルは線形モデルとなる。各ノードの活性化関数をノード指標情報が示す値で固定した後、代表線形モデルを算出する処理は、具体的には各ノードが属する層の直前の層の重み行列とノード指標情報と活性化関数後の層の重み行列との行列積を算出する処理である。なお、上述した式（４）の左辺（すなわちη＿（ｋ、ｍ））は、代表線形モデルを表す。 In the process of acquiring the representative linear model for each cluster after classification based on the node index information, specifically, the activation function of each node of the neural network is fixed at the value indicated by the node index information and the representative linear model is calculated. It is a process. Fixing the activation function of each node with the value indicated by the node index information means constant gradient when the output is differentiated at the input, that is, linear modeling. Gradient constantization is specifically the process of calculating the gradient vector of the activation function given a certain input and replacing the activation function of the model with the Hadamard product function of the input and the gradient vector. be. Since the gradient of the model does not change depending on the input value, the model with constant gradient becomes a linear model. After fixing the activation function of each node with the value indicated by the node index information, the process of calculating the representative linear model is specifically the weight matrix of the layer immediately before the layer to which each node belongs, the node index information, and the activation. This is a process of calculating the matrix product with the weight matrix of the layer after the function. The left side of the above equation (4) (that is, η_ (k, m)) represents a representative linear model.

以下の式（７）は、代表線形モデルを取得する処理の一例を表す数式である。 The following equation (7) is an equation representing an example of processing for acquiring a representative linear model.

βは更新幅を定義するハイパパラメータである。Ｌはｙ、ｘ及びΦ＿ｇを変数とする損失関数である。Ｌは、例えば二乗誤差である。Ｌは識別誤差であってもよい。ΔＬは、損失関数における変数Φ＿ｇを変化させた際のＬの変化量を表す。 β is a hyperparameter that defines the update width. L is a loss function with y, x and Φ_g as variables. L is, for example, a square error. L may be an identification error. ΔL represents the amount of change in L when the variable Φ_g in the loss function is changed.

動作制御部１０６は、ノード被使用指標パターン取得部１０２、初期化段階クラスタリング部３０１、ノード指標情報取得部１０４、代表線形モデル取得部１０５及び更新段階クラスタリング部３０２の動作を制御する。動作制御部１０６は、線形モデルの取得に関する所定の条件が満たされるまで（以下「解析処理初期化段階」という。）は、ノード被使用指標パターン取得部１０２、初期化段階クラスタリング部３０１、ノード指標情報取得部１０４及び代表線形モデル取得部１０５を動作させる。 The operation control unit 106 controls the operations of the node used index pattern acquisition unit 102, the initialization stage clustering unit 301, the node index information acquisition unit 104, the representative linear model acquisition unit 105, and the update stage clustering unit 302. The operation control unit 106 has the node used index pattern acquisition unit 102, the initialization stage clustering unit 301, and the node index until a predetermined condition for acquiring the linear model is satisfied (hereinafter referred to as “analysis processing initialization stage”). The information acquisition unit 104 and the representative linear model acquisition unit 105 are operated.

動作制御部１０６は解析処理更新段階においてノード被使用指標パターン取得部１０２、ノード指標情報取得部１０４、代表線形モデル取得部１０５及び更新段階クラスタリング部３０２を動作させる。すなわち、動作制御部１０６は、解析処理更新段階においては解析処理初期化段階で動作させていた初期化段階クラスタリング部３０１に代えて更新段階クラスタリング部３０２を動作させる。解析処理更新段階は、線形モデルの取得に関する所定の条件（以下「動作変更条件」という。）が満たされた後である。 The operation control unit 106 operates the node used index pattern acquisition unit 102, the node index information acquisition unit 104, the representative linear model acquisition unit 105, and the update stage clustering unit 302 in the analysis processing update stage. That is, the motion control unit 106 operates the update stage clustering unit 302 in place of the initialization stage clustering unit 301 that was operated in the analysis process initialization stage in the analysis process update stage. The analysis processing update stage is after a predetermined condition (hereinafter referred to as “operation change condition”) relating to the acquisition of the linear model is satisfied.

動作変更条件は、例えば、代表線形モデルの取得に用いたテストデータの数が所定の数に達したという条件である。動作変更条件は、例えば、新たなテストデータが追加されたことによる代表線形モデルの変化が所定の変化未満という条件であってもよい。 The operation change condition is, for example, a condition that the number of test data used for acquiring the representative linear model has reached a predetermined number. The operation change condition may be, for example, a condition that the change of the representative linear model due to the addition of new test data is less than a predetermined change.

動作変更条件は、例えば、解析処理更新段階で用いるデータであることを示す情報が予め付与されたテストデータが入力される、という条件であってもよい。以下、説明の簡単のため動作変更条件が、線形モデルの取得に用いたテストデータの数が所定の数に達したという条件である場合を例に、ニューラルネット解析装置１を説明する。 The operation change condition may be, for example, a condition in which test data to which information indicating that the data is used in the analysis process update stage is added in advance is input. Hereinafter, the neural network analysis device 1 will be described by taking as an example a case where the operation change condition is a condition that the number of test data used for acquiring the linear model has reached a predetermined number for the sake of simplicity.

なお、動作変更条件が満たされたか否かを示す情報は、記憶部１２が記憶する。 The storage unit 12 stores information indicating whether or not the operation change condition is satisfied.

出力制御部１０７は、出力部１３２の動作を制御する。出力制御部１０７は、出力部１３２の動作を制御して、出力部１３２に例えば近似結果モデルを出力させる。 The output control unit 107 controls the operation of the output unit 132. The output control unit 107 controls the operation of the output unit 132 to cause the output unit 132 to output, for example, an approximation result model.

図５は、実施形態における制御部１０が実行する解析処理初期化段階における処理の流れの一例を示すフローチャートである。 FIG. 5 is a flowchart showing an example of a processing flow in the analysis processing initialization stage executed by the control unit 10 in the embodiment.

テストデータ取得部１０１が１又は複数のテストデータを取得する（ステップＳ１０１）。次にノード被使用指標パターン取得部１０２が、ステップＳ１０１で取得された各テストデータを解析対象モデルに入力し、テストデータごとにノード被使用指標パターンを取得する（ステップＳ１０２）。次に初期化段階クラスタリング部３０１が、取得済みのノード被使用指標パターンに基づきテストデータ及びノード被使用指標パターンをクラスタリングする（ステップＳ１０３）。テストデータのクラスタリングによってノード被使用指標パターンもクラスタリングされる。 The test data acquisition unit 101 acquires one or a plurality of test data (step S101). Next, the node used index pattern acquisition unit 102 inputs each test data acquired in step S101 into the analysis target model, and acquires a node used index pattern for each test data (step S102). Next, the initialization stage clustering unit 301 clusters the test data and the node used index pattern based on the acquired node used index pattern (step S103). Node usage index patterns are also clustered by clustering test data.

次にノード指標情報取得部１０４が、属するノード被使用指標パターンに基づきクラスタごとにノード指標情報を取得する（ステップＳ１０４）。すなわち、ノード指標情報取得部１０４が、属するノード被使用指標パターンに基づきクラスタごとに解析対象ニューラルネットワークのノードごとにノード確率分布の期待値を取得する。 Next, the node index information acquisition unit 104 acquires node index information for each cluster based on the node used index pattern to which it belongs (step S104). That is, the node index information acquisition unit 104 acquires the expected value of the node probability distribution for each node of the neural network to be analyzed for each cluster based on the node used index pattern to which it belongs.

次に代表線形モデル取得部１０５が、ノード指標情報に基づき分類後のクラスタごとに代表線形モデルを取得する（ステップＳ１０５）。次に動作制御部１０６が、動作変更条件が満たされたか否かを判定する（ステップＳ１０６）。動作変更条件が満たされた場合、制御部１０が実行する解析処理初期化段階における処理が終了する。一方、動作変更条件が満たされていない場合、ステップＳ１０１の処理に戻る。なお、図５に記載の“更新の段階”とは、解析処理更新段階を意味する。 Next, the representative linear model acquisition unit 105 acquires a representative linear model for each classified cluster based on the node index information (step S105). Next, the operation control unit 106 determines whether or not the operation change condition is satisfied (step S106). When the operation change condition is satisfied, the processing in the analysis processing initialization stage executed by the control unit 10 ends. On the other hand, if the operation change condition is not satisfied, the process returns to the process of step S101. The "update stage" shown in FIG. 5 means the analysis process update stage.

図６は、実施形態における制御部１０が実行する解析処理更新段階における処理の流れの一例を示すフローチャートである。解析処理更新段階における処理は、初期化段階における処理の終了後に初期化段階における処理で使用されていないテストデータが存在する場合に実行される。解析処理更新段階における処理は、例えば、初期化段階における処理の終了後にユーザインタフェース１３又は通信部１１を介してテストデータが入力された場合に実行される。解析処理更新段階における処理は、例えば、初期化段階における処理の終了時点で、ユーザインタフェース１３又は通信部１１を介して既に入力済みであるものの初期化段階における処理で使用されなかったテストデータがある場合に実行されてもよい。 FIG. 6 is a flowchart showing an example of a processing flow in the analysis processing update stage executed by the control unit 10 in the embodiment. The processing in the analysis processing update stage is executed when there is test data that is not used in the processing in the initialization stage after the processing in the initialization stage is completed. The process in the analysis process update stage is executed, for example, when test data is input via the user interface 13 or the communication unit 11 after the process in the initialization stage is completed. The processing in the analysis processing update stage includes, for example, test data that has already been input via the user interface 13 or the communication unit 11 at the end of the processing in the initialization stage but has not been used in the processing in the initialization stage. May be executed if.

テストデータ取得部１０１が１又は複数のテストデータを取得する（ステップＳ２０１）。次にノード被使用指標パターン取得部１０２が、ステップＳ２０１で取得された各テストデータを解析対象モデルに入力し、テストデータごとにノード被使用指標パターンを取得する（ステップＳ２０２）。次に更新段階クラスタリング部３０２が、Ｓ２０２で取得済みのノード被使用指標パターンに基づきクラスタを更新する（ステップＳ２０３）。 The test data acquisition unit 101 acquires one or a plurality of test data (step S201). Next, the node used index pattern acquisition unit 102 inputs each test data acquired in step S201 into the analysis target model, and acquires a node used index pattern for each test data (step S202). Next, the update stage clustering unit 302 updates the cluster based on the node usage index pattern acquired in S202 (step S203).

次にノード指標情報取得部１０４が、属するノード被使用指標パターンに基づきクラスタごとにノード指標情報を取得する（ステップＳ２０４）。すなわち、ノード指標情報取得部１０４が、属するノード被使用指標パターンに基づきクラスタごとに解析対象ニューラルネットワークのノードごとにノード確率分布の期待値を取得する。 Next, the node index information acquisition unit 104 acquires node index information for each cluster based on the node used index pattern to which it belongs (step S204). That is, the node index information acquisition unit 104 acquires the expected value of the node probability distribution for each node of the neural network to be analyzed for each cluster based on the node used index pattern to which it belongs.

次に代表線形モデル取得部１０５が、ノード指標情報に基づき分類後のクラスタごとに代表線形モデルを取得する（ステップＳ２０５）。取得された各クラスタの代表線形モデルを要素とする集合が近似結果モデルである。次に出力制御部１０７が出力部１３２に近似結果モデルを表示させる（ステップＳ２０６）。 Next, the representative linear model acquisition unit 105 acquires a representative linear model for each classified cluster based on the node index information (step S205). The set whose elements are the representative linear model of each acquired cluster is the approximation result model. Next, the output control unit 107 causes the output unit 132 to display the approximation result model (step S206).

このように構成されたニューラルネット解析装置１は、ノード被使用指標パターンに基づいて解析対象モデルを線形モデルで近似する制御部１０を備える。そのため、このように構成されたニューラルネット解析装置１は、予測精度の低下を抑制しつつ学習済みモデルの解釈を容易にする。 The neural network analysis device 1 configured in this way includes a control unit 10 that approximates the analysis target model with a linear model based on the node usage index pattern. Therefore, the neural network analysis device 1 configured in this way facilitates the interpretation of the trained model while suppressing a decrease in prediction accuracy.

（変形例） (Modification example)

なお、ニューラルネット解析装置１は必ずしも更新段階クラスタリング部３０２を備える必要はない。このような場合、解析処理更新段階におけるクラスタリングを初期化段階クラスタリング部３０１が解析処理初期化段階におけるクラスタリングと同様の方法で実行してもよい。具体的にはこのような場合、解析処理更新段階においても、テストデータが追加されるたびに、新たに追加されたテストデータと既に使用されたテストデータとを用いてステップＳ１０２からステップＳ１０５の処理が実行される。そして、ステップＳ１０５の実行結果が近似結果モデルとして表示される。 The neural network analysis device 1 does not necessarily have to include the update stage clustering unit 302. In such a case, the clustering in the analysis process update stage may be executed by the initialization stage clustering unit 301 in the same manner as the clustering in the analysis process initialization stage. Specifically, in such a case, even in the analysis processing update stage, each time test data is added, the processing of steps S102 to S105 is performed using the newly added test data and the already used test data. Is executed. Then, the execution result of step S105 is displayed as an approximation result model.

なお、ニューラルネット解析装置１が行う処理は、以下の式（８）及び（９）によって表される式に基づき式（１０）で表される式を取得する処理である。ニューラルネット解析装置１が行う処理は具体的には、解析対象モデルをテストデータが解析対象モデルに入力された際のノード被使用指標パターンに基づき線形モデルで近似する処理である。 The process performed by the neural network analysis device 1 is a process of acquiring the equation represented by the equation (10) based on the equations represented by the following equations (8) and (9). Specifically, the process performed by the neural net analysis device 1 is a process of approximating the analysis target model with a linear model based on the node usage index pattern when the test data is input to the analysis target model.

Ωは、式（９）が表すようにモデルパラメータの集合を表す。Ｇはクラスタの集合を表す。式（８）の左辺はクラスタ集合とモデルパラメータが与えられた場合の出力値とノード被使用指標パターンとクラス所属の同時確率を示す。式（１０）の関数は、パラメータΦ＿ｇとデータｘが与えられた場合の出力値ｙの出現率を示す。 Ω represents a set of model parameters as represented by equation (9). G represents a set of clusters. The left side of Eq. (8) shows the output value when the cluster set and model parameters are given, the node usage index pattern, and the simultaneous probability of class affiliation. The function of equation (10) shows the appearance rate of the output value y when the parameter Φ_g and the data x are given.

なお、ニューラルネット解析装置１が備える機能部は必ずしも一つの筐体に実装される必要は無い。ニューラルネット解析装置１は、ネットワークを介して通信可能に接続された複数台の情報処理装置を用いて実装されてもよい。この場合、ニューラルネット解析装置１が備える各機能部は、複数の情報処理装置に分散して実装されてもよい。例えば、初期化段階クラスタリング部３０１と、更新段階クラスタリング部３０２とは異なる情報処理装置に実装されてもよい。 The functional unit included in the neural network analysis device 1 does not necessarily have to be mounted in one housing. The neural network analysis device 1 may be implemented by using a plurality of information processing devices that are communicably connected via a network. In this case, each functional unit included in the neural network analysis device 1 may be distributed and mounted in a plurality of information processing devices. For example, the initialization stage clustering unit 301 and the update stage clustering unit 302 may be mounted on different information processing devices.

以上説明した少なくともひとつの実施形態によれば、ニューラルネット解析装置１は、ニューラルネットワークのノードが使用される度合を入力層への入力に応じて示す値であるノード被使用指標をニューラルネットワークの入力層、中間層及び出力層の各ノードについて示す情報であるノード被使用指標パターンを、解析対象のニューラルネットワークである解析対象ニューラルネットワークが表す学習済みモデルに入力されたテストデータごとに取得するノード被使用指標パターン取得部１０２と、ノード被使用指標パターン取得部１０２が取得したノード被使用指標パターンに基づきテストデータ及びノード被使用指標パターンをクラスタリングするクラスタリング部１０３と、を備える。そのため、このように構成されたニューラルネット解析装置１は、予測精度の低下を抑制しつつ学習済みモデルの解釈を容易にする。 According to at least one embodiment described above, the neural network analyzer 1 inputs a node usage index, which is a value indicating the degree to which a node of the neural network is used, according to an input to the input layer. The node usage index pattern, which is information indicating each node of the layer, the intermediate layer, and the output layer, is acquired for each test data input to the trained model represented by the analysis target neural network, which is the analysis target neural network. It includes a usage index pattern acquisition unit 102 and a clustering unit 103 that clusters test data and a node usage index pattern based on the node usage index pattern acquired by the node usage index pattern acquisition unit 102. Therefore, the neural network analysis device 1 configured in this way facilitates the interpretation of the trained model while suppressing a decrease in prediction accuracy.

なお、制御部１０の各機能の全て又は一部は、ＡＳＩＣ（Application Specific Integrated Circuit）やＰＬＤ（Programmable Logic Device）やＦＰＧＡ（Field Programmable Gate Array）等のハードウェアを用いて実現されてもよい。プログラムは、コンピュータ読み取り可能な記録媒体に記録されてもよい。コンピュータ読み取り可能な記録媒体とは、例えばフレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置である。プログラムは、電気通信回線を介して送信されてもよい。 All or part of each function of the control unit 10 may be realized by using hardware such as an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), or an FPGA (Field Programmable Gate Array). The program may be recorded on a computer-readable recording medium. The computer-readable recording medium is, for example, a flexible disk, a magneto-optical disk, a portable medium such as a ROM or a CD-ROM, or a storage device such as a hard disk built in a computer system. The program may be transmitted over a telecommunication line.

本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。 Although some embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These embodiments can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the gist of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, as well as in the scope of the invention described in the claims and the equivalent scope thereof.

１…ニューラルネット解析装置、１０…制御部、１１…通信部、１２…記憶部、１３…ユーザインタフェース、１３１…入力部、１３２…出力部、１０１…テストデータ取得部、１０２…ノード被使用指標パターン取得部、１０３…クラスタリング部、１０４…ノード指標情報取得部、１０５…代表線形モデル取得部、１０６…動作制御部、３０１…初期化段階クラスタリング部、３０２…更新段階クラスタリング部 1 ... Neural net analysis device, 10 ... Control unit, 11 ... Communication unit, 12 ... Storage unit, 13 ... User interface, 131 ... Input unit, 132 ... Output unit, 101 ... Test data acquisition unit, 102 ... Node usage index Pattern acquisition unit, 103 ... Clustering unit, 104 ... Node index information acquisition unit, 105 ... Representative linear model acquisition unit, 106 ... Motion control unit, 301 ... Initialization stage clustering unit, 302 ... Update stage clustering unit

Claims

The node usage index, which is a value indicating the degree to which the nodes of the neural network are used according to the input to the input layer, is the information indicating each node of the input layer, the intermediate layer, and the output layer of the neural network. A node-used index pattern acquisition unit that acquires a pattern for each test data input to the trained model represented by the neural network to be analyzed, which is the neural network to be analyzed.
A clustering unit that clusters the test data and the node-used index pattern based on the node-used index pattern acquired by the node-used index pattern acquisition unit.
A neural network analyzer equipped with.

A representative linear model acquisition unit that acquires a representative linear model that is a linear model based on the distribution of the node used index parameters to which each cluster is generated by clustering by the clustering unit.
The neural network analysis apparatus according to claim 1.

The clustering unit receives the output result of the trained model for the newly added test data and each of the newly added test data when a predetermined condition including the condition that the cluster has already been generated is satisfied. Based on the output result of the representative linear model, the cluster to which the representative linear model belongs that outputs the output result having the smallest difference from the output result of the trained model when the additional test data is input. It is determined that the cluster belongs to the newly added test data, and the cluster is updated based on the determination result.
The neural network analysis apparatus according to claim 2.

The representative linear model acquisition unit acquires a node index obtained by using the node used index as a random variable for each node of the analysis target neural network for each cluster, and based on the acquired node index, the representative linear model To get,
The neural network analysis apparatus according to claim 2 or 3.

The node index is an expected value of the random variable.
The neural network analysis apparatus according to claim 4.

An output control unit that controls the operation of the output unit that outputs the representative linear model and causes the output unit to output the representative linear model.
The neural network analysis apparatus according to any one of claims 2 to 5.

It is a neural network analysis method executed by a computer.
The node usage index, which is a value indicating the degree to which the nodes of the neural network are used according to the input to the input layer, is the information indicating each node of the input layer, the intermediate layer, and the output layer of the neural network. A node usage index pattern acquisition step that acquires a pattern for each test data input to the trained model represented by the analysis target neural network, which is the analysis target neural network,
A clustering step for clustering the test data and the node used index pattern based on the node used index pattern acquired in the node used index pattern acquisition step, and a clustering step.
Neural network analysis method having.

The node usage index, which is a value indicating the degree to which the nodes of the neural network are used according to the input to the input layer, is the information indicating each node of the input layer, the intermediate layer, and the output layer of the neural network. A node usage index pattern acquisition step that acquires a pattern for each test data input to the trained model represented by the analysis target neural network, which is the analysis target neural network,
A clustering step for clustering the test data and the node used index pattern based on the node used index pattern acquired in the node used index pattern acquisition step, and a clustering step.
A program that causes a computer to run.