JP2022099572A

JP2022099572A - Information processor, information processing method and program

Info

Publication number: JP2022099572A
Application number: JP2020213408A
Authority: JP
Inventors: 裕生宮下; Yuki Miyashita; 恵介石黒; Keisuke Ishiguro
Original assignee: Nagoya Electric Works Co Ltd
Current assignee: Nagoya Electric Works Co Ltd
Priority date: 2020-12-23
Filing date: 2020-12-23
Publication date: 2022-07-05
Anticipated expiration: 2040-12-23
Also published as: JP7461866B2

Abstract

To provide an information processor, an information processing method, and a program that assist a user in grasping the degree to which each area of an image contributes to which class discrimination.SOLUTION: An information processor 10 includes an acquisition unit 21c that acquires contribution indicating the degree to which an area of an input image put into a machine learning model contributes to a class discrimination by the machine learning model for each class, and a derivation unit 21d that derives comparison information showing a comparison result of the contribution of each class acquired by the acquisition unit 21c.SELECTED DRAWING: Figure 1

Description

本発明は、情報処理装置、情報処理方法およびプログラムに関する。 The present invention relates to an information processing apparatus, an information processing method and a program.

機械学習されたモデルを用いた画像のクラス判別において、クラス判別の判断の根拠となった画像内の領域を可視化する技術がある。
非特許文献１には、ＣＮＮの判断根拠の可視化技術であるＧｒａｄ－ＣＡＭ（Ｇｒａｄｉｅｎｔ－ｗｅｉｇｈｔｅｄＣｌａｓｓＡｃｔｉｖａｔｉｏｎＭａｐｐｉｎｇ）が開示されている。 In image class discrimination using a machine-learned model, there is a technique for visualizing an area in an image that is the basis for class discrimination judgment.
Non-Patent Document 1 discloses Grad-CAM (Gradient-weighted Class Activation Mapping), which is a technique for visualizing the judgment basis of CNN.

Ｒ．Ｒ．Ｓｅｌｖａｒａｊｕ、Ｍ．Ｃｏｇｓｗｅｌｌ、Ａ．Ｄａｓ、Ｒ．Ｖｅｄａｎｔａｍ、Ｄ．Ｐａｒｉｋｈ、Ｄ．Ｂａｔｒａ、 "Ｇｒａｄ－ＣＡＭ：ＶｉｓｕａｌＥｘｐｌａｎａｔｉｏｎｓｆｒｏｍＤｅｅｐＮｅｔｗｏｒｋｓｖｉａＧｒａｄｉｅｎｔ－ｂａｓｅｄＬｏｃａｌｉｚａｔｉｏｎ、"ａｒＸｉｖ：１６１０．０２３９１ｖ３、２０１７．R. R. Selvaraju, M.D. Cogswell, A. Das, R.M. Vedantam, D.M. Parikh, D. Batra, "Grad-CAM: Visual Explanations from Deep Networks Via Gradient-based Localization," arXiv: 1610.02391v3, 2017.

しかし、非特許文献１では、複数のクラスそれぞれに対応するヒートマップを表示する場合、複数のヒートマップにおける同じ領域の色が同じような色だと、その領域が何れのクラスの判別により寄与したかがユーザーにより把握できないという課題があった。
本発明は、前記課題にかんがみてなされたもので、画像の各領域が何れのクラスの判別により寄与したかについて、ユーザーによる把握を支援することを目的とする。 However, in Non-Patent Document 1, when displaying heat maps corresponding to each of a plurality of classes, if the colors of the same region in the plurality of heat maps are similar, the region contributed to the discrimination of which class. There was a problem that the user could not grasp the color.
The present invention has been made in view of the above problems, and an object of the present invention is to support a user's understanding as to which class each region of an image contributes to the discrimination.

前記の目的を達成するため、本発明の情報処理装置は、機械学習モデルに入力される入力画像の領域が前記機械学習モデルによるクラス判別に寄与した度合いを示す寄与度を、クラス毎に取得する取得部と、前記取得部により取得されるクラス毎の寄与度の比較結果を示す比較情報を導出する導出部と、を備える。 In order to achieve the above object, the information processing apparatus of the present invention acquires the contribution degree indicating the degree to which the region of the input image input to the machine learning model contributes to the class discrimination by the machine learning model for each class. It includes an acquisition unit and a derivation unit for deriving comparison information indicating a comparison result of contributions for each class acquired by the acquisition unit.

以上、説明した構成によれば、入力画像の領域毎に、クラス毎のクラス判別の寄与度の比較結果を示す比較情報を導出できる。この比較情報がユーザーに提示されるとユーザーは、入力画像の各領域がどのクラスの判別により寄与したかを容易に把握できる。すなわち、本発明の構成により、このようなユーザーによる把握を支援することができる。 According to the configuration described above, comparison information showing the comparison result of the contribution of the class discrimination for each class can be derived for each area of the input image. When this comparison information is presented to the user, the user can easily grasp which class each area of the input image contributed to. That is, the configuration of the present invention can support such a user's grasp.

情報処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of an information processing apparatus. モデルの構造を示す図である。It is a figure which shows the structure of a model. 導出処理を示すフローチャートである。It is a flowchart which shows the derivation process. 原画像を示す図である。It is a figure which shows the original image. 評価画像の導出処理の概要を説明する図である。It is a figure explaining the outline of the derivation process of the evaluation image. 空車クラスについての評価画像を示す図である。It is a figure which shows the evaluation image about an empty car class. 駐車クラスについての評価画像を示す図である。It is a figure which shows the evaluation image about a parking class. 導出処理を示すフローチャートである。It is a flowchart which shows the derivation process. 出力される比較画像を示す図である。It is a figure which shows the output comparative image. 表示画面を示す図である。It is a figure which shows the display screen.

ここでは、下記の順序に従って本発明の実施の形態について説明する。
（１）情報処理装置の構成：
（２）導出処理：
（３）他の実施形態： Here, embodiments of the present invention will be described in the following order.
(1) Configuration of information processing device:
(2) Derivation processing:
(3) Other embodiments:

（１）情報処理装置の構成：
図１は、本発明の一実施形態にかかる情報処理装置１０の構成を示すブロック図である。情報処理装置１０は、制御部２０と記録媒体３０と表示部４０とを備えている。制御部２０は、ＣＰＵとＲＡＭとＲＯＭ等を備えたコンピュータであり、記録媒体３０等に記録された導出プログラム２１等の各種プログラムを実行する。導出プログラム２１は、クラス判別対象の画像における各領域について、各クラスの判別に対する寄与度の比較結果を示す画像を導出する機能を制御部２０に実行させるプログラムである。 (1) Configuration of information processing device:
FIG. 1 is a block diagram showing a configuration of an information processing apparatus 10 according to an embodiment of the present invention. The information processing apparatus 10 includes a control unit 20, a recording medium 30, and a display unit 40. The control unit 20 is a computer including a CPU, RAM, ROM, and the like, and executes various programs such as a derivation program 21 recorded on the recording medium 30 and the like. The derivation program 21 is a program that causes the control unit 20 to execute a function of deriving an image showing a comparison result of contributions to each class discrimination for each region in the image to be classified.

記録媒体３０は、導出プログラム２１等の各種プログラム、各種データを記録する。記録媒体３０は、例えば、ハードディスクドライブ（ＨＤＤ）、ソリッドステートドライブ（ＳＳＤ）等である。本実施形態では、記録媒体３０は、教師データ３１、機械学習モデル３２、原画像３３、判定基準データ３４、評価画像３５、比較画像３６等を記録する。
教師データ３１は、機械学習モデル３２の機械学習に用いられる教師データである。機械学習モデル３２は、入力される画像のクラス判別に用いられるモデルである。ここで、モデルとは、クラス判別対象のデータと、クラス判別結果のデータと、の対応関係を示す情報（例えば、式等）である。本実施形態では、機械学習モデル３２は、入力される画像が、駐車されている状況を示す駐車クラスと空車の状況を示す空車クラスとの２つのクラスのうちの何れであるかを判別するモデルである。また、本実施形態では、機械学習モデル３２は、ＣＮＮ（畳込みニューラルネットワーク）を含むモデルである。 The recording medium 30 records various programs such as the derivation program 21 and various data. The recording medium 30 is, for example, a hard disk drive (HDD), a solid state drive (SSD), or the like. In the present embodiment, the recording medium 30 records teacher data 31, machine learning model 32, original image 33, determination reference data 34, evaluation image 35, comparative image 36, and the like.
The teacher data 31 is teacher data used for machine learning of the machine learning model 32. The machine learning model 32 is a model used for class discrimination of an input image. Here, the model is information (for example, an expression or the like) showing the correspondence relationship between the data of the class discrimination target and the data of the class discrimination result. In the present embodiment, the machine learning model 32 is a model that determines whether the input image is one of two classes, a parking class indicating a parked situation and an empty vehicle class indicating an empty vehicle status. Is. Further, in the present embodiment, the machine learning model 32 is a model including a CNN (convolutional neural network).

図２は、本実施形態の機械学習モデル３２の構造を示す図である。図２においては、ＣＮＮによるデータフォーマットの変化を直方体で示している。本実施形態の機械学習モデル３２は、クラス判別対象の画像を示す画像データをＣＮＮの入力層Ｌ_ｉへの入力データとし、１つ以上の畳込み層、１つ以上のプーリング層、全結合層Ｌ_ｎを経由して、出力層Ｌ_ｏの各ノードへ出力データが出力される。本実施形態では、機械学習モデル３２のＣＮＮに入力される画像データは、縦Ｈピクセル、横Ｗピクセルであり、各ピクセルについてＲ：赤、Ｇ：緑、Ｂ：青の３チャネルの階調値が規定されている。従って、図２において入力層Ｌ_ｉの画像は、縦Ｈ、横Ｗ、奥行き３の直方体で模式的に示されている。 FIG. 2 is a diagram showing the structure of the machine learning model 32 of the present embodiment. In FIG. 2, the change in the data format due to CNN is shown in a rectangular parallelepiped. In the machine learning model 32 of the present embodiment, image data showing an image to be classified is used as input data to the input layer _Li of CNN, and one or more convolutional layers, one or more pooling layers, and fully connected layers. Output data is output to each node of the output layer _Lo via L _n . In the present embodiment, the image data input to the CNN of the machine learning model 32 is vertical H pixels and horizontal W pixels, and the gradation values of three channels of R: red, G: green, and B: blue for each pixel. Is stipulated. Therefore, in FIG. 2, the image of the input layer _Li is schematically shown as a rectangular parallelepiped having vertical H, horizontal W, and depth 3.

図２においては、入力層Ｌ_ｉに入力された画像は、既定の大きさおよび個数のフィルターによる畳み込み演算、活性化関数による演算およびプーリング層の演算を経て縦Ｈ_１×横Ｗ_１×チャネルＤ_１の特徴マップに変換される例を示している。図２においては、この後、複数の層を経て、最終的には、畳込み層の最終層である層Ｌ_ｍにおいて縦Ｈ_ｍ×横Ｗ_ｍ×チャネルＤ_ｍの特徴マップに変換される例を示している。ＣＮＮによって、縦Ｈ_ｍ×横Ｗ_ｍ×チャネルＤ_ｍの特徴マップが得られた後、全結合によって全結合層Ｌ_ｎの各ノードの値が得られる。全結合層Ｌ_ｎの各ノードの値を入力データとして、出力層Ｌ_ｏの各ノードの値が出力される。本実施形態の機械学習モデル３２のＣＮＮの出力層Ｌ_ｏには、空車クラス、駐車クラスにそれぞれ対応する２つのノードが含まれ、この２つのノードそれぞれには、入力された画像が対応するクラスに属する可能性の大きさに応じた値が出力される。また、畳込み層における特徴マップの各チャネルは、入力層Ｌ_ｉに入力された画像がフィルターによって畳込まれた形式のデータであり、各チャネルにおける各領域は、入力された画像における各領域と対応付いている。すなわち、畳込み層における特徴マップの各チャネルは、入力された画像における位置情報が維持されている。 In FIG. 2, the image input to the input layer _Li is subjected to a convolution operation by a filter of a predetermined size and number, an operation by an activation function, and an operation of a pooling layer, and then H ₁ × horizontal W ₁ × channel D. An example of conversion to the feature map of ₁ is shown. In FIG. 2, an example in which a feature map of vertical _{Hm × horizontal W m × channel D m} _is _finally converted into a layer L _m , which is the final layer of the convolutional layer, via a plurality of layers. Is shown. After the feature map of vertical H _m × horizontal W _m × channel D _m is obtained by CNN, the value of each node of the fully connected layer _Ln is obtained by the full bond. The value of each node of the fully connected layer L _n is used as input data, and the value of each node of the output layer _Lo is output. The output layer _Lo of the CNN of the machine learning model 32 of the present embodiment includes two nodes corresponding to an empty vehicle class and a parking class, and each of these two nodes is a class to which an input image corresponds. A value is output according to the magnitude of the possibility of belonging to. Further, each channel of the feature map in the convolution layer is data in a format in which the image input to the input layer Li is _convoluted by a filter, and each area in each channel is the same as each area in the input image. There is a correspondence. That is, each channel of the feature map in the convolutional layer maintains the position information in the input image.

原画像３３は、クラス判別の対象の画像であり、本実施形態では、駐車場における駐車スペースの画像である。判定基準データ３４は、評価画像３５同士の比較の基準を示す情報である。評価画像３５は、対応するクラスの判別に対し、原画像３３の各領域がどの程度寄与したかを示す画像であり、本実施形態では、Ｇｒａｄ－ＣＡＭで得られるヒートマップである。比較画像３６は、出力クラス毎の評価画像３５の比較結果を示す画像である。表示部４０は、各種情報を表示する。表示部４０は、例えば、モニタ、タッチパネル等である。 The original image 33 is an image to be classified, and in the present embodiment, is an image of a parking space in a parking lot. The determination standard data 34 is information indicating a standard for comparison between the evaluation images 35. The evaluation image 35 is an image showing how much each region of the original image 33 contributed to the discrimination of the corresponding class, and is a heat map obtained by Grad-CAM in the present embodiment. The comparison image 36 is an image showing the comparison result of the evaluation image 35 for each output class. The display unit 40 displays various information. The display unit 40 is, for example, a monitor, a touch panel, or the like.

制御部２０は、記録媒体３０に記録された導出プログラム２１を実行することで、学習部２１ａ、判別部２１ｂ、取得部２１ｃ、導出部２１ｄ、表示制御部２１ｅとして機能する。学習部２１ａは、機械学習モデル３２を機械学習する機能である。判別部２１ｂは、学習部２１ａにより機械学習された機械学習モデル３２を用いて、原画像３３のクラスを判別する機能である。取得部２１ｃは、空車クラス判別、駐車クラス判別それぞれに対する、原画像３３の各領域の寄与度を示す情報を取得する機能である。本実施形態では、取得部２１ｃは、空車クラス判別、駐車クラス判別それぞれに対する、原画像３３の各領域の寄与度を示す情報として、Ｇｒａｄ－ＣＡＭで得られるヒートマップである評価画像３５を取得する機能である。導出部２１ｄは、取得部２１ｃにより取得される出力クラス毎の寄与度の比較結果を示す情報を導出する機能である。以下では、出力クラス毎の寄与度の比較結果を示す情報を比較情報とする。本実施形態では、導出部２１ｄは、この比較情報として比較画像３６を導出する機能である。表示制御部２１ｅは、導出部２１ｄにより導出された比較情報（本実施形態では、比較画像３６）を表示する機能である。 The control unit 20 functions as a learning unit 21a, a discrimination unit 21b, an acquisition unit 21c, a derivation unit 21d, and a display control unit 21e by executing the derivation program 21 recorded on the recording medium 30. The learning unit 21a is a function of machine learning the machine learning model 32. The discrimination unit 21b is a function of discriminating the class of the original image 33 by using the machine learning model 32 machine-learned by the learning unit 21a. The acquisition unit 21c is a function of acquiring information indicating the degree of contribution of each region of the original image 33 to each of the empty vehicle class determination and the parking class determination. In the present embodiment, the acquisition unit 21c acquires the evaluation image 35, which is a heat map obtained by the Grad-CAM, as information indicating the contribution of each region of the original image 33 to each of the empty vehicle class discrimination and the parking class discrimination. It is a function. The derivation unit 21d is a function of deriving information indicating the comparison result of the contribution degree for each output class acquired by the acquisition unit 21c. In the following, the information showing the comparison result of the contribution of each output class will be referred to as the comparison information. In the present embodiment, the derivation unit 21d is a function of deriving the comparison image 36 as this comparison information. The display control unit 21e is a function of displaying the comparison information (comparative image 36 in the present embodiment) derived by the derivation unit 21d.

（２）導出処理：
図３は、情報処理装置１０が実行する導出処理の一例を示すフローチャートである。制御部２０は、図３の処理の開始前に、学習部２１ａの機能により、記録媒体３０に予め記録された教師データ３１を用いて、記録媒体３０に予め記録された機械学習モデル３２を機械学習する。ここで、教師データ３１は、ラベルが対応付けられた複数の画像である。教師データ３１の複数の画像それぞれは、原画像３３と同一のフォーマット（縦横画素）の駐車スペースの画像であって、画像に含まれる駐車スペースに車両が駐車しているか否かのラベルが対応づけられている。制御部２０は、教師データ３１を入力データとして機械学習モデル３２に入力し、出力を取得し、出力値と入力データに対応づけられたラベル（各ノードのいずれかが１）との差が少なくなるように、機械学習モデルの可変パラメータを変化させ、差が既定値以下になるまで演算を繰り返すことで、機械学習モデル３２の機械学習を行う。そして、制御部２０は、学習済みの機械学習モデル３２を記録媒体３０に記録する。 (2) Derivation processing:
FIG. 3 is a flowchart showing an example of the derivation process executed by the information processing apparatus 10. Prior to the start of the process of FIG. 3, the control unit 20 uses the teacher data 31 pre-recorded on the recording medium 30 by the function of the learning unit 21a to machine the machine learning model 32 pre-recorded on the recording medium 30. learn. Here, the teacher data 31 is a plurality of images to which labels are associated. Each of the plurality of images of the teacher data 31 is an image of a parking space in the same format (vertical and horizontal pixels) as the original image 33, and a label indicating whether or not the vehicle is parked is associated with the parking space included in the image. Has been done. The control unit 20 inputs the teacher data 31 as input data to the machine learning model 32, acquires the output, and the difference between the output value and the label associated with the input data (one of the nodes is 1) is small. The machine learning of the machine learning model 32 is performed by changing the variable parameters of the machine learning model and repeating the calculation until the difference becomes equal to or less than the default value. Then, the control unit 20 records the trained machine learning model 32 on the recording medium 30.

図３の比較結果導出処理が開始されると、制御部２０は、判別部２１ｂの機能により、機械学習モデル３２を用いたクラス判別を行う（ステップＳ１００）。より具体的には、制御部２０は、原画像３３を入力画像として、機械学習モデル３２による演算を行い、出力層Ｌ_ｏの各ノードの出力値を取得する。そして、制御部２０は、出力値が最も大きいノードに対応するクラスを判別結果とする。ここでは、本実施形態で用いられる原画像３３を図４に示す。本実施形態では、原画像３３は、図４に示すように、駐車場における空車状態の駐車スペースの画像である。 When the comparison result derivation process of FIG. 3 is started, the control unit 20 performs class discrimination using the machine learning model 32 by the function of the discrimination unit 21b (step S100). More specifically, the control unit 20 performs an operation by the machine learning model 32 using the original image 33 as an input image, and acquires the output value of each node of the output layer _Lo . Then, the control unit 20 determines the class corresponding to the node having the largest output value as the determination result. Here, the original image 33 used in this embodiment is shown in FIG. In the present embodiment, the original image 33 is an image of an empty parking space in a parking lot, as shown in FIG.

次に、制御部２０は、取得部２１ｃの機能により、機械学習モデル３２が判別するクラス（空車クラスおよび駐車クラス）から１つを選択する（ステップＳ１０５）。以下では、最新のステップＳ１０５の処理で選択されたクラスを選択クラスとする。次に、制御部２０は、取得部２１ｃの機能により、機械学習モデル３２のＣＮＮにおける畳込み層の最終層である層Ｌ_ｍにおける特徴マップから１つのチャネルを選択する（ステップＳ１１０）。図２の例では、制御部２０は、層Ｌ_ｍで得られる縦Ｈ_ｍ×横Ｗ_ｍ×チャネルＤ_ｍの特徴マップから１つのチャネル（縦Ｈ_ｍ×横Ｗ_ｍの特徴マップ）を選択する。以下では、最新のステップＳ１１０で選択されたチャネルを、選択チャネルとする。 Next, the control unit 20 selects one from the classes (empty car class and parking class) determined by the machine learning model 32 by the function of the acquisition unit 21c (step S105). In the following, the class selected in the latest process of step S105 will be referred to as a selection class. Next, the control unit 20 selects one channel from the feature map in the layer _Lm , which is the final layer of the convolutional layer in the CNN of the machine learning model 32, by the function of the acquisition unit 21c (step S110). In the example of FIG. 2, the control unit 20 selects one channel (vertical H _m × horizontal W _m feature map) from the feature map of vertical H _m × horizontal W _m × channel D _m obtained in the layer L _m . .. In the following, the channel selected in the latest step S110 will be referred to as a selected channel.

次に、制御部２０は、取得部２１ｃの機能により、選択クラスの判別に対する選択チャネルの重要さを示す重みを取得する（ステップＳ１１５）。ステップＳ１１５での処理の詳細を説明する。ここで、機械学習モデル３２が判別するクラス（空車クラス、駐車クラス）のうち、選択クラスを示すインデックスをｃとする。また、出力層Ｌ_ｏにおける選択クラスに対応するノードの出力値をｙ^ｃとする。また、層Ｌ_ｍにおける特徴マップにおけるＤ_ｍ個のチャネルのうち、選択チャネルを示すインデックスをｋとする。また、選択チャネルにおける特徴マップ（縦Ｈ_ｍ×横Ｗ_ｍの特徴マップ）をＡ^ｋとする。Ａ^ｋにおける画素数（Ｈ_ｍ×Ｗ_ｍ）をＺとおく。Ａ^ｋにおける横の位置、縦の位置を示すインデックスを、それぞれｉ、ｊとする。Ａ^ｋにおける位置（ｉ、ｊ）の画素の画素値を、Ａ^ｋ _ｉ、ｊとする。また、選択クラスの判別に対する選択チャネルの重みを、α^ｃ _ｋとする。
制御部２０は、以下の式１を用いて、重みα^ｃ _ｋを取得する。

Next, the control unit 20 acquires a weight indicating the importance of the selection channel for the determination of the selection class by the function of the acquisition unit 21c (step S115). The details of the process in step S115 will be described. Here, c is an index indicating a selected class among the classes (empty car class, parking class) determined by the machine learning model 32. Further, let y ^c be the output value of the node corresponding to the selection class in the output layer _Lo . Further, of the D _m channels in the feature map in the layer L _m , the index indicating the selected channel is k. Further, the feature map (feature map of vertical H _m × horizontal W _m ) in the selected channel is defined as ^Ak . ^Let Z be the number of pixels (H _m × W _m ) in Ak. Let i and ^j be the indexes indicating the horizontal position and the vertical position in Ak, respectively. Let the pixel values of the pixels at the positions (i, ^j ) at Ak be ^Ak _{i, j} . Further, the weight of the selection channel for determining the selection class is α ^c _k .
The control unit 20 acquires the weight α ^c _k using the following equation 1.

このように、制御部２０は、選択チャネルの全画素について、ｙ^ｃを選択チャネルの画素の値（Ａ^ｋ _ｉ、ｊ）で偏微分した値（δｙ^ｃ／δＡ^ｋ _ｉ、ｊ）、すなわち、ｙ^ｃのＡ^ｋ _ｉ、ｊに対する勾配を、画素の選択クラスの判別への寄与の度合を示す指標として求める。そして、制御部２０は、求めた指標の選択チャネルの全画素（Ｚ個の画素）についての平均値を、重みα^ｃ _ｋとして取得する。本実施形態では、制御部２０は、式１におけるδｙ^ｃ／δＡ^ｋ _ｉ、ｊを、以下のようにして求める。機械学習モデル３２においては、層Ｌ_ｍの特徴マップの各画素の値が全結合層の各ノードとして結合され、全結合層の各ノード値を入力データとして、出力層Ｌ_ｏの各ノードの出力値が得られる。そのため、出力層Ｌ_ｏの各ノード値は、層Ｌ_ｍの特徴マップの各画素の値を引数とする関数として表すことができる。すなわち、ｙ^ｃは、少なくともＡ^ｋを引数とする関数ｙ^ｃ（Ａ^ｋ _１、１、Ａ^ｋ _１、２、・・・、Ａ^ｋ _ｉ、ｊ、・・・、Ａ^ｋ _{Ｗｍ、Ｈｍ}）として表すことができる。そして、制御部２０は、Ａ^ｋ _ｉ、ｊの変動に応じたｙ^ｃの変動率を、δｙ^ｃ／δＡ^ｋ _ｉ、ｊの値として求める。本実施形態では、制御部２０は、以下の式２を用いて、δｙ^ｃ／δＡ^ｋ _ｉ、ｊを求める。式２におけるｈは、予め定められた実数である。

As described above, the control unit 20 partially differentiates y ^c from the values of the pixels of the selected channel (A ^k _{i, j} ) for all the pixels of the selected channel (δ y ^c / δ A ^k _{i, j} ), that is, The gradient of y ^c with respect to Ak _{i and j} ^is obtained as an index showing the degree of contribution to the discrimination of the pixel selection class. Then, the control unit 20 acquires the average value of all the pixels (Z pixels) of the selected channel of the obtained index as the weight α ^c _k . In the present embodiment, the control unit 20 obtains δy ^c / δA ^k _{i and j} in the equation 1 as follows. In the machine learning model 32, the values of each pixel of the feature map of the layer _Lm are combined as each node of the fully connected layer, and each node value of the fully connected layer is used as input data, and the output of each node of the output layer _Lo is used. The value is obtained. Therefore, each node value of the output layer _Lo can be expressed as a function having the value of each pixel of the feature map of the layer L _m as an argument. That is, y ^c is a function y ^c (A ^k ₁ , 1, A ^k 1, ₂ , ..., A ^k _{i, j} , ..., A ^k _{Wm, Hm} ) with at least ^Ak as an argument. Can be represented. Then, the control unit 20 obtains the volatility of y ^c according to the fluctuation of A ^k _i, j as the value of δ y ^c / δA ^k _{i, j} . In the present embodiment, the control unit 20 obtains δy ^c / δA ^k _{i and j} using the following equation 2. H in Equation 2 is a predetermined real number.

ただし、（δｙ^ｃ／δＡ^ｋ _ｉ、ｊ）を求める方法は、式２を用いる方法に限定されない。例えば、ｙ^ｃをＡ^ｋ _ｉ、ｊで偏微分した式の情報であって、機械学習モデル３２で用いられるパラメータ（ノードの結合の重み、畳込み等に用いられるフィルターの要素値、特徴マップの各画素の値等）で表される式の情報が予め記録媒体３０に記録されているとしてもよい。この場合、制御部２０は、この情報が示す式に各パラメータの値を代入することで、（δｙ^ｃ／δＡ^ｋ _ｉ、ｊ）を求めてもよい。 However, the method for obtaining (δy ^c / δA ki _{, j} ⁾ is not limited to the method using Equation 2. For example, it is the information of the equation obtained by partially differentiating y ^c by A ^k _{i and j} , and is the parameter used in the machine learning model 32 (weight of connection of nodes, element value of filter used for convolution, etc., feature map). The information of the formula represented by the value of each pixel, etc.) may be recorded in advance on the recording medium 30. In this case, the control unit 20 may obtain (δy ^c / δA ^k _{i, j} ) by substituting the value of each parameter into the equation indicated by this information.

次に、制御部２０は、取得部２１ｃの機能により、全チャネルを選択済であるか否か判定する（ステップＳ１２０）。すなわち、制御部２０は、ステップＳ１０５において選択された選択クラスにおいて、層Ｌ_ｍにおける特徴マップにおける全てのチャネル（チャネル数Ｄ_ｍ個分）について、ステップＳ１１０～ステップＳ１２０のループ処理を行ったか否かを判定する。選択クラスを選択した後で全チャネルを選択済であると、ステップＳ１２０において判定されない場合、制御部２０は、ステップＳ１１０以降の処理を繰り返す。 Next, the control unit 20 determines whether or not all channels have been selected by the function of the acquisition unit 21c (step S120). That is, whether or not the control unit 20 has performed the loop processing of steps S110 to S120 for all the channels (for the number of channels _Dm ₎ in the feature map in the layer Lm in the selection class selected in step S105. To judge. If it is not determined in step S120 that all channels have been selected after selecting the selection class, the control unit 20 repeats the processes after step S110.

ステップＳ１２０において、選択クラスを選択した後で全チャネルを選択済であると判定された場合、制御部２０は、取得部２１ｃの機能により、各チャネルにおける特徴マップを重みに応じて合成し、選択クラスの評価画像を取得する（ステップＳ１２５）。具体的には、制御部２０は、以下の式３を用いて、層Ｌ_ｍにおける特徴マップの全てのチャネルをステップＳ１１５で取得した重みをかけて線形結合することで、選択クラスについての評価画像３５を取得する。

When it is determined in step S120 that all channels have been selected after selecting the selection class, the control unit 20 synthesizes the feature maps in each channel according to the weights by the function of the acquisition unit 21c and selects them. The evaluation image of the class is acquired (step S125). Specifically, the control unit 20 linearly combines all the channels of the feature map in the layer _Lm with the weights acquired in step S115 by using the following equation 3 to evaluate the selection class. Get 35.

式３のＬ^ｃ _{Ｇｒａｄ－ＣＡＭ}は、選択クラスについての評価画像３５である。画像Ｌ^ｃ _{Ｇｒａｄ－ＣＡＭ}の各画素の値は、対応する原画像における領域の選択クラスの判別への寄与の度合を示す指標（寄与度）となる。本実施形態では、制御部２０は、式３を用いて、図５に示すように、層Ｌ_ｍにおける特徴マップの全てのチャネルにおける特徴マップを、対応する重みと掛け合わせて足し合わせ、１つの縦Ｈ_ｍ×横Ｗ_ｍの画像データとする。そして、制御部２０は、ＲｅＬＵ関数により、この画像データにおける画素値が０未満の各画素の画素値を０にする。選択クラスの判別に寄与するのは、画素値が正の部分であると考えられる。そこで、制御部２０は、このようにすることで、クラスの判別に寄与しない部分の情報量を低減できる。 The L ^c _Grad-CAM of Equation 3 is an evaluation image 35 for the selection class. The value of each pixel of the image L ^c _Grad-CAM is an index (contribution degree) indicating the degree of contribution to the discrimination of the selection class of the region in the corresponding original image. In the present embodiment, the control unit 20 uses Equation 3 to multiply the feature maps in all channels of the feature map in the layer _Lm by the corresponding weights and add one, as shown in FIG. Image data of vertical H _m × horizontal W _m . Then, the control unit 20 sets the pixel value of each pixel whose pixel value in the image data is less than 0 to 0 by the ReLU function. It is considered that the part where the pixel value is positive contributes to the determination of the selection class. Therefore, by doing so, the control unit 20 can reduce the amount of information in the portion that does not contribute to the discrimination of the class.

図６に、選択クラスが空車クラスである場合にステップＳ１２０で取得される評価画像３５ａを示す。また、図７に、選択クラスが駐車クラスである場合にステップＳ１２０で取得される評価画像３５ｂを示す。図６、７では、評価画像３５ａ、評価画像３５ｂそれぞれは、ヒートマップで表されている。図６、７では、評価画像３５ａ、ｂの画素の値が大きいほど、その画素は、より明るい色（図６、７の例では、より白い色）で表示される。これにより、ユーザーは、評価画像３５を視認することで、より明るい領域に対応する原画像３３の領域が、より対応するクラスの判別により寄与したことを直感的に把握できる。 FIG. 6 shows an evaluation image 35a acquired in step S120 when the selection class is an empty vehicle class. Further, FIG. 7 shows an evaluation image 35b acquired in step S120 when the selection class is a parking class. In FIGS. 6 and 7, each of the evaluation image 35a and the evaluation image 35b is represented by a heat map. In FIGS. 6 and 7, the larger the value of the pixel of the evaluation images 35a and b, the brighter the pixel is displayed (in the example of FIGS. 6 and 7, the whiter color). As a result, the user can intuitively grasp that the region of the original image 33 corresponding to the brighter region contributed to the discrimination of the corresponding class by visually recognizing the evaluation image 35.

次に、制御部２０は、取得部２１ｃの機能により、機械学習モデル３２で判別できるクラス（空車クラス、駐車クラス）の全てをステップＳ１０５で選択したか否かを判定する（ステップＳ１３０）。機械学習モデル３２で判別できるクラス（空車クラス、駐車クラス）の全てをステップＳ１０５で選択したと判定されない場合、制御部２０は、ステップＳ１０５以降の処理を繰り返す。このように、本実施形態では、制御部２０は、ステップＳ１０５～ステップＳ１３０の処理を実行することで、Ｇｒａｄ－ＣＡＭの手法により、原画像３３の各領域のクラス判別への寄与度を示す評価画像３５をクラス毎に取得する。 Next, the control unit 20 determines whether or not all of the classes (empty car class, parking class) that can be determined by the machine learning model 32 are selected in step S105 by the function of the acquisition unit 21c (step S130). If it is not determined in step S105 that all of the classes (empty car class, parking class) that can be discriminated by the machine learning model 32 are selected, the control unit 20 repeats the processes after step S105. As described above, in the present embodiment, the control unit 20 executes the processes of steps S105 to S130 to evaluate the degree of contribution to the class discrimination of each region of the original image 33 by the method of Grad-CAM. Image 35 is acquired for each class.

ステップＳ１３０において選択していないクラスがあると判定された場合、制御部２０は、導出部２１ｄの機能により、ステップＳ１２５で取得された評価画像３５ａと、評価画像３５ｂと、を画素毎に比較し、比較結果を示す比較画像３６を導出する（ステップＳ１３５）。記録媒体３０に記録された判定基準データ３４は、評価画像３５同士の比較の基準を示す。制御部２０は、判定基準データ３４に基づいて、評価画像３５ａと、評価画像３５ｂと、の比較を行う。本実施形態では、判定基準データ３４は、画素毎に、空車クラスの評価画像の画素値と空車クラスの評価画像の画素値との差をとることで、画素値の比較を行うことを示す。 When it is determined in step S130 that there is a class not selected, the control unit 20 compares the evaluation image 35a acquired in step S125 with the evaluation image 35b pixel by pixel by the function of the derivation unit 21d. , A comparative image 36 showing the comparison result is derived (step S135). The determination reference data 34 recorded on the recording medium 30 indicates a reference for comparison between the evaluation images 35. The control unit 20 compares the evaluation image 35a and the evaluation image 35b based on the determination reference data 34. In the present embodiment, the determination reference data 34 indicates that the pixel values are compared by taking the difference between the pixel value of the evaluation image of the empty vehicle class and the pixel value of the evaluation image of the empty vehicle class for each pixel.

ここで、図８を用いて、ステップＳ１３５の処理の詳細を説明する。ステップＳ２００において、制御部２０は、導出部２１ｄの機能により、ステップＳ１２５で取得した評価画像３５と同じサイズの画像データを、初期化状態の比較画像３６として生成する。本実施形態では、縦Ｈ_ｍ×横Ｗ_ｍの画像データを生成し、ＲＡＭに記録する。次に、制御部２０は、導出部２１ｄの機能により、インデックスｉ、ｊを初期化する（ステップＳ２０５）。具体的には、制御部２０は、画像中の画素の横の位置を示すインデックスｉと、画像中の画素の縦の位置を示すインデックスｊと、について値を１に初期化し、ＲＡＭに記録する。次に、制御部２０は、導出部２１ｄの機能により、評価画像３５ａにおける位置（ｉ、ｊ）の画素の画素値と、評価画像３５ｂにおける位置（ｉ、ｊ）の画素の画素値と、を比較する（ステップＳ２１０）。具体的には、制御部２０は、この２つの画素の画素値の大小関係（大小またはイコール）を特定する。 Here, the details of the process of step S135 will be described with reference to FIG. In step S200, the control unit 20 generates image data having the same size as the evaluation image 35 acquired in step S125 as the comparison image 36 in the initialized state by the function of the derivation unit 21d. In the present embodiment, image data of vertical H _m × horizontal W _m is generated and recorded in RAM. Next, the control unit 20 initializes the indexes i and j by the function of the derivation unit 21d (step S205). Specifically, the control unit 20 initializes the values of the index i indicating the horizontal position of the pixel in the image and the index j indicating the vertical position of the pixel in the image to 1, and records the value in the RAM. .. Next, the control unit 20 uses the function of the derivation unit 21d to obtain the pixel value of the pixel at the position (i, j) in the evaluation image 35a and the pixel value of the pixel at the position (i, j) in the evaluation image 35b. Compare (step S210). Specifically, the control unit 20 specifies the magnitude relationship (large or small or equal) of the pixel values of these two pixels.

次に、制御部２０は、導出部２１ｄの機能により、ステップＳ２１０での比較結果に応じて、比較画像３６における位置（ｉ、ｊ）の画素を着色する（ステップＳ２１５）。本実施形態では、制御部２０は、図６に示す評価画像３５ａにおける位置（ｉ、ｊ）の画素の画素値が、図７に示す評価画像３５ｂにおける位置（ｉ、ｊ）の画素の画素値よりも大きい場合、比較画像３６における位置（ｉ、ｊ）の画素を、第１の色で着色する。より具体的には、制御部２０は、比較画像３６における位置（ｉ、ｊ）の画素の画素値を、第１の色に応じた画素値にする。本実施形態では、第１の色は、黒とするが、赤、青、黄、緑等の他の色でもよい。 Next, the control unit 20 colors the pixels at the positions (i, j) in the comparison image 36 according to the comparison result in step S210 by the function of the derivation unit 21d (step S215). In the present embodiment, in the control unit 20, the pixel value of the pixel at the position (i, j) in the evaluation image 35a shown in FIG. 6 is the pixel value of the pixel at the position (i, j) in the evaluation image 35b shown in FIG. If it is larger than, the pixel at the position (i, j) in the comparative image 36 is colored with the first color. More specifically, the control unit 20 sets the pixel value of the pixel at the position (i, j) in the comparative image 36 to the pixel value corresponding to the first color. In the present embodiment, the first color is black, but other colors such as red, blue, yellow, and green may be used.

また、制御部２０は、図６に示す評価画像３５ａにおける位置（ｉ、ｊ）の画素の画素値が、図７に示す評価画像３５ｂにおける位置（ｉ、ｊ）の画素の画素値よりも小さい場合、比較画像３６における位置（ｉ、ｊ）の画素を、第１の色と異なる第２の色で着色する。より具体的には、制御部２０は、比較画像３６における位置（ｉ、ｊ）の画素の画素値を、第２の色に応じた画素値にする。本実施形態では、第２の色は、白色とするが、赤、青、黄、緑等の他の色でもよい。 Further, in the control unit 20, the pixel value of the pixel at the position (i, j) in the evaluation image 35a shown in FIG. 6 is smaller than the pixel value of the pixel at the position (i, j) in the evaluation image 35b shown in FIG. In this case, the pixel at the position (i, j) in the comparative image 36 is colored with a second color different from the first color. More specifically, the control unit 20 sets the pixel value of the pixel at the position (i, j) in the comparative image 36 to the pixel value corresponding to the second color. In the present embodiment, the second color is white, but other colors such as red, blue, yellow, and green may be used.

また、制御部２０は、図６に示す評価画像３５ａにおける位置（ｉ、ｊ）の画素の画素値が、図７に示す評価画像３５ｂにおける位置（ｉ、ｊ）の画素の画素値と等しい場合、比較画像３６における位置（ｉ、ｊ）の画素を、第３の色で着色する。より具体的には、制御部２０は、比較画像３６における位置（ｉ、ｊ）の画素の画素値を、第３の色に応じた画素値にする。本実施形態では、第３の色は、灰色とするが、赤、青、黄、緑等の他の色でもよい。 Further, the control unit 20 is in the case where the pixel value of the pixel at the position (i, j) in the evaluation image 35a shown in FIG. 6 is equal to the pixel value of the pixel at the position (i, j) in the evaluation image 35b shown in FIG. , The pixel at the position (i, j) in the comparative image 36 is colored with a third color. More specifically, the control unit 20 sets the pixel value of the pixel at the position (i, j) in the comparative image 36 to the pixel value corresponding to the third color. In the present embodiment, the third color is gray, but other colors such as red, blue, yellow, and green may be used.

次に、制御部２０は、導出部２１ｄの機能により、インデックスｉがＷ_ｍ以上であるか否かを判定する（ステップＳ２２０）。ステップＳ２２０においてインデックスｉがＷ_ｍ以上であると判定されない場合、制御部２０は、導出部２１ｄの機能により、インデックスｉの値をインクリメントし（ステップＳ２３０）、ステップＳ２１０以降の処理を繰り返す。 Next, the control unit 20 determines whether or not the index i is W _m or more by the function of the derivation unit 21d (step S220). If it is not determined in step S220 that the index i is W _m or more, the control unit 20 increments the value of the index i by the function of the derivation unit 21d (step S230), and repeats the processing after step S210.

ステップＳ２２０においてインデックスｉがＷ_ｍ以上であると判定された場合、制御部２０は、導出部２１ｄの機能により、インデックスｉの値を１に初期化する（ステップＳ２２５）。次に、制御部２０は、導出部２１ｄの機能により、インデックスｊがＨ_ｍ以上であるか否かを判定する（ステップＳ２３５）。ステップＳ２３５において、インデックスｊがＨ_ｍ未満であると判定された場合、制御部２０は、導出部２１ｄの機能により、インデックスｊの値をインクリメントし（ステップＳ２４０）、ステップＳ２１０以降の処理を繰り返す。一方、ステップＳ２３５において、インデックスｊがＨ_ｍ以上であると判定された場合、制御部２０は、比較画像３６が完成したとして図８の処理を終了し、図３に示すステップＳ１４０以降を実行する。以上の処理によれば、画素毎の色が比較結果に応じた色に着色された比較画像３６であって、横Ｗ_ｍ個、縦Ｈ_ｍ個の画素からなる比較画像３６が生成される。図９に、図８の処理で生成された比較画像３６を示す。 When it is determined in step S220 that the index i is W _m or more, the control unit 20 initializes the value of the index i to 1 by the function of the derivation unit 21d (step S225). Next, the control unit 20 determines whether or not the index j is H _m or more by the function of the derivation unit 21d (step S235). When it is determined in step S235 that the index j is less than H _m , the control unit 20 increments the value of the index j by the function of the derivation unit 21d (step S240), and repeats the processing after step S210. On the other hand, when it is determined in step S235 that the index j is H _m or more, the control unit 20 ends the process of FIG. 8 assuming that the comparative image 36 is completed, and executes step S140 and subsequent steps shown in FIG. .. According to the above processing, the comparison image 36 in which the color of each pixel is colored according to the comparison result is generated, and the comparison image 36 composed of W _m in the horizontal direction and H _m in the vertical direction is generated. FIG. 9 shows a comparative image 36 generated by the process of FIG.

比較画像３６の領域は、黒色、白色、灰色に分かれる。図９の例では、比較画像３６は、１つの黒色の領域３６１と、複数の白色の領域３６２と、複数の灰色の領域３６３と、に分かれている。 The area of the comparative image 36 is divided into black, white, and gray. In the example of FIG. 9, the comparative image 36 is divided into one black region 361, a plurality of white regions 362, and a plurality of gray regions 363.

比較画像３６における黒色の領域は、原画像３３における駐車クラスの判別よりも空車クラスの判別により寄与する領域に対応する領域である。図９の例では、比較画像３６と図４の原画像３３とのサイズを合わせて重ね合わせた場合に、原画像３３における領域３６１と重なる領域は、駐車クラスの判別よりも空車クラスの判別により寄与する。比較画像３６における白色の領域は、原画像３３における空車クラスの判別よりも駐車クラスの判別により寄与する領域に対応する領域である。図９の例では、比較画像３６と原画像３３とのサイズを合わせて重ね合わせた場合に、原画像３３における領域３６２と重なる領域は、空車クラスの判別よりも駐車クラスの判別により寄与する。比較画像３６における灰色の領域は、原画像３３における駐車クラスの判別、空車クラスの判別に対する寄与度が同じ領域に対応する領域である。図９の例では、比較画像３６と原画像３３とのサイズを合わせて重ね合わせた場合に、原画像３３における領域３６３と重なる領域は、空車クラスの判別、駐車クラスの判別の何れにも寄与したと取れる領域である。 The black region in the comparative image 36 is a region corresponding to a region contributing to the discrimination of the empty vehicle class rather than the discrimination of the parking class in the original image 33. In the example of FIG. 9, when the comparative image 36 and the original image 33 of FIG. 4 are overlapped with each other in size, the area overlapping the area 361 in the original image 33 is determined by the empty vehicle class rather than the parking class. Contribute. The white region in the comparative image 36 is a region corresponding to a region contributing to the discrimination of the parking class rather than the discrimination of the empty vehicle class in the original image 33. In the example of FIG. 9, when the sizes of the comparative image 36 and the original image 33 are overlapped and overlapped, the region overlapping with the region 362 in the original image 33 contributes to the discrimination of the parking class rather than the discrimination of the empty vehicle class. The gray area in the comparative image 36 is an area corresponding to the area having the same contribution to the discrimination of the parking class and the discrimination of the empty vehicle class in the original image 33. In the example of FIG. 9, when the sizes of the comparative image 36 and the original image 33 are overlapped and overlapped, the area overlapping with the area 363 in the original image 33 contributes to both the determination of the empty vehicle class and the determination of the parking class. It is an area that can be taken as a result.

ユーザーは、このような比較画像３６を視認することで、原画像３３における各領域が何れのクラスの判別により寄与したかをより容易に把握できる。例えば、図９の比較画像３６の場合、ユーザーは、黒色の領域３６１、白色の領域３６２、灰色の領域３６３を確認することで、空車クラスの判別により寄与した領域、駐車クラスの判別により寄与した領域、各クラスの判別への寄与度が同じ領域を把握できる。また、ユーザーは、比較画像３６において、黒色の領域３６１が多く、空車クラスの判別により寄与する領域が多いことを直感的に把握できる。 By visually recognizing such a comparison image 36, the user can more easily grasp which class each region in the original image 33 contributes to. For example, in the case of the comparative image 36 of FIG. 9, the user confirmed the black region 361, the white region 362, and the gray region 363, thereby contributing to the determination of the empty vehicle class and the parking class. It is possible to grasp the area and the area with the same contribution to the discrimination of each class. Further, the user can intuitively understand that in the comparative image 36, there are many black regions 361 and many regions contribute to the determination of the empty vehicle class.

図３のフローチャートの説明に戻る。ステップＳ１４０において、制御部２０は、表示制御部２１ｅの機能により、ステップＳ１３５で導出した比較画像３６を原画像３３とステップＳ１３５で取得された評価画像３５と並べて、表示部４０に表示する。図１０に、ステップＳ１４０で表示部４０に表示される画面を示す。なお、制御部２０は、比較画像３６、評価画像３５ａ、ｂ、原画像３３を同じサイズにリサイズして、表示する。制御部２０は、評価画像３５については、ヒートマップ画像として表示する。この際に、制御部２０は、各クラスについてのヒートマップ画像の各画素値の規格化を行う。すなわち、複数の評価画像３５における画素値が等しい画素は、同じ色で表されることとなる。 Returning to the explanation of the flowchart of FIG. In step S140, the control unit 20 displays the comparative image 36 derived in step S135 side by side with the original image 33 and the evaluation image 35 acquired in step S135 on the display unit 40 by the function of the display control unit 21e. FIG. 10 shows a screen displayed on the display unit 40 in step S140. The control unit 20 resizes the comparison image 36, the evaluation images 35a and b, and the original image 33 to the same size and displays them. The control unit 20 displays the evaluation image 35 as a heat map image. At this time, the control unit 20 normalizes each pixel value of the heat map image for each class. That is, the pixels having the same pixel value in the plurality of evaluation images 35 are represented by the same color.

これにより、ユーザーは、原画像３３と同じサイズの比較画像３６を視認することで、比較画像３６における各領域が、原画像３３におけるどの領域であるかをより容易に把握できる。結果として、ユーザーは、原画像におけるどの領域がどのクラスの判別により寄与したかを、より容易に把握できる。また、制御部２０は、比較画像３６を、原画像３３と並べて表示してもよい。これにより、ユーザーは、原画像３３と比較画像３６とを見比べることができ、比較画像３６における各領域が、原画像３３におけるどの領域であるかをより容易に把握できる。 As a result, the user can more easily grasp which region in the original image 33 each region in the comparative image 36 is by visually recognizing the comparative image 36 having the same size as the original image 33. As a result, the user can more easily understand which region in the original image contributed to the discrimination of which class. Further, the control unit 20 may display the comparison image 36 side by side with the original image 33. As a result, the user can compare the original image 33 and the comparative image 36, and can more easily grasp which region in the original image 33 each region in the comparative image 36 is.

以上の構成により、情報処理装置１０は、機械学習モデル３２に入力される原画像３３の領域が機械学習モデル３２によるクラス判別に寄与した度合いを示す寄与度を、クラス毎に取得し、取得したクラス毎の寄与度の比較結果を示す比較画像３６を導出する。これにより、原画像３３の領域毎に、クラス毎のクラス判別の寄与度の比較結果を示す比較画像３６を導出できる。この比較画像３６がユーザーに提示されると、ユーザーは、原画像３３の各領域がどのクラスの判別により寄与したかを容易に把握できる。すなわち、情報処理装置１０は、このようなユーザーによる把握を支援することができる。
また、情報処理装置１０は、導出した比較画像３６を表示部４０に表示することで、ユーザーは、比較画像３６によって示される情報を、一見して把握することができる。 With the above configuration, the information processing apparatus 10 acquires and acquires the contribution degree indicating the degree to which the region of the original image 33 input to the machine learning model 32 contributes to the class discrimination by the machine learning model 32 for each class. A comparison image 36 showing the comparison result of the contribution of each class is derived. As a result, a comparative image 36 showing the comparison result of the contribution of the class discrimination for each class can be derived for each region of the original image 33. When the comparison image 36 is presented to the user, the user can easily grasp which class each region of the original image 33 contributes to. That is, the information processing apparatus 10 can support such grasping by the user.
Further, the information processing apparatus 10 displays the derived comparison image 36 on the display unit 40, so that the user can grasp the information shown by the comparison image 36 at a glance.

また、情報処理装置１０は、比較画像３６における位置（ｉ、ｊ）の画素を、評価画像３５ａの位置（ｉ、ｊ）の画素の画素値が評価画像３５ｂの位置（ｉ、ｊ）の画素の画素値よりも大きい場合と、評価画像３５ａの位置（ｉ、ｊ）の画素の画素値が評価画像３５ｂの位置（ｉ、ｊ）の画素の画素値よりも小さい場合と、で異なる色で着色するとした。これにより、情報処理装置１０は、原画像３３における各領域がどのクラスの判別により寄与するかを、ユーザーがより把握しやすいように比較画像３６を表示できる。 Further, the information processing apparatus 10 has a pixel at the position (i, j) in the comparative image 36 and a pixel value of the pixel at the position (i, j) of the evaluation image 35a is a pixel at the position (i, j) of the evaluation image 35b. The pixel value of the pixel at the position (i, j) of the evaluation image 35a is smaller than the pixel value of the pixel at the position (i, j) of the evaluation image 35b. I decided to color it. As a result, the information processing apparatus 10 can display the comparative image 36 so that the user can more easily understand which class each region in the original image 33 contributes to.

また、情報処理装置１０は、比較画像３６における位置（ｉ、ｊ）の画素を、評価画像３５ａの位置（ｉ、ｊ）の画素の画素値と評価画像３５ｂの位置（ｉ、ｊ）の画素の画素値とが等しい場合、既定の色で着色するとした。これにより、情報処理装置１０は、原画像３３におけるクラス判別への寄与度が各クラスについて等しい領域を、ユーザーがより把握しやすいように比較画像３６を表示できる。 Further, the information processing apparatus 10 uses the pixels at the position (i, j) in the comparative image 36 as the pixel values of the pixels at the position (i, j) of the evaluation image 35a and the pixels at the position (i, j) of the evaluation image 35b. If the pixel value of is equal to, it is supposed to be colored with the default color. As a result, the information processing apparatus 10 can display the comparative image 36 so that the user can more easily grasp the region in which the contribution to the class discrimination in the original image 33 is the same for each class.

また、従来、Ｇｒａｄ－ＣＡＭを用いて、ヒートマップを表示する場合、以下のような問題があった。各クラスについてヒートマップの各画素値の規格化が行われ、規格化によってヒートマップにおける際立たせたい領域が目立たなくなるという問題である。ヒートマップの各画素値の規格化が行われると、複数のヒートマップにおける画素値が等しい画素は、同じ色で表されることとなる。この規格化により、一部の領域の画素値が他の領域よりも極端に大きくなる場合がある。このような場合、この一部の領域のみが明るい色で表示され、他の領域が暗い色で表示される。こうなると、他の領域においてもクラス判別の判断根拠となる領域が含まれる場合であっても、その領域が目立たなくなる。 Further, conventionally, when displaying a heat map using Grad-CAM, there are the following problems. The problem is that each pixel value of the heat map is standardized for each class, and the standardization makes the area to be emphasized in the heat map inconspicuous. When the pixel values of the heat maps are standardized, the pixels having the same pixel values in the plurality of heat maps are represented by the same color. Due to this standardization, the pixel value in some areas may be extremely larger than in other areas. In such a case, only a part of this area is displayed in a light color, and the other area is displayed in a dark color. In this case, even if an area that is a basis for determining the class is included in other areas, that area becomes inconspicuous.

駐車クラスの評価画像３５ａにおける領域７０１は、空車クラスの評価画像３５の各領域、駐車クラスの評価画像における他の領域よりも極端に大きい画素値となっている領域である。そのため、原画像３３は、空車状態の画像であるにも関わらず、図６、１０に示すように、空車クラスについての評価画像３５ａのヒートマップ画像は、全体的に暗い色で表されている。すなわち、Ｇｒａｄ－ＣＡＭの数値を同じ条件で評価するために、各クラスについてのヒートマップを正規化して表現することで、空車クラスの判定に大きく貢献した箇所が埋もれてしまう。結果として、空車クラスの判別の根拠となった領域の把握が困難となる。
対して、本実施形態の情報処理装置１０が導出する比較画像３６は、領域毎の各クラスの判別に対する寄与度の比較結果から得られるため、クラス判別の根拠となった領域に対応する部分を示せないという事態は生じにくい。すなわち、情報処理装置１０は、クラスの判別の根拠となった領域の把握が困難となる可能性を低減できる。
また、Grad-CAMにおけるヒートマップでは、クラス判別に寄与した領域のうち、画素値（寄与の度合い）が比較的大きい画素が明るい色で表示され、他の画素は暗めの色で表示される。すなわち、寄与の度合いが比較的大きいものしか目立たない。対して、本実施形態の手法では、クラス判別に寄与した判定に貢献した領域を、寄与の度合いの大きさに関係なく示すことができる。 The area 701 in the parking class evaluation image 35a is an area having an extremely large pixel value than each area of the empty vehicle class evaluation image 35 and the other areas in the parking class evaluation image. Therefore, although the original image 33 is an image of an empty vehicle state, as shown in FIGS. 6 and 10, the heat map image of the evaluation image 35a for the empty vehicle class is represented in a dark color as a whole. .. That is, in order to evaluate the numerical value of the Grad-CAM under the same conditions, the heat map for each class is normalized and expressed, so that the part that greatly contributes to the determination of the empty car class is buried. As a result, it becomes difficult to grasp the area that is the basis for determining the empty vehicle class.
On the other hand, since the comparison image 36 derived by the information processing apparatus 10 of the present embodiment is obtained from the comparison result of the degree of contribution to the discrimination of each class for each region, the portion corresponding to the region that is the basis of the class discrimination is provided. It is unlikely that you will not be able to show it. That is, the information processing apparatus 10 can reduce the possibility that it becomes difficult to grasp the region that is the basis for class discrimination.
Further, in the heat map in Grad-CAM, among the regions that contributed to the class discrimination, the pixels having a relatively large pixel value (degree of contribution) are displayed in a bright color, and the other pixels are displayed in a dark color. That is, only those with a relatively large degree of contribution are conspicuous. On the other hand, in the method of the present embodiment, the region that contributed to the determination that contributed to the class discrimination can be shown regardless of the degree of contribution.

（３）他の実施形態：
以上の実施形態は本発明を実施するための一例であり、機械学習モデルに入力される画像の領域が前記機械学習モデルによるクラス判別に寄与した度合いを示す寄与度を、クラス毎に取得し、クラス毎の寄与度の比較結果を示す比較情報を導出する限りにおいて、他にも種々の実施形態を採用可能である。例えば、機械学習モデルを学習する機能を、外部の装置が備えていてもよい。また、導出された比較画像を表示する機能を、外部の装置が備えていてもよい。 (3) Other embodiments:
The above embodiment is an example for carrying out the present invention, and the contribution degree indicating the degree to which the region of the image input to the machine learning model contributes to the class discrimination by the machine learning model is acquired for each class. Various other embodiments can be adopted as long as the comparison information indicating the comparison result of the contribution of each class is derived. For example, an external device may have a function of learning a machine learning model. Further, the external device may have a function of displaying the derived comparative image.

上述の実施形態では、制御部２０は、ステップＳ２１５において、ステップＳ２１０での比較結果に応じて、空車クラスについての画素値＞駐車クラスについての画素値となる場合に、第１の色で対応する画素を着色することとした。ただし、制御部２０は、空車クラスについての画素値＞駐車クラスについての画素値となる場合であっても、空車クラスについての画素値と駐車クラスについての画素値との差分の大きさに応じて、対応する画素を異なる色で着色してもよい。例えば、制御部２０は、（空車クラスについての画素値－駐車クラスについての画素値）が、既定の閾値以上となる場合、赤色で着色し、（空車クラスについての画素値－駐車クラスについての画素値）が、既定の閾値未満となる場合、桃色で着色してもよい。また、制御部２０は、（空車クラスについての画素値－駐車クラスについての画素値）の値の大きさに応じた濃度で、着色するようにしてもよい。 In the above-described embodiment, in step S215, when the pixel value for the empty vehicle class> the pixel value for the parking class is satisfied according to the comparison result in step S210, the control unit 20 corresponds to the first color. I decided to color the pixels. However, even if the pixel value for the empty car class> the pixel value for the parking class, the control unit 20 responds to the size of the difference between the pixel value for the empty car class and the pixel value for the parking class. , The corresponding pixels may be colored with different colors. For example, when the (pixel value for the empty car class-pixel value for the parking class) becomes equal to or higher than the predetermined threshold value, the control unit 20 colors the image in red and (pixel value for the empty car class-pixel value for the parking class). If the value) is less than the default threshold, it may be colored pink. Further, the control unit 20 may be colored with a density corresponding to the magnitude of the value (pixel value for the empty car class-pixel value for the parking class).

また、上述の実施形態では、制御部２０は、ステップＳ２１５において、ステップＳ２１０での比較結果に応じて、空車クラスについての画素値＜駐車クラスについての画素値となる場合に、第２の色で対応する画素を着色することとした。ただし、制御部２０は、空車クラスについての画素値＜駐車クラスについての画素値となる場合であっても、駐車クラスについての画素値と空車クラスについての画素値との差分の大きさに応じて、対応する画素を異なる色で着色してもよい。例えば、制御部２０は、（駐車クラスについての画素値－空車クラスについての画素値）が、既定の閾値以上となる場合、青色で着色し、（駐車クラスについての画素値－空車クラスについての画素値）が、既定の閾値未満となる場合、水色で着色してもよい。また、制御部２０は、（駐車クラスについての画素値－空車クラスについての画素値）の値の大きさに応じた濃度で、着色するようにしてもよい。 Further, in the above-described embodiment, in step S215, the control unit 20 uses the second color when the pixel value for the empty vehicle class <the pixel value for the parking class is satisfied according to the comparison result in step S210. We decided to color the corresponding pixels. However, even if the pixel value for the empty car class is less than the pixel value for the parking class, the control unit 20 responds to the size of the difference between the pixel value for the parking class and the pixel value for the empty car class. , The corresponding pixels may be colored with different colors. For example, the control unit 20 is colored in blue when (pixel value for parking class-pixel value for empty car class) is equal to or higher than a predetermined threshold value, and (pixel value for parking class-pixel value for empty car class). If the value) is less than the default threshold, it may be colored in light blue. Further, the control unit 20 may be colored with a density corresponding to the magnitude of the value (pixel value for the parking class-pixel value for the empty car class).

また、上述の実施形態では、制御部２０は、ステップＳ２１０において、空車クラスについての画素値と駐車クラスについての画素値とを比較する。そして、制御部２０は、比較結果として、車クラスについての画素値と駐車クラスについての画素値との関係が、（空車クラスについての画素値＞駐車クラスについての画素値）、（空車クラスについての画素値＜駐車クラスについての画素値）、（空車クラスについての画素値＝駐車クラスについての画素値）の何れであるかを特定した。ただし、制御部２０は、ステップＳ２１０において、空車クラスについての画素値と駐車クラスについての画素値との関係として、（空車クラスについての画素値＞（駐車クラスについての画素値＋既定の閾値））、（（空車クラスについての画素値＋既定の閾値）＜駐車クラスについての画素値）、（｜空車クラスについての画素値－駐車クラスについての画素値｜≦既定の閾値）の何れであるかを特定してもよい。その場合、制御部２０は、ステップＳ２１５で、（空車クラスについての画素値＞（駐車クラスについての画素値＋既定の閾値））、（（空車クラスについての画素値＋既定の閾値）＜駐車クラスについての画素値）、（｜空車クラスについての画素値－駐車クラスについての画素値｜＜＝既定の閾値）のそれぞれの場合で、異なる色で対応する画素を着色するようにしてもよい。 Further, in the above-described embodiment, the control unit 20 compares the pixel value for the empty vehicle class with the pixel value for the parking class in step S210. Then, as a comparison result, the control unit 20 determines that the relationship between the pixel value for the vehicle class and the pixel value for the parking class is (pixel value for the empty vehicle class> pixel value for the parking class) and (pixel value for the empty vehicle class). It was specified whether the pixel value <pixel value for the parking class) or (pixel value for the empty car class = pixel value for the parking class). However, in step S210, the control unit 20 determines that the relationship between the pixel value for the empty vehicle class and the pixel value for the parking class is (pixel value for the empty vehicle class> (pixel value for the parking class + default threshold value)). , ((Pixel value for empty car class + default threshold) <pixel value for parking class), (| Pixel value for empty car class-pixel value for parking class | ≤ default threshold) You may specify. In that case, in step S215, the control unit 20 has (pixel value for empty car class> (pixel value for parking class + default threshold value)), ((pixel value for empty car class + default threshold value) <parking class. Pixel value for) and (| Pixel value for empty car class-Pixel value for parking class | <= Default threshold value), the corresponding pixels may be colored with different colors.

また、上述の実施形態では、制御部２０は、ステップＳ２１５において、ステップＳ２１０での比較結果に応じて、（空車クラスについての画素値＝駐車クラスについての画素値）となる場合に、第３の色で対応する画素を着色することとした。ただし、制御部２０は、（空車クラスについての画素値＝駐車クラスについての画素値）となる場合には、対応する画素の着色を行わないこととしてもよい。例えば、制御部２０は、（空車クラスについての画素値＝駐車クラスについての画素値）となる場合には、対応する画素を透明に設定したり、Ｎｕｌｌ値を設定したりしてもよい。 Further, in the above-described embodiment, when the control unit 20 becomes (pixel value for empty vehicle class = pixel value for parking class) in step S215 according to the comparison result in step S210, a third We decided to color the corresponding pixels with color. However, when the control unit 20 (pixel value for empty vehicle class = pixel value for parking class), the control unit 20 may not color the corresponding pixel. For example, when the control unit 20 has (pixel value for empty vehicle class = pixel value for parking class), the corresponding pixel may be set to be transparent or a Null value may be set.

また、上述の実施形態では、機械学習モデル３２は、駐車されているか否かに関するクラス（空車クラス、駐車クラス）の判別に用いられるモデルである。しかし、機械学習モデルによるクラス分類対象はこの例に限定されない。例えば、機械学習モデル３２は、他の種類のクラスの判別に用いられるモデルであってもよい。具体的には、例えば、機械学習モデル３２は、入力画像のクラスが、犬であることを示す犬クラスと、猫であることを示す猫クラスと、の何れであるかを判別するモデルであってもよい。 Further, in the above-described embodiment, the machine learning model 32 is a model used for discriminating a class (empty car class, parking class) regarding whether or not the vehicle is parked. However, the classification target by the machine learning model is not limited to this example. For example, the machine learning model 32 may be a model used for discriminating other types of classes. Specifically, for example, the machine learning model 32 is a model for determining whether the class of the input image is a dog class indicating that it is a dog or a cat class indicating that it is a cat. You may.

また、上述の実施形態では、機械学習モデル３２は、入力された画像が、２つのクラスのうちの何れに属するかを判別するモデルである。しかし、モデルにおける出力の態様はこのような態様に限定されない。例えば、機械学習モデル３２は、入力された画像が、３つ以上のクラスのうちの何れに属するかを判別するモデルであってもよい。具体的には、例えば、機械学習モデル３２は、入力された画像が、３つのクラスのうちの何れに属するかを判別するモデルであってもよい。この場合、情報処理装置１０は、例えば、ステップＳ１０５～ステップＳ１３０の処理により、３つのクラス（クラスＡ、クラスＢ、クラスＣとおく）それぞれについての評価画像３５を取得してもよい。 Further, in the above-described embodiment, the machine learning model 32 is a model for determining which of the two classes the input image belongs to. However, the mode of output in the model is not limited to such a mode. For example, the machine learning model 32 may be a model for determining which of the three or more classes the input image belongs to. Specifically, for example, the machine learning model 32 may be a model for determining which of the three classes the input image belongs to. In this case, the information processing apparatus 10 may acquire the evaluation image 35 for each of the three classes (class A, class B, and class C) by processing steps S105 to S130, for example.

そして、制御部２０は、ステップＳ２１０で、３つのクラスそれぞれについての評価画像３５における位置（ｉ、ｊ）の画素の画素値の比較を行い、比較結果に応じて、ステップＳ２１５で比較画像３６の位置（ｉ、ｊ）の画素の着色を行ってもよい。例えば、制御部２０は、クラスＡについての画素値が最も大きい場合と、クラスＢについての画素値が最も大きい場合と、クラスＣについての画素値が最も大きい場合と、クラスＡ～Ｃそれぞれについての画素値が等しい場合とで、それぞれ異なる色で着色してもよい。また、例えば、制御部２０は、（クラスＡについての画素値＞クラスＢについての画素値＞クラスＣについての画素値）の場合、（クラスＡについての画素値＞クラスＣについての画素値＞クラスＢについての画素値）の場合、（クラスＢについての画素値＞クラスＡについての画素値＞クラスＣについての画素値）の場合、（クラスＢについての画素値＞クラスＣについての画素値＞クラスＡについての画素値）の場合、（クラスＣについての画素値＞クラスＡについての画素値＞クラスＢについての画素値）の場合、（クラスＣについての画素値＞クラスＢについての画素値＞クラスＡについての画素値）の場合、（クラスＣについての画素値＝クラスＡについての画素値＝クラスＢについての画素値）の場合等の大小関係に応じて、それぞれ異なる色で着色してもよい。 Then, the control unit 20 compares the pixel values of the pixels at the positions (i, j) in the evaluation image 35 for each of the three classes in step S210, and according to the comparison result, in step S215, the comparison image 36 The pixel at the position (i, j) may be colored. For example, the control unit 20 has a case where the pixel value for the class A is the largest, a case where the pixel value for the class B is the largest, a case where the pixel value for the class C is the largest, and each of the classes A to C. Coloring may be performed with different colors depending on whether the pixel values are the same. Further, for example, in the case of (pixel value for class A> pixel value for class B> pixel value for class C), the control unit 20 may (pixel value for class A> pixel value for class C> class). In the case of (pixel value for B), (pixel value for class B> pixel value for class A> pixel value for class C), (pixel value for class B> pixel value for class C> class In the case of (pixel value for A), (pixel value for class C> pixel value for class A> pixel value for class B), (pixel value for class C> pixel value for class B> class In the case of (pixel value for A), different colors may be used depending on the magnitude relationship such as (pixel value for class C = pixel value for class A = pixel value for class B). ..

また、上述の実施形態では、制御部２０は、導出部２１ｄの機能により、比較画像３６を導出する。しかし、比較結果の出力態様は比較画像による出力に限定されない。例えば、制御部２０は、導出部２１ｄの機能により、２つのクラスの一方のクラスの判別に対する寄与度が他方のクラスの判別に対する寄与度よりも大きい原画像３３の領域、一方のクラスの判別に対する寄与度が他方のクラスの判別に対する寄与度よりも小さい原画像３３の領域、一方のクラスの判別に対する寄与度が他方のクラスの判別に対する寄与度と等しい原画像３３の領域それぞれの存在割合を示す情報を導出してもよい。 Further, in the above-described embodiment, the control unit 20 derives the comparative image 36 by the function of the derivation unit 21d. However, the output mode of the comparison result is not limited to the output by the comparison image. For example, the control unit 20 may use the function of the derivation unit 21d to control the area of the original image 33 in which the contribution to the discrimination of one class of the two classes is larger than the contribution to the discrimination of the other class, and the discrimination of one class. Shows the abundance ratio of each region of the original image 33 whose contribution is smaller than the contribution to the discrimination of the other class, and the region of the original image 33 whose contribution to the discrimination of one class is equal to the contribution to the discrimination of the other class. Information may be derived.

より具体的には、制御部２０は、比較画像３６のうち第１の色で着色された領域と、第２の色で着色された領域と、第３の色で着色された領域と、の比較画像３６における存在割合を特定してもよい。そして、制御部２０は、特定した割合を示す情報（例えば、テキスト情報、グラフ（例えば、棒グラフ・円グラフ等）情報等）を導出してもよい。そして、制御部２０は、表示制御部２１ｅとして、導出された情報を表示部４０に表示してもよい。これにより、ユーザーは、空車クラスの判別により寄与した領域、駐車クラスの判別により寄与した領域、空車クラスの判別と駐車クラスの判別に同程度に寄与した領域それぞれがどの程度の割合で存在するかをより容易に把握できる。 More specifically, the control unit 20 includes a region of the comparative image 36 colored with the first color, a region colored with the second color, and a region colored with the third color. The abundance ratio in the comparative image 36 may be specified. Then, the control unit 20 may derive information indicating the specified ratio (for example, text information, graph (for example, bar graph, pie chart, etc.) information, etc.). Then, the control unit 20 may display the derived information on the display unit 40 as the display control unit 21e. As a result, the user can determine the proportion of each of the areas that contributed by the discrimination of the empty car class, the areas that contributed by the discrimination of the parking class, and the areas that contributed to the discrimination of the empty car class and the parking class to the same extent. Can be grasped more easily.

また、上述の実施形態では、制御部２０は、原画像３３の各領域の選択クラスの判別への寄与度を示す情報として、Ｇｒａｄ－ＣＡＭと同様の手法で、畳込み層の最終層である層Ｌ_ｍにおける特徴マップにおける各チャネルを合成した評価画像３５を取得することとした。ただし、制御部２０は、原画像３３の各領域の選択クラスの判別への寄与度を示す情報として、他の情報を取得してもよい。例えば、制御部２０は、選択クラスの判別への寄与度を示す情報として、ＣＡＭ（ＣｌａｓｓＡｃｔｉｖａｔｉｏｎＭａｐｐｉｎｇ）の手法で得られる画像を取得してもよい。 Further, in the above-described embodiment, the control unit 20 is the final layer of the convolution layer by the same method as the Grad-CAM as information indicating the degree of contribution to the discrimination of the selection class of each region of the original image 33. It was decided to acquire an evaluation image 35 in which each channel in the feature map in the layer _Lm was synthesized. However, the control unit 20 may acquire other information as information indicating the degree of contribution to the determination of the selection class of each region of the original image 33. For example, the control unit 20 may acquire an image obtained by a method of CAM (Class Activation Mapping) as information indicating the degree of contribution to the discrimination of the selection class.

また、例えば、制御部２０は、畳込み層の中間層における特徴マップから、各画素がクラス判別への寄与度を示す評価画像３５を取得してもよい。例えば、制御部２０は、畳込み層の最終層である層Ｌ_ｍにおける特徴マップの代わりに、畳込み層の中間層（層Ｌ_ｍよりも入力側に近い層）を層Ｌ_ｌとして、層Ｌ_ｌおける特徴マップを用いて、上述の実施形態と同様の処理を行うこととしてもよい。その場合、制御部２０は、Ｗｍ、Ｈｍの代わりに、層Ｌ_ｌにおける特徴マップの横サイズ、縦サイズを用いる。この場合も、制御部２０は、比較画像３６を入力される原画像３３と同じサイズになるようにリサイズしてもよい。 Further, for example, the control unit 20 may acquire an evaluation image 35 showing the degree of contribution of each pixel to class discrimination from the feature map in the intermediate layer of the convolutional layer. For example, the control unit 20 uses the intermediate layer of the convolutional layer (the layer closer to the input side than the layer L _m ) as the layer L _l instead of the feature map in the layer L _m which is the final layer of the convolutional layer. The same processing as in the above-described embodiment may be performed using the feature map in L _l . In that case, the control unit 20 uses the horizontal size and the vertical size of the feature map in the layer L _l instead of Wm and Hm. In this case as well, the control unit 20 may resize the comparison image 36 so that it has the same size as the input original image 33.

また、上述の実施形態では、制御部２０は、原画像３３の各領域の選択クラスの判別への寄与度を示す情報、クラス毎の寄与度の比較結果を示す比較情報を、評価画像３５、比較画像３６として画像の形式で求めた。ただし、制御部２０は、これらの情報を、画像と異なる形式の情報（例えば、ｃｓｖデータ、配列データ、テキストデータ等）として求めてもよい。 Further, in the above-described embodiment, the control unit 20 provides information indicating the contribution of each region of the original image 33 to the discrimination of the selected class, and comparative information indicating the comparison result of the contribution of each class, to the evaluation image 35. It was obtained in the form of an image as a comparative image 36. However, the control unit 20 may obtain these information as information in a format different from that of the image (for example, csv data, array data, text data, etc.).

寄与度は、機械学習モデルに入力される入力画像の領域が前記機械学習モデルによるクラス判別に寄与した度合いであればよい。従って、寄与度は、機械学習モデルにおける出力層のクラスに対応するノード値への影響の大きさを示す指標であってよく、他にも種々の定義を利用可能である。例えば、画像を複数の領域に分けた場合の当該領域毎の勾配値の平均であってもよい。
導出部は、クラス毎の寄与度の比較結果として、クラス毎の寄与度の比較により得られる情報を導出すればよい。例えば、導出部は、クラス毎の寄与度の大小関係を示す情報を導出してもよいし、クラス毎、領域毎の寄与度の差分を示す情報を導出してもよい。
第１の色、第２の色、第３の色は、それぞれ異なる色であればよい。例えば、第１の色、第２の色、第３の色は、それぞれ見分けやすい色（例えば、彩度、明度、色相の違いが顕著な色）であればよい。 The degree of contribution may be any degree as long as the region of the input image input to the machine learning model contributes to the class discrimination by the machine learning model. Therefore, the degree of contribution may be an index indicating the magnitude of the influence on the node value corresponding to the class of the output layer in the machine learning model, and various other definitions can be used. For example, it may be the average of the gradient values for each region when the image is divided into a plurality of regions.
The derivation unit may derive the information obtained by comparing the contributions of each class as the comparison result of the contributions of each class. For example, the derivation unit may derive information indicating the magnitude relationship of the contributions of each class, or may derive information indicating the difference of the contributions of each class and each region.
The first color, the second color, and the third color may be different colors. For example, the first color, the second color, and the third color may be colors that are easily distinguishable from each other (for example, colors having a remarkable difference in saturation, lightness, and hue).

さらに、本発明の手法は、プログラムや方法としても適用可能である。また、一部がソフトウェアであり一部がハードウェアであったりするなど、適宜、変更可能である。さらに、装置を制御するプログラムの記録媒体としても発明は成立する。むろん、そのソフトウェアの記録媒体は、磁気記録媒体であってもよいし半導体メモリであってもよいし、今後開発されるいかなる記録媒体においても全く同様に考えることができる。 Further, the method of the present invention can be applied as a program or a method. In addition, some of them are software and some of them are hardware, so they can be changed as appropriate. Further, the invention is also established as a recording medium for a program for controlling an apparatus. Of course, the recording medium of the software may be a magnetic recording medium or a semiconductor memory, and any recording medium developed in the future can be considered in exactly the same way.

１０…情報処理装置、２０…制御部、２１…導出プログラム、２１ａ…学習部、２１ｂ…判別部、２１ｃ…取得部、２１ｄ…導出部、２１ｅ…表示制御部、３０…記録媒体、３１…教師データ、３２…機械学習モデル、３３…原画像、３４…判定基準データ、３５…評価画像、３６…比較画像、４０…表示部 10 ... Information processing device, 20 ... Control unit, 21 ... Derivation program, 21a ... Learning unit, 21b ... Discrimination unit, 21c ... Acquisition unit, 21d ... Derivation unit, 21e ... Display control unit, 30 ... Recording medium, 31 ... Teacher Data, 32 ... Machine learning model, 33 ... Original image, 34 ... Judgment standard data, 35 ... Evaluation image, 36 ... Comparison image, 40 ... Display unit

Claims

An acquisition unit that acquires the degree of contribution indicating the degree to which the area of the input image input to the machine learning model contributes to the class discrimination by the machine learning model for each class.
A derivation unit that derives comparison information indicating the comparison result of the contribution of each class acquired by the acquisition unit, and a derivation unit.
Information processing device equipped with.

The information processing apparatus according to claim 1, further comprising a display control unit that displays the comparison information derived by the derivation unit.

The information processing apparatus according to claim 1 or 2, wherein the acquisition unit acquires each pixel value of an image derived by Grad-CAM as the contribution degree.

The machine learning model is used to determine whether the input image is one of two classes, an empty car class indicating that it is an image of an empty car and a parking class indicating that it is an image of parking.
The derivation unit has a portion corresponding to the region of the input image in which the contribution to the discrimination of the empty vehicle class is larger than the contribution to the discrimination of the parking class as the first color, and the contribution to the discrimination of the empty vehicle class. In any one of claims 1 to 3, an image in which a portion corresponding to a region of the input image having a degree larger than the contribution to the discrimination of the parking class is colored in a second color is derived as the comparative information. The information processing device described.

The derivation unit uses the image in which the portion corresponding to the region of the input image whose contribution to the discrimination of the empty vehicle class is equal to the contribution to the discrimination of the parking class is colored in a third color as the comparison information. The information processing apparatus according to claim 4 to be derived.

The machine learning model is used to determine which of the two classes the input image belongs to.
The derivation unit further includes a region of the input image in which the contribution to the discrimination of one class is larger than the contribution to the discrimination of the other class, and the contribution to the discrimination of the other class is the one class. Information indicating the abundance ratio of each of the region of the input image that is smaller than the contribution to the discrimination of the input image and the region of the input image whose contribution to the discrimination of one class is equal to the contribution to the discrimination of the other class. The information processing apparatus according to any one of claims 1 to 5.

It is an information processing method executed by an information processing device.
A step of acquiring the degree of contribution indicating the degree to which the area of the input image input to the machine learning model contributes to the class discrimination by the machine learning model for each class, and
A step to derive comparison information showing the comparison result of the acquired contribution of each class, and
Information processing methods including.

Computer,
An acquisition unit that acquires the degree of contribution indicating the degree to which the area of the input image input to the machine learning model contributes to the class discrimination by the machine learning model for each class.
A derivation unit that derives comparison information indicating the comparison result of the contribution of each class acquired by the acquisition unit.
A program that functions as.