JP7461866B2

JP7461866B2 - Information processing device, information processing method, and program

Info

Publication number: JP7461866B2
Application number: JP2020213408A
Authority: JP
Inventors: 裕生宮下; 恵介石黒
Original assignee: Nagoya Electric Works Co Ltd
Current assignee: Nagoya Electric Works Co Ltd
Priority date: 2020-12-23
Filing date: 2020-12-23
Publication date: 2024-04-04
Anticipated expiration: 2040-12-23
Also published as: JP2022099572A

Description

本発明は、情報処理装置、情報処理方法およびプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program.

機械学習されたモデルを用いた画像のクラス判別において、クラス判別の判断の根拠となった画像内の領域を可視化する技術がある。
非特許文献１には、ＣＮＮの判断根拠の可視化技術であるＧｒａｄ－ＣＡＭ（Ｇｒａｄｉｅｎｔ－ｗｅｉｇｈｔｅｄＣｌａｓｓＡｃｔｉｖａｔｉｏｎＭａｐｐｉｎｇ）が開示されている。 In classifying images using a machine-learned model, there is a technique for visualizing the areas within an image that served as the basis for classifying the images.
Non-Patent Document 1 discloses Gradient-weighted Class Activation Mapping (Grad-CAM), which is a technology for visualizing the basis for CNN decisions.

Ｒ．Ｒ．Ｓｅｌｖａｒａｊｕ、Ｍ．Ｃｏｇｓｗｅｌｌ、Ａ．Ｄａｓ、Ｒ．Ｖｅｄａｎｔａｍ、Ｄ．Ｐａｒｉｋｈ、Ｄ．Ｂａｔｒａ、 "Ｇｒａｄ－ＣＡＭ：ＶｉｓｕａｌＥｘｐｌａｎａｔｉｏｎｓｆｒｏｍＤｅｅｐＮｅｔｗｏｒｋｓｖｉａＧｒａｄｉｅｎｔ－ｂａｓｅｄＬｏｃａｌｉｚａｔｉｏｎ、"ａｒＸｉｖ：１６１０．０２３９１ｖ３、２０１７．R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, “Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization,” arXiv:1610.02391v3, 2017.

しかし、非特許文献１では、複数のクラスそれぞれに対応するヒートマップを表示する場合、複数のヒートマップにおける同じ領域の色が同じような色だと、その領域が何れのクラスの判別により寄与したかがユーザーにより把握できないという課題があった。
本発明は、前記課題にかんがみてなされたもので、画像の各領域が何れのクラスの判別により寄与したかについて、ユーザーによる把握を支援することを目的とする。 However, in Non-Patent Document 1, when displaying heat maps corresponding to multiple classes, if the colors of the same area in multiple heat maps are similar, the user cannot understand which class the area contributed to in determining which class.
The present invention has been made in consideration of the above-mentioned problems, and has an object to assist a user in understanding how each region of an image contributes to the discrimination of which class.

前記の目的を達成するため、本発明の情報処理装置は、機械学習モデルに入力される入力画像の領域が前記機械学習モデルによるクラス判別に寄与した度合いを示す寄与度を、クラス毎に取得する取得部と、前記取得部により取得されるクラス毎の寄与度の比較結果を示す比較情報を導出する導出部と、を備える。 To achieve the above object, the information processing device of the present invention includes an acquisition unit that acquires, for each class, a contribution level indicating the degree to which an area of an input image input to a machine learning model contributed to class discrimination by the machine learning model, and a derivation unit that derives comparison information indicating a comparison result of the contribution levels for each class acquired by the acquisition unit.

以上、説明した構成によれば、入力画像の領域毎に、クラス毎のクラス判別の寄与度の比較結果を示す比較情報を導出できる。この比較情報がユーザーに提示されるとユーザーは、入力画像の各領域がどのクラスの判別により寄与したかを容易に把握できる。すなわち、本発明の構成により、このようなユーザーによる把握を支援することができる。 According to the configuration described above, it is possible to derive comparison information that indicates the comparison results of the contribution of each region of the input image to the class discrimination for each class for each region of the input image. When this comparison information is presented to the user, the user can easily understand to which class discrimination each region of the input image contributed. In other words, the configuration of the present invention can assist the user in such understanding.

情報処理装置の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of an information processing device. モデルの構造を示す図である。FIG. 1 is a diagram showing the structure of a model. 導出処理を示すフローチャートである。13 is a flowchart showing a derivation process. 原画像を示す図である。FIG. 評価画像の導出処理の概要を説明する図である。10A and 10B are diagrams illustrating an overview of a process for deriving an evaluation image. 空車クラスについての評価画像を示す図である。FIG. 13 is a diagram showing an evaluation image for the empty vehicle class. 駐車クラスについての評価画像を示す図である。FIG. 13 is a diagram showing an evaluation image for a parking class. 導出処理を示すフローチャートである。13 is a flowchart showing a derivation process. 出力される比較画像を示す図である。FIG. 13 is a diagram showing a comparison image that is output. 表示画面を示す図である。FIG.

ここでは、下記の順序に従って本発明の実施の形態について説明する。
（１）情報処理装置の構成：
（２）導出処理：
（３）他の実施形態： Here, the embodiments of the present invention will be described in the following order.
(1) Configuration of information processing device:
(2) Derivation process:
(3) Other embodiments:

（１）情報処理装置の構成：
図１は、本発明の一実施形態にかかる情報処理装置１０の構成を示すブロック図である。情報処理装置１０は、制御部２０と記録媒体３０と表示部４０とを備えている。制御部２０は、ＣＰＵとＲＡＭとＲＯＭ等を備えたコンピュータであり、記録媒体３０等に記録された導出プログラム２１等の各種プログラムを実行する。導出プログラム２１は、クラス判別対象の画像における各領域について、各クラスの判別に対する寄与度の比較結果を示す画像を導出する機能を制御部２０に実行させるプログラムである。 (1) Configuration of information processing device:
1 is a block diagram showing a configuration of an information processing device 10 according to an embodiment of the present invention. The information processing device 10 includes a control unit 20, a recording medium 30, and a display unit 40. The control unit 20 is a computer including a CPU, a RAM, a ROM, etc., and executes various programs such as a derivation program 21 recorded on the recording medium 30, etc. The derivation program 21 is a program that causes the control unit 20 to execute a function of deriving an image showing a comparison result of the contribution degree to the discrimination of each class for each region in an image to be classified.

記録媒体３０は、導出プログラム２１等の各種プログラム、各種データを記録する。記録媒体３０は、例えば、ハードディスクドライブ（ＨＤＤ）、ソリッドステートドライブ（ＳＳＤ）等である。本実施形態では、記録媒体３０は、教師データ３１、機械学習モデル３２、原画像３３、判定基準データ３４、評価画像３５、比較画像３６等を記録する。
教師データ３１は、機械学習モデル３２の機械学習に用いられる教師データである。機械学習モデル３２は、入力される画像のクラス判別に用いられるモデルである。ここで、モデルとは、クラス判別対象のデータと、クラス判別結果のデータと、の対応関係を示す情報（例えば、式等）である。本実施形態では、機械学習モデル３２は、入力される画像が、駐車されている状況を示す駐車クラスと空車の状況を示す空車クラスとの２つのクラスのうちの何れであるかを判別するモデルである。また、本実施形態では、機械学習モデル３２は、ＣＮＮ（畳込みニューラルネットワーク）を含むモデルである。 The recording medium 30 records various programs such as the derivation program 21 and various data. The recording medium 30 is, for example, a hard disk drive (HDD), a solid state drive (SSD), etc. In this embodiment, the recording medium 30 records teacher data 31, a machine learning model 32, an original image 33, judgment criteria data 34, an evaluation image 35, a comparison image 36, etc.
The teacher data 31 is teacher data used for machine learning of the machine learning model 32. The machine learning model 32 is a model used for class discrimination of an input image. Here, the model is information (e.g., a formula, etc.) indicating the correspondence between data to be classified and data of the class discrimination result. In this embodiment, the machine learning model 32 is a model that determines whether an input image belongs to one of two classes, a parking class indicating a parked state and an empty car class indicating an empty car state. In this embodiment, the machine learning model 32 is a model including a CNN (convolutional neural network).

図２は、本実施形態の機械学習モデル３２の構造を示す図である。図２においては、ＣＮＮによるデータフォーマットの変化を直方体で示している。本実施形態の機械学習モデル３２は、クラス判別対象の画像を示す画像データをＣＮＮの入力層Ｌ_ｉへの入力データとし、１つ以上の畳込み層、１つ以上のプーリング層、全結合層Ｌ_ｎを経由して、出力層Ｌ_ｏの各ノードへ出力データが出力される。本実施形態では、機械学習モデル３２のＣＮＮに入力される画像データは、縦Ｈピクセル、横Ｗピクセルであり、各ピクセルについてＲ：赤、Ｇ：緑、Ｂ：青の３チャネルの階調値が規定されている。従って、図２において入力層Ｌ_ｉの画像は、縦Ｈ、横Ｗ、奥行き３の直方体で模式的に示されている。 FIG. 2 is a diagram showing the structure of the machine learning model 32 of this embodiment. In FIG. 2, the change in data format by CNN is shown as a rectangular parallelepiped. In the machine learning model 32 of this embodiment, image data indicating an image to be classified is input data to the input layer L _i of the CNN, and output data is output to each node of the output layer L _o via one or more convolution layers, one or more pooling layers, and a fully connected layer L _n . In this embodiment, the image data input to the CNN of the machine learning model 32 is H pixels in length and W pixels in width, and the gradation values of three channels, R: red, G: green, and B: blue, are specified for each pixel. Therefore, in FIG. 2, the image of the input layer L _i is diagrammatically shown as a rectangular parallelepiped with a length of H, a width of W, and a depth of 3.

図２においては、入力層Ｌ_ｉに入力された画像は、既定の大きさおよび個数のフィルターによる畳み込み演算、活性化関数による演算およびプーリング層の演算を経て縦Ｈ_１×横Ｗ_１×チャネルＤ_１の特徴マップに変換される例を示している。図２においては、この後、複数の層を経て、最終的には、畳込み層の最終層である層Ｌ_ｍにおいて縦Ｈ_ｍ×横Ｗ_ｍ×チャネルＤ_ｍの特徴マップに変換される例を示している。ＣＮＮによって、縦Ｈ_ｍ×横Ｗ_ｍ×チャネルＤ_ｍの特徴マップが得られた後、全結合によって全結合層Ｌ_ｎの各ノードの値が得られる。全結合層Ｌ_ｎの各ノードの値を入力データとして、出力層Ｌ_ｏの各ノードの値が出力される。本実施形態の機械学習モデル３２のＣＮＮの出力層Ｌ_ｏには、空車クラス、駐車クラスにそれぞれ対応する２つのノードが含まれ、この２つのノードそれぞれには、入力された画像が対応するクラスに属する可能性の大きさに応じた値が出力される。また、畳込み層における特徴マップの各チャネルは、入力層Ｌ_ｉに入力された画像がフィルターによって畳込まれた形式のデータであり、各チャネルにおける各領域は、入力された画像における各領域と対応付いている。すなわち、畳込み層における特徴マップの各チャネルは、入力された画像における位置情報が維持されている。 FIG. 2 shows an example in which an image input to the input layer L _i is converted into a feature map of vertical H ₁ × horizontal W ₁ × channel D ₁ through a convolution operation using a filter of a predetermined size and number, an operation using an activation function, and an operation using a pooling layer. FIG. 2 shows an example in which the image is then converted into a feature map of vertical H _{m × horizontal W m × channel D m through multiple layers, and finally into a feature map of vertical H m} × horizontal W _m × channel D _m in layer L _m , which is the final layer of the convolution layer. After a feature map of vertical H _m × horizontal W _m × channel D _m is obtained by CNN, the value of each node of the fully connected layer L _n is obtained by full connection. The value of each node of the fully connected layer L _n is used as input data, and the value of each node of the output layer L _o is output. The output layer L _o of the CNN of the machine learning model 32 of this embodiment includes two nodes corresponding to an empty car class and a parking class, respectively, and a value according to the magnitude of the possibility that the input image belongs to the corresponding class is output to each of these two nodes. In addition, each channel of the feature map in the convolution layer is data in a format in which the image input to the input layer L _i is convolved with a filter, and each region in each channel corresponds to each region in the input image. In other words, each channel of the feature map in the convolution layer maintains position information in the input image.

原画像３３は、クラス判別の対象の画像であり、本実施形態では、駐車場における駐車スペースの画像である。判定基準データ３４は、評価画像３５同士の比較の基準を示す情報である。評価画像３５は、対応するクラスの判別に対し、原画像３３の各領域がどの程度寄与したかを示す画像であり、本実施形態では、Ｇｒａｄ－ＣＡＭで得られるヒートマップである。比較画像３６は、出力クラス毎の評価画像３５の比較結果を示す画像である。表示部４０は、各種情報を表示する。表示部４０は、例えば、モニタ、タッチパネル等である。 The original image 33 is an image that is the subject of class discrimination, and in this embodiment, it is an image of a parking space in a parking lot. The judgment criteria data 34 is information that indicates the criteria for comparing the evaluation images 35. The evaluation image 35 is an image that indicates the degree to which each area of the original image 33 contributed to the discrimination of the corresponding class, and in this embodiment, it is a heat map obtained by Grad-CAM. The comparison image 36 is an image that indicates the comparison result of the evaluation images 35 for each output class. The display unit 40 displays various information. The display unit 40 is, for example, a monitor, a touch panel, etc.

制御部２０は、記録媒体３０に記録された導出プログラム２１を実行することで、学習部２１ａ、判別部２１ｂ、取得部２１ｃ、導出部２１ｄ、表示制御部２１ｅとして機能する。学習部２１ａは、機械学習モデル３２を機械学習する機能である。判別部２１ｂは、学習部２１ａにより機械学習された機械学習モデル３２を用いて、原画像３３のクラスを判別する機能である。取得部２１ｃは、空車クラス判別、駐車クラス判別それぞれに対する、原画像３３の各領域の寄与度を示す情報を取得する機能である。本実施形態では、取得部２１ｃは、空車クラス判別、駐車クラス判別それぞれに対する、原画像３３の各領域の寄与度を示す情報として、Ｇｒａｄ－ＣＡＭで得られるヒートマップである評価画像３５を取得する機能である。導出部２１ｄは、取得部２１ｃにより取得される出力クラス毎の寄与度の比較結果を示す情報を導出する機能である。以下では、出力クラス毎の寄与度の比較結果を示す情報を比較情報とする。本実施形態では、導出部２１ｄは、この比較情報として比較画像３６を導出する機能である。表示制御部２１ｅは、導出部２１ｄにより導出された比較情報（本実施形態では、比較画像３６）を表示する機能である。 The control unit 20 executes the derivation program 21 recorded on the recording medium 30, thereby functioning as a learning unit 21a, a discrimination unit 21b, an acquisition unit 21c, a derivation unit 21d, and a display control unit 21e. The learning unit 21a is a function for machine learning the machine learning model 32. The discrimination unit 21b is a function for discriminating the class of the original image 33 using the machine learning model 32 trained by the learning unit 21a. The acquisition unit 21c is a function for acquiring information indicating the contribution degree of each region of the original image 33 to each of the vacant vehicle class discrimination and the parking class discrimination. In this embodiment, the acquisition unit 21c is a function for acquiring an evaluation image 35, which is a heat map obtained by Grad-CAM, as information indicating the contribution degree of each region of the original image 33 to each of the vacant vehicle class discrimination and the parking class discrimination. The derivation unit 21d is a function for deriving information indicating the comparison result of the contribution degree for each output class acquired by the acquisition unit 21c. Hereinafter, the information indicating the comparison result of the contribution degree for each output class is referred to as comparison information. In this embodiment, the derivation unit 21d has a function of deriving a comparison image 36 as this comparison information. The display control unit 21e has a function of displaying the comparison information derived by the derivation unit 21d (in this embodiment, the comparison image 36).

（２）導出処理：
図３は、情報処理装置１０が実行する導出処理の一例を示すフローチャートである。制御部２０は、図３の処理の開始前に、学習部２１ａの機能により、記録媒体３０に予め記録された教師データ３１を用いて、記録媒体３０に予め記録された機械学習モデル３２を機械学習する。ここで、教師データ３１は、ラベルが対応付けられた複数の画像である。教師データ３１の複数の画像それぞれは、原画像３３と同一のフォーマット（縦横画素）の駐車スペースの画像であって、画像に含まれる駐車スペースに車両が駐車しているか否かのラベルが対応づけられている。制御部２０は、教師データ３１を入力データとして機械学習モデル３２に入力し、出力を取得し、出力値と入力データに対応づけられたラベル（各ノードのいずれかが１）との差が少なくなるように、機械学習モデルの可変パラメータを変化させ、差が既定値以下になるまで演算を繰り返すことで、機械学習モデル３２の機械学習を行う。そして、制御部２０は、学習済みの機械学習モデル３２を記録媒体３０に記録する。 (2) Derivation process:
FIG. 3 is a flowchart showing an example of a derivation process executed by the information processing device 10. Before starting the process of FIG. 3, the control unit 20 performs machine learning of the machine learning model 32 pre-recorded in the recording medium 30 using the teacher data 31 pre-recorded in the recording medium 30 by the function of the learning unit 21a. Here, the teacher data 31 is a plurality of images associated with labels. Each of the plurality of images of the teacher data 31 is an image of a parking space in the same format (vertical and horizontal pixels) as the original image 33, and is associated with a label indicating whether a vehicle is parked in the parking space included in the image. The control unit 20 inputs the teacher data 31 as input data to the machine learning model 32, obtains an output, and changes the variable parameters of the machine learning model so as to reduce the difference between the output value and the label associated with the input data (any of the nodes is 1), and repeats the calculation until the difference becomes equal to or less than a preset value, thereby performing machine learning of the machine learning model 32. Then, the control unit 20 records the learned machine learning model 32 in the recording medium 30.

図３の比較結果導出処理が開始されると、制御部２０は、判別部２１ｂの機能により、機械学習モデル３２を用いたクラス判別を行う（ステップＳ１００）。より具体的には、制御部２０は、原画像３３を入力画像として、機械学習モデル３２による演算を行い、出力層Ｌ_ｏの各ノードの出力値を取得する。そして、制御部２０は、出力値が最も大きいノードに対応するクラスを判別結果とする。ここでは、本実施形態で用いられる原画像３３を図４に示す。本実施形態では、原画像３３は、図４に示すように、駐車場における空車状態の駐車スペースの画像である。 When the comparison result derivation process of FIG. 3 is started, the control unit 20 performs class discrimination using the machine learning model 32 by the function of the discrimination unit 21b (step S100). More specifically, the control unit 20 performs calculations using the machine learning model 32 with the original image 33 as an input image, and obtains the output value of each node of the output layer _Lo . Then, the control unit 20 determines the class corresponding to the node with the largest output value as the discrimination result. Here, the original image 33 used in this embodiment is shown in FIG. 4. In this embodiment, the original image 33 is an image of an empty parking space in a parking lot, as shown in FIG. 4.

次に、制御部２０は、取得部２１ｃの機能により、機械学習モデル３２が判別するクラス（空車クラスおよび駐車クラス）から１つを選択する（ステップＳ１０５）。以下では、最新のステップＳ１０５の処理で選択されたクラスを選択クラスとする。次に、制御部２０は、取得部２１ｃの機能により、機械学習モデル３２のＣＮＮにおける畳込み層の最終層である層Ｌ_ｍにおける特徴マップから１つのチャネルを選択する（ステップＳ１１０）。図２の例では、制御部２０は、層Ｌ_ｍで得られる縦Ｈ_ｍ×横Ｗ_ｍ×チャネルＤ_ｍの特徴マップから１つのチャネル（縦Ｈ_ｍ×横Ｗ_ｍの特徴マップ）を選択する。以下では、最新のステップＳ１１０で選択されたチャネルを、選択チャネルとする。 Next, the control unit 20 selects one of the classes (empty vehicle class and parking class) determined by the machine learning model 32 by using the function of the acquisition unit 21c (step S105). Hereinafter, the class selected in the latest processing of step S105 is referred to as the selected class. Next, the control unit 20 selects one channel from the feature map in layer _Lm , which is the final layer of the convolution layer in the CNN of the machine learning model 32, by using the function of the acquisition unit 21c (step S110). In the example of FIG. 2, _the control unit 20 selects one channel (feature map of vertical _Hm x horizontal _Wm x channel Dm) from the feature map of vertical _Hm x horizontal Wm x _channel _Dm obtained in layer Lm. Hereinafter, the channel selected in the latest step S110 is referred to as the selected channel.

次に、制御部２０は、取得部２１ｃの機能により、選択クラスの判別に対する選択チャネルの重要さを示す重みを取得する（ステップＳ１１５）。ステップＳ１１５での処理の詳細を説明する。ここで、機械学習モデル３２が判別するクラス（空車クラス、駐車クラス）のうち、選択クラスを示すインデックスをｃとする。また、出力層Ｌ_ｏにおける選択クラスに対応するノードの出力値をｙ^ｃとする。また、層Ｌ_ｍにおける特徴マップにおけるＤ_ｍ個のチャネルのうち、選択チャネルを示すインデックスをｋとする。また、選択チャネルにおける特徴マップ（縦Ｈ_ｍ×横Ｗ_ｍの特徴マップ）をＡ^ｋとする。Ａ^ｋにおける画素数（Ｈ_ｍ×Ｗ_ｍ）をＺとおく。Ａ^ｋにおける横の位置、縦の位置を示すインデックスを、それぞれｉ、ｊとする。Ａ^ｋにおける位置（ｉ、ｊ）の画素の画素値を、Ａ^ｋ _ｉ、ｊとする。また、選択クラスの判別に対する選択チャネルの重みを、α^ｃ _ｋとする。
制御部２０は、以下の式１を用いて、重みα^ｃ _ｋを取得する。

Next, the control unit 20 acquires a weight indicating the importance of the selected channel for discriminating the selected class by using the function of the acquisition unit 21c (step S115). Details of the process in step S115 will be described. Here, the index indicating the selected class among the classes (empty vehicle class, parking class) discriminated by the machine learning model 32 is c. The output value of the node corresponding to the selected class in the output layer L _o is y ^c . The index indicating the selected channel among the D _m channels in the feature map in the layer L _m is k. The feature map in the selected channel (feature map of vertical H _m × horizontal W _m ) is A ^k . The number of pixels (H _m × W _m ) in A ^k is Z. The indexes indicating the horizontal position and vertical position in A ^k are i and j, respectively. The pixel value of the pixel at the position (i, j) in A ^k is A ^k _{i, j} . The weight of the selected channel for discriminating the selected class is α ^c _k .
The control unit 20 obtains the weight α ^c _k using the following equation 1.

このように、制御部２０は、選択チャネルの全画素について、ｙ^ｃを選択チャネルの画素の値（Ａ^ｋ _ｉ、ｊ）で偏微分した値（δｙ^ｃ／δＡ^ｋ _ｉ、ｊ）、すなわち、ｙ^ｃのＡ^ｋ _ｉ、ｊに対する勾配を、画素の選択クラスの判別への寄与の度合を示す指標として求める。そして、制御部２０は、求めた指標の選択チャネルの全画素（Ｚ個の画素）についての平均値を、重みα^ｃ _ｋとして取得する。本実施形態では、制御部２０は、式１におけるδｙ^ｃ／δＡ^ｋ _ｉ、ｊを、以下のようにして求める。機械学習モデル３２においては、層Ｌ_ｍの特徴マップの各画素の値が全結合層の各ノードとして結合され、全結合層の各ノード値を入力データとして、出力層Ｌ_ｏの各ノードの出力値が得られる。そのため、出力層Ｌ_ｏの各ノード値は、層Ｌ_ｍの特徴マップの各画素の値を引数とする関数として表すことができる。すなわち、ｙ^ｃは、少なくともＡ^ｋを引数とする関数ｙ^ｃ（Ａ^ｋ _１、１、Ａ^ｋ _１、２、・・・、Ａ^ｋ _ｉ、ｊ、・・・、Ａ^ｋ _{Ｗｍ、Ｈｍ}）として表すことができる。そして、制御部２０は、Ａ^ｋ _ｉ、ｊの変動に応じたｙ^ｃの変動率を、δｙ^ｃ／δＡ^ｋ _ｉ、ｊの値として求める。本実施形態では、制御部２０は、以下の式２を用いて、δｙ^ｃ／δＡ^ｋ _ｉ、ｊを求める。式２におけるｈは、予め定められた実数である。

In this way, the control unit 20 obtains a value (δy ^c /δA ^k _i,j ₎ obtained by partially differentiating y ^c with the pixel value (A ^k i,j ) of the selected channel for all pixels of the selected channel, that is, the gradient of y ^c with respect to A ^k _i,j , as an index indicating the degree of contribution of the pixel to the discrimination of the selected class. Then, the control unit 20 obtains the average value of the obtained index for all pixels (Z pixels) of the selected channel as the weight α ^c _k . In this embodiment, the control unit 20 obtains δy ^c /δA ^k _i,j in Equation 1 as follows. In the machine learning model 32, the values of each pixel in the feature map of the layer L _m are connected as each node of the fully connected layer, and the output value of each node of the output layer L _o is obtained using the node values of the fully connected layer as input data. Therefore, each node value of the output layer L _o can be expressed as a function with the value of each pixel in the feature map of the layer L _m as an argument. That is, ^yc can be expressed as a function ^yc ( ^Ak1,1 _, ^Ak1,2 _, ..., ^Aki _,j , ..., ^AkWm _,Hm ) with at least ^Ak as an argument. Then, the control unit 20 obtains the fluctuation rate of ^yc according to the fluctuation of ^Aki _,j as the value of ^δyc / ^δAki , _j . In this embodiment, the control unit 20 obtains ^δyc / ^δAki _,j using the following formula 2. h in formula 2 is a predetermined real number.

ただし、（δｙ^ｃ／δＡ^ｋ _ｉ、ｊ）を求める方法は、式２を用いる方法に限定されない。例えば、ｙ^ｃをＡ^ｋ _ｉ、ｊで偏微分した式の情報であって、機械学習モデル３２で用いられるパラメータ（ノードの結合の重み、畳込み等に用いられるフィルターの要素値、特徴マップの各画素の値等）で表される式の情報が予め記録媒体３０に記録されているとしてもよい。この場合、制御部２０は、この情報が示す式に各パラメータの値を代入することで、（δｙ^ｃ／δＡ^ｋ _ｉ、ｊ）を求めてもよい。 However, the method of finding (δy ^c /δA ^k _i,j ) is not limited to the method using formula 2. For example, information on a formula obtained by partially differentiating y ^c with A ^k _i,j and expressed by parameters (such as node connection weights, filter element values used in convolution, and values of each pixel in a feature map) used in the machine learning model 32 may be recorded in advance in the recording medium 30. In this case, the control unit 20 may find (δy ^c /δA ^k _i,j ) by substituting the values of each parameter into the formula indicated by this information.

次に、制御部２０は、取得部２１ｃの機能により、全チャネルを選択済であるか否か判定する（ステップＳ１２０）。すなわち、制御部２０は、ステップＳ１０５において選択された選択クラスにおいて、層Ｌ_ｍにおける特徴マップにおける全てのチャネル（チャネル数Ｄ_ｍ個分）について、ステップＳ１１０～ステップＳ１２０のループ処理を行ったか否かを判定する。選択クラスを選択した後で全チャネルを選択済であると、ステップＳ１２０において判定されない場合、制御部２０は、ステップＳ１１０以降の処理を繰り返す。 Next, the control unit 20 determines whether all channels have been selected using the function of the acquisition unit 21c (step S120). That is, the control unit 20 determines whether the loop process of steps S110 to S120 has been performed for all channels (number of channels _Dm ) in the feature map in the layer _Lm in the selection class selected in step S105. If it is not determined in step S120 that all channels have been selected after the selection class is selected, the control unit 20 repeats the process from step S110 onwards.

ステップＳ１２０において、選択クラスを選択した後で全チャネルを選択済であると判定された場合、制御部２０は、取得部２１ｃの機能により、各チャネルにおける特徴マップを重みに応じて合成し、選択クラスの評価画像を取得する（ステップＳ１２５）。具体的には、制御部２０は、以下の式３を用いて、層Ｌ_ｍにおける特徴マップの全てのチャネルをステップＳ１１５で取得した重みをかけて線形結合することで、選択クラスについての評価画像３５を取得する。

In step S120, if it is determined that all channels have been selected after the selection class is selected, the control unit 20 uses the function of the acquisition unit 21c to synthesize the feature maps in each channel according to the weights, and acquires an evaluation image for the selected class (step S125). Specifically, the control unit 20 linearly combines all channels of the feature maps in the layer _Lm by applying the weights acquired in step S115 using the following formula 3, thereby acquiring an evaluation image 35 for the selected class.

式３のＬ^ｃ _{Ｇｒａｄ－ＣＡＭ}は、選択クラスについての評価画像３５である。画像Ｌ^ｃ _{Ｇｒａｄ－ＣＡＭ}の各画素の値は、対応する原画像における領域の選択クラスの判別への寄与の度合を示す指標（寄与度）となる。本実施形態では、制御部２０は、式３を用いて、図５に示すように、層Ｌ_ｍにおける特徴マップの全てのチャネルにおける特徴マップを、対応する重みと掛け合わせて足し合わせ、１つの縦Ｈ_ｍ×横Ｗ_ｍの画像データとする。そして、制御部２０は、ＲｅＬＵ関数により、この画像データにおける画素値が０未満の各画素の画素値を０にする。選択クラスの判別に寄与するのは、画素値が正の部分であると考えられる。そこで、制御部２０は、このようにすることで、クラスの判別に寄与しない部分の情報量を低減できる。 L ^c _Grad-CAM in Equation 3 is the evaluation image 35 for the selected class. The value of each pixel in the image L ^c _Grad-CAM is an index (degree of contribution) indicating the degree of contribution of the corresponding region in the original image to the discrimination of the selected class. In this embodiment, the control unit 20 uses Equation 3 to multiply the feature maps in all channels of the feature map in the layer L _m by the corresponding weights and add them together as shown in FIG. 5 to obtain one image data of vertical H _m × horizontal W _m . Then, the control unit 20 sets the pixel value of each pixel in this image data that has a pixel value less than 0 to 0 by the ReLU function. It is considered that the part that contributes to the discrimination of the selected class is the part with a positive pixel value. Therefore, by doing so, the control unit 20 can reduce the amount of information of the part that does not contribute to the discrimination of the class.

図６に、選択クラスが空車クラスである場合にステップＳ１２５で取得される評価画像３５ａを示す。また、図７に、選択クラスが駐車クラスである場合にステップＳ１２５で取得される評価画像３５ｂを示す。図６、７では、評価画像３５ａ、評価画像３５ｂそれぞれは、ヒートマップで表されている。図６、７では、評価画像３５ａ、ｂの画素の値が大きいほど、その画素は、より明るい色（図６、７の例では、より白い色）で表示される。これにより、ユーザーは、評価画像３５を視認することで、より明るい領域に対応する原画像３３の領域が、より対応するクラスの判別により寄与したことを直感的に把握できる。
6 shows an evaluation image 35a acquired in step S12.5 when the selected class is the vacant class. FIG. 7 shows an evaluation image 35b acquired in step S12.5 when the selected class is the parking class. In FIGS. 6 and 7, the evaluation images 35a and 35b are each represented as a heat map. In FIGS. 6 and 7, the larger the pixel value of the evaluation images 35a and 35b, the brighter the color of the pixel is displayed (in the examples of FIGS. 6 and 7, the whiter the color is). This allows the user to intuitively grasp, by visually checking the evaluation image 35, that the area of the original image 33 corresponding to the brighter area contributed more to the discrimination of the corresponding class.

次に、制御部２０は、取得部２１ｃの機能により、機械学習モデル３２で判別できるクラス（空車クラス、駐車クラス）の全てをステップＳ１０５で選択したか否かを判定する（ステップＳ１３０）。機械学習モデル３２で判別できるクラス（空車クラス、駐車クラス）の全てをステップＳ１０５で選択したと判定されない場合、制御部２０は、ステップＳ１０５以降の処理を繰り返す。このように、本実施形態では、制御部２０は、ステップＳ１０５～ステップＳ１３０の処理を実行することで、Ｇｒａｄ－ＣＡＭの手法により、原画像３３の各領域のクラス判別への寄与度を示す評価画像３５をクラス毎に取得する。 Next, the control unit 20 uses the function of the acquisition unit 21c to determine whether or not all of the classes (vacant class, parking class) that can be distinguished by the machine learning model 32 have been selected in step S105 (step S130). If it is not determined that all of the classes (vacant class, parking class) that can be distinguished by the machine learning model 32 have been selected in step S105, the control unit 20 repeats the processes from step S105 onwards. In this way, in this embodiment, the control unit 20 executes the processes of steps S105 to S130 to obtain, for each class, an evaluation image 35 that indicates the contribution of each region of the original image 33 to class distinction, using the Grad-CAM technique.

ステップＳ１３０において選択していないクラスがあると判定された場合、制御部２０は、導出部２１ｄの機能により、ステップＳ１２５で取得された評価画像３５ａと、評価画像３５ｂと、を画素毎に比較し、比較結果を示す比較画像３６を導出する（ステップＳ１３５）。記録媒体３０に記録された判定基準データ３４は、評価画像３５同士の比較の基準を示す。制御部２０は、判定基準データ３４に基づいて、評価画像３５ａと、評価画像３５ｂと、の比較を行う。本実施形態では、判定基準データ３４は、画素毎に、空車クラスの評価画像の画素値と駐車クラスの評価画像の画素値との差をとることで、画素値の比較を行うことを示す。
If it is determined in step S130 that there is an unselected class, the control unit 20 uses the function of the derivation unit 21d to compare the evaluation images 35a and 35b acquired in step S125 for each pixel, and derives a comparison image 36 indicating the comparison result (step S135). The judgment criterion data 34 recorded in the recording medium 30 indicates the criterion for comparing the evaluation images 35. The control unit 20 compares the evaluation images 35a and 35b based on the judgment criterion data 34. In this embodiment, the judgment criterion data 34 indicates that the pixel values are compared by taking the difference between the pixel values of the evaluation image of the vacant class and the evaluation image of the parking class for each pixel.

ここで、図８を用いて、ステップＳ１３５の処理の詳細を説明する。ステップＳ２００において、制御部２０は、導出部２１ｄの機能により、ステップＳ１２５で取得した評価画像３５と同じサイズの画像データを、初期化状態の比較画像３６として生成する。本実施形態では、縦Ｈ_ｍ×横Ｗ_ｍの画像データを生成し、ＲＡＭに記録する。次に、制御部２０は、導出部２１ｄの機能により、インデックスｉ、ｊを初期化する（ステップＳ２０５）。具体的には、制御部２０は、画像中の画素の横の位置を示すインデックスｉと、画像中の画素の縦の位置を示すインデックスｊと、について値を１に初期化し、ＲＡＭに記録する。次に、制御部２０は、導出部２１ｄの機能により、評価画像３５ａにおける位置（ｉ、ｊ）の画素の画素値と、評価画像３５ｂにおける位置（ｉ、ｊ）の画素の画素値と、を比較する（ステップＳ２１０）。具体的には、制御部２０は、この２つの画素の画素値の大小関係（大小またはイコール）を特定する。 Here, the details of the process of step S135 will be described with reference to FIG. 8. In step S200, the control unit 20 generates image data of the same size as the evaluation image 35 acquired in step S125 as the comparison image 36 in an initialized state by using the function of the derivation unit 21d. In this embodiment, image data of H _m in height × W _m in width is generated and recorded in the RAM. Next, the control unit 20 initializes the indexes i and j by using the function of the derivation unit 21d (step S205). Specifically, the control unit 20 initializes the values of the index i indicating the horizontal position of the pixel in the image and the index j indicating the vertical position of the pixel in the image to 1 and records them in the RAM. Next, the control unit 20 compares the pixel value of the pixel at the position (i, j) in the evaluation image 35a with the pixel value of the pixel at the position (i, j) in the evaluation image 35b by using the function of the derivation unit 21d (step S210). Specifically, the control unit 20 specifies the magnitude relationship (larger, smaller, or equal) between the pixel values of these two pixels.

次に、制御部２０は、導出部２１ｄの機能により、ステップＳ２１０での比較結果に応じて、比較画像３６における位置（ｉ、ｊ）の画素を着色する（ステップＳ２１５）。本実施形態では、制御部２０は、図６に示す評価画像３５ａにおける位置（ｉ、ｊ）の画素の画素値が、図７に示す評価画像３５ｂにおける位置（ｉ、ｊ）の画素の画素値よりも大きい場合、比較画像３６における位置（ｉ、ｊ）の画素を、第１の色で着色する。より具体的には、制御部２０は、比較画像３６における位置（ｉ、ｊ）の画素の画素値を、第１の色に応じた画素値にする。本実施形態では、第１の色は、黒とするが、赤、青、黄、緑等の他の色でもよい。 Next, the control unit 20, by the function of the derivation unit 21d, colors the pixel at position (i, j) in the comparison image 36 according to the comparison result in step S210 (step S215). In this embodiment, if the pixel value of the pixel at position (i, j) in the evaluation image 35a shown in FIG. 6 is greater than the pixel value of the pixel at position (i, j) in the evaluation image 35b shown in FIG. 7, the control unit 20 colors the pixel at position (i, j) in the comparison image 36 with a first color. More specifically, the control unit 20 sets the pixel value of the pixel at position (i, j) in the comparison image 36 to a pixel value corresponding to the first color. In this embodiment, the first color is black, but may be other colors such as red, blue, yellow, and green.

また、制御部２０は、図６に示す評価画像３５ａにおける位置（ｉ、ｊ）の画素の画素値が、図７に示す評価画像３５ｂにおける位置（ｉ、ｊ）の画素の画素値よりも小さい場合、比較画像３６における位置（ｉ、ｊ）の画素を、第１の色と異なる第２の色で着色する。より具体的には、制御部２０は、比較画像３６における位置（ｉ、ｊ）の画素の画素値を、第２の色に応じた画素値にする。本実施形態では、第２の色は、白色とするが、赤、青、黄、緑等の他の色でもよい。 Furthermore, when the pixel value of the pixel at position (i, j) in evaluation image 35a shown in FIG. 6 is smaller than the pixel value of the pixel at position (i, j) in evaluation image 35b shown in FIG. 7, control unit 20 colors the pixel at position (i, j) in comparison image 36 with a second color different from the first color. More specifically, control unit 20 sets the pixel value of the pixel at position (i, j) in comparison image 36 to a pixel value corresponding to the second color. In this embodiment, the second color is white, but may be other colors such as red, blue, yellow, green, etc.

また、制御部２０は、図６に示す評価画像３５ａにおける位置（ｉ、ｊ）の画素の画素値が、図７に示す評価画像３５ｂにおける位置（ｉ、ｊ）の画素の画素値と等しい場合、比較画像３６における位置（ｉ、ｊ）の画素を、第３の色で着色する。より具体的には、制御部２０は、比較画像３６における位置（ｉ、ｊ）の画素の画素値を、第３の色に応じた画素値にする。本実施形態では、第３の色は、灰色とするが、赤、青、黄、緑等の他の色でもよい。 Furthermore, when the pixel value of the pixel at position (i, j) in evaluation image 35a shown in FIG. 6 is equal to the pixel value of the pixel at position (i, j) in evaluation image 35b shown in FIG. 7, control unit 20 colors the pixel at position (i, j) in comparison image 36 with a third color. More specifically, control unit 20 sets the pixel value of the pixel at position (i, j) in comparison image 36 to a pixel value corresponding to the third color. In this embodiment, the third color is gray, but may be other colors such as red, blue, yellow, green, etc.

次に、制御部２０は、導出部２１ｄの機能により、インデックスｉがＷ_ｍ以上であるか否かを判定する（ステップＳ２２０）。ステップＳ２２０においてインデックスｉがＷ_ｍ以上であると判定されない場合、制御部２０は、導出部２１ｄの機能により、インデックスｉの値をインクリメントし（ステップＳ２３０）、ステップＳ２１０以降の処理を繰り返す。 Next, the control unit 20 determines whether the index i is equal to or greater than _Wm using the function of the derivation unit 21d (step S220). If it is not determined in step S220 that the index i is equal to or greater than _Wm , the control unit 20 increments the value of the index i using the function of the derivation unit 21d (step S230) and repeats the processes from step S210 onward.

ステップＳ２２０においてインデックスｉがＷ_ｍ以上であると判定された場合、制御部２０は、導出部２１ｄの機能により、インデックスｉの値を１に初期化する（ステップＳ２２５）。次に、制御部２０は、導出部２１ｄの機能により、インデックスｊがＨ_ｍ以上であるか否かを判定する（ステップＳ２３５）。ステップＳ２３５において、インデックスｊがＨ_ｍ未満であると判定された場合、制御部２０は、導出部２１ｄの機能により、インデックスｊの値をインクリメントし（ステップＳ２４０）、ステップＳ２１０以降の処理を繰り返す。一方、ステップＳ２３５において、インデックスｊがＨ_ｍ以上であると判定された場合、制御部２０は、比較画像３６が完成したとして図８の処理を終了し、図３に示すステップＳ１４０以降を実行する。以上の処理によれば、画素毎の色が比較結果に応じた色に着色された比較画像３６であって、横Ｗ_ｍ個、縦Ｈ_ｍ個の画素からなる比較画像３６が生成される。図９に、図８の処理で生成された比較画像３６を示す。 If it is determined in step S220 that the index i is equal to or greater than _Wm , the control unit 20 initializes the value of the index i to 1 using the function of the derivation unit 21d (step S225). Next, the control unit 20 determines whether the index j is equal to or greater than _Hm using the function of the derivation unit 21d (step S235). If it is determined in step S235 that the index j is less than _Hm , the control unit 20 increments the value of the index j using the function of the derivation unit 21d (step S240) and repeats the processing from step S210 onwards. On the other hand, if it is determined in step S235 that the index j is equal to or greater than _Hm , the control unit 20 terminates the processing of FIG. 8 as the comparative image 36 is completed, and executes step S140 and subsequent steps shown in FIG. 3. According to the above processing, the comparative image 36 is generated, in which the color of each pixel is colored according to the comparison result, and which is made up of _Wm pixels in the horizontal direction and _Hm pixels in the vertical direction. FIG. 9 shows the comparative image 36 generated by the processing of FIG. 8.

比較画像３６の領域は、黒色、白色、灰色に分かれる。図９の例では、比較画像３６は、１つの黒色の領域３６１と、複数の白色の領域３６２と、複数の灰色の領域３６３と、に分かれている。 The areas of the comparison image 36 are divided into black, white, and gray. In the example of FIG. 9, the comparison image 36 is divided into one black area 361, multiple white areas 362, and multiple gray areas 363.

比較画像３６における黒色の領域は、原画像３３における駐車クラスの判別よりも空車クラスの判別により寄与する領域に対応する領域である。図９の例では、比較画像３６と図４の原画像３３とのサイズを合わせて重ね合わせた場合に、原画像３３における領域３６１と重なる領域は、駐車クラスの判別よりも空車クラスの判別により寄与する。比較画像３６における白色の領域は、原画像３３における空車クラスの判別よりも駐車クラスの判別により寄与する領域に対応する領域である。図９の例では、比較画像３６と原画像３３とのサイズを合わせて重ね合わせた場合に、原画像３３における領域３６２と重なる領域は、空車クラスの判別よりも駐車クラスの判別により寄与する。比較画像３６における灰色の領域は、原画像３３における駐車クラスの判別、空車クラスの判別に対する寄与度が同じ領域に対応する領域である。図９の例では、比較画像３６と原画像３３とのサイズを合わせて重ね合わせた場合に、原画像３３における領域３６３と重なる領域は、空車クラスの判別、駐車クラスの判別の何れにも寄与したと取れる領域である。 The black areas in the comparison image 36 correspond to areas in the original image 33 that contribute more to the empty class discrimination than to the parking class discrimination. In the example of FIG. 9, when the comparison image 36 and the original image 33 of FIG. 4 are overlaid with the same size, the area that overlaps with the area 361 in the original image 33 contributes more to the empty class discrimination than to the parking class discrimination. The white areas in the comparison image 36 correspond to areas in the original image 33 that contribute more to the parking class discrimination than to the empty class discrimination. In the example of FIG. 9, when the comparison image 36 and the original image 33 are overlaid with the same size, the area that overlaps with the area 362 in the original image 33 contributes more to the parking class discrimination than to the empty class discrimination. The gray areas in the comparison image 36 correspond to areas in the original image 33 that contribute the same degree of contribution to the parking class discrimination and the empty class discrimination. In the example of Figure 9, when the comparison image 36 and the original image 33 are overlaid with the same size, the area that overlaps with area 363 in the original image 33 is an area that can be considered to have contributed to both the determination of the vacant vehicle class and the parking class.

ユーザーは、このような比較画像３６を視認することで、原画像３３における各領域が何れのクラスの判別により寄与したかをより容易に把握できる。例えば、図９の比較画像３６の場合、ユーザーは、黒色の領域３６１、白色の領域３６２、灰色の領域３６３を確認することで、空車クラスの判別により寄与した領域、駐車クラスの判別により寄与した領域、各クラスの判別への寄与度が同じ領域を把握できる。また、ユーザーは、比較画像３６において、黒色の領域３６１が多く、空車クラスの判別により寄与する領域が多いことを直感的に把握できる。 By visually checking such a comparison image 36, the user can more easily understand which area in the original image 33 contributed to the determination of which class. For example, in the case of the comparison image 36 in FIG. 9, the user can check the black area 361, the white area 362, and the gray area 363 to understand which areas contributed to the determination of the vacant vehicle class, which areas contributed to the determination of the parking class, and which areas contributed equally to the determination of each class. Furthermore, the user can intuitively understand that there are many black areas 361 in the comparison image 36, and therefore many areas that contribute more to the determination of the vacant vehicle class.

図３のフローチャートの説明に戻る。ステップＳ１４０において、制御部２０は、表示制御部２１ｅの機能により、ステップＳ１３５で導出した比較画像３６を原画像３３とステップＳ１３５で取得された評価画像３５と並べて、表示部４０に表示する。図１０に、ステップＳ１４０で表示部４０に表示される画面を示す。なお、制御部２０は、比較画像３６、評価画像３５ａ、ｂ、原画像３３を同じサイズにリサイズして、表示する。制御部２０は、評価画像３５については、ヒートマップ画像として表示する。この際に、制御部２０は、各クラスについてのヒートマップ画像の各画素値の規格化を行う。すなわち、複数の評価画像３５における画素値が等しい画素は、同じ色で表されることとなる。 Returning to the explanation of the flowchart in FIG. 3, in step S140, the control unit 20 uses the function of the display control unit 21e to display the comparison image 36 derived in step S135 on the display unit 40 alongside the original image 33 and the evaluation image 35 acquired in step S135. FIG. 10 shows the screen displayed on the display unit 40 in step S140. The control unit 20 resizes and displays the comparison image 36, evaluation images 35a and 35b, and original image 33 to the same size. The control unit 20 displays the evaluation image 35 as a heat map image. At this time, the control unit 20 standardizes each pixel value of the heat map image for each class. That is, pixels with the same pixel value in multiple evaluation images 35 are displayed in the same color.

これにより、ユーザーは、原画像３３と同じサイズの比較画像３６を視認することで、比較画像３６における各領域が、原画像３３におけるどの領域であるかをより容易に把握できる。結果として、ユーザーは、原画像におけるどの領域がどのクラスの判別により寄与したかを、より容易に把握できる。また、制御部２０は、比較画像３６を、原画像３３と並べて表示してもよい。これにより、ユーザーは、原画像３３と比較画像３６とを見比べることができ、比較画像３６における各領域が、原画像３３におけるどの領域であるかをより容易に把握できる。 By visually viewing the comparison image 36, which has the same size as the original image 33, the user can more easily determine which area in the comparison image 36 corresponds to which area in the original image 33. As a result, the user can more easily determine which area in the original image contributed more to distinguishing which class. The control unit 20 may also display the comparison image 36 alongside the original image 33. This allows the user to visually compare the original image 33 and the comparison image 36, and more easily determine which area in the original image 33 corresponds to each area in the comparison image 36.

以上の構成により、情報処理装置１０は、機械学習モデル３２に入力される原画像３３の領域が機械学習モデル３２によるクラス判別に寄与した度合いを示す寄与度を、クラス毎に取得し、取得したクラス毎の寄与度の比較結果を示す比較画像３６を導出する。これにより、原画像３３の領域毎に、クラス毎のクラス判別の寄与度の比較結果を示す比較画像３６を導出できる。この比較画像３６がユーザーに提示されると、ユーザーは、原画像３３の各領域がどのクラスの判別により寄与したかを容易に把握できる。すなわち、情報処理装置１０は、このようなユーザーによる把握を支援することができる。
また、情報処理装置１０は、導出した比較画像３６を表示部４０に表示することで、ユーザーは、比較画像３６によって示される情報を、一見して把握することができる。 With the above configuration, the information processing device 10 acquires, for each class, a contribution degree indicating the degree to which the region of the original image 33 input to the machine learning model 32 contributed to the class discrimination by the machine learning model 32, and derives a comparison image 36 indicating a comparison result of the acquired contribution degree for each class. This makes it possible to derive a comparison image 36 indicating a comparison result of the contribution degree to the class discrimination for each class for each region of the original image 33. When this comparison image 36 is presented to the user, the user can easily understand which class discrimination each region of the original image 33 contributed to. In other words, the information processing device 10 can assist the user in such understanding.
Furthermore, the information processing device 10 displays the derived comparative image 36 on the display unit 40, allowing the user to grasp the information represented by the comparative image 36 at a glance.

また、情報処理装置１０は、比較画像３６における位置（ｉ、ｊ）の画素を、評価画像３５ａの位置（ｉ、ｊ）の画素の画素値が評価画像３５ｂの位置（ｉ、ｊ）の画素の画素値よりも大きい場合と、評価画像３５ａの位置（ｉ、ｊ）の画素の画素値が評価画像３５ｂの位置（ｉ、ｊ）の画素の画素値よりも小さい場合と、で異なる色で着色するとした。これにより、情報処理装置１０は、原画像３３における各領域がどのクラスの判別により寄与するかを、ユーザーがより把握しやすいように比較画像３６を表示できる。 In addition, the information processing device 10 colors the pixel at position (i, j) in the comparison image 36 in a different color when the pixel value of the pixel at position (i, j) in the evaluation image 35a is greater than the pixel value of the pixel at position (i, j) in the evaluation image 35b, and when the pixel value of the pixel at position (i, j) in the evaluation image 35a is smaller than the pixel value of the pixel at position (i, j) in the evaluation image 35b. This allows the information processing device 10 to display the comparison image 36 in a way that makes it easier for the user to understand which area in the original image 33 contributes to the discrimination of which class.

また、情報処理装置１０は、比較画像３６における位置（ｉ、ｊ）の画素を、評価画像３５ａの位置（ｉ、ｊ）の画素の画素値と評価画像３５ｂの位置（ｉ、ｊ）の画素の画素値とが等しい場合、既定の色で着色するとした。これにより、情報処理装置１０は、原画像３３におけるクラス判別への寄与度が各クラスについて等しい領域を、ユーザーがより把握しやすいように比較画像３６を表示できる。 In addition, the information processing device 10 colors a pixel at position (i, j) in the comparison image 36 with a default color if the pixel value of the pixel at position (i, j) in the evaluation image 35a is equal to the pixel value of the pixel at position (i, j) in the evaluation image 35b. This allows the information processing device 10 to display the comparison image 36 in a way that makes it easier for the user to grasp the areas in the original image 33 where the contribution to class discrimination for each class is equal.

また、従来、Ｇｒａｄ－ＣＡＭを用いて、ヒートマップを表示する場合、以下のような問題があった。各クラスについてヒートマップの各画素値の規格化が行われ、規格化によってヒートマップにおける際立たせたい領域が目立たなくなるという問題である。ヒートマップの各画素値の規格化が行われると、複数のヒートマップにおける画素値が等しい画素は、同じ色で表されることとなる。この規格化により、一部の領域の画素値が他の領域よりも極端に大きくなる場合がある。このような場合、この一部の領域のみが明るい色で表示され、他の領域が暗い色で表示される。こうなると、他の領域においてもクラス判別の判断根拠となる領域が含まれる場合であっても、その領域が目立たなくなる。 Furthermore, conventionally, when displaying a heat map using Grad-CAM, the following problem occurred. The pixel values of the heat map are normalized for each class, and the normalization makes areas in the heat map that you want to highlight less noticeable. When the pixel values of the heat map are normalized, pixels with the same pixel value in multiple heat maps are displayed in the same color. This normalization can cause the pixel values of some areas to be significantly higher than other areas. In such cases, only these areas are displayed in bright colors, and the other areas are displayed in dark colors. As a result, even if other areas also contain areas that serve as the basis for class discrimination, those areas become less noticeable.

駐車クラスの評価画像３５ａにおける領域７０１は、空車クラスの評価画像３５の各領域、駐車クラスの評価画像における他の領域よりも極端に大きい画素値となっている領域である。そのため、原画像３３は、空車状態の画像であるにも関わらず、図６、１０に示すように、空車クラスについての評価画像３５ａのヒートマップ画像は、全体的に暗い色で表されている。すなわち、Ｇｒａｄ－ＣＡＭの数値を同じ条件で評価するために、各クラスについてのヒートマップを正規化して表現することで、空車クラスの判定に大きく貢献した箇所が埋もれてしまう。結果として、空車クラスの判別の根拠となった領域の把握が困難となる。
対して、本実施形態の情報処理装置１０が導出する比較画像３６は、領域毎の各クラスの判別に対する寄与度の比較結果から得られるため、クラス判別の根拠となった領域に対応する部分を示せないという事態は生じにくい。すなわち、情報処理装置１０は、クラスの判別の根拠となった領域の把握が困難となる可能性を低減できる。
また、Grad-CAMにおけるヒートマップでは、クラス判別に寄与した領域のうち、画素値（寄与の度合い）が比較的大きい画素が明るい色で表示され、他の画素は暗めの色で表示される。すなわち、寄与の度合いが比較的大きいものしか目立たない。対して、本実施形態の手法では、クラス判別に寄与した判定に貢献した領域を、寄与の度合いの大きさに関係なく示すことができる。 The area 701 in the evaluation image 35a for the parking class is an area with a pixel value that is extremely larger than the other areas in the evaluation image 35 for the vacant class and the evaluation image for the parking class. Therefore, even though the original image 33 is an image of an empty vehicle, the heat map image of the evaluation image 35a for the vacant vehicle class is generally displayed in dark colors, as shown in FIGS. 6 and 10. In other words, in order to evaluate the Grad-CAM values under the same conditions, the heat maps for each class are normalized and displayed, so that the areas that contributed greatly to the determination of the vacant vehicle class are hidden. As a result, it becomes difficult to grasp the areas that served as the basis for determining the vacant vehicle class.
In contrast, the comparison image 36 derived by the information processing device 10 of the present embodiment is obtained from the comparison result of the contribution of each region to the discrimination of each class, so that it is unlikely that a part corresponding to the region that is the basis of the class discrimination cannot be shown. In other words, the information processing device 10 can reduce the possibility that it becomes difficult to grasp the region that is the basis of the class discrimination.
In addition, in the heat map of Grad-CAM, among the regions that contributed to the class discrimination, pixels with relatively large pixel values (degree of contribution) are displayed in bright colors, and other pixels are displayed in dark colors. In other words, only pixels with a relatively large degree of contribution stand out. In contrast, the method of this embodiment can show the regions that contributed to the judgment that contributed to the class discrimination, regardless of the magnitude of the degree of contribution.

（３）他の実施形態：
以上の実施形態は本発明を実施するための一例であり、機械学習モデルに入力される画像の領域が前記機械学習モデルによるクラス判別に寄与した度合いを示す寄与度を、クラス毎に取得し、クラス毎の寄与度の比較結果を示す比較情報を導出する限りにおいて、他にも種々の実施形態を採用可能である。例えば、機械学習モデルを学習する機能を、外部の装置が備えていてもよい。また、導出された比較画像を表示する機能を、外部の装置が備えていてもよい。 (3) Other embodiments:
The above embodiment is an example for implementing the present invention, and various other embodiments can be adopted as long as the contribution degree indicating the degree to which the region of the image input to the machine learning model contributed to the class discrimination by the machine learning model is acquired for each class, and comparison information indicating the comparison result of the contribution degree for each class is derived. For example, an external device may have a function of learning the machine learning model. In addition, an external device may have a function of displaying the derived comparison image.

上述の実施形態では、制御部２０は、ステップＳ２１５において、ステップＳ２１０での比較結果に応じて、空車クラスについての画素値＞駐車クラスについての画素値となる場合に、第１の色で対応する画素を着色することとした。ただし、制御部２０は、空車クラスについての画素値＞駐車クラスについての画素値となる場合であっても、空車クラスについての画素値と駐車クラスについての画素値との差分の大きさに応じて、対応する画素を異なる色で着色してもよい。例えば、制御部２０は、（空車クラスについての画素値－駐車クラスについての画素値）が、既定の閾値以上となる場合、赤色で着色し、（空車クラスについての画素値－駐車クラスについての画素値）が、既定の閾値未満となる場合、桃色で着色してもよい。また、制御部２０は、（空車クラスについての画素値－駐車クラスについての画素値）の値の大きさに応じた濃度で、着色するようにしてもよい。 In the above embodiment, in step S215, the control unit 20 colors the corresponding pixel with a first color when the pixel value for the vacant class is greater than the pixel value for the parking class, depending on the comparison result in step S210. However, even if the pixel value for the vacant class is greater than the pixel value for the parking class, the control unit 20 may color the corresponding pixel with a different color depending on the magnitude of the difference between the pixel value for the vacant class and the pixel value for the parking class. For example, the control unit 20 may color the corresponding pixel with red when (pixel value for the vacant class - pixel value for the parking class) is equal to or greater than a predetermined threshold, and may color the pixel with pink when (pixel value for the vacant class - pixel value for the parking class) is less than the predetermined threshold. The control unit 20 may also color the pixel with a density according to the magnitude of the value of (pixel value for the vacant class - pixel value for the parking class).

また、上述の実施形態では、制御部２０は、ステップＳ２１５において、ステップＳ２１０での比較結果に応じて、空車クラスについての画素値＜駐車クラスについての画素値となる場合に、第２の色で対応する画素を着色することとした。ただし、制御部２０は、空車クラスについての画素値＜駐車クラスについての画素値となる場合であっても、駐車クラスについての画素値と空車クラスについての画素値との差分の大きさに応じて、対応する画素を異なる色で着色してもよい。例えば、制御部２０は、（駐車クラスについての画素値－空車クラスについての画素値）が、既定の閾値以上となる場合、青色で着色し、（駐車クラスについての画素値－空車クラスについての画素値）が、既定の閾値未満となる場合、水色で着色してもよい。また、制御部２０は、（駐車クラスについての画素値－空車クラスについての画素値）の値の大きさに応じた濃度で、着色するようにしてもよい。 In the above embodiment, in step S215, the control unit 20 colors the corresponding pixel with a second color when the pixel value for the vacant class is less than the pixel value for the parking class, depending on the comparison result in step S210. However, even if the pixel value for the vacant class is less than the pixel value for the parking class, the control unit 20 may color the corresponding pixel with a different color depending on the magnitude of the difference between the pixel value for the parking class and the pixel value for the vacant class. For example, the control unit 20 may color the corresponding pixel with blue when (pixel value for the parking class - pixel value for the vacant class) is equal to or greater than a predetermined threshold, and may color the pixel with light blue when (pixel value for the parking class - pixel value for the vacant class) is less than the predetermined threshold. The control unit 20 may also color the pixel with a density according to the magnitude of the value of (pixel value for the parking class - pixel value for the vacant class).

また、上述の実施形態では、制御部２０は、ステップＳ２１０において、空車クラスについての画素値と駐車クラスについての画素値とを比較する。そして、制御部２０は、比較結果として、車クラスについての画素値と駐車クラスについての画素値との関係が、（空車クラスについての画素値＞駐車クラスについての画素値）、（空車クラスについての画素値＜駐車クラスについての画素値）、（空車クラスについての画素値＝駐車クラスについての画素値）の何れであるかを特定した。ただし、制御部２０は、ステップＳ２１０において、空車クラスについての画素値と駐車クラスについての画素値との関係として、（空車クラスについての画素値＞（駐車クラスについての画素値＋既定の閾値））、（（空車クラスについての画素値＋既定の閾値）＜駐車クラスについての画素値）、（｜空車クラスについての画素値－駐車クラスについての画素値｜≦既定の閾値）の何れであるかを特定してもよい。その場合、制御部２０は、ステップＳ２１５で、（空車クラスについての画素値＞（駐車クラスについての画素値＋既定の閾値））、（（空車クラスについての画素値＋既定の閾値）＜駐車クラスについての画素値）、（｜空車クラスについての画素値－駐車クラスについての画素値｜＜＝既定の閾値）のそれぞれの場合で、異なる色で対応する画素を着色するようにしてもよい。 In the above embodiment, the control unit 20 compares the pixel value for the vacant class with the pixel value for the parking class in step S210. Then, the control unit 20 specifies, as a comparison result, whether the relationship between the pixel value for the car class and the pixel value for the parking class is (pixel value for the vacant class > pixel value for the parking class), (pixel value for the vacant class < pixel value for the parking class), or (pixel value for the vacant class = pixel value for the parking class). However, in step S210, the control unit 20 may specify whether the relationship between the pixel value for the vacant class and the pixel value for the parking class is (pixel value for the vacant class > (pixel value for the parking class + default threshold value)), ((pixel value for the vacant class + default threshold value) < pixel value for the parking class), or (|pixel value for vacant class - pixel value for parking class| ≦ default threshold value). In this case, in step S215, the control unit 20 may color the corresponding pixels in different colors in each of the cases where (pixel value for vacant class > (pixel value for parking class + default threshold)), ((pixel value for vacant class + default threshold) < pixel value for parking class), and (|pixel value for vacant class - pixel value for parking class| <= default threshold).

また、上述の実施形態では、制御部２０は、ステップＳ２１５において、ステップＳ２１０での比較結果に応じて、（空車クラスについての画素値＝駐車クラスについての画素値）となる場合に、第３の色で対応する画素を着色することとした。ただし、制御部２０は、（空車クラスについての画素値＝駐車クラスについての画素値）となる場合には、対応する画素の着色を行わないこととしてもよい。例えば、制御部２０は、（空車クラスについての画素値＝駐車クラスについての画素値）となる場合には、対応する画素を透明に設定したり、Ｎｕｌｌ値を設定したりしてもよい。 In the above embodiment, in step S215, the control unit 20 colors the corresponding pixel with a third color when (the pixel value for the vacant class = the pixel value for the parking class) is satisfied according to the comparison result in step S210. However, the control unit 20 may not color the corresponding pixel when (the pixel value for the vacant class = the pixel value for the parking class). For example, the control unit 20 may set the corresponding pixel to transparent or to a null value when (the pixel value for the vacant class = the pixel value for the parking class) is satisfied.

また、上述の実施形態では、機械学習モデル３２は、駐車されているか否かに関するクラス（空車クラス、駐車クラス）の判別に用いられるモデルである。しかし、機械学習モデルによるクラス分類対象はこの例に限定されない。例えば、機械学習モデル３２は、他の種類のクラスの判別に用いられるモデルであってもよい。具体的には、例えば、機械学習モデル３２は、入力画像のクラスが、犬であることを示す犬クラスと、猫であることを示す猫クラスと、の何れであるかを判別するモデルであってもよい。 In the above embodiment, the machine learning model 32 is a model used to distinguish classes related to whether or not a vehicle is parked (vacant class, parked class). However, the class classification targets by the machine learning model are not limited to this example. For example, the machine learning model 32 may be a model used to distinguish other types of classes. Specifically, for example, the machine learning model 32 may be a model that distinguishes whether the class of an input image is a dog class indicating a dog or a cat class indicating a cat.

また、上述の実施形態では、機械学習モデル３２は、入力された画像が、２つのクラスのうちの何れに属するかを判別するモデルである。しかし、モデルにおける出力の態様はこのような態様に限定されない。例えば、機械学習モデル３２は、入力された画像が、３つ以上のクラスのうちの何れに属するかを判別するモデルであってもよい。具体的には、例えば、機械学習モデル３２は、入力された画像が、３つのクラスのうちの何れに属するかを判別するモデルであってもよい。この場合、情報処理装置１０は、例えば、ステップＳ１０５～ステップＳ１３０の処理により、３つのクラス（クラスＡ、クラスＢ、クラスＣとおく）それぞれについての評価画像３５を取得してもよい。 In the above embodiment, the machine learning model 32 is a model that determines which of two classes an input image belongs to. However, the mode of output in the model is not limited to this mode. For example, the machine learning model 32 may be a model that determines which of three or more classes an input image belongs to. Specifically, for example, the machine learning model 32 may be a model that determines which of three classes an input image belongs to. In this case, the information processing device 10 may obtain evaluation images 35 for each of the three classes (class A, class B, and class C) by the processing of steps S105 to S130, for example.

そして、制御部２０は、ステップＳ２１０で、３つのクラスそれぞれについての評価画像３５における位置（ｉ、ｊ）の画素の画素値の比較を行い、比較結果に応じて、ステップＳ２１５で比較画像３６の位置（ｉ、ｊ）の画素の着色を行ってもよい。例えば、制御部２０は、クラスＡについての画素値が最も大きい場合と、クラスＢについての画素値が最も大きい場合と、クラスＣについての画素値が最も大きい場合と、クラスＡ～Ｃそれぞれについての画素値が等しい場合とで、それぞれ異なる色で着色してもよい。また、例えば、制御部２０は、（クラスＡについての画素値＞クラスＢについての画素値＞クラスＣについての画素値）の場合、（クラスＡについての画素値＞クラスＣについての画素値＞クラスＢについての画素値）の場合、（クラスＢについての画素値＞クラスＡについての画素値＞クラスＣについての画素値）の場合、（クラスＢについての画素値＞クラスＣについての画素値＞クラスＡについての画素値）の場合、（クラスＣについての画素値＞クラスＡについての画素値＞クラスＢについての画素値）の場合、（クラスＣについての画素値＞クラスＢについての画素値＞クラスＡについての画素値）の場合、（クラスＣについての画素値＝クラスＡについての画素値＝クラスＢについての画素値）の場合等の大小関係に応じて、それぞれ異なる色で着色してもよい。 Then, in step S210, the control unit 20 may compare the pixel values of the pixel at position (i, j) in the evaluation image 35 for each of the three classes, and in step S215, color the pixel at position (i, j) in the comparison image 36 depending on the comparison result. For example, the control unit 20 may color the pixel at position (i, j) in the comparison image 36 in a different color when the pixel value for class A is the largest, when the pixel value for class B is the largest, when the pixel value for class C is the largest, and when the pixel values for classes A to C are equal. Also, for example, the control unit 20 may use different colors depending on the magnitude relationship in the cases of (pixel value for class A>pixel value for class B>pixel value for class C), (pixel value for class A>pixel value for class C>pixel value for class B), (pixel value for class B>pixel value for class A>pixel value for class C), (pixel value for class B>pixel value for class C>pixel value for class A), (pixel value for class C>pixel value for class A>pixel value for class B), (pixel value for class C>pixel value for class B>pixel value for class A), (pixel value for class C>pixel value for class B>pixel value for class A), (pixel value for class C=pixel value for class A=pixel value for class B), etc.

また、上述の実施形態では、制御部２０は、導出部２１ｄの機能により、比較画像３６を導出する。しかし、比較結果の出力態様は比較画像による出力に限定されない。例えば、制御部２０は、導出部２１ｄの機能により、２つのクラスの一方のクラスの判別に対する寄与度が他方のクラスの判別に対する寄与度よりも大きい原画像３３の領域、一方のクラスの判別に対する寄与度が他方のクラスの判別に対する寄与度よりも大きい原画像３３の領域、一方のクラスの判別に対する寄与度が他方のクラスの判別に対する寄与度と等しい原画像３３の領域それぞれの存在割合を示す情報を導出してもよい。 In the above embodiment, the control unit 20 derives the comparison image 36 by using the function of the derivation unit 21d. However, the output mode of the comparison result is not limited to outputting the comparison image. For example, the control unit 20 may derive, by using the function of the derivation unit 21d, information indicating the presence ratio of each of the areas of the original image 33 whose contribution to the discrimination of one of two classes is greater than the contribution to the discrimination of the other class, the areas of the original image 33 whose contribution to the discrimination of one class is greater than the contribution to the discrimination of the other class, and the areas of the original image 33 whose contribution to the discrimination of one class is equal to the contribution to the discrimination of the other class.

より具体的には、制御部２０は、比較画像３６のうち第１の色で着色された領域と、第２の色で着色された領域と、第３の色で着色された領域と、の比較画像３６における存在割合を特定してもよい。そして、制御部２０は、特定した割合を示す情報（例えば、テキスト情報、グラフ（例えば、棒グラフ・円グラフ等）情報等）を導出してもよい。そして、制御部２０は、表示制御部２１ｅとして、導出された情報を表示部４０に表示してもよい。これにより、ユーザーは、空車クラスの判別により寄与した領域、駐車クラスの判別により寄与した領域、空車クラスの判別と駐車クラスの判別に同程度に寄与した領域それぞれがどの程度の割合で存在するかをより容易に把握できる。 More specifically, the control unit 20 may identify the proportions of the areas colored with a first color, the areas colored with a second color, and the areas colored with a third color in the comparison image 36. The control unit 20 may then derive information indicating the identified proportions (e.g., text information, graph (e.g., bar graph, pie chart, etc.) information, etc.). The control unit 20 may then display the derived information on the display unit 40 as the display control unit 21e. This allows the user to more easily grasp the proportions of areas that contributed to the empty vehicle class determination, areas that contributed to the parking class determination, and areas that contributed equally to the empty vehicle class determination and the parking class determination.

また、上述の実施形態では、制御部２０は、原画像３３の各領域の選択クラスの判別への寄与度を示す情報として、Ｇｒａｄ－ＣＡＭと同様の手法で、畳込み層の最終層である層Ｌ_ｍにおける特徴マップにおける各チャネルを合成した評価画像３５を取得することとした。ただし、制御部２０は、原画像３３の各領域の選択クラスの判別への寄与度を示す情報として、他の情報を取得してもよい。例えば、制御部２０は、選択クラスの判別への寄与度を示す情報として、ＣＡＭ（ＣｌａｓｓＡｃｔｉｖａｔｉｏｎＭａｐｐｉｎｇ）の手法で得られる画像を取得してもよい。 In the above embodiment, the control unit 20 acquires an evaluation image 35 obtained by combining each channel in a feature map in the layer _Lm , which is the final layer of the convolutional layers, in a manner similar to Grad-CAM, as information indicating the contribution of each region of the original image 33 to the discrimination of the selected class. However, the control unit 20 may acquire other information as information indicating the contribution of each region of the original image 33 to the discrimination of the selected class. For example, the control unit 20 may acquire an image obtained by a Class Activation Mapping (CAM) method as information indicating the contribution of each region of the original image 33 to the discrimination of the selected class.

また、例えば、制御部２０は、畳込み層の中間層における特徴マップから、各画素がクラス判別への寄与度を示す評価画像３５を取得してもよい。例えば、制御部２０は、畳込み層の最終層である層Ｌ_ｍにおける特徴マップの代わりに、畳込み層の中間層（層Ｌ_ｍよりも入力側に近い層）を層Ｌ_ｌとして、層Ｌ_ｌおける特徴マップを用いて、上述の実施形態と同様の処理を行うこととしてもよい。その場合、制御部２０は、Ｗｍ、Ｈｍの代わりに、層Ｌ_ｌにおける特徴マップの横サイズ、縦サイズを用いる。この場合も、制御部２０は、比較画像３６を入力される原画像３３と同じサイズになるようにリサイズしてもよい。 Also, for example, the control unit 20 may obtain an evaluation image 35 indicating the contribution of each pixel to class discrimination from a feature map in an intermediate layer of the convolution layer. For example, the control unit 20 may perform the same processing as in the above-mentioned _embodiment by using a feature map in an intermediate layer of the convolution layer (a layer closer to the input side than layer _Lm ) as layer _Ll instead of a feature map in layer _Lm , which is the final layer of the convolution layer. In that case, the control unit 20 uses the horizontal size and vertical size of the feature map in layer _Ll instead of Wm and Hm. In this case, the control unit 20 may also resize the comparison image 36 to be the same size as the input original image 33.

また、上述の実施形態では、制御部２０は、原画像３３の各領域の選択クラスの判別への寄与度を示す情報、クラス毎の寄与度の比較結果を示す比較情報を、評価画像３５、比較画像３６として画像の形式で求めた。ただし、制御部２０は、これらの情報を、画像と異なる形式の情報（例えば、ｃｓｖデータ、配列データ、テキストデータ等）として求めてもよい。 In addition, in the above embodiment, the control unit 20 obtains information indicating the contribution of each region of the original image 33 to the discrimination of the selected class and comparison information indicating the comparison result of the contribution for each class in the form of images as the evaluation image 35 and the comparison image 36. However, the control unit 20 may obtain this information as information in a format other than images (e.g., csv data, array data, text data, etc.).

寄与度は、機械学習モデルに入力される入力画像の領域が前記機械学習モデルによるクラス判別に寄与した度合いであればよい。従って、寄与度は、機械学習モデルにおける出力層のクラスに対応するノード値への影響の大きさを示す指標であってよく、他にも種々の定義を利用可能である。例えば、画像を複数の領域に分けた場合の当該領域毎の勾配値の平均であってもよい。
導出部は、クラス毎の寄与度の比較結果として、クラス毎の寄与度の比較により得られる情報を導出すればよい。例えば、導出部は、クラス毎の寄与度の大小関係を示す情報を導出してもよいし、クラス毎、領域毎の寄与度の差分を示す情報を導出してもよい。
第１の色、第２の色、第３の色は、それぞれ異なる色であればよい。例えば、第１の色、第２の色、第３の色は、それぞれ見分けやすい色（例えば、彩度、明度、色相の違いが顕著な色）であればよい。 The degree of contribution may be the degree to which an area of an input image input to a machine learning model contributes to class discrimination by the machine learning model. Therefore, the degree of contribution may be an index indicating the magnitude of influence on a node value corresponding to a class of an output layer in a machine learning model, and various other definitions may be used. For example, the degree of contribution may be the average of gradient values for each area when an image is divided into multiple areas.
The derivation unit may derive information obtained by comparing the contribution degrees of each class as a comparison result of the contribution degrees of each class. For example, the derivation unit may derive information indicating a magnitude relationship between the contribution degrees of each class, or may derive information indicating a difference in the contribution degrees of each class and each region.
The first color, the second color, and the third color may be different from each other. For example, the first color, the second color, and the third color may be easily distinguished from each other (for example, colors having noticeable differences in saturation, brightness, or hue).

さらに、本発明の手法は、プログラムや方法としても適用可能である。また、一部がソフトウェアであり一部がハードウェアであったりするなど、適宜、変更可能である。さらに、装置を制御するプログラムの記録媒体としても発明は成立する。むろん、そのソフトウェアの記録媒体は、磁気記録媒体であってもよいし半導体メモリであってもよいし、今後開発されるいかなる記録媒体においても全く同様に考えることができる。 The technique of the present invention can also be applied as a program or method. It can also be modified as appropriate, with some parts being software and some being hardware. The invention can also be implemented as a recording medium for a program that controls an apparatus. Of course, the recording medium for that software can be a magnetic recording medium or a semiconductor memory, and any recording medium developed in the future can be considered in exactly the same way.

１０…情報処理装置、２０…制御部、２１…導出プログラム、２１ａ…学習部、２１ｂ…判別部、２１ｃ…取得部、２１ｄ…導出部、２１ｅ…表示制御部、３０…記録媒体、３１…教師データ、３２…機械学習モデル、３３…原画像、３４…判定基準データ、３５…評価画像、３６…比較画像、４０…表示部 10...information processing device, 20...control unit, 21...derivation program, 21a...learning unit, 21b...discrimination unit, 21c...acquisition unit, 21d...derivation unit, 21e...display control unit, 30...recording medium, 31...teacher data, 32...machine learning model, 33...original image, 34...criteria data, 35...evaluation image, 36...comparison image, 40...display unit

Claims

an acquisition unit that acquires, as a pixel value for each class, a contribution indicating a degree to which an area of an input image input to a machine learning model used for determining whether the input image is one of two classes, an empty vehicle class indicating an image of an empty vehicle, and a parking class indicating an image of a parked vehicle, contributed to class determination by the machine learning model; and
if the pixel value for the vacant vehicle class is greater than the pixel value for the parking class, the pixel value for the vacant vehicle class minus the pixel value for the parking class;
if the pixel value for the vacant vehicle class is less than the pixel value for the parking class, the pixel value for the parking class minus the magnitude of the pixel value for the vacant vehicle class;
a derivation unit that derives an image in which corresponding pixels are colored with different colors as comparison information indicating a comparison result of the contribution degree for each class acquired by the acquisition unit in response to the above-mentioned.
An information processing device comprising:

The information processing device according to claim 1, further comprising a display control unit that displays the comparison information derived by the derivation unit.

The information processing device according to claim 1 or 2, wherein the acquisition unit acquires each pixel value of an image derived by Grad-CAM as the contribution degree.

The information processing device according to any one of claims 1 to 3, wherein the derivation unit further derives information indicating the presence ratio of each of areas of the input image in which the contribution degree to the discrimination of one class is greater than the contribution degree to the discrimination of the other class, areas of the input image in which the contribution degree to the discrimination of the other class is greater than the contribution degree to the discrimination of the one class, and areas of the input image in which the contribution degree to the discrimination of the one class is equal to the contribution degree to the discrimination of the other class.

An information processing method executed by an information processing device,
A step of acquiring, for each class, a contribution indicating a degree to which an area of an input image input to a machine learning model used for determining whether the input image is one of two classes, an empty vehicle class indicating an image of an empty vehicle, and a parking class indicating an image of a parked vehicle, contributed to class determination by the machine learning model, the contribution indicating a pixel value for the class;
if the pixel value for the vacant vehicle class is greater than the pixel value for the parking class, the pixel value for the vacant vehicle class minus the pixel value for the parking class;
if the pixel value for the vacant vehicle class is less than the pixel value for the parking class, the pixel value for the parking class minus the magnitude of the pixel value for the vacant vehicle class;
deriving an image in which corresponding pixels are colored with different colors as comparison information indicating a comparison result of the contribution degree for each class obtained according to the obtained results;
An information processing method comprising:

Computer,
an acquisition unit that acquires, as a pixel value for each class, a contribution indicating a degree to which an area of an input image input to a machine learning model used for determining whether the input image is one of two classes, an empty vehicle class indicating an image of an empty vehicle, and a parking class indicating an image of a parked vehicle, contributed to class determination by the machine learning model;
if the pixel value for the vacant vehicle class is greater than the pixel value for the parking class, the pixel value for the vacant vehicle class minus the pixel value for the parking class;
if the pixel value for the vacant vehicle class is less than the pixel value for the parking class, the pixel value for the parking class minus the magnitude of the pixel value for the vacant vehicle class;
a derivation unit that derives an image in which corresponding pixels are colored with different colors as comparison information indicating a comparison result of the contribution degree for each class acquired by the acquisition unit in response to the above;
A program that functions as a