JP7351186B2

JP7351186B2 - Analysis equipment, analysis program and analysis method

Info

Publication number: JP7351186B2
Application number: JP2019200863A
Authority: JP
Inventors: 智規久保田; 康之村田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2019-11-05
Filing date: 2019-11-05
Publication date: 2023-09-27
Anticipated expiration: 2039-11-05
Also published as: US11663487B2; US20210133481A1; JP2021076924A

Description

本発明は、解析装置、解析プログラム及び解析方法に関する。 The present invention relates to an analysis device, an analysis program, and an analysis method.

近年、ＣＮＮ（Convolutional Neural Network）を用いた画像認識処理において、誤ったラベルが推論された場合の誤推論の原因を解析する解析技術が提案されている。一例として、スコア最大化法（Activation Maximization）が挙げられる。また、画像認識処理において推論時に注目される画像箇所を解析する解析技術が提案されている。一例として、ＢＰ（Back Propagation）法、ＧＢＰ（Guided Back Propagation）法等が挙げられる。 In recent years, analysis techniques have been proposed for analyzing the cause of erroneous inference when an erroneous label is inferred in image recognition processing using CNN (Convolutional Neural Network). An example is the score maximization method (activation maximization). Furthermore, an analysis technique has been proposed for analyzing image parts that are of interest during inference in image recognition processing. Examples include the BP (Back Propagation) method and the GBP (Guided Back Propagation) method.

スコア最大化法は、推論の正解ラベルが最大スコアとなるように入力画像を変更した際の変更部分を、誤推論の原因となる画像箇所として特定する方法である。また、ＢＰ法やＧＢＰ法は、推論したラベルから逆伝播し、入力画像までたどることで、推論の際に反応した特徴部分を可視化する方法である。 The score maximization method is a method in which a changed part when an input image is changed so that the correct label for inference has the maximum score is identified as an image part that causes incorrect inference. Furthermore, the BP method and the GBP method are methods for visualizing characteristic parts that reacted during inference by backpropagating from the inferred label and tracing it to the input image.

特開２０１８－０９７８０７号公報JP2018-097807A 特開２０１８－０４５３５０号公報Japanese Patent Application Publication No. 2018-045350 Ramprasaath R. Selvariju, et al.: Grad-cam: Visual explanations from deep networks via gradient-based localization. The IEEE International Conference on Computer Vision (ICCV), pp. 618-626, 2017.Ramprasaath R. Selvariju, et al.: Grad-cam: Visual explanations from deep networks via gradient-based localization. The IEEE International Conference on Computer Vision (ICCV), pp. 618-626, 2017.

しかしながら、上述した解析技術の場合、誤推論の原因となる画像箇所を十分な精度で特定することができないという問題がある。 However, in the case of the above-mentioned analysis technique, there is a problem in that it is not possible to specify with sufficient accuracy the image location that causes the incorrect inference.

一つの側面では、誤推論の原因となる画像箇所を特定する際の精度を向上させることを目的としている。 One aspect of the invention is to improve accuracy in identifying image locations that cause erroneous inferences.

一態様によれば、解析装置は、
画像認識処理の際に誤ったラベルが推論される誤推論画像と、リファイン画像との差分を用いて、前記誤推論画像から、推論の正解ラベルのスコアを最大化させたリファイン画像を生成するリファイン画像生成部と、
前記誤推論画像の複数の画素のうち前記スコアを最大化させたリファイン画像を生成する際に変更がなされた画素を示す第１のマップと、前記スコアを最大化させたリファイン画像の複数の画素のうち推論時に注目した各画素の注目度合いを示すマップであって、各注目度合いの出現頻度に基づいて調整した第２のマップと、を重畳することで、正解ラベルを推論するための各画素の重要度を示す第３のマップを生成するマップ生成部と、
前記誤推論画像において、ピクセルの集合ごとに前記第３のマップの画素値を算出することで、誤推論の原因となるピクセルの集合を特定する特定部と、を有し、
前記リファイン画像生成部は、
前記特定部により前記ピクセルの集合が特定された場合、前記特定されたピクセルの集合の領域について前記差分を補正し、補正後の差分を用いて、前記誤推論画像から、前記推論の正解ラベルのスコアを最大化させたリファイン画像を再生成する。
According to one aspect, the analysis device includes:
Refinement that generates a refined image that maximizes the score of the correct label of inference from the incorrect inference image by using the difference between the incorrect inference image in which an incorrect label is inferred during image recognition processing and the refined image. an image generation unit;
a first map indicating pixels of the plurality of pixels of the incorrect inference image that have been changed when generating the refined image whose score is maximized ; and a plurality of pixels of the refined image whose score is maximized. By superimposing a second map that shows the degree of attention of each pixel that was focused on at the time of inference and that is adjusted based on the frequency of appearance of each degree of attention, each pixel for inferring the correct label is a map generation unit that generates a third map indicating the importance of the
In the erroneous inference image, a specifying unit that identifies a set of pixels that causes an erroneous inference by calculating a pixel value of the third map for each set of pixels ,
The refined image generation unit includes:
When the set of pixels is identified by the identification unit, the difference is corrected for the area of the identified set of pixels, and the correct label of the inference is determined from the incorrect inference image using the corrected difference. Regenerate a refined image that maximizes the score .

誤推論の原因となる画像箇所を特定する際の精度を向上させることができる。 It is possible to improve the accuracy in identifying image locations that cause incorrect inferences.

解析装置の機能構成の一例を示す図である。FIG. 2 is a diagram showing an example of a functional configuration of an analysis device. 解析装置のハードウェア構成の一例を示す図である。FIG. 2 is a diagram showing an example of a hardware configuration of an analysis device. 誤推論原因抽出部の機能構成の一例を示す第１の図である。FIG. 2 is a first diagram showing an example of the functional configuration of an erroneous inference cause extraction unit. リファイン画像生成部の処理の具体例を示す図である。FIG. 7 is a diagram illustrating a specific example of processing by a refined image generation unit. マップ生成部の処理の具体例を示す図である。FIG. 6 is a diagram illustrating a specific example of processing by a map generation unit. 重要特徴マップ生成部の機能構成の詳細を示す図である。FIG. 3 is a diagram showing details of the functional configuration of an important feature map generation section. 選択的逆誤差伝播部の処理の具体例を示す図である。FIG. 7 is a diagram illustrating a specific example of processing by a selective back-error propagation unit. 非注目画素オフセット調整部の処理内容を示す図である。FIG. 7 is a diagram illustrating processing contents of a non-target pixel offset adjustment section. 非注目画素オフセット調整部の処理の具体例を示す図である。FIG. 7 is a diagram illustrating a specific example of processing by a non-target pixel offset adjustment unit. スーパーピクセル分割部の処理の具体例を示す図である。FIG. 7 is a diagram illustrating a specific example of processing by a superpixel dividing section. 重要スーパーピクセル決定部の処理の具体例を示す図である。FIG. 7 is a diagram illustrating a specific example of processing by an important superpixel determining unit. 領域抽出部及び合成部の処理の具体例を示す図である。FIG. 7 is a diagram illustrating a specific example of processing by a region extracting unit and a combining unit. 誤推論原因抽出処理の流れを示す第１のフローチャートである。It is a 1st flowchart which shows the flow of incorrect inference cause extraction processing. 誤推論原因抽出処理の流れを示す第２のフローチャートである。It is a 2nd flowchart which shows the flow of incorrect inference cause extraction processing. 重要特徴マップ生成処理の流れを示すフローチャートである。3 is a flowchart showing the flow of important feature map generation processing. 誤推論原因抽出処理の具体例を示す第１の図である。FIG. 2 is a first diagram illustrating a specific example of incorrect inference cause extraction processing. 誤推論原因抽出部の機能構成の一例を示す第２の図である。FIG. 2 is a second diagram illustrating an example of the functional configuration of an erroneous inference cause extraction unit. 誤推論原因抽出処理の具体例を示す第２の図である。FIG. 7 is a second diagram showing a specific example of the incorrect inference cause extraction process. 誤推論原因抽出部の機能構成の一例を示す第３の図である。FIG. 7 is a third diagram illustrating an example of the functional configuration of an erroneous inference cause extraction unit. 誤推論原因抽出処理の具体例を示す第３の図である。FIG. 7 is a third diagram showing a specific example of the incorrect inference cause extraction process. 誤推論原因抽出部の機能構成の一例を示す第４の図である。FIG. 4 is a fourth diagram showing an example of the functional configuration of the incorrect inference cause extraction unit. 詳細原因解析部の機能構成の一例を示す第１の図である。FIG. 2 is a first diagram showing an example of a functional configuration of a detailed cause analysis section. 詳細原因解析部の処理の具体例を示す第１の図である。FIG. 2 is a first diagram showing a specific example of processing by a detailed cause analysis unit. 詳細原因解析処理の流れを示す第１のフローチャートである。It is a 1st flowchart which shows the flow of detailed cause analysis processing. 誤推論原因抽出部の機能構成の一例を示す第５の図である。FIG. 5 is a fifth diagram showing an example of the functional configuration of an incorrect inference cause extraction unit. 詳細原因解析部の機能構成の一例を示す第２の図である。FIG. 2 is a second diagram showing an example of the functional configuration of a detailed cause analysis section. 詳細原因解析部の処理の具体例を示す第２の図である。FIG. 7 is a second diagram showing a specific example of processing by the detailed cause analysis unit. 詳細原因解析処理の流れを示す第２のフローチャートである。It is a 2nd flowchart which shows the flow of detailed cause analysis processing.

以下、各実施形態について添付の図面を参照しながら説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複した説明を省略する。 Each embodiment will be described below with reference to the accompanying drawings. Note that, in this specification and the drawings, components having substantially the same functional configuration are designated by the same reference numerals, thereby omitting redundant explanation.

［第１の実施形態］
＜解析装置の機能構成＞
はじめに、第１の実施形態に係る解析装置の機能構成について説明する。図１は、解析装置の機能構成の一例を示す図である。解析装置１００には、解析プログラムがインストールされており、当該プログラムが実行されることで、解析装置１００は、推論部１１０、誤推論画像抽出部１２０、誤推論原因抽出部１４０として機能する。 [First embodiment]
<Functional configuration of analysis device>
First, the functional configuration of the analysis device according to the first embodiment will be described. FIG. 1 is a diagram showing an example of the functional configuration of an analysis device. An analysis program is installed in the analysis device 100, and by executing the program, the analysis device 100 functions as an inference section 110, an erroneous inference image extraction section 120, and an erroneous inference cause extraction section 140.

推論部１１０は、学習済みのＣＮＮを用いて画像認識処理を行う。具体的には、推論部１１０は、入力画像１０が入力されることで、入力画像１０に含まれるオブジェクトの種類（本実施形態では、車両の種類）を示すラベルを推論し、推論したラベルを出力する。 The inference unit 110 performs image recognition processing using the trained CNN. Specifically, upon receiving the input image 10, the inference unit 110 infers a label indicating the type of object (in this embodiment, the type of vehicle) included in the input image 10, and uses the inferred label. Output.

誤推論画像抽出部１２０は、入力画像１０に含まれるオブジェクトの種類を示すラベル（既知）と、推論部１１０により推論されたラベルとが一致するか否かを判定する。また、誤推論画像抽出部１２０は、一致しないと判定した際の（誤ったラベルが推論された際の）入力画像を、"誤推論画像"として抽出し、誤推論画像格納部１３０に格納する。 The incorrect inference image extraction unit 120 determines whether a label (known) indicating the type of object included in the input image 10 matches the label inferred by the inference unit 110. In addition, the incorrect inference image extraction unit 120 extracts the input image when it is determined that there is no match (when an incorrect label is inferred) as an “incorrect inference image” and stores it in the incorrect inference image storage unit 130. .

誤推論原因抽出部１４０は、誤推論画像について、誤推論の原因となる画像箇所を特定し、誤推論原因情報を出力する。具体的には、誤推論原因抽出部１４０は、リファイン画像生成部１４１と、マップ生成部１４２と、特定部１４３とを有する。 The erroneous inference cause extraction unit 140 identifies an image part that causes an erroneous inference in the erroneous inference image, and outputs erroneous inference cause information. Specifically, the incorrect inference cause extraction unit 140 includes a refined image generation unit 141, a map generation unit 142, and a specification unit 143.

リファイン画像生成部１４１は、誤推論画像格納部１３０に格納された誤推論画像を読み出す。また、リファイン画像生成部１４１は、読み出した誤推論画像から、推論の正解ラベルのスコアを最大化させたスコア最大化リファイン画像を生成する。 The refined image generation unit 141 reads out the incorrect inference image stored in the incorrect inference image storage unit 130. Further, the refined image generation unit 141 generates a score-maximizing refined image in which the score of the correct inference label is maximized from the read incorrect inference image.

マップ生成部１４２は、誤推論の原因を解析する既知の解析技術等を用いて、ラベルの推論に影響する領域を識別するマップを生成する。 The map generation unit 142 generates a map that identifies areas that influence label inference using known analysis techniques that analyze the causes of erroneous inference.

特定部１４３は、誤推論画像のうち、生成されたマップに含まれる、ラベルの推論に影響する領域について、生成されたリファイン画像で置き換える。また、特定部１４３は、当該領域をリファイン画像で置き換えた誤推論画像を入力としてラベルを推論し、推論したラベルのスコアから、置き換えの効果を判定する。 The specifying unit 143 replaces a region of the incorrect inference image that is included in the generated map and that affects label inference with the generated refined image. Further, the specifying unit 143 infers a label by inputting the incorrect inference image in which the region is replaced with a refined image, and determines the effect of the replacement from the score of the inferred label.

また、特定部１４３は、ラベルの推論に影響する領域の大きさを変えながら入力することでラベルを推論し、推論したラベルのスコアから、誤推論の原因となる画像箇所を特定する。更に、特定部１４３は、特定した誤推論の原因となる画像箇所を、誤推論原因情報として出力する。 Further, the specifying unit 143 infers a label by inputting data while changing the size of a region that affects inference of a label, and specifies an image location that causes incorrect inference from the score of the inferred label. Further, the specifying unit 143 outputs the identified image portion that causes the erroneous inference as erroneous inference cause information.

このように、ラベルの推論に影響する領域をリファイン画像で置き換える際、置き換えの効果を参照することで、誤推論の原因となる画像箇所を精度よく特定することができる。 In this way, when replacing an area that affects label inference with a refined image, by referring to the effect of the replacement, it is possible to accurately identify the image location that causes incorrect inference.

＜解析装置のハードウェア構成＞
次に、解析装置１００のハードウェア構成について説明する。図２は、解析装置のハードウェア構成の一例を示す図である。図２に示すように、解析装置１００は、ＣＰＵ（Central Processing Unit）２０１、ＲＯＭ（Read Only Memory）２０２、ＲＡＭ（Random Access Memory）２０３を有する。ＣＰＵ２０１、ＲＯＭ２０２、ＲＡＭ２０３は、いわゆるコンピュータを形成する。 <Hardware configuration of analysis device>
Next, the hardware configuration of the analysis device 100 will be explained. FIG. 2 is a diagram showing an example of the hardware configuration of the analysis device. As shown in FIG. 2, the analysis device 100 includes a CPU (Central Processing Unit) 201, a ROM (Read Only Memory) 202, and a RAM (Random Access Memory) 203. CPU201, ROM202, and RAM203 form what is called a computer.

また、解析装置１００は、補助記憶装置２０４、表示装置２０５、操作装置２０６、Ｉ／Ｆ（Interface）装置２０７、ドライブ装置２０８を有する。なお、解析装置１００の各ハードウェアは、バス２０９を介して相互に接続されている。 The analysis device 100 also includes an auxiliary storage device 204, a display device 205, an operating device 206, an I/F (Interface) device 207, and a drive device 208. Note that each piece of hardware in the analysis device 100 is interconnected via a bus 209.

ＣＰＵ２０１は、補助記憶装置２０４にインストールされている各種プログラム（例えば、解析プログラム等）を実行する演算デバイスである。なお、図２には示していないが、演算デバイスとしてアクセラレータ（例えば、ＧＰＵ（Graphics Processing Unit）など）を組み合わせてもよい。 The CPU 201 is a calculation device that executes various programs (for example, an analysis program, etc.) installed in the auxiliary storage device 204. Note that although not shown in FIG. 2, an accelerator (for example, a GPU (Graphics Processing Unit), etc.) may be combined as a calculation device.

ＲＯＭ２０２は、不揮発性メモリである。ＲＯＭ２０２は、補助記憶装置２０４にインストールされている各種プログラムをＣＰＵ２０１が実行するために必要な各種プログラム、データ等を格納する主記憶デバイスとして機能する。具体的には、ＲＯＭ２０２はＢＩＯＳ（Basic Input/Output System）やＥＦＩ（Extensible Firmware Interface）等のブートプログラム等を格納する、主記憶デバイスとして機能する。 ROM202 is a nonvolatile memory. The ROM 202 functions as a main storage device that stores various programs, data, etc. necessary for the CPU 201 to execute various programs installed in the auxiliary storage device 204 . Specifically, the ROM 202 functions as a main storage device that stores boot programs such as BIOS (Basic Input/Output System) and EFI (Extensible Firmware Interface).

ＲＡＭ２０３は、ＤＲＡＭ（Dynamic Random Access Memory）やＳＲＡＭ（Static Random Access Memory）等の揮発性メモリである。ＲＡＭ２０３は、補助記憶装置２０４にインストールされている各種プログラムがＣＰＵ２０１によって実行される際に展開される作業領域を提供する、主記憶デバイスとして機能する。 The RAM 203 is a volatile memory such as DRAM (Dynamic Random Access Memory) or SRAM (Static Random Access Memory). The RAM 203 functions as a main storage device that provides a work area in which various programs installed in the auxiliary storage device 204 are expanded when executed by the CPU 201 .

補助記憶装置２０４は、各種プログラムや、各種プログラムが実行される際に用いられる情報を格納する補助記憶デバイスである。例えば、誤推論画像格納部１３０は、補助記憶装置２０４において実現される。 The auxiliary storage device 204 is an auxiliary storage device that stores various programs and information used when the various programs are executed. For example, the incorrect inference image storage unit 130 is implemented in the auxiliary storage device 204.

表示装置２０５は、誤推論原因情報等を含む各種表示画面を表示する表示デバイスである。操作装置２０６は、解析装置１００のユーザが解析装置１００に対して各種指示を入力するための入力デバイスである。 The display device 205 is a display device that displays various display screens including erroneous inference cause information and the like. The operating device 206 is an input device through which a user of the analysis device 100 inputs various instructions to the analysis device 100.

Ｉ／Ｆ装置２０７は、例えば、不図示のネットワークと接続するための通信デバイスである。 The I/F device 207 is, for example, a communication device for connecting to a network (not shown).

ドライブ装置２０８は記録媒体２１０をセットするためのデバイスである。ここでいう記録媒体２１０には、ＣＤ－ＲＯＭ、フレキシブルディスク、光磁気ディスク等のように情報を光学的、電気的あるいは磁気的に記録する媒体が含まれる。また、記録媒体２１０には、ＲＯＭ、フラッシュメモリ等のように情報を電気的に記録する半導体メモリ等が含まれていてもよい。 The drive device 208 is a device for setting the recording medium 210. The recording medium 210 here includes a medium for recording information optically, electrically, or magnetically, such as a CD-ROM, a flexible disk, or a magneto-optical disk. Further, the recording medium 210 may include a semiconductor memory or the like that electrically records information, such as a ROM or a flash memory.

なお、補助記憶装置２０４にインストールされる各種プログラムは、例えば、配布された記録媒体２１０がドライブ装置２０８にセットされ、該記録媒体２１０に記録された各種プログラムがドライブ装置２０８により読み出されることでインストールされる。あるいは、補助記憶装置２０４にインストールされる各種プログラムは、不図示のネットワークよりダウンロードされることでインストールされてもよい。 The various programs to be installed in the auxiliary storage device 204 can be installed by, for example, setting the distributed recording medium 210 in the drive device 208 and reading out the various programs recorded on the recording medium 210 by the drive device 208. be done. Alternatively, various programs to be installed in the auxiliary storage device 204 may be installed by being downloaded from a network (not shown).

＜誤推論原因抽出部の機能構成＞
次に、第１の実施形態に係る解析装置１００において実現される機能のうち、誤推論原因抽出部１４０の機能構成の詳細について説明する。図３は、誤推論原因抽出部の機能構成の一例を示す第１の図である。以下、誤推論原因抽出部１４０の各部（リファイン画像生成部１４１、マップ生成部１４２、特定部１４３）の詳細について説明する。 <Functional configuration of incorrect inference cause extraction unit>
Next, the details of the functional configuration of the incorrect inference cause extraction unit 140 among the functions implemented in the analysis device 100 according to the first embodiment will be described. FIG. 3 is a first diagram showing an example of the functional configuration of the incorrect inference cause extraction unit. The details of each part (refined image generation unit 141, map generation unit 142, identification unit 143) of the incorrect inference cause extraction unit 140 will be described below.

（１）リファイン画像生成部の詳細
はじめに、リファイン画像生成部１４１の詳細について説明する。図３に示すように、リファイン画像生成部１４１は、画像リファイナ部３０１、画像誤差演算部３０２、推論部３０３、スコア誤差演算部３０４を有する。 (1) Details of refined image generation section First, details of the refined image generation section 141 will be explained. As shown in FIG. 3, the refined image generation unit 141 includes an image refiner unit 301, an image error calculation unit 302, an inference unit 303, and a score error calculation unit 304.

画像リファイナ部３０１は、例えば、画像の生成モデルとしてＣＮＮを用いて、誤推論画像からリファイン画像を生成する。 The image refiner unit 301 generates a refined image from the incorrectly inferred image using, for example, CNN as an image generation model.

なお、画像リファイナ部３０１では、生成したリファイン画像を用いて推論した際に、正解ラベルのスコアが最大となるように、誤推論画像を変更する。また、画像リファイナ部３０１では、例えば、画像の生成モデルを用いてリファイン画像を生成する場合には、誤推論画像からの変更量（リファイン画像と誤推論画像との差分）が小さくなるように、リファイン画像を生成する。これにより、画像リファイナ部３０１によれば、視覚的に変更前の画像（誤推論画像）に近い画像（リファイン画像）を得ることができる。 Note that, when inference is made using the generated refined image, the image refiner unit 301 changes the incorrect inference image so that the score of the correct label is maximized. In addition, in the image refiner unit 301, for example, when generating a refined image using an image generation model, the image refiner 301 uses Generate a refined image. Thereby, the image refiner unit 301 can obtain an image (refined image) that is visually similar to the image before change (erroneous inference image).

具体的には、画像リファイナ部３０１では、画像の生成モデルとしてＣＮＮを用いる場合、
・生成したリファイン画像を用いて推論した際のスコアと、正解ラベルのスコアを最大にしたスコアとの誤差であるスコア誤差と、
・生成したリファイン画像と誤推論画像との差分である画像差分値（例えば、画像差分（Ｌ１差分）やＳＳＩＭ（Structural Similarity）やそれらの組み合わせ）と、
を最小化するようにＣＮＮの学習を行う。 Specifically, when the image refiner unit 301 uses CNN as an image generation model,
・Score error, which is the error between the score when inferring using the generated refined image and the score that maximizes the score of the correct answer label,
・An image difference value (for example, image difference (L1 difference), SSIM (Structural Similarity), or a combination thereof) that is the difference between the generated refined image and the incorrectly inferred image,
The CNN is trained to minimize .

画像誤差演算部３０２は、誤推論画像と、学習中に画像リファイナ部３０１より出力されるリファイン画像との差分を算出し、画像差分値を、画像リファイナ部３０１に入力する。画像誤差演算部３０２では、例えば、画素ごとの差分（Ｌ１差分）やＳＳＩＭ（Structural Similarity）演算を行うことにより、画像差分値を算出し、画像リファイナ部３０１に入力する。 The image error calculation unit 302 calculates the difference between the incorrect inference image and the refined image output from the image refiner unit 301 during learning, and inputs the image difference value to the image refiner unit 301. The image error calculation unit 302 calculates an image difference value by, for example, performing a pixel-by-pixel difference (L1 difference) or SSIM (Structural Similarity) calculation, and inputs it to the image refiner unit 301.

推論部３０３は、画像リファイナ部３０１により生成されたリファイン画像または後述する重要スーパーピクセル決定部３２２で生成された合成画像を入力として推論し、推論したラベルのスコアを出力する、学習済みのＣＮＮを有する。合成画像とは、誤推論画像のうち、マップ生成部１４２で生成されたマップ（重要特徴指標マップ）に含まれる、正解ラベルの推論に影響する領域について、リファイン画像で置き換えた誤推論画像に他ならない。 The inference unit 303 uses a trained CNN that performs inference using the refined image generated by the image refiner unit 301 or the composite image generated by the important superpixel determining unit 322 (described later) as input, and outputs the score of the inferred label. have A synthesized image is an erroneous inference image in which a region of the erroneous inference image that is included in the map (important feature index map) generated by the map generation unit 142 and that affects the inference of the correct label is replaced with a refined image. No.

なお、推論部３０３により出力されたスコアは、スコア誤差演算部３０４または重要スーパーピクセル評価部３２３に通知される。 Note that the score output by the inference section 303 is notified to the score error calculation section 304 or the important superpixel evaluation section 323.

スコア誤差演算部３０４は、推論部３０３により通知されたスコアと、正解ラベルのスコアを最大にしたスコアとの誤差を算出し、画像リファイナ部３０１にスコア誤差を通知する。スコア誤差演算部３０４により通知されたスコア誤差は、画像リファイナ部３０１においてＣＮＮの学習に用いられる。 The score error calculation unit 304 calculates the error between the score notified by the inference unit 303 and the score that maximizes the score of the correct label, and notifies the image refiner unit 301 of the score error. The score error notified by the score error calculation unit 304 is used for CNN learning in the image refiner unit 301.

なお、画像リファイナ部３０１が有するＣＮＮの学習中に画像リファイナ部３０１から出力されるリファイン画像は、リファイン画像格納部３０５に格納される。画像リファイナ部３０１が有するＣＮＮの学習は、
・予め定められた学習回数分（例えば、最大学習回数＝Ｎ回分）、あるいは、
・正解ラベルのスコアが所定の閾値を超えるまで、あるいは、
・正解ラベルのスコアが所定の閾値を超え、かつ、画像差分値が所定の閾値より小さくなるまで、
行われ、推論部３０３より出力される正解ラベルのスコアが最大化した際のリファイン画像を、以下では、"スコア最大化リファイン画像"と称す。 Note that the refined image output from the image refiner unit 301 during learning of the CNN included in the image refiner unit 301 is stored in the refined image storage unit 305. Learning of the CNN included in the image refiner unit 301 is as follows:
- A predetermined number of learning times (for example, maximum number of learning times = N times), or
・Until the score of the correct label exceeds a predetermined threshold, or
・Until the score of the correct label exceeds the predetermined threshold and the image difference value becomes smaller than the predetermined threshold,
The refined image obtained when the score of the correct label output from the inference unit 303 is maximized is hereinafter referred to as a "score-maximized refined image."

（２）マップ生成部の詳細
次に、マップ生成部１４２の詳細について説明する。図３に示すように、マップ生成部１４２は、重要特徴マップ生成部３１１、劣化尺度マップ生成部３１２、重畳部３１３を有する。 (2) Details of Map Generation Unit Next, details of the map generation unit 142 will be explained. As shown in FIG. 3, the map generation section 142 includes an important feature map generation section 311, a deterioration measure map generation section 312, and a superimposition section 313.

重要特徴マップ生成部３１１は、スコア最大化リファイン画像を入力として推論した際の推論部構造情報を、推論部３０３より取得する。また、重要特徴マップ生成部３１１は、ＢＰ（Back Propagation）法、ＧＢＰ（Guided Back Propagation）法または選択的ＢＰ法を用いることで、推論部構造情報に基づいて"グレイスケール化重要特徴マップ"を生成する。グレイスケール化重要特徴マップは第２のマップの一例であり、スコア最大化リファイン画像の複数の画素のうち推論時に注目した各画素の注目度合いを示すマップを、グレイスケール化したものである。 The important feature map generation unit 311 acquires inference unit structure information from the inference unit 303 when inference is made using the score-maximized refined image as input. In addition, the important feature map generation unit 311 generates a “grayscale important feature map” based on the inference unit structure information by using the BP (Back Propagation) method, the GBP (Guided Back Propagation) method, or the selective BP method. generate. The grayscale important feature map is an example of the second map, and is a grayscale map showing the degree of attention of each pixel of the plurality of pixels of the score-maximizing refined image that was focused upon during inference.

なお、ＢＰ法は、推論したラベルが正解する入力データ（ここでは、スコア最大化リファイン画像）の推論を行うことで得た分類確率から各ラベルの誤差を計算し、入力層まで逆伝播して得られる勾配の大小を画像化することで、特徴部分を可視化する方法である。また、ＧＢＰ法は、勾配情報の正値のみを特徴部分として画像化することで、特徴部分を可視化する方法である。 Note that in the BP method, the error of each label is calculated from the classification probability obtained by performing inference on input data for which the inferred label is correct (in this case, the score maximization refined image), and the error is back-propagated to the input layer. This is a method of visualizing characteristic parts by creating an image of the magnitude of the obtained gradient. Further, the GBP method is a method of visualizing a characteristic part by imaging only positive values of gradient information as a characteristic part.

更に、選択的ＢＰ法は、正解ラベルの誤差のみを最大にしたうえで、ＢＰ法またはＧＢＰ法を用いて処理を行う方法である。選択的ＢＰ法の場合、可視化される特徴部分は、正解ラベルのスコアのみに影響を与える特徴部分となる。 Furthermore, the selective BP method is a method in which processing is performed using the BP method or the GBP method after maximizing only the error of the correct label. In the case of the selective BP method, the visualized feature part is a feature part that only affects the score of the correct label.

劣化尺度マップ生成部３１２は、誤推論画像とスコア最大化リファイン画像とに基づいて、第１のマップの一例である"劣化尺度マップ"を生成する。劣化尺度マップは、スコア最大化リファイン画像を生成する際に変更がなされた各画素の変更度合いを示している。 The deterioration measure map generation unit 312 generates a "deterioration measure map" which is an example of a first map, based on the incorrect inference image and the score maximization refined image. The deterioration scale map shows the degree of change of each pixel that was changed when generating the score-maximizing refined image.

重畳部３１３は、重要特徴マップ生成部３１１において生成されたグレイスケール化重要特徴マップと、劣化尺度マップ生成部３１２において生成された劣化尺度マップとに基づいて、第３のマップの一例である"重要特徴指標マップ"を生成する。重要特徴指標マップは、正解ラベルを推論するための各画素の重要度を示している。 The superimposition unit 313 generates an example of a third map based on the grayscale important feature map generated in the important feature map generation unit 311 and the deterioration measure map generated in the deterioration measure map generation unit 312. "Important feature index map" is generated. The important feature index map indicates the importance of each pixel for inferring the correct label.

（３）特定部の詳細
次に、特定部１４３の詳細について説明する。図３に示すように、特定部１４３は、スーパーピクセル分割部３２１、重要スーパーピクセル決定部３２２、重要スーパーピクセル評価部３２３を有する。 (3) Details of Specification Unit Next, details of the specification unit 143 will be explained. As shown in FIG. 3, the identification unit 143 includes a superpixel division unit 321, an important superpixel determination unit 322, and an important superpixel evaluation unit 323.

スーパーピクセル分割部３２１は、誤推論画像を、誤推論画像に含まれるオブジェクト（本実施形態では車両）の部品ごとの領域（ピクセルの集合）である"スーパーピクセル"に分割し、スーパーピクセル分割情報を出力する。なお、誤推論画像をスーパーピクセルに分割するにあたっては、既存の分割機能を利用するか、あるいは、車両の部品ごとに分割するように学習したＣＮＮ等を利用する。 The superpixel dividing unit 321 divides the incorrectly inferred image into "superpixels" which are regions (sets of pixels) for each part of the object (vehicle in this embodiment) included in the incorrectly inferred image, and divides the incorrectly inferred image into "superpixels", Output. Note that to divide the incorrectly inferred image into superpixels, an existing division function is used, or a CNN or the like that has been trained to divide the image into parts of the vehicle is used.

重要スーパーピクセル決定部３２２は、スーパーピクセル分割部３２１により出力されたスーパーピクセル分割情報に基づいて、重畳部３１３により生成された重要特徴指標マップの各画素の値を、スーパーピクセルごとに加算する。 The important superpixel determining unit 322 adds the values of each pixel of the important feature index map generated by the superimposing unit 313 for each superpixel based on the superpixel division information output by the superpixel division unit 321.

また、重要スーパーピクセル決定部３２２は、各スーパーピクセルのうち、加算した各画素の加算値が所定の条件を満たす（重要特徴指標閾値以上）のスーパーピクセルを抽出する。また、重要スーパーピクセル決定部３２２は、抽出したスーパーピクセルの中から選択したスーパーピクセルを組み合わせたスーパーピクセル群を、変更可能領域（スコア最大化リファイン画像によって置き換えられる領域）と規定する。また、重要スーパーピクセル決定部３２２は、組み合わせたスーパーピクセル群以外のスーパーピクセル群を変更不可領域（スコア最大化リファイン画像によって置き換えられない領域）と規定する。 In addition, the important superpixel determination unit 322 extracts, from each superpixel, a superpixel whose summed value of each pixel satisfies a predetermined condition (greater than or equal to the important feature index threshold). Further, the important superpixel determining unit 322 defines a superpixel group that is a combination of superpixels selected from the extracted superpixels as a changeable region (a region to be replaced by the score-maximizing refined image). Furthermore, the important superpixel determining unit 322 defines superpixel groups other than the combined superpixel group as unchangeable areas (areas that cannot be replaced by the score-maximizing refined image).

更に、重要スーパーピクセル決定部３２２は、誤推論画像から、変更不可領域に対応する画像部分を抽出し、リファイン画像から、変更可能領域に対応する画像部分を抽出し、両者を合成することで、合成画像を生成する。画像リファイナ部３０１からは、学習回数に応じた数のリファイン画像であって、画像リファイナ部３０１で出力条件を満たしたリファイン画像が出力される。このため、重要スーパーピクセル決定部３２２では、当該数のリファイン画像それぞれについて、合成画像を生成する。 Furthermore, the important superpixel determination unit 322 extracts an image part corresponding to the unchangeable area from the incorrect inference image, extracts an image part corresponding to the changeable area from the refined image, and combines the two. Generate a composite image. The image refiner unit 301 outputs refined images whose number corresponds to the number of times of learning and which satisfies the output conditions of the image refiner unit 301 . Therefore, the important superpixel determining unit 322 generates a composite image for each of the number of refined images.

なお、重要スーパーピクセル決定部３２２では、変更可能領域及び変更不可領域を規定する際に用いる重要特徴指標閾値を徐々に下げることで、抽出するスーパーピクセルの数を増やす（変更可能領域を広げ、変更不可領域を狭めていく）。また、重要スーパーピクセル決定部３２２では、抽出したスーパーピクセルの中から選択するスーパーピクセルの組み合わせを変えながら、変更可能領域及び変更不可領域を更新する。 Note that the important superpixel determination unit 322 increases the number of superpixels to be extracted by gradually lowering the important feature index threshold used to define the changeable area and the unchangeable area (expands the changeable area, (Narrowing down the areas that are not allowed) Further, the important superpixel determining unit 322 updates the changeable area and the unchangeable area while changing the combination of superpixels selected from the extracted superpixels.

重要スーパーピクセル評価部３２３は、重要スーパーピクセル決定部３２２において生成された合成画像が推論部３０３に入力されるごとに推論される正解ラベルのスコアを取得する。上述したように、重要スーパーピクセル決定部３２２では、画像リファイナ部３０１により出力されたリファイン画像の数、重要特徴指標閾値を下げる回数、スーパーピクセルの組み合わせの数、に応じた数の合成画像を生成する。このため、重要スーパーピクセル評価部３２３では、当該数に応じた正解ラベルのスコアを取得する。また、重要スーパーピクセル評価部３２３は、取得したスコアに基づいて、誤推論の原因となるスーパーピクセルの組み合わせ（変更可能領域）を特定し、誤推論原因情報として出力する。 The important superpixel evaluation unit 323 obtains the score of the correct label that is inferred every time the composite image generated by the important superpixel determination unit 322 is input to the inference unit 303. As described above, the important superpixel determining unit 322 generates a number of composite images according to the number of refined images output by the image refiner unit 301, the number of times the important feature index threshold is lowered, and the number of superpixel combinations. do. Therefore, the important superpixel evaluation unit 323 obtains the score of the correct label according to the number. Further, the important superpixel evaluation unit 323 identifies a combination of superpixels (changeable region) that causes an erroneous inference based on the obtained score, and outputs it as erroneous inference cause information.

このとき、重要スーパーピクセル評価部３２３では、面積がなるべく小さくなるように、変更可能領域を特定する。例えば、重要スーパーピクセル評価部３２３では、推論部３０３より取得したスコアを評価する際、重要特徴指標閾値を下げる前のスーパーピクセルあるいはスーパーピクセルの組み合わせのうち、面積が小さいものから優先して評価する。また、重要スーパーピクセル評価部３２３では、重要特徴指標閾値が下がることで、正解ラベルが推論されるようになった時点での変更可能領域（正解ラベルを推論可能な限界の重要特徴指標閾値により抽出され、面積が最小の変更可能領域）を特定する。 At this time, the important superpixel evaluation unit 323 specifies the changeable area so that the area is as small as possible. For example, when evaluating the score obtained from the inference unit 303, the important superpixel evaluation unit 323 prioritizes and evaluates the superpixels or combinations of superpixels that have the smallest area before lowering the important feature index threshold. . In addition, in the important super pixel evaluation unit 323, by lowering the important feature index threshold, the changeable region (extracted by the important feature index threshold at the limit at which the correct label can be inferred) is determined at the time when the correct label can be inferred. (changeable area with the smallest area).

＜誤推論原因抽出部の各部の処理の具体例＞
次に、誤推論原因抽出部１４０の各部の処理の具体例について説明する。 <Specific examples of processing of each part of the incorrect inference cause extraction unit>
Next, a specific example of the processing of each part of the incorrect inference cause extraction unit 140 will be described.

（１）リファイン画像生成部の処理の具体例
はじめに、リファイン画像生成部１４１の処理の具体例について説明する。図４は、リファイン画像生成部の処理の具体例を示す図である。図４の左側の例は、正解ラベル＝"車種Ａ"の車両が含まれる誤推論画像４１０を入力として推論した結果、ラベル＝"車種Ｂ"と誤推論した様子を示している。 (1) Specific example of processing by refined image generation unit First, a specific example of processing by the refined image generation unit 141 will be described. FIG. 4 is a diagram showing a specific example of processing by the refined image generation section. The example on the left side of FIG. 4 shows how, as a result of inference using an incorrect inference image 410 that includes a vehicle with the correct label=“car type A” as input, the incorrect inference is made that the label=“car type B”.

また、図４の左側の例は、誤推論画像４１０を入力として推論した際のスコアが、
・車種Ａのスコア＝０．０１４２、
・車種Ｂのスコア＝０．４５４９、
・車種Ｃのスコア＝０．００１８、
であったことを示している。 In addition, in the example on the left side of FIG. 4, the score when inference is made using the incorrect inference image 410 as input is
・Score of car type A = 0.0142,
・Score of car type B = 0.4549,
・Score of car type C = 0.0018,
It shows that it was.

一方、図４の右側の例は、リファイン画像生成部１４１が、誤推論画像４１０からリファイン画像を生成する処理を行い、スコア最大化リファイン画像４２０を生成した様子を示している。図４の右側の例では、リファイン画像生成部１４１が、誤推論画像４１０に対して、ヘッドライト４２１の色や、道路標示４２２の色を変更することで、スコア最大化リファイン画像４２０を生成したことを示している。 On the other hand, the example on the right side of FIG. 4 shows how the refined image generation unit 141 performs processing to generate a refined image from the incorrect inference image 410 and generates a score-maximizing refined image 420. In the example on the right side of FIG. 4, the refined image generation unit 141 generates the score-maximizing refined image 420 by changing the color of the headlights 421 and the color of the road markings 422 with respect to the incorrect inference image 410. It is shown that.

また、図４の右側の例は、スコア最大化リファイン画像４２０を入力として推論した場合、正解ラベル＝"車種Ａ"と一致するラベルを推論できたことを示している。更に、図４の右側の例は、スコア最大化リファイン画像４２０を入力として推論した際のスコアが、
・車種Ａのスコア＝０．９９２７、
・車種Ｂのスコア＝０．００４２、
・車種Ｃのスコア＝０．００２２、
であったことを示している。 Further, the example on the right side of FIG. 4 shows that when inference is made using the score maximization refined image 420 as input, a label that matches the correct label = "car type A" can be inferred. Furthermore, in the example on the right side of FIG. 4, the score when inferred using the score maximization refined image 420 as input is
・Score of car type A = 0.9927,
・Score of car type B = 0.0042,
・Score of car type C = 0.0022,
It shows that it was.

このように、リファイン画像生成部１４１によれば、誤推論画像４１０を変更することで、正解ラベルと一致するラベルを推論でき、かつ、正解ラベルのスコアが最大となるスコア最大化リファイン画像４２０を生成することができる。 In this way, the refined image generation unit 141 can infer a label that matches the correct label by changing the incorrect inference image 410, and generates the score-maximizing refined image 420 in which the correct label has the maximum score. can be generated.

なお、図４の右側の例で示すように、リファイン画像生成部１４１により生成されるスコア最大化リファイン画像４２０の場合、誤推論画像４１０に対して、車両と無関係な路面標示まで変更される可能性がある。正解ラベルのスコアを最大化する学習での誤差逆伝播は、正解ラベルのスコアに影響するＣＮＮの経路（ユニット）に影響を及ぼすが、影響を及ぼされた経路（ユニット）が必ずしも誤推論の原因と関係しているとは限らないためである。 Note that, as shown in the example on the right side of FIG. 4, in the case of the score maximization refined image 420 generated by the refined image generation unit 141, even road markings unrelated to the vehicle may be changed with respect to the incorrect inference image 410. There is sex. Error backpropagation in learning that maximizes the score of the correct label affects the CNN path (unit) that affects the score of the correct label, but the affected path (unit) is not necessarily the cause of incorrect inference. This is because it is not necessarily related to

つまり、既知のスコア最大化法のように、変更部分に基づいて誤推論の原因となる画像箇所を特定しようとした場合、十分な精度で特定することができないという問題がある（変更部分について、更なる絞り込みが必要である）。本実施形態に係る誤推論原因抽出部１４０ではマップ生成部１４２、特定部１４３が機能することで、更なる絞り込みが行われる。 In other words, when trying to identify image parts that cause incorrect inferences based on the changed parts, as in the known score maximization method, there is a problem that it is not possible to identify with sufficient accuracy (for the changed parts, (further narrowing down is necessary). In the erroneous inference cause extraction unit 140 according to the present embodiment, the map generation unit 142 and the identification unit 143 function to perform further narrowing down.

（２）マップ生成部の処理の具体例
（２－１）マップ生成部全体の処理の具体例
次に、マップ生成部１４２の処理の具体例について説明する。はじめに、マップ生成部１４２全体の処理の具体例について説明する。図５は、マップ生成部の処理の具体例を示す図である。 (2) Specific example of processing of map generation unit (2-1) Specific example of processing of entire map generation unit Next, a specific example of processing of map generation unit 142 will be described. First, a specific example of the overall processing of the map generation unit 142 will be described. FIG. 5 is a diagram illustrating a specific example of processing by the map generation unit.

図５に示すように、マップ生成部１４２において重要特徴マップ生成部３１１は、推論部３０３がスコア最大化リファイン画像を入力として推論した際の推論部構造情報５０１を、推論部３０３から取得する。また、重要特徴マップ生成部３１１は、取得した推論部構造情報５０１に基づいて、例えば、選択的ＢＰ法を用いて重要特徴マップを生成し、生成した重要特徴マップについてオフセット調整を行う。なお、重要特徴マップ生成部３１１によるこれらの処理の詳細は、後述する。 As shown in FIG. 5, in the map generation unit 142, the important feature map generation unit 311 acquires from the inference unit 303 inference unit structure information 501 when the inference unit 303 infers using the score-maximized refined image as input. Further, the important feature map generation unit 311 generates an important feature map using, for example, the selective BP method based on the acquired inference unit structure information 501, and performs offset adjustment on the generated important feature map. Note that the details of these processes by the important feature map generation unit 311 will be described later.

また、重要特徴マップ生成部３１１は、オフセット調整した重要特徴マップをグレイスケール化し、グレイスケール化重要特徴マップ５０２を生成する。 Further, the important feature map generation unit 311 converts the offset-adjusted important feature map into a gray scale to generate a gray scale important feature map 502.

図５に示すグレイスケール化重要特徴マップ５０２は、０から２５５の画素値でグレイスケール化されている。このため、グレイスケール化重要特徴マップ５０２において、画素値が２５５に近い画素は、推論時に注目される画素（注目画素）であり、画素値が０に近い画素は、推論時に注目されない画素（非注目画素）である。 The gray scaled important feature map 502 shown in FIG. 5 is gray scaled with pixel values from 0 to 255. Therefore, in the grayscale important feature map 502, pixels with pixel values close to 255 are pixels that are noticed during inference (pixels of interest), and pixels whose pixel values are close to 0 are pixels that are not noticed during inference (non-target pixels). pixel of interest).

一方、劣化尺度マップ生成部３１２は、リファイン画像格納部３０５よりスコア最大化リファイン画像５１２を読み出し、誤推論画像５１１との間でＳＳＩＭ（Structural Similarity）演算を行う。これにより、劣化尺度マップ生成部３１２は、劣化尺度マップ５１３を生成する。劣化尺度マップ５１３は０から１の値をとり、画素値が１に近いほど、画像の差分が小さいことを表し、画素値が０に近いほど、画像の差分が大きいことを表す。 On the other hand, the deterioration scale map generation unit 312 reads the score-maximizing refined image 512 from the refined image storage unit 305 and performs SSIM (Structural Similarity) calculation between it and the incorrect inference image 511. Thereby, the deterioration measure map generation unit 312 generates the deterioration measure map 513. The deterioration scale map 513 takes values from 0 to 1, and the closer the pixel value is to 1, the smaller the image difference is, and the closer the pixel value is to 0, the larger the image difference.

また、重畳部３１３は、重要特徴マップ生成部３１１により生成されたグレイスケール化重要特徴マップ５０２と、劣化尺度マップ生成部３１２により生成された劣化尺度マップ５１３とを用いて、重要特徴指標マップ５２０を生成する。 Further, the superimposition unit 313 uses the grayscale important feature map 502 generated by the important feature map generation unit 311 and the deterioration measure map 513 generated by the deterioration measure map generation unit 312 to create an important feature index map 520. generate.

具体的には、重畳部３１３は、下式に基づいて、重要特徴指標マップ５２０を生成する。
（式１）
重要特徴指標マップ＝グレイスケール化重要特徴マップ×（１－劣化尺度マップ）
上式において、（１－劣化尺度マップ）の項は、０から１の値をとり、１に近いほど画像の差分が大きく、０に近いほど画像の差分が小さい。したがって、重要特徴指標マップ５２０は、スコア最大化リファイン画像の複数の画素のうち推論時に注目される画素の注目度合いを示すグレイスケール化重要特徴マップに、画像の差分の大小による強弱をつけた画像となる。 Specifically, the superimposition unit 313 generates the important feature index map 520 based on the following formula.
(Formula 1)
Important feature index map = grayscale important feature map × (1 - deterioration scale map)
In the above equation, the term (1-degradation scale map) takes a value from 0 to 1, and the closer it is to 1, the larger the image difference is, and the closer it is to 0, the smaller the image difference. Therefore, the important feature index map 520 is an image obtained by adding strength to a grayscale important feature map that indicates the degree of attention of a pixel that is noticed during inference among a plurality of pixels of the score-maximizing refined image, based on the magnitude of the image difference. becomes.

具体的には、重要特徴指標マップ５２０は、
・劣化尺度マップ５１３において画像の差分が小さい部分について、グレイスケール化重要特徴マップの画素値を小さくし、
・劣化尺度マップ５１３において画像の差分が大きい部分について、グレイスケール化重要特徴マップの画素値を大きくする、
ことで生成される。 Specifically, the important feature index map 520 is
- For parts of the deterioration scale map 513 where the image difference is small, reduce the pixel value of the grayscale important feature map,
・Increasing the pixel value of the grayscale important feature map for the portion where the image difference is large in the deterioration scale map 513,
It is generated by

なお、より識別しやすく可視化するために、重要特徴指標マップを反転させてもよい。図５に示す重要特徴指標マップは、下式に基づいて反転させたものを表示している。
（式２）
（反転した）重要特徴指標マップ＝２５５－［グレイスケール化重要特徴マップ×（１－劣化尺度マップ）］
ここで、重畳部３１３が、上式に基づいて、グレイスケール化重要特徴マップ５０２と劣化尺度マップ５１３とを重畳することによる利点について説明する。 Note that the important feature index map may be inverted to make it easier to identify and visualize. The important feature index map shown in FIG. 5 is displayed inverted based on the following formula.
(Formula 2)
(Inverted) important feature index map = 255 - [grayscaled important feature map x (1 - deterioration scale map)]
Here, the advantage of superimposing the gray scaled important feature map 502 and the deterioration measure map 513 by the superimposing unit 313 based on the above equation will be explained.

重要特徴マップ生成部３１１において生成されるグレイスケール化重要特徴マップ５０２は、スコア最大化リファイン画像を入力として推論し、正解ラベルのスコアが最大となった際に、推論部３０３が算出した注目部分に他ならない。 The gray scaled important feature map 502 generated by the important feature map generation unit 311 is a portion of interest calculated by the inference unit 303 when inference is made using the score-maximized refined image as input, and the score of the correct label becomes the maximum. Nothing but.

一方、劣化尺度マップ生成部３１２において生成される劣化尺度マップ５１３は、正解ラベルのスコアが最大化するように誤推論画像を変更した際の変更部分を表しており、誤推論の原因となる領域を表している。ただし、劣化尺度マップ生成部３１２において生成される劣化尺度マップ５１３は、正解ラベルのスコアを最大化するために必要な最小限の領域ではない。 On the other hand, the deterioration scale map 513 generated by the deterioration scale map generation unit 312 represents the changed part when the incorrect inference image is changed so that the score of the correct label is maximized, and represents the area that causes the incorrect inference. represents. However, the deterioration measure map 513 generated by the deterioration measure map generation unit 312 is not the minimum area necessary to maximize the score of the correct label.

重畳部３１３では、正解ラベルのスコアが最大化するように誤推論画像を変更した際の変更部分と、推論部３０３が算出した注目部分とを重畳することで、正解ラベルのスコアを最大化するために必要な領域を絞り込む。これにより、誤推論の原因となる領域を絞り込むことができる。 The superimposition unit 313 maximizes the score of the correct label by superimposing the changed part when changing the incorrect inference image and the part of interest calculated by the inference unit 303 so as to maximize the score of the correct label. Narrow down the necessary areas. This makes it possible to narrow down the area that causes the incorrect inference.

（２－２）重要特徴マップ生成部の処理の具体例
次に、マップ生成部１４２の重要特徴マップ生成部３１１の更なる詳細について、図７～図９を参照しながら、図６を用いて説明する。図６は、重要特徴マップ生成部の機能構成の詳細を示す図である。 (2-2) Specific example of processing of important feature map generation unit Next, further details of the important feature map generation unit 311 of the map generation unit 142 will be explained using FIG. 6 with reference to FIGS. 7 to 9. explain. FIG. 6 is a diagram showing details of the functional configuration of the important feature map generation section.

図６に示すように、重要特徴マップ生成部３１１は、選択的逆誤差伝播部６１１、非注目画素オフセット調整部６１２、グレイスケール化部６１３を有する。 As shown in FIG. 6, the important feature map generation section 311 includes a selective back error propagation section 611, a non-target pixel offset adjustment section 612, and a gray scale conversion section 613.

選択的逆誤差伝播部６１１は生成部の一例であり、推論部３０３から推論部構造情報５０１を取得し、選択的ＢＰ法を用いて重要特徴マップ６０１を生成する。図７は、選択的逆誤差伝播部の処理の具体例を示す図である。図７に示すように、選択的逆誤差伝播部６１１では、推論部３０３において推論された正解ラベルについて、推論の誤差がゼロでないという制約のもとで選択的逆誤差伝播を行う。これにより、選択的逆誤差伝播部６１１では、重要特徴マップ６０１を生成する。なお、重要特徴マップ６０１の生成に際しては、選択的逆誤差伝播部６１１の出力をそのまま用いてもよいし、あるいは、絶対値をとったものを用いてもよい。 The selective back-propagation unit 611 is an example of a generation unit, and acquires the inference unit structure information 501 from the inference unit 303 and generates the important feature map 601 using the selective BP method. FIG. 7 is a diagram illustrating a specific example of processing by the selective back error propagation unit. As shown in FIG. 7, the selective backpropagation unit 611 performs selective backpropagation on the correct label inferred by the inference unit 303 under the constraint that the inference error is not zero. As a result, the selective back error propagation unit 611 generates an important feature map 601. Note that when generating the important feature map 601, the output of the selective back error propagation unit 611 may be used as is, or the absolute value may be used.

非注目画素オフセット調整部６１２は調整部の一例であり、生成された重要特徴マップ６０１において、非注目画素の画素値がゼロになるように、オフセット調整を行う。ここで、選択的逆誤差伝播部６１１により生成される重要特徴マップ６０１は、非注目画素の画素値がゼロとして算出される場合と、非ゼロとして算出される場合とがありうる（推論部３０３に用いられるＣＮＮの種類によって変わる）。 The non-target pixel offset adjustment unit 612 is an example of an adjustment unit, and performs offset adjustment so that the pixel value of the non-target pixel becomes zero in the generated important feature map 601. Here, in the important feature map 601 generated by the selective back error propagation unit 611, the pixel value of the non-target pixel may be calculated as zero or non-zero (inference unit 303 (depending on the type of CNN used).

そこで、非注目画素オフセット調整部６１２では、重要特徴マップ６０１に含まれる非注目画素の画素値がゼロになるように、オフセット調整を行う。ここで、非注目画素オフセット調整部６１２では、重要特徴マップ６０１に含まれる各画素値の出現頻度が最大となる画素値の画素を、非注目画素としてオフセット調整を行う。 Therefore, the non-target pixel offset adjustment unit 612 performs offset adjustment so that the pixel value of the non-target pixel included in the important feature map 601 becomes zero. Here, the non-attention pixel offset adjustment unit 612 performs offset adjustment on a pixel having a pixel value that has the maximum frequency of occurrence of each pixel value included in the important feature map 601 as a non-attention pixel.

図８は、非注目画素オフセット調整部の処理内容を示す図である。図８（ａ）に示すように、非注目画素オフセット調整部６１２では、選択的逆誤差伝播部６１１より重要特徴マップ６０１を取得すると、各画素値の出現頻度を示すヒストグラムを生成する。続いて、非注目画素オフセット調整部６１２では、最小値が"０"、最大値が"２５５"になるようにスケーリング処理を行う（図８（ｂ））。なお、選択的逆誤差伝播部６１１の信号出力を維持した可視化や適用を行う場合には、スケーリング処理を行わない方法を用いてもよい。以下の説明では、特に断わりがない場合には、スケーリングを行うものとする。 FIG. 8 is a diagram showing the processing contents of the non-target pixel offset adjustment section. As shown in FIG. 8A, when the non-target pixel offset adjustment unit 612 obtains the important feature map 601 from the selective back error propagation unit 611, it generates a histogram indicating the appearance frequency of each pixel value. Subsequently, the non-target pixel offset adjustment unit 612 performs scaling processing so that the minimum value becomes "0" and the maximum value becomes "255" (FIG. 8(b)). Note that when visualizing or applying the signal output of the selective back error propagation unit 611 while maintaining it, a method that does not perform scaling processing may be used. In the following description, unless otherwise specified, scaling will be performed.

続いて、非注目画素オフセット調整部６１２では、選択的逆誤差伝播部６１１の出力の絶対値が用いる場合にあっては、出現頻度が最大となる画素値を探索し（図８（ｃ））、出現頻度が最大となる画素値がゼロになるように、オフセット調整を行う（図８（ｄ））。 Next, in the case where the absolute value of the output of the selective back error propagation unit 611 is used, the non-target pixel offset adjustment unit 612 searches for the pixel value with the maximum frequency of occurrence (FIG. 8(c)). , offset adjustment is performed so that the pixel value with the maximum frequency of appearance becomes zero (FIG. 8(d)).

続いて、非注目画素オフセット調整部６１２では、画素値の絶対値を算出する（図８（ｅ））。これにより、非注目画素オフセット調整部６１２では、オフセット重要特徴マップ６０２（図６参照）を生成する。 Subsequently, the non-target pixel offset adjustment unit 612 calculates the absolute value of the pixel value (FIG. 8(e)). As a result, the non-target pixel offset adjustment unit 612 generates an offset important feature map 602 (see FIG. 6).

図９は、非注目画素オフセット調整部の処理の具体例を示す図である。このうち、図９（ａ）は、選択的逆誤差伝播部６１１により生成される重要特徴マップにおいて、非注目画素の画素値がゼロとして算出されたケースを示している。一方、図９（ｂ）は、選択的逆誤差伝播部６１１により生成される重要特徴マップにおいて、非注目画素の画素値が非ゼロとして算出されたケースを示している。非注目画素をオフセットで調整するのは、例えば、上記のように注目度合いで情報をフィルタリングしたり、信号の強さを調整する場合に、注目度合いに応じた信号の強さに調整した情報を作用させることで、意図した情報に加工することができるからである。既知の可視化に用いる場合には、注目されていない箇所が暗黙的に何らかの画素値であることを読み取ることは可能であるが、例えば、コンピュータで処理する場合には、暗黙的な扱いはなされないため、上記のようなオフセット調整が必要になる。 FIG. 9 is a diagram illustrating a specific example of processing by the non-target pixel offset adjustment section. Of these, FIG. 9A shows a case in which the pixel value of a non-target pixel is calculated as zero in the important feature map generated by the selective back-error propagation unit 611. On the other hand, FIG. 9B shows a case in which the pixel value of a non-target pixel is calculated as non-zero in the important feature map generated by the selective back-error propagation unit 611. Adjusting non-attention pixels with an offset is used, for example, when filtering information or adjusting signal strength according to the degree of attention as described above. This is because by acting on it, it is possible to process it into the intended information. When used for known visualization, it is possible to implicitly read that the unfocused part is some pixel value, but for example, when processed by a computer, it is not handled implicitly. Therefore, offset adjustment as described above is required.

なお、上述したように、いずれのケースになるかは、推論部３０３に用いられるＣＮＮの種類によって変わる。したがって、それぞれのオフセット量は、例えば、均一な画素値からなる画像（図９（ａ）、（ｂ）に示す画像９０１のように、特徴のない画像（つまり、非注目画素からなる画像））に基づいて算出することができる。具体的には、画像９０１を、異なる種類のＣＮＮが用いられた推論部３０３にそれぞれ入力した際の推論部構造情報５０１を取得し、選択的ＢＰ法を用いて重要特徴マップを生成することで、オフセット量を算出することができる。 Note that, as described above, which case occurs depends on the type of CNN used in the inference unit 303. Therefore, each offset amount is, for example, an image consisting of uniform pixel values (an image without features (that is, an image consisting of non-target pixels), such as images 901 shown in FIGS. 9(a) and 9(b)). It can be calculated based on Specifically, the inference unit structure information 501 obtained when the image 901 is input to the inference units 303 using different types of CNNs is acquired, and an important feature map is generated using the selective BP method. , the offset amount can be calculated.

図９（ａ）は、オフセット量が"０"のケースであり、この場合、非注目画素オフセット調整部６１２では、オフセット調整を行わない（符号９０２に示す重要特徴マップ＝オフセット重要特徴マップ）。一方、図９（ｂ）は、オフセット量が"１２８"のケースであり、この場合、非注目画素オフセット調整部６１２では、オフセット調整を行う（符号９１２に示す重要特徴マップ→符号９１３に示すオフセット重要特徴マップ）。 FIG. 9A shows a case where the offset amount is "0", and in this case, the non-target pixel offset adjustment unit 612 does not perform offset adjustment (important feature map indicated by reference numeral 902 = offset important feature map). On the other hand, FIG. 9B shows a case where the offset amount is "128", and in this case, the non-target pixel offset adjustment unit 612 performs offset adjustment (important feature map indicated by reference numeral 912→offset indicated by reference numeral 913). important feature map).

図６の説明に戻る。グレイスケール化部６１３では、非注目画素オフセット調整部６１２によりオフセット調整された、オフセット重要特徴マップ６０２について、グレイスケール化処理を行い、グレイスケール化重要特徴マップ５０２を生成する。具体的には、オフセット重要特徴マップ６０２がカラーであった場合に、グレイスケール化処理を行い、グレイスケール化重要特徴マップ５０２を生成する。なお、ＲＧＢの各チャネルを分離し、個々のチャネルをグレイスケールとして扱い、チャネルごとにグレイスケール化重要特徴マップ５０２を生成してもよい。ただし、オフセット重要特徴マップ６０２がカラーでない場合、グレイスケール化部６１３では、グレイスケール化処理を行わない。 Returning to the explanation of FIG. 6. The gray scale conversion unit 613 performs gray scale processing on the offset important feature map 602 whose offset has been adjusted by the non-target pixel offset adjustment unit 612 to generate a gray scale important feature map 502. Specifically, when the offset important feature map 602 is in color, grayscale processing is performed to generate the grayscale important feature map 502. Note that it is also possible to separate each RGB channel, treat each channel as a gray scale, and generate the gray scale important feature map 502 for each channel. However, if the offset important feature map 602 is not in color, the grayscale conversion unit 613 does not perform grayscale processing.

（３）特定部の処理の具体例
次に、特定部１４３の各部（スーパーピクセル分割部３２１、重要スーパーピクセル決定部３２２、重要スーパーピクセル評価部３２３）の処理の具体例について説明する。 (3) Specific example of processing by specifying unit Next, a specific example of processing by each unit of the specifying unit 143 (superpixel dividing unit 321, important superpixel determining unit 322, important superpixel evaluating unit 323) will be described.

（３－１）スーパーピクセル分割部の処理の具体例
はじめに、特定部１４３に含まれるスーパーピクセル分割部３２１の処理の具体例について説明する。図１０は、スーパーピクセル分割部の処理の具体例を示す図である。図１０に示すように、スーパーピクセル分割部３２１は、例えば、ＳＬＩＣ（Simple Linear Iterative Clustering）処理を行う分割部１０１０を有する。分割部１０１０は、誤推論画像５１１を、誤推論画像５１１に含まれる車両の部品ごとの部分画像であるスーパーピクセルに分割する。また、スーパーピクセル分割部３２１は、分割部１０１０によりスーパーピクセルに分割されることで生成されたスーパーピクセル分割情報１００１を出力する。 (3-1) Specific Example of Processing of Super Pixel Dividing Unit First, a specific example of processing of the super pixel dividing unit 321 included in the specifying unit 143 will be described. FIG. 10 is a diagram showing a specific example of processing by the superpixel dividing section. As shown in FIG. 10, the superpixel dividing section 321 includes a dividing section 1010 that performs, for example, SLIC (Simple Linear Iterative Clustering) processing. The dividing unit 1010 divides the incorrect inference image 511 into superpixels that are partial images for each vehicle part included in the incorrect inference image 511. Further, the superpixel division unit 321 outputs superpixel division information 1001 generated by division into superpixels by the division unit 1010.

（３－２）重要スーパーピクセル決定部の処理の具体例
次に、特定部１４３に含まれる重要スーパーピクセル決定部３２２の処理の具体例について説明する。図１１は、重要スーパーピクセル決定部の処理の具体例を示す図である。 (3-2) Specific example of processing by important super pixel determining unit Next, a specific example of processing by important super pixel determining unit 322 included in identifying unit 143 will be described. FIG. 11 is a diagram showing a specific example of the processing of the important superpixel determining unit.

図１１に示すように、重要スーパーピクセル決定部３２２は、領域抽出部１１１０、合成部１１１１を有する。 As shown in FIG. 11, the important superpixel determining section 322 includes a region extracting section 1110 and a combining section 1111.

重要スーパーピクセル決定部３２２では、重畳部３１３より出力された重要特徴指標マップ５２０と、スーパーピクセル分割部３２１より出力されたスーパーピクセル分割情報１００１とを重ね合わせる。これにより、重要スーパーピクセル決定部３２２では、重要スーパーピクセル画像１１０１を生成する。なお、図１１では、重要特徴指標マップ５２０として、（反転した）重要特徴指標マップを表示している。 The important superpixel determining section 322 superimposes the important feature index map 520 output from the superimposing section 313 and the superpixel division information 1001 output from the superpixel division section 321. As a result, the important superpixel determining unit 322 generates the important superpixel image 1101. Note that in FIG. 11, an (inverted) important feature index map is displayed as the important feature index map 520.

また、重要スーパーピクセル決定部３２２では、生成した重要スーパーピクセル画像１１０１内の各スーパーピクセルについて、重要特徴指標マップ５２０の各画素の値を加算する。 Further, the important superpixel determination unit 322 adds the values of each pixel of the important feature index map 520 for each superpixel in the generated important superpixel image 1101.

また、重要スーパーピクセル決定部３２２では、スーパーピクセルごとの加算値が、重要特徴指標閾値以上であるかを判定し、加算値が重要特徴指標閾値以上であると判定したスーパーピクセルを抽出する。なお、図１１において、重要スーパーピクセル画像１１０２は、スーパーピクセルごとの加算値の一例を明示したものである。 In addition, the important superpixel determining unit 322 determines whether the added value of each superpixel is equal to or greater than the important feature index threshold, and extracts superpixels for which the added value is determined to be equal to or greater than the important feature index threshold. Note that in FIG. 11, an important superpixel image 1102 clearly shows an example of the added value for each superpixel.

また、重要スーパーピクセル決定部３２２では、抽出したスーパーピクセルの中から、選択したスーパーピクセルを組み合わせて変更可能領域と規定し、組み合わせたスーパーピクセル以外のスーパーピクセルを変更不可領域と規定する。更に、重要スーパーピクセル決定部３２２は、規定した変更可能領域及び変更不可領域を領域抽出部１１１０に通知する。 Furthermore, the important superpixel determining unit 322 combines selected superpixels from among the extracted superpixels to define a changeable area, and defines superpixels other than the combined superpixels as an unchangeable area. Further, the important superpixel determining unit 322 notifies the area extracting unit 1110 of the defined changeable area and unchangeable area.

領域抽出部１１１０は、誤推論画像５１１から、変更不可領域に対応する画像部分を抽出し、リファイン画像１１２１から、変更可能領域に対応する画像部分を抽出する。 The region extraction unit 1110 extracts an image portion corresponding to an unchangeable region from the incorrect inference image 511, and extracts an image portion corresponding to a changeable region from the refined image 1121.

合成部１１１１は、リファイン画像１１２１から抽出した変更可能領域に対応する画像部分と、誤推論画像５１１から抽出した変更不可領域に対応する画像部分とを合成し、合成画像を生成する。 The combining unit 1111 combines the image portion corresponding to the changeable area extracted from the refined image 1121 and the image portion corresponding to the unchangeable area extracted from the incorrect inference image 511 to generate a combined image.

図１２は、領域抽出部及び合成部の処理の具体例を示す図である。図１２において、上段は、領域抽出部１１１０が、リファイン画像１１２１から、変更可能領域に対応する画像部分（画像１２０１の白色部分）を抽出した様子を示している。 FIG. 12 is a diagram illustrating a specific example of processing by the region extracting unit and the combining unit. In FIG. 12, the upper row shows how the region extraction unit 1110 extracts an image portion (white portion of the image 1201) corresponding to the changeable region from the refined image 1121.

一方、図１２において、下段は、領域抽出部１１１０が、誤推論画像５１１から、変更不可領域に対応する画像部分（画像１２０１'の白色部分）を抽出した様子を示している。なお、画像１２０１'は、画像１２０１の白色部分と黒色部分とを反転した画像である（説明の便宜上、図１２の下段では、白色部分を、変更不可領域に対応する画像部分としている）。 On the other hand, in FIG. 12, the lower part shows how the region extracting unit 1110 extracts the image portion (white portion of the image 1201') corresponding to the unchangeable region from the incorrect inference image 511. Note that the image 1201' is an image obtained by inverting the white part and the black part of the image 1201 (for convenience of explanation, in the lower part of FIG. 12, the white part is the image part corresponding to the unchangeable area).

合成部１１１１は、図１２に示すように、領域抽出部１１１０より出力された、リファイン画像１１２１の変更可能領域に対応する画像部分と、誤推論画像５１１の変更不可領域に対応する画像部分とを合成し、合成画像１２２０を生成する。 As shown in FIG. 12, the combining unit 1111 combines the image portion corresponding to the changeable area of the refined image 1121 and the image portion corresponding to the unchangeable area of the incorrect inference image 511 output from the area extraction unit 1110. The images are combined to generate a composite image 1220.

このように、特定部１４３によれば、合成画像１２２０を生成する際、重要特徴指標マップ５２０を用いることで、リファイン画像１１２１で置き換える領域を、スーパーピクセル単位で特定することができる。 In this way, according to the specifying unit 143, when generating the composite image 1220, by using the important feature index map 520, the area to be replaced with the refined image 1121 can be specified in units of super pixels.

＜誤推論原因抽出処理の流れ＞
次に、誤推論原因抽出部１４０による誤推論原因抽出処理の流れについて説明する。図１３及び図１４は、誤推論原因抽出処理の流れを示す第１及び第２のフローチャートである。 <Flow of incorrect inference cause extraction process>
Next, the flow of the erroneous inference cause extraction process by the erroneous inference cause extraction unit 140 will be described. 13 and 14 are first and second flowcharts showing the flow of the incorrect inference cause extraction process.

ステップＳ１３０１において、誤推論原因抽出部１４０の各部は、初期化処理を行う。具体的には、画像リファイナ部３０１は、ＣＮＮの学習回数をゼロに設定するとともに、最大学習回数をユーザが指示した値に設定する。また、判断部３２５は、重要特徴指標閾値及びその下限値を、ユーザが指示した値に設定する。 In step S1301, each part of the incorrect inference cause extraction unit 140 performs initialization processing. Specifically, the image refiner unit 301 sets the number of learning times of the CNN to zero, and sets the maximum number of learning times to a value specified by the user. Further, the determining unit 325 sets the important feature index threshold and its lower limit to values specified by the user.

ステップＳ１３０２において、画像リファイナ部３０１は、誤推論画像を変更し、リファイン画像を生成する。 In step S1302, the image refiner unit 301 changes the incorrect inference image and generates a refined image.

ステップＳ１３０３において、推論部３０３は、リファイン画像を入力として推論し、正解ラベルのスコアを算出する。 In step S1303, the inference unit 303 performs inference using the refined image as input, and calculates the score of the correct label.

ステップＳ１３０４において、画像リファイナ部３０１は、画像差分値とスコア誤差とを用いてＣＮＮの学習を行う。 In step S1304, the image refiner unit 301 performs CNN learning using the image difference value and the score error.

ステップＳ１３０５において、画像リファイナ部３０１は、学習回数が最大学習回数を超えたか否かを判定する。ステップＳ１３０５において、学習回数が最大学習回数を超えていないと判定した場合には（ステップＳ１３０５においてＮｏの場合には）、ステップＳ１３０２に戻り、リファイン画像の生成を継続する。 In step S1305, the image refiner unit 301 determines whether the number of learning times exceeds the maximum number of learning times. If it is determined in step S1305 that the number of learning times does not exceed the maximum number of learning times (No in step S1305), the process returns to step S1302 and continues generating refined images.

一方、ステップＳ１３０５において、学習回数が最大学習回数を超えたと判定した場合には（ステップＳ１３０５においてＹｅｓの場合は）、ステップＳ１３０６に進む。なお、この時点で、リファイン画像格納部３０５には、学習ごとに生成されたリファイン画像（スコア最大化リファイン画像を含む）が格納されている。 On the other hand, if it is determined in step S1305 that the number of times of learning exceeds the maximum number of times of learning (if YES in step S1305), the process advances to step S1306. Note that, at this point, the refined image storage unit 305 stores the refined images (including the score-maximized refined image) generated for each learning.

ステップＳ１３０６において、重要特徴マップ生成部３１１は、推論部３０３よりスコア最大化リファイン画像を入力として推論された際の推論部構造情報を取得し、取得した推論部構造情報に基づいてグレイスケール化重要特徴マップを生成する（詳細は後述）。 In step S1306, the important feature map generation unit 311 acquires the inference unit structure information when inference is made using the score maximization refined image as input from the inference unit 303, and converts the important feature map into gray scale based on the acquired inference unit structure information. Generate a feature map (details will be described later).

ステップＳ１３０７において、劣化尺度マップ生成部３１２は、誤推論画像とスコア最大化リファイン画像とに基づいて、劣化尺度マップを生成する。 In step S1307, the deterioration measure map generation unit 312 generates a deterioration measure map based on the incorrect inference image and the score-maximizing refined image.

ステップ１３０８において、重畳部３１３は、グレイスケール化重要特徴マップと劣化尺度マップとに基づいて、重要特徴指標マップを生成する。 In step 1308, the superimposition unit 313 generates an important feature index map based on the grayscaled important feature map and the deterioration measure map.

ステップＳ１３０９において、スーパーピクセル分割部３２１は、誤推論画像をスーパーピクセルに分割し、スーパーピクセル分割情報を生成する。 In step S1309, the superpixel division unit 321 divides the incorrectly inferred image into superpixels and generates superpixel division information.

ステップＳ１３１０において、重要スーパーピクセル決定部３２２は、重要特徴指標マップの各画素の値を、スーパーピクセル単位で加算する。 In step S1310, the important superpixel determination unit 322 adds the values of each pixel of the important feature index map in units of superpixels.

ステップＳ１３１１において、重要スーパーピクセル決定部３２２は、加算値が重要特徴指標閾値以上のスーパーピクセルを抽出し、抽出したスーパーピクセルの中から選択したスーパーピクセルを組み合わせて変更可能領域を規定する。また、重要スーパーピクセル決定部３２２は、組み合わせたスーパーピクセル以外のスーパーピクセルを変更不可領域と規定する。 In step S1311, the important superpixel determination unit 322 extracts superpixels whose added value is greater than or equal to the important feature index threshold, and defines a changeable region by combining the superpixels selected from the extracted superpixels. Furthermore, the important superpixel determination unit 322 defines superpixels other than the combined superpixels as unchangeable areas.

続いて、図１４のステップＳ１４０１において、重要スーパーピクセル決定部３２２は、リファイン画像格納部３０５からリファイン画像を読み出す。 Subsequently, in step S1401 in FIG. 14, the important superpixel determination unit 322 reads the refined image from the refined image storage unit 305.

ステップＳ１４０２において、重要スーパーピクセル決定部３２２は、リファイン画像から、変更可能領域に対応する画像部分を抽出する。 In step S1402, the important superpixel determining unit 322 extracts an image portion corresponding to the changeable region from the refined image.

ステップＳ１４０３において、重要スーパーピクセル決定部３２２は、誤推論画像から、変更不可領域に対応する画像部分を抽出する。 In step S1403, the important superpixel determining unit 322 extracts an image portion corresponding to the unchangeable region from the incorrectly inferred image.

ステップＳ１４０４において、重要スーパーピクセル決定部３２２は、リファイン画像から抽出した変更可能領域に対応する画像部分と、誤推論画像から抽出した変更不可領域に対応する画像部分とを合成し、合成画像を生成する。 In step S1404, the important superpixel determination unit 322 combines the image portion corresponding to the changeable area extracted from the refined image and the image portion corresponding to the unchangeable area extracted from the incorrect inference image, and generates a composite image. do.

ステップＳ１４０５において、推論部３０３は、合成画像を入力として推論し、正解ラベルのスコアを算出する。また、重要スーパーピクセル評価部３２３は、推論部３０３により算出された正解ラベルのスコアを取得する。 In step S1405, the inference unit 303 performs inference using the composite image as input, and calculates the score of the correct label. Further, the important superpixel evaluation unit 323 obtains the score of the correct label calculated by the inference unit 303.

ステップＳ１４０７において、画像リファイナ部３０１は、リファイン画像格納部３０５に格納された全てのリファイン画像を読み出したか否かを判定する。ステップＳ１４０７において、読み出していないリファイン画像があると判定した場合には（ステップＳ１４０７においてＮｏの場合には）、ステップＳ１４０１に戻る。 In step S1407, the image refiner unit 301 determines whether all refined images stored in the refined image storage unit 305 have been read out. If it is determined in step S1407 that there is a refined image that has not been read out (No in step S1407), the process returns to step S1401.

一方、ステップＳ１４０７において、全てのリファイン画像を読み出したと判定した場合には（ステップＳ１４０７においてＹｅｓの場合には）、ステップＳ１４０８に進む。 On the other hand, if it is determined in step S1407 that all refined images have been read out (in the case of Yes in step S1407), the process advances to step S1408.

ステップＳ１４０８において、重要スーパーピクセル決定部３２２は、重要特徴指標閾値が下限値に到達したか否かを判定する。ステップＳ１４０８において、下限値に到達していないと判定した場合には（ステップＳ１４０８においてＮｏの場合には）、ステップＳ１４０９に進む。 In step S1408, the important superpixel determining unit 322 determines whether the important feature index threshold has reached the lower limit. If it is determined in step S1408 that the lower limit has not been reached (No in step S1408), the process advances to step S1409.

ステップＳ１４０９において、重要スーパーピクセル決定部３２２は、重要特徴指標閾値を下げた後、図１３のステップＳ１３１１に戻る。 In step S1409, the important superpixel determining unit 322 lowers the important feature index threshold, and then returns to step S1311 in FIG. 13.

一方、ステップＳ１４０８において、下限値に到達したと判定した場合には（ステップＳ１４０８においてＹｅｓの場合には）、ステップＳ１４１０に進む。 On the other hand, if it is determined in step S1408 that the lower limit value has been reached (in the case of Yes in step S1408), the process advances to step S1410.

ステップＳ１４１０において、重要スーパーピクセル評価部３２３は、取得した正解ラベルのスコアに基づいて、誤推論の原因となるスーパーピクセルの組み合わせ（変更可能領域）を特定し、誤推論原因情報として出力する。 In step S1410, the important superpixel evaluation unit 323 identifies a superpixel combination (changeable region) that causes incorrect inference based on the score of the acquired correct label, and outputs it as incorrect inference cause information.

＜重要特徴マップ生成処理の流れ＞
次に、図１３のステップＳ１３０６の重要特徴マップ生成処理の詳細について説明する。図１５は、重要特徴マップ生成処理の流れを示すフローチャートである。 <Flow of important feature map generation process>
Next, details of the important feature map generation process in step S1306 in FIG. 13 will be described. FIG. 15 is a flowchart showing the flow of important feature map generation processing.

ステップＳ１５０１において、選択的逆誤差伝播部６１１は、推論部３０３から推論部構造情報を取得する。 In step S1501, the selective back error propagation unit 611 obtains inference unit structure information from the inference unit 303.

ステップＳ１５０２において、選択的逆誤差伝播部６１１は、推論部構造情報について選択的ＢＰ法を用いて、重要特徴マップを生成する。 In step S1502, the selective backpropagation unit 611 generates an important feature map using the selective BP method for the inference unit structure information.

ステップＳ１５０３において、非注目画素オフセット調整部６１２は、重要特徴マップに基づいてヒストグラムを生成し、出現頻度が最大となる画素値を特定する。 In step S1503, the non-target pixel offset adjustment unit 612 generates a histogram based on the important feature map, and identifies the pixel value with the maximum frequency of appearance.

ステップＳ１５０４において、非注目画素オフセット調整部６１２は、出現頻度が最大となる画素値に基づいて、オフセット量を算出する。 In step S1504, the non-target pixel offset adjustment unit 612 calculates the offset amount based on the pixel value with the highest frequency of appearance.

ステップＳ１５０５において、非注目画素オフセット調整部６１２は、算出したオフセット量に基づいて、重要特徴マップのオフセット調整を行い、オフセット重要特徴マップを生成する。 In step S1505, the non-target pixel offset adjustment unit 612 performs offset adjustment of the important feature map based on the calculated offset amount, and generates an offset important feature map.

ステップＳ１５０６において、グレイスケール化部６１３は、オフセット重要特徴マップについて、グレイスケール化処理を行い、グレイスケール化重要特徴マップを生成する。 In step S1506, the grayscale conversion unit 613 performs grayscale processing on the offset important feature map to generate a grayscale important feature map.

＜誤推論原因抽出処理の具体例＞
次に、誤推論原因抽出処理の具体例について説明する。図１６は、誤推論原因抽出処理の具体例を示す第１の図である。 <Specific example of incorrect inference cause extraction process>
Next, a specific example of the incorrect inference cause extraction process will be described. FIG. 16 is a first diagram showing a specific example of the incorrect inference cause extraction process.

図１６に示すように、はじめに、リファイン画像生成部１４１により、誤推論画像からスコア最大化リファイン画像が生成されると、マップ生成部１４２では、重要特徴指標マップを生成する。 As shown in FIG. 16, first, when the refined image generation unit 141 generates a score-maximizing refined image from the incorrect inference image, the map generation unit 142 generates an important feature index map.

続いて、誤推論画像に基づいて、スーパーピクセル分割部３２１によりスーパーピクセル分割情報が生成されると、重要スーパーピクセル決定部３２２では、重要スーパーピクセル画像を生成する。 Subsequently, when the superpixel division section 321 generates superpixel division information based on the incorrect inference image, the important superpixel determination section 322 generates an important superpixel image.

続いて、重要スーパーピクセル決定部３２２では、重要特徴指標マップのもと、重要スーパーピクセル画像において変更可能領域及び変更不可領域を規定する。このとき、重要スーパーピクセル決定部３２２では、重要特徴指標閾値を変えるとともに、重要特徴指標閾値を超えるスーパーピクセルの中から選択する組み合わせを変えることで、複数の変更可能領域と変更不可領域との組を生成する。また、重要スーパーピクセル決定部３２２では、生成した複数の変更可能領域と変更不可領域との組を用いて、合成画像を生成する。 Subsequently, the important superpixel determination unit 322 defines a changeable area and an unchangeable area in the important superpixel image based on the important feature index map. At this time, the important superpixel determination unit 322 changes the important feature index threshold and also changes the combination selected from among the superpixels exceeding the important feature index threshold, thereby forming a combination of a plurality of changeable regions and non-changeable regions. generate. Furthermore, the important superpixel determining unit 322 generates a composite image using the generated sets of the plurality of changeable areas and non-changeable areas.

続いて、重要スーパーピクセル評価部３２３では、生成された合成画像を入力として推論部３０３が推論した正解ラベルのスコアを取得する。これにより、重要スーパーピクセル評価部３２３では、取得した正解ラベルのスコアに基づいて、誤推論の原因となるスーパーピクセルの組み合わせ（変更可能領域）を特定し、誤推論原因情報として出力する。 Subsequently, the important superpixel evaluation unit 323 receives the generated composite image as input and obtains the score of the correct label inferred by the inference unit 303. Thereby, the important superpixel evaluation unit 323 identifies the superpixel combination (changeable region) that causes the incorrect inference based on the score of the acquired correct label, and outputs it as incorrect inference cause information.

なお、合成画像の生成処理は、図１６に示すように、リファイン画像生成部１４１により生成された複数のリファイン画像それぞれに対して行ってもよいし、
・最後の回のリファイン画像に対して行ってもよい、あるいは、
・それぞれのリファイン画像を入力とした推論での正解ラベルのスコアが一番よいリファイン画像（スコア最大化リファイン画像）に対して行ってもよい。 Note that the composite image generation process may be performed for each of the plurality of refined images generated by the refined image generation unit 141, as shown in FIG.
・You can perform this on the last refined image, or
- It may be performed on the refined image (score-maximizing refined image) with the highest score of the correct label in inference using each refined image as input.

以上の説明から明らかなように、第１の実施形態に係る解析装置１００は、画像認識処理の際に誤ったラベルが推論される誤推論画像から、推論の正解ラベルのスコアを最大化させたスコア最大化リファイン画像を生成する。 As is clear from the above description, the analysis device 100 according to the first embodiment maximizes the score of the correct label of inference from the incorrect inference image from which an incorrect label is inferred during image recognition processing. Generate a score-maximizing refined image.

また、第１の実施形態に係る解析装置１００は、スコア最大化リファイン画像を生成した際の推論部構造情報に基づいて、スコア最大化リファイン画像の複数の画素のうち推論時の非注目画素の画素値がゼロとなるグレイスケール化重要特徴マップを生成する。 Furthermore, the analysis device 100 according to the first embodiment selects non-target pixels at the time of inference among the plurality of pixels of the score-maximizing refined image based on the inference unit structure information when the score-maximizing refined image is generated. Generate a grayscale important feature map whose pixel values are zero.

また、第１の実施形態に係る解析装置１００は、スコア最大化リファイン画像と誤推論画像との差分に基づいて、スコア最大化リファイン画像を生成する際に変更された画素の変更度合いを示す劣化尺度マップを生成する。 In addition, the analysis device 100 according to the first embodiment also detects degradation that indicates the degree of change in pixels that have been changed when generating the score-maximizing refined image, based on the difference between the score-maximizing refined image and the incorrect inference image. Generate a scale map.

また、第１の実施形態に係る解析装置１００は、グレイスケール化重要特徴マップと、劣化尺度マップと、を重畳することで、正解ラベルを推論するための各画素の重要度を示す重要特徴指標マップを生成する。 Furthermore, the analysis device 100 according to the first embodiment superimposes the grayscaled important feature map and the deterioration scale map to obtain an important feature index that indicates the importance of each pixel for inferring a correct label. Generate a map.

また、第１の実施形態に係る解析装置１００は、誤推論画像を分割することでスーパーピクセルを生成し、重要特徴指標マップの各画素値を、スーパーピクセル単位で加算する。また、第１の実施形態に係る解析装置１００は、加算値が重要特徴指標閾値以上のスーパーピクセルを抽出し、抽出したスーパーピクセルの中から選択したスーパーピクセルの組み合わせに基づいて、変更可能領域と変更不可領域とを規定する。 Furthermore, the analysis device 100 according to the first embodiment generates superpixels by dividing the incorrect inference image, and adds each pixel value of the important feature index map in units of superpixels. Furthermore, the analysis device 100 according to the first embodiment extracts superpixels whose added value is equal to or greater than an important feature index threshold, and determines a changeable region based on a combination of superpixels selected from the extracted superpixels. A non-changeable area is defined.

また、第１の実施形態に係る解析装置１００は、規定した変更可能領域をリファイン画像で置き換えた誤推論画像を、推論部に入力することで正解ラベルを推論する。更に第１の実施形態に係る解析装置１００は、重要特徴指標閾値及び選択するスーパーピクセルの組み合わせを変えながら推論し、推論した正解ラベルの各スコアから、誤推論の原因となるスーパーピクセルの組み合わせ（変更可能領域）を特定する。 Furthermore, the analysis device 100 according to the first embodiment infers a correct label by inputting an incorrect inference image in which the defined changeable region is replaced with a refined image to the inference unit. Furthermore, the analysis device 100 according to the first embodiment performs inference while changing the important feature index threshold and the combination of selected superpixels, and from each score of the inferred correct label, the combination of superpixels that causes incorrect inference ( (areas that can be changed).

このように、第１の実施形態では、正解ラベルの推論に影響するスーパーピクセルをリファイン画像で置き換え、置き換えの効果を参照しながら、スーパーピクセルを特定することで、誤推論の原因となる画像箇所を特定する。これにより、第１の実施形態によれば、誤推論の原因となる画像箇所を特定する際の精度を向上させることができる。 In this way, in the first embodiment, superpixels that affect the inference of correct labels are replaced with refined images, and the superpixels are identified while referring to the effect of the replacement, thereby identifying image locations that cause incorrect inferences. Identify. As a result, according to the first embodiment, it is possible to improve the accuracy in identifying image locations that cause erroneous inferences.

［第２の実施形態］
上記第１の実施形態では、スコア最大化リファイン画像を生成した際の推論部構造情報に基づいて、重要特徴指標マップを生成した。これに対して、第２の実施形態では、スコア最大化リファイン画像を生成するまでの学習中に取得した、リファイン画像それぞれに基づいて重要特徴指標マップを生成し、生成した重要特徴指標マップに基づいて、平均重要特徴指標マップを生成する。そして、重要スーパーピクセル決定部３２２では、平均重要特徴指標マップに基づいて、重要特徴指標閾値以上のスーパーピクセルを抽出する。以下、第２の実施形態について、上記第１の実施形態との相違点を中心に説明する。 [Second embodiment]
In the first embodiment described above, the important feature index map is generated based on the inference unit structure information when the score-maximizing refined image is generated. On the other hand, in the second embodiment, an important feature index map is generated based on each refined image acquired during learning until the score maximization refined image is generated, and an important feature index map is generated based on the generated important feature index map. Then, an average important feature index map is generated. Then, the important superpixel determining unit 322 extracts superpixels whose value is equal to or greater than the important feature index threshold based on the average important feature index map. The second embodiment will be described below, focusing on the differences from the first embodiment.

＜誤推論原因抽出部の機能構成＞
はじめに、第２の実施形態に係る解析装置の、誤推論原因抽出部の機能構成について説明する。図１７は、誤推論原因抽出部の機能構成の一例を示す第２の図である。図３に示した誤推論原因抽出部の機能構成との相違点は、マップ生成部１７１０である。以下、マップ生成部１７１０の詳細について説明する。 <Functional configuration of incorrect inference cause extraction unit>
First, the functional configuration of the erroneous inference cause extraction unit of the analysis device according to the second embodiment will be described. FIG. 17 is a second diagram illustrating an example of the functional configuration of the incorrect inference cause extraction unit. The difference from the functional configuration of the incorrect inference cause extraction unit shown in FIG. 3 is the map generation unit 1710. The details of the map generation unit 1710 will be explained below.

図１７に示すように、マップ生成部１７１０は、重要特徴マップ生成部３１１、劣化尺度マップ生成部３１２、重畳部３１３に加えて、平均化部１７１１を有する。 As shown in FIG. 17, the map generation section 1710 includes an averaging section 1711 in addition to an important feature map generation section 311, a deterioration measure map generation section 312, and a superimposition section 313.

マップ生成部１７１０は、画像リファイナ部３０１の学習中に生成されたリファイン画像と、当該リファイン画像を入力として推論部３０３が推論した際の推論部構造情報とを、逐次、取得する。また、リファイン画像及び推論部構造情報を取得するごとに、マップ生成部１７１０では、重要特徴マップ生成部３１１、劣化尺度マップ生成部３１２、重畳部３１３が動作し、重要特徴指標マップを生成する。 The map generation unit 1710 sequentially acquires a refined image generated during learning by the image refiner unit 301 and inference unit structure information when the inference unit 303 performs inference using the refined image as input. Furthermore, each time a refined image and inference unit structure information are acquired, in the map generation unit 1710, the important feature map generation unit 311, the deterioration measure map generation unit 312, and the superimposition unit 313 operate to generate an important feature index map.

平均化部１７１１は、重要特徴マップ生成部３１１及び劣化尺度マップ生成部３１２がリファイン画像及び推論部構造情報を取得するごとに重畳部３１３が生成した複数回分の重要特徴指標マップの平均値を算出し、平均重要特徴指標マップを生成する。 The averaging unit 1711 calculates the average value of the multiple important feature index maps generated by the superimposing unit 313 each time the important feature map generating unit 311 and the deterioration scale map generating unit 312 acquire the refined image and the inference part structure information. and generate an average important feature index map.

＜誤推論原因抽出処理の具体例＞
次に、誤推論原因抽出処理の具体例について説明する。図１８は、誤推論原因抽出処理の具体例を示す第２の図である。 <Specific example of incorrect inference cause extraction process>
Next, a specific example of the incorrect inference cause extraction process will be described. FIG. 18 is a second diagram showing a specific example of the incorrect inference cause extraction process.

図１８に示すように、リファイン画像生成部１４１において、画像リファイナ部３０１が、１回目の学習結果に基づき、誤推論画像からリファイン画像を生成すると、マップ生成部１７１０では、重要特徴指標マップを生成する。 As shown in FIG. 18, in the refined image generation unit 141, when the image refiner unit 301 generates a refined image from the incorrect inference image based on the first learning result, the map generation unit 1710 generates an important feature index map. do.

また、リファイン画像生成部１４１において、画像リファイナ部３０１が、２回目の学習結果に基づき、誤推論画像からリファイン画像を生成すると、マップ生成部１７１０では、重要特徴指標マップを生成する。以降、同様の処理を繰り返し、リファイン画像生成部１４１がスコア最大化リファイン画像を生成すると、マップ生成部１７１０では、重要特徴指標マップを生成する。 Further, in the refined image generation unit 141, when the image refiner unit 301 generates a refined image from the incorrectly inferred image based on the second learning result, the map generation unit 1710 generates an important feature index map. Thereafter, similar processing is repeated, and when the refined image generation unit 141 generates a score-maximized refined image, the map generation unit 1710 generates an important feature index map.

続いて、平均化部１７１１では、１回目のリファイン画像が生成されてからスコア最大化リファイン画像が生成されるまでの間に生成された、複数回分の重要特徴指標マップを取得する。また、平均化部１７１１では、取得した複数回分の重要特徴指標マップの平均値を算出し、平均重要特徴指標マップを生成する。 Subsequently, the averaging unit 1711 obtains multiple important feature index maps that have been generated between the generation of the first refined image and the generation of the score-maximizing refined image. Furthermore, the averaging unit 1711 calculates the average value of the plurality of acquired important feature index maps to generate an average important feature index map.

続いて、スーパーピクセル分割部３２１では、誤推論画像に基づいて、スーパーピクセル分割情報を生成し、重要スーパーピクセル決定部３２２では、重要スーパーピクセル画像を生成する。 Subsequently, the superpixel division section 321 generates superpixel division information based on the incorrectly inferred image, and the important superpixel determination section 322 generates an important superpixel image.

続いて、重要スーパーピクセル決定部３２２では、平均重要特徴指標マップのもと、重要スーパーピクセル画像において変更可能領域及び変更不可領域を規定する。このとき、重要スーパーピクセル決定部３２２では、重要特徴指標閾値を変えるとともに、重要特徴指標閾値を超えるスーパーピクセルの中から選択する組み合わせを変えることで、複数の変更可能領域と変更不可領域との組を生成する。また、重要スーパーピクセル決定部３２２では、生成した複数の変更可能領域と変更不可領域との組を用いて、合成画像を生成する。 Subsequently, the important superpixel determination unit 322 defines a changeable area and an unchangeable area in the important superpixel image based on the average important feature index map. At this time, the important superpixel determination unit 322 changes the important feature index threshold and also changes the combination selected from among the superpixels exceeding the important feature index threshold, thereby forming a combination of a plurality of changeable regions and non-changeable regions. generate. Furthermore, the important superpixel determining unit 322 generates a composite image using the generated sets of the plurality of changeable areas and non-changeable areas.

続いて、重要スーパーピクセル評価部３２３では、生成された合成画像を入力として推論部３０３が推論した正解ラベルのスコアを取得する。これにより、重要スーパーピクセル評価部３２３では、取得した正解ラベルに基づいて誤推論の原因となるスーパーピクセルの組み合わせ（変更可能領域）を特定し、誤推論原因情報として出力する。 Subsequently, the important superpixel evaluation unit 323 receives the generated composite image as input and obtains the score of the correct label inferred by the inference unit 303. Thereby, the important superpixel evaluation unit 323 identifies the superpixel combination (changeable region) that causes the incorrect inference based on the obtained correct label and outputs it as incorrect inference cause information.

なお、上記誤推論原因抽出処理において、リファイン画像を取得して重要特徴指標マップを生成する間隔は任意であり、１回の学習ごとに重要特徴指標マップを生成しても、複数回の学習ごとに重要特徴指標マップを生成してもよい。あるいは、リファイン画像を入力した際の推論部３０３の正解ラベルのスコアを評価し、所定の閾値よりも大きい場合に、当該リファイン画像を取得して重要特徴指標マップを生成してもよい。 In addition, in the above-mentioned incorrect inference cause extraction process, the interval at which refined images are acquired and important feature index maps are generated is arbitrary. An important feature index map may be generated. Alternatively, the score of the correct label of the inference unit 303 upon input of the refined image may be evaluated, and if the score is greater than a predetermined threshold, the refined image may be acquired to generate the important feature index map.

また、合成画像の生成処理は、図１８に示すように、リファイン画像生成部１４１により生成された複数のリファイン画像それぞれに対して行ってもよいし、
・最後の回のリファイン画像に対して行ってもよい、あるいは、
・それぞれのリファイン画像を入力とした推論での正解ラベルのスコアが一番よいリファイン画像（スコア最大化リファイン画像）に対して行ってもよい。 Further, the composite image generation process may be performed for each of the plurality of refined images generated by the refined image generation unit 141, as shown in FIG.
・You can perform this on the last refined image, or
- It may be performed on the refined image (score-maximizing refined image) with the highest score of the correct label in inference using each refined image as input.

以上の説明から明らかなように、第２の実施形態に係る解析装置１００は、スコア最大化リファイン画像を生成するまでの学習中に生成した、重要特徴指標マップに基づいて、平均重要特徴指標マップを生成する。また、第２の実施形態に係る解析装置１００は、平均重要特徴指標マップに基づいて、重要特徴指標閾値以上のスーパーピクセルを抽出する。 As is clear from the above description, the analysis device 100 according to the second embodiment calculates the average important feature index map based on the important feature index map generated during learning up to the generation of the score-maximizing refined image. generate. Furthermore, the analysis device 100 according to the second embodiment extracts superpixels having an important feature index equal to or greater than the threshold based on the average important feature index map.

これにより、第２の実施形態によれば、上記第１の実施形態による効果に加えて、更に、リファイン画像のゆらぎによる、重要特徴指標マップへの影響を低減させることが可能となる。 As a result, according to the second embodiment, in addition to the effects of the first embodiment, it is possible to further reduce the influence of fluctuations in the refined image on the important feature index map.

［第３の実施形態］
上記第１の実施形態では、スコア最大化リファイン画像が生成され、重要特徴指標マップが生成されると、特定部１４３では、変更可能領域及び変更不可領域を規定し、誤推論の原因となる画像箇所を特定する処理を開始するものとして説明した。 [Third embodiment]
In the first embodiment, when a score-maximizing refined image is generated and an important feature index map is generated, the specifying unit 143 defines a changeable area and an unchangeable area, and defines an image that may cause erroneous inference. The explanation has been made assuming that the process of identifying a location is started.

また、上記第２の実施形態では、スコア最大化リファイン画像が生成され、平均重要特徴指標マップが生成されると、特定部１４３では、変更可能領域及び変更不可領域を規定し、誤推論の原因となる画像箇所を特定する処理を開始するものとして説明した。 Furthermore, in the second embodiment, when a score-maximizing refined image is generated and an average important feature index map is generated, the specifying unit 143 defines a changeable area and an unchangeable area, and defines the cause of erroneous inference. The explanation has been made assuming that the process of identifying an image location where .

これに対して、第３の実施形態では、特定部１４３が変更可能領域及び変更不可領域を規定した後に、リファイン画像生成部が、規定された変更可能領域を加味して、再度、スコア最大化リファイン画像を生成しなおす。 On the other hand, in the third embodiment, after the specifying unit 143 defines the changeable area and the unchangeable area, the refine image generation unit takes into account the specified changeable area and maximizes the score again. Regenerate the refined image.

このように、変更可能領域を加味してスコア最大化リファイン画像を再生成することで、第３の実施形態によれば、正解ラベルの推論に影響する特徴部分がより明確になった重要特徴指標マップ（あるいは平均重要特徴指標マップ）を生成することが可能となる。この結果、合成画像を入力としてラベルを推論した際のスコアを上げることが可能となる。 According to the third embodiment, by regenerating the score-maximizing refined image with the changeable region taken into consideration, the important feature index that makes the feature parts that influence the inference of the correct label more clear It becomes possible to generate a map (or an average important feature index map). As a result, it is possible to increase the score when inferring a label using a composite image as input.

以下、第３の実施形態について、上記第１または第２の実施形態との相違点を中心に説明する。 The third embodiment will be described below, focusing on the differences from the first or second embodiment.

＜誤推論原因抽出部の機能構成＞
はじめに、第３の実施形態に係る解析装置の、誤推論原因抽出部の機能構成について説明する。図１９は、誤推論原因抽出部の機能構成の一例を示す第３の図である。図１７に示した誤推論原因抽出部の機能構成との相違点は、リファイン画像生成部１９１０及び特定部１９２０である。以下、リファイン画像生成部１９１０及び特定部１９２０の詳細について説明する。 <Functional configuration of incorrect inference cause extraction unit>
First, the functional configuration of the erroneous inference cause extraction unit of the analysis device according to the third embodiment will be described. FIG. 19 is a third diagram showing an example of the functional configuration of the incorrect inference cause extraction unit. The difference from the functional configuration of the incorrect inference cause extraction unit shown in FIG. 17 is the refined image generation unit 1910 and the identification unit 1920. Details of the refined image generation section 1910 and the identification section 1920 will be described below.

（１）リファイン画像生成部の詳細
はじめに、リファイン画像生成部１９１０の詳細について説明する。図１９に示すように、リファイン画像生成部１９１０は、リファイン画像生成部１４１の画像誤差演算部３０２とは機能の異なる画像誤差演算部１９１１を有する。 (1) Details of refined image generation section First, details of the refined image generation section 1910 will be explained. As shown in FIG. 19, the refined image generation section 1910 includes an image error calculation section 1911 that has a different function from the image error calculation section 302 of the refined image generation section 141.

画像誤差演算部１９１１は、画像誤差演算部３０２と同様、学習中に画像リファイナ部３０１に入力される誤推論画像と、学習中に画像リファイナ部３０１より出力されるリファイン画像との差分を算出し、画像差分値を、画像リファイナ部３０１に入力する。ただし、画像誤差演算部１９１１の場合、画像リファイナ部３０１に画像差分値を入力する際、補正部１９２１より通知された変更可能領域に対応する画像部分について、画像差分値を補正する。 Similar to the image error calculation unit 302, the image error calculation unit 1911 calculates the difference between the incorrect inference image input to the image refiner unit 301 during learning and the refined image output from the image refiner unit 301 during learning. , and the image difference values are input to the image refiner unit 301. However, in the case of the image error calculation unit 1911, when inputting the image difference value to the image refiner unit 301, the image difference value is corrected for the image portion corresponding to the changeable area notified by the correction unit 1921.

具体的には、画像誤差演算部１９１１は、変更可能領域に対応する画像部分の画像差分値に１未満の係数をかけて補正する。これにより、画像リファイナ部３０１では、変更可能領域に対応する画像部分の画像差分値を、変更可能領域以外の領域に対応する画像部分の画像差分値よりも弱めて再学習したうえで、スコア最大化リファイン画像を再生成することができる。 Specifically, the image error calculation unit 1911 corrects the image difference value of the image portion corresponding to the changeable area by multiplying it by a coefficient less than 1. As a result, the image refiner unit 301 relearns the image difference value of the image part corresponding to the changeable area to be weaker than the image difference value of the image part corresponding to the area other than the changeable area, and then It is possible to regenerate a refined image.

（２）特定部の詳細
次に、特定部１９２０の詳細について説明する。図１９に示すように、特定部１９２０は、スーパーピクセル分割部３２１、重要スーパーピクセル決定部３２２、重要スーパーピクセル評価部３２３に加えて、補正部１９２１を有する。 (2) Details of Specification Unit Next, details of the specification unit 1920 will be explained. As shown in FIG. 19, the identification unit 1920 includes a correction unit 1921 in addition to a superpixel division unit 321, an important superpixel determination unit 322, and an important superpixel evaluation unit 323.

補正部１９２１は、重要スーパーピクセル決定部３２２において規定された変更可能領域を取得し、画像誤差演算部１９１１に通知する。これにより、リファイン画像生成部１９１０では、変更可能領域を加味して再学習したうえで、スコア最大化リファイン画像を再生成することができる。 The correction unit 1921 acquires the changeable area defined by the important superpixel determination unit 322 and notifies the image error calculation unit 1911. As a result, the refined image generation unit 1910 can regenerate the score-maximizing refined image after re-learning with the changeable area taken into consideration.

＜誤推論原因抽出処理の具体例＞
次に、誤推論原因抽出処理の具体例について説明する。図２０は、誤推論原因抽出処理の具体例を示す第３の図である。図１８を用いて説明した、誤推論原因抽出処理の具体例との相違点は、重要スーパーピクセル画像に基づいて変更可能領域と変更不可領域との組が生成された際、補正部１９２１が、変更可能領域を画像誤差演算部１９１１に通知する点である。これにより、リファイン画像生成部１９１０では、変更可能領域を加味して再学習したうえで、スコア最大化リファイン画像を再生成し、マップ生成部１７１０では、平均重要特徴指標マップを再生成することができる。 <Specific example of incorrect inference cause extraction process>
Next, a specific example of the incorrect inference cause extraction process will be described. FIG. 20 is a third diagram showing a specific example of the incorrect inference cause extraction process. The difference from the specific example of the erroneous inference cause extraction process described using FIG. The point is to notify the image error calculation unit 1911 of the changeable area. As a result, the refined image generation unit 1910 re-generates the score-maximizing refined image after re-learning with the changeable region taken into account, and the map generation unit 1710 regenerates the average important feature index map. can.

また、合成画像の生成処理は、図２０に示すように、リファイン画像生成部１９１０により生成された複数のリファイン画像それぞれに対して行ってもよいし、
・最後の回のリファイン画像に対して行ってもよい、あるいは、
・それぞれのリファイン画像を入力とした推論での正解ラベルのスコアが一番よいリファイン画像（スコア最大化リファイン画像）に対して行ってもよい。 Further, the composite image generation process may be performed for each of the plurality of refined images generated by the refined image generation unit 1910, as shown in FIG.
・You can perform this on the last refined image, or
- It may be performed on the refined image (score-maximizing refined image) with the highest score of the correct label in inference using each refined image as input.

以上の説明から明らかなように、第３の実施形態に係る解析装置１００は、変更可能領域を加味して再学習することで、スコア最大化リファイン画像を再生成するとともに、重要特徴指標マップ（あるいは平均重要特徴指標マップ）を再生成する。 As is clear from the above description, the analysis device 100 according to the third embodiment regenerates the score-maximizing refined image by relearning with the changeable region taken into account, and also regenerates the important feature index map ( or the average important feature index map).

これにより、第３の実施形態によれば、正解ラベルの推論に影響する特徴部分がより明確になった重要特徴指標マップ（あるいは平均重要特徴指標マップ）を生成することが可能となり、合成画像を入力としてラベルを推論した際のスコアを上げることが可能となる。 As a result, according to the third embodiment, it is possible to generate an important feature index map (or an average important feature index map) in which the feature parts that influence the inference of the correct answer label are made clearer, and the composite image can be It is possible to increase the score when inferring a label as input.

［第４の実施形態］
上記第１乃至第３の実施形態では、誤推論の原因となるスーパーピクセルの組み合わせ（変更可能領域）を特定し、誤推論原因情報として出力するものとして説明した。しかしながら、誤推論原因情報の出力方法はこれに限定されず、例えば、変更可能領域内の重要部分を可視化して出力してもよい。以下、第４の実施形態について、上記第１乃至第３の実施形態との相違点を中心に説明する。 [Fourth embodiment]
In the first to third embodiments described above, the combination of superpixels (changeable area) that causes erroneous inference is specified and is output as erroneous inference cause information. However, the method of outputting the erroneous inference cause information is not limited to this, and for example, important parts within the changeable area may be visualized and output. The fourth embodiment will be described below, focusing on the differences from the first to third embodiments.

＜誤推論原因抽出部の機能構成＞
はじめに、第４の実施形態に係る解析装置１００における、誤推論原因抽出部の機能構成について説明する。図２１は、誤推論原因抽出部の機能構成の一例を示す第４の図である。図３に示した誤推論原因抽出部１４０の機能構成との相違点は、詳細原因解析部２１１０を有する点である。 <Functional configuration of incorrect inference cause extraction unit>
First, the functional configuration of the erroneous inference cause extraction unit in the analysis device 100 according to the fourth embodiment will be described. FIG. 21 is a fourth diagram illustrating an example of the functional configuration of the incorrect inference cause extraction unit. The difference from the functional configuration of the incorrect inference cause extraction unit 140 shown in FIG. 3 is that it includes a detailed cause analysis unit 2110.

詳細原因解析部２１１０は、誤推論画像とスコア最大化リファイン画像とを用いて、変更可能領域内の重要部分を可視化し、作用結果画像として出力する。 The detailed cause analysis unit 2110 uses the incorrect inference image and the score maximization refinement image to visualize important parts within the changeable region and outputs it as an action result image.

＜詳細原因解析部の機能構成＞
次に、詳細原因解析部２１１０の機能構成について説明する。図２２は、詳細原因解析部の機能構成の一例を示す第１の図である。図２２に示すように、詳細原因解析部２１１０は、画像差分演算部２２０１、ＳＳＩＭ演算部２２０２、切り出し部２２０３、作用部２２０４を有する。 <Functional configuration of detailed cause analysis section>
Next, the functional configuration of detailed cause analysis section 2110 will be explained. FIG. 22 is a first diagram showing an example of the functional configuration of the detailed cause analysis section. As shown in FIG. 22, the detailed cause analysis unit 2110 includes an image difference calculation unit 2201, an SSIM calculation unit 2202, a cutting unit 2203, and an action unit 2204.

画像差分演算部２２０１は、誤推論画像とスコア最大化リファイン画像との画素単位での差分を演算し、差分画像を出力する。 The image difference calculation unit 2201 calculates the difference in pixel units between the incorrect inference image and the score-maximizing refined image, and outputs a difference image.

ＳＳＩＭ演算部２２０２は、誤推論画像とスコア最大化リファイン画像とを用いて、ＳＳＩＭ演算を行うことで、ＳＳＩＭ画像を出力する。 The SSIM calculation unit 2202 outputs an SSIM image by performing SSIM calculation using the incorrect inference image and the score maximization refined image.

切り出し部２２０３は、差分画像から変更可能領域に対応する画像部分を切り出す。また、切り出し部２２０３は、ＳＳＩＭ画像から変更可能領域に対応する画像部分を切り出す。更に、切り出し部２２０３は、変更可能領域に対応する画像部分を切り出した、差分画像とＳＳＩＭ画像とを乗算して、乗算画像を生成する。 The cutting unit 2203 cuts out an image portion corresponding to the changeable area from the difference image. Furthermore, the cutting unit 2203 cuts out an image portion corresponding to the changeable area from the SSIM image. Furthermore, the cutout unit 2203 multiplies the SSIM image by the difference image, which has been cut out from the image portion corresponding to the changeable area, to generate a multiplied image.

作用部２２０４は、誤推論画像と乗算画像とに基づいて、作用結果画像を生成する。 The action unit 2204 generates an action result image based on the incorrect inference image and the multiplication image.

＜詳細原因解析部の処理の具体例＞
次に、詳細原因解析部２１１０の処理の具体例について説明する。図２３は、詳細原因解析部の処理の具体例を示す図である。 <Specific example of processing by the detailed cause analysis unit>
Next, a specific example of processing by the detailed cause analysis unit 2110 will be described. FIG. 23 is a diagram showing a specific example of processing by the detailed cause analysis unit.

図２３に示すように、はじめに、画像差分演算部２２０１において、誤推論画像（Ａ）とスコア最大化リファイン画像（Ｂ）との差分（＝（Ａ）－（Ｂ））が演算され、差分画像が出力される。差分画像は、誤推論の原因となる画像箇所での画素修正情報である。 As shown in FIG. 23, first, the image difference calculation unit 2201 calculates the difference (=(A)-(B)) between the incorrect inference image (A) and the score-maximizing refined image (B), and is output. The difference image is pixel correction information at a portion of the image that causes incorrect inference.

続いて、ＳＳＩＭ演算部２２０２において、誤推論画像（Ａ）とスコア最大化リファイン画像（Ｂ）とに基づいてＳＳＩＭ演算が行われる（ｙ＝ＳＳＩＭ（（Ａ），（Ｂ））。更に、ＳＳＩＭ演算部２２０２において、ＳＳＩＭ演算の結果が反転されることで（ｙ'＝２５５－（ｙ×２５５））、ＳＳＩＭ画像が出力される。ＳＳＩＭ画像は、誤推論の原因となる画像箇所を高精度に指定した画像であり、画素値が大きいと差分が大きく、画素値が小さいと差分が小さいことを表す。なお、ＳＳＩＭ演算の結果を反転する処理は、例えば、ｙ'＝１－ｙを算出することにより行ってもよい。 Next, the SSIM calculation unit 2202 performs SSIM calculation based on the incorrect inference image (A) and the score maximization refined image (B) (y=SSIM((A), (B)). In the calculation unit 2202, the result of the SSIM calculation is inverted (y' = 255 - (y x 255)) to output an SSIM image. This is an image specified by , and a large pixel value indicates a large difference, and a small pixel value indicates a small difference.The process of inverting the result of SSIM calculation is, for example, by calculating y' = 1 - y. This can also be done by doing this.

続いて、切り出し部２２０３において、差分画像から変更可能領域に対応する画像部分が切り出され、切り出し画像（Ｃ）が出力される。同様に、切り出し部２２０３において、ＳＳＩＭ画像から変更可能領域に対応する画像部分が切り出され、切り出し画像（Ｄ）が出力される。 Subsequently, the cutout unit 2203 cuts out an image portion corresponding to the changeable area from the difference image, and outputs a cutout image (C). Similarly, the cutout unit 2203 cuts out an image portion corresponding to the changeable area from the SSIM image, and outputs a cutout image (D).

ここで、変更可能領域は、誤推論の原因となる画像部分の領域を特定したものであり、詳細原因解析部２１１０では、特定した領域の中で、更に、画素粒度での原因解析を行うことを目的としている。 Here, the changeable area is a specified area of the image part that causes incorrect inference, and the detailed cause analysis unit 2110 further performs cause analysis at pixel granularity within the specified area. It is an object.

このため、切り出し部２２０３では、切り出し画像（Ｃ）と切り出し画像（Ｄ）とを乗算し、乗算画像（Ｇ）を生成する。乗算画像（Ｇ）は、誤推論の原因となる画像箇所での画素修正情報を更に高精度に指定した、画素修正情報に他ならない。 Therefore, the cropping unit 2203 multiplies the cropped image (C) and the cropped image (D) to generate a multiplied image (G). The multiplied image (G) is nothing but pixel correction information in which pixel correction information at a portion of the image that causes erroneous inference is specified with higher precision.

また、切り出し部２２０３では、乗算画像（Ｇ）に対して強調処理を行い、強調乗算画像（Ｈ）を出力する。なお、切り出し部２２０３では、強調乗算画像（Ｈ）を下式に基づいて算出する。
（式３）
強調乗算画像（Ｈ）＝２５５×（Ｇ）／（ｍａｘ（Ｇ）－ｍｉｎ（Ｇ））
続いて作用部２２０４では、誤推論画像（Ａ）から強調乗算画像（Ｈ）を減算することで重要部分を可視化し、作用結果画像を生成する。 Furthermore, the cutting unit 2203 performs emphasis processing on the multiplied image (G) and outputs an emphasized multiplied image (H). Note that the cutout unit 2203 calculates the emphasized multiplied image (H) based on the following formula.
(Formula 3)
Enhanced multiplication image (H) = 255 × (G) / (max (G) - min (G))
Subsequently, the action unit 2204 visualizes important parts by subtracting the emphasized multiplication image (H) from the incorrect inference image (A), and generates an action result image.

なお、図２３に示した強調処理の方法は一例にすぎず、可視化した際に重要部分がより識別しやすくなる方法であれば、他の方法により強調処理を行ってもよい。 Note that the emphasizing process shown in FIG. 23 is only an example, and the emphasizing process may be performed using any other method as long as it makes it easier to identify important parts when visualized.

＜詳細原因解析処理の流れ＞
次に、詳細原因解析部２１１０による詳細原因解析処理の流れについて説明する。図２４は、詳細原因解析処理の流れを示す第１のフローチャートである。 <Detailed cause analysis process flow>
Next, the flow of detailed cause analysis processing by the detailed cause analysis unit 2110 will be described. FIG. 24 is a first flowchart showing the flow of detailed cause analysis processing.

ステップＳ２４０１において、画像差分演算部２２０１は、誤推論画像とスコア最大化リファイン画像との差分画像を算出する。 In step S2401, the image difference calculation unit 2201 calculates a difference image between the incorrect inference image and the score-maximizing refined image.

ステップＳ２４０２において、ＳＳＩＭ演算部２２０２は、誤推論画像とスコア最大化リファイン画像とに基づいて、ＳＳＩＭ画像を演算する。 In step S2402, the SSIM calculation unit 2202 calculates an SSIM image based on the incorrect inference image and the score maximization refined image.

ステップＳ２４０３において、切り出し部２２０３は、変更可能領域に対応する差分画像を切り出す。 In step S2403, the cutting unit 2203 cuts out a difference image corresponding to the changeable area.

ステップＳ２４０４において、切り出し部２２０３は、変更可能領域に対応するＳＳＩＭ画像を切り出す。 In step S2404, the cutting unit 2203 cuts out the SSIM image corresponding to the changeable area.

ステップＳ２４０５において、切り出し部２２０３は、切り出した差分画像と切り出したＳＳＩＭ画像とを乗算し、乗算画像を生成する。 In step S2405, the cutout unit 2203 multiplies the cutout difference image and the cutout SSIM image to generate a multiplied image.

ステップＳ２４０６において、切り出し部２２０３は、乗算画像に対して強調処理を行う。また、作用部２２０４は、強調処理された乗算画像を、誤推論画像から減算し、作用結果画像を出力する。 In step S2406, the cutting unit 2203 performs emphasis processing on the multiplied image. Further, the effecting unit 2204 subtracts the emphasized multiplication image from the incorrect inference image, and outputs an effect result image.

以上の説明から明らかなように、第４の実施形態に係る解析装置１００は、誤推論画像とスコア最大化リファイン画像とに基づいて、差分画像とＳＳＩＭ画像とを生成し、変更可能領域を切り出して乗算する。これにより、第４の実施形態に係る解析装置１００によれば、変更可能領域内において、誤推論の原因となる画像箇所を画素単位で視認することが可能となる。 As is clear from the above description, the analysis device 100 according to the fourth embodiment generates a difference image and an SSIM image based on an incorrect inference image and a score maximization refined image, and cuts out a changeable region. Multiply by Thereby, according to the analysis device 100 according to the fourth embodiment, it is possible to visually recognize image locations that cause incorrect inferences in pixel units within the changeable region.

［第５の実施形態］
上記第４の実施形態では、誤推論画像とスコア最大化リファイン画像とに基づいて生成した差分画像とＳＳＩＭ画像とを用いて、誤推論の原因となる画像箇所を画素単位で可視化する場合について説明した。 [Fifth embodiment]
In the fourth embodiment described above, a case will be described in which image parts that cause incorrect inference are visualized pixel by pixel using a difference image generated based on an incorrect inference image and a score-maximizing refined image and an SSIM image. did.

これに対して、第５の実施形態では、更に、グレイスケール化重要特徴マップを用いて、誤推論の原因となる画像箇所を画素単位で可視化する。以下、第５の実施形態について、上記第４の実施形態との相違点を中心に説明する。 In contrast, in the fifth embodiment, a grayscale important feature map is further used to visualize image locations that cause erroneous inferences on a pixel-by-pixel basis. The fifth embodiment will be described below, focusing on the differences from the fourth embodiment.

＜誤推論原因抽出部の機能構成＞
はじめに、第５の実施形態に係る解析装置１００における、誤推論原因抽出部の機能構成について説明する。図２５は、誤推論原因抽出部の機能構成の一例を示す第５の図である。図２１に示した誤推論原因抽出部の機能構成との相違点は、図２５の場合、詳細原因解析部２５１０の機能が、図２１に示す詳細原因解析部２１１０の機能とは異なる点、及び、詳細原因解析部２５１０が、推論部３０３より推論部構造情報を取得する点である。 <Functional configuration of incorrect inference cause extraction unit>
First, the functional configuration of the erroneous inference cause extraction unit in the analysis device 100 according to the fifth embodiment will be described. FIG. 25 is a fifth diagram showing an example of the functional configuration of the incorrect inference cause extraction unit. The difference from the functional configuration of the incorrect inference cause extraction unit shown in FIG. 21 is that in the case of FIG. 25, the function of the detailed cause analysis unit 2510 is different from the function of the detailed cause analysis unit 2110 shown in FIG. , the detailed cause analysis unit 2510 acquires inference unit structure information from the inference unit 303.

詳細原因解析部２５１０は、誤推論画像とスコア最大化リファイン画像と推論部構造情報とに基づいて生成した、差分画像とＳＳＩＭ画像とグレイスケール化重要特徴マップとを用いて、誤推論の原因となる画像箇所を画素単位で可視化する。 The detailed cause analysis unit 2510 uses the difference image, SSIM image, and grayscale important feature map generated based on the erroneous inference image, the score-maximizing refined image, and the inference unit structure information to determine the cause of the erroneous inference. Visualize image locations pixel by pixel.

なお、詳細原因解析部２５１０が誤推論の原因となる画像箇所を画素単位で可視化する際に用いる差分画像、ＳＳＩＭ画像、グレイスケール化重要特徴マップは、以下のような属性を有する。
・差分画像：画素ごとの差分情報であり、指定したラベルの分類確率をよくするために画素をどのくらい修正すればよいかを示す、正負値を有する情報である。
・ＳＳＩＭ画像：画像全体及び局所領域の変化状況を考慮した差分情報であり、画素ごとの差分情報よりもアーティファクト（意図しないノイズ）が少ない情報である。つまり、より高い精度の差分情報である（ただし、正値のみの情報である）。
・グレイスケール化重要特徴マップ：正解ラベルの推論に影響を与える特徴部分を可視化したマップである。 Note that the difference image, SSIM image, and grayscale important feature map used by the detailed cause analysis unit 2510 to visualize image locations that cause erroneous inference in units of pixels have the following attributes.
- Difference image: This is difference information for each pixel, and is information having positive and negative values that indicates how much the pixel should be modified to improve the classification probability of the specified label.
- SSIM image: This is difference information that takes into account changes in the entire image and local regions, and is information that has fewer artifacts (unintended noise) than difference information for each pixel. In other words, it is differential information with higher precision (however, it is information only about positive values).
・Grayscale important feature map: This is a map that visualizes the feature parts that affect the inference of the correct label.

＜詳細原因解析部の機能構成＞
次に、詳細原因解析部２５１０の機能構成について説明する。図２６は、詳細原因解析部の機能構成の一例を示す第２の図である。図２２に示す機能構成との相違点は、図２６の場合、重要特徴マップ生成部２６０１を有する点、及び、切り出し部２６０２の機能が、図２２の切り出し部２２０３の機能とは異なる点である。 <Functional configuration of detailed cause analysis section>
Next, the functional configuration of detailed cause analysis section 2510 will be explained. FIG. 26 is a second diagram showing an example of the functional configuration of the detailed cause analysis section. The differences from the functional configuration shown in FIG. 22 are that in the case of FIG. 26, an important feature map generation section 2601 is included, and the function of the extraction section 2602 is different from the function of the extraction section 2203 of FIG. 22. .

重要特徴マップ生成部２６０１は、スコア最大化リファイン画像を入力として推論した際の推論部構造情報を、推論部３０３より取得する。また、重要特徴マップ生成部２６０１は、選択的ＢＰ法を用いることで、推論部構造情報に基づいて、グレイスケール化重要特徴マップを生成する。 The important feature map generation unit 2601 obtains inference unit structure information from the inference unit 303 when inference is made using the score maximization refined image as input. Further, the important feature map generation unit 2601 generates a grayscale important feature map based on the inference unit structure information by using the selective BP method.

切り出し部２６０２は、差分画像及びＳＳＩＭ画像から、変更可能領域に対応する画像部分を切り出すことに加えて、グレイスケール化重要特徴マップから、変更可能領域に対応する画像部分を切り出す。更に、切り出し部２６０２は、変更可能領域に対応する画像部分を切り出した、差分画像とＳＳＩＭ画像とグレイスケール化重要特徴マップとを乗算して、乗算画像を生成する。 The cutting unit 2602 cuts out an image portion corresponding to the changeable area from the difference image and the SSIM image, and also cuts out an image portion corresponding to the changeable area from the grayscale important feature map. Further, the cutout unit 2602 multiplies the difference image, the SSIM image, and the grayscale important feature map, which have been cut out from the image portion corresponding to the changeable region, to generate a multiplied image.

このように、差分画像とＳＳＩＭ画像とグレイスケール化重要特徴マップとを乗算することで、作用結果画像において、誤推論の原因となる画像箇所を画素単位で視認することが可能となる。 In this way, by multiplying the difference image, the SSIM image, and the gray scaled important feature map, it becomes possible to visually recognize image locations that cause erroneous inferences on a pixel-by-pixel basis in the action result image.

なお、乗算の際に差分画像を用いることで、作用結果画像は、正解ラベルのスコアが上がる画像に自動的に修正されることになる。したがって、差分画像を作用結果画像として出力してもよい。更に、このような利点を考慮しないのであれば、詳細原因解析部２５１０は、（差分画像を用いずに）ＳＳＩＭ画像とグレイスケール化重要特徴マップとを用いて乗算を行い、作用結果画像を出力してもよい。 Note that by using the difference image during multiplication, the action result image is automatically modified to an image that increases the score of the correct label. Therefore, the difference image may be output as the action result image. Furthermore, if such advantages are not taken into consideration, the detailed cause analysis unit 2510 may perform multiplication using the SSIM image and the grayscaled important feature map (without using the difference image) and output an effect result image. You may.

＜詳細原因解析部の処理の具体例＞
次に、詳細原因解析部２５１０の処理の具体例について説明する。図２７は、詳細原因解析部の処理の具体例を示す第２の図である。なお、図２３の詳細原因解析部２１１０の処理の具体例との相違点は、重要特徴マップ生成部２６０１が、推論部構造情報（Ｉ）に基づいて、選択的ＢＰ法を用いた重要特徴マップ生成処理を行い、グレイスケール化重要特徴マップを生成している点である。また、切り出し部２６０２が、グレイスケール化重要特徴マップから変更可能領域に対応する画像部分が切り出され、切り出し画像（Ｊ）を出力している点である。更に、切り出し部２６０２が、切り出し画像（Ｃ）と切り出し画像（Ｄ）と切り出し画像（Ｊ）とを乗算し、乗算画像（Ｇ）を生成している点である。 <Specific example of processing by the detailed cause analysis unit>
Next, a specific example of processing by the detailed cause analysis unit 2510 will be described. FIG. 27 is a second diagram showing a specific example of processing by the detailed cause analysis unit. The difference from the specific example of the processing of the detailed cause analysis unit 2110 in FIG. 23 is that the important feature map generation unit 2601 generates an important feature map using the selective BP method based on the inference unit structure information (I). The point is that generation processing is performed to generate a grayscale important feature map. Another point is that the clipping unit 2602 clips the image portion corresponding to the changeable region from the grayscale important feature map and outputs the clipped image (J). Furthermore, the cutout unit 2602 multiplies the cutout image (C), the cutout image (D), and the cutout image (J) to generate a multiplied image (G).

＜詳細原因解析処理の流れ＞
次に、詳細原因解析部２５１０による詳細原因解析処理の流れについて説明する。図２８は、詳細原因解析処理の流れを示す第２のフローチャートである。図２４に示したフローチャートとの相違点は、ステップＳ２８０１、ステップＳ２８０２、ステップＳ２８０３である。 <Detailed cause analysis process flow>
Next, the flow of detailed cause analysis processing by the detailed cause analysis unit 2510 will be explained. FIG. 28 is a second flowchart showing the flow of detailed cause analysis processing. The differences from the flowchart shown in FIG. 24 are steps S2801, S2802, and S2803.

ステップＳ２８０１において、重要特徴マップ生成部２６０１は、スコア最大化リファイン画像を入力としてラベルを推論した際の推論部構造情報を、推論部３０３より取得する。また、重要特徴マップ生成部２６０１は、選択的ＢＰ法を用いることで、推論部構造情報に基づいてグレイスケール化重要特徴マップを生成する。 In step S2801, the important feature map generation unit 2601 obtains inference unit structure information from the inference unit 303 when a label is inferred using the score-maximized refined image as input. Further, the important feature map generation unit 2601 uses the selective BP method to generate a grayscale important feature map based on the inference unit structure information.

ステップＳ２８０２において、切り出し部２６０２は、グレイスケール化重要特徴マップから、変更可能領域に対応する画像部分を切り出す。 In step S2802, the cutting unit 2602 cuts out an image portion corresponding to the changeable region from the grayscale important feature map.

ステップＳ２８０３において、切り出し部２６０２は、変更可能領域に対応する画像部分を切り出した、差分画像とＳＳＩＭ画像とグレイスケール化重要特徴マップとを乗算して、乗算画像を生成する。 In step S2803, the cropping unit 2602 generates a multiplied image by multiplying the difference image, the SSIM image, and the grayscale important feature map, which are extracted from the image portion corresponding to the changeable region.

以上の説明から明らかなように、第５の実施形態に係る解析装置１００は、誤推論画像とスコア最大化リファイン画像と推論部構造情報とに基づいて、差分画像とＳＳＩＭ画像とグレイスケール化重要特徴マップとを生成し、変更可能領域を切り出して乗算する。これにより、第５の実施形態に係る解析装置１００によれば、変更可能領域内において、誤推論の原因となる画像箇所を画素単位で視認することが可能となる。 As is clear from the above description, the analysis device 100 according to the fifth embodiment converts the difference image, the SSIM image, and the A feature map is generated, and a changeable region is cut out and multiplied. Thereby, according to the analysis device 100 according to the fifth embodiment, it is possible to visually recognize image locations that cause incorrect inferences in pixel units within the changeable region.

なお、上記実施形態に挙げた構成等に、その他の要素との組み合わせ等、ここで示した構成に本発明が限定されるものではない。これらの点に関しては、本発明の趣旨を逸脱しない範囲で変更することが可能であり、その応用形態に応じて適切に定めることができる。 Note that the present invention is not limited to the configurations shown here, such as combinations of other elements with the configurations listed in the above embodiments. These points can be modified without departing from the spirit of the present invention, and can be appropriately determined depending on the application thereof.

１００：解析装置
１４０：誤推論原因抽出部
１４１：リファイン画像生成部
１４２：マップ生成部
１４３：特定部
３０１：画像リファイナ部
３０２：画像誤差演算部
３０３：推論部
３０４：スコア誤差演算部
３１１：重要特徴マップ生成部
３１２：劣化尺度マップ生成部
３１３：重畳部
３２１：スーパーピクセル分割部
３２２：重要スーパーピクセル決定部
３２３：重要スーパーピクセル評価部
６１１：選択的逆誤差伝播部
６１２：非注目画素オフセット調整部
６１３：グレイスケール化部
１１１０：領域抽出部
１１１１：合成部
１７１０：マップ生成部
１７１１：平均化部
１９１０：リファイン画像生成部
１９１１：画像誤差演算部
１９２０：特定部
１９２１：補正部
２１１０：詳細原因解析部
２２０１：画像差分演算部
２２０２：ＳＳＩＭ演算部
２２０３：切り出し部
２２０４：作用部
２５１０：詳細原因解析部
２６０１：重要特徴マップ生成部
２６０２：切り出し部 100: Analysis device 140: Misinference cause extraction unit 141: Refine image generation unit 142: Map generation unit 143: Specification unit 301: Image refiner unit 302: Image error calculation unit 303: Inference unit 304: Score error calculation unit 311: Important Feature map generation unit 312 : Deterioration scale map generation unit 313 : Superposition unit 321 : Super pixel division unit 322 : Important super pixel determination unit 323 : Important super pixel evaluation unit 611 : Selective back error propagation unit 612 : Non-target pixel offset adjustment Section 613: Grayscale section 1110: Area extraction section 1111: Composition section 1710: Map generation section 1711: Averaging section 1910: Refine image generation section 1911: Image error calculation section 1920: Specification section 1921: Correction section 2110: Detailed cause Analysis unit 2201 : Image difference calculation unit 2202 : SSIM calculation unit 2203 : Cutout unit 2204 : Action unit 2510 : Detailed cause analysis unit 2601 : Important feature map generation unit 2602 : Cutout unit

Claims

Refinement that generates a refined image that maximizes the score of the correct label of inference from the incorrect inference image by using the difference between the incorrect inference image in which an incorrect label is inferred during image recognition processing and the refined image. an image generation unit;
a first map indicating pixels of the plurality of pixels of the incorrect inference image that have been changed when generating the refined image whose score is maximized ; and a plurality of pixels of the refined image whose score is maximized. By superimposing a second map that shows the degree of attention of each pixel that was focused on at the time of inference and that is adjusted based on the frequency of appearance of each degree of attention, each pixel for inferring the correct label is a map generation unit that generates a third map indicating the importance of the
In the erroneous inference image, a specifying unit that identifies a set of pixels that causes an erroneous inference by calculating a pixel value of the third map for each set of pixels ,
The refined image generation unit includes:
When the set of pixels is identified by the identification unit, the difference is corrected for the area of the identified set of pixels, and the correct label of the inference is determined from the incorrect inference image using the corrected difference. An analysis device that regenerates a refined image that maximizes the score .

The map generation unit identifies the pixel value at which the frequency of appearance of each degree of attention is maximum, and determines the attention of each pixel that was noticed at the time of inference among the plurality of pixels of the refined image so that the identified pixel value becomes zero. The analysis device according to claim 1, which adjusts a map indicating the degree.

2. The map generation unit generates a map indicating the degree of attention of each pixel that is focused upon inference among the plurality of pixels of the refined image using any one of a BP method, a GBP method, or a selective BP method. 2. The analysis device according to 2.

an image in which a predetermined area is cut out from a difference image calculated based on the difference between the incorrect inference image and the refined image that maximizes the score ;
an image in which a predetermined region is cut out from an SSIM image obtained by performing SSIM calculation on the incorrect inference image and the refined image that maximizes the score ;
an image cut out from a predetermined area from the second map;
The analysis device according to claim 1, wherein a multiplied image obtained by multiplying is visualized in the incorrect inference image.

Generating a refined image that maximizes the score of the correct label of inference from the erroneous inference image using the difference between the erroneous inference image in which an incorrect label is inferred during image recognition processing and the refined image,
a first map indicating pixels of the plurality of pixels of the incorrect inference image that have been changed when generating the refined image whose score is maximized ; and a plurality of pixels of the refined image whose score is maximized. By superimposing a second map that shows the degree of attention of each pixel that was focused on at the time of inference and that is adjusted based on the frequency of appearance of each degree of attention, each pixel for inferring the correct label is generate a third map showing the importance of
causing a computer to perform a process of identifying a set of pixels that causes an erroneous inference by calculating a pixel value of the third map for each set of pixels in the erroneous inference image ;
When the set of pixels is identified, the difference is corrected for the area of the identified set of pixels, and the corrected difference is used to maximize the score of the correct label of the inference from the incorrect inference image. An analysis program that regenerates refined images .

Generating a refined image that maximizes the score of the correct label of inference from the erroneous inference image using the difference between the erroneous inference image in which an incorrect label is inferred during image recognition processing and the refined image,
a first map indicating pixels of the plurality of pixels of the incorrect inference image that have been changed when generating the refined image whose score is maximized ; and a plurality of pixels of the refined image whose score is maximized. By superimposing a second map that shows the degree of attention of each pixel that was focused on at the time of inference and that is adjusted based on the frequency of appearance of each degree of attention, each pixel for inferring the correct label is generate a third map showing the importance of
In the erroneous inference image, a computer executes a process of identifying a set of pixels that causes the erroneous inference by calculating a pixel value of the third map for each set of pixels,
When the set of pixels is identified, the difference is corrected for the area of the identified set of pixels, and the corrected difference is used to maximize the score of the correct label of the inference from the incorrect inference image. An analysis method that regenerates refined images .