JP2021111306A

JP2021111306A - Computer system and program

Info

Publication number: JP2021111306A
Application number: JP2020027250A
Authority: JP
Inventors: 翔吾金石; Shogo Kaneishi; 繁幸岩田; Shigeyuki Iwata; 聡原; Satoshi Hara
Original assignee: Zenrin Co Ltd
Current assignee: Zenrin Co Ltd
Priority date: 2020-01-07
Filing date: 2020-02-20
Publication date: 2021-08-02
Anticipated expiration: 2040-02-20
Also published as: JP7365261B2

Abstract

To automatically analyze a machine learning model for image recognition.SOLUTION: A computer system according to an embodiment acquires, for each of multiple images, a correctness determination result indicating whether an image recognition result obtained by inputting the image into a machine learning model is correct, calculates, for each of the multiple images, a relationship degree indicating the relationship between an area of interest in the image in the machine learning model and each of one or more attribute labels associated with the image in advance, and estimates, based on the correctness determination result and the relationship degree of each of the multiple images, an attribute label related to incorrect image recognition from the one or more attribute labels.SELECTED DRAWING: Figure 4

Description

本開示の一側面はコンピュータシステム、処理方法、プログラム、および／またはデータ構造に関する。 One aspect of the disclosure relates to computer systems, processing methods, programs, and / or data structures.

画像認識用の機械学習モデルを分析する技術が知られている（例えば特許文献１〜３を参照）。 A technique for analyzing a machine learning model for image recognition is known (see, for example, Patent Documents 1 to 3).

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135-1144. ACM, 2016.Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. Why should I trust you ?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135-1144. ACM, 2016. R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. In ICCV, pages 618-626, 2017.RR Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. In ICCV, pages 618-626, 2017 .. Avanti Shrikumar, Peyton Greenside, Anshul Kundaje. Learning important features through propagating Activation differences. Proceedings of the 34th International Conference on Machine Learning, PMLR 70, 3145-3153 (2017).Avanti Shrikumar, Peyton Greenside, Anshul Kundaje. Learning important features through propagating Activation differences. Proceedings of the 34th International Conference on Machine Learning, PMLR 70, 3145-3153 (2017).

本開示の一側面は、画像認識用の機械学習モデルを自動的に分析することを目的とする。 One aspect of the disclosure is intended to automatically analyze machine learning models for image recognition.

本開示の一側面に係るコンピュータシステムはプロセッサを備える。プロセッサは、複数の画像のそれぞれについて、該画像を機械学習モデルに入力することで得られた画像認識結果が正しいか否かを示す正誤判定結果を取得し、複数の画像のそれぞれについて、機械学習モデルにおける該画像内の注目領域と、該画像に予め関連付けられた１以上の属性ラベルのそれぞれとの関係を示す関係度を算出し、複数の画像のそれぞれの正誤判定結果および関係度に基づいて、誤った画像認識に関連する属性ラベルを１以上の属性ラベルから推定する。 The computer system according to one aspect of the present disclosure includes a processor. The processor acquires a correct / incorrect judgment result indicating whether or not the image recognition result obtained by inputting the image into the machine learning model is correct for each of the plurality of images, and machine learning is performed for each of the plurality of images. The degree of relationship indicating the relationship between the region of interest in the image in the model and each of the one or more attribute labels associated with the image in advance is calculated, and based on the correctness determination result and the degree of relationship of each of the plurality of images. , The attribute label related to erroneous image recognition is estimated from one or more attribute labels.

実施形態に係る分析システムによる分析方法の一例を処理フローＳ１として示す図である。It is a figure which shows an example of the analysis method by the analysis system which concerns on embodiment as process flow S1. 注目領域を示すハイライト画像の一例を示す図である。It is a figure which shows an example of the highlight image which shows a region of interest. 実施形態に係る分析システムのために用いられるコンピュータのハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware composition of the computer used for the analysis system which concerns on embodiment. 実施形態に係る分析システムの機能構成の一例を示す図である。It is a figure which shows an example of the functional structure of the analysis system which concerns on embodiment. 属性ラベルのいくつかの例を示す図である。It is a figure which shows some example of the attribute label. 実施形態に係る分析システムによる分析処理の一例を示すフローチャートである。It is a flowchart which shows an example of the analysis processing by the analysis system which concerns on embodiment. 画像認識の正しさを判定する処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process of determining the correctness of image recognition. 注目領域と属性ラベルとの関係度を算出する処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process of calculating the degree of relationship between a region of interest and an attribute label. 誤った画像認識に関連する属性ラベルを推定する処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process of estimating the attribute label related to erroneous image recognition.

以下、添付図面を参照しながら本開示での実施形態を詳細に説明する。図面の説明において同一または同等の要素には同一の符号を付し、重複する説明を省略する。 Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. In the description of the drawings, the same or equivalent elements are designated by the same reference numerals, and duplicate description will be omitted.

［システムの概要］
実施形態に係る分析システム１は、画像認識用の機械学習モデルを分析するコンピュータシステムである。より具体的には、分析システム１はその機械学習モデルが画像認識を誤る原因を推定する。 [System overview]
The analysis system 1 according to the embodiment is a computer system that analyzes a machine learning model for image recognition. More specifically, the analysis system 1 estimates the cause of the machine learning model erroneous image recognition.

画像とは、人が視覚を通して何らかの事象を認識することができるように所定の媒体上で表現される情報をいう。画像はカメラなどの撮像装置によって生成された画像でもよいし、撮影後に任意の他の画像処理が施された画像でもよい。画像を示す電子データ、すなわち画像データがコンピュータにより処理されることで画像が可視化され、その結果、人は視覚を通じて画像を認識することができる。例えば、画像は静止画、すなわち写真でもよいし、動画を構成するフレーム画像でもよい。画像認識とは、画像に映された場面の特徴を識別する技術をいう。 An image is information expressed on a predetermined medium so that a person can visually recognize an event. The image may be an image generated by an image pickup device such as a camera, or may be an image subjected to any other image processing after shooting. Electronic data indicating an image, that is, image data is processed by a computer to visualize the image, and as a result, a person can visually recognize the image. For example, the image may be a still image, that is, a photograph, or a frame image constituting a moving image. Image recognition refers to a technique for identifying the characteristics of a scene projected on an image.

機械学習とは、与えられた情報に基づいて反復的に学習することで、法則またはルールを自律的に見つけ出す手法をいう。機械学習の具体的な手法は限定されず、任意の方針で設計されてよい。機械学習では機械学習モデルを用いる。一般に、機械学習モデルは多層構造のニューラルネットワークを用いて構築される。ニューラルネットワークとは、人間の脳神経系の仕組みを模した情報処理のモデルをいう。本開示では、機械学習モデルは、画像データを示すベクトルデータを入力ベクトルとして処理して、認識結果を示すベクトルデータを出力ベクトルとして出力するアルゴリズムである。運用される機械学習モデルは学習によって生成され、したがって「学習済みモデル」ともいわれる。学習済みモデルは、最も予測精度が高いと推定される最良の計算モデルであり、したがって「最良の学習済みモデル」ということができる。しかし、この最良の学習済みモデルは“現実に最良である”とは限らないことに留意されたい。 Machine learning is a method of autonomously finding a law or rule by iteratively learning based on given information. The specific method of machine learning is not limited, and it may be designed by any policy. Machine learning uses a machine learning model. Generally, machine learning models are constructed using multi-layered neural networks. A neural network is a model of information processing that imitates the mechanism of the human cranial nerve system. In the present disclosure, the machine learning model is an algorithm that processes vector data indicating image data as an input vector and outputs vector data indicating a recognition result as an output vector. The machine learning model that is operated is generated by learning and is therefore also called a "trained model". The trained model is the best computational model estimated to have the highest prediction accuracy and can therefore be referred to as the "best trained model". However, keep in mind that this best trained model is not always the “best in reality”.

コンピュータは、画像と正しい認識結果との多数の組合せを含む訓練データを処理することで最良の学習済みモデルを生成する。コンピュータは、画像を示す入力ベクトルを機械学習モデルに入力することで、認識結果を示す出力ベクトルを算出する。そして、コンピュータは、その出力ベクトルと訓練データで示される認識結果との誤差、すなわち、推定結果と正解ラベルとの差を求める。そして、コンピュータはその誤差に基づいて機械学習モデル内の所与のパラメータを更新する。コンピュータはこのような学習を繰り返すことで最良の学習済みモデルを生成する。一例では、この学習済みモデルは分析システム１のメモリに格納される。最良の学習済みモデルを生成する処理は学習フェーズということができる。学習済みモデルを生成するコンピュータは限定されず、例えば、分析システム１でもよいし、分析システム１とは異なるコンピュータまたはコンピュータシステムでもよい。 The computer produces the best trained model by processing training data that contains many combinations of images and correct recognition results. The computer calculates the output vector indicating the recognition result by inputting the input vector indicating the image into the machine learning model. Then, the computer finds the error between the output vector and the recognition result shown in the training data, that is, the difference between the estimation result and the correct answer label. The computer then updates a given parameter in the machine learning model based on that error. The computer repeats such learning to generate the best trained model. In one example, this trained model is stored in the memory of the analysis system 1. The process of generating the best trained model can be called the training phase. The computer that generates the trained model is not limited, and may be, for example, an analysis system 1 or a computer or computer system different from the analysis system 1.

機械学習モデルの分析とは、機械学習モデルの仕組みを明らかにするための処理をいう。分析システム１は、機械学習モデルが画像を認識する仕組みを把握するために該機械学習モデルを分析する。一例では、機械学習モデルの分析は、機械学習モデルが画像認識を誤る原因を推定することである。 Analysis of a machine learning model is a process for clarifying the mechanism of a machine learning model. The analysis system 1 analyzes the machine learning model in order to understand the mechanism by which the machine learning model recognizes an image. In one example, the analysis of a machine learning model is to estimate the cause of the machine learning model's misrecognition of images.

図１は分析システム１による分析方法の一例を処理フローＳ１として示す図である。分析システム１は、画像ｘ、正解ラベルｋ、および属性ラベルｒのｎ個の組合せを含むデータセットを処理する。画像ｘは原画像ということができる。正解ラベルとは、画像を機械学習モデルで処理することによって得られるべき出力値（すなわち正解）をいう。一例では、正解ラベルは分類問題の正解、すなわち、画像が属するクラス（グループ）を示す。属性ラベルとは、画像に関連する属性を示す値をいう。画像に関連する属性とは、画像を特徴付ける任意の情報をいう。一例では、個々の画像について正解ラベルおよび属性ラベルが人手でまたは自動的に設定されることでデータセットが用意される。 FIG. 1 is a diagram showing an example of an analysis method by the analysis system 1 as a processing flow S1. The analysis system 1 processes a data set containing n combinations of the image x, the correct label k, and the attribute label r. The image x can be said to be the original image. The correct label is the output value (that is, the correct answer) that should be obtained by processing the image with a machine learning model. In one example, the correct label indicates the correct answer to the classification problem, that is, the class (group) to which the image belongs. The attribute label is a value indicating an attribute related to the image. Image-related attributes are arbitrary information that characterizes an image. In one example, a dataset is prepared by manually or automatically setting correct and attribute labels for individual images.

ステップＳ１１では、分析システム１は機械学習モデルに画像ｘを入力することで画像認識を実行する。この機械学習モデルは学習済みモデルであり、したがって、その画像認識は運用フェーズに対応する。 In step S11, the analysis system 1 executes image recognition by inputting the image x into the machine learning model. This machine learning model is a trained model and therefore its image recognition corresponds to the operational phase.

ステップＳ１２では、分析システム１はその画像認識において機械学習モデルが注目した領域（本開示ではこれを「注目領域」ともいう。）を抽出するハイライト処理を実行する。一例では、分析システム１は上記の非特許文献１に記載された手法に基づいて画像ｘから注目領域を抽出し、その注目領域を示すハイライト画像ｇを生成する。ハイライト画像ｇの画像数は画像ｘの画像数と等しい。図２はハイライト画像の例を示す図である。この例では、画像２０１から得られたハイライト画像２０２は、制限速度が３０ｋｍ／ｈであることを示す標識板を注目領域として特に表す。このことは、機械学習モデルが画像２０１を認識する際にその標識板に特に注目したことを意味する。ハイライト画像２０２において黒く塗りつぶされた部分、すなわち、注目領域以外の部分は、画像認識において機械学習モデルが全くまたはほとんど注目しなかった領域であることを示す。 In step S12, the analysis system 1 executes a highlighting process for extracting a region of interest (also referred to as a “region of interest” in the present disclosure) that the machine learning model has focused on in its image recognition. In one example, the analysis system 1 extracts a region of interest from the image x based on the method described in Non-Patent Document 1 above, and generates a highlight image g indicating the region of interest. The number of images of the highlight image g is equal to the number of images of the image x. FIG. 2 is a diagram showing an example of a highlight image. In this example, the highlight image 202 obtained from image 201 particularly represents a sign plate indicating that the speed limit is 30 km / h as a region of interest. This means that the machine learning model paid particular attention to the sign board when recognizing the image 201. The black-filled portion in the highlight image 202, that is, the portion other than the region of interest, indicates that the region of interest is the region that the machine learning model has paid little or no attention to in image recognition.

ステップＳ１３では、分析システム１はその画像認識の結果と正解ラベルｋとを比較することで、その認識結果が正しいか否かを判定し、正誤判定結果ｃを得る。 In step S13, the analysis system 1 compares the result of the image recognition with the correct answer label k to determine whether or not the recognition result is correct, and obtains the correct / incorrect determination result c.

ステップＳ１４では、分析システム１はハイライト画像ｇ（すなわち注目領域）と属性ラベルｒとの関係度を算出する照合処理を実行する。関係度とは、注目領域と属性ラベルとの関係の強さを示す指数である。本開示では、関係度が高いほど、注目領域と属性ラベルとの関係が強いとする。 In step S14, the analysis system 1 executes a collation process for calculating the degree of relationship between the highlight image g (that is, the region of interest) and the attribute label r. The degree of relationship is an index indicating the strength of the relationship between the area of interest and the attribute label. In the present disclosure, it is assumed that the higher the degree of relationship, the stronger the relationship between the area of interest and the attribute label.

分析システム１はデータセット内のそれぞれの組合せについてステップＳ１１〜Ｓ１４の処理を実行する。ステップＳ１５では、分析システム１はｎ個の組合せについての正誤判定結果および照合結果（すなわち関係度）に基づく分析を実行して、誤った画像認識に関連する属性ラベルを推定する。分析システム１はその属性ラベルを分析結果として出力する。 The analysis system 1 executes the processes of steps S11 to S14 for each combination in the data set. In step S15, the analysis system 1 executes an analysis based on the correctness determination result and the collation result (that is, the degree of relation) for n combinations to estimate the attribute label related to the incorrect image recognition. The analysis system 1 outputs the attribute label as an analysis result.

［システムの構成］
図３は、分析システム１のために用いられるコンピュータ１１０のハードウェア構成の一例を示す図である。例えば、分析システム１は制御回路１００を有する。一例では、制御回路１００は、一つまたは複数のプロセッサ１０１と、メモリ１０２と、ストレージ１０３と、通信ポート１０４と、入出力ポート１０５とを有する。プロセッサ１０１はオペレーティングシステムおよびアプリケーションプログラムを実行する。ストレージ１０３はハードディスク、不揮発性の半導体メモリ、取り出し可能な媒体（例えば、磁気ディスク、光ディスクなど）などの非一時的な記憶媒体で構成され、オペレーティングシステムおよびアプリケーションプログラムを記憶する。メモリ１０２は、ストレージ１０３からロードされたプログラム、またはプロセッサ１０１による演算結果を一時的に記憶する。一例では、プロセッサ１０１は、メモリ１０２と協働してプログラムを実行することで、各機能モジュールとして機能する。通信ポート１０４は、プロセッサ１０１からの指令に従って、通信ネットワークＮＷを介して他の装置との間でデータ通信を行う。入出力ポート１０５は、プロセッサ１０１からの指令に従って、キーボード、マウス、モニタなどの入出力装置（ユーザインタフェース）との間で電気信号の入出力を実行する。 [System configuration]
FIG. 3 is a diagram showing an example of the hardware configuration of the computer 110 used for the analysis system 1. For example, the analysis system 1 has a control circuit 100. In one example, the control circuit 100 has one or more processors 101, a memory 102, a storage 103, a communication port 104, and an input / output port 105. Processor 101 executes the operating system and application programs. The storage 103 is composed of a non-temporary storage medium such as a hard disk, a non-volatile semiconductor memory, and a retrievable medium (for example, a magnetic disk, an optical disk, etc.), and stores an operating system and an application program. The memory 102 temporarily stores the program loaded from the storage 103 or the calculation result by the processor 101. In one example, the processor 101 functions as each functional module by executing a program in cooperation with the memory 102. The communication port 104 performs data communication with another device via the communication network NW in accordance with a command from the processor 101. The input / output port 105 executes input / output of an electric signal to / from an input / output device (user interface) such as a keyboard, a mouse, and a monitor in accordance with a command from the processor 101.

ストレージ１０３は、コンピュータ１１０を分析システム１として機能させるためのプログラム１２０を記憶する。プロセッサ１０１がこのプログラム１２０を実行することで分析システム１の各機能モジュールが実現される。プログラム１２０は、ＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭ、半導体メモリなどの非一時的な記録媒体に固定的に記録された上で提供されてもよい。あるいは、プログラム１２０は、搬送波に重畳されたデータ信号として通信ネットワークを介して提供されてもよい。 The storage 103 stores a program 120 for making the computer 110 function as the analysis system 1. When the processor 101 executes this program 120, each functional module of the analysis system 1 is realized. The program 120 may be provided after being fixedly recorded on a non-temporary recording medium such as a CD-ROM, a DVD-ROM, or a semiconductor memory. Alternatively, program 120 may be provided via a communication network as a data signal superimposed on a carrier wave.

分析システム１は一つまたは複数のコンピュータ１１０により構成され得る。複数のコンピュータ１１０が用いられる場合には、通信ネットワークＮＷを介してこれらのコンピュータ１１０が互いに接続されることで論理的に一つの分析システム１が構成される。 The analysis system 1 may consist of one or more computers 110. When a plurality of computers 110 are used, one analysis system 1 is logically configured by connecting these computers 110 to each other via a communication network NW.

分析システム１として機能するコンピュータ１１０は限定されない。例えば、分析システム１は業務用サーバなどの大型のコンピュータ１１０で構成されてもよいし、パーソナルコンピュータなどの小型のコンピュータ１１０で構成されてもよい。 The computer 110 that functions as the analysis system 1 is not limited. For example, the analysis system 1 may be composed of a large computer 110 such as a business server, or a small computer 110 such as a personal computer.

図４は分析システム１の機能構成の一例を示す図である。一例では、分析システム１は情報処理装置１０、記憶装置２０、入力装置３０、および出力装置４０を備える。一例では、情報処理装置１０は入力装置３０から入力される指示信号に従って動作し、記憶装置２０にアクセスしてデータの読み書きを実行し、処理結果を示すデータを出力装置４０に出力する。 FIG. 4 is a diagram showing an example of the functional configuration of the analysis system 1. In one example, the analysis system 1 includes an information processing device 10, a storage device 20, an input device 30, and an output device 40. In one example, the information processing device 10 operates according to an instruction signal input from the input device 30, accesses the storage device 20, reads and writes data, and outputs data indicating the processing result to the output device 40.

情報処理装置１０は機能モジュールとしてデータ取得部１１、画像認識部１２、正誤判定部１３、ハイライト部１４、照合部１５、および分析部１６を備える。データ取得部１１はデータセットを取得する機能モジュールである。画像認識部１２は機械学習モデルを用いた画像認識を実行する機能モジュールである。この機械学習モデルは学習済みモデルであり、したがって、画像認識部１２により実行される画像認識は運用フェーズに対応する。正誤判定部１３はその画像認識の結果が正しいか否かを判定する機能モジュールである。ハイライト部１４はその画像認識での注目領域を抽出する機能モジュールである。照合部１５は注目領域と属性ラベルとの関係を示す関係度を算出する機能要素である。分析部１６は、画像認識の正誤判定結果と関係度とに基づいて、誤った画像認識に関連する属性ラベルを推定する機能モジュールである。誤った画像認識に関連する属性ラベルとは、機械学習モデルによる画像認識を誤らせる原因となったと推定される属性ラベルのことをいい、端的に言うと、誤認識（誤分類）に寄与した属性ラベルである。「誤った画像認識に関連する属性ラベル」とは、分析システム１によって推定される結果であり、機械学習モデルによる画像認識を“現実に”誤らせた属性ラベルであるとは限らないことに留意されたい。 The information processing device 10 includes a data acquisition unit 11, an image recognition unit 12, a correctness determination unit 13, a highlight unit 14, a collation unit 15, and an analysis unit 16 as functional modules. The data acquisition unit 11 is a functional module for acquiring a data set. The image recognition unit 12 is a functional module that executes image recognition using a machine learning model. This machine learning model is a trained model, and therefore the image recognition performed by the image recognition unit 12 corresponds to the operation phase. The correctness determination unit 13 is a function module that determines whether or not the result of the image recognition is correct. The highlight unit 14 is a functional module that extracts a region of interest in the image recognition. The collation unit 15 is a functional element for calculating the degree of relationship indicating the relationship between the region of interest and the attribute label. The analysis unit 16 is a functional module that estimates attribute labels related to erroneous image recognition based on the correctness determination result of image recognition and the degree of relationship. The attribute label related to erroneous image recognition is the attribute label that is presumed to have caused the image recognition by the machine learning model to be erroneous. In short, the attribute label that contributed to erroneous recognition (misclassification). Is. Note that the "attribute label associated with erroneous image recognition" is the result estimated by the analysis system 1 and is not necessarily the attribute label that "realistically" misleads the image recognition by the machine learning model. I want to be.

記憶装置２０は分析システム１によって処理される各種のデータを一時的にまたは永続的に記憶する。一例では、記憶装置は、予め用意されるデータセットと、画像認識部１２により得られる画像認識結果と、正誤判定部１３により得られる正誤判定結果と、ハイライト部１４により得られるハイライト結果と、照合部１５により得られる照合結果（関係度）と、分析部１６により得られる分析結果とのうちの少なくとも一つを記憶する。 The storage device 20 temporarily or permanently stores various data processed by the analysis system 1. In one example, the storage device includes a data set prepared in advance, an image recognition result obtained by the image recognition unit 12, a correctness determination result obtained by the correctness determination unit 13, and a highlight result obtained by the highlight unit 14. , At least one of the collation result (relationship degree) obtained by the collation unit 15 and the analysis result obtained by the analysis unit 16 is stored.

［属性ラベル］
データセット内の個々の画像、すなわち個々の原画像には、少なくとも一つの属性ラベルが設定される。属性ラベルとして表現される属性は限定されない。属性ラベルが設定される対象も限定されず、例えば、属性ラベルは画像全体に対して設定されてもよいし、画像内の任意の部分領域に対して設定されてもよい。画像全体に対して設定される属性ラベルの例として、撮影条件（例えば、撮影日、天候）、および画像全体に関する属性（例えば、被写体の種別、解像度、コントラスト、明るさ、ノイズ）が挙げられるが、これらに限定されない。部分領域に対して設定される属性ラベルの例として、幾何情報（例えば、画像内での位置、面積、形状）、および該部分領域に関する属性（例えば、種別、色、姿勢、明るさ、遮蔽物の有無）が挙げられるが、これらに限定されない。部分領域は任意に設定されてよい。部分領域は物体（すなわち被写体）ごとに設定されてもよく、例えば、人、車、標識などの物体のそれぞれに設定されてもよい。あるいは、部分領域は個々の物体（被写体）の個々の構成要素ごとに設定されてもよい。例えば、物体（被写体）が標識であれば、標識板、縁取り、イラスト、文字、支柱などの個々の構成要素ごとに部分領域が設定されてもよい。あるいは、部分領域はスーパーピクセルごとに設定されてもよいし、画素ごとに設定されてもよい。スーパーピクセルとは、輝度、色などの画素情報が似ている複数の連続して並ぶ画素で構成される領域をいう。 [Attribute label]
At least one attribute label is set for each image in the dataset, that is, each original image. The attributes expressed as attribute labels are not limited. The target to which the attribute label is set is not limited, and for example, the attribute label may be set for the entire image or may be set for any partial area in the image. Examples of attribute labels set for the entire image include shooting conditions (eg, shooting date, weather), and attributes for the entire image (eg, subject type, resolution, contrast, brightness, noise). , Not limited to these. Examples of attribute labels set for a sub-region are geometric information (eg, position, area, shape in the image) and attributes for that sub-region (eg, type, color, orientation, brightness, obstruction). (Presence or absence), but is not limited to these. The partial area may be set arbitrarily. The partial area may be set for each object (that is, a subject), and may be set for each object such as a person, a car, or a sign. Alternatively, the subregion may be set for each individual component of the individual object (subject). For example, if the object (subject) is a sign, a partial area may be set for each individual component such as a sign board, a border, an illustration, characters, and a support. Alternatively, the subregion may be set for each superpixel or for each pixel. A super pixel is an area composed of a plurality of consecutively arranged pixels having similar pixel information such as brightness and color.

一例では、それぞれの画像において、それぞれの画素に少なくとも一つの属性ラベルが設定される。或る１画素に対応する少なくとも一つの属性ラベルは、１次元の配列によって表されるデータ構造によって表すことができ、本開示ではそのデータ構造を「画素ラベルベクトル」という。画素ラベルベクトルの個々の要素は、対応する属性ラベルを示す。したがって、画素ラベルベクトルの要素数は属性ラベルの種類の総数と等しい。例えば、或る画像の（ｉ，ｊ）番目の画素の画素ラベルベクトルをｒ_ｉｊとし、画素ラベルベクトルの個々の要素をｅとし、属性ラベルの種類の総数をＴとすると、ｒ_ｉｊ＝［ｅ_ｉｊ ^１，ｅ_ｉｊ ^２，…，ｅ_ｉｊ ^Ｔ］である。画素ラベルベクトルの個々の要素に設定される値は限定されない。一例では、少なくとも一つの要素は０または１の二値で表現されてもよい。例えば、或る画素に対して或る属性ラベルが付与される場合には、その属性ラベルに対応する要素に１が設定され、その画素に対してその属性レベルが付与されない場合には、その要素に０が設定されてもよい。一つの具体例として、或る画素の画素ラベルベクトルがｒ_ｉｊ＝［１，１，０，０，…，０，１］である場合には、この画素は１番目および２番目の要素に対応する二つの属性ラベルを有し、３番目および４番目の要素に対応する二つの属性ラベルを有しない。画素ラベルベクトルの少なくとも一つの要素は連続値で表現されてもよく、例えば、画像の解像度に対応する要素にその解像度がそのまま設定されてもよい。 In one example, in each image, at least one attribute label is set for each pixel. At least one attribute label corresponding to a certain pixel can be represented by a data structure represented by a one-dimensional array, and the data structure is referred to as a "pixel label vector" in the present disclosure. Each element of the pixel label vector indicates a corresponding attribute label. Therefore, the number of elements in the pixel label vector is equal to the total number of attribute label types. For example, if the pixel label vector of the (i, j) th pixel of a certain image is r _ij , the individual elements of the pixel label vector are e, and the total number of attribute label types is T, r _ij = [e. _ij ¹ , e _ij ² , ..., e _ij ^T ]. The values set for the individual elements of the pixel label vector are not limited. In one example, at least one element may be represented by a binary value of 0 or 1. For example, when a certain attribute label is given to a certain pixel, 1 is set for the element corresponding to the attribute label, and when the attribute level is not given to the pixel, the element is set. May be set to 0. As a specific example, when the pixel label vector of a pixel is _rij = [1,1,0,0, ..., 0,1], this pixel corresponds to the first and second elements. Has two attribute labels, and does not have two attribute labels corresponding to the third and fourth elements. At least one element of the pixel label vector may be represented by a continuous value, and for example, the resolution may be set as it is in the element corresponding to the resolution of the image.

図５は属性ラベルのいくつかの例を示す図である。図５の例（ａ）は、合流の標識を写した画像２１１と、この画像２１１の属性ラベルを説明するためのイラスト２１２とを示す。例（ａ）では画像２１１の属性ラベルとして「空」、「標識」、「樹木」、および「遮蔽あり」が設定されている。具体的には、空に対応する画素には属性ラベル「空」が付与され、標識に対応する画素には属性レベル「標識」が付与され、樹木に対応する画素には属性ラベル「樹木」が付与される。標識が樹木によって遮られた部分に対応する画素には属性ラベル「遮蔽あり」が付与される。この例では、「遮蔽あり」という属性ラベルは、標識が別の物体によって遮られた状態を示す。 FIG. 5 is a diagram showing some examples of attribute labels. Example (a) of FIG. 5 shows an image 211 showing a sign of merging and an illustration 212 for explaining the attribute label of the image 211. In the example (a), "empty", "sign", "tree", and "shielded" are set as the attribute labels of the image 211. Specifically, the pixel corresponding to the sky is given the attribute label "empty", the pixel corresponding to the sign is given the attribute level "mark", and the pixel corresponding to the tree is given the attribute label "tree". Granted. The attribute label "shielded" is given to the pixel corresponding to the portion where the sign is blocked by the tree. In this example, the "shielded" attribute label indicates that the sign is blocked by another object.

図５の例（ｂ）は、指定方向外進行禁止の標識を写した画像２２１と、この画像２２１の属性ラベルを説明するためのイラスト２２２とを示す。例（ｂ）では画像２２１の属性ラベルとして「空」、「標識」、「樹木」、「柱」、および「遮蔽あり」が設定されている。具体的には、空に対応する画素には属性ラベル「空」が付与され、標識に対応する画素には属性レベル「標識」が付与され、樹木に対応する画素には属性ラベル「樹木」が付与され、柱に対応する画素には属性ラベル「柱」が付与される。標識が柱によって遮られた部分に対応する画素には属性ラベル「遮蔽あり」が付与される。この例でも、「遮蔽あり」という属性ラベルは、標識が別の物体によって遮られた状態を示す。 Example (b) of FIG. 5 shows an image 221 showing a sign indicating that the vehicle is prohibited from traveling outside the designated direction, and an illustration 222 for explaining the attribute label of the image 221. In the example (b), "empty", "sign", "tree", "pillar", and "shielded" are set as the attribute labels of the image 221. Specifically, the pixel corresponding to the sky is given the attribute label "empty", the pixel corresponding to the marker is given the attribute level "mark", and the pixel corresponding to the tree is given the attribute label "tree". The attribute label "pillar" is given to the pixel corresponding to the pillar. The attribute label "with shielding" is given to the pixel corresponding to the portion where the sign is blocked by the pillar. Again, the "shielded" attribute label indicates that the sign is blocked by another object.

空、標識、樹木、柱、および遮蔽の有無という５種類の属性ラベルをこの順に示す画素ラベルベクトルが、画像２１１，２２１に共通して定義されるとする。そして、その画素ラベルベクトルの個々の要素において、属性ラベルが付与されない場合には０が付与され、属性ラベルが付与される場合には１が付与されるとする。この前提の下で属性ラベルの設定の一例を示す。画像２１１についていうと、遮られていない標識の部分に対応する画素ラベルベクトルは［０，１，０，０，０］である。樹木によって遮られた標識の部分に対応する画素ラベルベクトルは［０，０，１，０，１］である。標識を遮っていない樹木の部分に対応する画素ラベルベクトルは［０，０，１，０，０］である。柱が写っていない画像２１１については、４番目のベクトル要素が１になる画素ラベルベクトルは存在しない。画像２２１について同様に示すと、空に対応する画素ラベルベクトルは［１，０，０，０，０］である。柱によって遮られた標識の部分に対応する画素ラベルベクトルは［０，０，０，１，１］である。標識を遮っていない柱の部分に対応する画素ラベルベクトルは［０，０，０，１，０］である。 It is assumed that a pixel label vector indicating five types of attribute labels, that is, sky, sign, tree, pillar, and presence / absence of occlusion, in this order is commonly defined in images 211 and 221. Then, in each element of the pixel label vector, 0 is given when the attribute label is not given, and 1 is given when the attribute label is given. An example of setting the attribute label is shown under this premise. Regarding the image 211, the pixel label vector corresponding to the unobstructed marker portion is [0,1,0,0,0]. The pixel label vector corresponding to the portion of the sign blocked by the tree is [0,0,1,0,1]. The pixel label vector corresponding to the part of the tree that does not block the sign is [0,0,1,0,0]. For the image 211 in which the pillar is not shown, there is no pixel label vector in which the fourth vector element is 1. Similarly for image 221 the pixel label vector corresponding to the sky is [1,0,0,0,0]. The pixel label vector corresponding to the portion of the sign blocked by the pillar is [0,0,0,1,1]. The pixel label vector corresponding to the part of the pillar that does not block the sign is [0,0,0,1,0].

［システムでの処理手順］
図６〜図９を参照しながら、分析システム１の動作を説明するとともに、本実施形態に係る分析方法について説明する。図６は分析処理の全体の流れの一例を処理フローＳ２として示すフローチャートである。図６では処理フローＳ１との対応関係も示す。図７は画像認識の正しさを判定する処理の一例を詳細に示すフローチャートである。図８は注目領域と属性ラベルとの関係度を算出する処理の一例を詳細に示すフローチャートである。図９は誤った画像認識に関連する属性ラベルを推定する処理の一例を詳細に示すフローチャートである。 [Processing procedure in the system]
The operation of the analysis system 1 will be described with reference to FIGS. 6 to 9, and the analysis method according to the present embodiment will be described. FIG. 6 is a flowchart showing an example of the overall flow of the analysis process as the process flow S2. FIG. 6 also shows the correspondence with the processing flow S1. FIG. 7 is a flowchart showing in detail an example of the process of determining the correctness of image recognition. FIG. 8 is a flowchart showing in detail an example of the process of calculating the degree of relationship between the region of interest and the attribute label. FIG. 9 is a flowchart showing in detail an example of the process of estimating the attribute label related to erroneous image recognition.

図６に示すように、ステップＳ２１では、データ取得部１１がデータセットを取得する。上述したように、データセットは予め用意されて所定の記憶装置２０に予め格納される。データ取得部１１はその記憶装置２０にアクセスしてデータセットを読み出す。 As shown in FIG. 6, in step S21, the data acquisition unit 11 acquires the data set. As described above, the data set is prepared in advance and stored in a predetermined storage device 20 in advance. The data acquisition unit 11 accesses the storage device 20 and reads out the data set.

ステップＳ２２では、画像認識部１２がデータセットの個々の画像について、機械学習モデルによる画像認識を実行する。これはステップＳ１１に対応する。データセット内の１以上の画像のそれぞれについて、画像認識部１２は該画像を機械学習モデルに入力し、その機械学習モデルから出力されるデータを画像認識結果として取得し、この画像認識結果を記憶装置２０に格納する。 In step S22, the image recognition unit 12 executes image recognition by the machine learning model for each image of the data set. This corresponds to step S11. For each of the one or more images in the data set, the image recognition unit 12 inputs the image into the machine learning model, acquires the data output from the machine learning model as the image recognition result, and stores the image recognition result. It is stored in the device 20.

一例では、この画像認識は、下記の式（１）で示される分類モデルに対応する。

式（１）は、画像がＫ個のクラス（グループ）のいずれに属するかを求めることを示す。［０，１］は、或るクラス（グループ）に属するか否かの二択を示す。変数ｐは画像が或るクラスに属する確率であり、変数ｐ_ｋは画像がクラスｋに属する確率を示す。分類モデルｆは、画像ｘを確率に変換する関数ｐ＝ｆ（ｘ）であるともいえる。分類モデルｆを用いて得られる結果はｋ^＊＝ａｒｇｍａｘ_ｋｐ_ｋで表される。これは、最大のｐ_ｋが得られるクラスｋを認識結果ｋ^＊として得ることを意味する。 In one example, this image recognition corresponds to the classification model represented by the following equation (1).

Equation (1) indicates which of the K classes (groups) the image belongs to. [0,1] indicates two choices as to whether or not it belongs to a certain class (group). The variable p is the probability that the image belongs to a certain class, and the variable p _k is the probability that the image belongs to the class k. It can be said that the classification model f is a function p = f (x) that converts the image x into a probability. Results obtained using a classification model f is expressed by ^_{_k} * = argmax ^k p ^k. This means that the class k from which the maximum p _k is obtained is obtained as the recognition result k ^* .

ステップＳ２３では、ハイライト部１４がデータセットの個々の画像について、画像認識部１２による画像認識での注目領域を抽出する。これはステップＳ１２に対応する。上述したように、一例では、ハイライト部１４は上記の非特許文献１に記載された手法に基づいて注目領域を抽出し、その注目領域を示すハイライト画像を生成する。この結果、データセット内の１以上の画像のそれぞれについて、図２に示すようなハイライト画像が生成される。一例では、ハイライト画像の各画素には、原画像（すなわち画像ｘ）での対応画素の微小な変化に対する影響度が関連付けられる。この影響度はスカラー値であり、注目領域を示す情報の一例である。クラスｋに分類された画像ｘに対応するハイライト画像の全画素の影響度を勾配ベクトル∇_ｘｆ_ｋ（ｘ）で表すことができる。勾配ベクトルの要素数はハイライト画像の画素数に等しく、したがって、原画像の画素数にも等しい。個々の画素の影響度を示す勾配ベクトルも、注目領域を示す情報の一例である。ハイライト部１４は、勾配ベクトルを含むハイライト画像をハイライト結果として記憶装置２０に格納する。 In step S23, the highlight unit 14 extracts an area of interest in image recognition by the image recognition unit 12 for each image in the data set. This corresponds to step S12. As described above, in one example, the highlight unit 14 extracts the region of interest based on the method described in Non-Patent Document 1 and generates a highlight image showing the region of interest. As a result, a highlight image as shown in FIG. 2 is generated for each of the one or more images in the dataset. In one example, each pixel of the highlighted image is associated with a degree of influence on a small change in the corresponding pixel in the original image (ie, image x). This degree of influence is a scalar value, which is an example of information indicating a region of interest. The degree of influence of all pixels of the highlight image corresponding to the image x classified in the class k _{can be expressed by the gradient vector ∇ x} f _k (x). The number of elements in the gradient vector is equal to the number of pixels in the highlight image and therefore equal to the number of pixels in the original image. A gradient vector indicating the degree of influence of each pixel is also an example of information indicating a region of interest. The highlight unit 14 stores the highlight image including the gradient vector in the storage device 20 as the highlight result.

ステップＳ２４では、正誤判定部１３がデータセットの個々の画像について画像認識の正しさを判定する。これはステップＳ１３に対応する。図７を参照しながらこの判定処理の詳細な流れを説明する。 In step S24, the correctness determination unit 13 determines the correctness of image recognition for each image in the data set. This corresponds to step S13. The detailed flow of this determination process will be described with reference to FIG. 7.

ステップＳ２４１では、正誤判定部１３は処理する一つの画像を選択する。ステップＳ２４２では、正誤判定部１３はその画像についての画像認識結果を取得する。この画像認識結果は、画像認識部１２において機械学習モデルから出力されたデータである。ステップＳ２４３では、正誤判定部１３はデータセットを参照して、その画像に対応する正解ラベルを取得する。ステップＳ２４４では、正誤判定部１３は画像認識結果を正解ラベルと比較することで画像認識の正しさを判定する。正誤判定部１３は、画像認識結果が正解ラベルと一致する場合には画像認識が正しいと判定し、そうでない場合には画像認識が誤りであると判定する。正誤判定部１３はこの正誤判定結果を記憶装置２０に格納する。ステップＳ２４５に示すように、正誤判定部１３はデータセットの個々の画像についてステップＳ２４１〜Ｓ２４４の処理を実行する。データセットのすべての画像が処理されることで（ステップＳ２４５においてＹＥＳ）、ステップＳ２４が終了する。 In step S241, the correctness determination unit 13 selects one image to be processed. In step S242, the correctness determination unit 13 acquires the image recognition result for the image. This image recognition result is data output from the machine learning model in the image recognition unit 12. In step S243, the correctness determination unit 13 refers to the data set and acquires the correct answer label corresponding to the image. In step S244, the correctness determination unit 13 determines the correctness of image recognition by comparing the image recognition result with the correct answer label. The correctness determination unit 13 determines that the image recognition is correct when the image recognition result matches the correct answer label, and determines that the image recognition is incorrect otherwise. The correctness determination unit 13 stores the correctness determination result in the storage device 20. As shown in step S245, the correctness determination unit 13 executes the processes of steps S241 to S244 for each image of the data set. Step S24 ends when all the images in the dataset have been processed (YES in step S245).

図６に戻って、ステップＳ２５では、照合部１５がデータセットの個々の画像について注目領域と属性ラベルとの関係度を算出する。これはステップＳ１４に対応する。図８を参照しながらこの算出処理の詳細な流れを説明する。 Returning to FIG. 6, in step S25, the collating unit 15 calculates the degree of relationship between the region of interest and the attribute label for each image in the data set. This corresponds to step S14. The detailed flow of this calculation process will be described with reference to FIG.

ステップＳ２５１では、照合部１５は処理する一つの画像を選択する。ステップＳ２５２では、照合部１５はその画像についてのハイライト結果を取得する。このハイライト結果は、ハイライト部１４によって生成されたハイライト画像である。ステップＳ２５３では、照合部１５はデータセットを参照して、その画像に対応する１以上の属性ラベルを取得する。ステップＳ２５４では、照合部１５はハイライト結果と属性ラベルとに基づいて、注目領域と属性ラベルとの関係度を算出する。具体的には、照合部１５は１以上の属性ラベルのそれぞれについて、勾配ベクトルと該属性ラベルとの内積を関係度として算出する。この内積は関係度の一例である。内積が大きいほど関係度が高く、したがって、内積が大きいほど、注目領域と属性ラベルとの関係が強い。 In step S251, the collating unit 15 selects one image to be processed. In step S252, the collating unit 15 acquires the highlight result for the image. This highlight result is a highlight image generated by the highlight unit 14. In step S253, the collation unit 15 refers to the data set and acquires one or more attribute labels corresponding to the image. In step S254, the collating unit 15 calculates the degree of relationship between the region of interest and the attribute label based on the highlight result and the attribute label. Specifically, the collating unit 15 calculates the inner product of the gradient vector and the attribute label as the degree of relationship for each of the one or more attribute labels. This inner product is an example of the degree of relationship. The larger the inner product, the higher the degree of relationship. Therefore, the larger the inner product, the stronger the relationship between the region of interest and the attribute label.

一つの画像についての内積の計算について説明する。照合部１５は１以上の属性ラベルｔのそれぞれについて、画像の全画素に対応する該属性ラベルｔの要素値の集合から成るベクトルｒ_ｔを生成する。本開示ではこのベクトルｒ_ｔを「画像ラベルベクトル」ともいう。或る一つの属性ラベルｔに対応する画像ラベルベクトルの要素数は、画像の画素数と等しい。照合部１５はこの画像ラベルベクトルｒ_ｔと勾配ベクトル∇_ｘｆ_ｋ（ｘ）との内積を計算する。この内積は下記の式（２）で表される。式（２）は、ｎ番目の画像についての、画像ラベルベクトルｒ_ｔと勾配ベクトル∇_ｘｆ_ｋ（ｘ）との内積を表す。

The calculation of the inner product for one image will be described. Matching unit 15 for each of the one or more attribute label t, it generates a vector r _t consisting of a set of element values of the attribute label t corresponding to all pixels in the image. In the present disclosure refers to the vector r _t be an "image label vector". The number of elements of the image label vector corresponding to one attribute label t is equal to the number of pixels of the image. Matching unit 15 calculates the inner product between the image label vector _{r t} and the gradient vector ∇ _x f _k _(x). This inner product is expressed by the following equation (2). Equation (2) represents for the n-th image, the inner product of the image label vector _{r t} and the gradient vector ∇ _x f _k _(x).

ステップＳ２５５に示すように、照合部１５はデータセットの個々の画像についてステップＳ２５１〜Ｓ２５４の処理を実行する。データセットのすべての画像が処理されることで（ステップＳ２５５においてＹＥＳ）、ステップＳ２５が終了する。 As shown in step S255, the collating unit 15 executes the processes of steps S251 to S254 for each image of the data set. When all the images in the dataset have been processed (YES in step S255), step S25 ends.

図６に戻って、ステップＳ２６では、分析部１６が照合結果と正誤判定結果とに基づいて、誤った画像認識に関連する属性ラベルを推定する。これはステップＳ１５に対応する。図９を参照しながらこの推定処理の詳細な流れを説明する。 Returning to FIG. 6, in step S26, the analysis unit 16 estimates the attribute label related to the erroneous image recognition based on the collation result and the correctness determination result. This corresponds to step S15. The detailed flow of this estimation process will be described with reference to FIG.

ステップＳ２６１では、分析部１６はデータセット内の個々の画像についての照合結果を取得する。ステップＳ２６２では、分析部１６はその個々の画像についての正誤判定結果を取得する。ステップＳ２６３では、分析部１６はその照合結果および正誤判定結果に基づいて、誤った画像認識に関連する属性ラベルを推定する。本実施形態では以下に二つの推定手法を例示する。 In step S261, the analysis unit 16 acquires the collation results for the individual images in the dataset. In step S262, the analysis unit 16 acquires the correctness determination result for each of the images. In step S263, the analysis unit 16 estimates the attribute label related to the erroneous image recognition based on the collation result and the correctness determination result. In this embodiment, two estimation methods are illustrated below.

第１の手法は、誤った画像認識（すなわち誤分類）に寄与した画素を補正することで画像を正しく認識できるならば、その補正に寄与した属性ラベルが誤認識に関連するという考察に基づく。第１の手法では、分析部１６は補正重みを用いて画素を補正する。補正重みは、誤った画像認識に属性ラベルがどのくらい寄与したかを示す指数である貢献度の一例である。本開示では、補正重みが大きい属性ラベルほど、誤った画像認識への貢献度が高いとする。或る一つの属性ラベルｔについての画素の補正は下記の式（３）で表される。

ここで、ｘ_ｉｊは画素値を示す。α_ｔは補正重みを示し、これは０以上１以下の範囲で設定される。補正重みα_ｔは画素を変更するためのパラメータであるともいえ、より具体的には、個々の画素値を変更するためのパラメータであるともいえる。ｒ_ｉｊｔは画素に設定された属性ラベルｔの値を示し、したがって、ｒ_ｉｊｔが０であれば画素値は補正されない。 The first method is based on the consideration that if the image can be correctly recognized by correcting the pixels that contributed to the erroneous image recognition (that is, misclassification), the attribute label that contributed to the correction is related to the erroneous recognition. In the first method, the analysis unit 16 corrects the pixels using the correction weights. The correction weight is an example of the degree of contribution, which is an index indicating how much the attribute label contributed to erroneous image recognition. In the present disclosure, it is assumed that the attribute label having a larger correction weight has a higher degree of contribution to erroneous image recognition. The pixel correction for one attribute label t is expressed by the following equation (3).

Here, x _ij indicates a pixel value. α _t indicates a correction weight, which is set in the range of 0 or more and 1 or less. It can be said that the correction weight α _t is a parameter for changing the pixel, and more specifically, it can be said that it is a parameter for changing the individual pixel value. r _ijt indicates the value of the attribute label t set for the pixel, and therefore, _{if r ijt} is 0, the pixel value is not corrected.

分析部１６は、画像認識結果が正しくなかった画像を補正によって正しく分類し、且つ画像認識結果が正しかった画像を補正後も正しく分類するような補正重みα_ｔを探索する。この探索は、正しく分類された画像を考慮した制約の下で、誤分類に関係する属性ラベルを求めるための処理である。誤分類されたＮ個の画像に対応する、画像ｘ、属性ラベルｒ、および正解ラベルｋの組合せを｛ｘ^（ｎ），ｒ^（ｎ），ｋ^（ｎ）｝_ｎ＝１ ^Ｎと表す。また、正しく分類されたＭ個の画像に対応する、画像ｘ´、属性ラベルｒ´、および正解ラベルｋ´の組合せを｛ｘ´^（ｍ），ｒ´^（ｎ），ｋ´^（ｎ）｝_ｍ＝１ ^Ｍと表す。分析部１６は正誤判定結果を用いてこれらの組合せを特定する。この場合には、補正重みα_ｔの探索は、下記の式（４）を満たすα_ｔを探索することであるということができる。

ここで、ｆ_ｋは補正された画像を確率に変換する関数（分類モデル）を示す。ａｒｇｍａｘ_ｋｆ_ｋは、その確率が最大になるクラスｋを認識結果として得ることを意味する。 The analysis unit 16 searches for _{a correction weight α t} that correctly classifies images with incorrect image recognition results by correction and correctly classifies images with correct image recognition results even after correction. This search is a process for finding attribute labels related to misclassification under the constraint of considering correctly classified images. The combination of the image x, the attribute label r, and the correct label k corresponding to the misclassified N images is expressed as {x ⁽ⁿ⁾ , r ⁽ⁿ⁾ , k ⁽ⁿ⁾ } _{n = 1} ^N. Further, corresponding to the correctly classified the M image, image x', attribute label r', and a combination of true label ^{^{k'{x'(m), r'}} (n), k'(n)} It is expressed as _{m = 1} ^M. The analysis unit 16 identifies these combinations using the correctness determination result. In this case, it can be said that _{the search for the correction weight α t} _{is to search for α t} that satisfies the following equation (4).

Here, f _k indicates a function (classification model) that converts the corrected image into a probability. argmax _k f _k means that the class k having the maximum probability is obtained as the recognition result.

分析部１６は最大の補正重みα_ｔに対応する属性ラベルを、誤った画像認識に関連する属性ラベルとして取得する。その補正重みα_ｔの探索を示す目的関数は下記の式（５）により定義される。

The analysis unit 16 _{acquires the attribute label corresponding to the maximum correction weight α t} as the attribute label related to erroneous image recognition. The objective function indicating the search for the correction weight α _t is defined by the following equation (5).

分析部１６はこの目的関数を１次のテイラー展開によって近似する。この近似は下記の式（６）により定義され、この計算において、照合結果である内積（関係度）が用いられる。

The analysis unit 16 approximates this objective function by a first-order Taylor expansion. This approximation is defined by the following equation (6), and in this calculation, the inner product (relationship degree) which is the collation result is used.

分析部１６は式（６）に基づいて最大の補正重みα_ｔを算出し、その最大の補正重みα_ｔに対応する属性ラベルを、誤った画像認識に関連する属性ラベルとして取得する。 _{The analysis unit 16 calculates the maximum correction weight α t} based on the equation (6), and acquires the attribute label corresponding to the maximum correction weight α _t as an attribute label related to erroneous image recognition.

第２の手法も第１の手法と同様に、誤った画像認識（すなわち誤分類）に寄与した画素を補正することで画像を正しく分類できるようであれば、その補正に寄与した属性ラベルが誤認識に関連するという考察に基づく。第２の手法でも第１の手法と同様に補正重みα_ｔを用いる。 Similar to the first method, in the second method, if the images can be correctly classified by correcting the pixels that contributed to the incorrect image recognition (that is, misclassification), the attribute label that contributed to the correction is incorrect. Based on the consideration that it is related to cognition. In the second method, the correction weight α _t is used as in the first method.

第２の手法でも、分析部１６は、画像認識結果が正しくなかった画像を補正によって正しく分類し、且つ画像認識結果が正しかった画像を補正後も正しく分類するような補正重みα_ｔを探索する。第１の手法と同様に、分析部１６は正誤判定結果を用いて、誤分類されたＮ個の画像に対応する、画像ｘ、属性ラベルｒ、および正解ラベルｋの組合せと、正しく分類されたＭ個の画像に対応する、画像ｘ´、属性ラベルｒ´、および正解ラベルｋ´の組合せとを特定する。 Also in the second method, the analysis unit 16 searches for _{a correction weight α t} that correctly classifies the image with the incorrect image recognition result by correction and correctly classifies the image with the correct image recognition result even after the correction. .. Similar to the first method, the analysis unit 16 correctly classified the combination of the image x, the attribute label r, and the correct answer label k corresponding to the N misclassified images using the correct / incorrect judgment result. The combination of the image x', the attribute label r', and the correct label k'corresponding to the M images is specified.

第２の手法では、最大の補正重みα_ｔの探索を示す目的関数は下記の式（７）により定義される。

ここで、ｆ_ｋ（ｎ）は画像が正解のクラスに属する確率を示し、ｆ_ｈは画像が誤ったクラス（すなわち、正解クラス以外のクラス）に属する確率を示す。したがって、この目的関数は、正しく分類された画像を補正後も正しく分類しつつ、正解クラスの確率と誤ったクラスの確率との差が最も大きくなるような補正重みα_ｔを得ることを目的とする。 In the second method, the _{objective function indicating the search for the maximum correction weight α t} is defined by the following equation (7).

Here, f _{k (n)} indicates the probability that the image belongs to the correct class, and f _h indicates the probability that the image belongs to the wrong class (that is, a class other than the correct class). Therefore, the purpose of this objective function is to obtain _{a correction weight α t} that maximizes the difference between the probability of the correct class and the probability of the wrong class while correctly classifying the correctly classified images even after correction. do.

分析部１６はこの目的関数を１次のテイラー展開によって近似する。一例では、この近似は下記の式（８）により定義され、この計算において、照合結果である内積（関係度）が用いられる。

The analysis unit 16 approximates this objective function by a first-order Taylor expansion. In one example, this approximation is defined by the following equation (8), and in this calculation, the inner product (relationship degree) which is the collation result is used.

分析部１６は式（８）に基づいて最大の補正重みα_ｔを算出し、その最大の補正重みα_ｔに対応する属性ラベルを、誤った画像認識に関連する属性ラベルとして取得する。 _{The analysis unit 16 calculates the maximum correction weight α t} based on the equation (8), and acquires the attribute label corresponding to the maximum correction weight α _t as an attribute label related to erroneous image recognition.

第２の手法の変形例として、分析部１６は、下記の式（９）により示される１次のテイラー展開によってその目的関数を近似してもよい。分析部１６は式（９）に基づいて最大の補正重みα_ｔを算出し、その最大の補正重みα_ｔに対応する属性ラベルを、誤った画像認識に関連する属性ラベルとして取得する。この計算でも、照合結果である内積（関係度）が用いられる。

ここで、λΣ_ｍβ_ｍは或る一定の範囲で制約を緩和するための制約項である。λはその制約項を制御するためのパラメータである。γΣ_ｔ｜α_ｔ｜は、補正重みα_ｔを疎にして（すなわち、なるべく多くの補正重みα_ｔを０にして）、重要な属性ラベルを抽出することを目的とする項である。γは、複数の属性ラベルが誤認識に関連したものとして推定されないようにするためのパラメータである。言い換えると、γは、誤認識の原因と推定される属性ラベルを少なくするためのパラメータである。 As a modification of the second method, the analysis unit 16 may approximate the objective function by a first-order Taylor expansion represented by the following equation (9). _{The analysis unit 16 calculates the maximum correction weight α t} based on the equation (9), and acquires the attribute label corresponding to the maximum correction weight α _t as an attribute label related to erroneous image recognition. In this calculation as well, the inner product (degree of relationship), which is the collation result, is used.

Here, λΣ _m β _m is a constraint term for relaxing the constraint in a certain range. λ is a parameter for controlling the constraint term. γΣ _t | α _t | is a term for the purpose of extracting important attribute labels by making the correction weight α _t sparse (that is, _{setting as many correction weights α t as possible to 0).} γ is a parameter for preventing a plurality of attribute labels from being presumed to be related to misrecognition. In other words, γ is a parameter for reducing the attribute label presumed to be the cause of misrecognition.

第１および第２の手法のいずれも、画像認識結果が正しかった画像に補正重みを適用することで得られる第１補正画像を機械学習モデルに入力することで得られる新たな画像認識結果が正しく、かつ、画像認識結果が正しくなかった画像に該補正重みを適用することで得られる第２補正画像を機械学習モデルに入力することで得られる新たな画像認識結果が正しくなる該補正重みを貢献度として算出する処理の一例である。分析部１６は１以上の属性ラベルのそれぞれについてその補正重みを算出する。そして、分析部１６は最も高い補正重み（貢献度）を有する属性ラベルを、誤った画像認識に関連する属性ラベルとして推定する。 In both the first and second methods, the new image recognition result obtained by inputting the first corrected image obtained by applying the correction weight to the image for which the image recognition result was correct is correctly input to the machine learning model. In addition, the correction weight contributes to the correction of the new image recognition result obtained by inputting the second correction image obtained by applying the correction weight to the image in which the image recognition result is incorrect into the machine learning model. This is an example of processing calculated as a degree. The analysis unit 16 calculates the correction weight for each of the one or more attribute labels. Then, the analysis unit 16 estimates the attribute label having the highest correction weight (contribution degree) as the attribute label related to erroneous image recognition.

第１および第２の手法のいずれも、画像認識結果が正しい画像と誤りである画像の両方を用いているが、誤認識となる画像が少ない、または存在しない場合には画像認識結果が正しい画像のみを用いてαを探索してもよい。この探索では、正誤判定部１３を用いて、入力画像における正解クラスに属する確率を計算する。正解クラスの確率を閾値とし、閾値未満で分類されたN個の画像に対応する、画像ｘ、属性ラベルｒ、および正解ラベルｋの組合せを｛ｘ^（ｎ），ｒ^（ｎ），ｋ^（ｎ）｝_ｎ＝１ ^Ｎと表す。また、閾値以上で分類されたＭ個の画像に対応する、画像ｘ´、属性ラベルｒ´、および正解ラベルｋ´の組合せを｛ｘ´^（ｍ），ｒ´^（ｎ），ｋ´^（ｎ）｝_ｍ＝１ ^Ｍと表す。上記組合せにおいて、第１および第２の手法のいずれかを用いることで、誤った認識に影響を与えている属性ラベルを推定する。
Both the first and second methods use both an image with a correct image recognition result and an image with an incorrect image recognition result, but if there are few or no images with incorrect image recognition results, the image recognition result is correct. You may search for α using only. In this search, the correctness determination unit 13 is used to calculate the probability of belonging to the correct answer class in the input image. The combination of the image x, the attribute label r, and the correct answer label k corresponding to N images classified below the threshold value with the probability of the correct answer class as the threshold value is {x ⁽ⁿ⁾ , r ⁽ⁿ⁾ , k ^{(n). )} } _{N = 1} ^N. Further, the combinations of the image x', the attribute label r', and the correct answer label k'corresponding to the M images classified by the threshold value or more are {x' ^(m) , r ^'(n ⁾ , k'(n). ⁾ } ^M _{= 1 M.} In the above combination, by using either of the first and second methods, the attribute label affecting the erroneous recognition is estimated.

分析部１６は推定された属性ラベルを分析結果として出力する。分析結果の出力方法は限定されない。例えば、分析部１６はその分析結果を、記憶装置２０に格納してもよいし、モニタ上に表示してもよいし、他のコンピュータに向けて送信してもよい。 The analysis unit 16 outputs the estimated attribute label as the analysis result. The output method of the analysis result is not limited. For example, the analysis unit 16 may store the analysis result in the storage device 20, display it on a monitor, or transmit it to another computer.

［効果］
以上説明したように、本開示の一側面に係るコンピュータシステムはプロセッサを備える。プロセッサは、複数の画像のそれぞれについて、該画像を機械学習モデルに入力することで得られた画像認識結果が正しいか否かを示す正誤判定結果を取得し、複数の画像のそれぞれについて、機械学習モデルにおける該画像内の注目領域と、該画像に予め関連付けられた１以上の属性ラベルのそれぞれとの関係を示す関係度を算出し、複数の画像のそれぞれの正誤判定結果および関係度に基づいて、誤った画像認識に関連する属性ラベルを１以上の属性ラベルから推定する。 [effect]
As described above, the computer system according to one aspect of the present disclosure includes a processor. The processor acquires a correct / incorrect judgment result indicating whether or not the image recognition result obtained by inputting the image into the machine learning model is correct for each of the plurality of images, and machine learning is performed for each of the plurality of images. The degree of relationship indicating the relationship between the region of interest in the image in the model and each of the one or more attribute labels associated with the image in advance is calculated, and based on the correctness determination result and the degree of relationship of each of the plurality of images. , The attribute label related to erroneous image recognition is estimated from one or more attribute labels.

本開示の一側面に係るプログラムは、複数の画像のそれぞれについて、該画像を機械学習モデルに入力することで得られた画像認識結果が正しいか否かを示す正誤判定結果を取得するステップと、複数の画像のそれぞれについて、機械学習モデルにおける該画像内の注目領域と、該画像に予め関連付けられた１以上の属性ラベルのそれぞれとの関係を示す関係度を算出するステップと、複数の画像のそれぞれの正誤判定結果および関係度に基づいて、誤った画像認識に関連する属性ラベルを１以上の属性ラベルから推定するステップとをコンピュータに実行させる。 The program according to one aspect of the present disclosure includes, for each of a plurality of images, a step of acquiring a correct / incorrect judgment result indicating whether or not the image recognition result obtained by inputting the image into a machine learning model is correct. For each of the plurality of images, a step of calculating the degree of relationship indicating the relationship between the region of interest in the image in the machine learning model and each of one or more attribute labels associated with the image in advance, and a step of calculating the degree of relationship of the plurality of images. A computer is made to perform a step of estimating an attribute label related to erroneous image recognition from one or more attribute labels based on each correctness determination result and the degree of relationship.

このような側面においては、上記の関係度を求めることで、画像内で機械学習が注目した領域に関係する属性ラベルが把握される。そして、複数の画像における、その関係度と画像認識の正誤とを考慮することで、どの属性ラベルが、誤った画像認識に影響を与えたかが推定される。この一連の処理が実行されることで、画像認識用の機械学習モデルを自動的に分析することができる。この自動分析により、例えば、専門家による目視検査に頼ることなく、機械学習モデルにおける誤認識の原因を、多数の画像を分析して効率的に突き止めることが可能になる。別の例では、専門家が持つ知識では網羅できない誤認識の原因をその自動分析により突き止めることができる。 In such an aspect, by obtaining the above-mentioned degree of relationship, the attribute label related to the region of interest of machine learning in the image can be grasped. Then, by considering the degree of relationship and the correctness of image recognition in a plurality of images, it is estimated which attribute label affected the incorrect image recognition. By executing this series of processes, the machine learning model for image recognition can be automatically analyzed. This automatic analysis makes it possible to efficiently identify the cause of misrecognition in a machine learning model by analyzing a large number of images, for example, without relying on visual inspection by an expert. In another example, the cause of misrecognition, which cannot be covered by the knowledge of experts, can be identified by its automatic analysis.

他の側面に係るコンピュータシステムでは、プロセッサが、１以上の属性ラベルのそれぞれについて、複数の画像のそれぞれの正誤判定結果および関係度に基づいて、誤った画像認識への貢献度を算出し、１以上の属性ラベルから、少なくとも、最も高い貢献度を有する属性ラベルを、誤った画像認識に関連する属性ラベルとして推定してもよい。個々の属性ラベルについて貢献度を算出して、その貢献度が最も高い属性ラベルを選択することで、誤った画像認識に最も関連すると推定される属性ラベルを抽出することができる。 In the computer system according to the other aspect, the processor calculates the degree of contribution to erroneous image recognition for each of one or more attribute labels based on the correctness judgment result and the degree of relation of each of the plurality of images. From the above attribute labels, at least the attribute label having the highest contribution may be estimated as the attribute label related to erroneous image recognition. By calculating the contribution of each attribute label and selecting the attribute label with the highest contribution, the attribute label estimated to be most related to erroneous image recognition can be extracted.

他の側面に係るコンピュータシステムでは、プロセッサが、画像認識結果が正しかった画像に補正重みを適用することで得られる第１補正画像を機械学習モデルに入力することで得られる新たな画像認識結果が正しく、かつ、画像認識結果が正しくなかった画像に該補正重みを適用することで得られる第２補正画像を機械学習モデルに入力することで得られる新たな画像認識結果が正しくなる該補正重みを貢献度として、１以上の属性ラベルのそれぞれについて算出してもよい。誤った画像認識に寄与した画素を補正することで画像を正しく認識できるならば、その補正に寄与した属性ラベルが誤認識に関連する蓋然性が高い。したがって、上記の補正重みを貢献度として用いることで、誤った画像認識に関連する属性ラベルを精度良く推定することができる。 In the computer system according to the other aspect, a new image recognition result obtained by the processor inputting the first corrected image obtained by applying the correction weight to the image for which the image recognition result was correct is input to the machine learning model. The correction weight that corrects the new image recognition result obtained by inputting the second correction image obtained by applying the correction weight to the image that is correct and the image recognition result is not correct into the machine learning model. The degree of contribution may be calculated for each of one or more attribute labels. If the image can be correctly recognized by correcting the pixels that contributed to the erroneous image recognition, it is highly probable that the attribute label that contributed to the correction is related to the erroneous recognition. Therefore, by using the above correction weight as the contribution degree, it is possible to accurately estimate the attribute label related to the erroneous image recognition.

他の側面に係るコンピュータシステムでは、プロセッサが、貢献度を求める目的関数を１次のテイラー展開によって近似することで、最も高い貢献度を有する属性ラベルを推定してもよい。１次のテイラー展開を導入することで貢献度の計算を簡単にすることができるので、誤った画像認識に関連する属性ラベルをより短時間で推定することができる。 In a computer system according to another aspect, the processor may estimate the attribute label having the highest contribution by approximating the objective function for determining the contribution by a first-order Taylor expansion. Since the calculation of contribution can be simplified by introducing the first-order Taylor expansion, the attribute label related to erroneous image recognition can be estimated in a shorter time.

他の側面に係るコンピュータシステムでは、プロセッサが、注目領域を示す勾配ベクトルと、属性ラベルを示す画像ラベルベクトルとの内積を関係度として算出してもよい。この内積を用いることで、それぞれの属性ラベルと注目領域との関係の度合いが簡単に得られるので、誤った画像認識に関連する属性ラベルをより短時間で推定することができる。 In the computer system according to the other aspect, the processor may calculate the inner product of the gradient vector indicating the region of interest and the image label vector indicating the attribute label as the degree of relationship. By using this inner product, the degree of relationship between each attribute label and the region of interest can be easily obtained, so that the attribute label related to erroneous image recognition can be estimated in a shorter time.

［変形例］
以上、本開示の実施形態に基づいて詳細に説明した。しかし、本開示は上記実施形態に限定されるものではない。本開示は、その要旨を逸脱しない範囲で様々な変形が可能である。 [Modification example]
The above description has been made in detail based on the embodiments of the present disclosure. However, the present disclosure is not limited to the above embodiment. The present disclosure can be modified in various ways without departing from its gist.

上記実施形態では、分析システム１が画像認識結果、正誤判定結果、およびハイライト結果を算出する。しかし、本開示に係るコンピュータシステムは、これらの結果の少なくとも一つを外部のコンピュータシステムから取得してもよい。すなわち、画像認識、正誤判定、およびハイライト処理のうちの少なくとも一つは、本開示に係るコンピュータシステムにおいて必須の処理ではない。 In the above embodiment, the analysis system 1 calculates the image recognition result, the correctness determination result, and the highlight result. However, the computer system according to the present disclosure may obtain at least one of these results from an external computer system. That is, at least one of image recognition, correctness determination, and highlighting processing is not essential processing in the computer system according to the present disclosure.

上記実施形態では、注目領域と属性ラベルとの関係を示す関係度の一例として内積を示すが、内積以外のパラメータが関係度として用いられてもよい。上記実施形態では、誤った画像認識に属性ラベルがどのくらい寄与したかを示す貢献度の一例として補正重みを示すが、補正重み以外のパラメータが貢献度として用いられてもよい。 In the above embodiment, the inner product is shown as an example of the degree of relationship indicating the relationship between the region of interest and the attribute label, but parameters other than the inner product may be used as the degree of relationship. In the above embodiment, the correction weight is shown as an example of the contribution degree indicating how much the attribute label contributed to the erroneous image recognition, but a parameter other than the correction weight may be used as the contribution degree.

上記実施形態では、分析部１６が、最も高い貢献度（補正重み）を有する属性ラベルを、誤った画像認識に関連する属性ラベルとして推定する。しかし、推定される属性ラベルはこれに限定されない。例えば、本開示に係るコンピュータシステムは、貢献度（例えば補正重み）が所与の閾値以上である一または複数の属性ラベルを、誤った画像認識に関連する属性ラベルとして推定してもよい。いずれにしても、本開示に係るコンピュータシステムは少なくとも、最も高い貢献度（補正重み）を有する属性ラベルを、誤った画像認識に関連する属性ラベルとして推定する。 In the above embodiment, the analysis unit 16 estimates the attribute label having the highest contribution (correction weight) as the attribute label related to erroneous image recognition. However, the estimated attribute label is not limited to this. For example, the computer system according to the present disclosure may estimate one or more attribute labels whose contribution (eg, correction weight) is greater than or equal to a given threshold as attribute labels associated with erroneous image recognition. In any case, the computer system according to the present disclosure estimates at least the attribute label having the highest contribution (correction weight) as the attribute label related to erroneous image recognition.

本開示において、「プロセッサが、第１の処理を実行し、第２の処理を実行し、…第ｎの処理を実行する。」との表現、またはこれに対応する表現は、第１の処理から第ｎの処理までのｎ個の処理の実行主体（すなわちプロセッサ）が途中で変わる場合を含む概念を示す。すなわち、この表現は、ｎ個の処理のすべてが同じプロセッサで実行される場合と、ｎ個の処理においてプロセッサが任意の方針で変わる場合との双方を含む概念を示す。 In the present disclosure, the expression "the processor executes the first process, executes the second process, ... executes the nth process", or the expression corresponding thereto is the first process. The concept including the case where the execution subject (that is, the processor) of n processes from the nth process to the nth process changes in the middle is shown. That is, this expression shows a concept including both a case where all n processes are executed by the same processor and a case where the processor changes according to an arbitrary policy in n processes.

コンピュータシステム内で二つの数値の大小関係を比較する際には、「以上」および「よりも大きい」という二つの基準のどちらを用いてもよく、「以下」および「未満」の二つの基準のうちのどちらを用いてもよい。このような基準の選択は、二つの数値の大小関係を比較する処理についての技術的意義を変更するものではない。 When comparing the magnitude relations of two numbers in a computer system, either of the two criteria "greater than or equal to" and "greater than" may be used, and the two criteria "less than or equal to" and "less than" Either of them may be used. The selection of such criteria does not change the technical significance of the process of comparing the magnitude relations of two numbers.

プロセッサにより実行される方法の処理手順は上記実施形態での例に限定されない。例えば、上述したステップ（処理）の一部が省略されてもよいし、別の順序で各ステップが実行されてもよい。また、上述したステップのうちの任意の２以上のステップが組み合わされてもよいし、ステップの一部が修正または削除されてもよい。あるいは、上記の各ステップに加えて他のステップが実行されてもよい。 The processing procedure of the method executed by the processor is not limited to the example in the above embodiment. For example, some of the steps (processes) described above may be omitted, or each step may be executed in a different order. In addition, any two or more steps of the above-mentioned steps may be combined, or a part of the steps may be modified or deleted. Alternatively, other steps may be performed in addition to each of the above steps.

以上の実施形態の全部または一部に記載された態様は、画像処理に関する制御、処理速度の向上、処理精度の向上、使い勝手の向上、データを利用した機能の向上または適切な機能の提供その他の機能向上または適切な機能の提供、データおよび／またはプログラムの容量の削減、装置および／またはシステムの小型化等の適切なデータ、プログラム、記録媒体、装置および／またはシステムの提供、並びにデータ、プログラム、装置またはシステムの制作・製造コストの削減、制作・製造の容易化、制作・製造時間の短縮等のデータ、プログラム、記録媒体、装置および／またはシステムの制作・製造の適切化のいずれか一つの課題を解決する。 The embodiments described in all or part of the above embodiments include control of image processing, improvement of processing speed, improvement of processing accuracy, improvement of usability, improvement of functions using data, provision of appropriate functions, and the like. Providing appropriate data, programs, recording media, devices and / or systems, such as improving or providing appropriate functions, reducing the capacity of data and / or programs, miniaturizing devices and / or systems, and data, programs. Any one of data, programs, recording media, equipment and / or system production / manufacturing optimization such as reduction of production / manufacturing cost of equipment or system, facilitation of production / manufacturing, shortening of production / manufacturing time, etc. Solve one problem.

１…分析システム、１０…情報処理装置、１１…データ取得部、１２…画像認識部、１３…正誤判定部、１４…ハイライト部、１５…照合部、１６…分析部、２０…記憶装置。 1 ... analysis system, 10 ... information processing device, 11 ... data acquisition unit, 12 ... image recognition unit, 13 ... correctness judgment unit, 14 ... highlight unit, 15 ... collation unit, 16 ... analysis unit, 20 ... storage device.

Claims

Equipped with a processor
The processor
For each of the plurality of images, a correct / incorrect judgment result indicating whether or not the image recognition result obtained by inputting the image into the machine learning model is correct is acquired.
For each of the plurality of images, the degree of relationship indicating the relationship between the region of interest in the image in the machine learning model and each of one or more attribute labels associated with the image in advance was calculated.
Based on the correctness determination result and the degree of relationship of each of the plurality of images, the attribute label related to the erroneous image recognition is estimated from the one or more attribute labels.
Computer system.

The processor
For each of the one or more attribute labels, the degree of contribution to the erroneous image recognition is calculated based on the correctness determination result and the degree of relationship of each of the plurality of images.
From the one or more attribute labels, at least the attribute label having the highest contribution is estimated as the attribute label related to the erroneous image recognition.
The computer system according to claim 1.

The new image recognition result obtained by the processor inputting the first corrected image obtained by applying the correction weight to the image for which the image recognition result was correct is input to the machine learning model, and the new image recognition result is correct and said. The contribution of the correction weight that makes the new image recognition result correct by inputting the second correction image obtained by applying the correction weight to the image in which the image recognition result is incorrect into the machine learning model. Calculated for each of the above 1 or more attribute labels.
The computer system according to claim 2.

The processor estimates the attribute label with the highest contribution by approximating the objective function for the contribution by a first-order Taylor expansion.
The computer system according to claim 2 or 3.

The processor calculates the inner product of the gradient vector indicating the region of interest and the image label vector indicating the attribute label as the degree of relationship.
The computer system according to any one of claims 1 to 4.

For each of the plurality of images, a step of acquiring a correct / incorrect judgment result indicating whether or not the image recognition result obtained by inputting the image into the machine learning model is correct, and
For each of the plurality of images, a step of calculating the degree of relationship indicating the relationship between the region of interest in the image in the machine learning model and each of one or more attribute labels associated with the image in advance, and
A program that causes a computer to perform a step of estimating the attribute label related to erroneous image recognition from the one or more attribute labels based on the correctness determination result and the degree of relationship of each of the plurality of images.