JP2021039757A

JP2021039757A - Classifying apparatus, learning apparatus, classifying method, learning method, control program, and recording medium

Info

Publication number: JP2021039757A
Application number: JP2020146890A
Authority: JP
Inventors: ハルシャナハバラガムワ; Habaragamuwa Harshana; 優大石; Masaru Oishi; 竹谷　勝; Masaru Takeya; 勝竹谷; 田中　健一; Kenichi Tanaka; 健一田中
Original assignee: National Agriculture and Food Research Organization
Current assignee: National Agriculture and Food Research Organization
Priority date: 2019-09-02
Filing date: 2020-09-01
Publication date: 2021-03-11
Anticipated expiration: 2040-09-01

Abstract

To realize a technique for carrying out classification processing more properly.SOLUTION: A classifying apparatus (1) according to an embodiment of the present invention has an acquiring unit (5) for acquiring input data of a plurality of kinds, a first classifier (9) for mapping the input data to feature vectors in a feature amount space, and a second classifier (11) for outputting information relating to the plurality of kinds with using the feature vectors in the feature amount space as input. The second classifier (11) is input with feature vectors except for a common part among the feature vectors in the feature amount space corresponding to the respective input data of the plurality of kinds.SELECTED DRAWING: Figure 1

Description

本発明は、分類装置、学習装置、分類方法、学習方法、制御プログラム及び記録媒体に関する。 The present invention relates to a classification device, a learning device, a classification method, a learning method, a control program, and a recording medium.

従来、ニューラルネットワークを用いた分類装置が知られている。このような分類装置は、一例として対象物の真贋等についての分類を行う。また、植物が健全であるか病気であるのかを判定するための畳み込みニューラルネットワーク（CNN: Convolutional Neural Network）も知られている（非特許文献１）。 Conventionally, a classification device using a neural network is known. As an example, such a classification device classifies the authenticity of an object. A convolutional neural network (CNN) for determining whether a plant is healthy or sick is also known (Non-Patent Document 1).

Toda, Y., & Okura, F.(2019). How Convolutional Neural Networks Diagnose Plant Disease. Plant Phenomics, 2019, 1-14. URL(https://doi.ort/10.34133/2019/9237136)Toda, Y., & Okura, F. (2019). How Convolutional Neural Networks Diagnose Plant Disease. Plant Phenomics, 2019, 1-14. URL (https://doi.ort/10.34133/2019/9237136)

ところで、分類装置においては、より好適に分類処理を行うことが好ましいという第１の課題がある。 By the way, in the classification apparatus, there is a first problem that it is preferable to perform the classification process more preferably.

また、上述のような従来技術は、以下のような第２の課題がある。すなわち、通常、学習モデルがブラックボックスであるため、対象物のどのような特徴を学習して分類を行っているのかを、ユーザが好適に解釈（又は説明）できないという問題（解釈可能性の問題）がある。また、学習モデルのアルゴリズムが対象物の特徴を適切に学習したか否かを、確認することができないという問題がある。 In addition, the above-mentioned conventional technique has the following second problem. That is, since the learning model is usually a black box, the problem (interpretability problem) that the user cannot appropriately interpret (or explain) what features of the object are learned and classified. ). Further, there is a problem that it is not possible to confirm whether or not the algorithm of the learning model has properly learned the characteristics of the object.

本発明の一態様は、上記第１の課題及び第２の課題の少なくとも何れかを解決する技術を実現することを目的とする。 One aspect of the present invention aims to realize a technique for solving at least one of the first problem and the second problem.

少なくとも上記第１の課題を解決するために、本発明の一態様に係る分類装置は、複数の種別を取り得る入力データを取得する取得部と、前記入力データを、特徴量空間中の特徴ベクトルにマッピングする第１の分類器と、前記特徴量空間中の特徴ベクトルを入力とし、前記種別に関する情報を出力する第２の分類器とを備え、前記第２の分類器には、前記複数の種別の入力データの各々に対応する前記特徴量空間中の特徴ベクトルの共通部分以外の特徴ベクトルが入力される。 In order to solve at least the first problem, the classification device according to one aspect of the present invention has an acquisition unit that acquires input data that can take a plurality of types, and a feature vector that obtains the input data in a feature quantity space. The second classifier includes a first classifier that maps to the above, a second classifier that receives a feature vector in the feature quantity space as an input and outputs information about the type, and the second classifier includes the plurality of. A feature vector other than the common part of the feature vector in the feature amount space corresponding to each type of input data is input.

少なくとも上記第２の課題を解決するために、本発明の一態様に係る学習装置は、複数の種別を取り得る入力データを、互いに異なる複数の部分空間を含む特徴量空間中の特徴ベクトルにマッピングする第１の分類器を学習させる学習部を備え、前記学習部は、前記特徴量空間中の特徴ベクトルを入力とする復号器を備え、前記入力データとして第１種のデータを前記第１の分類器に入力する場合、前記特徴量空間のうち、第２種のデータ固有の特徴に対応する部分空間をマスクしたうえで、前記復号器が出力するデータと、前記入力データとの差異が小さくなるように、前記第１の分類器及び前記復号器を学習させるマスクあり学習であって、前記入力データとして前記第２種のデータを前記第１の分類器に入力する場合、前記特徴量空間のうち、前記第１種のデータ固有の特徴に対応する部分空間をマスクしたうえで、前記復号器が出力するデータと、前記入力データとの差異が小さくなるように、前記第１の分類器及び前記復号器を学習させるマスクあり学習を行う。 In order to solve at least the second problem, the learning device according to one aspect of the present invention maps input data that can take a plurality of types to a feature vector in a feature space including a plurality of different subspaces. The learning unit includes a learning unit for learning the first classifier, the learning unit includes a decoder that inputs a feature vector in the feature quantity space, and uses the first type of data as the input data of the first type. When inputting to the classifier, the difference between the data output by the decoder and the input data is small after masking the subspace corresponding to the second type data-specific feature in the feature amount space. In the case of masked learning in which the first classifier and the decoder are trained, and the second type of data is input to the first classifier as the input data, the feature quantity space. Of these, the first classifier is designed so that the difference between the data output by the decoder and the input data is small after masking the subspace corresponding to the unique feature of the first type of data. And learning with a mask to train the decoder.

少なくとも上記第２の課題を解決するために、本発明の一態様に係る学習装置は、複数の種別を取り得る入力データを、特徴量空間中の特徴ベクトルにマッピングする第１の分類器を学習させる学習部と、前記特徴量空間中の特徴ベクトルを入力とし、前記種別に関する情報を出力する第３の分類器とを備え、前記学習部は、前記複数の種別の入力データの各々に対応する前記特徴量空間中の特徴ベクトルの共通部分に含まれる特徴ベクトルを前記第３の分類器に入力した場合に、前記第３の分類器の分類精度が低くなるように、前記第１の分類器を学習させる。 In order to solve at least the second problem, the learning device according to one aspect of the present invention learns a first classifier that maps input data that can take a plurality of types to a feature vector in a feature space. It is provided with a learning unit to be used and a third classifier that receives a feature vector in the feature quantity space as an input and outputs information regarding the type, and the learning unit corresponds to each of the input data of the plurality of types. When the feature vector included in the common part of the feature vector in the feature quantity space is input to the third classifier, the classification accuracy of the third classifier is lowered, so that the first classifier is used. To learn.

少なくとも上記第１の課題を解決するために、本発明の一態様に係る分類方法は、装置によって実行される分類方法であって、複数の種別を取り得る入力データを取得する取得ステップと、前記入力データを、特徴量空間中の特徴ベクトルにマッピングする第１の分類ステップと、前記特徴量空間中の特徴ベクトルを入力とし、前記種別に関する情報を出力する第２の分類ステップとを含み、前記第２の分類ステップにおいては、前記複数の種別の入力データの各々に対応する前記特徴量空間中の特徴ベクトルの共通部分以外の特徴ベクトルが入力される。 In order to solve at least the first problem, the classification method according to one aspect of the present invention is a classification method executed by an apparatus, which includes an acquisition step of acquiring input data that can take a plurality of types, and the above-mentioned. The above includes a first classification step of mapping input data to a feature vector in the feature amount space, and a second classification step of inputting the feature vector in the feature amount space and outputting information on the type. In the second classification step, feature vectors other than the common portion of the feature vectors in the feature quantity space corresponding to each of the plurality of types of input data are input.

少なくとも上記第２の課題を解決するために、本発明の一態様に係る学習方法は、複数の種別を取り得る入力データを、互いに異なる複数の部分空間を含む特徴量空間中の特徴ベクトルにマッピングする第１の分類器を学習させる学習ステップを含み、前記学習ステップは、前記入力データとして第１種のデータを前記第１の分類器に入力する場合、前記特徴量空間のうち、第２種のデータ固有の特徴に対応する部分空間をマスクしたうえで、前記特徴量空間中の特徴ベクトルを入力とする復号器が出力するデータと、前記入力データとの差異が小さくなるように、前記第１の分類器及び前記復号器を学習させるマスクあり学習であって、前記入力データとして前記第２種のデータを前記第１の分類器に入力する場合、前記特徴量空間のうち、前記第１種のデータ固有の特徴に対応する部分空間をマスクしたうえで、前記特徴量空間中の特徴ベクトルを入力とする復号器が出力するデータと、前記入力データとの差異が小さくなるように、前記第１の分類器及び前記復号器を学習させるマスクあり学習を行う。 In order to solve at least the second problem, the learning method according to one aspect of the present invention maps input data that can take a plurality of types to a feature vector in a feature space including a plurality of different subspaces. The learning step includes a learning step for training the first classifier, and when the first type of data is input to the first classifier as the input data, the second type of the feature quantity space is included. After masking the subspace corresponding to the data-specific feature of, the first is such that the difference between the data output by the decoder that inputs the feature vector in the feature quantity space and the input data becomes small. In the masked learning for training the classifier 1 and the decoder, when the second type of data is input to the first classifier as the input data, the first of the feature quantity spaces is described. After masking the subspace corresponding to the unique feature of the species data, the data output by the decoder that inputs the feature vector in the feature amount space and the input data are reduced so that the difference is small. Masked learning is performed to train the first classifier and the decoder.

少なくとも上記第２の課題を解決するために、本発明の一態様に係る学習方法は、複数の種別を取り得る入力データを、特徴量空間中の特徴ベクトルにマッピングする第１の分類器を学習させる学習ステップを含み、前記学習ステップにおいては、前記複数の種別の入力データの各々に対応する前記特徴量空間中の特徴ベクトルの共通部分に含まれる特徴ベクトルを、前記特徴量空間中の特徴ベクトルを入力とし、前記種別に関する情報を出力する第３の分類器に入力した場合に、前記第３の分類器の分類精度が低くなるように、前記第１の分類器を学習させる。 In order to solve at least the second problem, the learning method according to one aspect of the present invention learns a first classifier that maps input data that can take a plurality of types to a feature vector in a feature space. In the learning step, the feature vector included in the common part of the feature vector in the feature quantity space corresponding to each of the plurality of types of input data is converted into the feature vector in the feature quantity space. Is input to the third classifier that outputs information about the type, the first classifier is trained so that the classification accuracy of the third classifier is lowered.

本発明の一態様によれば、上記第１の課題及び上記第２の課題の少なくとも何れかを解決する技術を実現できる。 According to one aspect of the present invention, it is possible to realize a technique for solving at least one of the first problem and the second problem.

実施形態に係る分類装置の機能ブロック図である。It is a functional block diagram of the classification apparatus which concerns on embodiment. 実施形態に係る学習処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the learning process which concerns on embodiment. 実施形態に係る第１の分類器および復号器の教師無し学習を示す概念図である。It is a conceptual diagram which shows the unsupervised learning of the 1st classifier and the decoder which concerns on embodiment. 実施形態に係る第１の分類器および復号器の条件付き学習、及び続いて実行される第２の分類器の教師有り学習を示す概念図である。FIG. 5 is a conceptual diagram showing conditional learning of a first classifier and a decoder according to an embodiment, followed by supervised learning of a second classifier. 実施形態に係る入力画像に対して病気の状態の特徴又は健全状態の特徴を付加して生成された画像の一例を示す図である。It is a figure which shows an example of the image generated by adding the feature of a disease state or the feature of a healthy state to the input image which concerns on embodiment. 実施形態に係る第１の分類器、復号器、及び、第３の分類器の学習例を示す概念図である。It is a conceptual diagram which shows the learning example of the 1st classifier, the decoding machine, and the 3rd classifier which concerns on embodiment. 実施形態に係る第２の分類器の教師有り学習を示す概念図である。It is a conceptual diagram which shows the supervised learning of the 2nd classifier which concerns on embodiment. 実施形態に係る分類処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the classification process which concerns on embodiment. 実施形態の付記事項に係る第１の分類器、復号器、及び、第３の分類器の学習例を示す概念図である。It is a conceptual diagram which shows the learning example of the 1st classifier, the decoding machine, and the 3rd classifier which concerns on the appendix of embodiment. 実施形態の付記事項に係る第２の分類器の教師有り学習を示す概念図である。It is a conceptual diagram which shows the supervised learning of the 2nd classifier which concerns on the appendix of embodiment. 実施形態の付記事項に係る解釈可能性向上処理例を示す概念図である。It is a conceptual diagram which shows the interpretability improvement processing example which concerns on the appendix of the embodiment. 実施形態の付記事項に係る分類装置（学習装置）に入力する入力画像例と、生成された画像例とを示す図である。It is a figure which shows the input image example to be input to the classification apparatus (learning apparatus) which concerns on the appendix of embodiment, and the generated image example. 分類結果により大きな影響を与えた特徴量を決定するための処理例を説明する概念図である。It is a conceptual diagram explaining the processing example for determining the feature amount which had a big influence on the classification result.

〔実施形態〕
以下、本発明の一実施形態について、詳細に説明する。本発明の一態様に係る分類装置は、画像等の入力データを複数の種別の何れかに分類する装置である。 [Embodiment]
Hereinafter, one embodiment of the present invention will be described in detail. The classification device according to one aspect of the present invention is a device that classifies input data such as images into any of a plurality of types.

〔１．分類装置の構成例〕
図１は、本実施形態に係る分類装置（学習装置）１の機能ブロック図である。図１に示すように分類装置１は、制御部３、出力部１７および記憶部１９を備えている。 [1. Classification device configuration example]
FIG. 1 is a functional block diagram of the classification device (learning device) 1 according to the present embodiment. As shown in FIG. 1, the classification device 1 includes a control unit 3, an output unit 17, and a storage unit 19.

制御部３は、分類装置１全体を統括する制御装置であって、取得部５、分類部７、学習部１３および復号器１５としても機能する。 The control unit 3 is a control device that controls the entire classification device 1, and also functions as an acquisition unit 5, a classification unit 7, a learning unit 13, and a decoder 15.

取得部５は、複数の種別を取り得る入力データを取得する。取得部５が取得する入力データの態様は本実施形態を限定するものでなない。以下では、入力データは、生物に関する画像データであり、複数の種別には、当該生物が健全であるのか、病気であるのかに関する種別が含まれるものとして説明する。また、本実施形態においては、生物に関する画像データとして、植物の葉の画像データを例に挙げて説明するが、例えば動物や昆虫の身体の一部を示す画像等であってもよい。また、複数の種別は２種類であることに限定されず、後述するように３種類以上であってもよい。なお、別の態様として、例えば入力データが工業製品等に関する画像データであり、複数の種別には、当該工業製品等が正常な良品であるのか、不良品であるのかに関する種別が含まれる構成であってもよい。また、その他の態様として、入力データは音声データであり、複数の種別には、音声の発生元によって分類される種別が含まれる構成であってもよい。 The acquisition unit 5 acquires input data that can take a plurality of types. The mode of the input data acquired by the acquisition unit 5 does not limit the present embodiment. In the following, the input data will be described as image data relating to an organism, and the plurality of types will be described as including the type relating to whether the organism is healthy or ill. Further, in the present embodiment, as the image data relating to the living thing, the image data of the leaves of the plant will be described as an example, but for example, an image showing a part of the body of an animal or an insect may be used. Further, the plurality of types is not limited to two types, and may be three or more types as described later. As another aspect, for example, the input data is image data related to an industrial product or the like, and the plurality of types include a type relating to whether the industrial product or the like is a normal non-defective product or a defective product. There may be. Further, as another aspect, the input data may be voice data, and the plurality of types may include types classified according to the source of the voice.

また、取得部５が入力データを取得する取得元は、特に限定されず、インターネットを介して接続された外部装置、又は分類装置１内部の記憶部１９等であってもよい。 Further, the acquisition source from which the acquisition unit 5 acquires the input data is not particularly limited, and may be an external device connected via the Internet, a storage unit 19 inside the classification device 1, or the like.

（分類部７）
分類部７は、入力データを複数の種別の何れかに分類する構成であって、第１の分類器９、第２の分類器１１および第３の分類器１２を備えている。また、分類部７による分類処理、及び後述する復号器１５による復号処理は、記憶部１９に格納されたパラメータセットに対応する学習モデルによって規定される。また、本実施形態における学習モデルには、入力データと後述する特徴量空間中の特徴ベクトルとを対応付けるためのモデル、及び、特徴量空間中のベクトルと出力データとを対応付けるためのモデルの少なくとも何れかが含まれる。 (Classification unit 7)
The classification unit 7 has a configuration for classifying the input data into any of a plurality of types, and includes a first classifier 9, a second classifier 11, and a third classifier 12. Further, the classification process by the classification unit 7 and the decoding process by the decoder 15 described later are defined by the learning model corresponding to the parameter set stored in the storage unit 19. Further, the learning model in the present embodiment includes at least one of a model for associating the input data with the feature vector in the feature amount space described later and a model for associating the vector in the feature amount space with the output data. Is included.

（第１の分類器９）
第１の分類器９は、取得部５が取得した入力データを、互いに異なる複数の部分空間を含む特徴量空間中の特徴ベクトル（潜在変数）にマッピングする。第１の分類器９の具体的構成は本実施形態を限定するものではないが、一例として、オートエンコーダ（AE: Autoencoder）に含まれるエンコーダ側のネットワークを用いることができる。また、第１の分類器９として、変分オートエンコーダ（VAE: Variational Autoencoder）に含まれるエンコーダ側のネットワークを用いる構成としてもよい。なお、上記特徴量空間の詳細については後述する。 (First classifier 9)
The first classifier 9 maps the input data acquired by the acquisition unit 5 to a feature vector (latent variable) in the feature space including a plurality of different subspaces. The specific configuration of the first classifier 9 is not limited to this embodiment, but as an example, a network on the encoder side included in an autoencoder (AE: Autoencoder) can be used. Further, as the first classifier 9, the network on the encoder side included in the variational autoencoder (VAE) may be used. The details of the feature space will be described later.

（第２の分類器１１）
第２の分類器１１は、特徴量空間中の特徴ベクトルを入力として、種別に関する情報を出力する。第２の分類器１１の具体的構成は本実施形態を限定するものではないが、一例として、全連結型のニューラルネットワーク（Fully Connected Neural Network）を用いることができる。なお、第２の分類器１１が出力する種別に関する情報の具体例は、本実施形態を限定するものではないが、一例として、分類の対象となる画像が示す植物の葉が、健全状態であるのか病気の状態であるのかについての推定結果を示す情報である。 (Second classifier 11)
The second classifier 11 takes the feature vector in the feature space as an input and outputs information on the type. The specific configuration of the second classifier 11 does not limit the present embodiment, but as an example, a fully connected neural network can be used. The specific example of the information regarding the type output by the second classifier 11 does not limit the present embodiment, but as an example, the leaves of the plant shown by the image to be classified are in a healthy state. It is information showing the estimation result as to whether the patient is in a state of illness or illness.

学習部１３は、分類部７および復号器１５の各種学習を行う。学習部１３による学習処理の詳細については後述する。なお、図１に示す構成とは別の態様として、復号器１５は、学習部１３が備える構成であってもよい。 The learning unit 13 performs various learning of the classification unit 7 and the decoder 15. The details of the learning process by the learning unit 13 will be described later. In addition to the configuration shown in FIG. 1, the decoder 15 may have a configuration included in the learning unit 13.

以下、第１種のデータは、健全状態にある植物の葉を示す画像であり、第２種のデータは、病気の状態にある植物の葉を示す画像であるものとして説明する。 Hereinafter, the data of the first kind will be described as an image showing the leaves of a plant in a healthy state, and the data of the second kind will be described as an image showing the leaves of a plant in a diseased state.

（第３の分類器１２）
第３の分類器１２は、特徴量空間の部分空間の定義が好適になされたかの判定に用いられる。第３の分類器１２は、特徴量空間中の特徴ベクトルを入力として、種別に関する情報を出力するが、処理の詳細については後述する。 (Third classifier 12)
The third classifier 12 is used to determine whether the definition of the subspace of the feature space has been made favorably. The third classifier 12 takes the feature vector in the feature space as an input and outputs information on the type, and the details of the processing will be described later.

（復号器１５）
復号器１５は、特徴量空間中の特徴ベクトルを入力として、復号画像を出力する。復号器１５の具体的構成は本実施形態を限定するものではないが、一例として、オートエンコーダ（AE: Autoencoder）に含まれるデコーダ側のネットワークを用いることができる。また、復号器１５として、変分オートエンコーダ（VAE: Variational Autoencoder）に含まれるデコーダ側のネットワークを用いる構成としてもよい。 (Decoder 15)
The decoder 15 takes the feature vector in the feature space as an input and outputs the decoded image. The specific configuration of the decoder 15 is not limited to this embodiment, but as an example, a network on the decoder side included in an autoencoder (AE: Autoencoder) can be used. Further, the decoder 15 may be configured to use the network on the decoder side included in the variational autoencoder (VAE).

出力部１７は、分類部７による分類結果を出力する表示パネル若しくはスピーカ等、又は分類結果を外部の表示装置等に出力する通信モジュールである。記憶部１９は、各種データを保存する記憶装置であって、学習モデルを規定するための各種の関数及びパラメータセットを格納する。 The output unit 17 is a display panel or speaker that outputs the classification result by the classification unit 7, or a communication module that outputs the classification result to an external display device or the like. The storage unit 19 is a storage device that stores various data, and stores various functions and parameter sets for defining a learning model.

なお、分類装置１は、ユーザが分類装置１に指示操作を行うためのキーボード等の入力インターフェースを備えていてもよい。 The classification device 1 may be provided with an input interface such as a keyboard for the user to perform an instruction operation on the classification device 1.

〔２．学習処理例〕
以下、分類装置１における学習処理の一例について説明する。本例の学習処理においては、概して以下の工程が実行される。 [2. Learning process example]
Hereinafter, an example of the learning process in the classification device 1 will be described. In the learning process of this example, the following steps are generally executed.

（１）学習用データを用いて、事前に第１の分類器９及び復号器１５の教師無し学習を行う。 (1) Unsupervised learning of the first classifier 9 and the decoder 15 is performed in advance using the learning data.

（２）第１種のデータ、又は第２種のデータのラベルが付加されたデータを用いて、第１の分類器９及び復号器１５の条件付き学習を行うと同時に、特徴ベクトルの共通部分に含まれる特徴ベクトルを第３の分類器１２に入力した際に、第３の分類器１２の分類精度が低くなるように第1の分類器９の条件付き学習を行う。 (2) Conditional learning of the first classifier 9 and the decoder 15 is performed using the first type data or the data to which the second type data label is added, and at the same time, the intersection of the feature vectors is performed. When the feature vector included in is input to the third classifier 12, conditional learning of the first classifier 9 is performed so that the classification accuracy of the third classifier 12 becomes low.

（３）第１の分類器９に含まれるエンコーダが用いる学習モデルのパラメータを固定したまま、上記エンコーダの出力を用いて第２の分類器１１の教師有り学習を行う。 (3) While fixing the parameters of the learning model used by the encoder included in the first classifier 9, supervised learning of the second classifier 11 is performed using the output of the encoder.

図２は、本実施形態に係る分類装置１による学習処理の流れを示すフローチャートである。 FIG. 2 is a flowchart showing a flow of learning processing by the classification device 1 according to the present embodiment.

ステップＳ１０１において、取得部５は、学習に用いる入力データとなる植物の葉の画像を取得する。また、上記画像の少なくとも一部には、当該画像に含まれる葉が健全状態にあるのか、病気の状態であるのかを示すラベル付きの画像が含まれる。 In step S101, the acquisition unit 5 acquires an image of a plant leaf as input data used for learning. In addition, at least a part of the above image includes a labeled image showing whether the leaves contained in the image are in a healthy state or in a diseased state.

ステップＳ１０２において、学習部１３は、取得部５が取得した画像を用いて、第１の分類器９及び復号器１５の教師無し学習を行う。図３は、本ステップＳ１０２における教師無し学習を示す概念図である。ここでの教師無し学習とは、取得部５が取得した画像を第１の分類器９に入力し、復号器１５が出力する画像と、取得部５が取得した画像との相違がより小さくなるように、第１の分類器９及び復号器１５のパラメータを更新する処理のことである。 In step S102, the learning unit 13 performs unsupervised learning of the first classifier 9 and the decoder 15 using the image acquired by the acquisition unit 5. FIG. 3 is a conceptual diagram showing unsupervised learning in this step S102. In the unsupervised learning here, the image acquired by the acquisition unit 5 is input to the first classifier 9, and the difference between the image output by the decoder 15 and the image acquired by the acquisition unit 5 becomes smaller. As described above, it is a process of updating the parameters of the first classifier 9 and the decoder 15.

本ステップＳ１０２において実行されるのは教師無し学習であるため、学習に用いる画像に上記ラベルが付加されていることを要しない。 Since unsupervised learning is executed in this step S102, it is not necessary that the above label is added to the image used for learning.

上記の教師無し学習により、特徴量空間中に、入力データのそれぞれに対応した特徴ベクトルが生成される。また、上記の教師無し学習を行うことによって、入力データの種別の違いに対応して、特徴量空間中に、入力データに対応する特徴ベクトルが種別の違いにより遍在するという状況が生じ得る。 By the above unsupervised learning, feature vectors corresponding to each of the input data are generated in the feature space. Further, by performing the above unsupervised learning, a situation may occur in which the feature vectors corresponding to the input data are ubiquitous due to the difference in the type in the feature space in response to the difference in the type of the input data.

学習部１３は、復号器１５が出力する画像と、第１の分類器９に入力される画像との差異が小さくなるように、第１の分類器９及び復号器１５を学習させる。なお、本ステップＳ１０２における教師無し学習では、後述するマスクは行われない。 The learning unit 13 trains the first classifier 9 and the decoder 15 so that the difference between the image output by the decoder 15 and the image input to the first classifier 9 becomes small. In the unsupervised learning in step S102, the mask described later is not performed.

また、本ステップＳ１０２の処理によって、第１の分類器９および復号器１５の学習に関するパラメータの重み等を、入力データの傾向に合わせて調整することができる。 Further, by the process of this step S102, the weights of the parameters related to the learning of the first classifier 9 and the decoder 15 can be adjusted according to the tendency of the input data.

ステップＳ１０３において、学習部１３は、取得部５が取得した上記ラベル付きの画像を用いて、第１の分類器９および復号器１５の条件付き学習を行う。これにより、特徴量空間中に、特徴ベクトルの互いに異なる集合に対応する複数の部分空間が規定される。また、各特徴ベクトル及び各部分空間が、健全状態にある葉の特徴、又は病気の状態にある葉の特徴の何れに対応するものであるかが求められる。なお、学習部１３は、ユーザが理解しやすい結果を生成するため、第１の分類器９および復号器１５で生成された画像と入力画像とについて敵対的生成ネットワーク（GAN）を適用してもよい。 In step S103, the learning unit 13 performs conditional learning of the first classifier 9 and the decoder 15 using the labeled image acquired by the acquisition unit 5. As a result, a plurality of subspaces corresponding to different sets of feature vectors are defined in the feature space. In addition, it is required whether each feature vector and each subspace corresponds to the feature of a leaf in a healthy state or the feature of a leaf in a diseased state. Note that the learning unit 13 may apply a hostile generation network (GAN) to the images generated by the first classifier 9 and the decoder 15 and the input images in order to generate results that are easy for the user to understand. Good.

本ステップＳ１０３における条件付き学習を行うことにより、各特徴ベクトルは、種別に応じて、互いに異なる複数の部分空間の何れかに存在することになる。ここで、互いに異なる複数の部分空間には、健全状態にある葉の特徴を示す特徴ベクトルに対応する部分空間、及び病気の状態にある葉の特徴を示す特徴ベクトルに対応する部分空間が含まれる。また、これらの部分空間は、健全状態であるか病気の状態であるかを問わない特徴に対応する共通部分であって、葉自体の特徴に対応する共通部分を有している。 By performing the conditional learning in this step S103, each feature vector exists in any of a plurality of subspaces different from each other depending on the type. Here, the plurality of subspaces different from each other include a subspace corresponding to a feature vector indicating the characteristics of a leaf in a healthy state and a subspace corresponding to a feature vector indicating the characteristics of a leaf in a diseased state. .. Further, these subspaces are common parts corresponding to the characteristics regardless of whether they are in a healthy state or a diseased state, and have common parts corresponding to the characteristics of the leaves themselves.

別の側面から言えば、
健全状態にある葉の特徴を示す特徴ベクトルに対応する部分空間は、
・健全状態にある葉に固有の特徴を示す特徴ベクトルに対応する部分空間（健全状態に固有の部分空間）と、
・健全状態であるか病気の状態であるかを問わない特徴であって、葉自体の共通した特徴に対応する部分空間（共通部分空間）と
により構成され、
病気の状態にある葉の特徴を示す特徴ベクトルに対応する部分空間は、
・病気の状態にある葉に固有の特徴を示す特徴ベクトルに対応する部分空間（病気の状態に固有の部分空間）と、
・健全状態であるか病気の状態であるかを問わない特徴であって、葉自体の共通した特徴に対応する部分空間（共通部分空間）と
により構成される。 From another side,
The subspace corresponding to the feature vector that represents the characteristics of a healthy leaf is
-The subspace corresponding to the feature vector showing the characteristics peculiar to the leaf in the healthy state (subspace peculiar to the healthy state) and
-It is a feature regardless of whether it is in a healthy state or a diseased state, and is composed of a subspace (common subspace) corresponding to the common feature of the leaf itself.
The subspace corresponding to the feature vector that characterizes the diseased leaf is
-Subspaces (subspaces specific to the diseased state) corresponding to feature vectors that show the characteristics unique to the diseased leaf,
-It is a feature regardless of whether it is in a healthy state or a diseased state, and is composed of a subspace (common subspace) corresponding to the common feature of the leaf itself.

補足すれば、健全状態にある葉の特徴を示す特徴ベクトルに対応する部分空間とは、健全状態に固有の部分空間と共通部分空間との和空間である。また、病気の状態にある葉の特徴を示す特徴ベクトルに対応する部分空間とは、病気の状態に固有の部分空間と共通部分空間との和空間である。 Supplementally, the subspace corresponding to the feature vector indicating the characteristics of the leaves in the healthy state is the sum space of the subspace peculiar to the healthy state and the intersection subspace. The subspace corresponding to the feature vector indicating the characteristics of the leaf in the diseased state is the sum space of the subspace peculiar to the diseased state and the intersection.

図４は、上記条件付き学習を示す概念図である。図４の例において、特徴ベクトル２１は、病気の状態にある葉に固有の特徴を示している。また、特徴ベクトル２３は、健全状態にある葉に固有の特徴を示している。また、特徴ベクトル２５は、健全状態および病気の状態の葉に共通した葉自体の特徴を示している。なお、図４に示されている第２の分類器１１における学習処理は、第１の分類器９及び復号器１５の学習処理に続いて実行される。当該処理についてはステップＳ１０４において後述する。本ステップＳ１０３において、学習部１３は、第１の分類器９の条件付き学習を行う場合に、各部分空間の共通部分に含まれる特徴ベクトルに、健全状態にある葉に固有の特徴ベクトル、及び病気の状態にある葉に固有の特徴ベクトルが含まれないようにするため、且つ、対応する特徴ベクトルに各特徴を反映するためにマスクを利用する。学習部１３は、入力画像の種別に対応する部分空間以外の空間をマスクしたうえで、特徴ベクトルを入力として復号器１５が出力する画像と、元の入力画像との差異が小さくなるように、第１の分類器９および復号器１５を学習させる。ここで、マスクとは、マスクする部分空間に対応する特徴ベクトルの成分を、0又はランダムに設定するか、或いは第1の分類器９において、対応する特徴ベクトルに接続しているノードを遮断する処理を意味する。この際、各特徴の独立が保たれる手法が望ましい。学習部１３は、健全状態にある葉の画像を第１の分類器９に入力する場合、特徴量空間のうち、病気の状態にある葉に固有の特徴ベクトルが構成する部分空間をマスクしたうえで、復号器１５が出力するデータと、入力画像との差異が小さくなるように、第１の分類器９及び復号器１５を学習させるマスク有り学習を行う。 FIG. 4 is a conceptual diagram showing the conditional learning. In the example of FIG. 4, the feature vector 21 shows features specific to the diseased leaf. Further, the feature vector 23 shows a feature peculiar to a leaf in a healthy state. In addition, the feature vector 25 shows the characteristics of the leaf itself that are common to the leaves in the healthy state and the diseased state. The learning process in the second classifier 11 shown in FIG. 4 is executed following the learning process in the first classifier 9 and the decoder 15. The process will be described later in step S104. In this step S103, when the learning unit 13 performs the conditional learning of the first classifier 9, the feature vector included in the common part of each subspace, the feature vector peculiar to the leaf in the healthy state, and the feature vector peculiar to the leaf in the healthy state are added. Masks are used to prevent the diseased leaves from containing unique feature vectors and to reflect each feature in the corresponding feature vectors. The learning unit 13 masks a space other than the subspace corresponding to the type of the input image, and then uses the feature vector as an input so that the difference between the image output by the decoder 15 and the original input image becomes small. The first classifier 9 and the decoder 15 are trained. Here, the mask means that the component of the feature vector corresponding to the subspace to be masked is set to 0 or randomly, or the node connected to the corresponding feature vector is blocked in the first classifier 9. Means processing. At this time, a method in which the independence of each feature is maintained is desirable. When inputting an image of a healthy leaf into the first classifier 9, the learning unit 13 masks a subspace of the feature space formed by a feature vector peculiar to the diseased leaf. Then, masked learning is performed so that the first classifier 9 and the decoder 15 are trained so that the difference between the data output by the decoder 15 and the input image becomes small.

これは、学習部１３が、当該画像のうち、病気の状態にある葉固有の特徴を示す部分を、消去するかランダムな画像成分に置きかえるか、或いは第1の分類器９において、当該特徴ベクトルに接続しているノードを遮断して学習処理を行うことを示している。この際、各特徴の独立が保たれる手法が望ましい。上記のように学習することにより、健全状態にある葉の画像を第１の分類器９に入力する場合には、病気の状態にある葉の特徴ベクトルが構成する部分空間を介した分類精度が向上しないような学習を実現することができる。 This is because the learning unit 13 erases or replaces the portion of the image showing the leaf-specific feature in the diseased state with a random image component, or in the first classifier 9, the feature vector. It shows that the learning process is performed by blocking the node connected to. At this time, a method in which the independence of each feature is maintained is desirable. By learning as described above, when an image of a healthy leaf is input to the first classifier 9, the classification accuracy via the subspace formed by the feature vector of the diseased leaf is improved. Learning that does not improve can be realized.

また、学習部１３は、病気の状態にある葉の画像を第１の分類器９に入力する場合、特徴量空間のうち、健全状態にある葉に固有の特徴ベクトルが構成する部分空間をマスクしたうえで、復号器１５が出力するデータと、入力画像との差異が小さくなるように、第１の分類器９及び復号器１５を学習させるマスク有り学習を行う。 Further, when the learning unit 13 inputs an image of a diseased leaf to the first classifier 9, it masks a subspace of the feature space formed by a feature vector unique to the healthy leaf. Then, masked learning is performed so that the first classifier 9 and the decoder 15 are trained so that the difference between the data output by the decoder 15 and the input image becomes small.

これは、学習部１３が、当該画像のうち、健全の状態にある葉固有の特徴を示す部分を、消去するかランダムな画像成分に置き換えるか、或いは第１の分類器９において、当該特徴ベクトルに接続しているノードを遮断して学習処理を行うことを示している。この際、各特徴の独立が保たれる手法が望ましい。上記のように学習することにより、病気の状態にある葉の画像を第１の分類器９に入力する場合には、健全状態にある葉の特徴ベクトルの更新があまり行われないような学習を実現することができる。 This is because the learning unit 13 erases or replaces the portion of the image showing the leaf-specific features in a healthy state with random image components, or in the first classifier 9, the feature vector. It shows that the learning process is performed by blocking the node connected to. At this time, a method in which the independence of each feature is maintained is desirable. By learning as described above, when the image of the diseased leaf is input to the first classifier 9, the learning is performed so that the feature vector of the healthy leaf is not updated so much. It can be realized.

また、マスク有り学習においては、健全状態であるか病気の状態であるかを問わない特徴に対応する特徴ベクトルに、健全状態又は病気の状態に固有の特徴に対応する情報が含まれないようにするため、当該特徴ベクトルが構成する部分空間を介した分類精度が向上しないように第１の分類器９及び復号器１５と、後述するように第３の分類器１２とを学習させる。 Also, in masked learning, the feature vector corresponding to the feature regardless of whether it is in a healthy state or a sick state does not include information corresponding to the feature peculiar to the healthy state or the sick state. Therefore, the first classifier 9 and the decoder 15 and the third classifier 12 are trained as described later so that the classification accuracy via the subspace formed by the feature vector is not improved.

なお、マスク有り学習をより一般的に言えば、ある種別のデータを第１の分類器９に入力する場合、当該ある種別のデータ固有の特徴に対応する部分空間をマスクせず、当該ある種別のデータの特徴と他の種別のデータの特徴とに共通の部分空間もマスクせず、且つ他の種別のデータ固有の特徴に対応する部分空間をマスクしたうえで、復号器１５が出力するデータと、前記入力データとの差異が小さくなるように、第１の分類器９及び復号器１５を学習させる、という構成になる。 More generally speaking, in masked learning, when a certain type of data is input to the first classifier 9, the subspace corresponding to the unique feature of the certain type of data is not masked, and the certain type is not masked. Data output by the decoder 15 after masking the subspaces that are common to the characteristics of the data of the above and the characteristics of the data of other types and masking the subspaces that correspond to the characteristics unique to the data of other types. The first classifier 9 and the decoder 15 are trained so that the difference from the input data becomes small.

なお、特徴量空間は例えば２５６〜４０９６といった高い次元を有し得るが、一例としての特徴量空間を３次元として模式的に示すと、
・第１種のデータに対応する特徴ベクトル＝（ｘ１、ｙ１、０）
・第２種のデータに対応する特徴ベクトル＝（０、ｙ２、ｚ２）
というような特徴ベクトルが生成される。ここで、ｘ１、ｙ１は、例えば、第１種のデータに対応する特徴ベクトルにおける主要な成分であり、ｙ２、ｚ２は、例えば、第２種のデータに対応する特徴ベクトルにおける主要な成分である。なお、「０」と示した成分は非主要な成分のことを指しており、厳密にゼロでなくとも、「他の成分に比べて十分に値が小さい」又は「多数の入力ベクトルを入力してもまれにしか０でない成分を有しない」といった性質を有する。 The feature space may have a high dimension such as 256 to 4096, but if the feature space as an example is schematically shown as three dimensions,
-Feature vector corresponding to the first type data = (x1, y1, 0)
-Feature vector corresponding to the second type data = (0, y2, z2)
A feature vector like this is generated. Here, x1 and y1 are, for example, the main components in the feature vector corresponding to the first type data, and y2 and z2 are, for example, the main components in the feature vector corresponding to the second type data. .. Note that the component indicated by "0" refers to a non-major component, and even if it is not exactly zero, "the value is sufficiently small compared to other components" or "a large number of input vectors are input. It has a property such as "it rarely has a component that is not 0".

上記の模式的な例では、
・第１種のデータの対応する部分空間（Ｘ、Ｙ、０）
＝第１種のデータに固有の部分空間（Ｘ、０、０）＋共通部分空間（０、Ｙ、０）
・第２種のデータの対応する部分空間（０、Ｙ、Ｚ）
＝第２種のデータに固有の部分空間（０、０、Ｚ）＋共通部分空間（０、Ｙ、０）
という対応を有する。 In the schematic example above,
-Corresponding subspace of type 1 data (X, Y, 0)
= Subspace unique to type 1 data (X, 0, 0) + Intersection (0, Y, 0)
-Corresponding subspace (0, Y, Z) of type 2 data
= Subspace unique to type 2 data (0, 0, Z) + intersection subspace (0, Y, 0)
It has the correspondence.

また、第１の分類器９は、学習処理の過程において、健全状態にある葉の画像を、複数の部分空間のうち、ある部分空間内の特徴ベクトルにマッピングし、病気の状態にある葉の画像を、複数の部分空間のうち、前記ある部分空間とは異なる他の部分空間内の特徴ベクトルにマッピングする。 In addition, the first classifier 9 maps the image of a leaf in a healthy state to a feature vector in a certain subspace among a plurality of subspaces in the process of learning processing, and the first classifier 9 maps the image of the leaf in the diseased state. The image is mapped to a feature vector in another subspace different from the one subspace among the plurality of subspaces.

また、本ステップＳ１０３におけるマスクあり学習を行うことにより、後述する分類処理において、入力データとしてのある種別のデータを、特徴量空間中の複数の部分空間のうち、ある部分空間内の特徴ベクトルにマッピングし、入力データとしての他の種別のデータを、前記複数の部分空間のうち、前記ある部分空間とは異なる他の部分空間内の特徴ベクトルにマッピングするという処理が可能になる。 Further, by performing the masked learning in this step S103, in the classification process described later, a certain type of data as input data is converted into a feature vector in a certain subspace among a plurality of subspaces in the feature amount space. It is possible to perform a process of mapping and mapping other types of data as input data to a feature vector in another subspace different from the certain subspace among the plurality of subspaces.

図５は、元の入力画像の健全状態にある葉の特徴、又は病気の状態にある葉の特徴をマスクして学習が行われることにより出力される、病気の状態の特徴を有する葉、及び健全状態の特徴を有する葉の一例を示す図である。これは、学習部１３が、元の入力画像に対して、健全状態にある葉の特徴、又は病気の状態にある葉の特徴を付加するものと言い換えることもできる。学習部１３は、図５に例示する出力画像を確認することによって、学習が適切に行えているかをユーザが好適に判定することができる。 FIG. 5 shows a leaf having the characteristic of the diseased state and a leaf having the characteristic of the diseased state, which is output by masking the characteristic of the leaf in the healthy state of the original input image or the characteristic of the leaf in the diseased state and performing learning. It is a figure which shows an example of the leaf which has the characteristic of a healthy state. This can be rephrased as the learning unit 13 adding the characteristics of leaves in a healthy state or the characteristics of leaves in a diseased state to the original input image. By confirming the output image illustrated in FIG. 5, the learning unit 13 can suitably determine whether or not the learning is properly performed by the user.

また、本ステップＳ１０３において、以下のように、学習部１３が第３の分類器１２の学習を更に行い、例えば特徴量空間の部分空間に関する定義が好適に改善されたことを確認することができる。図６は、このような学習処理例を示す概念図である。 Further, in this step S103, it can be confirmed that the learning unit 13 further learns the third classifier 12 as follows, and for example, the definition regarding the subspace of the feature amount space is suitably improved. .. FIG. 6 is a conceptual diagram showing an example of such learning processing.

（第３の分類器１２の学習フェーズ）
共通部分空間中の特徴ベクトルを第３の分類器１２に入力し、第３の分類器１２の出力の正答率が低くなるように第１の分類器９、及び復号器１５を学習させる。 (Learning phase of the third classifier 12)
The feature vector in the common subspace is input to the third classifier 12, and the first classifier 9 and the decoder 15 are trained so that the correct answer rate of the output of the third classifier 12 is low.

換言すれば、学習部１３は、
・前記複数の種別の入力データの各々に対応する前記特徴量空間中の特徴ベクトルの共通部分に含まれる特徴ベクトルを第３の分類器１２に入力した場合に、第３の分類器１２の分類精度が低くなるように、かつ、上述のように、
・復号器１５が出力する画像と、第１の分類器９に入力される画像との差異が小さくなるように、
第１の分類器９及び復号器１５を学習させる。 In other words, the learning unit 13
When the feature vector included in the intersection of the feature vectors in the feature quantity space corresponding to each of the plurality of types of input data is input to the third classifier 12, the classification of the third classifier 12 is performed. To reduce accuracy and, as mentioned above,
The difference between the image output by the decoder 15 and the image input to the first classifier 9 is reduced.
The first classifier 9 and the decoder 15 are trained.

上記第３の分類器１２の学習フェーズについて、より具体的に説明すれば以下の通りである。 The learning phase of the third classifier 12 will be described in more detail as follows.

第１の分類器（エンコーダ）９と復号器（デコーダ）１５について、第１の分類器９への入力画像と、復号器１５が出力する生成画像とに違いがあれば、その違い（誤差Ａとも表記する）が小さくなるように第１の分類器９と復号器１５のパラメータを変更する。様々な入力画像を用いて上記の処理を繰り返すことで、入力画像にそっくりな画像を生成できるようになる（以上の処理を処理Ａとも呼ぶ）。 Regarding the first classifier (encoder) 9 and the decoder (decoder) 15, if there is a difference between the input image to the first classifier 9 and the generated image output by the decoder 15, the difference (error A). The parameters of the first classifier 9 and the decoder 15 are changed so that (also referred to as) becomes smaller. By repeating the above processing using various input images, it becomes possible to generate an image that looks exactly like the input image (the above processing is also referred to as processing A).

更に、本実施形態に係る処理では、上記の処理に加えて、前記複数の種別の入力データの各々に対応する前記特徴量空間中の特徴ベクトルの共通部分に含まれる特徴ベクトル２５を第３の分類器１２に入力する。そして、第３の分類器１２による分類結果と正解（入力ラベル）とが異なればエラー（エラーＢとも表記）を第１の分類器９に送る。そして、復号器（デコーダ）１５については誤差Ａが小さくなるようにパラメータを変更する一方で、第１の分類器（エンコーダ）９については誤差Ａが小さくなるように、かつ、エラーＢが大きくなるようにパラメータを変更する（以上の処理を処理Ｂとも呼ぶ）。 Further, in the process according to the present embodiment, in addition to the above process, a third feature vector 25 included in the intersection of the feature vectors in the feature quantity space corresponding to each of the plurality of types of input data is used. Input to the classifier 12. Then, if the classification result by the third classifier 12 and the correct answer (input label) are different, an error (also referred to as error B) is sent to the first classifier 9. Then, the parameters of the decoder 15 are changed so that the error A becomes small, while the error A becomes small and the error B becomes large for the first classifier (encoder) 9. (The above process is also called process B).

このようにして、処理Ａと処理Ｂとが同時に行われる。 In this way, process A and process B are performed at the same time.

このようにして、学習部１３は、特徴量空間における、第１種のデータに対応する部分空間と第２種のデータに対応する部分空間との共通部分に、第１種のデータ固有の特徴及び第２種のデータ固有の特徴の何れも含まれないように、第１の分類器、及び復号器１５を学習させる。 In this way, the learning unit 13 has features unique to the first type of data in the intersection of the subspace corresponding to the first kind of data and the subspace corresponding to the second kind of data in the feature quantity space. The first classifier and the decoder 15 are trained so as not to include any of the data-specific features of the second type and the second type.

ステップＳ１０４において、学習部１３は、健全状態にある葉に固有の特徴ベクトル、及び病気の状態にある葉に固有の特徴ベクトルを入力として、第２の分類器１１の教師有り学習を行う。ここで、第２の分類器１１には、特徴量空間における、複数の部分空間の共通部分以外の空間内の特徴ベクトルが入力される。換言すれば、第２の分類器１１には、前記特徴量空間における特徴ベクトルの各成分のうち、前記複数の部分空間の共通部分以外の空間内の各成分が入力される。 In step S104, the learning unit 13 performs supervised learning of the second classifier 11 by inputting the feature vector peculiar to the leaf in the healthy state and the feature vector peculiar to the leaf in the diseased state. Here, the feature vector in the space other than the intersection of the plurality of subspaces in the feature quantity space is input to the second classifier 11. In other words, among the components of the feature vector in the feature quantity space, each component in the space other than the common portion of the plurality of subspaces is input to the second classifier 11.

第２の分類器１１に対して、共通部分以外の空間内の特徴ベクトルを入力することによって、第２の分類器１１による分類精度を向上させることができる。 By inputting a feature vector in a space other than the intersection to the second classifier 11, the classification accuracy by the second classifier 11 can be improved.

図７は、上記教師有り学習を示す概念図である。図７の例において「エンコーダ」は、ステップＳ１０３において学習済みであり、そのときのパラメータを固定して用いる。「Fully Connected」とは、第２の分類器１１が有する全結合ネットワークを示している。また、「Decision」とは、第２の分類器１１が出力する種別に関する情報であって、画像が示す葉が健全状態にあるのか病気の状態にあるのかを示す情報を示している。 FIG. 7 is a conceptual diagram showing the above-mentioned supervised learning. In the example of FIG. 7, the “encoder” has already been learned in step S103, and the parameters at that time are fixed and used. “Fully Connected” refers to the fully connected network of the second classifier 11. Further, "Decision" is information on the type output by the second classifier 11, and indicates information indicating whether the leaves shown in the image are in a healthy state or in a diseased state.

図２のフローチャートを参照して上述した構成によれば、学習モデルのアルゴリズムが対象物の特徴を適切に学習したか否かを、確認することが困難であるという説明困難性又は解釈困難性の問題を解消できる。 According to the configuration described above with reference to the flowchart of FIG. 2, it is difficult to explain or interpret that it is difficult to confirm whether or not the algorithm of the learning model has properly learned the characteristics of the object. The problem can be solved.

なお、第１の分類器９及び復号器１５は、学習に用いる入力データを疑似的に生成することができる。ステップＳ１０２以降においては、変分オートエンコーダとしての第１の分類器９及び復号器１５が生成した画像を学習処理に用いてもよい。 The first classifier 9 and the decoder 15 can generate pseudo input data used for learning. In step S102 and subsequent steps, the images generated by the first classifier 9 and the decoder 15 as the variational autoencoder may be used for the learning process.

また、図２のフローチャートを参照して上述したように、分類装置１は、ニューラルネットワークを学習させる学習装置１としても機能する。学習装置１は、複数の種別を取り得る入力データであって、植物の葉を示す画像等の入力データを、互いに異なる複数の部分空間を含む特徴量空間中の特徴ベクトルにマッピングする第１の分類器９を学習させる学習部１３を備え、学習部１３は、前記特徴量空間中の特徴ベクトルを入力とする復号器１５を備え、前記入力データとして、健全状態にある植物の葉を示す画像等の第１種のデータを第１の分類器９に入力する場合、前記特徴量空間のうち、第２種のデータ固有の特徴に対応する部分空間をマスクしたうえで、復号器１５が出力するデータと、前記入力データとの差異が小さくなるように、第１の分類器９及び復号器１５を学習させるマスクあり学習であって、前記入力データとして、病気の状態にある植物の葉を示す画像等の前記第２種のデータを第１の分類器９に入力する場合、前記特徴量空間のうち、前記第１種のデータ固有の特徴に対応する部分空間をマスクしたうえで、復号器１５が出力するデータと、前記入力データとの差異が小さくなるように、第１の分類器９及び復号器１５を学習させるマスクあり学習を行う構成である。 Further, as described above with reference to the flowchart of FIG. 2, the classification device 1 also functions as a learning device 1 for learning the neural network. The learning device 1 is first input data that can take a plurality of types, and maps input data such as an image showing a plant leaf to a feature vector in a feature quantity space including a plurality of different subspaces. A learning unit 13 for learning the classifier 9 is provided, and the learning unit 13 is provided with a decoder 15 for inputting a feature vector in the feature quantity space, and as the input data, an image showing a leaf of a plant in a healthy state. When inputting first-class data such as, etc. to the first classifier 9, the decoder 15 outputs after masking the subspace corresponding to the second-class data-specific feature in the feature quantity space. In the masked learning in which the first classifier 9 and the decoder 15 are trained so that the difference between the data to be input and the input data becomes small, the leaves of a diseased plant are used as the input data. When the second type data such as the image to be shown is input to the first classifier 9, the partial space corresponding to the first type data-specific feature in the feature quantity space is masked and then decoded. In order to reduce the difference between the data output by the device 15 and the input data, the first classifier 9 and the decoder 15 are trained to perform learning with a mask.

また、学習装置１等を用いた学習方法は、複数の種別を取り得る入力データを、互いに異なる複数の部分空間を含む特徴量空間中の特徴ベクトルにマッピングする第１の分類器９を学習させる学習ステップを含み、前記学習ステップは、前記入力データとして第１種のデータを第１の分類器９に入力する場合、前記特徴量空間のうち、第２種のデータ固有の特徴に対応する部分空間をマスクしたうえで、前記特徴量空間中の特徴ベクトルを入力とする復号器１５が出力するデータと、前記入力データとの差異が小さくなるように、第１の分類器９及び復号器１５を学習させるマスクあり学習であって、前記入力データとして前記第２種のデータを第１の分類器９に入力する場合、前記特徴量空間のうち、前記第１種のデータ固有の特徴に対応する部分空間をマスクしたうえで、前記特徴量空間中の特徴ベクトルを入力とする復号器１５が出力するデータと、前記入力データとの差異が小さくなるように、第１の分類器９及び復号器１５を学習させるマスクあり学習を行う構成である、と言える。 Further, in the learning method using the learning device 1 or the like, the first classifier 9 that maps the input data that can take a plurality of types to the feature vector in the feature quantity space including a plurality of different subspaces is trained. The learning step includes a learning step, and when the first type of data is input to the first classifier 9 as the input data, the learning step is a portion of the feature quantity space corresponding to the feature peculiar to the second type of data. After masking the space, the first classifier 9 and the decoder 15 so that the difference between the data output by the decoder 15 that inputs the feature vector in the feature quantity space and the input data becomes small. In the case of learning with a mask for learning the above, when the second type of data is input to the first classifier 9 as the input data, it corresponds to the feature peculiar to the first type of data in the feature amount space. The first classifier 9 and decoding are performed so that the difference between the data output by the decoder 15 that inputs the feature vector in the feature quantity space and the input data is small after masking the subspace. It can be said that the configuration is such that learning with a mask is performed so that the vessel 15 is trained.

＜学習処理例に係る付記事項＞
上述した学習処理例において、学習用データを用いて事前に第１の分類器９及び復号器１５の教師無し学習を行う処理は必須ではない。即ち分類装置１は、ステップＳ１０１において植物の葉の画像を取得したのち、続いてステップＳ１０３におけるマスク有り学習を行う構成であってもよい。 <Additional notes related to learning processing examples>
In the above-mentioned learning processing example, it is not essential to perform unsupervised learning of the first classifier 9 and the decoder 15 in advance using the learning data. That is, the classification device 1 may be configured to acquire an image of a plant leaf in step S101 and then perform masked learning in step S103.

〔３．分類処理例〕
以下、上述した学習を行った分類装置１によって実行される入力データの分類処理の一例について説明する。図８は、本実施形態に係る分類装置１による分類処理の流れを示すフローチャートである。 [3. Classification processing example]
Hereinafter, an example of the input data classification process executed by the classification device 1 that has performed the above-mentioned learning will be described. FIG. 8 is a flowchart showing the flow of the classification process by the classification device 1 according to the present embodiment.

ステップＳ２０１において、取得部５は、分類の対象となる入力データとなる植物の葉の画像を取得する。 In step S201, the acquisition unit 5 acquires an image of a plant leaf as input data to be classified.

ステップＳ２０２において、第１の分類器９は、取得部５が取得した画像を、特徴空間中の特徴ベクトルにマッピングする。このとき、上記画像が、健全状態にある葉の画像であるか、病気の状態にある葉の画像であるかによって、上記画像がマッピングされる特徴量空間中の部分空間は互いに異なる。 In step S202, the first classifier 9 maps the image acquired by the acquisition unit 5 to the feature vector in the feature space. At this time, the subspaces in the feature space to which the image is mapped differ from each other depending on whether the image is an image of a leaf in a healthy state or an image of a leaf in a diseased state.

ステップＳ２０３において、第２の分類器１１は、第１の分類器９がマッピングした特徴ベクトルであって、特徴量空間中の特徴ベクトルを入力として、分類の対象となる画像が示す植物の葉が、健全状態にあるか、病気の状態にあるか示す情報を出力する。 In step S203, the second classifier 11 is a feature vector mapped by the first classifier 9, and the leaf of the plant indicated by the image to be classified is displayed by inputting the feature vector in the feature amount space. , Outputs information indicating whether it is in a healthy state or in a sick state.

ステップＳ２０４において、出力部１７は、第２の分類器１１が出力した情報を、例えば自身が有するディスプレイ等に表示させる。 In step S204, the output unit 17 causes the information output by the second classifier 11 to be displayed on, for example, a display or the like owned by the output unit 17.

〔４．重要性評価処理〕
本実施形態に係る分類処理の解釈可能性を更に向上させるため、分類装置１は、以下に列挙する重要性評価に関する一連の処理を行う構成としてもよい。 [4. Importance evaluation process]
In order to further improve the interpretability of the classification process according to the present embodiment, the classification device 1 may be configured to perform a series of processes related to the importance evaluation listed below.

（４−１：特徴量空間の座標系又は特徴ベクトルの変換）
分類装置１の制御部３が、特徴量空間の座標系又は特徴量空間における特徴ベクトルの変換を行い、変換後の特徴量空間又は特徴ベクトルを第２の分類器１１に入力することによって、特徴量空間における何れの領域、又は特徴ベクトルにおける何れの成分が、第２の分類器１１が出力する推定結果にどの程度影響を与えるのかを、ユーザが容易に確認することができる。 (4-1: Transformation of the coordinate system or feature vector of the feature space)
The control unit 3 of the classification device 1 converts the feature vector in the coordinate system of the feature space or the feature space, and inputs the converted feature space or feature vector to the second classifier 11. The user can easily confirm to what extent which region in the quantity space or which component in the feature vector affects the estimation result output by the second classifier 11.

すなわち、一般に、変換前の特徴ベクトルをＸと表現し、変換後の特徴ベクトルをＸ’と表現し、ベクトルの変換を行う関数をｆと表現すると、
Ｘ’＝ｆ（Ｘ）
によって特徴ベクトルを変換し、変換後の特徴ベクトルを第２の分類器１１または復号器に入力することによって、特徴量空間における何れの領域、又は特徴ベクトルにおける何れの成分が、推定結果にどの程度影響を与えるのかをユーザが容易に確認することができる。また、どのような出力に対して、どのような重みや非線形性が寄与しているのかを容易に確認することができる。 That is, in general, if the feature vector before conversion is expressed as X, the feature vector after conversion is expressed as X', and the function that converts the vector is expressed as f.
X'= f (X)
By transforming the feature vector with the above and inputting the converted feature vector to the second classifier 11 or the decoder, which region in the feature space or which component in the feature vector is included in the estimation result. The user can easily confirm whether or not the influence is exerted. In addition, it is possible to easily confirm what kind of weight and non-linearity contribute to what kind of output.

ここで、関数ｆの具体例や決定方法は本実施形態を限定するものではないが、例えば、関数ｆの一例には、バックプロパゲーション等の各種手法における変換関数が含まれる。 Here, the specific example and the determination method of the function f do not limit the present embodiment, but for example, an example of the function f includes a conversion function in various methods such as backpropagation.

また、更に解釈可能性を向上させるために、上記の関数ｆとして、変換前の特徴ベクトルをＸに加え、第２の分類器１１が出力する推定結果の少なくとも一部を引数とする関数を採用してもよい。一例として、第２の分類器１１が出力する推定結果のうちの病気を示す成分をＤoutと表現した場合、
Ｘ’＝ｆ（Ｘ，Ｄout）
によって特徴ベクトルを変換し、変換前と変換後の特徴ベクトルをそれぞれ復号器に入力する構成としてもよい。 Further, in order to further improve the interpretability, as the above function f, a function in which the feature vector before conversion is added to X and at least a part of the estimation result output by the second classifier 11 is used as an argument is adopted. You may. As an example, when the component indicating a disease in the estimation result output by the second classifier 11 is expressed as Dout,
X'= f (X, Dout)
The feature vector may be converted by the above, and the feature vector before and after the conversion may be input to the decoder, respectively.

これにより、病気の特徴ベクトルにおける何れの成分が、病気という推定結果にどの程度影響を与えるのかをユーザが容易に確認することができる。また、病気という推定結果に対して、どのような重みや非線形性が寄与しているのかを容易に確認することができる。 As a result, the user can easily confirm which component in the disease feature vector affects the estimation result of the disease to what extent. In addition, it is possible to easily confirm what kind of weight and non-linearity contribute to the estimation result of disease.

なお、このような特徴ベクトルの変換の一例として、後述するように、特徴ベクトルの各成分のうち、相対的に大きな重み係数が乗ぜられる成分を変化させることによって変換後の特徴ベクトルを生成することが挙げられる。 As an example of such conversion of the feature vector, as will be described later, the converted feature vector is generated by changing the component on which a relatively large weighting coefficient is multiplied among each component of the feature vector. Can be mentioned.

（４−２：評価結果の出力）
分類装置１の出力部１７は、上記変換前の特徴ベクトルを復号器に入力した場合の当該復号器の出力画像（出力画像Ａ１とも呼ぶ）と、上記変換後の特徴ベクトルを復号器に入力した場合の当該復号器の出力画像（出力画像Ａ２とも呼ぶ）との相違を可視化した画像（相違画像とも呼ぶ）を表示する構成としてもよい。 (4-2: Output of evaluation result)
The output unit 17 of the classification device 1 input the output image (also referred to as output image A1) of the decoder when the feature vector before conversion is input to the decoder and the feature vector after conversion to the decoder. The image (also referred to as a difference image) that visualizes the difference from the output image (also referred to as the output image A2) of the decoder in the case may be displayed.

一例として、分類装置１の制御部３が、出力画像Ａ１と出力画像Ａ２との差分画像、又は、出力画像Ａ１と出力画像Ａ２とを用いたヒートマップ画像を相違画像として生成し、出力部１７にて相違画像を表示する構成としてもよい。 As an example, the control unit 3 of the classification device 1 generates a difference image between the output image A1 and the output image A2 or a heat map image using the output image A1 and the output image A2 as a difference image, and the output unit 17 It may be configured to display different images in.

また、これらの相違画像を表示する際には、変換関数ｆを特徴付けるパラメータ等を共に表示することが好ましい。これにより、ユーザは、どのような変換を行った場合に、出力画像がどのような影響をどの程度受けるのかを容易に確認することができる。 Further, when displaying these different images, it is preferable to display the parameters that characterize the conversion function f together. As a result, the user can easily confirm what kind of conversion is performed and how much the output image is affected.

〔５．従来技術との相違点〕
「２．学習処理例」及び「３．分類処理例」において上述した本実施形態に係る構成と、従来技術との相違点について補足する。入力データ中のどの特徴がどの分類に対応するかを確認する既存手法としては、例えば以下の例が挙げられる。
・陰層活性化の可視化
各層と各フィルターの活性化を使用して、特定の画像に対して活性化されている場所を確認することができる。
・特徴の視覚確認
画像にグラデーションを繰り返し追加することで、活性化を増やす。最後に、ユーザは画像中の変更された箇所を確認することができる。
・セマンティック辞書の視覚化
CNNの最終層のグローバル平均プーリングを使用し、それに1つの線形層だけを掛けてクラス分類をする。グローバル平均プーリング層のうち重要なチャンネルを特定した後、特徴を視覚化することで、特定のクラス画像における重要な領域を特定することが可能になる。
・アテンションマップ
アテンションマップに分類される一般的なアルゴリズムはいくつかある（一般勾配、GRAD-CAM、誘導逆伝搬、積分勾配など）。 [5. Differences from conventional technology]
In "2. Learning processing example" and "3. Classification processing example", the differences between the above-described configuration according to the present embodiment and the prior art will be supplemented. As an existing method for confirming which feature in the input data corresponds to which classification, for example, the following example can be given.
-Visualization of negative layer activation The activation of each layer and each filter can be used to identify where activation is for a particular image.
-Visual confirmation of features Increase activation by repeatedly adding gradation to the image. Finally, the user can see the changed part in the image.
・ Visualization of semantic dictionary
Use the global mean pooling of the last layer of the CNN and multiply it by only one linear layer to classify. After identifying the important channels in the global average pooling layer, visualizing the features makes it possible to identify important areas in a particular class image.
-Attention map There are several common algorithms that are classified as attention maps (general gradient, GRAD-CAM, induced backpropagation, integral gradient, etc.).

一般にそれらは与えられたクラスの活性化を説明するために逆伝搬勾配を使っている。例として、GRAD-CAMはグローバル平均プーリング+逆勾配を使用することで、特定のクラス画像における重要な領域を特定することが可能である。 In general they use backpropagation gradients to account for the activation of a given class. As an example, GRAD-CAM can identify important regions in a particular class image by using global mean pooling + reverse gradient.

上述したような既存手法の大半は、例えば植物が病気に罹患しているかの判定等の分類処理において、AlexNet、VGG16等の分類ネットワークからの結果を記載するために使用されている。しかしながら上記既存手法は、入力データ中の何れの特徴がニューラルネットワークの上位層のフィルターから抽出されたのかが分からないという問題がある。また、上記既存手法において、何れの特徴がクラス活性化を与えるために使われたかを確認するためには、追加のトレーニングを必要とする。セマンティック辞書法では、各クラスの活性化に関与する最上位レベルのニューロンは、特徴視覚化マップとして可視化できるが、特徴視覚化マップからヒトが理解できる特徴を識別することは困難である。アテンションマップでは、追加のトレーニングを必要とせず各クラスの特徴視覚化マップを作成することができるが、上記と同様に特徴マップからユーザが理解できる特徴を識別することは困難である。さらに、アテンションマップ法であってもいくつかの手法については、追加のトレーニングを必要とする。また、上記既存手法は、葉の一部などの大きさに依存する特徴のみに焦点を当てており、特徴マップでは表現が困難な色などの大きさに依存しない特徴を考慮することができない。 Most of the existing methods as described above are used to describe the results from classification networks such as AlexNet and VGG16 in the classification process such as determining whether a plant has a disease. However, the above-mentioned existing method has a problem that it is not possible to know which feature in the input data is extracted from the filter in the upper layer of the neural network. Also, in the above existing method, additional training is required to confirm which feature was used to give class activation. In the semantic dictionary method, the top-level neurons involved in the activation of each class can be visualized as a feature visualization map, but it is difficult to identify human-understandable features from the feature visualization map. Attention maps allow the creation of feature visualization maps for each class without the need for additional training, but similar to the above, it is difficult to identify user-understandable features from the feature maps. In addition, some methods, even the attention map method, require additional training. In addition, the above-mentioned existing method focuses only on features that depend on the size of a part of a leaf and the like, and cannot consider features that do not depend on the size such as colors that are difficult to express in a feature map.

一方で、本実施形態に係る構成によれば、画像に含まれるどの特徴がどの分類に対応するのか、また、複数の分類に共通する特徴は何かを確認することができる。また、上記の構成においては、トレーニングパイプラインを用いるので追加のトレーニングを要しない。 On the other hand, according to the configuration according to the present embodiment, it is possible to confirm which feature included in the image corresponds to which classification, and what features are common to the plurality of classifications. Further, in the above configuration, since the training pipeline is used, no additional training is required.

また、既存手法においては、本実施形態における分類部７に相当するニューラルネットワーク全体をまとめて学習させている。一方で本実施形態に係る構成は、第１の分類器９と第２の分類器１１とを段階的に学習させる点において既存手法と相違する。 Further, in the existing method, the entire neural network corresponding to the classification unit 7 in the present embodiment is trained collectively. On the other hand, the configuration according to the present embodiment is different from the existing method in that the first classifier 9 and the second classifier 11 are trained step by step.

また、本実施形態に係る構成と、Fader Networksとを比較すると、Fader Networksをトレーニングするときには、エンコーダ出力ベクトル各要素を1-0に変える条件付きトレーニングをする。一方で本実施形態に係る構成は、クラスに対してベクトルの異なる部分を使い、1-0を使うのではなく直接そのクラス値を渡している。また、本実施形態に係る構成では、クラスベクトルの代わりに分類器を使用するが、この部分はFader Networksにはない。そのため本実施形態に係る構成におけるアルゴリズムのパイプラインは説明可能な特徴を使って分類をすることができる。そして、より多くのクラス固有の情報をデコーダ層に送ることができる。エンコーダにおいて表される正しい特徴を確認する必要があるため、本実施形態に係る構成では、デコーダにおける下位層のエンコーダの活性化も使用しない。 Further, comparing the configuration according to the present embodiment with Fader Networks, when training Fader Networks, conditional training is performed in which each element of the encoder output vector is changed to 1-0. On the other hand, the configuration according to this embodiment uses different parts of the vector for the class, and instead of using 1-0, the class value is passed directly. Further, in the configuration according to the present embodiment, a classifier is used instead of the class vector, but this part is not included in Fader Networks. Therefore, the pipeline of algorithms in the configuration according to this embodiment can be classified using explainable features. Then, more class-specific information can be sent to the decoder layer. Since it is necessary to confirm the correct characteristics represented by the encoder, activation of the lower layer encoder in the decoder is not used in the configuration according to the present embodiment.

〔６．本願適用例〕
例えば、インタネット上で一般に公開されている植物画像データベースのPlantVillage Datasetからダウンロードした植物の葉の画像を用いて、変分オートエンコーダの教師無し学習を行う。入力画像サイズは２５６×２５６画素、エンコーダの出力は平均値４０９６、標準偏差４０９６とする。エンコーダの構造は、畳み込み６層＋全結合層３層、デコーダはその逆で、全結合３層＋逆畳み込み６層とする。上記の学習済み変分オートエンコーダのうち４０９６次元の特徴ベクトル（潜在変数）を、１／４が健全、１／２が共通、１／４が病気特徴部と定義する。健全または病気のクラス情報が付与されている葉の画像を用いて、上記の学習済み変分オートエンコーダの条件付き学習を行う。ここで、潜在変数の共通特徴部に健全および病気の特徴が含まれないようにするため、３層のCNNの健全・病気分類器による分類精度が向上しないようにエンコーダを学習する。また、健全の特徴を潜在変数の健全特徴部へ、病気の特徴を病気特徴部に反映させるため、健全葉の場合、潜在変数の病気特徴部をランダム、病気葉の場合、健全特徴部をランダムに変更してエンコーダおよびデコーダを学習する。さらに、ユーザが理解しやすい結果を生成するため、これらのエンコーダおよびデコーダで生成された画像と入力画像について敵対的生成ネットワーク（GAN）を適用する。学習済みエンコードの健全特徴部と病気特徴部を連結した２０４８次元の潜在変数を入力として、３層のCNNの健全・病気分類器の教師有り学習を行う。 [6. Application example of the present application]
For example, unsupervised learning of a variational auto-encoder is performed using plant leaf images downloaded from the Plant Village Dataset, a plant image database that is open to the public on the Internet. The input image size is 256 × 256 pixels, the output of the encoder is an average value of 4096, and a standard deviation of 4096. The structure of the encoder is 6 convolution layers + 3 fully connected layers, and vice versa for the decoder, which is 3 fully connected layers + 6 deconvolution layers. Of the above learned variational autoencoders, the 4096-dimensional feature vector (latent variable) is defined as 1/4 as sound, 1/2 as common, and 1/4 as disease feature. Conditional learning of the above learned variational autoencoder is performed using the leaf image to which the healthy or disease class information is given. Here, in order to prevent the common feature portion of the latent variable from including the health and disease features, the encoder is learned so that the classification accuracy of the three-layer CNN by the health and disease classifier does not improve. In addition, in order to reflect the healthy characteristics in the healthy characteristics of the latent variable and the disease characteristics in the disease characteristics, the disease characteristics of the latent variables are random in the case of healthy leaves, and the healthy characteristics are random in the case of diseased leaves. Change to to learn encoders and decoders. In addition, a Generative Adversarial Network (GAN) is applied to the images and input images generated by these encoders and decoders to produce results that are easy for the user to understand. Supervised learning of a three-layer CNN health / disease classifier is performed by inputting a 2048-dimensional latent variable that connects the sound feature part and the disease feature part of the learned encoding.

〔７．実施例〕
「６．本願適用例」において上述した構成において、健全なトマトの葉とトマト疫病に罹患した葉、健全なトマトの葉とトマトモザイク病に罹患した葉、及び、健全なピーマンの葉とピーマン斑点細菌病に罹患した葉の３種のデータセットを入力データとして、本実施形態の分類装置１における分類処理を実行したところ、上記３セットにおける分類精度の平均値は９４％であった。また、使用された特徴量には、葉の色、形および表面のテクスチャ等が含まれることが確認された。 [7. Example〕
In the configuration described above in "6. Application example of the present application", healthy tomato leaves and leaves suffering from tomato plague, healthy tomato leaves and leaves suffering from tomato mosaic disease, and healthy bell pepper leaves and bell pepper spots. When the classification process in the classification device 1 of the present embodiment was executed using three types of data sets of leaves affected by bacterial disease as input data, the average value of the classification accuracy in the above three sets was 94%. It was also confirmed that the feature amounts used included leaf color, shape, surface texture, and the like.

〔８．実施形態の付記事項１〕
上述のように、本実施形態に係る分類装置（学習装置）１は、複数の種別を取り得る入力データを取得する取得部５と、前記入力データを、特徴量空間中の特徴ベクトルにマッピングする第１の分類器９と、前記特徴量空間中の特徴ベクトルを入力とし、前記種別に関する情報を出力する第２の分類器１１とを備え、第２の分類器１１には、前記複数の種別の入力データの各々に対応する前記特徴量空間中の特徴ベクトルの共通部分以外の特徴ベクトルが入力されるものであると表現することができる。 [8. Appendix 1 of the embodiment]
As described above, the classification device (learning device) 1 according to the present embodiment maps the acquisition unit 5 that acquires the input data that can take a plurality of types and the input data to the feature vector in the feature quantity space. A first classifier 9 and a second classifier 11 that inputs a feature vector in the feature quantity space and outputs information about the type are provided, and the second classifier 11 includes the plurality of types. It can be expressed that a feature vector other than the common part of the feature vector in the feature quantity space corresponding to each of the input data of is input.

本実施形態に係る分類装置（学習装置）１では、上記のように、第２の分類器１１には、前記複数の種別の入力データの各々に対応する前記特徴量空間中の特徴ベクトルの共通部分以外の特徴ベクトルが入力されるので、より好適な分類処理を実現することができる。例えば、第２の分類器１１に前記共通部分の特徴ベクトルも入力する構成に比べて分類精度が向上する。。 In the classification device (learning device) 1 according to the present embodiment, as described above, the second classifier 11 has a common feature vector in the feature quantity space corresponding to each of the plurality of types of input data. Since the feature vector other than the part is input, more suitable classification processing can be realized. For example, the classification accuracy is improved as compared with the configuration in which the feature vector of the common portion is also input to the second classifier 11. ..

また、本実施形態に係る分類装置（学習装置）１では、上記のように、取得部５は、複数の種別を取り得る入力データを取得する。ここで、複数の種別は２種類であることに限定されず、後述するように３種類以上であってもよい。 Further, in the classification device (learning device) 1 according to the present embodiment, as described above, the acquisition unit 5 acquires input data that can take a plurality of types. Here, the plurality of types are not limited to two types, and may be three or more types as described later.

したがって、本実施形態に係る分類装置（学習装置）１では、２クラスへの分類（例えば、植物が健全であるか病気であるのかに関する分類）のみならず、マルチクラスへの分類（例えば、植物が健全であるか病気であるのかのみならず、どのような病気の種類に関する病気であるのかに関する分類）に対しても、好適に分類処理を行うことができる。 Therefore, in the classification device (learning device) 1 according to the present embodiment, not only the classification into two classes (for example, the classification regarding whether the plant is healthy or diseased) but also the classification into multiple classes (for example, plants). It is possible to preferably perform the classification process not only for whether the disease is healthy or ill, but also for what kind of illness the disease is related to).

また、上述のように、本実施形態に係る分類装置（学習装置）１は、複数の種別を取り得る入力データを、特徴量空間中の特徴ベクトルにマッピングする第１の分類器９を学習させる学習部１３と、前記特徴量空間中の特徴ベクトルを入力とし、前記種別に関する情報を出力する第３の分類器１２とを備え、学習部１３は、前記複数の種別の入力データの各々に対応する前記特徴量空間中の特徴ベクトルの共通部分に含まれる特徴ベクトルを第３の分類器１２に入力した場合に、第３の分類器１２の分類精度が低くなるように、第１の分類器９を学習させるものである。 Further, as described above, the classifier (learning device) 1 according to the present embodiment learns the first classifier 9 that maps input data that can take a plurality of types to a feature vector in the feature amount space. A learning unit 13 and a third classifier 12 that receives a feature vector in the feature quantity space as input and outputs information about the type are provided, and the learning unit 13 corresponds to each of the input data of the plurality of types. When the feature vector included in the common part of the feature vector in the feature quantity space is input to the third classifier 12, the classification accuracy of the third classifier 12 is lowered, so that the first classifier is used. 9 is to be learned.

本実施形態に係る分類装置（学習装置）１では、上記のように、学習部１３は、前記複数の種別の入力データの各々に対応する前記特徴量空間中の特徴ベクトルの共通部分に含まれる特徴ベクトルを第３の分類器１２に入力した場合に、第３の分類器１２の分類精度が低くなるように、第１の分類器９を学習させるので、解釈可能性を向上させることができる。 In the classification device (learning device) 1 according to the present embodiment, as described above, the learning unit 13 is included in the intersection of the feature vectors in the feature quantity space corresponding to each of the plurality of types of input data. When the feature vector is input to the third classifier 12, the first classifier 9 is trained so that the classification accuracy of the third classifier 12 becomes low, so that the interpretability can be improved. ..

〔９．実施形態の付記事項２〕
上述の実施形態では、病気の状態にある葉に固有の特徴を示す特徴ベクトル２１と、健全状態にある葉に固有の特徴を示す特徴ベクトル２３とを区別して処理を行っていたが、これは本明細書に記載の事項を限定するものではない。 [9. Appendix 2 of the embodiment]
In the above-described embodiment, the feature vector 21 showing the characteristic peculiar to the diseased leaf and the feature vector 23 showing the characteristic peculiar to the healthy leaf are distinguished and processed. The matters described in this specification are not limited.

一例として、本実施形態に係る分類装置（学習装置）１は、病気の状態にある葉に固有の特徴を示す特徴ベクトル２１と、健全状態にある葉に固有の特徴を示す特徴ベクトル２３とを区別せずに、それらを合わせて、分類可能特徴ベクトル２４として扱う構成としてもよい。 As an example, the classification device (learning device) 1 according to the present embodiment has a feature vector 21 showing a feature peculiar to a diseased leaf and a feature vector 23 showing a feature peculiar to a healthy leaf. Instead of distinguishing them, they may be combined and treated as a classifiable feature vector 24.

このような構成においても、本実施形態に係る分類装置（学習装置）１は、複数の種別を取り得る入力データを取得する取得部５と、前記入力データを、特徴量空間中の特徴ベクトルにマッピングする第１の分類器９と、前記特徴量空間中の特徴ベクトルを入力とし、前記種別に関する情報を出力する第２の分類器１１とを備え、第２の分類器１１には、前記複数の種別の入力データの各々に対応する前記特徴量空間中の特徴ベクトルの共通部分以外の特徴ベクトルが入力される。 Even in such a configuration, the classification device (learning device) 1 according to the present embodiment uses the acquisition unit 5 for acquiring input data that can take a plurality of types and the input data as a feature vector in the feature quantity space. A first classifier 9 for mapping and a second classifier 11 for inputting a feature vector in the feature quantity space and outputting information on the type are provided, and the second classifier 11 includes the plurality of the above. A feature vector other than the common part of the feature vector in the feature quantity space corresponding to each of the input data of the type of is input.

図９は、このような構成を用いた場合の学習処理例を示す概念図であり、図６に対応する図面である。図９に示すように、本例に係る学習処理では、特徴量空間中の特徴ベクトルは、分類可能特徴ベクトル２４と、健全状態および病気の状態の葉に共通した葉自体の特徴を示す共通特徴ベクトル２５とによって構成される。 FIG. 9 is a conceptual diagram showing an example of learning processing when such a configuration is used, and is a drawing corresponding to FIG. As shown in FIG. 9, in the learning process according to this example, the feature vector in the feature space is the classifiable feature vector 24 and the common feature showing the features of the leaf itself common to the leaves in the healthy state and the diseased state. It is composed of a vector 25.

換言すれば、当該例に係る学習処理では、特徴量空間は、
・分類可能特徴ベクトルに対応する部分空間（分類可能部分空間）と、
・健全状態であるか病気の状態であるかを問わない特徴であって、葉自体の共通した特徴に対応する部分空間（共通部分空間）と
により構成される。 In other words, in the learning process according to the example, the feature space is
-The subspace corresponding to the classifiable feature vector (classifiable subspace) and
-It is a feature regardless of whether it is in a healthy state or a diseased state, and is composed of a subspace (common subspace) corresponding to the common feature of the leaf itself.

また、本例に係る学習処理では、ステップＳ１０３において説明した各部分空間へのマスクを行うことなく、第１の分類器９、復号器１５、及び第３の分類器１２の学習が行われる。第１の分類器９、復号器１５、及び第３の分類器１２の学習処理における他の構成については上述の実施形態において説明したためここでは改めての説明を省略する。 Further, in the learning process according to this example, learning of the first classifier 9, the decoder 15, and the third classifier 12 is performed without masking each subspace described in step S103. Since other configurations in the learning process of the first classifier 9, the decoder 15, and the third classifier 12 have been described in the above-described embodiment, the description will be omitted here.

このような構成においても、本実施形態に係る分類装置（学習装置）１は、複数の種別を取り得る入力データを、特徴量空間中の特徴ベクトルにマッピングする第１の分類器９を学習させる学習部１３と、前記特徴量空間中の特徴ベクトルを入力とし、前記種別に関する情報を出力する第３の分類器１２とを備え、学習部１３は、前記複数の種別の入力データの各々に対応する前記特徴量空間中の特徴ベクトルの共通部分に含まれる特徴ベクトルを第３の分類器１２に入力した場合に、第３の分類器１２の分類精度が低くなるように、第１の分類器９を学習させる。 Even in such a configuration, the classifier (learning device) 1 according to the present embodiment learns the first classifier 9 that maps input data that can take a plurality of types to a feature vector in the feature amount space. A learning unit 13 and a third classifier 12 that receives a feature vector in the feature quantity space as input and outputs information about the type are provided, and the learning unit 13 corresponds to each of the input data of the plurality of types. When the feature vector included in the common part of the feature vector in the feature quantity space is input to the third classifier 12, the classification accuracy of the third classifier 12 is lowered, so that the first classifier is used. Learn 9

また、図１０は、上記の構成を用いた場合の、第２の分類器１１による教師あり学習を示す概念図であり、図７に対応する図面である。図１０の例において「Fully Connected」とは、第２の分類器１１が有する全結合ネットワークを示している。また、「Decision」とは、第２の分類器１１が出力する種別に関する情報であって、画像が示す葉が健全状態にあるのか病気の状態にあるのかを示す情報を示している。ここで、病気の状態として、病気Ａ、病気Ｂ、病気Ｃ、・・・のようにマルチクラスへの分類が可能である。第２の分類器１１の学習処理における他の構成については上述の実施形態において説明したためここでは改めての説明を省略する。 Further, FIG. 10 is a conceptual diagram showing supervised learning by the second classifier 11 when the above configuration is used, and is a drawing corresponding to FIG. 7. In the example of FIG. 10, "Fully Connected" indicates a fully connected network included in the second classifier 11. Further, "Decision" is information on the type output by the second classifier 11, and indicates information indicating whether the leaves shown in the image are in a healthy state or in a diseased state. Here, the state of illness can be classified into multi-class such as illness A, illness B, illness C, and so on. Since other configurations in the learning process of the second classifier 11 have been described in the above-described embodiment, a new description will be omitted here.

上記のように構成された分類装置（学習装置）１によれば、健全か病気かという二者択一の分類ではなく、例えば、健全、病気Ａ、病気Ｂ、病気Ｃ、・・・への分類のように、多段階の分類（マルチクラス分類）を容易に行うことができる。 According to the classification device (learning device) 1 configured as described above, it is not an alternative classification of healthy or illness, but for example, healthy, illness A, illness B, illness C, ... Like classification, multi-stage classification (multi-class classification) can be easily performed.

〔１０．実施形態の付記事項３〕
上述の実施形態において、分類処理の解釈可能性を更に向上させるための重要性評価処理について説明したが、以下では、分類処理の解釈可能性を向上させるための具体的処理例について説明する。 [10. Appendix 3 of the embodiment]
In the above-described embodiment, the importance evaluation process for further improving the interpretability of the classification process has been described, but a specific processing example for improving the interpretability of the classification process will be described below.

図１１は、本例に係る解釈可能性向上処理例を示す概念図である。本実施形態に係る分類装置（学習装置）１は、本例の処理を、一例として以下のように行う。 FIG. 11 is a conceptual diagram showing an example of interpretability improvement processing according to this example. The classification device (learning device) 1 according to the present embodiment performs the processing of this example as an example as follows.

（ステップＳ３０１）
まず、ある１枚の入力画像に対して第1の分類器９は、図１１の符号３１に示すように、分類可能特徴空間において、基準となる特徴点（特徴ベクトル）３９を設定する。 (Step S301)
First, for a certain input image, the first classifier 9 sets a reference feature point (feature vector) 39 in the classifiable feature space as shown by reference numeral 31 in FIG.

（ステップＳ３０２）
続いて、同じクラスの複数の入力画像に対して第１の分類器９は、図１１の符号３３に示すように、分類可能特徴空間において、基準となる特徴点３９の近傍に複数の点を設定（生成）する。本ステップにおける複数の点の生成には、一例としてモンテカルロサンプリングによって生成される。 (Step S302)
Subsequently, for a plurality of input images of the same class, the first classifier 9 sets a plurality of points in the vicinity of the reference feature point 39 in the classifiable feature space as shown by reference numeral 33 in FIG. Set (generate). The generation of a plurality of points in this step is generated by Monte Carlo sampling as an example.

（ステップＳ３０３）
続いて、第２の分類器１１は、図１１の符号３５に示すように、分類可能特徴空間において、複数の種別の入力データの各々に対応する特徴量空間中の特徴ベクトルが属するクラスタであって、互いに隣接するクラスタの境界を規定する前記特徴量空間中の超平面（分離超平面）を決定する。 (Step S303)
Subsequently, as shown by reference numeral 35 in FIG. 11, the second classifier 11 is a cluster to which the feature vectors in the feature quantity space corresponding to each of the plurality of types of input data belong in the classifiable feature space. To determine the hyperplane (separated hyperplane) in the feature space that defines the boundaries of clusters adjacent to each other.

図１１の符号３５に示す例では、決定された分離超平面４１は、健全状態に分類されるクラスタと、病気状態に分類されるクラスタとを分離している。 In the example shown by reference numeral 35 in FIG. 11, the determined separation hyperplane 41 separates the clusters classified as healthy and the clusters classified as diseased.

（ステップＳ３０４）
続いて、第２の分類器１１は、ステップＳ３０３において決定した超平面の近傍に位置する１又は複数の特徴ベクトルを決定し、決定した１又は複数の特徴ベクトルを復号器１５（図１１における「学習済デコーダ」）に入力する。なお、復号器１５に入力される特徴ベクトルは、ユーザによって指定可能な構成であってもよい。 (Step S304)
Subsequently, the second classifier 11 determines one or a plurality of feature vectors located in the vicinity of the hyperplane determined in step S303, and determines the determined one or a plurality of feature vectors in the decoder 15 ("" in FIG. 11 ". Input to the trained decoder "). The feature vector input to the decoder 15 may have a configuration that can be specified by the user.

図１１の符号３７に示す例では、ステップＳ３０１において設定した特徴点３９（病気に分類）を復号器１５（図１１における「学習済デコーダ」）に入力することによって病気葉画像を生成する。また、特徴点３９の近傍（分離超平面４１の近傍）に位置する特徴ベクトルであって、特徴点３９とは異なる種別（クラス）に分類される特徴ベクトルを復号器１５に入力することによって健全葉画像を生成する。図１１の符号３７に示す例では、より具体的には、特徴点３９の近傍（分離超平面４１の近傍）に位置する特徴ベクトルとして、基準となる特徴点３９に最も近い健全の点の特徴ベクトルが用いられている。 In the example shown by reference numeral 37 in FIG. 11, a diseased leaf image is generated by inputting the feature point 39 (classified as a disease) set in step S301 into the decoder 15 (“learned decoder” in FIG. 11). Further, the feature vector located in the vicinity of the feature point 39 (near the separation hyperplane 41) and classified into a type (class) different from the feature point 39 is input to the decoder 15 to be sound. Generate a leaf image. In the example shown by reference numeral 37 in FIG. 11, more specifically, as a feature vector located in the vicinity of the feature point 39 (near the separation hyperplane 41), the feature of the sound point closest to the reference feature point 39. Vectors are used.

（ステップＳ３０５）
続いて、出力部１７は、復号器１５が出力する画像データを表示パネル（表示部）に表示する。一例として、出力部１７は、復号器１５が生成した健全葉画像と病気葉画像と比較可能に表示する。 (Step S305)
Subsequently, the output unit 17 displays the image data output by the decoder 15 on the display panel (display unit). As an example, the output unit 17 displays the healthy leaf image and the diseased leaf image generated by the decoder 15 in a comparable manner.

以上のように構成された分類装置（学習装置）１は、異なる種別に分類される特徴ベクトルから生成された画像であって、互いに異なる種別（クラス）に分類される画像を表示する。 The classification device (learning device) 1 configured as described above is an image generated from feature vectors classified into different types, and displays images classified into different types (classes).

図１２は、分類装置（学習装置）１に入力する入力画像例と、生成された病気葉画像及び健全葉画像の例（表示例）を示す図である。図１２に示すように、分類装置（学習装置）１は、入力された病気葉画像から辛うじて健全と分類される健全葉画像を生成したり、入力された健全葉画像から辛うじて病気と分類される病気葉画像を生成したりすることができる。 FIG. 12 is a diagram showing an example of an input image to be input to the classification device (learning device) 1 and an example (display example) of the generated diseased leaf image and healthy leaf image. As shown in FIG. 12, the classification device (learning device) 1 generates a healthy leaf image that is barely classified as healthy from the input healthy leaf image, or is barely classified as a disease from the input healthy leaf image. It is possible to generate a diseased leaf image.

また、複数種別の教師画像に対して、図１１の符号３３に示すように、分類可能特徴空間において、特徴点を生成することで、複数のクラスタが生成される（教師画像が健全、病気Ａ、病気Ｂ、病気Ｃの場合、4つ以上のクラスタ；健全といってもクラスタが２つに分かれる可能性があるため）。それぞれの重心の特徴点に向かって特徴ベクトルを線形内挿し、それぞれ図１１の符号３７に示すように復号器１５によって出力することで、典型的な健全葉、典型的な病気Ａ葉、典型的な病気Ｂ葉……に徐々に変化していく画像生成が可能になるため、図５のようにモデルが分類に用いた特徴の可視化が可能となる。 Further, for a plurality of types of teacher images, as shown by reference numeral 33 in FIG. 11, a plurality of clusters are generated by generating feature points in the classifiable feature space (teacher image is healthy, disease A). , Illness B, Illness C, 4 or more clusters; even though it is healthy, the clusters may be divided into two). By linearly interpolating the feature vector toward the feature point of each center of gravity and outputting it by the decoder 15 as shown by reference numeral 37 in FIG. 11, a typical healthy leaf, a typical diseased A leaf, and a typical one are output. Since it is possible to generate an image that gradually changes to a disease B leaf, etc., it is possible to visualize the features used for classification by the model as shown in FIG.

このようにして、ユーザは、分類装置（学習装置）１がどのような基準で分類を行っているのかを容易に把握することができる。したがって、上記のように構成された分類装置（学習装置）１によれば、解釈可能性を向上させることができる。 In this way, the user can easily grasp the criteria by which the classification device (learning device) 1 classifies. Therefore, according to the classification device (learning device) 1 configured as described above, the interpretability can be improved.

また、本例に係る分類装置（学習装置）１は、分類結果により大きな影響を与えた特徴量を決定するため、更に以下の処理を行ってもよい。図１３は、分類結果により大きな影響を与えた特徴量を決定するための処理例を説明する概念図である。 Further, the classification device (learning device) 1 according to this example may further perform the following processing in order to determine the feature amount that has a greater influence on the classification result. FIG. 13 is a conceptual diagram illustrating a processing example for determining a feature amount that has a greater influence on the classification result.

（ステップＳ４０１）
ステップＳ３０２において生成した複数の点（図１３の符号５１参照）を教師データとして、第２の分類器１１を学習させる。又は、分類装置（学習装置）１が他の分類器を備える構成とし、Ｓ３０２において生成した複数の点を教師データとして、当該他の分類器を学習させる。ここで、本例では、第２の分類器１１又は他の分類器として線形分類器を用いることができる。 (Step S401)
The second classifier 11 is trained using the plurality of points generated in step S302 (see reference numeral 51 in FIG. 13) as teacher data. Alternatively, the classification device (learning device) 1 is configured to include another classifier, and the other classifier is trained using a plurality of points generated in S302 as teacher data. Here, in this example, a linear classifier can be used as the second classifier 11 or another classifier.

図１３の符号５２に、このようにして学習された第２の分類器１１又は他の分類器における重み係数のセットＷ^c
Ｗ^c＝[Ｗ₁₁, Ｗ₁₂, Ｗ₁₃,・・・,Ｗ₂₁, Ｗ₂₂, Ｗ₂₃,・・・]
と、特徴ベクトルＸ^c
Ｘ^c＝[Ｘ₁, Ｘ₂, Ｘ₃,・・・]
と、第２の分類器１１又は他の分類器による分類結果を示すベクトルＹ^c
Ｙ^c＝[Ｙ₁, Ｙ₂]
との関係を示している。 Reference numeral 52 in FIG. 13 indicates a set of weighting factors W ^{c in the second classifier 11 or other classifier thus learned.}
W ^c = [W ₁₁ , W ₁₂ , W ₁₃ , ..., W ₂₁ , W ₂₂ , W ₂₃ , ...]
And the feature vector X ^c
X ^c = [X ₁ , X ₂ , X ₃ , ...]
^{And the vector Y c} showing the classification result by the second classifier 11 or another classifier.
Y ^c = [Y ₁ , Y ₂ ]
Shows the relationship with.

（ステップＳ４０２）
続いて、第２の分類器１１又は他の分類器は、重み係数のセットＷ^cに含まれる重み係数のうち、相対的に大きな係数が乗ぜられる特徴ベクトルＸ^cの成分（重要成分とも呼ぶ）を特定する。一例として、第２の分類器１１又は他の分類器は、重み係数のセットＷ^cに含まれる重み係数のうち、最も係数が乗ぜられる特徴ベクトルＸ^cの成分を特定する。 (Step S402)
Subsequently, the second classifier 11 or another classifier is a component (also referred to as an important component) of the ^{feature vector X c on} which a relatively large coefficient is multiplied among the weighting coefficients included in the weighting coefficient ^{set W c.} To identify. As an example, the second classifier 11 or another classifier identifies the component of the ^{feature vector X c} to which the coefficient is most multiplied among the weighting coefficients included in the weighting coefficient ^{set W c.}

なお、特徴ベクトルの重要成分の特定の仕方は上記の例に限られるものではなく、例えば、係数×成分の値（Ｗ_ab×Ｘ_bの値）がより大きい成分（Ｘ_b）を、重要成分として特定してもよい。一般に、係数の値が相対的に大きくても、乗ぜられる成分の値が０に近い場合には、当該成分が及ぼす分類結果ベクトルＹ^cへの寄与は小さいためである。 The method of specifying the important component of the feature vector is not limited to the above example. For example, a _{component (X b} _{) having a larger coefficient × component value (W ab} × X _b value) is selected as an important component. May be specified as. In general, even if the value of the coefficient is relatively large, when the value of the component to be multiplied is close to 0, the contribution of the ^{component to the classification result vector Y c is small.}

（ステップＳ４０３）
続いて、第２の分類器１１又は他の分類器は、ステップＳ４０２において特定した重要成分の値を変更することによって変換後の特徴ベクトルを生成する。そして、制御部３は、生成された変換後の特徴ベクトルを復号器１５（図１３の符号５３における「学習済デコーダ」）に入力することによって、復号画像を生成する。ここで、重要成分の値の変更は多段階的に複数回行われることが好ましい。一例として、特徴ベクトルにおける重要成分を少しずつ変化させ、各々の復号画像を生成することが好ましい。図１３の符号５３に、このようにして多段階的に生成された複数の画像例を示す。なお、図１３の符号５３における「学習済エンコーダ」は、第１の分類器９のことを指している。 (Step S403)
Subsequently, the second classifier 11 or another classifier generates the converted feature vector by changing the value of the important component specified in step S402. Then, the control unit 3 generates a decoded image by inputting the generated feature vector after conversion into the decoder 15 (“learned decoder” in reference numeral 53 in FIG. 13). Here, it is preferable that the value of the important component is changed a plurality of times in multiple steps. As an example, it is preferable to change the important components in the feature vector little by little to generate each decoded image. Reference numeral 53 in FIG. 13 indicates a plurality of image examples generated in multiple stages in this way. The “learned encoder” in reference numeral 53 in FIG. 13 refers to the first classifier 9.

このように本例では、超平面の近傍に位置する１又は複数の特徴ベクトルの各成分のうち、相対的に重要な成分を変化させることによって得られる変換後の特徴ベクトルを、復号器１５に入力させる。 As described above, in this example, the converted feature vector obtained by changing the relatively important component among the components of one or a plurality of feature vectors located in the vicinity of the hyperplane is transmitted to the decoder 15. Let me enter.

このように構成された分類装置（学習装置）１によれば、特徴ベクトルの重要成分を変更することによって得られる復号画像を比較することができるので、ユーザは、分類装置（学習装置）１がどのような基準で分類を行っているのかを容易に把握することができる。したがって、上記のように構成された分類装置（学習装置）１によれば、解釈可能性を向上させることができる。 According to the classification device (learning device) 1 configured in this way, the decoded images obtained by changing the important components of the feature vector can be compared, so that the user can use the classification device (learning device) 1. It is possible to easily grasp what kind of standard is used for classification. Therefore, according to the classification device (learning device) 1 configured as described above, the interpretability can be improved.

〔ソフトウェアによる実現例〕
分類装置（学習装置）１の制御ブロック（特に取得部５、分類部７、学習部１３および復号器１５）は、集積回路（ＩＣチップ）等に形成された論理回路（ハードウェア）によって実現してもよいし、ソフトウェアによって実現してもよい。 [Example of realization by software]
The control block (particularly the acquisition unit 5, the classification unit 7, the learning unit 13 and the decoder 15) of the classification device (learning device) 1 is realized by a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like. It may be realized by software.

後者の場合、分類装置１は、各機能を実現するソフトウェアであるプログラムの命令を実行するコンピュータを備えている。このコンピュータは、例えば１つ以上のプロセッサを備えていると共に、上記プログラムを記憶したコンピュータ読み取り可能な記録媒体を備えている。そして、上記コンピュータにおいて、上記プロセッサが上記プログラムを上記記録媒体から読み取って実行することにより、本発明の目的が達成される。上記プロセッサとしては、例えばＣＰＵ（Central Processing Unit）を用いることができる。またＧＰＵ（Graphics Processing Unit）を併用して処理の高速化を図ってもよい。上記記録媒体としては、「一時的でない有形の媒体」、例えば、ＲＯＭ（Read Only Memory）等の他、テープ、ディスク、カード、半導体メモリ、プログラマブルな論理回路などを用いることができる。また、上記プログラムを展開するＲＡＭ（Random Access Memory）などをさらに備えていてもよい。また、上記プログラムは、該プログラムを伝送可能な任意の伝送媒体（通信ネットワークや放送波等）を介して上記コンピュータに供給されてもよい。なお、本発明の一態様は、上記プログラムが電子的な伝送によって具現化された、搬送波に埋め込まれたデータ信号の形態でも実現され得る。 In the latter case, the classification device 1 includes a computer that executes instructions of a program that is software that realizes each function. The computer includes, for example, one or more processors and a computer-readable recording medium that stores the program. Then, in the computer, the processor reads the program from the recording medium and executes it, thereby achieving the object of the present invention. As the processor, for example, a CPU (Central Processing Unit) can be used. Further, a GPU (Graphics Processing Unit) may be used in combination to speed up the processing. As the recording medium, a "non-temporary tangible medium", for example, a ROM (Read Only Memory) or the like, a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used. Further, a RAM (Random Access Memory) for expanding the above program may be further provided. Further, the program may be supplied to the computer via an arbitrary transmission medium (communication network, broadcast wave, etc.) capable of transmitting the program. It should be noted that one aspect of the present invention can also be realized in the form of a data signal embedded in a carrier wave, in which the above program is embodied by electronic transmission.

本発明は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。 The present invention is not limited to the above-described embodiments, and various modifications can be made within the scope of the claims, and the embodiments obtained by appropriately combining the technical means disclosed in the different embodiments. Is also included in the technical scope of the present invention.

１分類装置（学習装置）
３制御部
５取得部
７分類部
９第１の分類器
１１第２の分類器
１２第３の分類器
１３学習部
１５復号器
１７出力部（表示部）
１９記憶部 1 Classification device (learning device)
3 Control unit 5 Acquisition unit 7 Classification unit 9 First classifier 11 Second classifier 12 Third classifier 13 Learning unit 15 Decoder 17 Output unit (display unit)
19 Memory

Claims

An acquisition unit that acquires input data that can take multiple types,
A first classifier that maps the input data to a feature vector in the feature space,
A second classifier that takes a feature vector in the feature space as an input and outputs information about the type is provided.
A classifier characterized in that a feature vector other than a common portion of the feature vector in the feature quantity space corresponding to each of the plurality of types of input data is input to the second classifier.

The feature space includes a plurality of subspaces different from each other.
The first classifier is
A certain type of data as the input data is mapped to a feature vector in a certain subspace among the plurality of subspaces.
The classification according to claim 1, wherein the data of another type as the input data is mapped to a feature vector in another subspace different from the certain subspace among the plurality of subspaces. apparatus.

A learning unit that trains the first classifier,
A third classifier that takes a feature vector in the feature space as an input and outputs information about the type is provided.
When the learning unit inputs the feature vector included in the common portion of the feature vector in the feature quantity space corresponding to each of the plurality of types of input data to the third classifier, the third classifier The classification device according to claim 1 or 2, wherein the first classifier is trained so that the classification accuracy of the classifier is lowered.

The learning unit
To train the first classifier by performing conditional learning to input the labeled input data indicating whether the data is the first type data or the second type data to the first classifier. The classification device according to claim 3, which is characterized.

A decoder that inputs a feature vector in the feature space is provided.
The learning unit
When the first type of data is input to the first classifier, the decoder outputs after masking the subspace corresponding to the feature unique to the second type of data in the feature quantity space. This is supervised learning in which the first classifier and the decoder are trained so that the difference between the data to be input and the input data is small.
When the second type of data is input to the first classifier, the decoder outputs after masking the subspace corresponding to the feature unique to the first type of data in the feature quantity space. The classification device according to claim 4, wherein learning with a mask for training the first classifier and the decoder is performed so that the difference between the data to be input and the input data is small.

The learning unit has features unique to the first type of data and a common portion of the subspace corresponding to the first type of data and the subspace corresponding to the second type of data in the feature amount space. The classification device according to claim 5, wherein the first classifier is trained so as not to include any of the second type data-specific features.

Prior to the masked learning, the learning unit performs the first classifier and the first classifier so that the difference between the data output by the decoder and the input data is small without masking the feature space. The classification device according to claim 5 or 6, wherein the decoder is trained.

The second classifier is a cluster to which a feature vector in the feature space corresponding to each of the plurality of types of input data belongs, and is in the feature space that defines boundaries of clusters adjacent to each other. Determine the hyperplane,
The classification device is
A decoder in which one or more feature vectors located near the hyperplane determined by the second classifier, or a feature vector specified by the user, is input.
The classification device according to any one of claims 1 to 4, further comprising a display unit for displaying data output by the decoder.

Among the components of one or a plurality of feature vectors located in the vicinity of the hyperplane, the converted feature vector obtained by changing a relatively important component is input to the decoder. The classification device according to claim 8.

The input data is image data relating to an organism, and the plurality of types include any one of claims 1 to 9 relating to whether the organism is healthy or ill. The classification device described in the section.

A learning unit for learning a first classifier that maps input data that can take multiple types to a feature vector in a feature space containing a plurality of different subspaces is provided.
The learning unit
A decoder that inputs a feature vector in the feature space is provided.
When first-class data is input to the first classifier as the input data, the decoding device after masking the subspace corresponding to the second-class data-specific feature in the feature quantity space. This is masked learning in which the first classifier and the decoder are trained so that the difference between the data output by the data and the input data is small.
When the second type of data is input to the first classifier as the input data, the partial space corresponding to the feature peculiar to the first type of data is masked in the feature quantity space, and then the said. A learning device characterized in that learning with a mask for learning the first classifier and the decoder is performed so that the difference between the data output by the decoder and the input data becomes small.

A learning unit that trains a first classifier that maps input data that can take multiple types to a feature vector in the feature space.
A third classifier that takes a feature vector in the feature space as an input and outputs information about the type is provided.
When the learning unit inputs the feature vector included in the common portion of the feature vector in the feature quantity space corresponding to each of the plurality of types of input data to the third classifier, the third classifier A learning device characterized in that the first classifier is trained so that the classification accuracy of the classifier is lowered.

A classification method performed by the device,
An acquisition step to acquire input data that can take multiple types, and
A first classification step of mapping the input data to a feature vector in the feature space,
It includes a second classification step in which a feature vector in the feature space is input and information about the type is output.
In the second classification step, a classification method characterized in that a feature vector other than a common portion of the feature vector in the feature quantity space corresponding to each of the plurality of types of input data is input.

Includes a learning step that trains a first classifier that maps input data that can be of multiple types to feature vectors in a feature space that includes multiple subspaces that differ from each other.
The learning step
When first-class data is input to the first classifier as the input data, the feature quantity space is masked with a subspace corresponding to the feature specific to the second-class data, and then the feature quantity is used. Masked learning that trains the first classifier and the decoder so that the difference between the data output by the decoder that inputs the feature vector in space and the input data becomes small.
When the second type of data is input to the first classifier as the input data, the partial space corresponding to the feature peculiar to the first type of data is masked in the feature quantity space, and then the above. Performs supervised learning to train the first classifier and the decoder so that the difference between the data output by the decoder that inputs the feature vector in the feature space and the input data becomes small. A learning method characterized by.

It includes a learning step that trains a first classifier that maps input data that can be of multiple types to a feature vector in the feature space.
In the learning step, the feature vector included in the common portion of the feature vector in the feature quantity space corresponding to each of the plurality of types of input data is input, and the feature vector in the feature quantity space is input, and the type is described. A learning method characterized in that the first classifier is trained so that the classification accuracy of the third classifier becomes low when the information is input to the third classifier that outputs information about the third classifier.

A control program for operating a computer as the classification device according to claim 1, wherein the computer functions as the acquisition unit, the first classifier, and the second classifier.

A control program for operating a computer as a learning device according to claim 11 or 12, wherein the computer functions as a learning unit.

A computer-readable recording medium on which the control program according to claim 16 is recorded.

A computer-readable recording medium on which the control program according to claim 17 is recorded.