JPWO2020054551A1

JPWO2020054551A1 - Information processing equipment, information processing methods, programs

Info

Publication number: JPWO2020054551A1
Application number: JP2020545955A
Authority: JP
Inventors: 和久高木
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2018-09-11
Filing date: 2019-09-04
Publication date: 2021-08-30
Anticipated expiration: 2039-09-04
Also published as: WO2020054551A1; JP7156383B2

Abstract

本発明の情報処理装置１００は、ニューラルネットモデルを用いて学習データの第１の特徴量を抽出する特徴量抽出部１１０と、ニューラルネットモデルに設定された学習データが属するクラスに対応する情報に基づいて、第１の特徴量を第２の特徴量に変換して、当該第２の特徴量をクラスタリングするクラスタリング部１２０と、第２の特徴量のクラスタリング結果に基づいてクラスを修正する対象となる学習データを選択する修正対象選択部１３０と、を備える。
The information processing apparatus 100 of the present invention includes a feature amount extraction unit 110 that extracts a first feature amount of training data using a neural network model, and information corresponding to a class to which the training data set in the neural network model belongs. Based on this, the clustering unit 120 that converts the first feature amount into the second feature amount and clusters the second feature amount, and the target for modifying the class based on the clustering result of the second feature amount. It is provided with a correction target selection unit 130 for selecting the learning data.

Description

本発明は、ニューラルネットで用いられる学習データに対するラベル付けの修正を支援する情報処理装置、情報処理方法、プログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program that support correction of labeling of learning data used in a neural net.

近年、ニューラルネットを用いた機械学習が様々な分野で用いられている。ニューラルネットを用いて推論モデルを作成するためには、大量の学習データが必要となるが、その量の多さから、ラベル付けは複数人または長期的に行われる。このため、判断基準が曖昧なラベル付けが学習データに対してなされることがある。このようなラベル付けにより、推論モデルの推論精度は低下する。 In recent years, machine learning using neural networks has been used in various fields. In order to create an inference model using a neural network, a large amount of training data is required, but due to the large amount of training data, labeling is performed by multiple people or in the long term. For this reason, the training data may be labeled with ambiguous judgment criteria. Such labeling reduces the inference accuracy of the inference model.

このような推論モデルの精度を向上させるための方法として、ラベルの修正が有効である。このため、専門家が目視で学習データを確認し、ラベル付けの判断基準を整理しながらラベルを修正する、ということが行われていた。 Label modification is effective as a method for improving the accuracy of such an inference model. For this reason, an expert visually confirms the learning data and corrects the label while arranging the criteria for labeling.

国際公開第２０１７／１７９２５８号International Publication No. 2017/179258

しかしながら、ラベルの確認対象となる学習データは大量にあり、またそれらは整理されていないため、ラベル修正には工数を要する、という問題が生じる。かかる問題に関連する技術として、特許文献１の技術がある。特許文献１では、尤度ベクトルのクラスタリングと各クラスタ内の平均との差によって画像の表示方法を変更することで、ラベル修正に要する工数を削減している。 However, since there are a large amount of learning data to be checked for labels and they are not organized, there arises a problem that it takes man-hours to correct the labels. As a technique related to such a problem, there is a technique of Patent Document 1. In Patent Document 1, the man-hours required for label correction are reduced by changing the image display method according to the difference between the clustering of the likelihood vectors and the average in each cluster.

ここで、ニューラルネットから得られる特徴量は、異なるラベルが付与されたもの同士が識別平面と垂直方向に離れるという性質を持つ。このため、図１に示すように、異なるラベル付けがされた類似データ同士Ｃ１，Ｃ２、すなわち曖昧な判断基準によりラベル付けされたデータ同士Ｃ１，Ｃ２が、同じクラスタにまとまりづらい、という問題がある。ところが、かかる問題に対する解決方法は特許文献１には記載されていない。その結果、依然として、学習データのラベル修正に要する工数を削減しつつ、ラベル付けの精度の向上を図ることができない、という問題がある。 Here, the feature quantity obtained from the neural network has a property that those with different labels are separated from each other in the direction perpendicular to the identification plane. Therefore, as shown in FIG. 1, there is a problem that similar data with different labels C1 and C2, that is, data labeled with ambiguous criteria C1 and C2 are difficult to be grouped in the same cluster. .. However, a solution to such a problem is not described in Patent Document 1. As a result, there is still a problem that it is not possible to improve the labeling accuracy while reducing the man-hours required for label correction of the training data.

このため、本発明の目的は、ニューラルネットを用いた機械学習において、学習データのラベル修正に要する工数の削減とラベル付けの精度の向上を図ることができない、という問題を解決することができる情報処理装置、情報処理方法、プログラムを提供することにある。 Therefore, an object of the present invention is information that can solve the problem that in machine learning using a neural network, it is not possible to reduce the number of steps required for label correction of training data and improve the labeling accuracy. The purpose is to provide processing devices, information processing methods, and programs.

本発明の一形態である情報処理装置は、
ニューラルネットモデルを用いて学習データの第１の特徴量を抽出する特徴量抽出部と、
前記ニューラルネットモデルに設定された前記学習データが属するクラスに対応する情報に基づいて、前記第１の特徴量を第２の特徴量に変換して、当該第２の特徴量をクラスタリングするクラスタリング部と、
前記第２の特徴量のクラスタリング結果に基づいてクラスを修正する対象となる前記学習データを選択する修正対象選択部と、
を備えた、
という構成をとる。The information processing device, which is one embodiment of the present invention, is
A feature extraction unit that extracts the first feature of the training data using a neural network model,
A clustering unit that converts the first feature amount into a second feature amount and clusters the second feature amount based on the information corresponding to the class to which the learning data is set in the neural network model. When,
A correction target selection unit that selects the learning data to be modified based on the clustering result of the second feature amount, and a modification target selection unit.
With,
It takes the configuration.

また、本発明の一形態であるプログラムは、
情報処理装置に、
ニューラルネットモデルを用いて学習データの第１の特徴量を抽出する特徴量抽出部と、
前記ニューラルネットモデルに設定された前記学習データが属するクラスに対応する情報に基づいて、前記第１の特徴量を第２の特徴量に変換して、当該第２の特徴量をクラスタリングするクラスタリング部と、
前記第２の特徴量のクラスタリング結果に基づいてクラスを修正する対象となる前記学習データを選択する修正対象選択部と、
を実現させる、
という構成をとる。Further, the program which is one form of the present invention is
For information processing equipment
A feature extraction unit that extracts the first feature of the training data using a neural network model,
A clustering unit that converts the first feature amount into a second feature amount and clusters the second feature amount based on the information corresponding to the class to which the learning data is set in the neural network model. When,
A correction target selection unit that selects the learning data to be modified based on the clustering result of the second feature amount, and a modification target selection unit.
To realize,
It takes the configuration.

また、本発明の一形態である情報処理方法は、
ニューラルネットモデルを用いて学習データの第１の特徴量を抽出し、
前記ニューラルネットモデルに設定された前記学習データが属するクラスに対応する情報に基づいて、前記第１の特徴量を第２の特徴量に変換して、当該第２の特徴量をクラスタリングし、
前記第２の特徴量のクラスタリング結果に基づいてクラスを修正する対象となる前記学習データを選択する、
という構成をとる。Further, the information processing method, which is one form of the present invention, is
Extract the first feature of the training data using the neural network model,
Based on the information set in the neural network model and corresponding to the class to which the learning data belongs, the first feature amount is converted into the second feature amount, and the second feature amount is clustered.
The training data to be modified in the class is selected based on the clustering result of the second feature amount.
It takes the configuration.

本発明は、以上のように構成されることにより、ニューラルネットを用いた機械学習において、学習データのラベル修正に要する工数の削減とラベル付けの精度の向上を図ることができる。 By configuring as described above, the present invention can reduce the man-hours required for label correction of learning data and improve the labeling accuracy in machine learning using a neural net.

機械学習における課題を説明するための図である。It is a figure for demonstrating the problem in machine learning. 本発明の実施形態１におけるラベル修正支援装置の構成を示すブロック図である。It is a block diagram which shows the structure of the label correction support device in Embodiment 1 of this invention. 図２に開示したラベル修正対象提示方法決定装置による処理の様子を説明するための図である。It is a figure for demonstrating the state of the process by the label correction target presenting method determination apparatus disclosed in FIG. 図２に開示した提示・修正装置による処理の様子を説明するための図である。It is a figure for demonstrating the state of the processing by the presentation / correction apparatus disclosed in FIG. 図２に開示したラベル修正支援装置の動作を示すフローチャートである。It is a flowchart which shows the operation of the label correction support device disclosed in FIG. 図２に開示したラベル修正支援装置の動作を示すフローチャートである。It is a flowchart which shows the operation of the label correction support device disclosed in FIG. 本発明の実施形態２におけるラベル修正支援装置の構成を示すブロック図である。It is a block diagram which shows the structure of the label correction support device in Embodiment 2 of this invention. 本発明の実施形態３におけるラベル修正支援装置の構成を示すブロック図である。It is a block diagram which shows the structure of the label correction support device in Embodiment 3 of this invention. 本発明の実施形態４における情報処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the information processing apparatus in Embodiment 4 of this invention.

＜実施形態１＞
本発明の第１の実施形態を、図２乃至図６を参照して説明する。図２は、ラベル修正支援装置の構成を説明するための図であり、図３乃至６は、ラベル修正支援装置の動作を説明するための図である。<Embodiment 1>
The first embodiment of the present invention will be described with reference to FIGS. 2 to 6. FIG. 2 is a diagram for explaining the configuration of the label correction support device, and FIGS. 3 to 6 are diagrams for explaining the operation of the label correction support device.

［構成］
本発明は、図１に示すようなラベル修正支援装置１で構成される。ラベル修正支援装置１は、演算装置と記憶装置１０とを備えた１台又は複数台の情報処理装置にて構成される。そして、ラベル修正支援装置１は、演算装置がプログラムを実行することで構築された、ラベル修正対象提示方法決定装置２０と提示・修正装置３０とを備える。そして、ラベル修正対象提示方法決定装置２０は、特徴量抽出装置２１、画像選別装置２２、クラスタリング装置２３、クラスタ選別・並び替え装置２４、を備える。以下、各構成について詳述する。[composition]
The present invention includes a label correction support device 1 as shown in FIG. The label correction support device 1 is composed of one or a plurality of information processing devices including an arithmetic unit and a storage device 10. The label correction support device 1 includes a label correction target presentation method determining device 20 and a presentation / correction device 30 constructed by the arithmetic unit executing a program. The label correction target presentation method determination device 20 includes a feature amount extraction device 21, an image sorting device 22, a clustering device 23, and a cluster sorting / sorting device 24. Hereinafter, each configuration will be described in detail.

まず、本発明のラベル修正支援装置１は、ニューラルネットで用いられる学習データに対するラベル付けの修正を支援するものである。具体的に、ラベル修正支援装置１は、後述するように、ユーザに学習データを提示し、フィードバックを得て、当該学習データのラベルを修正する機能を有する。このため、ラベル修正支援装置１は、学習データが属するクラスが互いに排他的な多クラス分類問題に適用可能である。この問題とは、例えば、対象となる学習データが画像である場合に、かかる画像が「犬」、「猫」のいずれなのかを分類するような問題である。以降では、分類するクラス数をＣ（上の例ではＣ＝２）とする。但し、本発明で対象となる学習データは、画像であることに限定されない。 First, the label correction support device 1 of the present invention supports correction of labeling of learning data used in a neural network. Specifically, the label correction support device 1 has a function of presenting learning data to the user, obtaining feedback, and correcting the label of the learning data, as will be described later. Therefore, the label correction support device 1 can be applied to a multi-class classification problem in which the classes to which the learning data belong are mutually exclusive. This problem is, for example, a problem of classifying whether the image is a "dog" or a "cat" when the target learning data is an image. Hereinafter, the number of classes to be classified is C (C = 2 in the above example). However, the learning data targeted by the present invention is not limited to images.

上記記憶装置１０は、学習データ、ニューラルネットモデル、各種設定値を記憶する。このとき、学習データは、画像とラベルの組からなり、複数存在する。 The storage device 10 stores learning data, a neural network model, and various set values. At this time, the learning data is composed of a set of an image and a label, and there are a plurality of learning data.

上記ラベル修正対象提示方法決定装置２０は、記憶装置１０からの学習データ、ニューラルネットモデル、各種設定値を入力として、後述するように、いくつかの類似画像クラスタをラベル付けの判断基準が曖昧なものから順に並べたものを出力する装置である。 The label correction target presentation method determination device 20 uses the learning data from the storage device 10, the neural network model, and various setting values as inputs, and as will be described later, the determination criteria for labeling some similar image clusters are ambiguous. It is a device that outputs items arranged in order from the item.

上記特徴量抽出装置２１（特徴量抽出部）は、記憶装置１０からの学習データおよびニューラルネットモデルを入力として、学習データであるN枚の画像それぞれの特徴量x_n(n=1,…,N)（第１の特徴量）を出力する装置である。この特徴量抽出装置２１では、ニューラルネットの最後に位置する分類活性化層の手前の分類層への入力ベクトルを特徴量として抽出する。このとき、分類層とは、直前の層の各ニューロンの出力を要素として持つ特徴量ベクトルx_nを入力として、クラス数C個の超平面w_c・x+b=0(c=1,…,C)からの距離それぞれを要素として持つC次元の分類ベクトルzを出力する層である。また、分類活性化層とは、分類ベクトルzを入力として、各次元の値をSoftmax関数等の活性化関数により活性化したC次元のベクトルyを出力とする層である。この時、このベクトルの各要素y_c(c=1,…C)はそれぞれ、入力された画像がどれだけその要素に対応するクラスらしいか（学習データが属するクラスに対する確からしさ）を表す確信度y_cとみなすことができる。 The feature amount extraction device 21 (feature amount extraction unit) receives the learning data from the storage device 10 and the neural network model as inputs, and the feature amount x_n (n = 1, ..., N) of each of the N images as the training data. ) (First feature amount). The feature amount extraction device 21 extracts an input vector to the classification layer in front of the classification activation layer located at the end of the neural network as a feature amount. At this time, the classification layer is a hyperplane w_c · x + b = 0 (c = 1, ..., C) with C classes, with the feature vector x_n having the output of each neuron in the immediately preceding layer as an input. ) Is a layer that outputs a C-dimensional classification vector z that has each element as an element. The classification activation layer is a layer that takes the classification vector z as an input and outputs a C-dimensional vector y in which the value of each dimension is activated by an activation function such as a Softmax function. At this time, each element y_c (c = 1, ... C) of this vector is a certainty y_c indicating how much the input image seems to be a class corresponding to that element (certainty for the class to which the learning data belongs). Can be regarded as.

上記画像選別装置２２（特徴量選択部）は、特徴量抽出装置２１からの特徴量x_nと、記憶装置１０からの各種設定値を入力として、曖昧な判断基準でラベル付けされた画像の特徴量x_m(m=1,…,M)のみを選別して出力する装置である。ここで、Mは、選別後の特徴量の数である。この選別は、上述した確信度y_cや特徴量x_nを用いた任意の方法により実現できる。例えば、確信度y_cが予め設定された範囲内に含まれるか否かにより選別するという方法や、その範囲を確信度y_cの平均値や分散から算出される値とするという方法などが考えられる。一例としては、確信度y_cが記憶装置１０に記憶された設定値としての閾値よりも低い特徴量を選別する。これにより、画像選別装置２２は、後述するように、クラスタリング装置２３において発生する、明確な判断基準でラベル付けされた画像が曖昧な判断基準でラベル付けされた画像と同一クラスタにクラスタリングされる、という問題を抑制する役割を果たしている。なお、選別で必要となる設定値としては、各種設定値内の画像選別設定値を用いる。 The image selection device 22 (feature amount selection unit) inputs the feature amount x_n from the feature amount extraction device 21 and various setting values from the storage device 10, and the feature amount of the image labeled with an ambiguous determination standard. It is a device that selects and outputs only x_m (m = 1, ..., M). Here, M is the number of features after selection. This selection can be realized by an arbitrary method using the above-mentioned certainty y_c and feature amount x_n. For example, a method of selecting whether or not the certainty y_c is included in a preset range, or a method of setting the range as a value calculated from the average value or variance of the certainty y_c can be considered. As an example, a feature amount whose certainty y_c is lower than the threshold value as a set value stored in the storage device 10 is selected. As a result, as will be described later, the image sorting device 22 clusters the images labeled with the clear judgment criteria generated in the clustering device 23 into the same cluster as the images labeled with the ambiguous judgment criteria. It plays a role in suppressing the problem. As the setting value required for sorting, the image sorting setting value in various setting values is used.

上記クラスタリング装置２３（クラスタリング部）は、記憶装置１０からのニューラルネットモデル及び各種設定値と、画像選別装置２２からの選別された特徴量x_mとを入力として、ニューラルネットモデルに基づいて学習データの特徴量x_mをクラスタリングする。そして、クラスタリング結果(各特徴量、各クラスに対するクラスタID)k_m,c(m=1,…,M: c=1,…,C)を出力する。具体的に、クラスタリング装置２３は、まず選別された特徴量x_mをニューラルネットモデルのそれぞれのクラスcに対応する識別平面に正射影することで変換する。つまり、図３に示すように、学習データが属するクラスに対応する識別平面に、かかる学習データの特徴量を正射影することで、当該特徴量の次元を圧縮して変換する。なお、識別平面への正射影は、例えば図３に示す式にて行われる。次に、クラスタリング装置２３は、変換した特徴量x’_m,c（第２の特徴量）に対して、ニューラルネットモデルを用いてクラスタリングを行う。この時、クラスタリング手法としては例えば、一般的に知られているk-means法、Mean-Shift法等を用いた方法などが考えられる。 The clustering device 23 (clustering unit) receives the neural network model from the storage device 10 and various set values, and the selected feature amount x_m from the image sorting device 22 as inputs, and obtains training data based on the neural network model. Cluster the feature x_m. Then, the clustering result (each feature amount, cluster ID for each class) k_m, c (m = 1, ..., M: c = 1, ..., C) is output. Specifically, the clustering apparatus 23 first transforms the selected features x_m by orthographically projecting them onto the identification plane corresponding to each class c of the neural network model. That is, as shown in FIG. 3, by orthographically projecting the feature amount of the learning data onto the identification plane corresponding to the class to which the learning data belongs, the dimension of the feature amount is compressed and converted. Orthographic projection onto the identification plane is performed by, for example, the formula shown in FIG. Next, the clustering device 23 clusters the converted features x'_m, c (second feature) using the neural network model. At this time, as a clustering method, for example, a method using a generally known k-means method, Mean-Shift method, or the like can be considered.

このように、クラスタリング装置２３は、選別された特徴量x_mを変換する機能を有しており、かかる機能が、上述した「ニューラルネットから得られる特徴量は、異なるラベルが付与されたもの同士が識別平面と垂直方向に離れるという性質を持つため、異なるラベル付けがされた類似画像同士、すなわち曖昧な判断基準によりラベル付けされた画像同士が同じクラスタにまとまりづらい」というような問題に対処する役割を果たしている。この時、上述したクラスの識別平面への正射影により確信度方向の情報が失われるが、上述した画像選別装置２２で既に変換対象の特徴量を選別しており、かかる情報を補う役割を果たしている。なお、k-means法等で必要となる設定値としては、各種設定値内のクラスタリング設定値を用いる。 As described above, the clustering device 23 has a function of converting the selected feature amount x_m, and such a function is such that the above-mentioned "feature amounts obtained from the neural network are those with different labels." Since it has the property of being vertically separated from the identification plane, it is difficult for similar images with different labels to be grouped together in the same cluster, that is, images labeled with ambiguous criteria. Is playing. At this time, the information in the certainty direction is lost due to the orthogonal projection onto the identification plane of the above-mentioned class, but the above-mentioned image sorting device 22 has already selected the feature amount to be converted, and plays a role of supplementing such information. There is. As the setting value required by the k-means method or the like, the clustering setting value in various setting values is used.

上記クラスタ選別・並び替え装置２４（修正対象選択部）は、記憶装置１０からの各種設定値、画像選別装置２２からの選別された特徴量x_m、および、クラスタリング装置２３からのクラスタリング結果k_m,cを入力として、クラスタ並び替え情報を出力する装置である。具体的に、クラスタリング選別・並び替え装置２４は、クラスタリングされたクラスタ内の特徴量に付与されたラベルの乱雑さを曖昧度a_kとしてクラスタkに付与し、その曖昧度a_kやクラスcを基準としてクラスタkを並べ、提示条件に満たないクラスタkを除外する。つまり、除外されていないクラスタに属する学習データを、当該学習データが属するクラスを修正する対象として選択する。この時、並べ方としては、クラスc順に並べた後でその中で曖昧度a_kの大きいものから順に並べるという方法や、クラスを無視して曖昧度a_kの大きいものから順に並べるという方法が考えられる。 The cluster sorting / sorting device 24 (correction target selection unit) has various set values from the storage device 10, selected feature amounts x_m from the image sorting device 22, and clustering results k_m, c from the clustering device 23. Is a device that outputs cluster sorting information by inputting. Specifically, the clustering sorting / sorting device 24 assigns the randomness of the label given to the feature amount in the clustered cluster to the cluster k as the ambiguity a_k, and uses the ambiguity a_k and the class c as a reference. Arrange the cluster k and exclude the cluster k that does not meet the presentation conditions. That is, the learning data belonging to the cluster that is not excluded is selected as the target for modifying the class to which the learning data belongs. At this time, as a method of arranging, a method of arranging in the order of class c and then arranging in order from the one having the highest ambiguity a_k, or a method of ignoring the class and arranging in order from the one having the largest ambiguity a_k can be considered.

なお、クラスタ選別・並び替え装置２４による曖昧度a_kの計算方法としては、例えば、クラスタ内のラベル付けについてのエントロピーを用いた方法、ラベル付けの単純な比率を用いた方法などが考えられる。さらに、提示条件の計算方法としては、例えば、曖昧度a_kと閾値との比較による方法、クラスタkの曖昧度a_kの順位と閾値との比較による方法等が考えられる。加えて、曖昧度a_kの閾値の設定方法としては、例えば、単に固定値を用いる方法や、他クラスタの曖昧度a_k’にある固定値を足したものを用いる方法などが考えられる。なお、本装置で必要となる設定値としては、各種設定値内のクラスタ選別・並び替え設定値を用いる。 As a method of calculating the ambiguity a_k by the cluster sorting / sorting device 24, for example, a method using entropy for labeling in the cluster, a method using a simple ratio of labeling, and the like can be considered. Further, as a method of calculating the presentation condition, for example, a method by comparing the ambiguity a_k and the threshold value, a method by comparing the rank of the ambiguity a_k of the cluster k and the threshold value, and the like can be considered. In addition, as a method of setting the threshold value of the ambiguity a_k, for example, a method of simply using a fixed value or a method of adding a fixed value in the ambiguity a_k'of another cluster can be considered. As the setting value required by this device, the cluster selection / sorting setting value in various setting values is used.

上記提示・修正装置３０（クラス設定部）は、記憶装置１０からの学習データ、クラスタリング装置２３からのクラスタリング結果、クラスタ選別・並び替え装置２４からのクラスタ並び替え情報を入力として、ユーザに修正対象の学習データである修正対象画像と当該画像が属するクラスを表すラベルとを、クラスタ毎に順に提示する。そして、提示・修正装置３０は、提示した修正対象画像に対してユーザから入力されたラベルの修正情報に基づいて、記憶装置１０に記憶されている学習データが属するクラスを表すラベルを更新して設定する。ここで、提示・修正方法の一例を図４に示す。上述した方法によりラベル付けの判断基準が曖昧な類似する画像である学習データとラベルがまとめて提示される。この例では、「？」マークの欄に修正後のラベルを入力するようユーザに求めている。また、「矢印」マークの押下により、前あるいは次のクラスタが表示されるようなインターフェースを備えている。 The presentation / correction device 30 (class setting unit) inputs the learning data from the storage device 10, the clustering result from the clustering device 23, and the cluster sorting information from the cluster sorting / sorting device 24, and is a correction target for the user. The image to be modified, which is the training data of the above, and the label indicating the class to which the image belongs are presented in order for each cluster. Then, the presentation / correction device 30 updates the label representing the class to which the learning data stored in the storage device 10 belongs, based on the correction information of the label input from the user for the presented correction target image. Set. Here, an example of the presentation / correction method is shown in FIG. By the method described above, the training data and the label, which are similar images whose labeling criteria are ambiguous, are presented together. In this example, the user is asked to enter the modified label in the "?" Mark field. It also has an interface that displays the previous or next cluster by pressing the "arrow" mark.

［動作］
次に、上述したラベル修正支援装置１の動作を、図５乃至図６のフローチャートを参照して説明する。なお、図５は、ラベル修正支援装置１の全体的な動作を示し、図６は、クラスタリング装置２３の動作を示す。[motion]
Next, the operation of the label correction support device 1 described above will be described with reference to the flowcharts of FIGS. 5 to 6. Note that FIG. 5 shows the overall operation of the label correction support device 1, and FIG. 6 shows the operation of the clustering device 23.

まず、特徴量抽出装置２１が、記憶装置１０からの学習データおよびニューラルネットモデルを入力として、学習データの特徴量x_n(n=1,…,N)（第１の特徴量）を抽出する（ステップＳ１）。このとき、特徴量抽出装置２１は、学習データの特徴量に基づいて、かかる学習データが属するクラスに対する確信度y_cも算出する。 First, the feature amount extraction device 21 takes the training data from the storage device 10 and the neural network model as inputs, and extracts the feature amount x_n (n = 1, ..., N) (first feature amount) of the training data (first feature amount). Step S1). At this time, the feature amount extraction device 21 also calculates the certainty y_c for the class to which the learning data belongs based on the feature amount of the learning data.

続いて、画像選別装置２２が、特徴量x_nと確信度y_cを用いて、曖昧な判断基準でラベル付けされたと判断できる画像の特徴量x_m(m=1,…,M)のみを選別する（ステップＳ２）。そして、クラスタリング装置２３が、選択された特徴量を用いて、以下に説明する特徴量変換（ステップＳ３）とクラスタリング（ステップＳ４）を行う。 Subsequently, the image sorting device 22 sorts only the feature amount x_m (m = 1, ..., M) of the image that can be determined to be labeled by an ambiguous judgment criterion by using the feature amount x_n and the certainty y_c ( Step S2). Then, the clustering apparatus 23 performs the feature amount conversion (step S3) and clustering (step S4) described below using the selected feature amount.

ここで、クラスタリング装置２３による処理動作を図６のフローチャートを参照して説明する。まず、クラスタリング装置２３は、記憶装置１０、画像選別装置２２から、それぞれニューラルネットモデル、選択された特徴量x_m(m=1,…,M)を受け取る（ステップＳ１１）。この時、Ｍは、画像選別装置２２で選択された特徴量の個数である。また、ニューラルネットモデルには、分類層の、各クラスc(=1,…,C)それぞれに対応する重みパラメタw_cおよびバイアスパラメタb_cが含まれる。 Here, the processing operation by the clustering apparatus 23 will be described with reference to the flowchart of FIG. First, the clustering device 23 receives the neural network model and the selected feature amount x_m (m = 1, ..., M) from the storage device 10 and the image selection device 22, respectively (step S11). At this time, M is the number of features selected by the image sorting device 22. Further, the neural network model includes a weight parameter w_c and a bias parameter b_c corresponding to each class c (= 1, ..., C) of the classification layer.

続いて、クラスタリング装置２３は、変数の初期化、つまり、c = 1とする（ステップＳ１２）。続いて、クラスタリング装置２３は、選択された特徴量を、識別平面へ正射影する（ステップＳ１３）。具体的には、ニューラルネットモデルの重みパラメタw_cおよびバイアスパラメタb_cを用いて、全ての選択された特徴量x_mを、それぞれ図３に示した式を用いて、対応するクラスcの識別平面に正射影した特徴量x’_m,c（第２の特徴量）に変換する。 Subsequently, the clustering device 23 initializes the variable, that is, c = 1 (step S12). Subsequently, the clustering device 23 orthographically projects the selected feature amount onto the identification plane (step S13). Specifically, using the weight parameter w_c and the bias parameter b_c of the neural network model, all the selected features x_m are positive to the corresponding class c identification plane using the equations shown in FIG. Convert to projected features x'_m, c (second feature).

続いて、クラスタリング装置２３は、k-means法やMean-Shift法により、正射影した特徴量x’_m,cをクラスタリングする（ステップＳ１４）。これにより、正射影した特徴量x’_m,cが属するクラスタID k_m,c (=1,…,K)が得られる。この時、Kはクラスタの個数である。 Subsequently, the clustering apparatus 23 clusters the orthographically projected features x'_m, c by the k-means method or the Mean-Shift method (step S14). As a result, the cluster ID k_m, c (= 1, ..., K) to which the orthographically projected features x'_m, c belong can be obtained. At this time, K is the number of clusters.

その後、クラスタリング装置２３は、c の値を c + 1に更新し（ステップＳ１５）、c <= Cであれば（ステップＳ１６でＮｏ）、次のクラスに関するクラスタリングを行う（ステップＳ１３〜Ｓ１５）。c = Cであれば（ステップＳ１６でＹｅｓ）、クラスタリング結果(k_m,c(m=1,…,M: c=1,…C))をクラスタ選別・並び替え装置２４に送る（ステップＳ１７）。 After that, the clustering apparatus 23 updates the value of c to c + 1 (step S15), and if c <= C (No in step S16), clusters the next class (steps S13 to S15). If c = C (Yes in step S16), the clustering result (k_m, c (m = 1, ..., M: c = 1, ... C)) is sent to the cluster sorting / sorting device 24 (step S17). ..

続いて、クラスタ選別・並び替え装置２４は、クラスタリングされたクラスタ内の特徴量に付与されたラベルの乱雑さを曖昧度a_kとしてクラスタkに付与し、その曖昧度a_kやクラスcを基準としてクラスタkを並べ、クラスを修正する対象とする学習データである画像を選択する（ステップＳ５）。 Subsequently, the cluster sorting / sorting device 24 assigns the randomness of the label given to the feature amount in the clustered cluster to the cluster k as the ambiguity a_k, and clusters based on the ambiguity a_k and the class c. Arrange k and select an image that is the training data for which the class is to be modified (step S5).

そして、提示・修正装置３０は、図４に示すように、ユーザに修正対象の学習データである画像とラベルとを、クラスタ毎に順に提示する（ステップＳ６）。提示・修正装置３０は、提示した修正対象画像に対してユーザから入力されたラベルの修正情報に基づいて、記憶装置１０に記憶されている学習データが属するクラスを表すラベルを更新して設定する（ステップＳ７）。 Then, as shown in FIG. 4, the presentation / correction device 30 presents the image and the label, which are the learning data to be corrected, to the user in order for each cluster (step S6). The presentation / correction device 30 updates and sets a label representing a class to which the learning data stored in the storage device 10 belongs, based on the correction information of the label input from the user for the presented correction target image. (Step S7).

以上のように、本発明では、ニューラルネットから得られる特徴量は、異なるラベルが付与されたもの同士が識別平面と垂直方向に離れるという性質を持つため、異なるラベル付けがされた類似画像同士、すなわち曖昧な判断基準によりラベル付けされた画像同士が同じクラスタにまとまりづらい、という問題を解消することができる。その理由は、本発明が、選別した特徴量の識別平面への正射影をクラスタリングし、そのクラスタの選別・並び替えをすることで修正対象の提示方法を決定する機能を持つからである。 As described above, in the present invention, the features obtained from the neural net have the property that those with different labels are separated from each other in the direction perpendicular to the identification plane. That is, it is possible to solve the problem that the images labeled by an ambiguous criterion are difficult to be grouped in the same cluster. The reason is that the present invention has a function of clustering the orthogonal projections of the selected features on the identification plane and determining the method of presenting the correction target by selecting and rearranging the clusters.

＜実施形態２＞
次に、本発明の第２の実施形態を、図７を参照して説明する。図７は、本実施形態におけるラベル修正支援装置１の構成を示す図である。<Embodiment 2>
Next, a second embodiment of the present invention will be described with reference to FIG. FIG. 7 is a diagram showing a configuration of the label correction support device 1 in the present embodiment.

本実施形態におけるラベル修正支援装置１は、演算装置がプログラムを実行することで構築されたラベル自動修正装置４０を備える。ラベル自動修正装置４０は、クラスタリング装置２３からのクラスタリング結果、クラスタ選別・並び替え装置２４からのクラスタ並び替え情報を入力として、記憶装置１０の学習データを更新する装置である。この時、更新方法としては、例えば、提示されるクラスタ毎にランダムにラベルを選択し、そのクラスタ内の画像全てのラベルを、当該選択されたラベルに更新するという方法や、クラスタ内の全画像の確信度の平均値が閾値を超えるか否かによってラベルを選択し、ラベルを更新するという方法等が考えられる。このように、本実施形態では、画像のラベルを一括で、また、自動で更新することで、ラベル修正が容易となる。 The label correction support device 1 in the present embodiment includes an automatic label correction device 40 constructed by the arithmetic unit executing a program. The label automatic correction device 40 is a device that updates the learning data of the storage device 10 by inputting the clustering result from the clustering device 23 and the cluster sorting information from the cluster sorting / sorting device 24. At this time, as an update method, for example, a method of randomly selecting a label for each presented cluster and updating all the labels of the images in the cluster to the selected label, or all the images in the cluster. A method of selecting a label and updating the label depending on whether or not the average value of the certainty of the above exceeds the threshold value can be considered. As described above, in the present embodiment, the label of the image is updated all at once and automatically, so that the label can be easily corrected.

＜実施形態３＞
次に、本発明の第３の実施形態を、図８を参照して説明する。図８は、本実施形態におけるラベル修正支援装置１の構成を示す図である。<Embodiment 3>
Next, a third embodiment of the present invention will be described with reference to FIG. FIG. 8 is a diagram showing the configuration of the label correction support device 1 in the present embodiment.

本実施形態におけるラベル修正支援装置１は、実施形態１で説明したラベル修正支援装置１が備える構成に加えて、演算装置がプログラムを実行することで構築された設定値更新装置５０をさらに備える。設定値更新装置５０は、各種設定値をユーザに提示し、ユーザからその更新値を受け取り、かかる更新値に基づいて記憶装置１０に記憶されている上述したような各種設定値を更新する機能を有する。 The label correction support device 1 in the present embodiment further includes a set value update device 50 constructed by the arithmetic unit executing a program, in addition to the configuration provided in the label correction support device 1 described in the first embodiment. The setting value updating device 50 has a function of presenting various setting values to the user, receiving the updated values from the user, and updating the various setting values stored in the storage device 10 based on the updated values as described above. Have.

＜実施形態４＞
次に、本発明の第４の実施形態を、図９を参照して説明する。図９は、実施形態４における情報処理装置の構成を示すブロック図である。なお、本実施形態では、実施形態１で説明したラベル修正支援装置１の構成の概略を示している。<Embodiment 4>
Next, a fourth embodiment of the present invention will be described with reference to FIG. FIG. 9 is a block diagram showing the configuration of the information processing apparatus according to the fourth embodiment. In this embodiment, the outline of the configuration of the label correction support device 1 described in the first embodiment is shown.

図９に示すように、本実施形態における情報処理装置１００は、
ニューラルネットモデルを用いて学習データの第１の特徴量を抽出する特徴量抽出部１１０と、
前記ニューラルネットモデルに設定された前記学習データが属するクラスに対応する情報に基づいて、前記第１の特徴量を第２の特徴量に変換して、当該第２の特徴量をクラスタリングするクラスタリング部１２０と、
前記第２の特徴量のクラスタリング結果に基づいてクラスを修正する対象となる前記学習データを選択する修正対象選択部１３０と、
を備える。As shown in FIG. 9, the information processing device 100 in this embodiment is
The feature amount extraction unit 110 that extracts the first feature amount of the training data using the neural network model, and the feature amount extraction unit 110.
A clustering unit that converts the first feature amount into a second feature amount and clusters the second feature amount based on the information corresponding to the class to which the learning data is set in the neural network model. 120 and
A correction target selection unit 130 that selects the learning data to be corrected based on the clustering result of the second feature amount, and a correction target selection unit 130.
To be equipped.

なお、上記特徴量抽出部１１０とクラスタリング部１２０と修正対象選択部１３０とは、情報処理装置がプログラムを実行することで実現されるものである。 The feature amount extraction unit 110, the clustering unit 120, and the correction target selection unit 130 are realized by the information processing apparatus executing a program.

そして、上記構成の情報処理装置１００は、
ニューラルネットモデルを用いて学習データの第１の特徴量を抽出し、
前記ニューラルネットモデルに設定された前記学習データが属するクラスに対応する情報に基づいて、前記第１の特徴量を第２の特徴量に変換して、当該第２の特徴量をクラスタリングし、
前記第２の特徴量のクラスタリング結果に基づいてクラスを修正する対象となる前記学習データを選択する、
という処理を実行するよう作動する。Then, the information processing device 100 having the above configuration is
Extract the first feature of the training data using the neural network model,
Based on the information set in the neural network model and corresponding to the class to which the learning data belongs, the first feature amount is converted into the second feature amount, and the second feature amount is clustered.
The training data to be modified in the class is selected based on the clustering result of the second feature amount.
It operates to execute the process.

上記発明によると、学習データの第１の特徴量を、当該学習データが属するクラスに対応する情報に基づいて第２の特徴量に変換してクラスタリングすることで、クラスを修正する対象となる学習データを選択することができる。その結果、学習データのラベル修正に要する工数の削減とラベル付けの精度の向上を図ることができる。 According to the above invention, the first feature amount of the learning data is converted into the second feature amount based on the information corresponding to the class to which the learning data belongs and clustered, so that the learning to be the target of modifying the class is performed. You can select the data. As a result, it is possible to reduce the man-hours required for label correction of learning data and improve the labeling accuracy.

＜付記＞
上記実施形態の一部又は全部は、以下の付記のようにも記載されうる。以下、本発明における情報処理装置、情報処理方法、プログラムの構成の概略を説明する。但し、本発明は、以下の構成に限定されない。<Additional notes>
Part or all of the above embodiments may also be described as in the appendix below. Hereinafter, the outline of the configuration of the information processing device, the information processing method, and the program in the present invention will be described. However, the present invention is not limited to the following configurations.

（付記１）
ニューラルネットモデルを用いて学習データの第１の特徴量を抽出する特徴量抽出部と、
前記ニューラルネットモデルに設定された前記学習データが属するクラスに対応する情報に基づいて、前記第１の特徴量を第２の特徴量に変換して、当該第２の特徴量をクラスタリングするクラスタリング部と、
前記第２の特徴量のクラスタリング結果に基づいてクラスを修正する対象となる前記学習データを選択する修正対象選択部と、
を備えた情報処理装置。(Appendix 1)
A feature extraction unit that extracts the first feature of the training data using a neural network model,
A clustering unit that converts the first feature amount into a second feature amount and clusters the second feature amount based on the information corresponding to the class to which the learning data is set in the neural network model. When,
A correction target selection unit that selects the learning data to be modified based on the clustering result of the second feature amount, and a modification target selection unit.
Information processing device equipped with.

（付記２）
付記１に記載の情報処理装置であって、
前記クラスタリング部は、前記第１の特徴量の次元を圧縮して前記第２の特徴量に変換する、
情報処理装置。(Appendix 2)
The information processing device according to Appendix 1.
The clustering unit compresses the dimension of the first feature amount and converts it into the second feature amount.
Information processing device.

（付記３）
付記１又は２に記載の情報処理装置であって、
前記クラスタリング部は、前記ニューラルネットモデルにおける前記学習データが属するクラスに対応する識別平面に、前記第１の特徴量を正射影して当該第１の特徴量を前記第２の特徴量に変換する、
情報処理装置。(Appendix 3)
The information processing device according to Appendix 1 or 2.
The clustering unit orthographically projects the first feature amount onto the identification plane corresponding to the class to which the learning data belongs in the neural network model, and converts the first feature amount into the second feature amount. ,
Information processing device.

（付記４）
付記１乃至３のいずれかに記載の情報処理装置であって、
前記第１の特徴量に基づく値に基づいて当該第１の特徴量を選択する特徴量選択部をさらに備え、
前記クラスタリング部は、選択された前記第１の特徴量を前記第２の特徴量に変換する、
情報処理装置。(Appendix 4)
The information processing device according to any one of Supplementary note 1 to 3.
A feature amount selection unit for selecting the first feature amount based on the value based on the first feature amount is further provided.
The clustering unit converts the selected first feature amount into the second feature amount.
Information processing device.

（付記５）
付記４に記載の情報処理装置であって、
前記特徴量選択部は、前記第１の特徴量に基づく前記学習データが属するクラスに対する確からしさを表す値に基づいて前記第１の特徴量を選択する、
情報処理装置。(Appendix 5)
The information processing device according to Appendix 4.
The feature amount selection unit selects the first feature amount based on a value representing the certainty for the class to which the learning data belongs based on the first feature amount.
Information processing device.

（付記６）
付記１乃至５のいずれかに記載の情報処理装置であって、
前記修正対象選択部は、クラスタリングされた前記第２の特徴量が属するクラスタ内における当該第２の特徴量の元となる前記学習データが属するクラスの乱雑さに基づいて、クラスを修正する対象となる前記学習データを選択する、
情報処理装置。(Appendix 6)
The information processing device according to any one of Appendix 1 to 5.
The correction target selection unit is a target for modifying a class based on the randomness of the class to which the learning data, which is the source of the second feature amount, belongs in the cluster to which the second feature amount is clustered. Select the training data,
Information processing device.

（付記７）
付記１乃至６のいずれかに記載の情報処理装置であって、
クラスを修正する対象として選択された前記学習データが属するクラスを変更設定するクラス設定部を備えた、
情報処理装置。(Appendix 7)
The information processing device according to any one of Supplementary note 1 to 6.
A class setting unit for changing and setting the class to which the learning data selected as the target for modifying the class belongs is provided.
Information processing device.

（付記８）
情報処理装置に、
ニューラルネットモデルを用いて学習データの第１の特徴量を抽出する特徴量抽出部と、
前記ニューラルネットモデルに設定された前記学習データが属するクラスに対応する情報に基づいて、前記第１の特徴量を第２の特徴量に変換して、当該第２の特徴量をクラスタリングするクラスタリング部と、
前記第２の特徴量のクラスタリング結果に基づいてクラスを修正する対象となる前記学習データを選択する修正対象選択部と、
を実現させるためのプログラム。(Appendix 8)
For information processing equipment
A feature extraction unit that extracts the first feature of the training data using a neural network model,
A clustering unit that converts the first feature amount into a second feature amount and clusters the second feature amount based on the information corresponding to the class to which the learning data is set in the neural network model. When,
A correction target selection unit that selects the learning data to be modified based on the clustering result of the second feature amount, and a modification target selection unit.
A program to realize.

（付記８．１）
付記８に記載のプログラムであって、
前記情報処理装置に、
前記第１の特徴量に基づく値に基づいて当該第１の特徴量を選択する特徴量選択部をさらに実現させ、
前記クラスタリング部は、選択された前記第１の特徴量を前記第２の特徴量に変換する、
プログラム。。(Appendix 8.1)
The program described in Appendix 8
In the information processing device
Further, a feature amount selection unit that selects the first feature amount based on the value based on the first feature amount is further realized.
The clustering unit converts the selected first feature amount into the second feature amount.
program. ..

（付記８．２）
付記８又は８．１に記載のプログラムあて、
前記情報処理装置に、クラスを修正する対象として選択された前記学習データが属するクラスを変更設定するクラス設定部をさらに実現させるためのプログラム。(Appendix 8.2)
To the program described in Appendix 8 or 8.1,
A program for further realizing a class setting unit for changing and setting a class to which the learning data selected as a target for modifying a class belongs to the information processing device.

（付記９）
ニューラルネットモデルを用いて学習データの第１の特徴量を抽出し、
前記ニューラルネットモデルに設定された前記学習データが属するクラスに対応する情報に基づいて、前記第１の特徴量を第２の特徴量に変換して、当該第２の特徴量をクラスタリングし、
前記第２の特徴量のクラスタリング結果に基づいてクラスを修正する対象となる前記学習データを選択する、
情報処理方法。(Appendix 9)
Extract the first feature of the training data using the neural network model,
Based on the information set in the neural network model and corresponding to the class to which the learning data belongs, the first feature amount is converted into the second feature amount, and the second feature amount is clustered.
The training data to be modified in the class is selected based on the clustering result of the second feature amount.
Information processing method.

（付記１０）
付記９に記載の情報処理方法であって、
前記第１の特徴量の次元を圧縮して前記第２の特徴量に変換し、当該第２の特徴量をクラスタリングする、
情報処理方法。(Appendix 10)
The information processing method described in Appendix 9
The dimension of the first feature is compressed and converted into the second feature, and the second feature is clustered.
Information processing method.

（付記１１）
付記９又は１０に記載の情報処理方法であって、
前記ニューラルネットモデルにおける前記学習データが属するクラスに対応する識別平面に、前記第１の特徴量を正射影して当該第１の特徴量を前記第２の特徴量に変換し、当該第２の特徴量をクラスタリングする、
情報処理方法。(Appendix 11)
The information processing method according to Appendix 9 or 10.
The first feature amount is orthographically projected onto the identification plane corresponding to the class to which the learning data belongs in the neural network model, the first feature amount is converted into the second feature amount, and the second feature amount is converted into the second feature amount. Clustering features,
Information processing method.

（付記１２）
付記９乃至１１のいずれかに記載の情報処理方法であって、
前記第１の特徴量に基づく値に基づいて当該第１の特徴量を選択し、
選択された前記第１の特徴量を前記第２の特徴量に変換して、当該第２の特徴量をクラスタリングする、
情報処理方法。(Appendix 12)
The information processing method according to any one of Supplementary note 9 to 11.
The first feature amount is selected based on the value based on the first feature amount, and the first feature amount is selected.
The selected first feature amount is converted into the second feature amount, and the second feature amount is clustered.
Information processing method.

（付記１３）
付記１２に記載の情報処理方法であって、
前記第１の特徴量に基づく前記学習データが属するクラスに対する確からしさを表す値に基づいて前記第１の特徴量を選択する、
情報処理方法。(Appendix 13)
The information processing method described in Appendix 12
The first feature amount is selected based on a value representing the certainty for the class to which the learning data belongs based on the first feature amount.
Information processing method.

（付記１４）
付記９乃至１３のいずれかに記載の情報処理方法であって、
クラスタリングされた前記第２の特徴量が属するクラスタ内における当該第２の特徴量の元となる前記学習データが属するクラスの乱雑さに基づいて、クラスを修正する対象となる前記学習データを選択する、
情報処理方法。(Appendix 14)
The information processing method according to any one of Supplementary note 9 to 13.
Based on the randomness of the class to which the learning data, which is the source of the second feature, belongs in the cluster to which the clustered second feature belongs, the learning data to be modified is selected. ,
Information processing method.

（付記１５）
付記９乃至１４のいずれかに記載の情報処理方法であって、
クラスを修正する対象として選択された前記学習データが属するクラスを変更設定する、
情報処理方法。(Appendix 15)
The information processing method according to any one of Supplementary note 9 to 14.
Change and set the class to which the learning data selected as the target for modifying the class belongs.
Information processing method.

なお、上述したプログラムは、様々なタイプの非一時的なコンピュータ可読媒体（non-transitory computer readable medium）を用いて格納され、コンピュータに供給することができる。非一時的なコンピュータ可読媒体は、様々なタイプの実体のある記録媒体（tangible storage medium）を含む。非一時的なコンピュータ可読媒体の例は、磁気記録媒体（例えばフレキシブルディスク、磁気テープ、ハードディスクドライブ）、光磁気記録媒体（例えば光磁気ディスク）、ＣＤ−ＲＯＭ（Read Only Memory）、ＣＤ−Ｒ、ＣＤ−Ｒ／Ｗ、半導体メモリ（例えば、マスクＲＯＭ、ＰＲＯＭ（Programmable ROM）、ＥＰＲＯＭ（Erasable PROM）、フラッシュＲＯＭ、ＲＡＭ（Random Access Memory））を含む。また、プログラムは、様々なタイプの一時的なコンピュータ可読媒体（transitory computer readable medium）によってコンピュータに供給されてもよい。一時的なコンピュータ可読媒体の例は、電気信号、光信号、及び電磁波を含む。一時的なコンピュータ可読媒体は、電線及び光ファイバ等の有線通信路、又は無線通信路を介して、プログラムをコンピュータに供給できる。 The above-mentioned program can be stored and supplied to a computer using various types of non-transitory computer readable medium. Non-temporary computer-readable media include various types of tangible storage media. Examples of non-temporary computer-readable media include magnetic recording media (eg, flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (eg, magneto-optical disks), CD-ROMs (Read Only Memory), CD-Rs. CD-R / W, semiconductor memory (for example, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory)) are included. The program may also be supplied to the computer by various types of transient computer readable medium. Examples of temporary computer-readable media include electrical, optical, and electromagnetic waves. The temporary computer-readable medium can supply the program to the computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.

以上、上記実施形態等を参照して本願発明を説明したが、本願発明は、上述した実施形態に限定されるものではない。本願発明の構成や詳細には、本願発明の範囲内で当業者が理解しうる様々な変更をすることができる。 Although the invention of the present application has been described above with reference to the above-described embodiment and the like, the present invention is not limited to the above-described embodiment. Various changes that can be understood by those skilled in the art can be made within the scope of the invention of the present application in terms of the configuration and details of the invention of the present application.

なお、本発明は、日本国にて２０１８年９月１１日に特許出願された特願２０１８−１６９８２９の特許出願に基づく優先権主張の利益を享受するものであり、当該特許出願に記載された内容は、全て本明細書に含まれるものとする。 The present invention enjoys the benefit of priority claim based on the patent application of Japanese Patent Application No. 2018-169829, which was filed in Japan on September 11, 2018, and is described in the patent application. All contents are included in this specification.

１ラベル修正支援装置
１０記憶装置
２０ラベル修正対象提示方法決定装置
２１特徴量抽出装置
２２画像選別装置
２３クラスタリング装置
２４クラスタ選別・並び替え装置
３０提示・修正装置
４０ラベル自動修正装置
５０設定値更新装置
１００情報処理装置
１１０特徴量抽出部
１２０クラスタリング部
１３０修正対象選択部
1 Label correction support device 10 Storage device 20 Label correction target presentation method determination device 21 Feature extraction device 22 Image sorting device 23 Clustering device 24 Cluster sorting / sorting device 30 Presentation / correction device 40 Label automatic correction device 50 Setting value update device 100 Information processing device 110 Feature extraction unit 120 Clustering unit 130 Correction target selection unit

Claims

A feature extraction unit that extracts the first feature of the training data using a neural network model,
A clustering unit that converts the first feature amount into a second feature amount and clusters the second feature amount based on the information corresponding to the class to which the learning data is set in the neural network model. When,
A correction target selection unit that selects the learning data to be modified based on the clustering result of the second feature amount, and a modification target selection unit.
Information processing device equipped with.

The information processing device according to claim 1.
The clustering unit compresses the dimension of the first feature amount and converts it into the second feature amount.
Information processing device.

The information processing device according to claim 1 or 2.
The clustering unit orthographically projects the first feature amount onto the identification plane corresponding to the class to which the learning data belongs in the neural network model, and converts the first feature amount into the second feature amount. ,
Information processing device.

The information processing device according to any one of claims 1 to 3.
A feature amount selection unit for selecting the first feature amount based on the value based on the first feature amount is further provided.
The clustering unit converts the selected first feature amount into the second feature amount.
Information processing device.

The information processing device according to claim 4.
The feature amount selection unit selects the first feature amount based on a value representing the certainty for the class to which the learning data belongs based on the first feature amount.
Information processing device.

The information processing device according to any one of claims 1 to 5.
The correction target selection unit is a target for modifying a class based on the randomness of the class to which the learning data, which is the source of the second feature amount, belongs in the cluster to which the second feature amount is clustered. Select the training data,
Information processing device.

The information processing device according to any one of claims 1 to 6.
A class setting unit for changing and setting the class to which the learning data selected as the target for modifying the class belongs is provided.
Information processing device.

For information processing equipment
A feature extraction unit that extracts the first feature of the training data using a neural network model,
A clustering unit that converts the first feature amount into a second feature amount and clusters the second feature amount based on the information corresponding to the class to which the learning data is set in the neural network model. When,
A correction target selection unit that selects the learning data to be modified based on the clustering result of the second feature amount, and a modification target selection unit.
A program to realize.

Extract the first feature of the training data using the neural network model,
Based on the information set in the neural network model and corresponding to the class to which the learning data belongs, the first feature amount is converted into the second feature amount, and the second feature amount is clustered.
The training data to be modified in the class is selected based on the clustering result of the second feature amount.
Information processing method.

The information processing method according to claim 9.
The dimension of the first feature is compressed and converted into the second feature, and the second feature is clustered.
Information processing method.

The information processing method according to claim 9 or 10.
The first feature amount is orthographically projected onto the identification plane corresponding to the class to which the learning data belongs in the neural network model, the first feature amount is converted into the second feature amount, and the second feature amount is converted into the second feature amount. Clustering features,
Information processing method.

The information processing method according to any one of claims 9 to 11.
The first feature amount is selected based on the value based on the first feature amount, and the first feature amount is selected.
The selected first feature amount is converted into the second feature amount, and the second feature amount is clustered.
Information processing method.

The information processing method according to claim 12.
The first feature amount is selected based on a value representing the certainty for the class to which the learning data belongs based on the first feature amount.
Information processing method.

The information processing method according to any one of claims 9 to 13.
Based on the randomness of the class to which the learning data, which is the source of the second feature, belongs in the cluster to which the clustered second feature belongs, the learning data to be modified is selected. ,
Information processing method.

The information processing method according to any one of claims 9 to 14.
Change and set the class to which the learning data selected as the target for modifying the class belongs.
Information processing method.